Item7197: Detecting UTF-8 flag before the response output is written out

Item Form Data

AppliesTo: Component: Priority: CurrentState: WaitingFor: TargetRelease ReleasedIn
Engine   Normal Closed   major 6.0.0

Edit Form Data

Reported By:
Applies To:
Current State:
Waiting For:
Target Release:
Released In:


When a page is displayed with some contents (from plugins etc.) containing any utf8-flagged characters, the browser apparently keeps loading the page forever, and after a timeout, the entire page is replaced by an error page.

The root cause is that the HTTP "Content-Length" header is skewed, as it relies on Perl's build-in length() function, which returns the number of utf-8 characters rather than the number of bytes. Note: the loading issue probably does not occur if "Transfer-Encoding: chunked" is used.

Ideally, all the plugins should conform to the TWiki's assumption that the utf8 flags are turned off everywhere - i.e. all the Perl strings should be raw byte sequences, regardless of whether they are utf8 or not. On the other hand, some plugins possibly return strings with the utf8 flag turned on without notice, which ends up with making users wonder TWiki is broken because of the browser's behavior.

As a precaution, TWiki should detect the utf8 flag just before writing out any output, and forcefully turn off the flag while writing a warning log.

-- TWiki:Main/MahiroAndo - 2013-03-19

Summary Detecting UTF-8 flag before the response output is written out
ReportedBy TWiki:Main.MahiroAndo
Codebase ~twiki4
SVN Range TWiki-5.1.3-trunk, Fri, 15 Mar 2013, build 25443
AppliesTo Engine

Priority Normal
CurrentState Closed

Checkins TWikirev:25460 TWikirev:25461
TargetRelease major
ReleasedIn 6.0.0
Topic revision: r6 - 2013-10-15 - PeterThoeny
