When trying to create a new topic in a web with an international character in it - using Internet Explorer - you get an error like this (the web in question is called "Fårvang"):
- URL: /twiki/bin/oops/F%c3%a5rvang/Test%c5t?template=oopsaccessdenied;def=no_such_web;param1=edit
- MSG: The "FÃ¥rvang" web does not exist
There is no problems when using Firefox to do the same.
I have tried various I18N configuration, but I get the same result for all the settings I have tried so far.
--
SP
I have now tried different versions of apache and CGI::Session also, but always failing the same way.
--
SP
This looks like a failure of
TWiki:Codev.EncodeURLsWithUTF8, which has been seen on Windows a couple of times. This happens only on IE, because IE is sending a URL encoded with UTF8, which it does by default, whereas Firefox does non-UTF8 encoded URLs (see
TWiki:Codev.MozillaURLEncodingWithI18N for the original problem).
There is some advice on debugging this part of the TWiki code at
TWiki:Support.GermanUmlauteAndWindows.
JoachimBlum also had this but his detailed support page (linked from this one) has disappeared. There were some promising lines of enquiry here, and I think the CGI.pm version is the strongest one (v3.04 works), as the CGI module maintainers have managed to break some things in the past re
I18N with new versions - see
TWiki:Support/ProblemsWithInternationCharactersInOddPlaces (different problem).
Providing the output of
configure
would be useful, as this includes version of CGI.pm which might be an issue. Also platform details on client and server, esp. if you are using Windows server without
TWiki:Codev.TWikiVMDebianStable.
Hope that helps. Please do email me again if you need more input as I don't really have time to track this TWiki.
--
RD
Thanks for your input, Richard.
I noticed today that I only have the problem when
both the name of the web
and the name of topic I am trying to create have international characters in them.
Web |
Topic |
Result |
Fårvang |
OneTopic |
OK - |
Fårvang |
FårvangTopic |
Unable to create a page - ends up as "The "FÃ¥rvang" web does not exist" - |
Sandbox |
FårvangTopic |
OK - |
This is using the Jump box or WebTopicCreator.
If I simply enter an edit URL manually, I get this (this is all using Internet Explorer, I cannot provoke any errors with Firefox):
Create URL |
Expected topic name |
Result |
/twiki/bin/edit/Fårvang/FårvangTopic |
FårvangTopic |
FårvangTopic - OK |
/twiki/bin/edit/F%e5rvang/FårvangTopic2 |
FårvangTopic2 |
FÃ¥rvangTopic2 - FAIL (TWiki automatically rewrites the urls to the %e5-form while browsing) |
/twiki/bin/edit/Sandbox/FårvangTopic2 |
FårvangTopic2 |
FårvangTopic2 - OK |
As a sidenote it seems to gets worse if the last char of the topic name is a international character - in that case i.e. "å" expands to "Ã…" instead of "Ã¥" (as in the above example).
I see I mentioned various versions of CGI::Session last, now I also tried various versions of CGI.pm, including
http://cpan.org/modules/by-module/CGI/CGI.pm-3.04.tar.gz. Results are the same still. (Server is Apache 1.3.43/Linux Debian, IE is 6.0.29/English)
It surprised me that there is only an error if the topic I am trying to create contains an international char. Looks like somehow this is not an error with the international web name, but the combination with an international topic name also?
--
SP
I think I found it - it looks like multiple encodings can end up in the web and topic name parameters of CGI.pm at the same time when using IE:
query = bless( {
'.script_name' => '/twiki/bin/view',
'topic' => [
'FårvangTopic'
],
'.parameters' => [
'topic'
],
'.path_info' => '/FÃ¥rvang/WebHome',
'.charset' => 'ISO-8859-1',
'.fieldnames' => {},
'.cookies' => {
'TWIKISID' => bless( {
'value' => [
'075a9e8169c42114be68bddb2f3d90c0'
],
'name' => 'TWIKISID',
'path' => '/'
}, 'CGI::Cookie' )
},
'escape' => 1
}, 'CGI' );
As these were decoded together in TWiki.pm, only one could end up right.
--
SP
I think I've figured out the issue here. The TWiki UTF-8 URL encoding work takes the webname and topic, however found by TWiki (which could be
PATH_INFO
for the web name, and a CGI parameter
topic
for the topic name). UTF-8 URL encoding by IE perhaps only applies to the URL including PATH_INFO, probably doesn't apply to GET parameters after the '?', and certainly doesn't apply to POST parameters (which will be sent with the page encoding, e.g. ISO-8859-1 here).
So... CGI.pm decodes the topic parameter as ISO-8859-1 and the web name is taken from the PATH_INFO by TWiki code, in UTF-8.
The solution, although slightly less performant for non-ISO-8859-1 sites,is to decode the webname and topic name separately. For sites using ISO-8859-1, the conversion is done algorithmically so there should be little performance overhead anyway.
Presumably this only happens when creating a new topic because IE doesn't encode the topic name as UTF-8 if it's a parameter in a GET - however, the topic creation URL doesn't pass the topic that way, so it's a mystery how this happens... Some debug required...
The following is the code that is at fault, though it's better to really find out what's happening:
my $newt = $this->UTF82SiteCharSet( $this->{webName}.'.'.
$this->{topicName} );
Making this two calls to the routine should fix this.
--
RD
4.1.0 released
KJL