• Do not register here on develop.twiki.org, login with your twiki.org account.
• Use View topic Item7848 for generic doc work for TWiki-6.1.1. Use View topic Item7851 for doc work on extensions that are not part of a release. More... Close
• Anything you create or change in standard webs (Main, TWiki, Sandbox etc) will be automatically reverted on every SVN update.
Does this site look broken?. Use the LitterTray web for test cases.

When trying to create a new topic in a web with an international character in it - using Internet Explorer - you get an error like this (the web in question is called "Fårvang"):

  • URL: /twiki/bin/oops/F%c3%a5rvang/Test%c5t?template=oopsaccessdenied;def=no_such_web;param1=edit
  • MSG: The "FÃ¥rvang" web does not exist

There is no problems when using Firefox to do the same.

I have tried various I18N configuration, but I get the same result for all the settings I have tried so far.

-- SP

I have now tried different versions of apache and CGI::Session also, but always failing the same way.

-- SP

This looks like a failure of TWiki:Codev.EncodeURLsWithUTF8, which has been seen on Windows a couple of times. This happens only on IE, because IE is sending a URL encoded with UTF8, which it does by default, whereas Firefox does non-UTF8 encoded URLs (see TWiki:Codev.MozillaURLEncodingWithI18N for the original problem).

There is some advice on debugging this part of the TWiki code at TWiki:Support.GermanUmlauteAndWindows. JoachimBlum also had this but his detailed support page (linked from this one) has disappeared. There were some promising lines of enquiry here, and I think the CGI.pm version is the strongest one (v3.04 works), as the CGI module maintainers have managed to break some things in the past re I18N with new versions - see TWiki:Support/ProblemsWithInternationCharactersInOddPlaces (different problem).

Providing the output of configure would be useful, as this includes version of CGI.pm which might be an issue. Also platform details on client and server, esp. if you are using Windows server without TWiki:Codev.TWikiVMDebianStable.

Hope that helps. Please do email me again if you need more input as I don't really have time to track this TWiki.

-- RD

Thanks for your input, Richard.

I noticed today that I only have the problem when both the name of the web and the name of topic I am trying to create have international characters in them.

Web Topic Result
Fårvang OneTopic OK - DONE
Fårvang FårvangTopic Unable to create a page - ends up as "The "FÃ¥rvang" web does not exist" - No
Sandbox FårvangTopic OK - DONE

This is using the Jump box or WebTopicCreator.

If I simply enter an edit URL manually, I get this (this is all using Internet Explorer, I cannot provoke any errors with Firefox):

Create URL Expected topic name Result
/twiki/bin/edit/Fårvang/FårvangTopic FårvangTopic FårvangTopic - OK DONE
/twiki/bin/edit/F%e5rvang/FårvangTopic2 FårvangTopic2 FÃ¥rvangTopic2 - FAIL No (TWiki automatically rewrites the urls to the %e5-form while browsing)
/twiki/bin/edit/Sandbox/FårvangTopic2 FårvangTopic2 FårvangTopic2 - OK DONE

As a sidenote it seems to gets worse if the last char of the topic name is a international character - in that case i.e. "å" expands to "Ã…" instead of "Ã¥" (as in the above example).

I see I mentioned various versions of CGI::Session last, now I also tried various versions of CGI.pm, including http://cpan.org/modules/by-module/CGI/CGI.pm-3.04.tar.gz. Results are the same still. (Server is Apache 1.3.43/Linux Debian, IE is 6.0.29/English)

It surprised me that there is only an error if the topic I am trying to create contains an international char. Looks like somehow this is not an error with the international web name, but the combination with an international topic name also?

-- SP

I think I found it - it looks like multiple encodings can end up in the web and topic name parameters of CGI.pm at the same time when using IE:

query = bless( {
                 '.script_name' => '/twiki/bin/view',
                 'topic' => [
                              'FårvangTopic'
                            ],
                 '.parameters' => [
                                    'topic'
                                  ],
                 '.path_info' => '/FÃ¥rvang/WebHome',
                 '.charset' => 'ISO-8859-1',
                 '.fieldnames' => {},
                 '.cookies' => {
                                 'TWIKISID' => bless( {
                                                        'value' => [
                                                                     '075a9e8169c42114be68bddb2f3d90c0'
                                                                   ],
                                                        'name' => 'TWIKISID',
                                                        'path' => '/'
                                                      }, 'CGI::Cookie' )
                               },
                 'escape' => 1
               }, 'CGI' );
 

As these were decoded together in TWiki.pm, only one could end up right.

-- SP

I think I've figured out the issue here. The TWiki UTF-8 URL encoding work takes the webname and topic, however found by TWiki (which could be PATH_INFO for the web name, and a CGI parameter topic for the topic name). UTF-8 URL encoding by IE perhaps only applies to the URL including PATH_INFO, probably doesn't apply to GET parameters after the '?', and certainly doesn't apply to POST parameters (which will be sent with the page encoding, e.g. ISO-8859-1 here).

So... CGI.pm decodes the topic parameter as ISO-8859-1 and the web name is taken from the PATH_INFO by TWiki code, in UTF-8.

The solution, although slightly less performant for non-ISO-8859-1 sites,is to decode the webname and topic name separately. For sites using ISO-8859-1, the conversion is done algorithmically so there should be little performance overhead anyway.

Presumably this only happens when creating a new topic because IE doesn't encode the topic name as UTF-8 if it's a parameter in a GET - however, the topic creation URL doesn't pass the topic that way, so it's a mystery how this happens... Some debug required...

The following is the code that is at fault, though it's better to really find out what's happening:

    my $newt = $this->UTF82SiteCharSet( $this->{webName}.'.'.
                                        $this->{topicName} );
Making this two calls to the routine should fix this.

-- RD

4.1.0 released

KJL

ItemTemplate
Summary UTF-8 conversion fails with IE when both webname and topicname contains international characters
ReportedBy TWiki:Main.SteffenPoulsen
Codebase 4.0.4, ~twiki4
SVN Range TWiki-4.1, Wed, 04 Oct 2006, build 11657
AppliesTo Engine
Component

Priority Urgent
CurrentState Closed
WaitingFor

Checkins 11688
TargetRelease minor
Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View |  Raw edit | More topic actions
Topic revision: r11 - 2007-01-16 - KennethLavrsen
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback