Encoded parentheses are needed to pass a regular expression in a url.
If you do a
WebSearch with
(bla)
:
https://develop.twiki.org/do/view/Bugs/WebSearch?search=%28bla%29
the search string is properly encoded (check search box and url).
But when you write
%ENCODE{"(bla)" type="url"}%
the parentheses are not encoded. If you pass a url with parentheses to a websearch you won't get the same results. The search is effectively broken.
Test:
* %ENCODE{"(text with spaces and parentheses)" type="entity"}%
* %ENCODE{"(text with spaces and parentheses)" type="html"}%
* %ENCODE{"(text with spaces and parentheses)" type="url"}%
* %ENCODE{"(text with spaces and parentheses)"}%
results in:
- (text with spaces and parentheses)
- (text with spaces and parentheses)
- (text%20with%20spaces%20and%20parentheses)
- (text%20with%20spaces%20and%20parentheses)
The code doc at
TWiki.pm->urlEncode
says:
...Only alphanumerics [0-9a-zA-Z], the special
characters $-_.+!*'(), and reserved characters used for their
reserved purposes may be used unencoded within a URL.
The fix looks easy: change
$text =~ s/([^0-9a-zA-Z-_.:~!*'()\/%])/'%'.sprintf('%02x',ord($1))/ge;
to:
$text =~ s/([^0-9a-zA-Z-_.:~!*'\/%])/'%'.sprintf('%02x',ord($1))/ge;
After the change the test above will print:
- (text with spaces and parentheses)
- (text with spaces and parentheses)
- %28text%20with%20spaces%20and%20parentheses%29
- %28text%20with%20spaces%20and%20parentheses%29
--
AC
That doesn't sound right at all. The purpose of encoding is to stop an illegal URL being generated, but () are
legal in a URL so don't have to be encoded. The text of that comment in TWiki.pm was taken
verbatim from the spec of URLs.
Are you trying to use ENCODE for a purpose for which it wasn't intended? Can you please explain the use-case?
CC
Hmm, my examples here work ok:
I must be doing something wrong. Setting to discarded for now.
AC