Description of the problem
Suppose you have an Apache configuration allowing both http and https connects to your TWiki and you have a DefaultUrlHost like
http://www.example.com
. (means default is http if not specified).
If you use
ShorterUrlCookbook
and you have the following lines in your Apache config
Alias /twiki/pub/ /path/to/twiki/pub/
Alias / /path/to/twiki/cgi-bin/bin/view/
and your view URLs look like:
http://www.example.com/Web/Topic
or (and that's the important case for this bug)
https://www.example.com/Web/Topic
then you have problem in the second case (https case) because all links (that TWiki shows on the page) have the DefaultUrlHost prepended with http and not https!
Why?
Taking a look at
TWiki.pm
in the constructor (sub new) where urlHost is set, then we see
my $url = $query->url();
if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ) {
... lines left out ...
} else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
}
If your view script gets called with
https://www.example.com/Web/Topic
, then url in this case is
https://www.example.com
without a trailing slash. Hence the regexp doesn't match (one slash after the two slashes of
protocol://
is required; and even if there were a trailing slash
$2
would be empty).
Then you have the wrong urlHost and especially all created links point to
http://www.example.com/...
and not
https://www.example.com/...
--
TWiki:Main/ChristianLudwig
- 21 Feb 2008
Seems to me this is a security issue - having url's that are intended to be ssl'd get rendered without https will pass out unencrypted text in places where its bad.
--
SvenDowideit - 01 Jun 2008
Four months old bug now
--
TWiki:Main.KennethLavrsen
- 01 Jul 2008
I know this bug is now four month old, but I cannot suggest a patch, because I do not know the intention of the code:
if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ) {
$this->{urlHost} = $1;
# If the urlHost in the url is localhost, this is a lot less
# useful than the default url host. This is because new CGI("")
# assigns this host by default - it's a default setting, used
# when there is nothing better available.
if( $this->{urlHost} eq 'http://localhost' ) {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
} elsif( $TWiki::cfg{RemovePortNumber} ) {
$this->{urlHost} =~ s/\:[0-9]+$//;
}
if( $TWiki::cfg{GetScriptUrlFromCgi} ) {
# SMELL: this is a really dangerous hack. It will fail
# spectacularly with mod_perl.
# SMELL: why not just use $query->script_name?
$this->{scriptUrlPath} = $2;
}
} else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
}
I've several problems understanding what the code tries to do:
- I doesn't see, why "localhost" is that bad. I personally use localhost:8080 for the start of a ssh-tunnel for a test server. Fortunately I'm lucky and the
'http://localhost'
doesn't match because in my case there is an additional port at the end of the string.
- I don't know why it's important or necessary to have an additional slash after
://
in order to find out the urlHost
.
- I'm sure there are many reasons why the programmer(s) didn't use a simpler version, like
if (defined($url) && $url =~ m!^([^:]*://[^/]*)!) {
$this->{urlHost} = $1;
$this->{urlHost} =~ s/:[0-9]+$// if ($TWiki::cfg{RemovePortNumber});
}
else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
}
$this->{scriptUrlPath}=$query->script_name if $TWiki::cfg{GetScriptUrlFromCgi};
Sorry; that's the reason, why I only gave a pointer to the code location in the bug report without an suggestion for improvement.
--
TWiki:Main.ChristianLudwig
- 01 Jul 2008
Hi,
I am able to recreate this Bug into my environment.
Let me try to summarise:
-You have $TWiki::cfg{DefaultUrlHost} = 'http://www.example.com'; in LocalSite.cfg
-Your webserver allows to serve TWiki/Web pages using http and https protocol.
-You have setup ShortUrlCookBook settings, you are accessing say https://www.example.com/Main/WebHome (this is actually https://www.example.com/twiki/bin/view/Main/WebHome)
-In the case, the line my $url = $query->url(); sets value to $url = http://www.example.com (it does not set to https://www.example.com or does not set to https://www.example.com/Main/WebHome), so it does not go through if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ), it goes via else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
} loop
So $this->{urlHost} gets the value $TWiki::cfg{DefaultUrlHost} which is http://www.example.com
So all links from the page/topic have links starting with http:// protocol and not with https://
I am working on to fix this issue
--
TWiki:Main.SopanShewale
- 09 Jul 2008
Hi,
Seems like the following code fixes this issue.
my $url = $query->url();
if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ) {
$this->{urlHost} = $1;
# If the urlHost in the url is localhost, this is a lot less
# useful than the default url host. This is because new CGI("")
# assigns this host by default - it's a default setting, used
# when there is nothing better available.
if( $this->{urlHost} eq 'http://localhost' ) {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
} elsif( $TWiki::cfg{RemovePortNumber} ) {
$this->{urlHost} =~ s/\:[0-9]+$//;
}
if( $TWiki::cfg{GetScriptUrlFromCgi} ) {
# SMELL: this is a really dangerous hack. It will fail
# spectacularly with mod_perl.
# SMELL: why not just use $query->script_name?
$this->{scriptUrlPath} = $2;
}
}elsif ( $url && $url =~ m!^([^:]*://[^/]*).*$!) {
$this->{urlHost} = $1;
}else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
}
TWiki:Main/ChristianLudwig
, kindly request you to test this and let me know.
--
SopanShewale
- 14 Jul 2008
Thanks for this fix Sopan. And thanks to Christian for a very good bug report.
Setting to Waiting for Release.
--
KennethLavrsen
- 15 Jul 2008
The fix works for me. Thank you.
--
TWiki:Main.ChristianLudwig
- 15 Jul 2008
But breaks 9 unit tests
--
TWiki:Main.KennethLavrsen
- 16 Jul 2008
Because of the if … elseif … else construction there are now two completely different branches handling URLs. In the second branch (elsif) the RemovePortNumber config variable and the GetScriptUrlFromCGI config variable are both ignored and there is no localhost test.
Perhaps it's better to think about fixing the first regular expression (in the if) instead of opening more branches. But for fixing the reg exp some hints are needed what this reg exp should match and what not.
--
TWiki:Main.ChristianLudwig
- 17 Jul 2008
its actually better to use the url processing CPAN library - i didn't when I last poked the url parsing code because we try to avoid external dependancies, but I suspect it'd be safer to do so - perhaps by including a copy of the .pm in the release.
--
TWiki:Main.SvenDowideit
- 17 Jul 2008
But what is the code line supposed to do? That has nothing to do with CPAN libs.
We need the spec for what the original programmer intended this code to do.
--
TWiki:Main.KennethLavrsen
- 17 Jul 2008
Sorry, it seems obvious to me that the regex is intended to extract
$this->{urlHost} = $1;
.
Why its done this way is robably due to where the code started from, and the sequence of bugs that were found and fixed in it.
doesn't change the fact that hand parsing url's using a regex is a risky thing to do when there are well tested libraries that are known to do it well.
--
TWiki:Main.SvenDowideit
- 17 Jul 2008
Christian for sure understood at that low level what the code is doing.
Take a step back and tell us what the code is supposed to
do
As Christian writes: "I know this bug is now four month old, but I cannot suggest a patch, because I do not know the intention of the code"
What is the intention of the code? Even with a CPAN call you can get into trouble if you use it for the wrong purpose. And even without knowing the original intention exactly - for sure we are not going to force a new CPAN dependency of a lib that has to be installed just to perform such a simple little task. It is difficult for people using shared hosting to get the admins to install CPAN libs and for sure this one can be resolved with the right regex.
- The code block takes my $url = $query->url();
- and at the end it defines $this->{urlHost}
What - expressed in words - is the code supposed to do?
--
TWiki:Main.KennethLavrsen
- 18 Jul 2008
Kenneth, Which 9 test cases are failing - can you please let me know. I will look at those failed cases - may be need to go into deep to understand test cases/code from top to bottom.
--
TWiki:Main.SopanShewale
- 18 Jul 2008
Looking at the test report - i know the failed cases - working on it..
--
TWiki:Main.SopanShewale
- 18 Jul 2008
hmm.. this is interesting issue
If someone sets $TWiki::cfg{DefaultUrlHost} = 'http://192.168.168.129'; and $TWiki::cfg{ScriptUrlPath} = '/dev/bin';
He is expected to use
http://192.168.168.129/dev/bin
in Testing also.
Why test framework is calling
http://localhost/dev/bin
for testing purposes?
I feel - we should make the changes in the test framework. I am not sure - but this issue of test framework is already discussed.
While creating the fixature or while calling to any url - we should be using
DefaultUrl Host and not the localhost
--
TWiki:Main.SopanShewale
- 18 Jul 2008
Just to make happy Test Suite - i can change the code to following:
}elsif ( $url && $url =~ m!^([^:]*://[^/]*).*$!) {
$this->{urlHost} = $1;
if( $this->{urlHost} eq 'http://localhost' ) {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
} elsif( $TWiki::cfg{RemovePortNumber} ) {
$this->{urlHost} =~ s/\:[0-9]+$//;
}
}else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
}
this does solves the problem - it passes the test cases.. for example ..
[root@localhost unit]# perl ../bin/TestRunner.pl Fn_SCRIPTURL.pm
Options:
exporting TWIKI_ASSERTS=1 for extra checking; disable by exporting TWIKI_ASSERTS=0
Assert checking on 1
Running Fn_SCRIPTURL
Fn_SCRIPTURL::test_SCRIPTURL
All tests passed (1)
[root@localhost unit]#
The same may happen with other test cases..
Suggest some one - what should be done? I feel -while running the test cases - we should be extracting the
DefaultURL host and run the test cases using that and not the localhost all the time.
--
TWiki:Main.SopanShewale
- 18 July 2008
I guess your quick fix above looks clean and simple and if it works check it in right away so we have a working 4.2.1 code base with clean unit test case run.
If you want to improve the test cases noone is going to blame you. On the contrary.
--
TWiki:Main.KennethLavrsen
- 20 Jul 2008
Hey - Thanks Kenneth, I will checkin the fix tomorrow morning.
--
TWiki:Main.SopanShewale
- 21 Jul 2008
It was confirmed at release meeting that your code is OK and that it is the unit tests that needs fixing.
Note that after Gilmars checkin the unit tests in the 4.2 branch and the trunk are not compatible so you need to fix it both places and you cannot copy over the file from trunk to 4.2.
--
TWiki:Main.KennethLavrsen
- 22 Jul 2008
Hi,
Looks like unit tests are modified to use $TWiki::cfg{DefaultUrlHost}. Cool.. i am not making any change to code now.. thank you
I am able to run the test cases with latest code and they are passing.
--
TWiki:Main.SopanShewale
- 22 Jul 2008
Hi,
Looks like unit tests are modified to use $TWiki::cfg{DefaultUrlHost}. Cool.. i am not making any change to code now.. thank you
I am able to run the test cases with latest code and they are passing.
--
TWiki:Main.SopanShewale
- 22 Jul 2008
I was looking at this issue and I think I have some contributions to give. Kenneth asked what was the purpose of the highlighted code. AFAICT, there are two goals:
- Set
$this->{urlHost}
- Set
$this->{scriptUrlPath}
(if $TWiki::cfg{GetScriptUrlFromCgi}
is true, what by default not holds)
To achieve the first goal, the original regex and the proposed at "elsif" are exactly the same. To achieve the second goal.. well, this one is a problem, cause:
- As its comment states: it doesn't work with mod_perl
- It can't work with TWiki:TWiki.ShorterUrlCookbook
by definition (GetScriptUrlFromCgi only works if there is such URL, and this URL is exactly what ShorterUrlCookbook eliminates).
So, my suggestion is a code like: (since the two goals are independent and a regex match isn't that heavy)
if( $url && $url =~ m{^([^:]*://[^/]*).*$} ) { # <-- the regex that catches urlHost
$this->{urlHost} = $1;
# If the urlHost in the url is localhost, this is a lot less
# useful than the default url host. This is because new CGI("")
# assigns this host by default - it's a default setting, used
# when there is nothing better available.
if( $this->{urlHost} eq 'http://localhost' ) {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
} elsif( $TWiki::cfg{RemovePortNumber} ) {
$this->{urlHost} =~ s/\:[0-9]+$//;
}
} else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
}
if ( $TWiki::cfg{GetScriptUrlFromCgi}
&& $url
&& $url =~ m{^[^:]*://[^/]*(.*)/.*$} # <-- the "complete" regex, to catch scriptUrlPath
&& $1 )
{
# SMELL: this is a really dangerous hack. It will fail
# spectacularly with mod_perl.
# SMELL: why not just use $query->script_name?
$this->{scriptUrlPath} = $1;
}
I'll test it now, and if it works I'll commit.
Sorry for not looking at it earlier.
--
TWiki:Main.GilmarSantosJr
- 22 Jul 2008
I tested it and all works fine with TWikiRelease04x02. Waiting for feedback and unit test updates, so we can close this one
--
TWiki:Main.GilmarSantosJr
- 22 Jul 2008
Hey Thanks
TWiki:Main.GilmarSantosJr
- I will look test your code and will provide the feedback on tomorrow (India Time)
Thanks
--
TWiki:Main.SopanShewale
- 22 Jul 2008
Verified that unit tests pass now.
Changing to Waiting for release.
--
TWiki:Main.KennethLavrsen
- 22 Jul 2008
Why did you add yourself to Waiting for Gilmar?
Are there still open actions? Normally we clear that field when work is complete
--
TWiki:Main.KennethLavrsen
- 23 Jul 2008
Humm... since when state is "Being worked on" the "Waiting for" field have information about who is working, I thought (when I saw
Item4824 at
RecentlyClosed) that this field should have the name of people that fixed the bug when state was "Waiting for Release". Now that I read
instructions more carefully, I realized that I was using somethings in a wrong way... it's always nice to read
However I still have a question (that I didn't find the answer within docs): a bug on TWikiRelease04x02 implies on ticking "4.2.1" at Codebases form field?
And I think there is a pending issue:
TWiki:Main.SopanShewale
said he would test the fix and would also update unit tests. In this case should he open a new report or should we set state to "Waiting for Feedback" on Sopan (or keep "Being worked on")?
I took out my name from "Waiting For"
--
TWiki:Main.GilmarSantosJr
- 23 Jul 2008
Hey - thanks for your comments
TWiki:Main.GilmarSantosJr
. Everything is fixed about this bug (have not tested your code, i guess its not required).
Thanks - removed Waitingfor field.
Thanks Kenneth for tracking issues

we are always gr8full on you.
-
TWiki:Main.SopanShewale
- 23 Jul 2008
Gilmar, the normal procedure for a report to remain in "Being Worked on" with a name in the "Waiting For" field until the bug is fixed and unit tests pass. There should be no need to open a new report for unit test fixes. Once the report is "Waiting for Release" or "Closed" there is no need for a name in "Waiting For". The main purpose of the "Waiting For" field is to generate reminder emails for the person being waited on.
--
TWiki:Main.CrawfordCurrie
- 23 Jul 2008
you're welcome,
TWiki:Main.SopanShewale
Thanks for the clarifications Crawford!
--
TWiki:Main.GilmarSantosJr
- 23 Jul 2008
Cleaned "WaitingFor" field.
--
TWiki:Main.GilmarSantosJr
- 10 Aug 2008