• Do not register here on develop.twiki.org, login with your twiki.org account.
• Use View topic Item7848 for generic doc work for TWiki-6.1.1. Use View topic Item7851 for doc work on extensions that are not part of a release. More... Close
• Anything you create or change in standard webs (Main, TWiki, Sandbox etc) will be automatically reverted on every SVN update.
Does this site look broken?. Use the LitterTray web for test cases.

Item5382: TWiki.pm sets wrong urlHost if https protocol and ShorterUrlCookbook is used

Item Form Data

AppliesTo: Component: Priority: CurrentState: WaitingFor: TargetRelease ReleasedIn
Engine TWiki.pm Urgent Closed   patch 4.2.1, 5.0.0

Edit Form Data

Summary:
Reported By:
Codebase:
Applies To:
Component:
Priority:
Current State:
Waiting For:
Target Release:
Released In:
 

Detail

Description of the problem

Suppose you have an Apache configuration allowing both http and https connects to your TWiki and you have a DefaultUrlHost like http://www.example.com. (means default is http if not specified).

If you use ShorterUrlCookbook and you have the following lines in your Apache config

Alias /twiki/pub/ /path/to/twiki/pub/
Alias /  /path/to/twiki/cgi-bin/bin/view/
and your view URLs look like: http://www.example.com/Web/Topic or (and that's the important case for this bug) https://www.example.com/Web/Topic then you have problem in the second case (https case) because all links (that TWiki shows on the page) have the DefaultUrlHost prepended with http and not https!

Why?

Taking a look at TWiki.pm in the constructor (sub new) where urlHost is set, then we see
    my $url = $query->url();
    if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ) {
       ... lines left out ...
    } else {
        $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
    }
If your view script gets called with https://www.example.com/Web/Topic, then url in this case is https://www.example.com without a trailing slash. Hence the regexp doesn't match (one slash after the two slashes of protocol:// is required; and even if there were a trailing slash $2 would be empty).

Then you have the wrong urlHost and especially all created links point to http://www.example.com/... and not https://www.example.com/...

-- TWiki:Main/ChristianLudwig - 21 Feb 2008

Seems to me this is a security issue - having url's that are intended to be ssl'd get rendered without https will pass out unencrypted text in places where its bad.

-- SvenDowideit - 01 Jun 2008

Four months old bug now

-- TWiki:Main.KennethLavrsen - 01 Jul 2008

I know this bug is now four month old, but I cannot suggest a patch, because I do not know the intention of the code:

if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ) {
    $this->{urlHost} = $1;
    # If the urlHost in the url is localhost, this is a lot less
    # useful than the default url host. This is because new CGI("")
    # assigns this host by default - it's a default setting, used
    # when there is nothing better available.
    if( $this->{urlHost} eq 'http://localhost' ) {
        $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
    } elsif( $TWiki::cfg{RemovePortNumber} ) {
        $this->{urlHost} =~ s/\:[0-9]+$//;
    }
    if( $TWiki::cfg{GetScriptUrlFromCgi} ) {
        # SMELL: this is a really dangerous hack. It will fail
        # spectacularly with mod_perl.
        # SMELL: why not just use $query->script_name?
        $this->{scriptUrlPath} = $2;
    }
} else {
    $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
}

I've several problems understanding what the code tries to do:

  1. I doesn't see, why "localhost" is that bad. I personally use localhost:8080 for the start of a ssh-tunnel for a test server. Fortunately I'm lucky and the
    'http://localhost'
    doesn't match because in my case there is an additional port at the end of the string.
  2. I don't know why it's important or necessary to have an additional slash after :// in order to find out the urlHost.
  3. I'm sure there are many reasons why the programmer(s) didn't use a simpler version, like
    if (defined($url) && $url =~ m!^([^:]*://[^/]*)!) {
      $this->{urlHost} = $1;
      $this->{urlHost} =~ s/:[0-9]+$// if ($TWiki::cfg{RemovePortNumber});
    }
    else {
        $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
    }
    $this->{scriptUrlPath}=$query->script_name if $TWiki::cfg{GetScriptUrlFromCgi};
    
    Sorry; that's the reason, why I only gave a pointer to the code location in the bug report without an suggestion for improvement.

-- TWiki:Main.ChristianLudwig - 01 Jul 2008

Hi,

I am able to recreate this Bug into my environment.

Let me try to summarise:

-You have $TWiki::cfg{DefaultUrlHost} = 'http://www.example.com'; in LocalSite.cfg

-Your webserver allows to serve TWiki/Web pages using http and https protocol.

-You have setup ShortUrlCookBook settings, you are accessing say https://www.example.com/Main/WebHome (this is actually https://www.example.com/twiki/bin/view/Main/WebHome)

-In the case, the line my $url = $query->url(); sets value to $url = http://www.example.com (it does not set to https://www.example.com or does not set to https://www.example.com/Main/WebHome), so it does not go through if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ), it goes via else {
$this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
} loop

So $this->{urlHost} gets the value $TWiki::cfg{DefaultUrlHost} which is http://www.example.com

So all links from the page/topic have links starting with http:// protocol and not with https://

I am working on to fix this issue

-- TWiki:Main.SopanShewale - 09 Jul 2008

Hi,

Seems like the following code fixes this issue.

my $url = $query->url();

    if( $url && $url =~ m!^([^:]*://[^/]*)(.*)/.*$! && $2 ) {

        $this->{urlHost} = $1;

        # If the urlHost in the url is localhost, this is a lot less
        # useful than the default url host. This is because new CGI("")
        # assigns this host by default - it's a default setting, used
        # when there is nothing better available.
        if( $this->{urlHost} eq 'http://localhost' ) {
            $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
        } elsif( $TWiki::cfg{RemovePortNumber} ) {
            $this->{urlHost} =~ s/\:[0-9]+$//;
        }
        if( $TWiki::cfg{GetScriptUrlFromCgi} ) {
            # SMELL: this is a really dangerous hack. It will fail
            # spectacularly with mod_perl.
            # SMELL: why not just use $query->script_name?
            $this->{scriptUrlPath} = $2;
        }
    }elsif ( $url && $url =~ m!^([^:]*://[^/]*).*$!) {
        $this->{urlHost} = $1;
    }else {

        $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
  }
 

TWiki:Main/ChristianLudwig, kindly request you to test this and let me know.

-- SopanShewale - 14 Jul 2008

Thanks for this fix Sopan. And thanks to Christian for a very good bug report.

Setting to Waiting for Release.

-- KennethLavrsen - 15 Jul 2008

The fix works for me. Thank you.

-- TWiki:Main.ChristianLudwig - 15 Jul 2008

But breaks 9 unit tests

-- TWiki:Main.KennethLavrsen - 16 Jul 2008

Because of the if … elseif … else construction there are now two completely different branches handling URLs. In the second branch (elsif) the RemovePortNumber config variable and the GetScriptUrlFromCGI config variable are both ignored and there is no localhost test.

Perhaps it's better to think about fixing the first regular expression (in the if) instead of opening more branches. But for fixing the reg exp some hints are needed what this reg exp should match and what not.

-- TWiki:Main.ChristianLudwig - 17 Jul 2008

its actually better to use the url processing CPAN library - i didn't when I last poked the url parsing code because we try to avoid external dependancies, but I suspect it'd be safer to do so - perhaps by including a copy of the .pm in the release.

-- TWiki:Main.SvenDowideit - 17 Jul 2008

But what is the code line supposed to do? That has nothing to do with CPAN libs.

We need the spec for what the original programmer intended this code to do.

-- TWiki:Main.KennethLavrsen - 17 Jul 2008

Sorry, it seems obvious to me that the regex is intended to extract $this->{urlHost} = $1;.

Why its done this way is robably due to where the code started from, and the sequence of bugs that were found and fixed in it.

doesn't change the fact that hand parsing url's using a regex is a risky thing to do when there are well tested libraries that are known to do it well.

-- TWiki:Main.SvenDowideit - 17 Jul 2008

Christian for sure understood at that low level what the code is doing.

Take a step back and tell us what the code is supposed to do

As Christian writes: "I know this bug is now four month old, but I cannot suggest a patch, because I do not know the intention of the code"

What is the intention of the code? Even with a CPAN call you can get into trouble if you use it for the wrong purpose. And even without knowing the original intention exactly - for sure we are not going to force a new CPAN dependency of a lib that has to be installed just to perform such a simple little task. It is difficult for people using shared hosting to get the admins to install CPAN libs and for sure this one can be resolved with the right regex.

  • The code block takes my $url = $query->url();
  • and at the end it defines $this->{urlHost}

What - expressed in words - is the code supposed to do?

-- TWiki:Main.KennethLavrsen - 18 Jul 2008

Kenneth, Which 9 test cases are failing - can you please let me know. I will look at those failed cases - may be need to go into deep to understand test cases/code from top to bottom.

-- TWiki:Main.SopanShewale - 18 Jul 2008

Looking at the test report - i know the failed cases - working on it..

-- TWiki:Main.SopanShewale - 18 Jul 2008

hmm.. this is interesting issue

If someone sets $TWiki::cfg{DefaultUrlHost} = 'http://192.168.168.129'; and $TWiki::cfg{ScriptUrlPath} = '/dev/bin';

He is expected to use http://192.168.168.129/dev/bin in Testing also.

Why test framework is calling http://localhost/dev/bin for testing purposes?

I feel - we should make the changes in the test framework. I am not sure - but this issue of test framework is already discussed.

While creating the fixature or while calling to any url - we should be using DefaultUrl Host and not the localhost

-- TWiki:Main.SopanShewale - 18 Jul 2008

Just to make happy Test Suite - i can change the code to following:

 }elsif ( $url && $url =~ m!^([^:]*://[^/]*).*$!) {
        $this->{urlHost} = $1;
        if( $this->{urlHost} eq 'http://localhost' ) {
            $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
        } elsif( $TWiki::cfg{RemovePortNumber} ) {
            $this->{urlHost} =~ s/\:[0-9]+$//;
        }

    }else {

        $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
  }
 

this does solves the problem - it passes the test cases.. for example ..

[root@localhost unit]# perl ../bin/TestRunner.pl Fn_SCRIPTURL.pm
Options:
exporting TWIKI_ASSERTS=1 for extra checking; disable by exporting TWIKI_ASSERTS=0
Assert checking on 1
Running Fn_SCRIPTURL
        Fn_SCRIPTURL::test_SCRIPTURL
All tests passed (1)
[root@localhost unit]#
 

The same may happen with other test cases..

Suggest some one - what should be done? I feel -while running the test cases - we should be extracting the DefaultURL host and run the test cases using that and not the localhost all the time.

-- TWiki:Main.SopanShewale - 18 July 2008

I guess your quick fix above looks clean and simple and if it works check it in right away so we have a working 4.2.1 code base with clean unit test case run.

If you want to improve the test cases noone is going to blame you. On the contrary.

-- TWiki:Main.KennethLavrsen - 20 Jul 2008

Hey - Thanks Kenneth, I will checkin the fix tomorrow morning.

-- TWiki:Main.SopanShewale - 21 Jul 2008

It was confirmed at release meeting that your code is OK and that it is the unit tests that needs fixing.

Note that after Gilmars checkin the unit tests in the 4.2 branch and the trunk are not compatible so you need to fix it both places and you cannot copy over the file from trunk to 4.2.

-- TWiki:Main.KennethLavrsen - 22 Jul 2008

Hi,

Looks like unit tests are modified to use $TWiki::cfg{DefaultUrlHost}. Cool.. i am not making any change to code now.. thank you

I am able to run the test cases with latest code and they are passing.

-- TWiki:Main.SopanShewale - 22 Jul 2008

Hi,

Looks like unit tests are modified to use $TWiki::cfg{DefaultUrlHost}. Cool.. i am not making any change to code now.. thank you

I am able to run the test cases with latest code and they are passing.

-- TWiki:Main.SopanShewale - 22 Jul 2008

I was looking at this issue and I think I have some contributions to give. Kenneth asked what was the purpose of the highlighted code. AFAICT, there are two goals:

  • Set $this->{urlHost}
  • Set $this->{scriptUrlPath} (if $TWiki::cfg{GetScriptUrlFromCgi} is true, what by default not holds)

To achieve the first goal, the original regex and the proposed at "elsif" are exactly the same. To achieve the second goal.. well, this one is a problem, cause:

  • As its comment states: it doesn't work with mod_perl
  • It can't work with TWiki:TWiki.ShorterUrlCookbook by definition (GetScriptUrlFromCgi only works if there is such URL, and this URL is exactly what ShorterUrlCookbook eliminates).

So, my suggestion is a code like: (since the two goals are independent and a regex match isn't that heavy)

    if( $url && $url =~ m{^([^:]*://[^/]*).*$} ) {  # <-- the regex that catches urlHost
        $this->{urlHost} = $1;
        # If the urlHost in the url is localhost, this is a lot less
        # useful than the default url host. This is because new CGI("")
        # assigns this host by default - it's a default setting, used
        # when there is nothing better available.
        if( $this->{urlHost} eq 'http://localhost' ) {
            $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
        } elsif( $TWiki::cfg{RemovePortNumber} ) {
            $this->{urlHost} =~ s/\:[0-9]+$//;
        }
    } else {
        $this->{urlHost} = $TWiki::cfg{DefaultUrlHost};
    }
    if (   $TWiki::cfg{GetScriptUrlFromCgi} 
        && $url
        && $url =~ m{^[^:]*://[^/]*(.*)/.*$}  # <-- the "complete" regex, to catch scriptUrlPath
        && $1 )
    {

        # SMELL: this is a really dangerous hack. It will fail
        # spectacularly with mod_perl.
        # SMELL: why not just use $query->script_name?
        $this->{scriptUrlPath} = $1;
    }

I'll test it now, and if it works I'll commit.

Sorry for not looking at it earlier.

-- TWiki:Main.GilmarSantosJr - 22 Jul 2008

I tested it and all works fine with TWikiRelease04x02. Waiting for feedback and unit test updates, so we can close this one wink

-- TWiki:Main.GilmarSantosJr - 22 Jul 2008

Hey Thanks TWiki:Main.GilmarSantosJr - I will look test your code and will provide the feedback on tomorrow (India Time)

Thanks smile

-- TWiki:Main.SopanShewale - 22 Jul 2008

Verified that unit tests pass now.

Changing to Waiting for release.

-- TWiki:Main.KennethLavrsen - 22 Jul 2008

Why did you add yourself to Waiting for Gilmar?

Are there still open actions? Normally we clear that field when work is complete

-- TWiki:Main.KennethLavrsen - 23 Jul 2008

Humm... since when state is "Being worked on" the "Waiting for" field have information about who is working, I thought (when I saw Item4824 at RecentlyClosed) that this field should have the name of people that fixed the bug when state was "Waiting for Release". Now that I read instructions more carefully, I realized that I was using somethings in a wrong way... it's always nice to read smile

However I still have a question (that I didn't find the answer within docs): a bug on TWikiRelease04x02 implies on ticking "4.2.1" at Codebases form field?

And I think there is a pending issue: TWiki:Main.SopanShewale said he would test the fix and would also update unit tests. In this case should he open a new report or should we set state to "Waiting for Feedback" on Sopan (or keep "Being worked on")?

I took out my name from "Waiting For" wink

-- TWiki:Main.GilmarSantosJr - 23 Jul 2008

Hey - thanks for your comments TWiki:Main.GilmarSantosJr. Everything is fixed about this bug (have not tested your code, i guess its not required).

Thanks - removed Waitingfor field.

Thanks Kenneth for tracking issues smile we are always gr8full on you.

- TWiki:Main.SopanShewale - 23 Jul 2008

Gilmar, the normal procedure for a report to remain in "Being Worked on" with a name in the "Waiting For" field until the bug is fixed and unit tests pass. There should be no need to open a new report for unit test fixes. Once the report is "Waiting for Release" or "Closed" there is no need for a name in "Waiting For". The main purpose of the "Waiting For" field is to generate reminder emails for the person being waited on.

-- TWiki:Main.CrawfordCurrie - 23 Jul 2008

you're welcome, TWiki:Main.SopanShewale wink

Thanks for the clarifications Crawford!

-- TWiki:Main.GilmarSantosJr - 23 Jul 2008

Cleaned "WaitingFor" field.

-- TWiki:Main.GilmarSantosJr - 10 Aug 2008

ItemTemplate
Summary TWiki.pm sets wrong urlHost if https protocol and ShorterUrlCookbook is used
ReportedBy TWiki:Main.ChristianLudwig
Codebase 4.2.0, 4.2.1, ~twiki4
SVN Range

AppliesTo Engine
Component TWiki.pm
Priority Urgent
CurrentState Closed
WaitingFor

Checkins TWikirev:17017 TWikirev:17018 TWikirev:17019 TWikirev:17105 TWikirev:17106
TargetRelease patch
ReleasedIn 4.2.1, 5.0.0
Edit | Attach | Watch | Print version | History: r39 < r38 < r37 < r36 < r35 | Backlinks | Raw View |  Raw edit | More topic actions
Topic revision: r39 - 2008-08-10 - GilmarSantosJr
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback