Using
top
, I've observed that the apache processes seem to grow over time, from 25MB initially (I compile almost everything in the mod_perl startup file) to about 100MB. So far I have worked around by letting each process do only a limited number of requests, because I was convinced that the problem is in my self-compiled add-on modules to Apache.
Yesterday I stumbled over a circular reference between the
TWiki
object (mostly called
$session
) and the
TWiki::Prefs
object - and on a closer look there are more of that sort.
Perl's garbage collection is known to
not break these things, so memory is piling up because the objects can't be freed.
There's a simple script to reproduce the problem (resembling to what mod_perl does). Place the following in your
bin
directory as
view_loop
and start from the command line with
./view_loop 50 >/dev/null
(change the number of loops if you want, it defaults to 100 views). Watch with
top
.
#!/usr/bin/perl -w
my $view = '';
my $n_loops = $ARGV[0] || 100;
my $i_loops = 0;
{
open VIEW, './view' or die "Couldn't open view script: '$!'";
local $/ = undef;
$view = <VIEW>;
close VIEW;
}
eval "sub handler { $view }";
die "Couldn't compile handler: '$@'" if $@;
while ($i_loops++ < $n_loops) {
handler();
}
I am really surprised that I'm the first to spot this. Or maybe I'm hallucinating?
(Setting this to "Actioning" because I'm going to work on that the next days. Feedback about other suspicious things to look for is welcome, though)
No, you're not hallucinating. I knew about the circular refs, but never worried about them, because I thought the perl GC was smart enough to detect disconnected subgraphs (and I'm frankly slightly amazed that it isn't)
CC
Say
perldoc perlobj
to your computer and look for the heading "Two-Phased Garbage Collection".
Just to give a figure: In my rather small installation (some hundreds of topics, 25 registered users) every request took about 280kB extra RAM.
But anyway - here's the patch which allowed my installation to take several hundreds of
view
requests in one process, either with the test program or with apache/mod_perl, without taking up more RAM:
Index: lib/TWiki.pm
===================================================================
--- lib/TWiki.pm (revision 9912)
+++ lib/TWiki.pm (working copy)
@@ -1327,15 +1327,23 @@
---++ ObjectMethod finish
Complete processing after the client's HTTP request has been responded
-to. Right now this only entails one activity: calling TWiki::Client to
-flushing the user's
-session (if any) to disk.
+to. Right now this does two things:
+ 1 calling TWiki::Client to flushing the user's session (if any) to disk,
+ 2 breaking circular references to allow garbage collection in persistent
+ environments
=cut
sub finish {
my $this = shift;
$this->{client}->finish();
+
+ # no candidate for a beauty contest, but way easier than Prefs::finish
+ my $prefs = $this->{prefs};
+ @{$prefs->{PREFS}} = ();
+
+ # and finally get rid of the _contents_ of the TWiki object in one sweep
+ %$this = ();
}
=pod
I have not yet rigorously tested with other actions than
view
, and there have been only few
edit
actions in my installation today. Maybe I've missed another circular reference - but I can add that later.
I'd like to suggest that my sample program is added to the test suite. After all, there are some mod_perl installations around.
If I recall correctly, Sven has reported that mod_perl didn't help him much with regard of performance. Given his 15000 users (which would take another 300-400kB pile up per request) I could expect that: It is fresh memory for each request, defeating all attempts of intelligent swapping.
And I wonder whether some of the irreproducible results with mod_perl we had during Dakar beta testing have been related to TWiki hitting some RAM boundaries...
--
TWiki:Main.HaraldJoerg
perldoc perlobj
.... "a more complete garbage collection strategy will be implemented at some point in the future". They actually released this language with a brain-dead GC.
OK, yes, please do add it to the test suite. I
think I only used the session object in all subobjects (I derived them using mod_perlize, after all) but we obviously need some way to detect leakage. From what I recall of the code, I have a horrible feeling that CGI::Session will also leak.
I think Sven has other performance issues. He is re-covering ground that Rafael and I walked over last year. Until he has thrashed through the obvious issues with his platform, I am taking his reports with a pinch of salt. But yes, this could be a cause of this and other issues; I have seen similar issues running Cairo under
SpeedyCGI, which has a similar persistance model.
CC
Can't confirm a leakage of CGI::Session right now. The
CGI::Session::File
object is destroyed during the call to finish (OTOH, there have been various changes to CGI::Sessions recently, will have to check). ...
After having performed
TWiki:Codev.HowToTrackDownCircularReferences on another TWiki, I've found more circular references. In total, these are:
- from the TWiki object to its various sub-objects, as fixed by the patch above.
- from TWiki::Prefs to TWiki::PrefsCache objects. The patch above fixes this only partially, it fails to clear the
TEXT
and WEBS
substructure.
- in TWiki::Users, from the wikiname to the login structure. The patch above fails to fix these.
I'll attach my current version of the patch (
https://develop.twiki.org/pub/Bugs/Item2158/Item2158.diff), for easier update. For 1000 requests with
view_loop
, SVN r9917 without the patch shows a 280kB RAM increase per request up to almost 300MB, while with the patch memory increases from 14380kB to 14428kB. Maybe the patch is still incomplete.
--
TWiki:Main.HaraldJoerg
I was just about to check in your fix on SVN when Sven tells me that you have had SVN commit access since November. Did you know that?
Then I think you shall have the honour of checking in your patch yourself
I tried without the patch and after loading a test topic with ab 2400 times the machine ran out of resources. Clearly we have a memory leak which is a killer with mod_perl. After implementing the patch I ran ab with 17000 page loads and no leaking could be seen. So a great improvement. This patch should be checked in to both DEVELOP and TWiki4. If it is incomplete it is for sure a great improvement and we should start testing with it to ensure that nothing else breaks.
Changed classification to patch since this now just miss the SVN checkin.
--
KJL
Now I know that I have commit access
(and changed status to 'Waiting for release')
--
TWiki:Main.HaraldJoerg
I re-engineered the solution a bit, because i want the option of adding a cleanup method in other classes, and I
don't like the idea of poking private variables.
CC
With all due respect: my patch was intended to fix the bug (which it does) and not to re-engineer. I fully agree with your solution for DEVELOP, since there we 1) might easily run into the challenge of adding cleanup methods in other classes and 2) we have enough time to test and tweak it and 3) I don't like to poke in private variables as well.
But you've checked in for TWikiRelease04x00
only.
In TWikiRelease04x00 we should have bug fixes and
only bug fixes. For bug fixes I can hardly see the need to rename hash keys, and to have the option for a cleanup method in other classes. Ripping out another structure in
TWiki::finish
(whether private or not) will always be easier in case my patch missed something.
Bug fixing, in my experience, is rather like assassinating. If you want to squash a bug, be sure that it is dead before you leave, and take care not to spread blood all over the place (your patch is touching six files instead of one).
That said, I would not have been so upset if your patch wouldn't miss a data structure which is 1) handled by my patch, and 2) documented above. I wrote that I had to check on another TWiki installation (because on vanilla SVN my first shot has been apparently sufficient) to which I unfortunately don't have access right now. I sat the whole evening to try to reconstruct
why I had to hack through the
users
hash. In hindsight it is pretty easy: log in with an user id which is member of a group. Group points to members, member points to group. Gotcha.
So I'd suggest to keep my original patch for TWikiRelease04x00 and tweak your solution for DEVELOP to include TWiki::Disposable into TWiki::Users and maybe TWiki::User or TWiki::Users::TWikiUserMapping or whatever.
Erm - or is this what you intend to do since you've set the status to "Actioning"? If so, forget what I've written here...
-- Cheers,
TWiki:Main.HaraldJoerg
MD got hit by internal API changes:
Item2234
No, you are quite right, and I consider myself appropriately chastened.
What drove me to make this change in
TWikiRelease04x00 is actually history. In my experience with TWiki, once a patch goes in, then it rarely gets re-engineered. Patches linger, the smells pile up, and no-one can remember why they were done. So I try to do things "right first time" rather than patching and trying to recover later.
But you are quite right; this is the new order, patches
will be followed up with appropriate refactoring, and my fix should have gone into DEVELOP and not
TWikiRelease04x00.
I reverted the changes I made, and am checking into DEVELOP.
CC
I'm sorry, making a bug fix is no excuse for making worse code.
Commiting a version that at least tries to respect object boundaries.
-- SD
Close, but no banana.
This is the second time that someone tries to beautify my fix. Feel free to do so over and over again, it is a valuable exercise in Perl and object oriented programming.
But it would be nice if you actually ran the appropriate tests whether your code is really
better than mine before committing with regards to "correctness". Crawford missed to properly untangle the users/groups circle, while your "improvement" leaves the Prefs/PrefsCache circle intact. I only had to run the checks documented in
TWiki:Codev.HowToTrackDownCircularReferences to discover that.
How come? When copying/modifying you did a general replace of
$prefs
by
$this
. However, in the inner loop, there's an inner variable which is also called
$prefs
and which must
not be replaced by
$this
. You can blame me for writing misleading code, I myself would do so, but alas - it
works.
Be assured: I
need this fixed because I'm running my TWiki on a small machine where mod_perl is essential to keep response time acceptable. I'll mercilessly review every "fix" of my patch, though I'd prefer I wouldn't have to.
Sigh
--
TWiki:Main.HaraldJoerg
pffft,
sigh, oh, you did commit it, cool -- SD
Hey Harald - would tools/MemoryCycleTests.pl reliably show up the problem? I'd prefer it to be a unit test, but i failed initially -- SD
Cool! I haven't found
Devel::Monitor
when I've been looking for something from CPAN to spot cycles. Doesn't work on cygwin, but who cares - my production is Linux... and it needs only to be done once... --
TWiki:Main.HaraldJoerg