It seems pasting nasty Microsoft stuff from Outlook into Wysiwyg can make Wysiwyg go in permanent orbit producing error log entries until killed.
This is the error message
save: Use of uninitialized value in pattern match (m//) at /var/www/twiki/lib/TWiki/Plugins/WysiwygPlugin/HTML2TML/Node.pm line 784., referer: http://ehc.comm.mot.com/twiki/bin/edit/Console/CM_HOWTO?cover=kupu&t=1142418171
One thing is not being able to handle the garbage from the "evil empire" but we should at least not go in infinite loop.
I'm guessing from the document title that you might
be able to share the file being imported, which is really essential to debugging this.....
BTW that line can't be the reason for the infinite loop.
Note: please don't mark things as "patch" until I've at least had a chance to analyse the problem!
I cannot give you the exact copy/paste contents that caused the loop.
It was a user in Poland that tried to paste something from an email into the Wysiwyg window. He never succeeded in getting the topic saved. Instead he killed the browser - opened a new window - and made the text manually. And that saved topic is harmless correct TML which I can Wysiwyg edit and save without problems.
I do not expect you to fix the problem with handling the contents correctly.
What I hope for is some measures to avoid the looping - or at least limit it so nothing can run for days.
What I have is the error_line that repeats itself in 99% of a 1.8 GB error_log generated in 1.5 days. And we must assume that the text copied into the email is full of either richtext "garbage" or it was an in HTML format. The function that is called again and again is obviously the _handleDIV and the programming logic worth tracing must be.
- Why is some variable uninitialized? Can this be avoided?
- Because the text contained a DIV that had no CLASS. Yes, it can be coded around.
- When it is uninitialized - why does the "parent" code keep on calling it?
- Because it is a warning, not and error.
I marked it as 'patch' because endless loops is a very bad thing and it is urgent to avoid. I would like to be able to go on vacation and not worry about the office TWiki server running crazy and flood the disks in a few days.
logic contains no potential or actual endless loops when transforming from HTML to TML. However it uses the CPAN HTML parser, which may contain an endless loop (this is 3rd party code). Unfortunately without any means of reproducing this, there's not much I can do, sorry.
Discarding, pending a test-case that reproduces the problem.
(Note that apache servers have built-in support for terminating processes that exceed resource limits. This is a more effective control measure than anything I could code in.)
As agreed on #twiki I change to waiting for feedback. I will try and provoke the error with some garbage the next days.
SVN 9566 (subject to confirmation)
I can confirm that this issue is resolved. I have tested it both on my home and on our office machine. No processes hanging and no entries in the error log when saving the same content that caused the problem.
Job well done
Unfortunately I have to reopen this bug.
Part of it may have been resolved but it is back.
Today during a meeting minutes were taken using Wysiwyg. The topic has been edited up front by quite many people and was quite big. At some point after 15 minutes of saving the topic was saved and the browser was just sitting there for more than 15 minutes.
Again the error log was flooded and by the same error as last time.
[Fri Mar 31 12:30:03 2006] [error] [client 188.8.131.52] [Fri Mar 31 12:30:03 2006] save: Use of uninitialized value in pattern match (m//) at /var/www/twiki/lib/TWiki/Plugins/WysiwygPlugin/HTML2TML/Node.pm line 784., referer: http://ehc.comm.mot.com/twiki/bin/edit/QMS/QmsReview20060331?cover=kupu&t=1143800564
So maybe the thing that triggered the last problem was resolved but the basic issue is still unresolved. I probably have to deactivate Wysiwyg plugin now because people will loose their work again and again because of this bug.
The topic that caused it is confidencial so I will have to try and find the thing that triggered it. But I can say that after many Wysiwyg edit and save cycles the topic is totally full of odd html tags defining the same CSS classes many many times. This plugin should not have been released as it works now.
Kenneth, what code are you running on? The code in TWiki4 branch doesn't have
a pattern match on line 784 of Node.pm.
It is likely to be caused by the lack of the fix that just went in (SVN 9627) that unfortunately missed the release.
I will try with your latest improvement.
And also try to reproduce it again. The topic I have is from before the problem since the real version got lost because of the loop.
The MOT Copenhagen TWiki runs on 4.0.2. When the problem happened it was a 4.0.2 beta from 2 days before release updated Wed 29th. Mar. Do not have SVN number here.
Did you check that change of Node.pm into the TWiki4 branch?
I do not see a difference between what I have in the release ZIP and what is on SVN.
I checked it in, and updated the release zip on plugins web. Are you still seeing the problem?
I have tried to reproduce this but have not been able to again.
So far we must assume that your fix worked. I put this in Waiting for release and patch so it goes in the change history for 4.0.3. I assume that is the right thing to do since there was a fix that missed the release and the plugin is part of the distribution.
Naturally I will reopen if it reappears. Maybe you can add the last SVN numbers.