Phil Gyford

Writing

Thursday 9 March 2006

PreviousIndexNext Movable Type's over-enthusiastic sanitisation

Here’s one for anyone from the future who Googles for Movable Type, MT, Convert Line Breaks, comments, comment, formatting, sanitize, santise, GlobalSanitizeSpec, line breaks, paragraphs, etc.

At the weekend I did an awful lot of messing around on this site, and around the same time annotations on Pepys’ Diary started looking very wonky. For some reason comments were no longer formatted with <p></p> tags. Then I realised it was the same across all sites on my MT installation. Strange.

All my sites still had the comment formatting set to “Convert Line Breaks”, as they always had, and that value was still stored correctly in the database. What was going on all of a sudden? Over the next few days I found bits of time to poke around, trying to work out where this was happening in the MT code, but my Perl skillz were far from up to it.

If I switched comment formatting to use the installed Markdown plug-in, the comments were formatted correctly, using Markdown’s formatting. But switch back to “Convert Line Breaks” and comments were back to very long single lines.

I was getting worried my weekend hacking, including some probably ill-advised MySQL queries direct on MT’s database, had screwed things up forever. Then I tried replacing my MT installation with fresh clean files and it worked! I went through it all file by file, eventually realising the problem was somewhere within mt-config.cgi. I went through this line by line until, finally, the culprit was caught: The GlobalSanitizeSpec setting, which I’d changed at the weekend.

If you want to use “Convert Line Breaks” on comments you must list the p and br/ tags in the GlobalSanitizeSpec setting.

This seems counter-intuitive to me. I’d expect the sanitise setting to filter the user input. Then I’d expect the “Convert Line Breaks” setting (or whatever other formatting you’ve chosen) to take effect on display of the comment. But it seems that the sanitising takes place after MT has applied its own formatting. MT adds paragraph and line-break tags according to a display setting and then removed them as specified by a user-input filtering setting. Weird, and, dare I say it, er, wrong?

Anyway, Future Googler, I hope that was what you were looking for, and that I’ve helped you work out why MT had suddenly started messing up your weblog.

Comments

Global sanitizing? Ye gods! Will these formatters stop at nothing?

Trebuchet looks so handsome onscreen one wonders it's not used more often---doesn't have that dead look of so many sans serifs.

Posted by Bradford on 9 March 2006, 10:30 pm | Link

They tried sanitizing the Globe in Shakespeare's time. Burnt it down instead. Surely not a lesson, just a coincidence...

Posted by slangist on 13 March 2006, 8:52 pm | Link