Subclassing HTML::Parser to support $p->include()

Andy Armstrong andy at hexten.net
Fri Feb 24 17:42:44 GMT 2006


I've just sent this to libwww at perl.org but I imagine someone here  
might have a bright idea :)

I'm using HTML::Parser as part of a templating system that parses  
HTML formatted templates and interprets certain special tags. I'd  
like to be able to implement a tag like

  <include src="header.html" />

To do that I'd like to subclass HTML::Parser and add an include()  
method that can be called in a tag handler and has the effect of  
including a chunk of text in the parser's input. I need the included  
text to appear in the HTML stream that HTML::Parser sees right after  
the <include /> tag (so that the included text is in the right place).

My first thought is to provide a callback to $p->parse() that returns  
the input text in chunks, breaking the text after each '>' - so that  
the text immediately after each tag is in a new chunk. The $p->include 
() method will tell the chunk-reading callback to read from the  
included text up to EOF and then return where it left off in the  
original text (there'll be an include stack of course so that nested  
includes work).

For that to work I have to rely on HTML::Parser issuing a tag  
callback as soon as it sees the closing character of a tag - if it  
reads ahead then it will have already digested text beyond the  
<include /> tag by the time it issues the callback for the <include / 
 > tage.

I can easily check what the current behaviour is - and I shall - but  
there's no contract that I can see about the relationship between the  
text that HTML::Parser has read via a callback and when the handlers  
trigger. So even if it works now it could potentially change in the  
future - I don't want to rely on undocumented behaviour.

So, is what I'm proposing sensible? If not is there a better way?  
Assuming HTML::Parser currently behaves in the way I need it to is it  
likely always to do so?

Thanks :)

-- 
Andy Armstrong, hexten.net



More information about the london.pm mailing list