Word Documents

Paul Makepeace paul at makepeace.net
Thu Dec 8 08:51:39 GMT 2005


Sam Smith wrote:
> On Wed, 7 Dec 2005, Steve Mynott wrote:
>> On Tue, Dec 06, 2005 at 10:49:57PM +0000, Sam Smith typed:
>>> Does anyone know if there's a way to tell, from perl (on
>>> Unix) whether a word document has track changes turned on?
>>
>> Why don't you save a document without track changes and then with
>> track changes on and try a binary compare to work out the difference?
>>
>> (Although admittedly modern versions of Word documents always seem to
>> think they have been changed after opening and you may find several
>> binary changes).
> 
> I tried that, it didn't help.
> 
> I was hoping that it would be something like read byte X and
> jump to the offset stored in it. It isn't. Which is no
> surprise.

The reason it's unlikely to work is that Word's binary "format" is 
essentially a serialized blob of the in-memory representation of the 
document. (This, IIRC, led to some interesting side-effects like users 
having access to the undo history of other people's documents.)

Depending how much time you have you could spelunk the sources or ask on 
the developer lists of OpenOffice, Abiword, or Antiword.

Paul


More information about the london.pm mailing list