XML is still a huge mess, but at least now I have managed to get a few programs that can handle it with reasonable-ish memory requirements.
For Perl, as I had thought, the XML::Twig module gave me a pleasant interface and was able to easily handle the document.
For Haskell it was a little bit trickier. I used the SAX parser in HaXml, but it is not like a regular SAX parser, since Haskell is so unlike any regular language. The parser returns a lazy list of SAX events, so I had to make sure I processed the list without evaluating the whole thing into memory.
Now that I’ve dealt with the memory issue it appears that I have a speed issue to deal with next.