Welcome to Steve’s Webshed

Games, Screensavers, and Photo Utilities
Tips for C++ and C++Builder



Gotta website? Wanna earn some cash? Sign up to be an affiliate!

Google
Search Now:
In Association with Amazon.com

Snip #4: An Event-Driven XML Parser


Configuration files are a necessity for most non-trivial programs. In Windows, configuration is usually stored either in ini files, or in the registry. For PictureSaver and RandomSaver, I used a combination of a pseudo-ini file (for the file lists, which can grow and become unmanageable in the registry) and the registry (for options of fixed sizes, like times and colors). At the time this seemed like a good idea. Then I started working on a new project.

This new project really needed something more hierarchical and flexible. It'd be nice if I could easily handle multiple users, import and export data, and still maintain human-readability. XML seemed a natural choice.

Now, there's certainly a lot of existing code out on the 'net for parsing XML, but where's the fun in using that? So I wrote my own. The idea of this parser is similar to expat, in that every time a tag is encountered, the parser generates an event. Unlike expat, my parser is written in C++ and makes use of object-oriented techniques.

This parser is intended to parse well-formed, canonical XML, and does NOT attempt to perform any validation using DTD or schema. It will recognize tags and attributes, free text elements, comments, CDATA, and entities (like &). I used the XML tutorial at W3CSchools.com for most of this. I'm not going to try to teach XML here, just how to use the parser, so check them out for more details. Like everything else here in Code Snips, the parser is wrapped in the slug namespace (slug = Steve's Library of Useful Goodies), and makes use of the standard library.

Oh, and before reading on, you might want to download the files.

Using the slug::XMLParser Class

Using the XMLParser class is pretty easy. xmlparser.h defines two classes that we're interested in: ElementHandlerBase, and XMLParser.

ElementHandlerBase is an interface class that you must inherit in order to use XMLParser. You register your ElementHandlerBase descendent with XMLParser, and when interesting things are detected in the XML stream, XMLParser calls one of your inherited methods. And that's it!

Here's the declaration for ElementHandlerBase:

class ElementHandlerBase
{
public:
    typedef std::pair<std::string, std::string> Attribute;
    typedef std::list<Attribute>                AttributeList;

    // these are required directives that MUST be implemented by the client

    virtual void startElement(const std::string &element, const AttributeList &attributes) = 0;
         // called when a start/beginning tag is encountered; the attribute list contains
         // any attributes within the tag, stored as name/value pairs

    virtual void endElement(const std::string &element) = 0;
        // called when an ending tag is encountered (e.g., </end>)

    virtual void textHandler(const std::string &text) = 0;
        // called for free-text, between a start/end tag pair

    // these are optional directives; if you don't implement them, then the command/tag
    // is silently absorbed

    virtual void xmlDeclaration(const std::string &element, const AttributeList &attributes) {}
    virtual void xmlCDATA(const std::string &data) {}
};

As you can see, there's three main functions you need to worry about: startElement, endElement, and textHandler. When a start tag is detected, startElement is called, with the name of the element and a list of attribute name/value pairs as parameters. Likewise, when an ending tag is detected, endElement is called with the name of the element. textHandler is called when free text between tags is detected.

Two other methods are worth noting: if you care about either the xml declaration, or CDATA (free text, that isn't parsed for entity references like regular free text; it could also be binary data), then you can feel free to overload xmlDeclaration or xmlCDATA, respectively.

Here's the declaration for the XML parser:

class XMLParserImpl;
class XMLParser
{
public:
    XMLParser();

    void parseStream(ElementHandlerBase *handler, std::istream &is);
        // parses the provided stream and provides XML notifications to the specified
        // handler; throws if handler is 0, or if an error (ie non-XMLism) is detected
        // in the stream

private:
    std::auto_ptr<XMLParserImpl> impl;

    XMLParser(const XMLParser &disabled);
    XMLParser& operator=(const XMLParser &disabled);
};

Using this parser is pretty simple: create an XMLParser object, and call parseStream on it. The parameters to parseStream are just a pointer to your ElementHandlerBase-derived object, and to an open input stream.

One note, here, about the implementation details: I'm using a std::auto_ptr to implement the pointer-to-implementation idiom (or handle-body idiom, whatever), but this isn't really ideal. I'd much rather use boost's scoped_ptr, but for some reason Borland C++Builder 5 thinks that XMLParserImpl must be completely defined at this point, which is false. If I ever track down the error I'll update the code accordingly.

Some Notes About the Guts of slug::XMLParser

I'm not going to provide details about the guts, you can look at the code for those. However, some notes are warranted since I did some non-obvious things in the code, partly because it seemed to make sense, but mostly because I wanted to try a few things.

Future Plans

XMLParser isn't quite done yet. It does everything I need it to now, but there are a few things I plan to add:

Those are my plans for now, unless anybody has any requests. Don't forget to download the code. Enjoy!

Latest Releases
Webifier 1.1.0 - 11/14/2007 - Show your pictures to the world with this web photo gallery creator.
PictureSaver 4.2.5 - 03/10/2006 - Turn your pictures into a slideshow screensaver; great with digital cameras!
RandomSaver 2.0.3 - 06/14/2002 - Got a lot of screen savers? Why watch just one? RandomSaver lets you watch 'em all!
GullBlaster 1.1 - 05/12/2002 - Tired of them seagulls? Now you can get even! Blast them before they blast you!
FontViewer 1.1.1 - 06/13/2002 - Preview all the installed fonts on your system.

Firefox 2
Get Thunderbird!

Monitor page
for changes

it's private
powered by
ChangeDetection

Valid HTML 4.01 Transitional


This page has been visited 565 times. View usage statistics.