UNISA Chatter – Design patterns in C++ Part 7: Parsing XML using QT

See UNISA – Summary of 2010 Posts for a list of related UNISA posts. Continued from UNISA Chatter – Design patterns in C++ Part 6: Widgets … Validation and Regular Expressions using QT.

IMPORTANT POINT: It is important to emphasize that the intent of these posts are to share my learning's as I dig through the last three subjects of my part-time UNISA studies. The posts by no means promote concepts or technologies … they are pure information sharing for fellow students … although the highlight that we explore more technologies and concepts that we typically prefer :)

I am really looking forward to May, because I will be getting a chance to roll up my sleeves and jump into .NET 4 and C#, using Visual Studio 2010. While I am enjoying the change in technology as part of my part-time studies, I really miss the Visual Studio 2010 IDE and debugger when I work on my assignments :|

XML … my pet hate

  • XML is a class of file format that is understandable and editable by humans … really?
  • XML = eXtensible Markup Language
  • XML Document has:
    • <TAG attributes>element</TAG>

QT and Environment Summary

The following is a summary of findings as I worked through the course related book. See the summary post for details on the book.

Terminology Description Example
SAX2 Parser Simple API for XML, offers an event driven way of parsing XML See fileHandler.h below.
DOM Higher level interface allowing us to digest an XML document as objects in a hierarchical tree structure.  
QXmlContentHandler Is an abstract base class, which defines the methods used to signal events: startDocument endDocument startElement endElement characters  
QXmlDefaultHandler Is a concrete class that implements the above methods as empty do-nothing methods. When we inherit from this class, we can override one or more of the methods as needed.  

Should we use DOM or SAX?

Using the DOM we require the entire document to be loaded into memory, which makes it a challenge with bigger XML files. SAX can write or dispose processed data while continuing with parsing and processing. If you are looking for easy navigation, go DOM, however, if you have to process large files, go SX.

As part of the assignment I implemented a QXmlDefaultHandler, experimenting with the various callbacks. While it all works pretty slick, I did encounter a few odd behaviours, which I had to code around … but then that’s half the fun.

The fileHandler.h file I ended up with:

    1: #ifndef FILEHANDLER_H
    2: #define FILEHANDLER_H
    3:  
    4: #ifndef MYHANDLER_H
    5: #define MYHANDLER_H
    6: //start
    7: #include <QXmlDefaultHandler>
    8: class QString;
    9: class FileHandler : public QXmlDefaultHandler {
   10:   public:
   11:     FileHandler(QString outputFileName, QString elementName);
   12:     ~FileHandler();
   13:     bool startDocument();
   14:     bool endDocument();
   15:     bool startElement( const QString & namespaceURI,
   16:                        const QString & localName,
   17:                        const QString & qName,
   18:                        const QXmlAttributes & atts);
   19:     bool characters(const QString& text);
   20:     bool endElement( const QString & namespaceURI,
   21:                      const QString & localName,
   22:                      const QString & qName );
   23:   private:
   24:     bool            m_startElement;
   25:     bool            m_startData;
   26:     bool            m_startTag;
   27:     QFile*          m_file;
   28:     QTextStream*    m_out;
   29:     QString         m_elementName;
   30:     QString         m_fileName;
   31: };
   32: //end
   33:  
   34: #endif        //  #ifndef MYHANDLER_H
   35:  
   36:  
   37: #endif // FILEHANDLER_H

You can also get a quick reference poster here, which summarises some of the XML concepts: 
image

… next will be reflective programming, more patterns and the MVC pattern. However, as I will be starting my preparations for the May examinations, I will put this subject and the associated posts on the backburner until June.