TBXML V1.2

Documentation

Last Updated:
11 November 2009

Download:
TBXML-V1.2.zip
XMLBooks-V1.2.zip

Editors:
Tom Bradley

Donations:
If you find TBXML useful and would like to donate, please do so via the paypal link below. All donations are very much appreciated and will help me to continue to support and enhance TBXML!


Table of Contents

  1. 1 Introduction
    1. 1.1 Origin and Goals
  2. 2 Programming Reference
    1. 2.1 Files
    2. 2.2 Structures
    3. 2.3 Methods
  3. 3 Usage
    1. 3.1 Inclusion into project
    2. 3.2 Loading an XML document
    3. 3.3 Extracting Elements
    4. 3.4 Extracting Attributes
    5. 3.5 Extracting Element Text
    6. 3.6 Traversing Unknown elements / attributes
  4. 4 Samples
    1. 4.1 Sample XML File
    2. 4.2 Sample Parsing Code
  5. 5 Licence
  6. 6 Version History
    1. 6.1 Version 1.0 - Original Version
    2. 6.2 Version 1.1 - Bug Fixes
    3. 6.3 Version 1.2 - Bug Fixes / New Features
  7. 7 Known Bugs
    1. 7.1 EXC_BAD_ACCESS When using initWithXMLString or initWithURL


1 Introduction

TBXML is a light-weight XML document parser written in Objective-C designed for use on Apple iPhone / iPod Touch devices. TBXML aims to provide the fastest possible XML parsing whilst utilising the fewest resources. This requirement for absolute efficiency is achieved at the expence of XML validation and modification. It is NOT possible to modify and generate valid XML from a TBXML object and NO validation is performed whatsoever whilst importing and parsing an XML document.

1.1 Origin and Goals

TBXML is developed and maintained by Tom Bradley.

The design goals for TBXML are:

  1. XML files conforming to the W3C XML spec 1.0 should be passable.

  2. XML parsing should incur the fewest possible resources.

  3. XML parsing should be achieved in the shortest possible time.

  4. It shall be easy to write programs that utilise TBXML.

This specification provides all the information necessary to understand TBXML Version 1.1 and construct computer programs to utilise it.

This version of the TBXML specification may be distributed freely, as long as all text and legal notices remain intact.

2 Programming reference

2.1 Files

Files included with TBXML are:

  1. TBXML.h - Header file containing defines / structures and class definition.

  2. TBXML.m - Implementation containing TBXML declaration and private interface.

  3. NSDataAdditions.h - Header file containing definition of NSData categories.

  4. NSDataAdditions.m - Implementation file containing NSData categories for decoding gzip & base64.

2.2 Structures

  1. TBXMLElement

    The TBXMLElement structure holds information about a single XML element. The structure holds the element name & text along with pointers to the first attribute, parent element, first child element and first sibling element. Using this structure, we can create a linked list of TBXMLElements to map out an entire XML file.

    typedef struct _TBXMLElement { char * name; char * text; TBXMLAttribute * firstAttribute; struct _TBXMLElement * parentElement; struct _TBXMLElement * firstChild; struct _TBXMLElement * currentChild; struct _TBXMLElement * nextSibling; struct _TBXMLElement * previousSibling; } TBXMLElement;
  2. TBXMLAttribute

    The TBXMLAttribute structure holds information about a single XML attribute. The structure holds the attribute name, value and next sibling attribute. This structure allows us to create a linked list of attributes belonging to a specific element

    typedef struct _TBXMLAttribute { char * name; char * value; struct _TBXMLAttribute * next; } TBXMLAttribute;
  3. TBXMLElementBuffer

    The TBXMLElementBuffer is a structure that holds a buffer of TBXMLElements. When the buffer of elements is used, an additional buffer is created and linked to the previous one. This allows for efficient memory allocation/deallocation elements.

    typedef struct _TBXMLElementBuffer { TBXMLElement * elements; struct _TBXMLElementBuffer * next; struct _TBXMLElementBuffer * previous; } TBXMLElementBuffer;
  4. TBXMLAttributeBuffer

    The TBXMLAttributeBuffer is a structure that holds a buffer of TBXMLAttributes. When the buffer of attributes is used, an additional buffer is created and linked to the previous one. This allows for efficient memeory allocation/deallocation of attributes.

    typedef struct _TBXMLAttributeBuffer { TBXMLAttribute * attributes; struct _TBXMLAttributeBuffer * next; struct _TBXMLAttributeBuffer * previous; } TBXMLAttributeBuffer;

2.3 Methods

  1. - (id)initWithXMLFile:(NSString*)aXMLFile fileExtension:(NSString*)aFileExtension

    Instantiates a TBXML object and parses the specified file. The following code instantiates a TBXML object and parses books.xml

    TBXML * tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"];
  2. - (id)initWithXMLString:(NSString*)aXMLString

    Instantiates a TBXML object and parses the specified XML string. The following code instantiates a TBXML object and parses the given XML

    tbxml = [[TBXML alloc] initWithXMLString:@"<root><elem1 attribute1=\"elem1-attribute1\"/><elem2 attribute2=\"attribute2\"/></root>"];
  3. - (id)initWithXMLData:(NSData*)aData

    Instantiates a TBXML object and parses the specified XML in an NSData object. The following code instantiates a TBXML object and parses an XML document held in an NSData object

    TBXML * tbxml = [[TBXML alloc] initWithXMLData:myXMLData];
  4. - (id)initWithURL:(NSURL*)aURL

    Instantiates a TBXML object then downloads and parses an XML file for the given URL. The following code instantiates a TBXML object and parses the note.xml file from w3schools.com

    tbxml = [[TBXML alloc] initWithURL:[NSURL URLWithString:@"http://www.w3schools.com/XML/note.xml"]];
  5. - (TBXMLElement*) childElementNamed:(NSString*)aName parentElement:(TBXMLElement*)aParentXMLElement

    Retrieves the first child element of the specified name from the given parent element. The following code retrieves the first author element from the root element

    TBXMLElement * author = [tbxml childElementNamed:@"author" parentElement:rootXMLElement];
  6. - (TBXMLElement*) nextSiblingNamed:(NSString*)aName searchFromElement:(TBXMLElement*)aXMLElement

    Retrieves the next sibling with the specified name starting from the given element. The following returns the next "author" element starting from the given author element

    TBXMLElement * author = [tbxml nextSiblingNamed:@"author" searchFromElement:author];
  7. - (NSString*) valueOfAttributeNamed:(NSString *)aName forElement:(TBXMLElement*)aXMLElement

    Returns an NSString containing the value of the specified attribute belonging to the given element. The following returns the name attribute from the given author element

    NSString * authorName = [tbxml valueOfAttributeNamed:@"name" forElement:authorElement];
  8. - (NSString*) textForElement:(TBXMLElement*)aXMLElement

    Returns an NSString containing the text for the specified element. The following returns the text from the book element

    NSString * bookDescription = [tbxml textForElement:bookElement];
  9. - (NSString*) elementName:(TBXMLElement*)aXMLElement;

    Returns an NSString containing the element name for the specified element.

    NSString * elementName = [tbxml elementName:element];
  10. - (NSString*) attributeName:(TBXMLAttribute*)aXMLAttribute;

    Returns an NSString containing the attribute name for the specified attribute.

    NSString * attributeName = [tbxml attributeName:attribute];
  11. - (NSString*) attributeValue:(TBXMLAttribute*)aXMLAttribute;

    Returns an NSString containing the attribute value for the specified attribute.

    NSString * attributeValue = [tbxml attributeValue:attribute];

3 Usage

3.1 Inclusion into project

To use TBXML, simply include the 4 files into your project.

  1. In xcode right click your project file and select "New Group". Type TBXML as the group name.
  2. Right click the TBXML group and select "Add" then "Existing Files".
  3. Find and select the 4 files (TBXML.h, TBXML.m, NSDataAdditions.h, NSDataAdditions.m). Check the "Copy items into destination group's folder (if needed)" checkbox is ticked. This ensures a copy of TBXML stays with the project.
  4. Locate the Targets node in the group tree under your project. Click the arrow to expand and right click your project's target file. Select "Get Info" and navigate to the "General" tab. Click the plus symbol at the bottom of the window to add a linked library. From the list, select "libz.dylib". You can now close this info window.

3.2 Loading an XML document

To load an xml file, you need to instantiate a TBXML object and supply the XML file to parse.

TBXML * tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"];

Or instantiate a TBXML object and supply the XML string to parse.

TBXML * tbxml = [[TBXML alloc] initWithXMLString:@"<root><elem1 attribute1=\"elem1-attribute1\"/><elem2 attribute2=\"attribute2\"/></root>"];

Or instantiate a TBXML object and supply the NSData object containing XML data to parse.

TBXML * tbxml = [[TBXML alloc] initWithXMLData:myXMLData];

Or instantiate a TBXML object and supply the URL of an XML document to retrieve and parse.

TBXML * tbxml = [[TBXML alloc] initWithURL:[NSURL URLWithString:@"http://www.w3schools.com/XML/note.xml"]];

You can obtain the root node of the parsed XML document by accessing TBXML's property "rootXMLElement"

TBXMLElement * rootXMLElement = tbXML.rootXMLElement;

3.3 Extracting elements

The "childElementNamed: parentElement:" method allows you to search for a child element with a given name. The following returns the first "author" element from the document root.

[tbxml childElementNamed:@"author" parentElement:root]

3.4 Extracting attributes

You can obtain an attribute from an element using TBXML's "valueOfAttributeNamed: forElement:" method. The code below shows how you would extract the "name" attribute from the author element.

name = [tbxml valueOfAttributeNamed:@"name" forElement:author];

3.5 Extracting element text

Given an XML element, you can obtain the text using the "textForElement:" method. The code below extracts the text from the descriptionElement.

NSString * description = [tbxml textForElement:descriptionElement];

3.6 Traversing Unknown elements / attributes

Each element contains a pointer to the next sibling element called "nextSibling". You can use this to loop through all sibling element. Each element also has a pointer to the first child element called "firstChild". Once you have a child element, you can use [tbxml elementName:element] to return an NSString containing the name of the element.

Each element also has a pointer to the first attribute. You can use [tbxml attributeName:attribute] and [tbxml attributeValue:attribute] to return the name and value of the attribute. Each attribute has a pointer to the next attribute called "next". This can be used to loop through all attributes.

4 Samples

The following sample code gives an example of how you would use TBXML to decode an XML file.

4.1 Sample XML file

The sample code is based on the below XML.

<?xml version="1.0"?> <authors> <author name="J.K. Rowling"> <book title="Harry Potter and the Philosopher's Stone" price="9.99"> <description> Harry potter thinks he is an ordinary boy - until he is rescued from a beetle-eyed giant of a man, enrolls at Hogwarts School of Witchcraft and Wizardry, learns to play quidditch and does battle in a deadly duel. </description> </book> <book title="Harry Potter and the Chamber of Secrets" price="8.99"> <description> When the Chamber of Secrets is opened again at the Hogwarts School for Witchcraft and Wizardry, second-year student Harry Potter finds himself in danger from a dark power that has once more been released on the school. </description> </book> <book title="Harry Potter and the Prisoner of Azkaban" price="12.99"> <description> Harry Potter, along with his friends, Ron and Hermione, is about to start his third year at Hogwarts School of Witchcraft and Wizardry. Harry can't wait to get back to \school after the summer holidays. (Who wouldn't if they lived with the horrible Dursleys?) But when Harry gets to Hogwarts, the atmosphere is tense. There's an escaped mass murderer on the the loose, and the sinister prison guards of Azkaban have been called in to guard the school. </description> </book> </author> <author name="Douglas Adams"> <book title="The Hitchhiker's Guide to the Galaxy" price="15.49"> <description> Join Douglas Adams's hapless hero Arthur Dent as he travels the galaxy with his intrepid pal Ford Prefect, getting into horrible messes and generally wreaking hilarious havoc. </description> </book> <book title="The Restaurant at the End of the Universe " price="14.36"> <description> Arthur and Ford, having survived the destruction of Earth by surreptitiously hitching a ride on a Vogon constructor ship, have been kicked off that ship by its commander. Now they find themselves aboard a stolen Improbability Drive ship commanded by Beeblebrox, ex-president of the Imperial Galactic Government and full-time thief. </description> </book> </author> </authors>

4.2 Sample code for parsing a known XML layout

The following code can be used to pass the sample "books.xml" file into Author and Book classes. It starts by obtaining the root document element and traversing all child elements named "author". For each author found, an Author class is instantiated and populated with the author name. We then pass all child elements of the author element looking for book elements. For each book found, a Book class is instantiated and populated with the book title. The "description" child element is then obtained from the book element, and it's text extracted to give us the books description.

// instantiate an array to hold author objects authors = [[NSMutableArray alloc] initWithCapacity:10]; // Load and parse the books.xml file tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"]; // Obtain root element TBXMLElement * root = tbxml.rootXMLElement; // if root element is valid if (root) { // search for the first author element within the root element's children TBXMLElement * author = [tbxml childElementNamed:@"author" parentElement:root]; // if an author element was found while (author != nil) { // instantiate an author object Author * anAuthor = [[Author alloc] init]; // add our author object to the authors array [authors addObject:anAuthor]; // get the name attribute from the author element anAuthor.name = [tbxml valueOfAttributeNamed:@"name" forElement:author]; // search the author's child elements for a book element TBXMLElement * book = [tbxml childElementNamed:@"book" parentElement:author]; // if a book element was found while (book != nil) { // instantiate a book object Book * aBook = [[Book alloc] init]; // add the book object to the author's books array [anAuthor.books addObject:aBook]; // extract the title attribute from the book element aBook.title = [tbxml valueOfAttributeNamed:@"title" forElement:book]; // find the description child element of the book element TBXMLElement * desc = [tbxml childElementNamed:@"description" parentElement:book]; // if we found a description if (desc != nil) { // obtain the text from the description element aBook.description = [tbxml textForElement:desc]; } // find the next sibling element named "book" book = [tbxml nextSiblingNamed:@"book" searchFromElement:book]; } // find the next sibling element named "author" author = [tbxml nextSiblingNamed:@"author" searchFromElement:author]; } } // release resources [tbxml release];

4.3 Sample code for parsing an unknown XML layout

The following code loads the "books.xml" file then traverses all elements, displaying their name along with all attributes and values to the log window.

- (void)loadUnknownXML { // Load and parse the books.xml file tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"]; // If TBXML found a root node, process element and iterate all children if (tbxml.rootXMLElement) [self traverseElement:tbxml.rootXMLElement]; // release resources [tbxml release]; } - (void) traverseElement:(TBXMLElement *)element { do { // Display the name of the element NSLog(@"%@",[tbxml elementName:element]); // Obtain first attribute from element TBXMLAttribute * attribute = element->firstAttribute; // if attribute is valid while (attribute) { // Display name and value of attribute to the log window NSLog(@"%@->%@ = %@",[tbxml elementName:element],[tbxml attributeName:attribute], [tbxml attributeValue:attribute]); // Obtain the next attribute attribute = attribute->next; } // if the element has child elements, process them if (element->firstChild) [self traverseElement:element->firstChild]; // Obtain next sibling element } while ((element = element->nextSibling)); }

5 Licence

Copyright (c) 2009 Tom Bradley

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

6 Version History

6.1 Version 1.0

Original Version

Released
18 September 2009

Download
TBXML-V1.0.zip
XMLBooks-V1.0.zip

6.2 Version 1.1

Released
12 October 2009

Bug Fixes
Fixed a bug where searching for a node or attribute only compared names upto the length of search the string. This resulted in nodes and attributes being returned that didn't exactly match the search string.

New Features
Added an initialiser to allow parsing of an XML string rather than a file.

Download
TBXML-V1.1.zip
XMLBooks-V1.1.zip

6.3 Version 1.2

Released
11 November 2009

Bug Fixes
Properly clear text values when elements have children
Removed all whitespace from element text, attribute names and attribute values
Check for NULL's when obtaining element text, attribute names and attribute values
Fixed memory leak where bytes was not freed during dealloc

New Features
Added an initialiser to allow parsing of an NSData object.
Added an initialiser to download and parse a file from the given NSURL.
Added support for multiple CDATA sections within attributes and element text.
Added full support for comments.

Download
TBXML-V1.2.zip
XMLBooks-V1.2.zip

7 Known Bugs

7.1 EXC_BAD_ACCESS When using initWithXMLString or initWithURL

Problem

bytesLength is not set correctly for the required encoding scheme


Solution

On Line 80 of TBXML.m in method "- (id)initWithXMLString:(NSString*)aXMLString"

Replace

bytesLength = [aXMLString length];

With
bytesLength = [aXMLString lengthOfBytesUsingEncoding:NSASCIIStringEncoding];