1 Introduction
TBXML is a light-weight XML document parser written in Objective-C designed for use on Apple iPhone / iPod Touch devices. TBXML aims to provide the fastest possible XML parsing whilst utilising the fewest resources. This requirement for absolute efficiency is achieved at the expence of XML validation and modification. It is NOT possible to modify and generate valid XML from a TBXML object and NO validation is performed whatsoever whilst importing and parsing an XML document.
1.1 Origin and Goals
TBXML is developed and maintained by Tom Bradley.
The design goals for TBXML are:
-
XML files conforming to the W3C XML spec 1.0 should be passable.
-
XML parsing should incur the fewest possible resources.
-
XML parsing should be achieved in the shortest possible time.
-
It shall be easy to write programs that utilise TBXML.
This specification provides all the information necessary to understand TBXML Version 1.1 and construct computer programs to utilise it.
This version of the TBXML specification may be distributed freely, as long as all text and legal notices remain intact.
2.1 Files
Files included with TBXML are:
-
TBXML.h - Header file containing defines / structures and class definition.
-
TBXML.m - Implementation containing TBXML declaration and private interface.
-
NSDataAdditions.h - Header file containing definition of NSData categories.
-
NSDataAdditions.m - Implementation file containing NSData categories for decoding gzip & base64.
2.2 Structures
-
TBXMLElement
The TBXMLElement structure holds information about a single XML element. The structure holds the element name & text along with pointers to the first attribute, parent element, first child element and first sibling element. Using this structure, we can create a linked list of TBXMLElements to map out an entire XML file.
typedef struct _TBXMLElement {
char * name;
char * text;
TBXMLAttribute * firstAttribute;
struct _TBXMLElement * parentElement;
struct _TBXMLElement * firstChild;
struct _TBXMLElement * currentChild;
struct _TBXMLElement * nextSibling;
struct _TBXMLElement * previousSibling;
} TBXMLElement;
-
TBXMLAttribute
The TBXMLAttribute structure holds information about a single XML attribute. The structure holds the attribute name, value and next sibling attribute. This structure allows us to create a linked list of attributes belonging to a specific element
typedef struct _TBXMLAttribute {
char * name;
char * value;
struct _TBXMLAttribute * next;
} TBXMLAttribute;
-
TBXMLElementBuffer
The TBXMLElementBuffer is a structure that holds a buffer of TBXMLElements. When the buffer of elements is used, an additional buffer is created and linked to the previous one. This allows for efficient memory allocation/deallocation elements.
typedef struct _TBXMLElementBuffer {
TBXMLElement * elements;
struct _TBXMLElementBuffer * next;
struct _TBXMLElementBuffer * previous;
} TBXMLElementBuffer;
-
TBXMLAttributeBuffer
The TBXMLAttributeBuffer is a structure that holds a buffer of TBXMLAttributes. When the buffer of attributes is used, an additional buffer is created and linked to the previous one. This allows for efficient memeory allocation/deallocation of attributes.
typedef struct _TBXMLAttributeBuffer {
TBXMLAttribute * attributes;
struct _TBXMLAttributeBuffer * next;
struct _TBXMLAttributeBuffer * previous;
} TBXMLAttributeBuffer;
2.3 Methods
-
- (id)initWithXMLFile:(NSString*)aXMLFile fileExtension:(NSString*)aFileExtension
Instantiates a TBXML object and parses the specified file. The following code instantiates a TBXML object and parses books.xml
TBXML * tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"];
-
- (id)initWithXMLString:(NSString*)aXMLString
Instantiates a TBXML object and parses the specified XML string. The following code instantiates a TBXML object and parses the given XML
tbxml = [[TBXML alloc] initWithXMLString:@"<root><elem1 attribute1=\"elem1-attribute1\"/><elem2 attribute2=\"attribute2\"/></root>"];
-
- (id)initWithXMLData:(NSData*)aData
Instantiates a TBXML object and parses the specified XML in an NSData object. The following code instantiates a TBXML object and parses an XML document held in an NSData object
TBXML * tbxml = [[TBXML alloc] initWithXMLData:myXMLData];
-
- (id)initWithURL:(NSURL*)aURL
Instantiates a TBXML object then downloads and parses an XML file for the given URL. The following code instantiates a TBXML object and parses the note.xml file from w3schools.com
tbxml = [[TBXML alloc] initWithURL:[NSURL URLWithString:@"http://www.w3schools.com/XML/note.xml"]];
-
- (TBXMLElement*) childElementNamed:(NSString*)aName parentElement:(TBXMLElement*)aParentXMLElement
Retrieves the first child element of the specified name from the given parent element. The following code retrieves the first author element from the root element
TBXMLElement * author = [tbxml childElementNamed:@"author" parentElement:rootXMLElement];
-
- (TBXMLElement*) nextSiblingNamed:(NSString*)aName searchFromElement:(TBXMLElement*)aXMLElement
Retrieves the next sibling with the specified name starting from the given element. The following returns the next "author" element starting from the given author element
TBXMLElement * author = [tbxml nextSiblingNamed:@"author" searchFromElement:author];
-
- (NSString*) valueOfAttributeNamed:(NSString *)aName forElement:(TBXMLElement*)aXMLElement
Returns an NSString containing the value of the specified attribute belonging to the given element. The following returns the name attribute from the given author element
NSString * authorName = [tbxml valueOfAttributeNamed:@"name" forElement:authorElement];
-
- (NSString*) textForElement:(TBXMLElement*)aXMLElement
Returns an NSString containing the text for the specified element. The following returns the text from the book element
NSString * bookDescription = [tbxml textForElement:bookElement];
-
- (NSString*) elementName:(TBXMLElement*)aXMLElement;
Returns an NSString containing the element name for the specified element.
NSString * elementName = [tbxml elementName:element];
-
- (NSString*) attributeName:(TBXMLAttribute*)aXMLAttribute;
Returns an NSString containing the attribute name for the specified attribute.
NSString * attributeName = [tbxml attributeName:attribute];
-
- (NSString*) attributeValue:(TBXMLAttribute*)aXMLAttribute;
Returns an NSString containing the attribute value for the specified attribute.
NSString * attributeValue = [tbxml attributeValue:attribute];
3.1 Inclusion into project
To use TBXML, simply include the 4 files into your project.
-
In xcode right click your project file and select "New Group". Type TBXML as the group name.
-
Right click the TBXML group and select "Add" then "Existing Files".
-
Find and select the 4 files (TBXML.h, TBXML.m, NSDataAdditions.h, NSDataAdditions.m). Check the "Copy items into destination group's folder (if needed)" checkbox is ticked. This ensures a copy of TBXML stays with the project.
-
Locate the Targets node in the group tree under your project. Click the arrow to expand and right click your project's target file. Select "Get Info" and navigate to the "General" tab. Click the plus symbol at the bottom of the window to add a linked library. From the list, select "libz.dylib". You can now close this info window.
3.2 Loading an XML document
To load an xml file, you need to instantiate a TBXML object and supply the XML file to parse.
TBXML * tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"];
Or instantiate a TBXML object and supply the XML string to parse.
TBXML * tbxml = [[TBXML alloc] initWithXMLString:@"<root><elem1 attribute1=\"elem1-attribute1\"/><elem2 attribute2=\"attribute2\"/></root>"];
Or instantiate a TBXML object and supply the NSData object containing XML data to parse.
TBXML * tbxml = [[TBXML alloc] initWithXMLData:myXMLData];
Or instantiate a TBXML object and supply the URL of an XML document to retrieve and parse.
TBXML * tbxml = [[TBXML alloc] initWithURL:[NSURL URLWithString:@"http://www.w3schools.com/XML/note.xml"]];
You can obtain the root node of the parsed XML document by accessing TBXML's property "rootXMLElement"
TBXMLElement * rootXMLElement = tbXML.rootXMLElement;
3.3 Extracting elements
The "childElementNamed: parentElement:" method allows you to search for a child element with a given name. The following returns the first "author" element from the document root.
[tbxml childElementNamed:@"author" parentElement:root]
3.4 Extracting attributes
You can obtain an attribute from an element using TBXML's "valueOfAttributeNamed: forElement:" method. The code below shows how you would extract the "name" attribute from the author element.
name = [tbxml valueOfAttributeNamed:@"name" forElement:author];
3.5 Extracting element text
Given an XML element, you can obtain the text using the "textForElement:" method. The code below extracts the text from the descriptionElement.
NSString * description = [tbxml textForElement:descriptionElement];
3.6 Traversing Unknown elements / attributes
Each element contains a pointer to the next sibling element called "nextSibling". You can use this to loop through all sibling element. Each element also has a pointer to the first child element called "firstChild". Once you have a child element, you can use [tbxml elementName:element] to return an NSString containing the name of the element.
Each element also has a pointer to the first attribute. You can use [tbxml attributeName:attribute] and [tbxml attributeValue:attribute] to return the name and value of the attribute. Each attribute has a pointer to the next attribute called "next". This can be used to loop through all attributes.
4 Samples
The following sample code gives an example of how you would use TBXML to decode an XML file.
4.1 Sample XML file
The sample code is based on the below XML.
<?xml version="1.0"?>
<authors>
<author name="J.K. Rowling">
<book title="Harry Potter and the Philosopher's Stone" price="9.99">
<description>
Harry potter thinks he is an ordinary boy - until he is rescued from a beetle-eyed
giant of a man, enrolls at Hogwarts School of Witchcraft and Wizardry, learns to
play quidditch and does battle in a deadly duel.
</description>
</book>
<book title="Harry Potter and the Chamber of Secrets" price="8.99">
<description>
When the Chamber of Secrets is opened again at the Hogwarts School for Witchcraft
and Wizardry, second-year student Harry Potter finds himself in danger from a dark
power that has once more been released on the school.
</description>
</book>
<book title="Harry Potter and the Prisoner of Azkaban" price="12.99">
<description>
Harry Potter, along with his friends, Ron and Hermione, is about to start his third
year at Hogwarts School of Witchcraft and Wizardry. Harry can't wait to get back to
\school after the summer holidays. (Who wouldn't if they lived with the horrible
Dursleys?) But when Harry gets to Hogwarts, the atmosphere is tense. There's an escaped
mass murderer on the the loose, and the sinister prison guards of Azkaban have been
called in to guard the school.
</description>
</book>
</author>
<author name="Douglas Adams">
<book title="The Hitchhiker's Guide to the Galaxy" price="15.49">
<description>
Join Douglas Adams's hapless hero Arthur Dent as he travels the galaxy with his intrepid
pal Ford Prefect, getting into horrible messes and generally wreaking hilarious havoc.
</description>
</book>
<book title="The Restaurant at the End of the Universe " price="14.36">
<description>
Arthur and Ford, having survived the destruction of Earth by surreptitiously hitching a
ride on a Vogon constructor ship, have been kicked off that ship by its commander. Now
they find themselves aboard a stolen Improbability Drive ship commanded by Beeblebrox,
ex-president of the Imperial Galactic Government and full-time thief.
</description>
</book>
</author>
</authors>
4.2 Sample code for parsing a known XML layout
The following code can be used to pass the sample "books.xml" file into Author and Book classes. It starts by obtaining the root document element and traversing all child elements named "author". For each author found, an Author class is instantiated and populated with the author name. We then pass all child elements of the author element looking for book elements. For each book found, a Book class is instantiated and populated with the book title. The "description" child element is then obtained from the book element, and it's text extracted to give us the books description.
// instantiate an array to hold author objects
authors = [[NSMutableArray alloc] initWithCapacity:10];
// Load and parse the books.xml file
tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"];
// Obtain root element
TBXMLElement * root = tbxml.rootXMLElement;
// if root element is valid
if (root) {
// search for the first author element within the root element's children
TBXMLElement * author = [tbxml childElementNamed:@"author" parentElement:root];
// if an author element was found
while (author != nil) {
// instantiate an author object
Author * anAuthor = [[Author alloc] init];
// add our author object to the authors array
[authors addObject:anAuthor];
// get the name attribute from the author element
anAuthor.name = [tbxml valueOfAttributeNamed:@"name" forElement:author];
// search the author's child elements for a book element
TBXMLElement * book = [tbxml childElementNamed:@"book" parentElement:author];
// if a book element was found
while (book != nil) {
// instantiate a book object
Book * aBook = [[Book alloc] init];
// add the book object to the author's books array
[anAuthor.books addObject:aBook];
// extract the title attribute from the book element
aBook.title = [tbxml valueOfAttributeNamed:@"title" forElement:book];
// find the description child element of the book element
TBXMLElement * desc = [tbxml childElementNamed:@"description" parentElement:book];
// if we found a description
if (desc != nil) {
// obtain the text from the description element
aBook.description = [tbxml textForElement:desc];
}
// find the next sibling element named "book"
book = [tbxml nextSiblingNamed:@"book" searchFromElement:book];
}
// find the next sibling element named "author"
author = [tbxml nextSiblingNamed:@"author" searchFromElement:author];
}
}
// release resources
[tbxml release];
4.3 Sample code for parsing an unknown XML layout
The following code loads the "books.xml" file then traverses all elements, displaying their name along with all attributes and values to the log window.
- (void)loadUnknownXML {
// Load and parse the books.xml file
tbxml = [[TBXML alloc] initWithXMLFile:@"books" fileExtension:@"xml"];
// If TBXML found a root node, process element and iterate all children
if (tbxml.rootXMLElement)
[self traverseElement:tbxml.rootXMLElement];
// release resources
[tbxml release];
}
- (void) traverseElement:(TBXMLElement *)element {
do {
// Display the name of the element
NSLog(@"%@",[tbxml elementName:element]);
// Obtain first attribute from element
TBXMLAttribute * attribute = element->firstAttribute;
// if attribute is valid
while (attribute) {
// Display name and value of attribute to the log window
NSLog(@"%@->%@ = %@",[tbxml elementName:element],[tbxml attributeName:attribute], [tbxml attributeValue:attribute]);
// Obtain the next attribute
attribute = attribute->next;
}
// if the element has child elements, process them
if (element->firstChild) [self traverseElement:element->firstChild];
// Obtain next sibling element
} while ((element = element->nextSibling));
}
5 Licence
Copyright (c) 2009 Tom Bradley
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
6.2 Version 1.1
- Released
- 12 October 2009
- Bug Fixes
- Fixed a bug where searching for a node or attribute only compared names upto the length of search the string. This resulted in nodes and attributes being returned that didn't exactly match the search string.
- New Features
- Added an initialiser to allow parsing of an XML string rather than a file.
- Download
- TBXML-V1.1.zip
- XMLBooks-V1.1.zip
6.3 Version 1.2
- Released
- 11 November 2009
- Bug Fixes
- Properly clear text values when elements have children
- Removed all whitespace from element text, attribute names and attribute values
- Check for NULL's when obtaining element text, attribute names and attribute values
- Fixed memory leak where bytes was not freed during dealloc
- New Features
- Added an initialiser to allow parsing of an NSData object.
- Added an initialiser to download and parse a file from the given NSURL.
- Added support for multiple CDATA sections within attributes and element text.
- Added full support for comments.
- Download
- TBXML-V1.2.zip
- XMLBooks-V1.2.zip
7.1 EXC_BAD_ACCESS When using initWithXMLString or initWithURL
- Problem
bytesLength is not set correctly for the required encoding scheme
- Solution
On Line 80 of TBXML.m in method "- (id)initWithXMLString:(NSString*)aXMLString"
Replace
bytesLength = [aXMLString length];
With
bytesLength = [aXMLString lengthOfBytesUsingEncoding:NSASCIIStringEncoding];