Processing XML with Xerces and the DOM
Pages: 1, 2, 3, 4
Modifying the DOM Tree in Memory
Having in-memory, tree-style access to an XML document is useful for plucking out pieces of data; but half of the benefit of DOM is the ability to modify that tree and save it back to a file. The program step4 rounds out the examples by altering the in-memory tree and saving it back to the original config file. All of this happens in the XMLConfigData:::commit() method.
commit() first calls updateObject() and updateXML(). These update the last-modified date and sync the object with the backing DOM tree, respectively. updateXML() involves updating some attributes, replacing a node, and adding some new nodes.
Updating an attribute is similar to getting its value: call the element node's setAttribute() member. This excerpt sets the <config> element's lastupdate attribute:
xercesc::DOMElement* configElement =
finder_.getConfigElement() ;
configElement->setAttribute(
finder_.ATTR_LASTUPDATE_.asXMLString() ,
sm.convert( getLastUpdate() )
) ;
The sample code doesn't update the <login> tag's user or password attributes, but those would follow the same formula.
Updating the <reports> node takes a little more work. You could delete all of the child <report> elements and create new ones. However, for purely illustrative purposes, step4 takes the long route: it creates a new <reports> element, populates that element with new <report> children, and then swaps the old <reports> for the new.
The parent document owns all nodes, by default. To create an element, call the parent document's createElement() member:
xercesc::DOMElement* newReportsElement =
xmlDoc->createElement( finder_.TAG_REPORTS_.asXMLString() ) ;
Next, create new <report> elements and add them under the new <reports> element:
for( ... each report in the XMLConfigData object ...){
xercesc::DOMElement* element =
xmlDoc->createElement( ... ) ;
newReportsElement->appendChild( element ) ;
Finally, swap the old and new elements:
xercesc::DOMElement* oldReportsElement =
finder_.getReportsElement() ;
configElement->replaceChild(
newReportsElement ,
oldReportsElement
) ;
You don't have to free the oldReportsElement pointer explicitly, as the parent XMLDocument still owns it.
Writing XML
The last part of commit() takes care of saving the DOM tree back to disk, using a LocalFileFormatTarget object. Xerces also supports storing XML in a memory buffer (MemBufFormatTarget) and writing to standard output (StdOutFormatTarget). You're free to implement your own FormatTarget class for custom output.
A DOMWriter object is responsible for writing out the data. step4 configures the DOMWriter to add spacing and formatting to make the document more human-readable:
xercesc::DOMWriter* writer = ... create new writer ...
writer->setFeature(
xercesc::XMLUni::fgDOMWRTFormatPrettyPrint ,
true
);
Finally, step4 calls the writer to write out the document:
writer->writeNode( outfile , *xmlDoc ) ;
Note that because the parent document does not own the LocalFileFormatTarget and DOMWriter, the code calls delete() on them explicitly.
If you check step4's output, you'll notice the in-memory DOM tree has become well-formed XML. Furthermore, the file is an accurate representation of the DOM tree managed by the program: unmodified nodes remain as is, including comments. (Remember, comments are valid XML constructs; they're just not valid elements.)
That's all for Xerces-C++ and DOM. My next article will show Xerces's SAX side and explain XML validation using DTD and schema.
Resources
- You can download this article's sample code..
- Despite its title, Elliotte Rusty Harold's Processing XML with Java is a useful reference for XML processing in all languages. The book is available for purchase as a hard copy, or you can read it all online.
- The Xerces-C++ web site has links to documentation and downloads. Binaries are available for several platforms. While no RPMs are available, the source bundle includes a spec file for building your own.
Ethan McCallum grew from curious child to curious adult, turning his passion for technology into a career.
Return to ONLamp
You must be logged in to the O'Reilly Network to post a talkback.
Showing messages 1 through 7 of 7.
-
solution to "corrupt tarfile" problem
2007-02-09 07:43:29 Ethan McCallum |
[Reply | View]
The file itself is intact. It appears the problem is that some browsers try to "help" people and change a filename's extension.
The download for this article is a ".tar.gz" file. If your browser renames it to a ".tar.tar" or even just a ".tar" extension, simply rename it to ".tar.gz" and it should extract without error.
-
tar file error
2006-11-17 12:00:07 GregMa [Reply | View]
"Error reading header after processing 0 entries"
-
tar file error
2006-11-17 11:58:32 GregMa [Reply | View]
"Error reading header after processing 0 entries"
-
Excellent article but tar file for download is corrupted .
2006-10-29 21:48:15 Uthup_p [Reply | View]
Can you please rectify this -
Excellent article but tar file for download is corrupted .
2006-11-10 04:30:21 Ethan McCallum |
[Reply | View]
I just tested the file -- it extracts without error on my system.
What error do you see on your system?







// create a new XML document
XmlManager manager;
XmlDocument doc(manager.CreateDocument());
doc.setname("FilterObject.xml");
// Obtain DOM representation
xerces::DOMDocument &dom_doc (*doc.getContentsAsDOM());
filterscreenclass_(dom_doc,*c,map);
To my surprise, classes XmlManager, XmlDocument, DOMDocument are not being recognized by the compiler. I could find class definition for DOM document but not for other two classes in the entire xereces source code made available.
Does any one know where are these classes located?
Please help me out.
Regards,
Rajesh G Manwani