| Sign In/My Account | View Cart |
|
|
Valuable Data Requires Open Formats
![]() Chris Tyler I've been doing more and more work with XML, and my appreciation for that family of technologies is growing by the day. XML and open data standards solved a problem that arose with OpenOffice.org Writer a few weeks ago. OpenOffice.org is (of course) an open source office suite. I've been using a pre-2.0 test version of the software, and it has demonstrated a few instabilities but also has some great new features. I spent all of Tuesday evening using OOo to draft a detailed outline for a book. The outline contained nearly 200 entries, and while the resulting document was fairly short -- only a few pages long -- it represented a lot of work. You can imagine my dismay when the document wouldn't open the next morning. "Read Error," the program whined. Something incomprehensible about a format error at (2,2847) in styles.xml. Not a problem, I thought -- I'll just use the backup that I'd saved. Same error! The previous version of the file - same error! If I'd written that document in WordPerfect or MS Word, that would have been the end of the story. I'd probably have to rewrite. I know it happens; I've been there. But OOo 2.0 uses a document format that is an OASIS standard, which means that it's publicly documented XML. Actually, an OOo document is a zip archive containing multiple XML files. So I unzipped the OOo archive and checked the styles.xml file using xmlwf (checking to see if the XML was 'well-formed', which is step one of two on the road to correctness; the second hurdle is validity according to the schema). Sure enough, there was a duplicate element attribute at the line and column indicated in the cryptic OOo error message. Edit it out, zip it back up, try again, and ... same error, different location. But after a couple of iterations the problem was fixed. Sure, it was a pain, and sure, it should never have happened. But in an imperfect world, I'd much rather have my data in an accessible format that can be manipulated by many different tools than locked up in an undocumented, proprietary format. Chris Tyler is a programmer and Linux network administrator with a focus on the X Window System and LAMP. He has programmed in two dozen different languages over the past 20 years, and now teaches at Seneca College, Toronto. Showing messages 1 through 1 of 1.
Return to weblogs.oreilly.com. Weblog authors are solely responsible for the content and accuracy of their weblogs, including opinions they express, and O'Reilly Media, Inc., disclaims any and all liabililty for that content, its accuracy, and opinions it may contain.
|
|
Sponsored By: |
|||||||||||
The primary reason is to not be left at the mercy of the vendor. With vendors increasingly struggling to maintain their revenue streams through forced upgrades, they are relying increasingly on timelocking their software licenses. Thus, you have to re-license the SW after a period of time determined by the vendor, and your valuable info may be unavailable until you do that. Open formats are a form of insurance against that happenning, especially if they happen to be also supported by one or more open source products.