Sebastopol, CA--Many developers see W3C XML Schema as the principal language for defining the content and structure of XML documents, while others resist the specification as unnecessarily complex, preferring to use tools such as DTDs, Schematron or RELAX NG. Eric van der Vlist, the author of the newly released (O'Reilly, US $39.95), approaches this controversy with a sober and objective view: W3C XML Schema, he says, is both essential and potentially dangerous for XML.
"XML Schema is the most complex specification ever published by the W3C," van der Vlist says. "The technology itself is complex, and the specification was written in a way that's very difficult to read. Many experts lack the objectivity necessary to show the limitations and pitfalls of the technology. My book is an honest attempt to provide a description of W3C XML Schema that is neither bashing nor praising."
Involved in developing ISO standards as the editor of the Document Schema Definition Languages Part 5 specification describing "Object Oriented XML Schema languages," van der Vlist is an XML consultant and developer, creator and chief editor of XMLfr.org, and regular contributor to XML.com and xmlhack.com. He wrote "XML Schema" for O'Reilly because W3C XML Schema has become a key component of web services specifications such as SOAP and WSDL, and most developers who interchange XML documents will need to work with the specification on some level.
Primarily designed as a tutorial--with design choices, best practices, and limitations--"XML Schema" also serves as a reference to many aspects of XML Schema creation and processing. Schemas, the book explains, effectively serve as design tools for an array of XML-based applications that enable developers to automate tasks such as validation, code generation, documentation, data binding and query optimization. Validation is the most common use for schemas, ensuring that XML documents conform to expectations, simplifying the code needed to process them.
W3C XML Schema's object-oriented approach enables XML developers to create very precise document descriptions, using a method of classification to derive types from other types. "Classification and object-orientation are useful ways to leverage what we know at a general level to a more specific level," van der Vlist explains. "For example, if I know that a cheetah is a mammal, I can infer further information about a cheetah--that it's warm-blooded and that female cheetahs nurse their young--which I don't need to formalize specifically for the cheetah. A similar principle applies to object orientation programming and XML. Knowing that an element or an attribute has a certain type may give me information, which allows me to use algorithmic processes that apply to this type."
That, he asserts, is the big promise of both object orientation and W3C XML Schema. Instead of writing documentation and processes for each element and attribute--that is, for each object--developers can write documentation and processes for each type, or class of objects, where each type is used to describe several elements and attributes. The danger lies in W3C XML Schema's uniqueness. Since trying to impose a single schema language is as unrealistic as trying to impose a single programming language, developers might actually create two distinct and potentially incompatible types of XML applications: those that identify elements and attributes by their datatypes (with W3C XML Schema), and those that identify them by a set of rules or patterns (with other schemas).
"Like it or not, most of us will have to use W3C XML Schema, and it's up to us to use it for the better and not for the worse," van der Vlist counsels. "My approach to writing the book was to make a critical analysis of the features of the language, not taking anything for granted. I'm convinced that this is the only useful and practical way to approach this highly intrusive specification, and the purpose of my book is to guide the reader as safely as possible through this tour."
Additional Resources:
"XML Schema" is also available on
of the W3C XML Schema at the O'Reilly Open Source Convention, July 22-26, 2002, in San Diego
By Eric van der Vlist
ISBN 0-596-00252-1, 400 pages, $39.95 US $61.95 CA
order@oreilly.com
1-800-998-9938; 1-707-827-7000
About O’Reilly
O’Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O’Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurring their adoption by amplifying “faint signals” from the alpha geeks who are creating the future. An active participant in the technology community, the company has a long history of advocacy, meme-making, and evangelism.