Issue: What is the best way to design an XML instance document that is to contain a collection of diverse data?
I want to create an XML instance document containing information about a camera I recently purchased.
The camera is a hybrid: it has a Nikon body, an Olympus lens, and a Pentax manual adaptor. Nikon provides this information about the body: weight and description. Olympus provides this information about the lens: zoom and f-stop. And Pentax provides this information about the manual adaptor: speed.
Thus my instance document is to be comprised of a diverse collection of information: basic camera information (date of purchase and warranty), the Nikon body information, the Olympus lens information, and the Pentax manual adaptor information.
What's the best way to design this instance document? The following sections explore various designs. At the end I make a recommendation.
Create an XML Schema for the basic camera information, and set it to import the Nikon, Olympus, and Pentax XML Schemas.
The instance document conforms to the camera schema, along with its imported schemas.
Here are the Design #1 files.
Create an XML Schema for the basic camera information. Don't import the other schemas. Instead, make the camera schema extensible using the <any/> element.
The instance document fills in the open area created by the <any/> element with the Nikon, Olympus, and Pentax information.
Here are the Design #2 files.
Create an XML Schema for the basic camera information. Don't import the other schemas. And don't make the camera schema extensible. In the instance document assemble all the desired information - the basic camera information, the Nikon information, the Olympus information, and the Pentax information. Use NVDL to map each part of this compound document to the appropriate schema and to the XML Schema validator.
Here are the Design #3 files.
In the prior designs all the XML grammars were expressed using XML Schemas. In this design the Olympus schema is expressed using a Relax NG schema. Proceed as with design #3. In the instance document assemble all the information - the basic camera information, the Nikon information, the Olympus information, and the Pentax information. NVDL maps each part of the compound document to the appropriate schema and to the appropriate validator (XML Schema validator, or Relax NG validator).
Here are the Design #4 files.
Olympus has a Relax NG schema to express grammar contraints, and a Schematron schema to express a co-constraint between the lens size and the f-stop. The other schemas just express grammar constraints and use XML Schema. Use NVDL to map to the XML Schemas, Relax NG schema, and the Schematron schema.
Here are the Design #5 files.
Modify the camera schema to incorporate optional hyperlinks. In the instance document the Nokia, Olympus, and Pentax information can be either linked to, or embedded within the instance document. NVDL maps each part to the appropriate schemas.
Here are the Design #6 files.
With each successive design there is increasing flexibility and robustness, culminating with design #6.
Design #6 has these features:
Loose coupling, unlimited assembly of parts, unconstrained set of schema languages, independent schemas that are easier to develop and maintain — this is a powerful design. I recommend design #6.
Last updated: April 22, 2008, by Roger L. Costello.