Identifying XML Schema Design Objectives

by Roger L. Costello

No XML Schema should ever be developed without first creating and documenting a set of design objectives!

There are two purposes for creating and documenting the design objectives of the XML Schemas that you create:

Guide Schema Development: the objectives will guide you in the creation of your schemas.
Facilitate Schema Evaluation: how do you measure the goodness of your Schema? One excellent method is to examine how well it meets the design objectives.

Below is a questionnaire which may be used to help identify things you should consider when developing a list of design objectives.

The questionnaire is spit into two parts:

Design Objectives for Instance Documents: how you design a Schema obviously impacts all instance documents which use your Schema. So, your objectives must take instance document and instance document authors into consideration.
Design Objectives for the Schema: how a Schema is designed will impact how easily it is maintained and updated, and how reusable it is by other Schemas.

Here are some things to consider when designing a Schema from the perspective of the instance documents that will be generated:

Should instance document authors be empowered to add data above and beyond what the Schema dictates? If so, then the Schema needs to have extensibility built-in. See [1],[7] in the References section below.
The Web is based upon an open, distributed publishing paradigm. Should instance document authors be allowed to publish "some data here, some data there"? That is, should instance document authors be allowed to follow the Web paradigm? (An aggregator tool is used to collect the distributed data.) See [1].
Should instance documents be designed to maximize the range of applications that can process the data? (For example, you may wish to design it so that instance documents can be processed by RDF applications.) Or, is there a fixed set of applications that will process the data? Is it a good idea to design for a fixed set of applications? See [2].
Should instance document elements be treated as "plug-and-play pieces"? That is, one element can be substituted by another element. See [3].
Should instance documents be shielded from namespace complexity? See [5].

Here are some things to consider when designing a Schema from the perspective of the Schema itself and the Schemas that will use your Schema:

Should a Schema be standalone? That is, not dependent on other schemas. (Rationale for wanting schemas to be standalone: when other schemas reuse your schema they will import into their schemas a fixed, known set of complexity. Analogy: you would hate to ask for a banana, only to discover an elephant connected to the other end!) See [4].
Should a Schema be designed for frequent modifications? See [6].
Should a Schema be only one of a collection of different types of schemas, e.g., DTDs, RelaxNG, Schematron? See [8].