How do we design the schema so that Book's content model is extensible? Below are two methods for implementing extensible content models.
This type substitutability mechanism is a powerful extensibility mechanism. However, it suffers from two problems:
For example, suppose that the instance document author discovers a schema, containing a declaration for a Reviewer element:
And suppose that for an instance document author it is important that, in addition to specifying the Title, Author, Date, ISBN, and Publisher of each Book, he/she specify a reviewer. Because the schema has been designed with extensibility in mind, the instance document author can use the Reviewer element in his/her BookCatalogue: The instance document author has enhanced the instance document with an element that the schema designer may have never even envisioned. We have empowered the instance author with a great deal of flexibility in creating the instance document. Wow!An alternate schema design is to create a BookType (as we did above) and embed the <any> element within the BookType:
and then declare Book of type BookType: However, then we are then back to the "unexpected extensibility" problem. Namely, after the <Publication> element any well-formed XML element may occur, and after that anything could be present.There is a way to control the extensibility and still use a type. We can add a block attribute to Book:
The block attribute prohibits derived types from being used in Book's content model. Thus, by this method we have created a reusable component (BookType), and yet we still have control over the extensibility.With the <any> element we have complete control over where, and how much extensibility we want to allow. For example, suppose that we want to enable there to be at most two new elements at the top of Book's content model. Here's how to specify that using the <any> element:
Note how the <any> element has been placed at the top of the content model, and it has set maxOccurs="2". Thus, in instance documents the <Book> content will always end with <Title>, <Author>, <Date>, <ISBN>, and <Publisher>. Prior to that, two well-formed XML elements may occur.In summary:
As a schema designer you need to recognize your limitations. You have no way of anticipating all the varieties of data that an instance document author might need in creating an instance document. Be smart enough to know that you're not smart enough to anticipate all possible needs! Design your schemas with flexibility built-in.
Definition: an open content schema is one that allows instance documents to contain additional elements beyond what is declared in the schema. As we have seen, this may be achieved by using the <any> (and <anyAttribute>) element in the schema.
Sprinkling <any> and <anyAttribute> elements liberally throughout your schema will yield benefits in terms of how evolvable your schema is:
In today's rapidly changing market static schemas will be less commonplace, as the market pushes schemas to quickly support new capabilities. For example, consider the cellphone industry. Clearly, this is a rapidly evolving market. Any schema that the cellphone community creates will soon become obsolete as hardware/software changes extend the cellphone capabilities. For the cellphone community rapid evolution of a cellphone schema is not just a nicety, the market demands it!
Suppose that the cellphone community gets together and creates a schema, cellphone.xsd. Imagine that every week NOKIA sends out to the various vendors an instance document (conforming to cellphone.xsd), detailing its current product set. Now suppose that a few months after cellphone.xsd is agreed upon NOKIA makes some breakthroughs in their cellphones - they create new memory, call, and display features, none of which are supported by cellphone.xsd. To gain a market advantage NOKIA will want to get information about these new capabilities to its vendors ASAP. Further, they will have little motivation to wait for the next meeting of the cellphone community to consider upgrades to cellphone.xsd. They need results NOW. How does open content help? That is described next.
Suppose that the cellphone schema is declared "open". Immediately NOKIA can extend its instance documents to incorporate data about the new features. How does this change impact the vendor applications that receive the instance documents? The answer is - not at all. In the worst case, the vendor's application will simply skip over the new elements. More likely, however, the vendors are showing the cellphone features in a list box and these new features will be automatically captured with the other features. Let's stop and think about what has been just described Without modifying the cellphone schema and without touching the vendor's applications, information about the new NOKIA features has been instantly disseminated to the marketplace! Open content in the cellphone schema is the enabler for this rapid dissemination.
Clearly some types of instance document extensions may require modification to the vendor's applications. Recognize, however, that thevendors are free to upgrade their applications in their own time. The applications do not need to be upgraded before changes can be introduced into instance documents. At the very worst, the vendor's applications will simply skip over the extensions. And, of course, those vendors do not need to upgrade in lock-step
To wrap up this example suppose that several months later the cellphone community reconvenes to discuss enhancements to the schema. The new features that NOKIA first introduced into the marketplace are then officially added into the schema. Thus completes the cycle. Changes to the instance documents have driven the evolution of the schema.