Creating Variable Content Container Elements

(A Collectively Developed Set of Schema Design Guidelines)

  XML Schemas: Best Practices     Default Namespace - targetNamespace or XMLSchema?     Hide (Localize) Versus Expose     Element versus Type  
  Global versus Local     Zero, One, or Many Namespaces     Creating Extensible Content Models     Extending XML Schemas  

Table of Contents

Issue

What is the Best Practice for implementing a container element that is to be comprised of variable content?

Introduction

A typical problem when creating an XML Schema is to design a container element (e.g., Catalogue) which is to be comprised of variable content (e.g., Book, or Magazine, or ...)
    <Catalogue>
        - variable content -
    </Catalogue>
Some things to consider:

Example

Throughout this discussion we will consider variable content containers (e.g., <Catalogue>) which are comprised of a collection of elements, where each element is variable.

Here's an example of a <Catalogue> container element comprised of two different kinds of elements:

    <Catalogue>
        <Book> ... </Book>
        <Magazine> ... </Magazine>
        <Book> ... </Book>
    </Catalogue>   
Below are four methods for implementing variable content containers.

Method 1: Implementing variable content containers using an abstract element and element substitution

Description:

There are five XML Schema concepts that must be understood for implementing this method:

Implementation:

Declare an abstract element (Publication):
    <xsd:element name="Publication" abstract="true" 
                 type="PublicationType"/>
Declare a variable content container element (Catalogue) to have as its content the abstract element ("ref" to the abstract element declaration):
    <xsd:element name="Catalogue">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element ref="Publication" maxOccurs="unbounded"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
Note that maxOccurs="unbounded", so Catalogue may contain a collection (one or more) of Publication elements.

Declare the concrete elements (Book and Magazine) that are to be the contents of the variable content container and declare them to be in a substitutionGroup with the abstract element:

    <xsd:element name="Book" substitutionGroup="Publication" 
                 type="BookType"/>
    <xsd:element name="Magazine" substitutionGroup="Publication" 
                 type="MagazineType"/>
In order for Book and Magazine to substitute for Publication, their types (BookType and MagazineType) must derive from Publication's type (PublicationType). Here are the type definitions:

PublicationType - the base type:

    <xsd:complexType name="PublicationType">
        <xsd:sequence>
            <xsd:element name="Title" type="xsd:string"/>
            <xsd:element name="Author" type="xsd:string" 
                         minOccurs="0" maxOccurs="unbounded"/>
            <xsd:element name="Date" type="xsd:year"/>
        </xsd:sequence>
    </xsd:complexType>
BookType - extends PublicationType by adding two new elements, ISBN and Publisher:
    <xsd:complexType name="BookType">
        <xsd:complexContent>
            <xsd:extension base="PublicationType">
                <xsd:sequence>
                    <xsd:element name="ISBN" type="xsd:string"/>
                    <xsd:element name="Publisher" type="xsd:string"/>
                </xsd:sequence>
            </xsd:extension>
        </xsd:complexContent>
    </xsd:complexType>
MagazineType - restricts PublicationType by striking out the Author element:
    <xsd:complexType name="MagazineType">
        <xsd:complexContent>
            <xsd:restriction base="PublicationType">
                <xsd:sequence>
                    <xsd:element name="Title" type="xsd:string"/>
                    <xsd:element name="Author" type="xsd:string" 
                                 minOccurs="0" maxOccurs="0"/>
                    <xsd:element name="Date" type="xsd:year"/>
                </xsd:sequence>
            </xsd:restriction>
        </xsd:complexContent>
    </xsd:complexType>

Advantages:

Disadvantages:

Method 2: Implementing variable content containers using a <choice> element

Description:

This method is quite straightforward - simply list within a <choice> element all the elements which can appear in the variable content container, and embed the <choice> element in the container element.

Implementation:

Declare within a <choice> element all the elements (e.g., Book, Magazine) that may be used in the variable content container. Embed the <choice> element within the container element (Catalogue):
    <element name="Catalogue">
        <complexType>
            <choice maxOccurs="unbounded">
                <element name="Book" type="BookType"/>
                <element name="Magazine" type="MagazineType"/>
            </choice>
        </complexType>
    </element>

Advantages:

Disadvantages:

Method 3: Implementing variable content containers using an abstract type and type substitution

Description:

There are three XML Schema concepts that must be understood for implementing this method:

Implementation:

Define an abstract base type (PublicationType):
    <xsd:complexType name="PublicationType" abstract="true">
        <xsd:sequence>
            <xsd:element name="Title" type="xsd:string"/>
            <xsd:element name="Author" type="xsd:string" 
                         minOccurs="0" maxOccurs="unbounded"/>
            <xsd:element name="Date" type="xsd:year"/>
        </xsd:sequence>
    </xsd:complexType>
Declare the container element (Catalogue) to contain an element (Publication), which is of the abstract type:
    <xsd:element name="Catalogue">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="Publication" type="PublicationType" 
                             maxOccurs="unbounded"/>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
In instance documents, the content of <Publication> can only be of a concrete type which derives from PublicationType, such as BookType or MagazineType (we saw these type definitions in Method 1 above).

With this method instance documents will look different than we saw with the above two methods. Namely, <Catalogue> will not contain variable content. Instead, it will always contain the same element (Publication). However, that element will contain variable content:

    <Catalogue>
        <Publication xsi:type="BookType"> ... </Publication>
        <Publication xsi:type="MagazineType"> ... </Publication>
        <Publication xsi:type="BookType"> ... </Publication>
    </Catalogue>

Advantages:

Disadvantages:

Method 4: Implementing variable content containers using a dangling type

Motivation:

Thus far our variable content container has contained complex content (i.e., child elements). Suppose that we want to create a variable content container to hold simple content? None of the previous methods can be used. We need a method that allows us to create simpleType variable content containers.

There is one key XML Schema concept that must be understood for implementing this method:

Description:

Let's take an example. Suppose that we desire an element, sensor, which contains the name of a weather station sensor. For example:
<sensor>Barometric Pressure</sensor>
There are several things to note: Here's an elegant design for making the contents of <sensor> customizable by each weather station:

- When you create sensor, declare it to be of a type from another namespace.
- Then, when you <import> that namespace don't provide a schemaLocation.
- Thus, the element is declared to be of a type for which no particular schema is identified, i.e., we have a dangling type!

Implementation:

Let's go through the design, step by step. In your schema, declare the sensor element:
<xsd:element name="sensor" type="s:sensor_type"/>
Note that the sensor element is declared to have a type "sensor_type", which is in a different namespace - the sensor namespace:
xmlns:s="http://www.sensor.org"
Now here's the key - when you <import> this namespace, don't provide a value for schemaLocation! (In an import element schemaLocation is optional.) For example:
<xsd:import namespace="http://www.sensor.org"/>
The instance document must then identify a schema that implements sensor_type. Thus, at run time (i.e., validation time) we are matching up the reference to sensor_type with an implementation of sensor_type. For example, an instance document may have this:
xsi:schemaLocation=
      "http://www.weather-station.org weather-station.xsd
       http://www.sensor.org boston-sensors.xsd"
In this instance document schemaLocation is identifying a schema, boston-sensors.xsd, which is to provide the implementation of sensor_type.

Let's take a look at the schemas and instance documents for the weather station sensor example we have been considering. Here's the main schema, which contains the dangling type:

weather-station.xsd


<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.weather-station.org" xmlns="http://www.weather-station.org" xmlns:s="http://www.sensor.org" elementFormDefault="qualified"> <xsd:import namespace="http://www.sensor.org"/> <xsd:element name="weather-station"> <xsd:complexType> <xsd:sequence> <xsd:element name="sensor" type="s:sensor_type" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:schema>
Note that the <import> element does not have a schemaLocation attribute to identify a particular schema which implements sensor_type. (Stated differently, this schema does not hardcode in the identity of the schema which is to provide the implementation of sensor_type.) The schema validator will resolve the reference to sensor_type based upon collection of schemas that is provided to it in the instance document.

The Boston weather station creates a schema which implements sensor_type:

boston-sensors.xsd


<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.sensor.org" xmlns="http://www.sensor.org" elementFormDefault="qualified"> <xsd:simpleType name="sensor_type"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="barometer"/> <xsd:enumeration value="thermometer"/> <xsd:enumeration value="anenometer"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>
Now an instance document can conform to weather-station.xsd and use boston-sensors.xsd as the implementation of sensor_type:

boston-weather-station.xml


<?xml version="1.0"?> <weather-station xmlns="http://www.weather-station.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.weather-station.org weather-station.xsd http://www.sensor.org boston-sensors.xsd"> <sensor>thermometer</sensor> <sensor>barometer</sensor> <sensor>anenometer</sensor> </weather-station>
Suppose that the London weather station has all the sensors that Boston has, plus some additional ones that are unique to the London weather patterns. Thus, London will create its own implementation of sensor_type:

london-sensors.xsd


<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.sensor.org" xmlns="http://www.sensor.org" elementFormDefault="qualified"> <xsd:simpleType name="sensor_type"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="barometer"/> <xsd:enumeration value="thermometer"/> <xsd:enumeration value="anenometer"/> <xsd:enumeration value="hygrometer"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>
Note that this schema has an additional sensor_type that Boston does not have - hygrometer.

Just as with the Boston weather station instance document, the London weather station instance document will conform to a collection of schemas: weather-station.xsd and london-sensors.xsd:

london-weather-station.xml


<?xml version="1.0"?> <weather-station xmlns="http://www.weather-station.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.weather-station.org weather-station.xsd http://www.sensor.org london-sensors.xsd"> <sensor>thermometer</sensor> <sensor>barometer</sensor> <sensor>hygrometer</sensor> <sensor>anenometer</sensor> </weather-station>

Summary:

This method represents an extraordinarily powerful design pattern. The key to this design pattern is:

1. When you declare the variable content container element give it a type that is in another namespace, e.g., s:sensor_type

2. When you <import> that namespace don't provide a value for schemaLocation, e.g.,

<xsd:import namespace="http://www.sensors.org"/>
3. Create any number of implementations of the dangling type, e.g.,

- boston-sensors.xsd
- london-sensors.xsd

4. In instance documents identify the schema that you want used to implement the dangling type, e.g.,

xsi:schemaLocation=
    "http://www.weather-station.org weather-station.xsd
     http://www.sensor.org london-sensors.xsd"

Both simpleType and complexType:

In our examples we have implemented the dangling type as a simpleType. The implementation of a dangling type does not have to be a simpleType. A schema could define it as a complexType.

Advantages:

Disadvantages:

Best Practice

Which method you should use to create your variable content containers ultimately depends on your requirements. Here are some things to consider.

Use Method 2 (<choice> element) when:

Use Method 4 (dangling type) when: Use Method 3 (abstract type with type substitution) when: Best Practice: Method 4 + Method 3. That is, create a schema with a dangling type. Then, in the schemas which implements the dangling type use an abstract type with type substitution.

Acknowledgements

This issue has turned out to have many interesting twists and turns. Special thanks to Curt Arnold and Len Bullard for their many excellent inputs. Without their inputs, this document would not be nearly as complete and detailed. Also, thanks to Rick Jelliffe and Jeff Rafter.