You are here: Home > What's New > Designing an XML Schema For 3 Elements ...

Designing an XML Schema For 3 Elements and at Least One Must be Present

Problem Statement

You have a sequence of 3 elements — A, B, C — and you need at least one of them to be present in your XML instance documents. How do you design an XML Schema to accomplish this?

Here are the valid instances:

<A>...</A>
<B>...</B>
<C>...</C>

or

<A>...</A>
<B>...</B>

or

<A>...</A>
<C>...</C>

or

<B>...</B>
<C>...</C>

or

<A>...</A>

or

<B>...</B>

or

<C>...</C>

Will this XML Schema definition work?

<complexType name="ThreeElements">
    <choice>
        <sequence>
            <element name="A" type="string" />
            <element name="B" type="string" minOccurs="0" />
            <element name="C" type="string" minOccurs="0" />
        </sequence> 
        <sequence>
            <element name="A" type="string" minOccurs="0" />
            <element name="B" type="string" />
            <element name="C" type="string" minOccurs="0" />
        </sequence> 
        <sequence>
            <element name="A" type="string" minOccurs="0" />
            <element name="B" type="string" minOccurs="0" />
            <element name="C" type="string" />
        </sequence>
    </choice>
</complexType>

Notice that in the first choice A is required and B, C are optional. In the second choice B is required and A, C are optional. In the third choice C is required and A, B are optional.

Let's play schema validator: You are parsing an instance document and encounter this element:

<A>...</A>

How shall you validate it? In the above schema there are three declarations for the <A> element. The only way for you to know which declaration to use is to look ahead in the instance document. For example, if there is no <B> element then you can eliminate the second choice.

Requiring an XML Schema validator to look ahead is not allowed (it is allowed with RELAX NG).

The technical term for the above content model is: non-deterministic content model.

XML Schema 1.0 does not allow non-deterministic content models.

Thus, the above complexType definition is not valid.

Solution #1

Here is an all-Schema solution:

<complexType name="ThreeElements">
    <choice>
        <sequence>
            <element name="A" type="string" />
            <element name="B" type="string" minOccurs="0" />
            <element name="C" type="string" minOccurs="0" />
        </sequence>
        <sequence>
            <element name="B" type="string" />
            <element name="C" type="string" minOccurs="0" />
        </sequence>
        <sequence>
            <element name="C" type="string" />
        </sequence>
    </choice>
</complexType>

Solution #2

Here is a solution that uses a combination of XML Schema and Schematron.

First, create a simple XML Schema in which all three elements are optional:

<complexType name="ThreeElements">
    <sequence>
        <element name="A" type="string" minOccurs="0" />
        <element name="B" type="string" minOccurs="0" />
        <element name="C" type="string" minOccurs="0" />
    </sequence>
</complexType>

Then, create a Schematron assertion that implements the business rule that at least one element must be present:

<sch:assert test="count(*) >= 1">
    At least one element must be present
</sch:assert>

Acknowledgements

Thanks to the following people for their input to this document:

  • Lyle Anderson
  • George Christian Bina
  • Pete Cordell
  • Roger Costello
  • Dave Czulada
  • Fraser Goffin
  • Robert Koberg
  • Bryan Rasmussen
  • Laurens van den Oever
  • Andrew Welch

Last Updated: August 15, 2008