Schematron Usage and Features
  • Roger L. Costello and
  • Robin A. Simmons
  1. Schematron is a schema language for making assertions about data in an Extensible Markup Language (XML) document.
  2. Use Schematron to verify data interdependencies (co constraints), check data cardinality, and perform algorithmic checks.
  3. A co-constraint is a dependency between data within an XML document or across XML documents.
  4. Cardinality refers to the presence or absence of data.
  5. An algorithmic check determines data validity by performing an algorithm on the data.
  6. Schematron can perform any function supported by an XPath statement or XSLT test condition.

Schematron is an assertion-based schema language. Data constraints are expressed by making assertions (using XPath) about what relationships and patterns should hold true in the data.

Example: The following XML instance document contains a classification attribute on the <Document> element and on the <Para> element:

  1. <?xml version="1.0"?>
  2. <Document classification="secret">
  3. <Para classification="unclassified">
  4. One if by land; two if by sea.
  5. </Para>
  6. </Document>

Schematron can be used to assert:

  1. The <Para> classification value cannot be more sensitive than the <Document> classification value.
  2. The <Para> element cannot contain any restricted keywords.

The first assertion is a co-constraint between the values of two attributes. A co-constraint can apply to any number of dependencies between XML structure components (elements and attributes) as well as between values. A co-constraint may be within an XML document or across multiple XML documents.

The second assertion is a cardinality check. In the above example, the text in the <Para> element must not contain any values that are in a list of restricted keywords. The keywords may be obtained dynamically from another file.

In general, cardinality constraints are constraints on the occurrence of data, elements, or attributes. The cardinality constraints may apply over the entire document or to portions of the document.

The above two uses of Schematron involve examination or comparison of data. A third category of Schematron usage involves validating the data by algorithmic processing of the data.

Example: The following XML instance document contains election results. To be a valid document, the candidate election results must total 100%.

  1. <?xml version="1.0"?>
  2. <ElectionResultsByPercentage>
  3. <Candidate
  4. name="John">61</Candidate>
  5. <Candidate
  6. name="Sara">24</Candidate>
  7. <Candidate
  8. name="Bill">15</Candidate>
  9. </ElectionResultsByPercentage>

Schematron can be used to assert that the values of the <Candidate> elements must total 100 percent.

Recommendation:

Use Schematron to express data constraints:

  1. Co-constraints
  2. Cardinality checking
  3. Algorithmic checking

Additional Schematron Features. Schematron allows the schema author to write descriptive error messages, which can result in more understandable error messages. Schematron also can dynamically obtain data from external files. In the second assertion above, this allows the list of prohibited keywords to be in a separate file.