Hide (Localize) Namespaces
Versus
Expose Namespaces
(A Collectively Developed Set of Schema Design Guidelines)
Table of Contents
Issue
When should a schema be designed to hide (localize) namespaces within the schema, and
when should it be designed to expose namespaces in instance documents?
Introduction
A typical schema
will use elements and types from multiple schemas, each with different namespaces.
A schema, then, may be comprised of components from multiple namespaces. Thus, when a schema
is designed the schema designer must decide whether or not the origin (namespace) of each element should be
exposed in the instance documents. A binary "switch" attribute (elementFormDefault) in the schema is used to
control the hiding/exposure of namespaces - by setting elementFormDefault="unqualified"
the namespaces will be hidden (localized) within the schema, and by setting
elementFormDefault="qualified" the namespaces will be exposed in instance
documents.
Example
Below is a schema for describing a camera. The schema uses components from three other
schemas - the camera's <body> element uses a type from the Nikon schema,
the camera's <lens> element uses a type from the Olympus schema, and
the camera's <manual_adaptor> element uses a type from the Pentax schema.
Camera.xsd
Note the three <import> elements for importing the Nikon, Olympus, and Pentax components.
Also note that the <schema> attribute, elementFormDefault has been set to the value
of unqualified. This is a critical attribute. Its value controls whether the namespaces
of the elements being used by the schema will be hidden or exposed in instance documents (thus, it behaves
like a switch turning namespace exposure on/off). Because it has
been set to "unqualified" in this schema, the namespaces will be remain hidden (localized)
within the schema, and will not be visible in instance documents, as we see here:
Camera.xml (namespaces hidden)
The only namespace qualifier exposed in the instance document is on the <camera>
root element. The rest of the document is completely free of namespace qualifiers.
The Nikon, Olympus, and Pentax namespaces are completely hidden (localized)
within the schema!
Looking at the instance document one would
never realize that the schema got its components from three other schemas.
Such complexities are localized to the schema. Thus, we say
that the schema has been designed in such a fashion that its component namespace complexities are
"hidden" from the instance document.
On the other hand, if the above schema had set elementFormDefault="qualified" then the namespace of each element would be
exposed in instance documents. Here's what the instance document would look like:
Camera.xml (namespaces exposed)
Note that each element is explicitly namespace-qualified. Also, observe
the declaration for each namespace. Due to the way the schema has been
designed, the complexities of where the schema obtained its components have been
pushed out to the instance document. Thus, the reader of this instance document is
"exposed" to the fact that the schema obtained the description element from the
Nikon schema, the zoom and f-stop elements from the Olympus schemas, and the speed
element from the Pentax schema. The instance document is definitely
a lot busier!
All Schemas must have a Consistent Value for elementFormDefault!
Be sure to note that elementFormDefault applies just to the schema that it is in. It does not
apply to schemas that it includes or imports. Consequently, if you want to hide namespaces then
all schemas involved must have set elementFormDefault="unqualified". Likewise, if you want to
expose namespaces then all schemas involved must have set elementFormDefault="qualified". To see
what happens when you "mix" elementFormDefault values, let's suppose that Camera.xsd and Olympus.xsd have both set in their
schema elementFormDefault="unqualified", while Nikon.xsd and Pentax.xsd have both set elementFormDefault="qualified".
Here's what an instance document looks like in this "mixed" design:
Camera.xml (mixed design)
Observe that in this instance document some of the elements are namespace-qualified, while others are not.
Namely, those elements from the Camera and Olympus schemas are not qualified, whereas those elements from
the Nikon and Pentax schemas are qualified.
Technical Requirements for Hiding (Localizing) Namespaces
There are two requirements on an element for its namespace to be hidden from instance
documents:
[1] The value of elementFormDefault must be "unqualified".
[2] The element must not be globally declared. For example:
The element foo can never have its namespace hidden from instance documents,
regardless of the value of elementFormDefault. foo is a global element
(i.e., an immediate child of <schema>) and therefore must always be
qualified. To enable namespace hiding the element must be a local
element.
Best Practices
For this issue there is no definitive Best Practice with respect to whether to design your schemas
to hide/localize namespaces, or design it to expose namespaces. Sometimes it's best
to hide the namespaces. Othertimes it's best to expose the namespaces. Both
have their pluses and minus, as is discussed below.
However, there are Best Practices with regards to other aspects of this issue. They are:
1. Whenever you create a schema, make two copies of it. The copies should be identical, except that in one
copy set elementFormDefault="qualified", whereas in the other copy set elementFormDefault="unqualified".
If you make two versions of all your schemas then people who use your schemas will be able to implement
either design approach - hide/localize namespaces, or expose namespaces.
2. Minimize the use of global elements and attributes so that elementFormDefault can behave as
an "exposure switch". The rationale for this Best Practice was described above, in
Technical Requirements for Hiding (Localizing) Namespaces
Advantages of Hiding (Localizing) Component Namespaces within the Schema
The instance document is simple. It's easy to read and understand.
There are no namespace qualifiers cluttering up
the document, except for the one on the document element (which is okay because it shows the
domain of the document).
The knowledge of where the schema got its components
is irrelevant and localized to the schema.
Design your schema to hide (localize) namespaces within the schema ...
- when simplicity, readability, and understandability of instance documents is of
utmost importance
- when namespaces in the instance document provide no necessary additional information. In
many scenarios the users of the instance documents are not XML-experts. Namespaces
would distract and confuse such users, where they are just concerned about structure
and content.
- when you need the flexibility of being able to change the schema
without impact to instance documents. To see this, imagine that when a schema is
originally designed it imports elements/types from another namespace.
Since the schema has been designed to hide (localize) the namespaces, instance documents
do not see the namespaces of the imported elements. Then, imagine that at a later date
the schema is changed such that instead of importing the elements/types, those
elements and types are declared/defined right within the schema (inline). This change
from using elements/types from another namespace to using elements/types in the local namespace
has no impact to instance documents because the schema has been designed to shield
instance documents from where the components come from.
Advantages of Exposing Namespaces in Instance Documents
If your company spends the time and money to create a reusable schema component,
and you make it available to the marketplace, then you will most likely want recognition
for that component. Namespaces provide a means to achieve recognition. For example,
There can be no misunderstanding that this component comes from Nikon. The namespace qualifier
is providing information on the origin/lineage of the description element.
Another case where it is desirable to expose namespaces is when processing
instance documents. Oftentimes when processing instance documents the namespace is required
to determine how an element is to be processed. If the namespaces are hidden
then your application is forced to do a lookup in the schema for every element.
This will be unacceptably slow.
Design your schema to expose namespaces in instance documents ...
- when lineage/ownership of the elements are important to the instance document users
(such as for copyright purposes).
- when there are multiple elements with the same name but different semantics then you
may want to namespace-qualify them so that they can be differentiated (e.g, publisher:body
versus human:body).
[In some cases you have multiple elements with the same name and different semantics
but the context of the element is sufficient to determine its semantics. Example: the title element in <person><title> is
easily distinguished from the title element in <chapter><title>. In such cases there is less justification
for designing your schema to expose the namespaces.]
- when processing (by an application) of the instance document elements is dependent upon knowledge of the
namespaces of the elements.