The Carbon Java Framework  

The Carbon Core Config Subsystem

Configuration Service Validation Design

Author: Jordan Reed (jreed at sapient.com)
Version: $Revision: 1.4 $($Author: dvoet $ / $Date: 2003/05/01 14:53:22 $)
Created: March 2003

Introduction

The major form of configuration in the Carbon system is XML-based. This provides the ability to use various forms of XML validation technology to help developers create properly defined configuration documents without the need to start the Carbon system to and allow each Component to validate its own configuration upon startup.

How is XML Validated?

There are three major forms of validation that can be performed against XML documents. Each level of the validation is layered upon the success of the previous layer.

Structural Validation

Structural validation checks for the well-formedness of an XML document. This means that the document is properly layered with each element opening and closing itself in a proper hierarchy.

Additional structural validation can occur with the presence of a validation document such as a DTD or XML Schema. These documents can define the list of valid XML elements within the document, the order of the elements and what is allowed to be sub-elements of each. They also define the attributes that can be applied to the various elements.

Data Validation

Data validation is the validation of the information appearing within elements and attributes. Data validation may restrict the content of elements or attributes to be numbers, match regular expressions or various other content based restrictions.

Data validation requires the use of validation document which supports complex type definitions. DTDs do not have this capability, and thus the use of XML Schema is needed to support automatic data validation.

Programmatic Validation

Programmatic validation, often called business validation, is the most complex level. There is no formal method of externalizing the rules of programmatic validation into a validation document. This type of validation is performed by a business application that reads in the XML file and performs programmatic checks against its content.

Validation Documents

DTD

"XML 1.0 included a set of tools for defining XML document structures, called Document Type Definitions (DTDs). DTDs provide a set of tools for defining which element and attribute structures are permitted in a document, as well as mechanisms for providing default values for attributes, defining reusable constants (entities), and some kinds of metadata information (notations). While DTDs are widely supported and used, many XML developers quickly outgrew the capabilities DTDs provide." - XML Schema (Eric van der Vlist).

A complete description of DTDs can be found in the Extensible Markup Language (XML) 1.0 (Second Edition) technical documentation at the W3C.

Carbon supports the use of DTDs as validation documents inside of Configuration files, but does not recommend them. DTDs can only provide structural validation of an XML document and therefore do not add a significant level of value to the average developer of the configuration files.

XML Schema

XML Schema was created by the W3C as a method of adding data validation information to documents and providing an XML-based extensible way of defining the structure and content of the document.

A complete description of XML Schema can be found in the XML Schema technical documentation at the W3C.

Attaching an XML Schema to a Carbon Config

Attaching an XML Schema to a Carbon configuration document is a simple process that requires only two additional attributes to be added to the top of the Configuration document inside the <Configuration> element.

<?xml version="1.0" encoding="UTF-8"?>
<Configuration
        ConfigurationInterface="org.sape.carbon.services.cache.mru.MRUCacheConfiguration"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:noNamespaceSchemaLocation="http://carbon.sourceforge.net/schema/MRUCache.xsd">

    <!-- ... -->

</Configuration>
  • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" - The first item defines a namespace to the XML processor stating that the document may include XML Schema based attributes or elements.
  • xsi:noNamespaceSchemaLocation="http://carbon.sourceforge.net/schema/MRUCache.xsd" - The second item associates a schema to this document. The value given is often called a "hint" to the location of the actual schema. Currently in carbon the value must be a valid URL (or a classpath location) to locate the schema. Examples include:
    • http://carbon.sourceforge.net/schema/MRUCache.xsd
    • file://d:/myApp/schema/MRUCache.xsd
    • classpath://com/sapient/services/cache/mru/MRUCache.xsd (Note: This is a Carbon specific enhancement to the resolution of the Schema location. Normal XML processing programs will not be able to resolve this URI without the help of XML Catalogs.)

Validation with XML Schema

Development-time

The use of the XML Schema documents with Carbon is most useful during application development. When using a XML editing program (like Altova XMLSPY or TIBCO TurboXML) a developer can instantly know their document is meeting the structural and data validation requirements of the configuration. The editor will also provide useful IDE like features of automatically closing elements, listing possible sub-elements, etc.

It is also possible to develop the configuration XML in a standard text editor and use one of the many XML validation tools available on the web to quickly check if the document is validating properly.

Run-time

Run-time validation of the XML document is considered less useful and is somewhat disabled inside the Carbon system.

Structural validation of XML documents is required since it is impossible for a parser to properly read in the information contained within the document if it is not well-formed XML.

Data validation errors will not break the Carbon system. If an XML document contains a validation document (DTD or XML Schema) and attempt will be made to validate the document. If validation fails, an error message will be logged, but the system will continue to process the XML document ignoring the validation rules applied.

Additional Questions

Why aren't XML Namespaces supported?

XML Namespaces provide a method using multiple schemas to define a single XML document. With Carbon's concept of nested configurations, it seems like having multiple namespaces would be a natural extension.

For example, a Cache config document usually includes a nested DataLoader element. It would appear that this configuration should actually be validated against a Cache XML Schema as well a a DataLoader XML Schema each in its own namespace.

This solution is possible on input, but provides huge problems when attempting to manipulate the document through JMX or other management tools. When adding elements to the document there is no way for Carbon's configuration service to known what namespace the element is suppose to occur under. It can make educated guesses, but may often be incorrect. Therefore it was decided to only allow XML Schemas that do not include a namespace.

Do I have to hard-code my XML Schema location?

When including the location of an XML schema in a configuration document with the xsi:noNamespaceSchemaLocation you must give a location that the XML processor can use to find the XML Schema document. The default entity resolver used by Xalan (Carbon's default XML processor) is only able to find locations with a valid URL. It does not do additional processing to attempt to determine the location of the document.

Future versions of Carbon may include the Apache XML Commons Resolver which provides support for OASIS XML Catalogs that allow a user to map a given URI for a schema location to a URL using a configuration file.


Copyright © 2001-2003, Sapient Corporation