JAXP Masquerading

Written by Kohsuke KAWAGUCHI

Introduction

"JAXP masquerading" is a feature of MSV that allows developers to incorporate MSV into existing JAXP-based applications relatively easily.

The key advantage of this feature is that the change to your existing code will be minimal.

How It Works

MSV implements the JAXP interface, which is basically a proxy (or a wrapper) to another JAXP implementation. MSV does not do any actual work except the validation; all other parsing tasks are delegated to another JAXP implementation.

As a whole, from the application developer's perspective, this feature effectively adds the multi-schema validation capability into any JAXP parser.

Using JAXP masquerading

If your application is using DOM, it must have some code to create a new instance of DocumentBuilderFactory:

// create a new parser
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

factory.setNamespaceAware(true);
Document dom =
  factory.newDocumentBuilder().parse(urlOfDocument);

// do something with the parsed dom

To use JAXP masquerading, change the above code as shown in the following example:

// create a new parser
DocumentBuilderFactory factory =
  com.sun.msv.verifier.jaxp.DocumentBuilderFactoryImpl();

factory.setNamespaceAware(true);
factory.setAttribute("http://www.sun.com/xml/msv/schema",
     schemaUrl);

Document dom =
  factory.newDocumentBuilder().parse(urlOfDocument);

// do something with the parsed dom

If your application is using SAX, it must have some code to create a new instance of SAXParserFactory:

// create a new parser
SAXParserFactory factory = SAXParserFactory.newInstance();

factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();

// parse the document
parser.parse(new File("abc.xml"),myHandler);

To use JAXP masquerading, change the above code as shown in the following example:

// create a new parser
SAXParserFactory factory =
  com.sun.msv.verifier.jaxp.SAXParserFactoryImpl();

factory.setNamespaceAware(true);

SAXParser parser = factory.newSAXParser();
factory.setProperty("http://www.sun.com/xml/msv/schema",
     schemaUrl);

// parse the document
parser.parse(new File("abc.xml"),myHandler);

The first thing you'd notice is that it creates an instance of com.sun.msv.verifier.jaxp.***Impl instead of using the newInstance method. In this way, you can use the JAXP masquerading implementation of MSV, instead of the normal XML parser implementation.

The next thing you'd notice is that it calls the setAttribute method or the setProperty method to set a schema. Once the schema is set, it is applied whenever a document is parsed. Although the URL of the schema is passed in this example, it can accept several types of input. For details, please read this.

Once the schema is set, all the successive calls of the parse method will be validated with the schema. Any validation error will be reported just like any other well-formedness errors; if you set an error handler, it will receive the errors; otherwise the parse method will throw a SAXException.

As you expect, MSV provides the multi-schema capability so you can pass a schema written in any of the supported schema languages.

Syntax of Property

The JAXP masquerading implementation of MSV supports only one property/attribute. All other properties/attributes will be processed by the underlying implementation.

Name: http://www.sun.com/xml/msv/schema
Type: java.lang.String, java.io.File, java.io.InputStream, org.xml.sax.InputSource, org.iso_relax.verifier.Schema
Access: read-only
Desc: Set the schema which will be used to validate documents. If a String is passed, it will be treated as the URL of the schema. If a File is passed, the schema will be parsed from that file. If an InputStream is passed, the schema will be read from that stream. If a Schema object is passed, then that schema will be used.
Note:

Currently, the detection of DTD is done by checking the file extension ".dtd". Therefore, if you are planning to use DTDs, then you cannot pass an InputStream (because it doesn't have a name). If you pass an InputSource, don't forget to call the setSystemId method to set the name.

All other schemas written in XML-syntax (RELAX, W3C XML Schema, TREX and etc) will be detected correctly no matter what the format is.

Advanced Usage

For further advanced use, please consult the javadoc of the com.sun.msv.verifier.jaxp package. Here are several tips: