Saturday, October 07, 2006

JAXB and xmlBeans. Pros ad cons

By Sandrick Melbouci

I would like to address some differences between JAXB and xmlBeans in processing of XML documents.
JAXB is a part of JavaEE standards and provides a convenient way to bind an XML schema to a set of Java classes. This allowsh an easy way to process data in java technology without having to understand all the ins and outs of te XML technology.
JAXB 2.0, available in in the open-source Java EE 5-compliant application server at Glassfish project is part of the new integrated stack for Web services development. This new version of JAXB includes all the data binding functionality in a single package, while the previous Web services development stack (in Java Web Services Developer Pack 1.6 and previous versions) carried out some data binding in the JAX-RPC 1.x package and some in the JAXB 1.x package. With the integrated stack comprising JAX-WS 2.0, JAXB 2.0, and SAAJ 1.3, the Web services description, data binding, and SOAP attachment processing functionality are more logically architected, enabling you to develop Web services, middle tier and Web applications more easily.

XmlBeans was originally developed by BEA Systems, it is now an open source project. xmlBeans provide similar functionality then JAXB but may require some good knowledge of XML technology.
Prior to this technologies, one way to proccess an XML document, perhaps the most typical way, is through parsers Simple API for XML (SAX) or the Document Object Model (DOM). Both of these parsers are provided by
Java API for XML Processing (JAXP). The developer writes code to invoke a SAX or DOM parser through the JAXP API to parse an XML document -- that is, scan the document and logically break it up into discrete pieces. The parsed content is then made available to the application. In the SAX approach, the parser starts at the beginning of the document and passes each piece of the document to the application in the sequence it finds it. Nothing is saved in memory. The application can take action on the data as it gets it from the parser, but it can't do any in-memory manipulation of the data. For example, it can't update the data in memory and return the updated data to the XML file.
In the DOM approach, the parser creates a tree of objects that represents the content and organization of data in the document. In this case, the tree exists in memory. The application can then navigate through the tree to access the data it needs, and if appropriate, manipulate it.

Now we have 2 technologies that do convert the XML document into java classes. You manipulate the Java object instead. However, JAXB and XmlBeans accomplish the marshalling and unmarshalling of XML document quite differently.

XMLBeans processes an XML document without going through a conversion into Java where integrity of the document can be lost. XMLBeans creates a cursor that can move through the XML document. You can access any element of the document, including comments and schema information because the document is kept in full fidelity. It also creates the opportunity to execute an XQuery on the document. XMLBeans also gives a strongly typed access to the document and a more generic type of access, similar to a reflection API. However, it doesn't always work the way you want it. Many time an error like "Invalid XML character 0x0" by processing the same document where the processing was previously successful.
More importantly, like I previously mentioned it, Using Xml Beans requires you to know XML very well and also SAX and DOM.

JAXB is binded to the XML schema, the first release of JAXB did not answer the development community, mainly because it did not support all XML schema features and only supported DTDs. The first JAXB version suffered mainly of 3 major issues: it prevented support of type substitution and related features, it prevented support for extensibility and versioning, it failed to provide readable bindings in many cases.
JAXB 2.0 was enhanced to fill up the gaps and provides full XML schema support, Java to XML Schema mapping, schema evolution and portability. For ease of development, JAXB has leveraged J2SE 5.0 features: generics for type safe collections enum type for binding of simple type with enumeration facets better support for XML schema types using datatypes in JAXP 1.3 leveraging JAXP 1.3 validation for smaller footprint more compact binding based on constraining facets defined in schema.
JAXB architecture was re-designed to assist with schema evolution and invalid XML content. JAXB has introduced a flexible unmarshalling mode that allows for unmarshalling of invalid XML content. Optional unmarshal time validation can be used by an application to detect invalid XML content and decide whether to terminate unmarshalling.

Let's take an example to illustrate the basic usage of both technologies:

Let say we have a schema:

<xs:complextype name="promotion">
<xs:sequence>
<xs:element name="discount" type="xs:string">
<xs:element name="None" type="xs:string">
</xs:sequence>
</xs:complextype>



With XmlBeans:
PromotionDocument pdoc =
PromotionDocument.Factory.parse(new File (xmlFile));

// Get and print pieces of the XML instance.
Promotion p = pdoc.getPromotion;
with JAXB:
JAXBContext jc = JAXBContext.newInstance("test.jaxb");
Unmarshaller unmarshaller = jc.createUnmarshaller();
Promotion p = (Promotion) unmarshaller(new File(xmlFile) );


The choice between these technology depends on the performance you want to get
and the use of XML features. I will go with XML Beans if:
I would like to access the XML document itself and use XQueries ...etc.
I would like Native DOM representation at the expense of performance and memory management.
Access schema metadata
Use it older JDK versions. 1.4 or older.


If you are looking for simplicity to convert an XML document <--> Java classes,
JAXB is your choice.
If you want binding customization, JAXB allows these declarations to be made "inline" -- that is, in the schema, or in a separate document.
JAXB uses memory efficiently: The tree of content objects produced through JAXB tends
can be more efficient in terms of memory use than DOM-based trees.
JAXB allows you to access XML data without having to unmarshal it. Once a schema is bound you can use the ObjectFactory methods to create the objects and then use set methods in the generated objects to create content.

References:
JAXB Architecture, http://java.sun.com/webservices/jaxb
JSR 31, JAXB specs, http://jcp.org/en/jsr/detail?id=031
XMLBeans, http://xmlbeans.apache.org/
XMLBeans Architecture, http://dev2dev.bea.com/xml/xmlbeans.html

No comments: