February 3, 2010

Apex DOM classes Document XMLNode reviewed - Spring 10 Release !

Abstract

I was going through the code sample and apex docs for new Spring 10 Apex DOM Classes. I feel these new classes are really a great missing piece for apex developers  that work on integrations via web services(XML) for ex gdata api’s. Though the API is a huge benefit for apex developers , but the design of the API could be much better for ex. the pretty popular “getElementsByTagName” method is not available. So in this post I will try to explore the good and the bad/missing part of this new API.

 

Apex Dom Classes, the good part !

The only major advantage I see from this API is “It by passes governor limits for script execution”. I didn’t mentioned XML DOM parsing/creation here, because we have XMLDom.cls available for free. This class is pretty well designed and does all the required stuff for traversing and creating a XML document.

But the problem with XMLDom is that its a normal apex class, just like the classes we create. So any operation it does counts against the governor limits and execution cost. The major two issues with XMLDom.cls are :

  1. Governor limits: It consumes a lot from “Total number of executed script statements” limit.
  2. XmlDom methods like parseXMLReader, getElementsByTagName(), toXmlString are pretty time consuming and costly.
  3. It pollutes the debug logs :))  

 

Good Part 1 : Governor Limits advantages over XMLDom.cls

If you are parsing a big XML using XMLDom its pretty much possible that  you will end up in getting or close to limit : “Total number of executed script statements”.  For maintaining big  XML node hierarchy XMLDom has to create a big nested Node structure, that involves creating too many XmlDom.Element instances. Together with this operations like parseXmlReader, getElementsByTagName and toXmlString(which is called just for debugging), adds a lot to the script execution limit count. If you check your debug logs for a big XML, they will be flooded with information from these XMLDom methods. So if someone is lucky enough to get a web service XML response while fighting the 100KB response limit, this XML parsing will make him fail for sure.

Here is snippet of debug logs for a pretty common XML response when working with GDATA feeds.

20100203083559.140:Class.XMLDom: line 117, column 24:     returning from end of method public Element(String) in 1 ms
20100203083559.140:Class.XMLDom.parseXmlReader: line 63, column 21:     returning from end of method public Element(String) in 0 ms
20100203083559.140:Class.XMLDom.parseXmlReader: line 77, column 5:     returning from end of method public void appendChild(XMLDom.Element) in 0 ms
......................................................    
....... Trimmed similar 200 lines of Class.XMLDom.parseXmlReader 
........................................................

20100203083559.140:Class.XMLDom.Element.getElementsByTagName: line 194, column 3: SelectLoop:LIST:XMLDom.Element
20100203083559.140:Class.XMLDom.Element.getElementsByTagName: line 194, column 3:     Number of iterations: 0
20100203083559.140:Class.XMLDom.Element.getElementsByTagName: line 195, column 16:     returning LIST:XMLDom.Element from method public LIST:XMLDom.Element getElementsByTagName(String) in 0 ms
20100203083559.140:Class.XMLDom.Element.getElementsByTagName: line 194, column 3: SelectLoop:LIST:XMLDom.Element

......................................................    
....... Trimmed similar 600 lines of Class.XMLDom.Element.getElementsByTagName 
........................................................


20100203083559.140:Class.XMLDom.Element.toXmlString: line 225, column 3: SelectLoop:SET:String
20100203083559.140:Class.XMLDom.Element.toXmlString: line 225, column 3:     Number of iterations: 1
20100203083559.140:Class.XMLDom.Element.toXmlString: line 230, column 3: SelectLoop:LIST:XMLDom.Element

......................................................    
....... Trimmed similar 250 lines of Class.XMLDom.Element.toXmlString
........................................................

toXmlString() is the only guy we can avoid debugging, but getElementsByTagName and parseXmlReader are unavoidable. So for sure the new Apex Document and XMLNode classes will help a lot here.

Good Part 2 : Native Document and XMLNode will be faster then Apex XMLDom.cls

This again comes from apex profiling. If we check the debug logs to figure out the "10 most expensive method invocations”. You will for sure find these two

  • Class.XMLDom: parseXmlReader() : executed 133 times in 1202 ms
  • Class.XMLDom: getElementsByTagName() : executed 407 times in 1183 ms

These result I got when trying to publish to Google sites using apex controller. So I am saying this as a good part of new API because “The new API is no more Apex class, it must be something native like a Java code. So it will execute faster than the Apex.”

 

Apex Dom Classes, the BAD part !

Bad part was too much to be in this post, it deserves its  own post. So moving it out of this. Check the BAD part here (http://www.tgerm.com/2010/02/apex-dom-document-xmlnode-bad-design.html).

But I will end the post, by giving my sincere thanks to salesforce team to come out with this API. at least it will give us more bandwidth from governor limits :)