background image

Streaming Versus DOM

<< 4. Streaming API for XML | Pull Parsing Versus Push Parsing >>
<< 4. Streaming API for XML | Pull Parsing Versus Push Parsing >>
78
S
TREAMING
API
FOR
XML
by exposing a simple iterator based API. This allows the programmer to ask for
the next event (pull the event) and allows state to be stored in procedural fash-
ion." StAX was created to address limitations in the two most prevalent parsing
APIs, SAX and DOM.
Streaming Versus DOM
Generally speaking, there are two programming models for working with XML
infosets: document streaming and the document object model (DOM).
The DOM model involves creating in-memory objects representing an entire
document tree and the complete infoset state for an XML document. Once in
memory, DOM trees can be navigated freely and parsed arbitrarily, and as such
provide maximum flexibility for developers. However the cost of this flexibility
is a potentially large memory footprint and significant processor requirements,
as the entire representation of the document must be held in memory as objects
for the duration of the document processing. This may not be an issue when
working with small documents, but memory and processor requirements can
escalate quickly with document size.
Streaming refers to a programming model in which XML infosets are transmit-
ted and parsed serially at application runtime, often in real time, and often from
dynamic sources whose contents are not precisely known beforehand. Moreover,
stream-based parsers can start generating output immediately, and infoset ele-
ments can be discarded and garbage collected immediately after they are used.
While providing a smaller memory footprint, reduced processor requirements,
and higher performance in certain situations, the primary trade-off with stream
processing is that you can only see the infoset state at one location at a time in
the document. You are essentially limited to the "cardboard tube" view of a doc-
ument, the implication being that you need to know what processing you want to
do before reading the XML document.
Streaming models for XML processing are particularly useful when your appli-
cation has strict memory limitations, as with a cellphone running J2ME, or when
your application needs to simultaneously process several requests, as with an
application server. In fact, it can be argued that the majority of XML business
logic can benefit from stream processing, and does not require the in-memory
maintenance of entire DOM trees.