DomSax software is an implementation of a XML-parser that operates on the standard Document Object Model principle.
One of the key features of DomSax is its ability to create a complete document for each repeating block that exists in most XML-documents. This means that the programmer can keep their code clean and control memory consumption, without sacrificing processing power.
The parser has been thoroughly tested and optimized for use with Java 1.5.1. Currently, there are two options available for parsing XML-files: SAX and DOM. While SAX provides flexibility and low memory consumption, it can complicate searches and slow down load times. In contrast, DOM provides an easier interface for searching elements in a fully loaded document, but comes with a high cost in memory consumption and speed.
At the outset of the project, one of the primary requirements was the ability to process XML-files that were 100+ Mb in size. This meant that using the SAX parser was the only practical option, allowing for parsing the file element by element and keeping memory consumption within manageable limits. However, the resulting code was often messy and difficult to work with, due to the separation of opening tags, data, and closing tags.
This is why DomSax was designed to combine the flexibility of the SAX parser with the ease of use of the DOM approach. The underlying principle of DomSax is built around repeating blocks, which can be identified using the existing XPath technology. Most XML-files contain records that are repeated in the same manner; these can be separated into individual blocks using the Xpath.
For each such block, DomSax creates a complete document containing only the data within that block. Once the document is complete, DomSax passes it to any registered listeners for further processing. Overall, DomSax is a powerful and easy-to-use tool that simplifies the process of parsing XML-files and working with data blocks.
Version 1.0.0: N/A