Abbey Workshop

SAX Parsing with JAXP

This tip includes an example of how an XML file is parsed using the SAX event model and JAXP. The parser object are created on lines 15-17. First a factory object is created and this is used to configure the parser. Then, a parser object is created which will parse our XML file. Since SAX parsing is event driven, an event handler is created on line 20. This is where methods are created for managing SAX events.

Notice on line 16 the parser is made namespace aware. This is done so we can use the local part of the tag name passed, not the qname.

Listing for: MyParser.java

   1:import javax.xml.parsers.*;     // Includes SAXParser and SAXParserFactory
   2:import org.xml.sax.*;           // Needed for SAXException
   3:import org.xml.sax.helpers.*;   // Needed to include the DefaultHandler
   4:import java.io.*;
   5:
   6:// A Simple SAX Application using JAXP with Namespace support
   7:public class MyParser{
   8:    private SAXParserFactory factory; // Creates parser factory
   9:    private SAXParser saxParser; // Holds a parser object
  10:
  11:    private DefaultHandler handler; // Defines the handler for this parser
  12:    
  13:    public MyParser() throws SAXException{
  14:        try{
  15:            factory = SAXParserFactory.newInstance();
  16:            factory.setNamespaceAware(true);        
  17:            saxParser = factory.newSAXParser();
  18:                        
  19:            // Set Content Handlers
  20:            handler = new MyDefaultHandler();
  21:            
  22:        } catch (ParserConfigurationException e){
  23:            e.printStackTrace();
  24:        } catch (SAXException e){
  25:            e.printStackTrace();
  26:        }
  27:    }
  28:    
  29:    public void parseDocument(String xmlFile){
  30:        try{
  31:            saxParser.parse(xmlFile, handler); // Parses file using handler
  32:        } catch (SAXException e){
  33:            e.printStackTrace();
  34:        } catch (IOException e){
  35:            e.printStackTrace();
  36:        } catch (Exception e){
  37:            e.printStackTrace();
  38:        }
  39:    }
  40:    
  41:    public static void main(String[] args){
  42:        try {
  43:            if (args.length != 1) {
  44:                System.out.println(
  45:                    "Usage: java SimpleJAXPns " +
  46:                    "[XML Document Filename]");
  47:                System.exit(0);
  48:            }
  49:            MyParser xmlApp = new MyParser();
  50:            xmlApp.parseDocument(args[0]);
  51:        } catch (SAXException e){
  52:            e.printStackTrace();
  53:        } catch (Exception e) {
  54:            e.printStackTrace();
  55:        }
  56:    }
  57:    
  58:    class MyDefaultHandler extends DefaultHandler{
  59:        private CharArrayWriter buff = new CharArrayWriter();
  60:        /* With a handler class, just override the methods you need to use
  61:        */
  62:        public void startElement(String uri, String local, String qName, Attributes att){
  63:            System.out.println("== Started element ==");
  64:            System.out.println("Local name: " + local);
  65:            System.out.println("Qname: " + qName);
  66:        }
  67:        
  68:        public void characters(char[] ch, int start, int length){
  69:            buff.write(ch, start, length);
  70:            
  71:            // Skip spaces and end of line markers
  72:            // A bit of a hack. Will switch to a RegEx in the future
  73:            if (buff.size() > 3){
  74:                System.out.println("Element Data: " + buff.toString());
  75:            }
  76:            buff.reset();
  77:        }
  78:
  79:        public void endElement(String uri, String local, String qName){
  80:            System.out.println("Ended element: " + local);
  81:        }
  82:        
  83:    }
  84:}
  85:
`

The DefaultHandler

The methods declared in this class define the events that we will handle. Since the DefaultHandler class is used, we override only the methods we need. A method does not need to be written for each possible type of event. See the class documentation for DefaultHandler for a list of all of the possible events.

The following are links to sample XML files you can use with this class as a test:

Listing for: todo.xml

   1 <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
   2 <!-- A Todo List -->
   3 <todo>
   4   <list name="List1">
   5     <item>Item 1</item>
   6     <item>Item 2</item>
   7     <item>Item 3</item>
   8   </list>
   9   <list name="List2">
  10     <item>Item one</item>
  11     <item>Item two</item>
  12     <item>Item three</item>
  13   </list>
  14 </todo>

Listing for: todons.xml

   1 <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
   2 <!-- A Todo List -->
   3 <todo xmlns:td="http://www.abbeyworkshop.com/todo">
   4   <td:list name="List1">
   5     <td:item>Item 1</td:item>
   6     <td:item>Item 2</td:item>
   7     <td:item>Item 3</td:item>
   8   </td:list>
   9   <td:list name="List2">
  10     <td:item>Item one</td:item>
  11     <td:item>Item two</td:item>
  12     <td:item>Item three</td:item>
  13   </td:list>
  14 </todo>