Stax parser
StAX XML Parser in Java
This article focuses on how one can parse a XML file in Java.
XML : XML stands for eXtensible Markup Language. It was designed to store
and transport data. It was designed to be both human- and
machine-readable. That’s why, the design goals of XML emphasize simplicity,
generality, and usability across the Internet.
Why StAX instead of SAX ?
 SAX: The SAX is a push model API which means that it is the API
which calls your handler, not your handler that calls the API . The SAX
parser thus “pushes” events into your handler. With this push model of
API you have no control over how and when the parser iterates over the
file. Once you start the parser, it iterates all the way until the end,
calling your handler for each and every XML event in the input XML
document.
SAX Parser --> Handler

 StAX : The StAX pull model means that it is your “handler” class
that calls the parser API , not the other way around. Thus your handler
class controls when the parser is to move on to the next event in the
input. In other words, your handler “pulls” the XML events out of the
parser. Additionally, you can stop the parsing at any point. The StAX
parser is generally used instead of a file reader , when the input or
database is given in the form of offline or online xml file .The pull model
of is summarized like this:
Handler --> StAX Parser

Also StAX parser can read and write in the XML documents while SAX can
only read. SAX provides the schema validation i.e. if the tags are nested
correctly or XML is correctly written , but StAX provides no such method of
schema validation.

Implementation
Idea of How StAX parser works :
Input File : This is sample input file made by the author as an example to
show how StAX parser is used . Save it as data.xml and run the code . XML
database files usually are large and contains many tags nested within each
other .
<company class="geeksforgeeks.org">
<name>Kunal Sharma</name>
<title>Student</title>
<email>kunal@example.com</email>
<phone>(202) 456-1414</phone>
</company>
// Java Code to implement StAX parser
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.Iterator;
import javax.xml.namespace.QName;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.*;
public class Main
{
private static boolean bcompany,btitle,bname,bemail,bphone;
public static void main(String[] args) throws FileNotFoundException,
XMLStreamException
{
// Create a File object with appropriate xml file name
File file = new File("data.xml");
// Function for accessing the data
parser(file);
}
public static void parser(File file) throws FileNotFoundException,
XMLStreamException
{
// Variables to make sure whether a element
// in the xml is being accessed or not
// if false that means elements is
// not been used currently , if true the element or the
// tag is being used currently
bcompany = btitle = bname = bemail = bphone = false;
// Instance of the class which helps on reading tags
XMLInputFactory factory = XMLInputFactory.newInstance();
// Initializing the handler to access the tags in the XML file
XMLEventReader eventReader =
factory.createXMLEventReader(new FileReader(file));
// Checking the availabilty of the next tag
while (eventReader.hasNext())
{
// Event is actually the tag . It is of 3 types
// <name> = StartEvent
// </name> = EndEvent
// data between the StartEvent and the EndEvent
// which is Characters Event
XMLEvent event = eventReader.nextEvent();
// This will trigger when the tag is of type <...>
if (event.isStartElement())
{
StartElement element = (StartElement)event;
// Iterator for accessing the metadeta related
// the tag started.
// Here, it would name of the company
Iterator<Attribute> iterator = element.getAttributes();
while (iterator.hasNext())
{
Attribute attribute = iterator.next();
QName name = attribute.getName();
String value = attribute.getValue();
System.out.println(name+" = " + value);
}
// Checking which tag needs to be opened for reading.
// If the tag matches then the boolean of that tag
// is set to be true.
if
(element.getName().toString().equalsIgnoreCase("comapany"))
{
bcompany = true;
}
if
(element.getName().toString().equalsIgnoreCase("title"))
{
btitle = true;
}
if
(element.getName().toString().equalsIgnoreCase("name"))
{
bname = true;
}
if
(element.getName().toString().equalsIgnoreCase("email"))
{
bemail = true;
}
if
(element.getName().toString().equalsIgnoreCase("phone"))
{
bphone = true;
}
}
// This will be triggered when the tag is of type </...>
if (event.isEndElement())
{
EndElement element = (EndElement) event;
// Checking which tag needs to be closed after reading.
// If the tag matches then the boolean of that tag is
// set to be false.
if
(element.getName().toString().equalsIgnoreCase("comapany"))
{
bcompany = false;
}
if
(element.getName().toString().equalsIgnoreCase("title"))
{
btitle = false;
}
if
(element.getName().toString().equalsIgnoreCase("name"))
{
bname = false;
}
if
(element.getName().toString().equalsIgnoreCase("email"))
{
bemail = false;
}
if
(element.getName().toString().equalsIgnoreCase("phone"))
{
bphone = false;
}
}
// Triggered when there is data after the tag which is
// currently opened.
if (event.isCharacters())
{
// Depending upon the tag opened the data is retrieved .
Characters element = (Characters) event;
if (bcompany)
{
System.out.println(element.getData());
}
if (btitle)
{
System.out.println(element.getData());
}
if (bname)
{
System.out.println(element.getData());
}
if (bemail)
{
System.out.println(element.getData());
}
if (bphone)
{
System.out.println(element.getData());
}
}
}
}
}
Run on IDE
Output :
name = geeksforgeeks.org
Kunal Sharma
Student
kunal@example.com
(202) 456-1414
How does StAX work in the above Code ?
After creating the eventReader in the above code with the help of factory
pattern to create a XML file reader, it basically starts by reading the <…>
tag . As soon as <…> tag comes, a boolean variable is set to true indicating
that the tag has been opened. This tag matching is done by identifying
whether it is a start tag or end tag. Since <…> tag indicates the starting,
therefore it is matched by StartElement. Next comes the data reading part.
In the next step, it reads the character/data by matching the element by
isCharacters, this is done only if the starting tag that we require is opened
or its boolean variable is set true. After this comes closing of element
indicated by </…> tag. As soon it encounters </..> it checks which of the
elements was opened or set to true and it sets that element boolean to false
or closes it.
Basically each event is first opening the tag, reading its data and then closing
it.

More Related Content

PPT
DOSUG XML Beans overview by Om Sivanesian
PPSX
ASP.Net Presentation Part2
PPT
6 xml parsing
PPTX
PPT
5 xml parsing
PPT
XML SAX PARSING
PPTX
Java Docs
DOSUG XML Beans overview by Om Sivanesian
ASP.Net Presentation Part2
6 xml parsing
5 xml parsing
XML SAX PARSING
Java Docs

What's hot (20)

PPT
Java XML Parsing
PDF
Javadoc guidelines
PPTX
Sql server ___________session_18(stored procedures)
PPTX
Stored procedure in sql server
PDF
Xml And JSON Java
PPT
Java căn bản - Chapter12
PPTX
Ajax
PDF
Oracle to vb 6.0 connectivity
PPT
XStream Quick Start
ZIP
Introduction to SQLite in Adobe AIR 1.5
PPTX
Oracle: PLSQL Introduction
PPTX
5.C#
DOCX
Spring review_for Semester II of Year 4
PPTX
Time-Based Blind SQL Injection
PPT
Time-Based Blind SQL Injection using Heavy Queries
PPS
Procedures/functions of rdbms
PPT
Lecture14Slides.ppt
PDF
SQL Injection Tutorial
Java XML Parsing
Javadoc guidelines
Sql server ___________session_18(stored procedures)
Stored procedure in sql server
Xml And JSON Java
Java căn bản - Chapter12
Ajax
Oracle to vb 6.0 connectivity
XStream Quick Start
Introduction to SQLite in Adobe AIR 1.5
Oracle: PLSQL Introduction
5.C#
Spring review_for Semester II of Year 4
Time-Based Blind SQL Injection
Time-Based Blind SQL Injection using Heavy Queries
Procedures/functions of rdbms
Lecture14Slides.ppt
SQL Injection Tutorial
Ad

Similar to Stax parser (20)

PDF
Xml & Java
PPTX
Sax parser
PDF
Xml parsing
PPT
Processing XML with Java
PPT
JSR 172: XML Parsing in MIDP
PDF
SAX, DOM & JDOM parsers for beginners
PPT
Sax Dom Tutorial
PDF
Service Oriented Architecture - Unit II - Sax
PDF
24sax
PPT
SAX PARSER
ODP
SCDJWS 6. REST JAX-P
PPT
Xml parsers
PPT
Xm lparsers
PPTX
Xml and xml processor
PPTX
Xml and xml processor
PPT
PDF
Java xml tutorial
PPT
Xml processing-by-asfak
PDF
Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...
PPTX
VTD-XML: The Future of XML Processing
Xml & Java
Sax parser
Xml parsing
Processing XML with Java
JSR 172: XML Parsing in MIDP
SAX, DOM & JDOM parsers for beginners
Sax Dom Tutorial
Service Oriented Architecture - Unit II - Sax
24sax
SAX PARSER
SCDJWS 6. REST JAX-P
Xml parsers
Xm lparsers
Xml and xml processor
Xml and xml processor
Java xml tutorial
Xml processing-by-asfak
Web Technologies (8/12): XML & HTML Data Processing. Simple API for XML. Simp...
VTD-XML: The Future of XML Processing
Ad

Recently uploaded (20)

PDF
Environmental Education MCQ BD2EE - Share Source.pdf
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
 
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PPTX
Introduction to pro and eukaryotes and differences.pptx
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PPTX
Computer Architecture Input Output Memory.pptx
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Empowerment Technology for Senior High School Guide
PDF
IGGE1 Understanding the Self1234567891011
PDF
Trump Administration's workforce development strategy
Environmental Education MCQ BD2EE - Share Source.pdf
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
TNA_Presentation-1-Final(SAVE)) (1).pptx
202450812 BayCHI UCSC-SV 20250812 v17.pptx
 
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
Introduction to pro and eukaryotes and differences.pptx
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Share_Module_2_Power_conflict_and_negotiation.pptx
Unit 4 Computer Architecture Multicore Processor.pptx
Computer Architecture Input Output Memory.pptx
LDMMIA Reiki Yoga Finals Review Spring Summer
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Cambridge-Practice-Tests-for-IELTS-12.docx
AI-driven educational solutions for real-life interventions in the Philippine...
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Empowerment Technology for Senior High School Guide
IGGE1 Understanding the Self1234567891011
Trump Administration's workforce development strategy

Stax parser

  • 2. StAX XML Parser in Java This article focuses on how one can parse a XML file in Java. XML : XML stands for eXtensible Markup Language. It was designed to store and transport data. It was designed to be both human- and machine-readable. That’s why, the design goals of XML emphasize simplicity, generality, and usability across the Internet. Why StAX instead of SAX ?  SAX: The SAX is a push model API which means that it is the API which calls your handler, not your handler that calls the API . The SAX parser thus “pushes” events into your handler. With this push model of API you have no control over how and when the parser iterates over the file. Once you start the parser, it iterates all the way until the end, calling your handler for each and every XML event in the input XML document. SAX Parser --> Handler   StAX : The StAX pull model means that it is your “handler” class that calls the parser API , not the other way around. Thus your handler class controls when the parser is to move on to the next event in the input. In other words, your handler “pulls” the XML events out of the parser. Additionally, you can stop the parsing at any point. The StAX parser is generally used instead of a file reader , when the input or database is given in the form of offline or online xml file .The pull model of is summarized like this: Handler --> StAX Parser  Also StAX parser can read and write in the XML documents while SAX can only read. SAX provides the schema validation i.e. if the tags are nested correctly or XML is correctly written , but StAX provides no such method of schema validation.  Implementation
  • 3. Idea of How StAX parser works : Input File : This is sample input file made by the author as an example to show how StAX parser is used . Save it as data.xml and run the code . XML database files usually are large and contains many tags nested within each other . <company class="geeksforgeeks.org"> <name>Kunal Sharma</name> <title>Student</title> <email>kunal@example.com</email> <phone>(202) 456-1414</phone> </company> // Java Code to implement StAX parser import java.io.File; import java.io.FileNotFoundException; import java.io.FileReader; import java.util.Iterator; import javax.xml.namespace.QName; import javax.xml.stream.XMLEventReader; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLStreamException; import javax.xml.stream.events.*; public class Main { private static boolean bcompany,btitle,bname,bemail,bphone;
  • 4. public static void main(String[] args) throws FileNotFoundException, XMLStreamException { // Create a File object with appropriate xml file name File file = new File("data.xml"); // Function for accessing the data parser(file); } public static void parser(File file) throws FileNotFoundException, XMLStreamException { // Variables to make sure whether a element // in the xml is being accessed or not // if false that means elements is // not been used currently , if true the element or the // tag is being used currently bcompany = btitle = bname = bemail = bphone = false; // Instance of the class which helps on reading tags XMLInputFactory factory = XMLInputFactory.newInstance(); // Initializing the handler to access the tags in the XML file XMLEventReader eventReader = factory.createXMLEventReader(new FileReader(file)); // Checking the availabilty of the next tag while (eventReader.hasNext()) { // Event is actually the tag . It is of 3 types // <name> = StartEvent // </name> = EndEvent // data between the StartEvent and the EndEvent // which is Characters Event XMLEvent event = eventReader.nextEvent(); // This will trigger when the tag is of type <...> if (event.isStartElement()) { StartElement element = (StartElement)event; // Iterator for accessing the metadeta related // the tag started. // Here, it would name of the company Iterator<Attribute> iterator = element.getAttributes(); while (iterator.hasNext()) { Attribute attribute = iterator.next(); QName name = attribute.getName(); String value = attribute.getValue(); System.out.println(name+" = " + value); } // Checking which tag needs to be opened for reading. // If the tag matches then the boolean of that tag
  • 5. // is set to be true. if (element.getName().toString().equalsIgnoreCase("comapany")) { bcompany = true; } if (element.getName().toString().equalsIgnoreCase("title")) { btitle = true; } if (element.getName().toString().equalsIgnoreCase("name")) { bname = true; } if (element.getName().toString().equalsIgnoreCase("email")) { bemail = true; } if (element.getName().toString().equalsIgnoreCase("phone")) { bphone = true; } } // This will be triggered when the tag is of type </...> if (event.isEndElement()) { EndElement element = (EndElement) event; // Checking which tag needs to be closed after reading. // If the tag matches then the boolean of that tag is // set to be false. if (element.getName().toString().equalsIgnoreCase("comapany")) { bcompany = false; } if (element.getName().toString().equalsIgnoreCase("title")) { btitle = false; } if (element.getName().toString().equalsIgnoreCase("name")) { bname = false; } if (element.getName().toString().equalsIgnoreCase("email")) { bemail = false; }
  • 6. if (element.getName().toString().equalsIgnoreCase("phone")) { bphone = false; } } // Triggered when there is data after the tag which is // currently opened. if (event.isCharacters()) { // Depending upon the tag opened the data is retrieved . Characters element = (Characters) event; if (bcompany) { System.out.println(element.getData()); } if (btitle) { System.out.println(element.getData()); } if (bname) { System.out.println(element.getData()); } if (bemail) { System.out.println(element.getData()); } if (bphone) { System.out.println(element.getData()); } } } } } Run on IDE Output : name = geeksforgeeks.org Kunal Sharma Student kunal@example.com (202) 456-1414 How does StAX work in the above Code ?
  • 7. After creating the eventReader in the above code with the help of factory pattern to create a XML file reader, it basically starts by reading the <…> tag . As soon as <…> tag comes, a boolean variable is set to true indicating that the tag has been opened. This tag matching is done by identifying whether it is a start tag or end tag. Since <…> tag indicates the starting, therefore it is matched by StartElement. Next comes the data reading part. In the next step, it reads the character/data by matching the element by isCharacters, this is done only if the starting tag that we require is opened or its boolean variable is set true. After this comes closing of element indicated by </…> tag. As soon it encounters </..> it checks which of the elements was opened or set to true and it sets that element boolean to false or closes it. Basically each event is first opening the tag, reading its data and then closing it.