3.1. What is XML?
Last updated: 2 February 2013.
XML (eXtensible Markup Language) provides a simple and standard way to store and transport data between heterogeneous applications. Because it is widely used by a variety of applications (not only Java applications), it is important for Java developers to know how to deal with XML.
The XML format is based upon markup and content. The markup must obey certain rules and its purpose is to describe the content. Here is a basic XML file named hotel.xml containing data about the rooms of a hotel. In this example, the content appears in red.
<?xml version="1.0"?>
<HOTEL>
<ROOM>
<NUMBER>17</NUMBER>
<CATEGORY>STANDARD</CATEGORY>
<BED>DOUBLE SIZE</BED>
<FLOOR>5</FLOOR>
<LIVING_ROOM/>
<KITCHENETTE/>
<AIR_CONDITIONING>
<TABLE_FANS>2</TABLE_FANS>
</AIR_CONDITIONING>
</ROOM>
<ROOM>
<NUMBER>33</NUMBER>
<CATEGORY>SUITE</CATEGORY>
<BED>DOUBLE SIZE</BED>
<BED>KING SIZE</BED>
<FLOOR>9</FLOOR>
<LIVING_ROOM>17 FEET * 13 FEET</LIVING_ROOM>
<KITCHENETTE>10 FEET * 8 FEET</KITCHENETTE>
<JACUZZI>3 SEATS</JACUZZI>
<AIR_CONDITIONING>
<AIR_CONDITIONERS>5</AIR_CONDITIONERS>
</AIR_CONDITIONING>
</ROOM>
</HOTEL>
<HOTEL>
<ROOM>
<NUMBER>17</NUMBER>
<CATEGORY>STANDARD</CATEGORY>
<BED>DOUBLE SIZE</BED>
<FLOOR>5</FLOOR>
<LIVING_ROOM/>
<KITCHENETTE/>
<AIR_CONDITIONING>
<TABLE_FANS>2</TABLE_FANS>
</AIR_CONDITIONING>
</ROOM>
<ROOM>
<NUMBER>33</NUMBER>
<CATEGORY>SUITE</CATEGORY>
<BED>DOUBLE SIZE</BED>
<BED>KING SIZE</BED>
<FLOOR>9</FLOOR>
<LIVING_ROOM>17 FEET * 13 FEET</LIVING_ROOM>
<KITCHENETTE>10 FEET * 8 FEET</KITCHENETTE>
<JACUZZI>3 SEATS</JACUZZI>
<AIR_CONDITIONING>
<AIR_CONDITIONERS>5</AIR_CONDITIONERS>
</AIR_CONDITIONING>
</ROOM>
</HOTEL>
A room has a number, a category, one or more beds and a floor. A room may or may not have a living room, kitchenette or jacuzzi. Additionally, each room has air conditioning: either air conditioners with remote control or simple table fans. The first room in the file hotel.xml has two table fans whereas the second one has 5 air conditioners. Also, the first room has a single bed whereas the second one has two.
A well formed XML document must follow these rules:
- The first line (<?xml version="1.0"?>) is optional. It is a prolog which specifies in this case the version of XML in use. The prolog may contain other optional declarations separated by spaces. Apart from the prolog, everything between the characters '<' and '>' is markup. The rest is content.
- If it holds content, every markup must have an opening tag and a closing tag. Let's have a look at the following line:<CATEGORY>STANDARD</CATEGORY>
Because it holds content (STANDARD), the markup has an opening tag (<CATEGORY>) and a closing tag (</CATEGORY>). - If there is no content within the markup, there is no need for a closing tag. Instead, the markup can end with />. For example, for the first room, <LIVING_ROOM/> and <KITCHENETTE/> do not have a closing tag since they are empty. However, those lines can also be written as follows:<LIVING_ROOM></LIVING_ROOM>
<KITCHENETTE></KITCHENETTE/> - An opening and closing tag (the child) can be enclosed in another opening and closing tag (the parent). For instance, in hotel.xml, ROOM is the parent of NUMBER. Likewise, AIR_CONDITIONING is the parent of TABLE_FANS and AIR_CONDITIONERS. A child element must be nested properly within its parent, that is, a child and its parent cannot interleave like this:<PARENT><CHILD></CHILD>
Child's content
</PARENT>
The correct format is:<PARENT><CHILD></PARENT>
Child's content
</CHILD> - Every XML document must have a single opening and closing tag known as the root element (HOTEL in the above example). All the elements of the XML document must be enclosed in the root element. Note that elements are also termed nodes.
- An element (or node) can have one or more attributes that provide additional information about the element. Usually, attributes hold information appearing only once in the node they belong to. For example, If you want to add the hotel name to the file hotel.xml, instead of creating a new node (<NAME>Seasons Hotel</NAME>) that would have a single occurrence within the HOTEL node, you can add an attribute to the HOTEL node like this:
<HOTEL name=Seasons Hotel
>
Attribute values must be enclosed in single quotes or double quotes (as shown above). If a node has multiple attributes, they must be separated by spaces like this:
<HOTEL name=Seasons Hotel
starRating=3
>
Namespaces
Namespaces are used to avoid naming conflicts in situations where nodes with the same name have different structures. As an example, suppose that your application processes HOTEL nodes like the one you have seen in the previous section but also HOTEL nodes coming from another application which deals with customer feedback:
<?xml version="1.0"?>
<HOTEL>
<ROOM>
<NUMBER>17</NUMBER>
<DATE>2 February 2013</DATE>
<CLEANLINESS>SATISFACTORY</CLEANLINESS>
<ROOM_SERVICE>EXCELLENT</ROOM_SERVICE>
<HOUSE_KEEPING_SERVICE>GOOD</HOUSE_KEEPING_SERVICE>
</ROOM>
<ROOM>
<NUMBER>33</NUMBER>
<DATE>31 January 2013</DATE>
<CLEANLINESS>GOOD</CLEANLINESS>
<ROOM_SERVICE>AVERAGE</ROOM_SERVICE>
<HOUSE_KEEPING_SERVICE>EXCELLENT</HOUSE_KEEPING_SERVICE>
</ROOM>
</HOTEL>
<HOTEL>
<ROOM>
<NUMBER>17</NUMBER>
<DATE>2 February 2013</DATE>
<CLEANLINESS>SATISFACTORY</CLEANLINESS>
<ROOM_SERVICE>EXCELLENT</ROOM_SERVICE>
<HOUSE_KEEPING_SERVICE>GOOD</HOUSE_KEEPING_SERVICE>
</ROOM>
<ROOM>
<NUMBER>33</NUMBER>
<DATE>31 January 2013</DATE>
<CLEANLINESS>GOOD</CLEANLINESS>
<ROOM_SERVICE>AVERAGE</ROOM_SERVICE>
<HOUSE_KEEPING_SERVICE>EXCELLENT</HOUSE_KEEPING_SERVICE>
</ROOM>
</HOTEL>
There will be a naming conflict since the above ROOM nodes do not have the same structure as the ROOM nodes shown in the previous section, which would confuse an XML parser. To avoid such a naming conflict, you can use the attribute xmlns to declare a namespace for each type of ROOM node.
A namespace is defined by a unique URI (Uniform Resource Identifier) and prefix. For instance, you can declare two different namespaces to distinguish between the two types of ROOM nodes as follows:
<?xml version="1.0"?>
<HOTEL xmlns:i="http://www.seasonshotel.com/information/"
xmlns:f="http://www.seasonshotel.com/feedback/">
<i:ROOM>
<i:NUMBER>17</i:NUMBER>
<i:CATEGORY>STANDARD</i:CATEGORY>
<i:BED>DOUBLE SIZE</i:BED>
<i:FLOOR>5</i:FLOOR>
<i:LIVING_ROOM/>
<i:KITCHENETTE/>
<i:AIR_CONDITIONING>
<i:TABLE_FANS>2</i:TABLE_FANS>
</i:AIR_CONDITIONING>
</i:ROOM>
<i:ROOM>
<i:NUMBER>33</i:NUMBER>
<i:CATEGORY>SUITE</i:CATEGORY>
<i:BED>DOUBLE SIZE</i:BED>
<i:BED>KING SIZE</i:BED>
<i:FLOOR>9</i:FLOOR>
<i:LIVING_ROOM>17 FEET * 13 FEET</i:LIVING_ROOM>
<i:KITCHENETTE>10 FEET * 8 FEET</i:KITCHENETTE>
<i:JACUZZI>3 SEATS</i:JACUZZI>
<i:AIR_CONDITIONING>
<i:AIR_CONDITIONERS>5</i:AIR_CONDITIONERS>
</i:AIR_CONDITIONING>
</i:ROOM>
<f:ROOM>
<f:NUMBER>17</f:NUMBER>
<f:DATE>2 February 2013</f:DATE>
<f:CLEANLINESS>SATISFACTORY</f:CLEANLINESS>
<f:ROOM_SERVICE>EXCELLENT</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>GOOD</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
<f:ROOM>
<f:NUMBER>33</f:NUMBER>
<f:DATE>31 January 2013</f:DATE>
<f:CLEANLINESS>GOOD</f:CLEANLINESS>
<f:ROOM_SERVICE>AVERAGE</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>EXCELLENT</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
</HOTEL>
<HOTEL xmlns:i="http://www.seasonshotel.com/information/"
xmlns:f="http://www.seasonshotel.com/feedback/">
<i:ROOM>
<i:NUMBER>17</i:NUMBER>
<i:CATEGORY>STANDARD</i:CATEGORY>
<i:BED>DOUBLE SIZE</i:BED>
<i:FLOOR>5</i:FLOOR>
<i:LIVING_ROOM/>
<i:KITCHENETTE/>
<i:AIR_CONDITIONING>
<i:TABLE_FANS>2</i:TABLE_FANS>
</i:AIR_CONDITIONING>
</i:ROOM>
<i:ROOM>
<i:NUMBER>33</i:NUMBER>
<i:CATEGORY>SUITE</i:CATEGORY>
<i:BED>DOUBLE SIZE</i:BED>
<i:BED>KING SIZE</i:BED>
<i:FLOOR>9</i:FLOOR>
<i:LIVING_ROOM>17 FEET * 13 FEET</i:LIVING_ROOM>
<i:KITCHENETTE>10 FEET * 8 FEET</i:KITCHENETTE>
<i:JACUZZI>3 SEATS</i:JACUZZI>
<i:AIR_CONDITIONING>
<i:AIR_CONDITIONERS>5</i:AIR_CONDITIONERS>
</i:AIR_CONDITIONING>
</i:ROOM>
<f:ROOM>
<f:NUMBER>17</f:NUMBER>
<f:DATE>2 February 2013</f:DATE>
<f:CLEANLINESS>SATISFACTORY</f:CLEANLINESS>
<f:ROOM_SERVICE>EXCELLENT</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>GOOD</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
<f:ROOM>
<f:NUMBER>33</f:NUMBER>
<f:DATE>31 January 2013</f:DATE>
<f:CLEANLINESS>GOOD</f:CLEANLINESS>
<f:ROOM_SERVICE>AVERAGE</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>EXCELLENT</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
</HOTEL>
As you can see, the attribute xmlns is used in the root node to declare two namespaces. Each namespace has a URI and a prefix which is then used to link up the nodes to their namespace. The URI of a namespace does not necessarily point to an existing web page although it can be used to store information about the namespace. What matters is that each URI must be unique.
The xmlns attribute allows you to declare a default namespace, that is a namespace which is defined without a prefix, so that the related nodes can be written without a prefix. In the next example, the default namespace URI is:
http://www.seasonshotel.com/information/
<?xml version="1.0"?>
<HOTEL xmlns="http://www.seasonshotel.com/information/"
xmlns:f="http://www.seasonshotel.com/feedback/">
<ROOM>
<NUMBER>17</NUMBER>
<CATEGORY>STANDARD</CATEGORY>
<BED>DOUBLE SIZE</BED>
<FLOOR>5</FLOOR>
<LIVING_ROOM/>
<KITCHENETTE/>
<AIR_CONDITIONING>
<TABLE_FANS>2</TABLE_FANS>
</AIR_CONDITIONING>
</ROOM>
<ROOM>
<NUMBER>33</NUMBER>
<CATEGORY>SUITE</CATEGORY>
<BED>DOUBLE SIZE</BED>
<BED>KING SIZE</BED>
<FLOOR>9</FLOOR>
<LIVING_ROOM>17 FEET * 13 FEET</LIVING_ROOM>
<KITCHENETTE>10 FEET * 8 FEET</KITCHENETTE>
<JACUZZI>3 SEATS</JACUZZI>
<AIR_CONDITIONING>
<AIR_CONDITIONERS>5</AIR_CONDITIONERS>
</AIR_CONDITIONING>
</ROOM>
<f:ROOM>
<f:NUMBER>17</f:NUMBER>
<f:DATE>2 February 2013</f:DATE>
<f:CLEANLINESS>SATISFACTORY</f:CLEANLINESS>
<f:ROOM_SERVICE>EXCELLENT</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>GOOD</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
<f:ROOM>
<f:NUMBER>33</f:NUMBER>
<f:DATE>31 January 2013</f:DATE>
<f:CLEANLINESS>GOOD</f:CLEANLINESS>
<f:ROOM_SERVICE>AVERAGE</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>EXCELLENT</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
</HOTEL>
<HOTEL xmlns="http://www.seasonshotel.com/information/"
xmlns:f="http://www.seasonshotel.com/feedback/">
<ROOM>
<NUMBER>17</NUMBER>
<CATEGORY>STANDARD</CATEGORY>
<BED>DOUBLE SIZE</BED>
<FLOOR>5</FLOOR>
<LIVING_ROOM/>
<KITCHENETTE/>
<AIR_CONDITIONING>
<TABLE_FANS>2</TABLE_FANS>
</AIR_CONDITIONING>
</ROOM>
<ROOM>
<NUMBER>33</NUMBER>
<CATEGORY>SUITE</CATEGORY>
<BED>DOUBLE SIZE</BED>
<BED>KING SIZE</BED>
<FLOOR>9</FLOOR>
<LIVING_ROOM>17 FEET * 13 FEET</LIVING_ROOM>
<KITCHENETTE>10 FEET * 8 FEET</KITCHENETTE>
<JACUZZI>3 SEATS</JACUZZI>
<AIR_CONDITIONING>
<AIR_CONDITIONERS>5</AIR_CONDITIONERS>
</AIR_CONDITIONING>
</ROOM>
<f:ROOM>
<f:NUMBER>17</f:NUMBER>
<f:DATE>2 February 2013</f:DATE>
<f:CLEANLINESS>SATISFACTORY</f:CLEANLINESS>
<f:ROOM_SERVICE>EXCELLENT</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>GOOD</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
<f:ROOM>
<f:NUMBER>33</f:NUMBER>
<f:DATE>31 January 2013</f:DATE>
<f:CLEANLINESS>GOOD</f:CLEANLINESS>
<f:ROOM_SERVICE>AVERAGE</f:ROOM_SERVICE>
<f:HOUSE_KEEPING_SERVICE>EXCELLENT</f:HOUSE_KEEPING_SERVICE>
</f:ROOM>
</HOTEL>
You are here :
JavaPerspective.com >
Advanced Tutorials >
3. XML processing with JDOM >
3.1. What is XML?
Next tutorial : JavaPerspective.com > Advanced Tutorials > 3. XML processing with JDOM > 3.2. What is XML validation?
Next tutorial : JavaPerspective.com > Advanced Tutorials > 3. XML processing with JDOM > 3.2. What is XML validation?