XSLT: Splitting an XML File into Multiple Files with XSLT
This tips explains how to split one XML file into several files using XSLT. The techniques and methods for doing this are currently XSLT processor specific. This example uses the Apache Xalan XSLT processor. If you are using Saxon, Sablotron or some other XSLT processor, the steps will be different.
For this example, the following student directory xml file will be split into three separate files. Each student has an id
attribute associated with them. This attribute will be used to create a new file for each student.
Listing for: student_directory.xml
1 <student_list>
2 <student id="1">
3 <name>George Washington</name>
4 <major>Politics</major>
5 <phone>312-123-4567</phone>
6 <email>gw@example.edu</email>
7 </student>
8 <student id="2">
9 <name>Janet Jones</name>
10 <major>Undeclared</major>
11 <phone>311-122-2233</phone>
12 <email>janetj@example.edu</email>
13 </student>
14 <student id="3">
15 <name>Joe Taylor</name>
16 <major>Engineering</major>
17 <phone>211-111-2333</phone>
18 <email>joe@example.edu</email>
19 </student>
20 </student_list>
Enabling the Functionality
To enable the functionality in Xalan, there are a couple of steps you must take.
Add Redirect Extension Namespaces
First, you must add a redirect
namespace to your <xsl:stylesheet>
element and an additional attribute. For example:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:redirect="http://xml.apache.org/xalan/redirect" extension-element-prefixes="redirect" version="1.0">
You will notice the addition of the redirect
namespace and the extension-element-prefixes
attribute. This information tells Xalan to make the internal Redirect class and its features available in this style sheet.
Add Elements That Write a File
The next step is to add special elements to your style sheet that are in the redirect namespace. These elements open, close, and write the files created from the style sheet.
<redirect:write> - This element opens, writes, and closes the file it is writing to. Sort of a one stop file creator. Simply wrap this element around whatever you want to write out, and that text will be written to the file you specify. The file name is specified with a file
or select
attribute. The file
attribute takes a string value while the select
attribute takes an XPath expression. If the expression in the select
resolves to empty, then the file
attribute is used.
<redirect:open> and <redirect:close> - These elements allow your control when a file is opened and when it is closed. Both elements have the same attributes as redirect:write
but they are typically entered as empty elements. This allows you to open and close files at different points in the style sheet.
append="true/yes"One other point, you can add the append
attribute to one of the above elements to append the information your are writing to the file. However, note that if you are using the XML output method, you could get XML version headers anywhere in the document as a header is automatically added each time a write occurs.
For more information take a look at the Redirect class in the Xalan Javadoc documentation. http://xml.apache.org/xalan-j/apidocs/
Here is a example that uses the redirect elements.
Listing for: student_split.xsl
1 <!-- Created by Michael Williams, Abbeyworkshop.com, 2004 -->
2 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
3 xmlns:redirect="http://xml.apache.org/xalan/redirect"
4 extension-element-prefixes="redirect"
5 version="1.0"
6 >
7 <xsl:output method="xml"/>
8
9 <xsl:template match="/">
10 <xsl:apply-templates />
11 </xsl:template>
12
13 <xsl:template match="student_list">
14 <xsl:apply-templates />
15 </xsl:template>
16
17 <xsl:template match="student">
18 <xsl:variable name="filename" select="concat(@id,'.xml')" />
19 <redirect:write select="$filename">
20 <student id="{@id}">
21 <xsl:apply-templates />
22 </student>
23 </redirect:write>
24 </xsl:template>
25
26 <xsl:template match="name | major | phone | email">
27 <xsl:copy-of select="." />
28 </xsl:template>
29
30 </xsl:stylesheet>
The key template here is of course the student
template. The redirect:write
element creates a file name based on the id
attribute. An xsl:variable
element is used to append '.xml' to the value of the id
attribute.
Since there are no changes to the subelements, they are just copied and written out in their original state.
If you download this style sheet and the XML file and transform the file, you should get a file for each student: 1.xml, 2.xml, and 3.xml. You also get an empty output file for the main style sheet. (Haven't yet figured out a way to prevent this file from being written out.)