Abbey Workshop

XSLT: Splitting an XML File into Multiple Files with XSLT

This tips explains how to split one XML file into several files using XSLT. The techniques and methods for doing this are currently XSLT processor specific. This example uses the Apache Xalan XSLT processor. If you are using Saxon, Sablotron or some other XSLT processor, the steps will be different.

For this example, the following student directory xml file will be split into three separate files. Each student has an id attribute associated with them. This attribute will be used to create a new file for each student.

Listing for: student_directory.xml

   1 <student_list>
   2   <student id="1">
   3     <name>George Washington</name>
   4     <major>Politics</major>
   5     <phone>312-123-4567</phone>
   6     <email>gw@example.edu</email>
   7   </student>
   8   <student id="2">
   9     <name>Janet Jones</name>
  10     <major>Undeclared</major>
  11     <phone>311-122-2233</phone>
  12     <email>janetj@example.edu</email>
  13   </student>
  14   <student id="3">
  15     <name>Joe Taylor</name>
  16     <major>Engineering</major>
  17     <phone>211-111-2333</phone>
  18     <email>joe@example.edu</email>
  19   </student>
  20 </student_list>

Enabling the Functionality

To enable the functionality in Xalan, there are a couple of steps you must take.

Add Redirect Extension Namespaces

First, you must add a redirect namespace to your <xsl:stylesheet> element and an additional attribute. For example:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:redirect="http://xml.apache.org/xalan/redirect" extension-element-prefixes="redirect" version="1.0">

You will notice the addition of the redirect namespace and the extension-element-prefixes attribute. This information tells Xalan to make the internal Redirect class and its features available in this style sheet.

Add Elements That Write a File

The next step is to add special elements to your style sheet that are in the redirect namespace. These elements open, close, and write the files created from the style sheet.

<redirect:write> - This element opens, writes, and closes the file it is writing to. Sort of a one stop file creator. Simply wrap this element around whatever you want to write out, and that text will be written to the file you specify. The file name is specified with a file or select attribute. The file attribute takes a string value while the select attribute takes an XPath expression. If the expression in the select resolves to empty, then the file attribute is used.

<redirect:open> and <redirect:close> - These elements allow your control when a file is opened and when it is closed. Both elements have the same attributes as redirect:write but they are typically entered as empty elements. This allows you to open and close files at different points in the style sheet.

append="true/yes"One other point, you can add the append attribute to one of the above elements to append the information your are writing to the file. However, note that if you are using the XML output method, you could get XML version headers anywhere in the document as a header is automatically added each time a write occurs.

For more information take a look at the Redirect class in the Xalan Javadoc documentation. http://xml.apache.org/xalan-j/apidocs/

Here is a example that uses the redirect elements.

Listing for: student_split.xsl

   1 <!-- Created by Michael Williams, Abbeyworkshop.com, 2004 -->
   2 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   3   xmlns:redirect="http://xml.apache.org/xalan/redirect"
   4   extension-element-prefixes="redirect"
   5   version="1.0"
   6 >
   7 <xsl:output method="xml"/>
   8 
   9 <xsl:template match="/">
  10   <xsl:apply-templates />
  11 </xsl:template>
  12 
  13 <xsl:template match="student_list">
  14   <xsl:apply-templates />
  15 </xsl:template>
  16 
  17 <xsl:template match="student">
  18   <xsl:variable name="filename" select="concat(@id,'.xml')" />
  19   <redirect:write select="$filename">
  20     <student id="{@id}">
  21       <xsl:apply-templates />
  22     </student>
  23   </redirect:write>
  24 </xsl:template>
  25 
  26 <xsl:template match="name | major | phone | email">
  27   <xsl:copy-of select="." />
  28 </xsl:template>
  29 
  30 </xsl:stylesheet>

The key template here is of course the student template. The redirect:write element creates a file name based on the id attribute. An xsl:variable element is used to append '.xml' to the value of the id attribute.

Since there are no changes to the subelements, they are just copied and written out in their original state.

If you download this style sheet and the XML file and transform the file, you should get a file for each student: 1.xml, 2.xml, and 3.xml. You also get an empty output file for the main style sheet. (Haven't yet figured out a way to prevent this file from being written out.)