15 December 2006

Working with XML - Part 3 - Formatting XML using XSL

XSL is an language for transforming XML into different formats, written in XML using specific elements and attributes. To achieve the transform you simply load you XML and XSL into DOM objects then call transformNode on the XML document passing it your XSL document and it will return the transformed output.

XSL is effectively a procedural programming language. It has conditional and looping logic structures as well as variables and callable "functions". I'd recommend reading XSL @ W3SChools as this explains the basics well.

Recap

In parts one and two we looked at loading a sample bit of XML and how to select certain nodes from it using XPath.

<?xml version="1.0" ?>
<library>

  <authors>
     <author id="12345">
        <name>Charles Dickens</name>
     </author>
     <author id="23456">  
        <name>Rudyard Kipling</name>

     </author>        
  </authors>
  <books>
     <book>
        <title>Great Expectations</title>
        <author>12345</author>

     </book>
     <book>
        <title>The Jungle Book</title>
        <author>23456</author>
     </book>

  </books>
</library>

Using XSL we can convert this XML document into, for example, HTML to send to a web browser. We could also transform it into SQL INSERT statements for adding the data to a database.

HTML Output - Listing Data

Here's an example of how you could output the records from the sample XML:

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<!-- more stuff to go here -->

</xsl:stylesheet>

We start with the stylesheet element which specifies the XML namespace "xsl" which is used as the prefix for all the special XSL elements.

<xsl:template match="/">
   <html>
      <head>
         <title>Library</title>
      </head>
      <body>
         <h1>Library</h1>
         <xsl:apply-templates select="authors/author" />
      </body>
   </html>
</xsl:template>

Next we have a template element with a match attribute equal to "/". This template matches the root of the XML document so the transform starts here outputting the contents of the element. The apply-templates element then tells the transform to apply the appropriate templates to the nodes returned by the value of the select attribute which you'll notice is XPath.

<xsl:template match="author">
   <h2><xsl:value-of select="name" /> (<xsl:value-of select="@id" />)</h2>
   <p>Books by this author:</p>
   <table>
      <tr>
        <th>Title</th>
      </tr>
      <xsl:apply-templates select="/library/books/book[author = current()/@id]" />
   </table>
</xsl:template>

This template will match the author nodes selected so output will pass here at this point - similar to a function call in a normal programming language. Here value-of elements output the value of the nodes identified in their select attributes, XPath again.

The next apply-templates element selects all the book elements with an author child element whose value is equal to the current author element's id attribute. It uses the current() XSL function to get at the element being tranformed by the template.

<xsl:template match="book">
   <tr>
      <td><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

Finally the titles of the books are writen out on table rows so what you end up with after calling transformNode is this:

<html>
  <head>
    <title>Library</title>
  </head>
  <body>
    <h1>Library</h2>
    <h2>Charles Dickens (12345)</h2>
    <p>Books by this author:</p>
    <table>
      <tr>
        <th>Title</th>
      </tr>
      <tr>
        <td>Great Expectations</td>
      </tr>
    </table>
    <h2>Rudyard Kipling (23456)</h2>
    <p>Books by this author:</p>
    <table>
      <tr>
        <th>Title</th>
      </tr>
      <tr>
        <td>The Jungle Book</td>
      </tr>
    </table>
  </body>
</html>

Using the output element

XHTML compliant output

XSL has an HTML output mode that you can specify by adding this to the top of your stylesheet, before any template elements:

<xsl:output mode="html" />

However the Microsoft.XMLDOM will mess around with your tags if you use this so if you want your output to be XHTML compliant you need to use the XML output mode instead thusly:

<xsl:output method="xml" omit-xml-declaration="yes" />

Outputting a DOCTYPE

If you want to output a DOCTYPE (which you need to get IE to obey the CSS box model properly) you add a few attributes to your output element:

<xsl:output 
   method="xml" 
   omit-xml-declaration="yes"
   doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
   doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
/>

which will produce:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

For more info on the output element visit output @ W3Schools.

Useful XSL snipets

Alternating row classes on tables

<xsl:template match="book">
   <tr>
      <xsl:attribute name="class">
         <xsl:choose>
            <xsl:when test="(position() mod 2) = 0">even</xsl:when>
            <xsl:otherwise>odd</xsl:otherwise>
         </xsl:choose>
      </xsl:attribute>
      <td><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

Template applicability

You can make nodes of the same type product different output by changing the match attribute of your template:

<xsl:template match="book[author = 12345]">
   <tr>
      <td class="highlight"><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

<xsl:template match="book">
   <tr>
      <td><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

Here we're applying a highlight class to book rows by author 12345.

Links

06 November 2006

Working with XML - Part 2 - Using XPath to query XML

XPath is a simple language for querying XML documents in order to retrieve nodes matching particular criteria. There are some good references and tutorials out there to help you get to grips with the basics; i'd recommend reading XPath @ W3Schools for starters and then running through the Zvon XPath Tutorial before reading on.

XSL, which the next part of this post is about, makes much use of XPath so you need to get up to speed with it before you venture into XSL.

Recap

In part one we loaded the following XML in to the DOM and performed various operations with DOM properties and methods. To use XPath there are only two methods selectNodes which returns a node list and selectSingleNode which returns one node.

<?xml version="1.0" ?>
<library>
  <authors>
     <author id="12345">
        <name>Charles Dickens</name>
     </author>
     <author id="23456">  
        <name>Rudyard Kipling</name>
     </author>        
  </authors>
  <books>
     <book>
        <title>Great Expectations</title>
        <author>12345</author>
     </book>
     <book>
        <title>The Jungle Book</title>
        <author>23456</author>
     </book>
  </books>
</library>

XPath allows you to do some quite complex data selection and analysis using a mixture of path syntax, axes, predicates and functions. Having said that i had quite a hard time finding decent examples of doing some quite simple stuff.

Important - Setting the SelectionLanguage property

If you're using the Microsoft XMLDOM COM component you need to set the SelectionLanguage property of the DOM document to "XPath" otherwise you'll get some very odd results - you do this as follows:

xmlDoc.setProperty "SelectionLanguage", "XPath"

Example 1 - Selecting nodes and checking return value

'a single node
strXPath = "/library/authors"
Set ndAuthors = xmlDoc.documentElement.selectSingleNode(strXPath)

'This test checks whether authors was found or not...
If ndAuthors Is Nothing Then
  'error not found!
Else
  'do something with authors
End If

'multiple nodes
strXPath = "/library/authors/author"
Set nlAuthors = xmlDoc.documentElement.selectNodes(strXPath)

'This test checks whether author nodes were found or not...
If nlAuthors.Length = 0 Then
  'error not found!
Else
  For Each ndAuthor In nlAuthors
     'do something with nodes
  Next
End If

If you ran through the Zvon XPath tutorial earlier you should now be able to do some basic selecting of nodes using the two methods i've just shown you.

In the next few examples i'm going to run through some of the things you'll probably want to do but tutorials like the Zvon one don't cover.

Example 2 - Predicates and Axes

Selecting the author element with id "12345"...

strXPath = "/library/authors/author[@id='12345']"
Set ndNode = xmlDoc.documentElement.selectSingleNode(strXPath)

Only selecting the author's name...

strXPath = "/library/authors/author[@id='12345']/name"
Set ndNode = xmlDoc.documentElement.selectSingleNode(strXPath)

Titles of books written by that author...

strXPath = "/library/books/book[author='12345']/title"
Set nlNodes = xmlDoc.documentElement.selectNodes(strXPath)

Example 3 - Functions

Simple counting of nodes...

strXPath = "count(/library/authors/author)"
Set ndCount = xmlDoc.documentElement.selectSingleNode(strXPath)

Combining count with the ancestor axis allows you to select nodes of a particular depth...

strXPath = "//*[count(ancestor::*) > 2]"
Set nlDeepNodes = xmlDoc.documentElement.selectNodes(strXPath)

Books with "The" in their title...

strXPath = "/library/books/book[contains(title,'The')]"
Set nlNodes = xmlDoc.documentElement.selectNodes(strXPath)

Example 4 - Common tasks

Remove nodes that match certain criteria...

strXPath = "/library/books/book[contains(title,'The')]"
Set nlNodes = xmlDoc.documentElement.selectNodes(strXPath)
For Each ndNode In nlNodes
   ndNode.parentNode.removeChild ndNode
Next

Links

21 October 2006

Working with XML - Part 1 - Using the DOM

XML (Extensible Markup Language) is a way of storing information as text by applying structure and meaning to data using a system of nested elements and attributes. HTML is a loose form of XML because it consists of elements, attributes and text although it doesn't always obey the strict rules necesary for valid XML.

The basic rules are:

  • Element and attribute names are case sensitive i.e. derek != Derek
  • Each element must be closed
  • All an element's child elements must be closed before it can be
  • An XML document must have one element only as its root node

Document Type Definitions (DTD) and XML Schema Definition (XSD) are methods for defining the structure of an XML document and ensuring it adheres to your specification. Although i'm not going to cover them in this post they're very important particularly if you're letting other people write XML for your system.

Examples are in ASP/VBScript

Example 1 - Some XML

<?xml version="1.0" ?>
<library>
   <authors>
      <author id="12345">
         <name>Charles Dickens</name>
      </author>
      <author id="23456">   
         <name>Rudyard Kipling</name>
      </author>         
   </authors>
   <books>
      <book>
         <title>Great Expectations</title>
         <author>12345</author>
      </book>
      <book>
         <title>The Jungle Book</title>
         <author>23456</author>
      </book>
   </books>
</library>

In this example we have a library element as the root node of the XML document. Inside that we have an authors element containing author elements and a books element containing book elements. The author elements have an id attribute and a name child element whereas the book elements have a title element and an author element containing the id of the associated author.

You can see from this example how it's easy to denote quite complex relationships in a simple, readable manner. I'll base the other examples in this post around this bit of XML

Example 2 - Loading XML in to a DOM

Set xmlDoc = Server.CreateObject("Microsoft.XMLDOM")
xmlDoc.async = False
xmlDoc.load(Server.MapPath("mydoc.xml"))
Response.Write xmlDoc.xml

This snipet loads and XML document from a file into a DOM object. There's also a LoadXML function on the Microsoft DOM object for loading a string containing XML.

Once we've got our XML loaded we can traverse the tree, read data, change properties and save it back to a file.

Set xmlDoc = Server.CreateObject("Microsoft.XMLDOM")
xmlDoc.async = False
xmlDoc.load(Server.MapPath("mydoc.xml"))
Set ndRoot = xmlDoc.documentElement

'retrieving element names
Response.Write ndRoot.tagName & "<br />"

'looping though nodes
Set ndAuthors = ndRoot.firstChild
For Each ndAuthor In ndAuthors.childNodes
   Response.Write ndAuthor.getAttribute("id") & "<br />"
Next

'setting attributes
Set ndSecondAuthor = ndAuthors.childNodes(1)
ndSecondAuthor.setAttribute "id", 99999
Response.Write ndSecondAuthor.getAttribute("id") & "<br />"

'retrieving and setting node text
Response.Write ndSecondAuthor.firstChild.text & "<br />"
ndSecondAuthor.firstChild.text = "Joe Bloggs"
Response.Write ndSecondAuthor.firstChild.text & "<br />"

Links

18 October 2006

Simple event driven programming using VBScript's GetRef

VBScript doesn't have an event implementation so if you fancy having features like attaching handlers which will respond to specific events on your object you can do it simply by using the GetRef function and a bit of "syntactic sugar".

I'm using ASP in these examples cos it's easy.

Example 1 - Simple Events

'Create a handler
Function MyHandler()
   Response.Write "Hello from the handler!"
End Function

'Create an event
Dim OnLoad
Set OnLoad = GetRef("MyHandler")

'Fire the event
OnLoad()

Here we've created a simple event which takes one handler function and fired the event which in turn has called the function we attached.

To turn this in to a more useful event system we can use an array for the OnLoad event variable thus...

'Create some handlers
Function MyHandler1()
   Response.Write "Hello from handler 1!"
End Function

Function MyHandler2()
   Response.Write "Hello from handler 2!"
End Function

'Create an event
Dim OnLoad
OnLoad = Array(GetRef("MyHandler1"), GetRef("MyHandler2"))

'Fire the event
For Each handler In OnLoad
   handler()
Next

Example 2 - Event Arguments

In most event implementations the event handlers take one argument, passed to them by the fired event, which contains things like the type of event and a reference to the object on which it was fired etc.

'Create a handler which takes one argument
Function MyHandler(e)
   Response.Write "Hello from the handler - i was called by " & e
End Function

'Create two events
Dim OnLoad
Set OnLoad = GetRef("MyHandler")

Dim OnUnload
Set OnUnload = GetRef("MyHandler")

'Fire the events
OnLoad("Load")
OnUnload("Unload")

Wrapping it up

We've established we can do all the basics of events, now all we need to do is wrap it up in a few classes to make it usable.

First we need an Event class that we can instantiate for each event we want. This will have to expose an event arguments property and methods for attaching handlers and firing the event. It will also have to keep track internally of the attached handlers. Lets have a go...

Class clsEvent

   'An array to keep track of our handlers
   Private aryHandlers()

   'Our event arguments object to be passed 
   'to the handlers
   Public EventArgs

   Private Sub Class_Initialize()
      ReDim aryHandlers(-1)
      Set EventArgs = New clsEventArgs
   End Sub

   Private Sub Class_Terminate()
      Set EventArgs = Nothing
      Erase aryHandlers
   End Sub

   'Method for adding a handler
   Public Function AddHandler(strFunctionName)
      ReDim Preserve aryHandlers(UBound(aryHandlers) + 1)
      Set aryHandlers(UBound(aryHandlers)) = _
         GetRef(strFunctionName)
   End Function

   'Method for firing the event
   Public Function Fire(strType, objCaller)
      EventArgs.EventType = strType
      Set EventArgs.Caller = objCaller
      For Each f In aryHandlers
         f(EventArgs)
      Next
   End Function

End Class

Next we need an EventArgs class for passing data about the event to the handlers. This just needs three properties; event type, caller and an arguments collection for event type specific things.

Class clsEventArgs

   Public EventType, Caller, Args

   Private Sub Class_Initialize()
      Set Args = CreateObject("Scripting.Dictionary")
   End Sub

   Private Sub Class_Terminate()
      Args.RemoveAll
      Set Args = Nothing
   End Sub

End Class

Next our class that has an event, in this case an OnLoad which fires after the object's Load method is called. We'll also create a few handlers and do a trial run.

Class MyClass

   Public OnLoad

   Private Sub Class_Initialize()
      'Setting up our event
      Set OnLoad = New clsEvent

      'Adding an argument
      OnLoad.EventArgs.Args.Add "arg1", "Hello"
   End Sub

   Public Function Load()
      Response.Write "loading the object here!<br />"
      
      'Firing the event
      OnLoad.Fire "load", Me
   End Function

End Class


'A couple of handling function for the events
Function EventHandler(e)
   Response.Write "<h2>EventHandler</h2>"
   Response.Write "<p>Event """ & e.EventType & """ fired by object
of type " & TypeName(e.Caller) & ".</p>"
End Function

Function EventHandler2(e)
   Response.Write "<h2>EventHandler2</h2>"
   For Each x In e.Args
      Response.Write x & ": " & e.Args(x) & "<br />"
   Next
End Function

'instantiate the object, attach the handlers and call the load
Set myObj = New MyClass
myObj.OnLoad.AddHandler("EventHandler")
myObj.OnLoad.AddHandler("EventHandler2")
myObj.Load()

Event based programming reverses the responsibility for code execution within your program. In conventional procedural programming it would be the responsibility of the myObj class to make sure the two event handlers were fired when it's Load method was called. By using an OnLoad event instead myObj doesn't have to know anything about the environment in which its executing, it just blindly fires the event and any attached handlers will be called. In this way you can add additional functions which run when myObj's Load method is called without modifying MyClass.

In more complex systems being able to add functionality with a minimum of intrusion into other parts of the system is a big bonus and event based programming is an easy way of achieving it.