28 March 2007

Keeping your data secure online

There is a growing trend among the Web 2.0 sites of asking you to enter a username and password for some other site in order to perform an action. Two recent examples of this i've come across are Technorati which asks for your Blogger username and password in order to verify you own a particular blog and Facebook which asks for your e-mail account username and password in order to match you contact list against existing Facebook members.

As an IT professional i'm naturally skeptical of any such demand and in the case of Technorati chose to verify my blog ownership in a different way. Terms and Conditions and Data Protection declarations are one thing but you've no guarantee your details won't be stored in a database and if they are there's all the more chance they may at some point fall into the wrong hands. The only way to be sure is not to type them in in the first place.

A Blogger user account is one thing but Facebook asks for something totally different. I wonder how many people actually consider what data is stored in their e-mail inbox before they submit their details with the promise of being shown how many of their friends are already signed up.

When you sign up to the majority of sites you're e-mailed a confirmation containing all your details, including things like home and work addresses, phone numbers, date of birth, passwords and perhaps the most crucial of all, answers to secret questions e.g. mother's maiden name. If some unscrupulous person gets hold of your e-mail account password it will likely take them little effort to steal your identity, empty your bank accounts and max out your credit cards.

This article from the BBC Many net users 'not safety-aware' serves to reaffirm the point that a large number of net users simply do not think before typing their sensitive information into random websites.

12 March 2007

Reference type keys and .NET dictionaries

The default implementation of the Equals method for reference types is to call ReferenceEquals i.e. to test whether two variables reference the same instance of an object.

When using a ListDictionary the Equals method is used so you can be sure you will be accessing the correct item. However if you use a HybridDictionary, which will swap to a Hashtable for collections of more than 10 items, you can get inconsistent results. This is all down to the fact that the Hashtable uses the GetHashCode method to get a code which represents an object and this is then used as the key. As you will read here the GetHashCode method does not always return a unique code for dissimilar objects so you can end up accessing the wrong item of your dictionary.

To get around this you can either stick to using the ListDictionary or implement your own IHashCodeProvider for the classes used as keys in your dictionary.

20 February 2007

32-bit Windows Script Components under 64-bit Windows

We're currently setting up a new suite of 64-bit web and database servers at work all running Windows Server 2003 x64 Edition. We've got quite a few legacy 32-bit Windows Script Components we need to use which by all accounts should run fine in the 32-bit environment provided by WOW64.

Imagine our surprise when, on registering the components, none of them worked - throwing errors at the point of instantiation.

WOW64 is rather an odd beast; residing as it does in it's own SYSWOW64 folder in the Windows directory. This folder essentially contains the 32-bit versions of all the DLLs and things that are available in a 32-bit version of Windows. The caveat being that in order to get your 32-bit fare to work you need to call on the services of these SYSWOW64 versions rather than the ones in the folder still called SYSTEM32 (note the stupid naming convention).

When registering WSC's you actually register the hosting service, scrobj.dll with regsvr32.exe, passing the path to your WSC as the command line for scrobj.dll using the /i switch e.g.

regsvr32 /i:"C:\Components\Display.wsc" "C:\WINDOWS\SYSTEM32\scrobj.dll"

Oddly the Register option in the file association for WSC's seems to mix versions, calling the 64-bit version of regsvr32.exe and the 32-bit version of scrobj.dll.

"C:\WINDOWS\system32\REGSVR32.EXE" /i:"%1" "C:\WINDOWS\SYSWOW64\scrobj.dll"

I'm not sure of the significance of this mixed version thing, however it didn't work in our case so we added a 32-bit Register option which called the 32-bit versions of both files from the SYSWOW64 folder e.g.

"C:\WINDOWS\SYSWOW64\REGSVR32.EXE" /i:"%1" "C:\WINDOWS\SYSWOW64\scrobj.dll"

and a 32-bit Unregister e.g.

"C:\WINDOWS\SYSWOW64\REGSVR32.EXE" /u /n /i:"%1" "C:\WINDOWS\SYSWOW64\scrobj.dll"

which sorted the issue.

13 February 2007

XML namespace prefixes in MSXML

If you're working with XSL or a similar technology that makes use of XML namespace prefixes using the Microsoft XML DOM you'll likely run into problems if you try to do anything more than just load in a file.

Adding elements

The W3 DOM specification includes a createElementNS method for creating an element scoped within a namespace however MSXML doesn't. You can create an element with a prefix using createElement but this doesn't correctly register the namespace of the node and you'll get a schema error something like:

msxml3.dll: Keyword xsl:stylesheet may not contain xsl:include.

In order to create an element and register it correctly you have to use createNode instead which takes node type (1 for an element), node name and namespace URI as arguments e.g.

Set ndIncl = xslDoc.createNode(1, "xsl:include",
"http://www.w3.org/1999/XSL/Transform")

Using XPath

Similar to the createElement problem, even if you've only loaded an XSL document you won't be able to use XPath to query it because oddly the namespaces aren't automatically registered with XPath e.g.

Set nlTemps = xslDoc.documentElement.selectNodes("/xsl:stylesheet/xsl:template")

yields the following error:

msxml3.dll: Reference to undeclared namespace prefix: 'xsl'.

To get this to play ball you have to set the "SelectionNamespaces" second-level property which takes a space delimited list of namespace definitions using a setProperty call of the form:

xslDoc.setProperty "SelectionNamespaces",
"xmlns:xsl='http://www.w3.org/1999/XSL/Transform'"

Links

05 February 2007

XML and SQL Server

In this post i'll cover how you can get SQL Server to return data as XML by using the FOR XML command and how you can use XML as input for updating records and as a rich argument for row returning stored procedures using the OPENXML command.

There are lots of reasons you may want to get data out of a database as XML:

  • You may be building an AJAX app and want to send XML to the client directly for processing by your client-side JavaScript
  • You may want to use XSL to transform your data into some format such as HTML or a CSV
  • You may want to export data and store it in a form which retains its original structure

These reasons also give you reasons for needing to pass XML into your database for example with the AJAX app you may want to receive changes as XML from the client and post them straight to a stored proc that updates your tables.

Examples are in VBScript using ASP and ADO.

Getting SQL Server to return XML

The key to getting SQL Server to return XML is the FOR XML command. It comes in three flavours:

FOR XML RAW
The least useful, RAW mode simply outputs the rows returned by your query as <row> nodes with the columns being either elements within this node or attributes of it as you define.
FOR XML AUTO
Automagically translates your SQL query, joins and all into suitable nested XML elements and attributes. For example if you are joining Orders to OrderItems the XML output will be OrderItem nodes nested within the associated Order node. You can alter the naming of the nodes by aliasing your table and column names but that's about it.
FOR XML EXPLICIT
Explicit mode allows the most customisability but it's also the most fiddly requiring you to alias all your columns names to a specific format which describes which nodes they should belong to.

You'll mostly use AUTO mode because it gives you the most useful results in the least amount of time so here it is in an example:

SELECT Order.*, OrderItem.*
FROM Order
INNER JOIN OrderItem
   ON Order.order_key = OrderItem.order_fkey
WHERE Order.customer_fkey = 1
FOR XML AUTO

All you do is tag FOR XML AUTO on to the end of your query, that's it! The output will look something like this:

<Order order_key="1" customer_fkey="48" date_placed="24/08/2006 12:31">
   <OrderItem orderitem_key="123" order_fkey="1" product_fkey="234" list_price="£14" />
   <OrderItem orderitem_key="124" order_fkey="1" product_fkey="64" list_price="£3" />
   <OrderItem orderitem_key="125" order_fkey="1" product_fkey="73" list_price="£27" />
</Order>

If you run this in Query Analyzer you'll notice in the results pane it looks like the XML has been split into rows. We need to use an ADODB.Stream object to get at the output properly, thus:

Set conn = Server.CreateObject("ADODB.Connection")
Set cmd = Server.CreateObject("ADODB.Command")
Set strm = Server.CreateObject("ADODB.Stream")

conn.Open "Provider=SQLOLEDB;Data Source=myServerAddress;" & _
   "Initial Catalog=myDataBase;User Id=myUsername;Password=myPassword;"

strm.Open

Set cmd.ActiveConnection = conn

cmd.Properties("Output Stream").Value = strm
cmd.Properties("Output Encoding") = "UTF-8"
cmd.Properties("XML Root") = "Root"  'this can be anything you want

cmd.CommandType = adCmdText
cmd.CommandText = strSQL

cmd.Execute , , adExecuteStream

Set xmlDoc = Server.CreateObject("Microsoft.XMLDOM")
xmlDoc.async = "false"

xmlDoc.LoadXML(strm.ReadText)

strm.Close : Set strm = Nothing
Set cmd = Nothing

xmlDoc now contains our XML to do with as we will.

Passing XML into SQL Server

The easiest way to get XML into SQL Server is as a parameter of a stored procedure thus:

cmd.Parameters.Append cmd.CreateParameter("somexml", adVarChar,
adParamInput, 8000, xmlDoc.xml)

You then use two System Stored Procedures along with the OPENXML command to SELECT from the contents of the XML parameter as if it were a table:

DECLARE @idoc int
EXEC sp_xml_preparedocument @idoc OUTPUT, @somexml

SELECT * FROM OPENXML (@idoc, '/Root/Order') WITH (Order)

EXEC sp_xml_removedocument @idoc

OPENXML takes the prepared XML document and an XPath expression telling it which nodes it is taking into account. The WITH statement in this case tells OPENXML that the nodes it is working on are of the same schema type as the rows in the Order table.

The result of this call is a list of records of the same schema as the Order table but which have actually come from the passed in XML document. Because the schema is that of Order you can put an INSERT INTO [Order] in front of the SELECT and this will add the rows from the XML to the Order table. You probably wouldn't want to do that but you get the idea.

You don't have to have a table representing the schema of the XML you're passing in in your database. WITH also accepts a normal schema declaration i.e. comma delimited column names with their types and which node this maps to in the XML:

SELECT order_key, customer_fkey, description
FROM OPENXML (@idoc, '/Root/Order') 
WITH (
   order_key     int           '@order_key',
   customer_fkey int           '@customer_fkey',
   description   nvarchar(100) 'description'
)

The advantage of being able to do this is that you can pass complex structured criteria into one of your stored procedures and use OPENXML to turn it into a rowset which you can use to JOIN to the tables in your database. Powerful stuff with a large number of applications in both improving querying data and updating it.

02 February 2007

A room with a Vista

I've been toying with the idea of buying Vista for a some time now. I have a beta version installed in a spare partition but some hard disk corruption issues prevented me from giving it a good run for its money. So, having not looked at it for a while, I've been reading CNET's Seven days of Vista series over the past week to see what's what in the final release.

Being a technology obsessed geek, the Ultimate Edition was the only viable option but it came as a roundhouse kick to the face when i learnt this would retail at £350! "Guess I'll be sticking to XP for now then", I though mournfully.

Not so, for at the end of day 7's post was the pièce de résistance:

Our final tip would be to consider buying OEM versions of Vista ... the consumer version of Ultimate is [£349], yet it's just £121.68 for the OEM version

I was aware you could buy OEM software but had always thought the difference in price was similar to that of retail and OEM hardware. "Surely that can't be right", I though, but sure enough dabs.com has both versions retail boxed and OEM.

The only differences with the OEM being you don't get any support and it's tied to the motherboard it's first installed on. I don't need support and, in the unlikely event that my mobo dies and i have to buy another OEM copy, I'll still be £50 richer. Besides you can get three OEMs for the price of one retail so I could just get a few spares!

01 February 2007

SQL Server Oddness

I recently came across a rather odd bug in SQL Server 2000 which, although fixed in SP4, i had to find a work-around for as an immediate solution was required that couldn't wait for a scheduled server patching.

In my case it concerned a JOIN subselect which contained a JOIN using a user-defined scalar function as one of its predicates e.g.

...
LEFT JOIN (
   SELECT foo.col1, MIN(bar.col1)
   FROM foo
   INNER JOIN bar
      ON foo.col1 = bar.col2
      AND bar.col3 = dbo.fn_GetVal("derek")
   GROUP BY foo.col1 
)
...

which produced the rather unhelpful error:

Server: Msg 913, Level 16, State 8, Line 4
Could not find database ID 102. Database may not be activated yet or may
be in transition.

The work-around in this case was to move the predicate to the WHERE clause rather than the ON e.g.

...
LEFT JOIN (
   SELECT foo.col1, MIN(bar.col1)
   FROM foo
   INNER JOIN bar
      ON foo.col1 = bar.col2
   WHERE bar.col3 = dbo.fn_GetVal("derek")
   GROUP BY foo.col1 
)
...

24 January 2007

What is it with PC component categorisation?

Like many geeks I buy my computers as components and assemble them myself, taking great care to make sure the quality and compatibility of kit is as good as i can afford. One thing of constant annoyance however is the inadequacy of online PC component retailer's product categorisation and searching facilities.

With all the different types of processor, RAM, hard disks etc that are available accurate categorisation is essential for you to find what you're looking for. Why is it then that so many of the online component retailers have these lacklustre categorisation schemes in place, often repeating categories of product e.g. "Core 2 Duo" and "Core Duo 2".

It does seem to be CPUs that suffer the worst, probably because of the many different ways you can group them; by manufacturer, product line, socket type, number of cores etc. A lot of these retailers only allow a product to be in one category rather than "tagging" products with all their relevant information and allowing a user to group by anything. Because of this one category restriction each retailer has gone with what they see as the best categorisation, the problem being that none of them are the same and most of them are inflexible.

The result of this is a very poor user experience making it difficult for the consumer wanting to shop around. What's needed is an open data initiative between the component industry and retailers to standardise on these categorisations thus empowering the consumer to find what they're looking for more easily.

As much as this sounds like a massive plug, the only online component retailer i regularly use is dabs.com as they're the only one i know of that implements a good product categorisation scheme. So take note ebuyer et al - proper categorisation pays!

22 January 2007

Is it time for a thin-client resurrection?

Microsoft is making much of the performance benefits flash memory brings to Windows Vista. Two items on the performance features page utilise it; ReadyBoost as an extension to the RAM and ReadyDrive as a large hard disk cache - part of hybrid drive technology.

Hybrid drive hard disks cache frequently used data in flash memory attached to the disk. Surely this just means the operating system is cached so why then don't we just go the whole hog and give the OS its own flash drive to run from? But wait, we can go further than this, Office Live removes the need for having Office installed on a local disk. On a lot of home PCs that is the only application installed.

We still need a hard disk for bulk storage of documents, media etc but a NAS over gigabit ethernet can potentially have an access time of 125MB/s which is comparible to a local hard disk. We've reached the point where the only people who need a hard disk are those who rely on consistant real-time access to disk i.e. audio and video editing and the like and these folk tend to use firewire disks anyway.

Processor, RAM, a load of Compact Flash and maybe an optical drive is all that's needed - hopefully it's not going to be long until desktop PCs look like this:

18 January 2007

A convenient falsehood

This was supposed to be a post about thin-clients but that will just have to wait...Mike's power supply problems have struck a nerve.

His problem stemmed from the fact that computers are becoming more power hungry thus necessitating ever higher wattage power supplies. 4GHz CPUs and double height graphics cards requiring their own power connector all come at a price. It's not a case of "they don't make them like they used to", it's a case of "my CPU requires a 500W PSU where my last computer only needed 200W". On average the 500W would burn out 2.5 times faster than the 200W.

It raises the question, is there really a need for this much horsepower in a desktop PC? I'd suggest there isn't; the majority of users could probably get by with a 2GHz machine and onboard graphics. Obviously gamers are a different matter but hardware enthusiast and overclocker types make up a small percentage of PC users.

I've lost count of the number of times i've heard a PC salesman say "...and you've got the latest graphics card so the kids can play their games...". These kids who've probably already got one of the latest consoles so aren't going to be bothered that the family PC's packing a behemoth. It'll be there anyway, sucking the life out of the planet like the crystalline entity from TNG.

It's a sorry state of affairs, considering all this "climate change" stuff component manufacturers should be concentrating on making their fare more efficient and and PC retailers should be advising customers to buy greener PCs.

15 January 2007

Apple's iPod/phone mashup

I'm purposefully steering clear of Cisco's iPhone brand name in the title of this post in the event that Apple lose the impending court battle. Not that this will make much difference now that every reference to Cisco's product is now buried underneath a mountain of links to Apple's. Anyway i will hereafter refer to Apple's product as the iPhone although i recognise this is Cisco's brand name yada yada...

I wasn't going to blog anything about the iPhone because there seems to be plenty of that going around already, all along the lines of "The 10 worst things about the iPhone (but i'm still going to buy one)". However the comments in an article on the BBC News website entitled From iPhone to iGroan have prompted me to get writing.

It occurs to me that a lot of the contributors to the article haven't really considered their comments and are just complaining for the sake of it, the TV show "Grumpy Old Men" springs to mind, however...

I'll start my deconstruction of these arguments with this excert from one of the contributors:

The functionality really doesn't differ that much from some of the mobiles already on the market. It's just another example of how well Apple have mastered the use of brand loyalty.

When the iPod came out it didn't differ in functionality from other MP3 players on the market. The things that set it apart were its design and Apple's meticulous attention to detail making it perform those functions in the best way possible. You just need to look at the number of units sold to know that it's not just the loyal Apple fans who appreciated that and bought one. Its success had nothing to do with brand loyalty and everything to do with the fact that Apple had created a great product.

Older people seem to miss the point of convergence devices, saying things like "Why do i want to take photos on my mobile phone?". The point is convenience. Your mobile phone, by its nature, is something you have with you most of the time. By building functions like cameras and music players into them it means that with no extra effort you can also take photos or listen to music when you want to. How many time have you wished you'd had a camera with you? How many times have you remembered a favorite song and wanted to listen to it straight away?

Several contributors allude to the fact that they have somehwhat of a love/hate relationship with their phone. They can make calls but all the other features are hidden away behind a labyrinth of menu screens. Convergence devices are all well and good but as Apple know there's no point making something unless it looks nice and is easy to use. Just as with the iPod, it's these two factors that will set the iPhone apart from the crowd and it's these two factors that will ensure it's a hit with not only the iPod generation but also the skeptics.

This leaves me asking one question - in this day and age, where cultural misunderstanding and racial hatred are rife, how can anything that facilitates better communication be a bad thing?

11 January 2007

Windows Script Components as an alternative to COM

Windows Script Components provide VBScript and JScript developers with the ability to create COM style components. They can wrap up shared functionality into a component which can be registered on a server and instantiated using a CreateObject call just like other COM.

Defining a component

<?xml version="1.0"?>
<component>
   <public>
      <property name="Forename" internalName="strForename" />
      <property name="Surname">
         <get/>
         <put/>
      </property>
      <method name="Save" />
   </public>
   <script language="VBScript">
   <![CDATA[
   
   Dim strForename, strSurname

   Function get_Surname()
      get_Surname = strSurname
   End Function

   Function put_Surname(strValue)
      strSurname = strValue
   End Function

   Function Save()
      If IsEmpty(strForename) Or IsEmpty(strSurname) Then
         Save = False
      Else
         Save = True
      End If
   End Function

   ]]>
   </script>
</component>

Usage

Set myObj = Server.CreateObject("MyApp.MyClass")
myObj.Forename = "Derek"
myObj.Surname = "Fowler"
myObj.Save()
Set myObj = Nothing

Advantages

  • You don't need to server-side include anything to use them, once they're registered they're available from anywhere
  • Like other COM you can use them Application or Session scoped in ASP which means you can use a tailored object rather than abusing Arrays or a Recordset to create your shopping basket
  • Anywhere you can use a COM component you can use a WSC - even within a .NET application

Disadvantages

  • You don't get the performance benefits of proper compiled COM components
  • They don't support destructors which can make clearing up a pain
  • You can't do proper locking in VBScript or JScript so it's difficult to avoid concurrancy issues such as when using them Application scoped in ASP

Having said all that for the majority of applications the advantages certainly outway the disadvantages. Creating your data access and business tiers using WSCs allow you to work outside the confines of ASP environment and create components you can use anywhere that supports COM.

For anyone working in a company that is resisting the adoption of .NET this ability to write functionality to use in ASP but which you can then reuse in ASP.NET provides you with a clear upgrade path.

Links

Microsoft downloads

06 January 2007

Freeing your digital media

Yesteryear

Ever since MP3 came about and the prospect of storing my entire music collection on my computer became reality i've ripped CDs and downloaded tracks with much rejoicing. At the time i spent a large proportion of my time in front of a computer and it was great to have any music i felt like listening to either right there or a couple of minutes of downloading away. The iPod came along and added a whole other dimension, allowing me to venture forth into the world with my record collection tucked neatly in my pocket.

These days however i work in the computer industry and the last thing i want to do when i get home is have to sit in front of the computer to listen to my music.

I'd investigated the whole network music thing a while back when Slim Devices was the only kid on the block with their SLIMP3. Although it was a nice design it was rather expensive and the prospect of having to turn on my computer to use it wasn't all that appealing.

Movies too?!

These days video has gone the way of audio to the detriment of the MPAA. You have the two original industry standards MPEG 1 & 2 but with MPEG-1's poor picture quality and MPEG-2's large file sizes they're not really viable options in the way MP3 is for audio. There are a number of other formats which are however:

MPEG-4 Part 2
Is an improvement upon MPEG-1 which produces similar file sizes but with images closer to that of MPEG-2. There are several implementations of this but the main two are the commercial DivX and the open source XviD. Both of which you may have seen support for on some new DVD players.
Windows Media Video
Now in it's ninth incarnation, Microsoft's video compression format is good and has quite wide support even if it does leave a bad taste in the mouth.
Real video
Has been around almost as long as MPEG-1 and is used for vidcasts by, among others, the BBC. That doesn't stop it being the worst of the three however.

With these compression formats able to fit a DVD on to a CD (4.7GB down to 700MB) and ever more massive hard disks available it starts looking like not only can we store and stream all our music but all our movies and TV shows too!

Affordable Network Storage

Network Attached Storage (NAS) appliances are similar to an external USB or FireWire hard disk apart from that they plug into your network and can be accessed by any computer on it. Recently companies have started making NAS apliances targeted at the home user and i've seen these popping up on Amazon for under £130 for 250GB.

One importat point missed out of the product specs on Amazon however is that some of these NAS units are DLNA-compliant, indeed it was only after looking on some other sites for price comparisons that i even discovered DLNA existed.

DLNA - The Keystone

DLNA is the Digital Living Network Alliance (formerly the Digital Home Working Group) and it's been around since 2003 coming up with a set of guidelines for interoperability between networked devices.

Although i'd never heard of it before, if you look at the roster, the list of companies involved is huge.

So what exactly does a NAS appliance being DLNA-compliant mean?

It means the NAS has the ability to not only store digital media and any other file but that it can also stream that media to any DLNA-compliant player connected to the network - that being something similar to the SLIMP3.

Some of these products are already on the market and more are in development. They include things like set top boxes such as Buffalo's LinkTheatre and Philip's Wireless Multi-media Adapter which will allow you to browse and watch media from your DLNA-compliant NAS or any Windows PC on your normal TV. You'll soon be able to buy TVs with a build in ethernet port that do the same and portable wireless devices that let you listen to music and watch video anywhere in the house over your network.

Exciting stuff

All this means that our digital media is about to be set free. A NAS for £150 and a media player to sit under the TV for £150 and you can watch all your digital video on your tv and listen to all your digital music on your hifi all without needing your computer on. What's more if you have HD video on your NAS you can stream that to your HD TV.

If your HiFi is in another room and you want to stream music there as well then just buy a music player that features a remove control and screen for browsing through your music collection to sit on top of it.

It seems DLNA has all the bases covered and i'll look forward to seeing more products coming to market with a DLNA Certified logo on. Now all they need to do is tell people about it which may be easier said than done with Microsoft pushing Windows Media Connect and Intel pussing Viiv. DLNA may still have the edge however as both of these require you to buy an expensive media centre PC to go under your TV. We'll have to wait and see.

Update - CES 2007

A recent article from the BBC's Click technology programme on the 2007 Consumer Electronics Show has this to say about DLNA:

Many companies are supporting a set of standardised formats through industry groups like the Digital Living Network Alliance (DLNA).
Not much admittedly but it's a start.

15 December 2006

Working with XML - Part 3 - Formatting XML using XSL

XSL is an language for transforming XML into different formats, written in XML using specific elements and attributes. To achieve the transform you simply load you XML and XSL into DOM objects then call transformNode on the XML document passing it your XSL document and it will return the transformed output.

XSL is effectively a procedural programming language. It has conditional and looping logic structures as well as variables and callable "functions". I'd recommend reading XSL @ W3SChools as this explains the basics well.

Recap

In parts one and two we looked at loading a sample bit of XML and how to select certain nodes from it using XPath.

<?xml version="1.0" ?>
<library>

  <authors>
     <author id="12345">
        <name>Charles Dickens</name>
     </author>
     <author id="23456">  
        <name>Rudyard Kipling</name>

     </author>        
  </authors>
  <books>
     <book>
        <title>Great Expectations</title>
        <author>12345</author>

     </book>
     <book>
        <title>The Jungle Book</title>
        <author>23456</author>
     </book>

  </books>
</library>

Using XSL we can convert this XML document into, for example, HTML to send to a web browser. We could also transform it into SQL INSERT statements for adding the data to a database.

HTML Output - Listing Data

Here's an example of how you could output the records from the sample XML:

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<!-- more stuff to go here -->

</xsl:stylesheet>

We start with the stylesheet element which specifies the XML namespace "xsl" which is used as the prefix for all the special XSL elements.

<xsl:template match="/">
   <html>
      <head>
         <title>Library</title>
      </head>
      <body>
         <h1>Library</h1>
         <xsl:apply-templates select="authors/author" />
      </body>
   </html>
</xsl:template>

Next we have a template element with a match attribute equal to "/". This template matches the root of the XML document so the transform starts here outputting the contents of the element. The apply-templates element then tells the transform to apply the appropriate templates to the nodes returned by the value of the select attribute which you'll notice is XPath.

<xsl:template match="author">
   <h2><xsl:value-of select="name" /> (<xsl:value-of select="@id" />)</h2>
   <p>Books by this author:</p>
   <table>
      <tr>
        <th>Title</th>
      </tr>
      <xsl:apply-templates select="/library/books/book[author = current()/@id]" />
   </table>
</xsl:template>

This template will match the author nodes selected so output will pass here at this point - similar to a function call in a normal programming language. Here value-of elements output the value of the nodes identified in their select attributes, XPath again.

The next apply-templates element selects all the book elements with an author child element whose value is equal to the current author element's id attribute. It uses the current() XSL function to get at the element being tranformed by the template.

<xsl:template match="book">
   <tr>
      <td><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

Finally the titles of the books are writen out on table rows so what you end up with after calling transformNode is this:

<html>
  <head>
    <title>Library</title>
  </head>
  <body>
    <h1>Library</h2>
    <h2>Charles Dickens (12345)</h2>
    <p>Books by this author:</p>
    <table>
      <tr>
        <th>Title</th>
      </tr>
      <tr>
        <td>Great Expectations</td>
      </tr>
    </table>
    <h2>Rudyard Kipling (23456)</h2>
    <p>Books by this author:</p>
    <table>
      <tr>
        <th>Title</th>
      </tr>
      <tr>
        <td>The Jungle Book</td>
      </tr>
    </table>
  </body>
</html>

Using the output element

XHTML compliant output

XSL has an HTML output mode that you can specify by adding this to the top of your stylesheet, before any template elements:

<xsl:output mode="html" />

However the Microsoft.XMLDOM will mess around with your tags if you use this so if you want your output to be XHTML compliant you need to use the XML output mode instead thusly:

<xsl:output method="xml" omit-xml-declaration="yes" />

Outputting a DOCTYPE

If you want to output a DOCTYPE (which you need to get IE to obey the CSS box model properly) you add a few attributes to your output element:

<xsl:output 
   method="xml" 
   omit-xml-declaration="yes"
   doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
   doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
/>

which will produce:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

For more info on the output element visit output @ W3Schools.

Useful XSL snipets

Alternating row classes on tables

<xsl:template match="book">
   <tr>
      <xsl:attribute name="class">
         <xsl:choose>
            <xsl:when test="(position() mod 2) = 0">even</xsl:when>
            <xsl:otherwise>odd</xsl:otherwise>
         </xsl:choose>
      </xsl:attribute>
      <td><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

Template applicability

You can make nodes of the same type product different output by changing the match attribute of your template:

<xsl:template match="book[author = 12345]">
   <tr>
      <td class="highlight"><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

<xsl:template match="book">
   <tr>
      <td><xsl:value-of select="title" /></td>
   </tr>
</xsl:template>

Here we're applying a highlight class to book rows by author 12345.

Links

06 November 2006

Working with XML - Part 2 - Using XPath to query XML

XPath is a simple language for querying XML documents in order to retrieve nodes matching particular criteria. There are some good references and tutorials out there to help you get to grips with the basics; i'd recommend reading XPath @ W3Schools for starters and then running through the Zvon XPath Tutorial before reading on.

XSL, which the next part of this post is about, makes much use of XPath so you need to get up to speed with it before you venture into XSL.

Recap

In part one we loaded the following XML in to the DOM and performed various operations with DOM properties and methods. To use XPath there are only two methods selectNodes which returns a node list and selectSingleNode which returns one node.

<?xml version="1.0" ?>
<library>
  <authors>
     <author id="12345">
        <name>Charles Dickens</name>
     </author>
     <author id="23456">  
        <name>Rudyard Kipling</name>
     </author>        
  </authors>
  <books>
     <book>
        <title>Great Expectations</title>
        <author>12345</author>
     </book>
     <book>
        <title>The Jungle Book</title>
        <author>23456</author>
     </book>
  </books>
</library>

XPath allows you to do some quite complex data selection and analysis using a mixture of path syntax, axes, predicates and functions. Having said that i had quite a hard time finding decent examples of doing some quite simple stuff.

Important - Setting the SelectionLanguage property

If you're using the Microsoft XMLDOM COM component you need to set the SelectionLanguage property of the DOM document to "XPath" otherwise you'll get some very odd results - you do this as follows:

xmlDoc.setProperty "SelectionLanguage", "XPath"

Example 1 - Selecting nodes and checking return value

'a single node
strXPath = "/library/authors"
Set ndAuthors = xmlDoc.documentElement.selectSingleNode(strXPath)

'This test checks whether authors was found or not...
If ndAuthors Is Nothing Then
  'error not found!
Else
  'do something with authors
End If

'multiple nodes
strXPath = "/library/authors/author"
Set nlAuthors = xmlDoc.documentElement.selectNodes(strXPath)

'This test checks whether author nodes were found or not...
If nlAuthors.Length = 0 Then
  'error not found!
Else
  For Each ndAuthor In nlAuthors
     'do something with nodes
  Next
End If

If you ran through the Zvon XPath tutorial earlier you should now be able to do some basic selecting of nodes using the two methods i've just shown you.

In the next few examples i'm going to run through some of the things you'll probably want to do but tutorials like the Zvon one don't cover.

Example 2 - Predicates and Axes

Selecting the author element with id "12345"...

strXPath = "/library/authors/author[@id='12345']"
Set ndNode = xmlDoc.documentElement.selectSingleNode(strXPath)

Only selecting the author's name...

strXPath = "/library/authors/author[@id='12345']/name"
Set ndNode = xmlDoc.documentElement.selectSingleNode(strXPath)

Titles of books written by that author...

strXPath = "/library/books/book[author='12345']/title"
Set nlNodes = xmlDoc.documentElement.selectNodes(strXPath)

Example 3 - Functions

Simple counting of nodes...

strXPath = "count(/library/authors/author)"
Set ndCount = xmlDoc.documentElement.selectSingleNode(strXPath)

Combining count with the ancestor axis allows you to select nodes of a particular depth...

strXPath = "//*[count(ancestor::*) > 2]"
Set nlDeepNodes = xmlDoc.documentElement.selectNodes(strXPath)

Books with "The" in their title...

strXPath = "/library/books/book[contains(title,'The')]"
Set nlNodes = xmlDoc.documentElement.selectNodes(strXPath)

Example 4 - Common tasks

Remove nodes that match certain criteria...

strXPath = "/library/books/book[contains(title,'The')]"
Set nlNodes = xmlDoc.documentElement.selectNodes(strXPath)
For Each ndNode In nlNodes
   ndNode.parentNode.removeChild ndNode
Next

Links

21 October 2006

Working with XML - Part 1 - Using the DOM

XML (Extensible Markup Language) is a way of storing information as text by applying structure and meaning to data using a system of nested elements and attributes. HTML is a loose form of XML because it consists of elements, attributes and text although it doesn't always obey the strict rules necesary for valid XML.

The basic rules are:

  • Element and attribute names are case sensitive i.e. derek != Derek
  • Each element must be closed
  • All an element's child elements must be closed before it can be
  • An XML document must have one element only as its root node

Document Type Definitions (DTD) and XML Schema Definition (XSD) are methods for defining the structure of an XML document and ensuring it adheres to your specification. Although i'm not going to cover them in this post they're very important particularly if you're letting other people write XML for your system.

Examples are in ASP/VBScript

Example 1 - Some XML

<?xml version="1.0" ?>
<library>
   <authors>
      <author id="12345">
         <name>Charles Dickens</name>
      </author>
      <author id="23456">   
         <name>Rudyard Kipling</name>
      </author>         
   </authors>
   <books>
      <book>
         <title>Great Expectations</title>
         <author>12345</author>
      </book>
      <book>
         <title>The Jungle Book</title>
         <author>23456</author>
      </book>
   </books>
</library>

In this example we have a library element as the root node of the XML document. Inside that we have an authors element containing author elements and a books element containing book elements. The author elements have an id attribute and a name child element whereas the book elements have a title element and an author element containing the id of the associated author.

You can see from this example how it's easy to denote quite complex relationships in a simple, readable manner. I'll base the other examples in this post around this bit of XML

Example 2 - Loading XML in to a DOM

Set xmlDoc = Server.CreateObject("Microsoft.XMLDOM")
xmlDoc.async = False
xmlDoc.load(Server.MapPath("mydoc.xml"))
Response.Write xmlDoc.xml

This snipet loads and XML document from a file into a DOM object. There's also a LoadXML function on the Microsoft DOM object for loading a string containing XML.

Once we've got our XML loaded we can traverse the tree, read data, change properties and save it back to a file.

Set xmlDoc = Server.CreateObject("Microsoft.XMLDOM")
xmlDoc.async = False
xmlDoc.load(Server.MapPath("mydoc.xml"))
Set ndRoot = xmlDoc.documentElement

'retrieving element names
Response.Write ndRoot.tagName & "<br />"

'looping though nodes
Set ndAuthors = ndRoot.firstChild
For Each ndAuthor In ndAuthors.childNodes
   Response.Write ndAuthor.getAttribute("id") & "<br />"
Next

'setting attributes
Set ndSecondAuthor = ndAuthors.childNodes(1)
ndSecondAuthor.setAttribute "id", 99999
Response.Write ndSecondAuthor.getAttribute("id") & "<br />"

'retrieving and setting node text
Response.Write ndSecondAuthor.firstChild.text & "<br />"
ndSecondAuthor.firstChild.text = "Joe Bloggs"
Response.Write ndSecondAuthor.firstChild.text & "<br />"

Links

18 October 2006

Simple event driven programming using VBScript's GetRef

VBScript doesn't have an event implementation so if you fancy having features like attaching handlers which will respond to specific events on your object you can do it simply by using the GetRef function and a bit of "syntactic sugar".

I'm using ASP in these examples cos it's easy.

Example 1 - Simple Events

'Create a handler
Function MyHandler()
   Response.Write "Hello from the handler!"
End Function

'Create an event
Dim OnLoad
Set OnLoad = GetRef("MyHandler")

'Fire the event
OnLoad()

Here we've created a simple event which takes one handler function and fired the event which in turn has called the function we attached.

To turn this in to a more useful event system we can use an array for the OnLoad event variable thus...

'Create some handlers
Function MyHandler1()
   Response.Write "Hello from handler 1!"
End Function

Function MyHandler2()
   Response.Write "Hello from handler 2!"
End Function

'Create an event
Dim OnLoad
OnLoad = Array(GetRef("MyHandler1"), GetRef("MyHandler2"))

'Fire the event
For Each handler In OnLoad
   handler()
Next

Example 2 - Event Arguments

In most event implementations the event handlers take one argument, passed to them by the fired event, which contains things like the type of event and a reference to the object on which it was fired etc.

'Create a handler which takes one argument
Function MyHandler(e)
   Response.Write "Hello from the handler - i was called by " & e
End Function

'Create two events
Dim OnLoad
Set OnLoad = GetRef("MyHandler")

Dim OnUnload
Set OnUnload = GetRef("MyHandler")

'Fire the events
OnLoad("Load")
OnUnload("Unload")

Wrapping it up

We've established we can do all the basics of events, now all we need to do is wrap it up in a few classes to make it usable.

First we need an Event class that we can instantiate for each event we want. This will have to expose an event arguments property and methods for attaching handlers and firing the event. It will also have to keep track internally of the attached handlers. Lets have a go...

Class clsEvent

   'An array to keep track of our handlers
   Private aryHandlers()

   'Our event arguments object to be passed 
   'to the handlers
   Public EventArgs

   Private Sub Class_Initialize()
      ReDim aryHandlers(-1)
      Set EventArgs = New clsEventArgs
   End Sub

   Private Sub Class_Terminate()
      Set EventArgs = Nothing
      Erase aryHandlers
   End Sub

   'Method for adding a handler
   Public Function AddHandler(strFunctionName)
      ReDim Preserve aryHandlers(UBound(aryHandlers) + 1)
      Set aryHandlers(UBound(aryHandlers)) = _
         GetRef(strFunctionName)
   End Function

   'Method for firing the event
   Public Function Fire(strType, objCaller)
      EventArgs.EventType = strType
      Set EventArgs.Caller = objCaller
      For Each f In aryHandlers
         f(EventArgs)
      Next
   End Function

End Class

Next we need an EventArgs class for passing data about the event to the handlers. This just needs three properties; event type, caller and an arguments collection for event type specific things.

Class clsEventArgs

   Public EventType, Caller, Args

   Private Sub Class_Initialize()
      Set Args = CreateObject("Scripting.Dictionary")
   End Sub

   Private Sub Class_Terminate()
      Args.RemoveAll
      Set Args = Nothing
   End Sub

End Class

Next our class that has an event, in this case an OnLoad which fires after the object's Load method is called. We'll also create a few handlers and do a trial run.

Class MyClass

   Public OnLoad

   Private Sub Class_Initialize()
      'Setting up our event
      Set OnLoad = New clsEvent

      'Adding an argument
      OnLoad.EventArgs.Args.Add "arg1", "Hello"
   End Sub

   Public Function Load()
      Response.Write "loading the object here!<br />"
      
      'Firing the event
      OnLoad.Fire "load", Me
   End Function

End Class


'A couple of handling function for the events
Function EventHandler(e)
   Response.Write "<h2>EventHandler</h2>"
   Response.Write "<p>Event """ & e.EventType & """ fired by object
of type " & TypeName(e.Caller) & ".</p>"
End Function

Function EventHandler2(e)
   Response.Write "<h2>EventHandler2</h2>"
   For Each x In e.Args
      Response.Write x & ": " & e.Args(x) & "<br />"
   Next
End Function

'instantiate the object, attach the handlers and call the load
Set myObj = New MyClass
myObj.OnLoad.AddHandler("EventHandler")
myObj.OnLoad.AddHandler("EventHandler2")
myObj.Load()

Event based programming reverses the responsibility for code execution within your program. In conventional procedural programming it would be the responsibility of the myObj class to make sure the two event handlers were fired when it's Load method was called. By using an OnLoad event instead myObj doesn't have to know anything about the environment in which its executing, it just blindly fires the event and any attached handlers will be called. In this way you can add additional functions which run when myObj's Load method is called without modifying MyClass.

In more complex systems being able to add functionality with a minimum of intrusion into other parts of the system is a big bonus and event based programming is an easy way of achieving it.