XML Connections

Friday, November 7, 2008

Grouping an XML document based on element names

It has been a while since we have talked about some "pure XML" problem to be solved with XQuery; so when I read this un-answered post on the Stylus Studio Developer Network I thought that was a good chance to talk about it here as an interesting XQuery example.
The problem involves moving from a flat XML structure like this one: ...to a more hierarchical XML that "explodes" the implicit structure hidden in the original XML element names: In the end this is a grouping problem, but a bit trickier than usual, as it involves recognizing and exploding the groups from the original XML element names.
Even if XQuery 1.0 doesn't support grouping explicitly, the fn:distinct-values() function is extremely useful in solving grouping problems. fn:distinct-values() gets a sequence of atomic values in input and returns a sequence containing the same values with any duplicate removed. That helps a lot with our problem, as we can retrieve what all the unique top level categories are (MAINx) and what the unique sub categories are (SUBy) for each top level one. Add to that a very simple use of the fn:tokenize() function that splits a name like "MAIN1_SUB1_COLNAME1" into a sequence like ("MAIN1", "SUB1", "COLNAME1"), and the problem is easily solved; here is the XQuery I came up with:That generates the following XML result, which is what we are looking for:A simple example, but it makes use of some useful functions and structures in XQuery; just to remind us that while we keep talking about how useful XQuery is to deal with heterogeneous data sources and leverage the XML Data Model as an abstraction from the physical details of the data we are dealing with, XQuery is extremely powerful and flexible also in the "simpler" cases where you need to manipulate and re-arrange XML structures.

Labels: ,

0 Comments:

Post a Comment



<< Home