XML Connections

Thursday, December 13, 2007

Natural sorting in XQuery

Jeff Atwood touched an interesting topic yesterday, Sorting for Humans : Natural Sort Order.

Let's sort the following strings: a, A, b, B, 1, 2, 10
The ASCIIbetical order is as follows: 1, 10, 2, A, B, a, b
The natural sorting, for most human beings, is as follows: 1, 2, 10, a, A, b, B

Sorting strings in most programming languages will result in the ASCIIbetical result, and Jeff wonders if a more human-friendly natural sort option should be built into mainstream programming languages. What about XQuery? We're talking here about collations, and XQuery has built-in support for collations.

The default collation in XQuery is the Unicode Codepoint collation. For example,

for $s in ("a", "A", "b", "B", "1", "2", "10")
order by $s
return $s

yields the following result: 1, 10, 2, A, B, a, b.

XQuery implementation are allowed to used a different default collation. With DataDirect XQuery, the default collation is based on the locale of your Java Virtual Machine. The query above will result in: 1, 10, 2, a, A, b, B. That's already better, 'a' and 'A' are sorted before 'b' and 'B'. By the way, using the locale implies that on a German system for example, characters like umlaut will collate as a German would expect.

But we're not yet there, the numbers are still not naturally sorted. You can achieve this with DataDirect XQuery, by explicitly overriding the default collation and specify the alphanumeric option. As shown in the next query,

declare default collation "http://www.datadirect.com/xquery/collation?alphanumeric=yes";
for $s in ("a", "A", "b", "B", "1", "2", "10")
order by $s
return $s
And we get the desired result: 1, 2, 10, a, A, b, B.

Want more tips and tricks?

Tech Tags:

Labels: ,

Wednesday, November 7, 2007

Updating XML with XQuery 1.0

I was reading an interesting discussion yesterday on xquery-talk, replacing a node in in-memory XML.

How can one modify an XML structure through XQuery? In the future, the answer is definitely XQuery Update Facility. But the XQuery Update Facility is currently still work in progress, and not yet widely supported. What do we do today?

Ryan Grimm wrote an XQuery library to update an in-memory XML structure. And it looks like the in-mem-update library is pretty functional complete, having the following functions.

  • node-insert-child
  • node-insert-before
  • node-insert-after
  • node-replace
  • node-delete

How do you use these functions? Let's have a look at a query from the XQuery Update Facility Use Cases, and show an equivalent implementation based on the in-mem-update library.

Consider Q2, Enter a bid for user Annabel Lee on February 1st, 1999 for 60 dollars on item 1001. The XQuery Update Facility based solution is as follows,

let $uid := 
doc("users.xml")/users/user_tuple[name="Annabel Lee"]/userid
return do
insert
<bid_tuple>
<userid>{data($uid)}</userid>
<itemno>1001</itemno>
<bid>60</bid>
<bid_date>1999-02-01</bid_date>
<bid_tuple>
into doc("bids.xml")/bids

Using the library we end up doing something as follows,

import module namespace mem = "http://xqdev.com/in-mem-update" at "in-mem-update.xqy";
let $uid :=
doc("users.xml")/users/user_tuple[name="Annabel Lee"]/userid
return
mem:node-insert-child(
doc("bids.xml")/bids,
<bid_tuple>
<userid>{data($uid)}</userid>
<itemno>1001</itemno>
<bid>60</bid>
<bid_date>1999-02-01</bid_date>
</bid_tuple>)
Looks pretty similar, no? There is actually one fundamental difference. With the XQuery Update Facility, the bids.xml document is actually updated. The in-mem-update variant, doesn't update the bids.xml document, but rather returns a copy of the original document reflecting the change.
This shows one of the possible issues with the library. Each modification made to an XML structure results in a copy. Making a lot of changes to a single XML structure, or updating a huge XML structure might affect performance. Still, I believe the library is useful in a lot of common scenarios.

The library is written to be used with MarkLogic Server, and unfortunately based on an older version of the XQuery specification. This makes it fail out of the box using XQuery 1.0 compliant processors. I updated the XQuery module in order to make it XQuery 1.0 compatible, and in addition added support for document nodes. It's available for download here.

So, you can now update all your data with DataDirect XQuery. Using the ddtek:sql-insert, ddtek:sql-update and ddtek:sql-delete functions you can update your relation database. And using the in-mem-update library you can now also make changes to your XML documents.

I believe this library is complementary to the functions modifying XML elements and attributes available in the FunctX XQuery library. Wouldn't it be cool to have these functions added to FunctX? I leave it to Ryan Grimm and Priscilla Walmsley to discuss this in detail.

Labels: , , , ,

Friday, October 5, 2007

XQuery questions you've always wanted to ask (but never dared to)

When using a programming language, sooner or later we all end up trying to solve similar problems. When I enjoyed writing applications in Prolog or C++ (yes, many years ago; and yes, I said enjoy), I wasn't lucky enough to be able to search the Internet for answers, and I had to find solutions to problems that I was sure thousands of other developers had already faced (and solved!).

But Internet or no, developers today are still confronted with questions and problems, especially when dealing with relatively new languages; and this is true of XQuery, of course — How do I return a sequence of elements? How do I do grouping? How do I use variables in expressions? Why does using the default namespace make my query fail? And many more.

Since we don't want you to suffer the same way I did when I was younger, we thought it would be a good idea to share with you typical questions (and answers) that we have experienced in the past few years of work on XQuery. The result is a collection of "tips and tricks" that already covers dozens of topics, and we'll augment the collection over time.

If you have recently hit an XQuery problem about which you have a question, or if you have recently solved a problem that you think might be encountered by other XQuery users, let us know! Who knows: maybe the next addition to our tips and tricks pages to help other XQuery developers will be yours!


Tech Tags:

Labels: , ,