XML Connections

Thursday, December 27, 2007

Create (EDI) messages out of your database

This is our third post in a series about XQuery generating multiple XML documents.

It's not uncommon to have the needs to generate XML messages out of data stored in your relational database in some type of batch process. Obviously a first class use case for DataDirect XQuery, and the built-in function ddtek:serialize-to-url comes again to rescue.

Suppose we're a bookstore, running a MySQL database with supplier and order information. As we know, with DataDirect XQuery tables in your relational database are accessed through the fn:collection function. The following query creates an XML document for each of the suppliers, listing all orders of today, and each order includes the book's ISBN number and a quantity. The XML documents are saved in the directory C:\output, as "suppliername.xml" files.

for $supplier in collection("SUPPLIERS")/SUPPLIERS
let $orders := collection("ORDERS")/ORDERS[SUP_ID = $supplier/ID and
ORDER_DATE = current-date()]
where $orders (: ignore the supplier if there are no orders for today :)
return
let $url := concat("file:///C:/output/", $supplier/NAME, ".xml")
let $msg := <supplier name="{$supplier/NAME/}">
<orders>{
for $order in $orders
return
<order>
<isbn>{$order/ISBN/text()}</isbn>
<quantity>{$order/QUANTITY/text()}</quantity>
}</orders>
</supplier>
return
ddtek:serialize-to-url($msg, $url, "method=xml")

But now we need to communicate through EDI with our business partners. DataDirect XQuery not only integrates with DataDirect XML Converters to query and transform EDI messages as shown in a previous post, but also to generate your EDI messages.

In the example above, the third argument of ddtek:serialize-to-url is "method=xml". This instructs DataDirect XQuery to serialize the messages in XML format. By simply specifying "method=EDI", DataDirect XQuery will use the appropriate DataDirect XML Converter to produce EDI messages rather than XML.

The next example is similar to the previous, but generates EANCOM ORDERS messages for each of the suppliers.

for $supplier in collection("SUPPLIERS")/SUPPLIERS
let $orders := collection("ORDERS")/ORDERS[SUP_ID = $supplier/ID and
ORDER_DATE = current-date()]
where $orders
return
let $url := concat("file:///C:/output/", $supplier/NAME, ".edi")
let $edi := (: this message is incomplete :)
<EDIFACT>
<UNB>
<UNB01>
<UNB0101>UNOA</UNB0101>
<UNB0102>4</UNB0102>
</UNB01>
<UNB02>
<UNB0201>MYBOOKSHOP</UNB0201>
<UNB0202>666</UNB0202>
</UNB02>
<UNB03>
<UNB0301>{$supplier/NAME/text()}</UNB0301>
<UNB0302>{$supplier/BUSINESS_ID/text()}</UNB0302>
</UNB03>
<UNB05>6002</UNB05>
</UNB>
<ORDERS>{
for $order at $lineitem in $orders
return
<GROUP_28>
<LIN>
<LIN01>{$lineitem}</LIN01>
<LIN02>1</LIN02>
<LIN03>
<LIN0301>{$order/ISBN/text()}</LIN0301>
<LIN0302>IB</LIN0302>
</LIN03>
</LIN>
<QTY>
<QTY01>
<QTY0101>1</QTY0101>
<QTY0102>{$order/QUANTITY/text()}</QTY0102>
</QTY01>
</QTY>
</GROUP_28>
}</ORDERS>
</EDIFACT>
return
ddtek:serialize-to-url($edi, $url, "method=EDI")

This example creates a fixed EDI message for each of the suppliers, but this can easily be tuned as we're using XQuery. Suppose some of your suppliers require the order information as XML and others as EDI. We can combine parts of the previous two queries.

for $supplier in collection("SUPPLIERS")/SUPPLIERS
let $orders := collection("ORDERS")/ORDERS[SUP_ID = $supplier/ID and
ORDER_DATE = current-date()]
where $orders
return
if ($supplier/MESSAGE_FORMAT = "XML") then
(: continue here with the first query :)
else
(: continue here with the second query :)

I hope this gives you an idea how easy it is to generate XML or EDI messages out of your database. And this is not limited to EANCOM messages, many other standards are supported too, think of X12, EDIFACT, HL7, IATA, etc.
Using DataDirect XQuery together with DataDirect XML Converters and all the expressiveness of the XQuery language there are close to no limits.

Tech Tags:

Labels: , ,

Wednesday, December 19, 2007

Transform a directory of EDI messages

We recently announced DataDirect XQuery 3.1. It integrates now nicely with the DataDirect XML Converters, and as we have the ability to query all files in a directory, this opens a new range of use cases.

Suppose you have a directory with EDI messages and need to transform them all to XML, for example to make the messages consumable for some other business process.

In XQuery generating multiple XML documents? we learned how to query a directory of XML document, and to transform and save each of the documents into a new XML file. To refresh our mind, the following query copies all XML files from one to another directory,

declare function local:get-file-name($document-uri as xs:string){
tokenize($document-uri, "/")[last()]
};

for $doc in fn:collection("file:///C:/input?select=*.xml")
let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($doc)))
return
ddtek:serialize-to-url($doc, $filename, "")

But ok, we have now a directory of EDI messages, and want to transform them all into XML. Only a few changes to the above query are needed. The argument to fn:collection is different, in order to make DataDirect XQuery invoke the appropriate DataDirect XML Converter. And second, the local:get-file-name() is a bit more complex to generate the matching .xml filename.

declare function local:get-file-name($document-uri as xs:string){
let $file-name := tokenize($document-uri, "/")[last()]
let $file-name := replace($file-name,'^(.*)\..*','$1')
let $file-name := concat($file-name, ".", "edi")
return $file-name
};

for $doc in fn:collection("converter:///EDI?file:///C:/input?select=*.edi")
let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($doc)))
return
ddtek:serialize-to-url($doc, $filename, "")

The resulting XML document need to conform to a specific XML Schema. No problem, XQuery is designed to perform such transformations. Assume we need to transform EDIFACT messages, we start from our previous query and simply add some transformation logic.

for $edi in fn:collection("converter:///EDI?file:///C:/input?select=*.edi")

let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($edi)))
let $xml :=
<order submitter="{$edi/EDIFACT/UNB/UNB02/UNB0201}">{
for $group28 in $edi/EDIFACT/ORDERS/GROUP_28
return
<book>
<id>{
$group28/LIN/LIN03/LIN0301/text()
}</id>
<quantity>{
$group28/QTY/QTY01/QTY0102/text()
}</quantity>
<ISBN>{
$group28/LIN/LIN03/LIN0301/text()
}</ISBN>
</book>
}</order>
return
ddtek:serialize-to-url($xml, $filename, "")

We have still fairly simple code with already significant functionality. You’re now ready to use any of the built-in DataDirect XQuery functionality. Assume the data needs to be enriched with data from a web service or your customer data stored in a relational database. Let’s extend the previous example and add a shipping address to the result, which is obtained out of the CUSTOMERS table in our database.

for $edi in fn:collection("converter:///EDI?file:///C:/input?select=*.edi")
let $address := collection("CUSTOMERS")/CUSTOMERS[ID = $edi/EDIFACT/UNB/UNB02/UNB0201]
let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($edi)))
let $xml :=
<order submitter="{$edi/EDIFACT/UNB/UNB02/UNB0201}">{
<shipping_address>
<street>{
concat($address/STREET, " ", $address/NUMBER)
}</street>
<zip>{
$address/ZIPCODE
}</zip>
<state>{
$address/STATE
}</state>
</shipping_address>,
for $group28 in $edi/EDIFACT/ORDERS/GROUP_28
return
<book>
<id>{
$group28/LIN/LIN03/LIN0301/text()
}</id>
<quantity>{
$group28/QTY/QTY01/QTY0102/text()
}</quantity>
<ISBN>{
$group28/LIN/LIN03/LIN0301/text()
}</ISBN>
</book>
}</order>
return
ddtek:serialize-to-url($xml, $filename, "")


Hopefully I demonstrated how to query, enrich an transform XML messages. Soon we'll talk about "split a huge XML file into smaller documents".

Tech Tags:

Labels: , ,

Monday, December 17, 2007

New product versions released!


I just wanted to let you know that we (DataDirect) have recently released new version of our XML products; all of them are available for download and free evaluation:

DataDirect XQuery 3.1
Complete release highlights are available here.
  • New database support for DB2 v9.1 for z/OS, MySQL, Oracle 11g, Informix, PostgreSQL
  • Support for XML-based office document standards (OpenDocument Format, Office Open XML, new PDF versions)
  • Richer support for XQuery output formats (EDI and flat files)
  • New Eclipse-based plug-in for developing and testing queries
  • Performance enhancements
DataDirect XML Converters 3.1
Complete release highlights available here.
  • Support for the HL7 EDI standard
  • XML Schema API generates an XML Schema that describes how to convert XML to non-XML
  • Support for Standard Exchange Format (SEF) to manage custom EDI message types
  • New exception and error listener interfaces for flexible error handling
Stylus Studio 2008
Complete release highlights availabe here.
  • Support for DataDirect XQuery 3.1 and XML Converters 3.1 (embedded in the product)
  • New WSDL editor
  • C# code generation for .NET
  • Simplified generated Java code
  • Auto-complete XML Fragments
  • XML Schema Editor Enhancements
In the coming weeks I'm planning to post examples showing how you can use the new (and old) functionality in our XML products. Of course, if there are specific examples you would like to see covered, I would be very interested in hearing about them.

Labels: , , ,

Thursday, December 13, 2007

Natural sorting in XQuery

Jeff Atwood touched an interesting topic yesterday, Sorting for Humans : Natural Sort Order.

Let's sort the following strings: a, A, b, B, 1, 2, 10
The ASCIIbetical order is as follows: 1, 10, 2, A, B, a, b
The natural sorting, for most human beings, is as follows: 1, 2, 10, a, A, b, B

Sorting strings in most programming languages will result in the ASCIIbetical result, and Jeff wonders if a more human-friendly natural sort option should be built into mainstream programming languages. What about XQuery? We're talking here about collations, and XQuery has built-in support for collations.

The default collation in XQuery is the Unicode Codepoint collation. For example,

for $s in ("a", "A", "b", "B", "1", "2", "10")
order by $s
return $s

yields the following result: 1, 10, 2, A, B, a, b.

XQuery implementation are allowed to used a different default collation. With DataDirect XQuery, the default collation is based on the locale of your Java Virtual Machine. The query above will result in: 1, 10, 2, a, A, b, B. That's already better, 'a' and 'A' are sorted before 'b' and 'B'. By the way, using the locale implies that on a German system for example, characters like umlaut will collate as a German would expect.

But we're not yet there, the numbers are still not naturally sorted. You can achieve this with DataDirect XQuery, by explicitly overriding the default collation and specify the alphanumeric option. As shown in the next query,

declare default collation "http://www.datadirect.com/xquery/collation?alphanumeric=yes";
for $s in ("a", "A", "b", "B", "1", "2", "10")
order by $s
return $s
And we get the desired result: 1, 2, 10, a, A, b, B.

Want more tips and tricks?

Tech Tags:

Labels: ,

Sunday, December 9, 2007

XQuery 1.0 errata published

For the die-hards, or if you care about the details, the W3C has published the first errata for XQuery 1.0 and accompanying specifications.

Errata for XQuery 1.0: An XML Query Language
Errata for XQuery 1.0 and XPath 2.0 Functions and Operators
Errata for XQuery 1.0 and XPath 2.0 Data Model (XDM)
Errata for XSLT 2.0 and XQuery 1.0 Serialization
Errata for XQuery 1.0 and XPath 2.0 Formal Semantics
Errata for XML Syntax for XQuery 1.0 (XQueryX)

Tech Tags:

Labels:

Tuesday, December 4, 2007

XQuery generating multiple XML documents?

A requirement we often see with DataDirect XQuery customers, is the ability to generate multiple results. Imagine a directory with XML files, each of which needs to be transformed and result in a new XML document. Wouldn’t it be great to query the complete directory and generate the corresponding results, all in a single XQuery? No need to glue multiple queries together, do external scripting, write Java or any other proprietary code. Simply a single XQuery!

The use case is not new. And not surprising, people are doing this in XSLT since a long time through the xsl:result-document instruction. Despite the library of almost 200 built-in functions in XQuery, there is not such a function in XQuery 1.0.

The good news is that DataDirect XQuery 3.1 adds out-of-the-box support to serialize query results in a file.

ddtek:serialize-to-url(
$items as item()*,
$url as xs:string,
$options as xs:string)

ddtek:serialize-to-url() has three arguments, first the data to be serialized, second a URL specifying the file to be saved and the last argument tweaks the serialization process.

The following example creates a output.xml file in the c:\results directory,

let $item := <message>XQuery rocks!</message>
return
ddtek:serialize-to-url($item,
"file:///results/output.xml","")

As you see there is no need at all to write any Java code, this can be executed from within your favorite XQuery editor or using the DataDirect XQuery command line utility.

As a URL specifies the output location, you can for example easily upload the result to an FTP server,

let $item := <message>XQuery rocks!</message>
return
ddtek:serialize-to-url($item,
"ftp://uid:pwd@myftpserver/results/", "")

Finally, the serialization of the results can be tweaked. Here we encode the XML output as UTF-16 and specify to include an XML declaration,

let $item := <message>XQuery rocks!</message>
return
ddtek:serialize-to-url($item,
"file:///results/output.xml",
"omit-xml-declaration=no,encoding=UTF-16")

Let's look at a more concrete example, where all XML document are copied from one directory to another. in this query we use DataDirect XQuery's capability to query a complete directory through fn:collection.

declare function local:get-file-name($document-uri as xs:string){
tokenize($document-uri, "/")[last()]
};

for $doc in fn:collection("file:///C:/input?select=*.xml")
let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($doc)))
return
ddtek:serialize-to-url($doc, $filename, "")

Assuming this XQuery is saved as C:\xquery\copy-document.xq, it can be executed from the command line as follows

java -jar C:\ddxq\lib\ddxq.jar C:\xquery\copy-document.xq

In the next days, we’ll show through some real use cases how to take advantage of ddtek:serialize-to-url

Tech Tags:

Labels: ,