XML Connections

Tuesday, January 29, 2008

Converting tab delimited files into EDI


A few days ago I received a email saying "I am a DataDirect XQuery user, and I'm successfully using your product to create XML-based reports from incoming EDI messages. Now I'm in need to do something somewhat different: it may happen that my company receives tab-delimited messages, and those messages need to be translated to EDI X12 831, rather than converted to XML. Can your product help me handling tab-delimited to EDI conversions?".

Not being a marketing person, I usually like showing how things are done, rather than just claiming that they can be easily done using our products... :)
So, I asked the user for an example of his data, a tab-delimited file. If you have read some of the other posts in this Blog, you will know that DataDirect XQuery can access tab-delimited files thanks to the XML Converters technology. So, consider this tab-delimited file:

id description quantity unitprice
001 Telephone 1 23.00
002 Desk 1 129.00
003 Keyboard 2 21.00

The following XQuery will return, for example, all the description fields in the tab-delimited input message:

 <items> {
  for $item in doc("converter:TAB:first=yes?file:///c:/sample.tab")//row
  return $item/description
 } </items>

The result is:

 <items>
  <description>Telephone</description>
  <description>Desk</description>
  <description>Keyboard</description>
 </items>

That shows that we can handle a tab-delimited file in DataDirect XQuery and generate XML; but it doesn't help much our user: he needs to output EDI.
But we know how to do that; Marc has shown us in this post how you can create EDI messages from XQuery. We just need to apply the same idea to our example; the following XQuery uses the incoming tab-delimited file to apparently generate an XML fragment consistent with a specific vocabulary; that vocabulary is what the underlying XML Converters technology knows how to translate in the corresponding EDI message (an X12 832 message in this case, which is a price/sales catalog). The serialization option at the beginning of the XQuery is instructing DataDirect XQuery to output a raw EDI message rather than XML; that way we are moving data from a tab-delimited file format into an EDI message manipulating it as if it was all XML, but without ever materializing it as XML! Many of the EDI fields typically specified are omitted here; but this example will run if you want to see how it works.

 declare option ddtek:serialize "method=EDI,long=yes";
 <X12>
  <ISA>
   <ISA06-InterchangeSenderId>1515151515</ISA06-InterchangeSenderId>
   <ISA08-InterchangeReceiverId>5151515151</ISA08-InterchangeReceiverId>
   <ISA11-RepetitionSeparator>^</ISA11-RepetitionSeparator>
   <ISA13-InterchangeControlNumber>000032123</ISA13-InterchangeControlNumber>
   <ISA14-AcknowledgmentRequested>0</ISA14-AcknowledgmentRequested>
   <ISA15-UsageIndicator>P</ISA15-UsageIndicator>
   <ISA16-ComponentElementSeparator>*</ISA16-ComponentElementSeparator>
  </ISA>
  <GS>
   <GS01-FunctionalIdentifierCode>CT</GS01-FunctionalIdentifierCode>
   <GS02-ApplicationSendersCode>9988776655</GS02-ApplicationSendersCode>
   <GS03-ApplicationReceiversCode>1122334455</GS03-ApplicationReceiversCode>
   <GS06-GroupControlNumber>128</GS06-GroupControlNumber>
   <GS07-ResponsibleAgencyCode>X</GS07-ResponsibleAgencyCode>
   <GS08-VersionReleaseIndustry>004030</GS08-VersionReleaseIndustry>
  </GS>
  <TS_832>
   <ST>
    <ST01-TransactionSetIdentifierCode>832</ST01-TransactionSetIdentifierCode>
    <ST02-TransactionSetControlNumber>12345</ST02-TransactionSetControlNumber>
   </ST>
   <BCT>
    <BCT01-CatalogPurposeCode>CP</BCT01-CatalogPurposeCode>
    <BCT02-CatalogNumber>GOV56789</BCT02-CatalogNumber>
    <BCT10-TransactionSetPurposeCode>00</BCT10-TransactionSetPurposeCode>
   </BCT>
   {
    for $row in doc('converter:TAB:first=yes?file:///c:/sample.tab')/table/row
    return
     <GROUP_5>
      <LIN>
       <LIN02-ProductServiceIdQualifier>MF</LIN02-ProductServiceIdQualifier>
       <LIN03-ProductServiceId>{$row/id/text()}</LIN03-ProductServiceId>
       <LIN07-ProductServiceId>{$row/description/text()}</LIN07-ProductServiceId>
       <LIN09-ProductServiceId>{$row/quantity/text()}</LIN09-ProductServiceId>
      </LIN>
      <GROUP_6>
       <CTP>
        <CTP03-UnitPrice>{$row/unitprice/text()}</CTP03-UnitPrice>
       </CTP>
      </GROUP_6>
     </GROUP_5>
   }
   <SE/>
  </TS_832>
  <GE/>
 </X12>

Curious about the result? Here you go:
ISA+00+ +00+ +ZZ+1515151515 +ZZ+5151515151 +080129+1710+U+00000+000032123+0+P+*'
GS+CT+9988776655+1122334455+20080129+1710+128+X+004030'
ST+832+12345'
BCT+CP+GOV56789++++++++00'
LIN++MF+001++++Telephone++1'
CTP+++23'
LIN++MF+002++++Desk++1'
CTP+++129'
LIN++MF+003++++Keyboard++2'
CTP+++21'
SE+9+12345'
GE+1+128'
IEA+1+000032123'

The beauty of this approach is that you can use it in conjunction with all the other data aggregation capabilities of DataDirect XQuery, like possibly adding data to the outgoing EDI message which is not available in the incoming tab-delimited file but available inside a relational database (like a encoded field that may need to be looked up in a table); and this will all work taking advantage of the performance and scalability features of DataDirect XQuery and XML Converters: the example above will seamlessly stream from tab-delimited to EDI without ever allocating in memory for than a few items per time.

The same technique applies of course to all the data sources supported by DataDirect XQuery and XML Converters; you can find more details at the following links:

Labels: , , , , ,

Thursday, December 27, 2007

Create (EDI) messages out of your database

This is our third post in a series about XQuery generating multiple XML documents.

It's not uncommon to have the needs to generate XML messages out of data stored in your relational database in some type of batch process. Obviously a first class use case for DataDirect XQuery, and the built-in function ddtek:serialize-to-url comes again to rescue.

Suppose we're a bookstore, running a MySQL database with supplier and order information. As we know, with DataDirect XQuery tables in your relational database are accessed through the fn:collection function. The following query creates an XML document for each of the suppliers, listing all orders of today, and each order includes the book's ISBN number and a quantity. The XML documents are saved in the directory C:\output, as "suppliername.xml" files.

for $supplier in collection("SUPPLIERS")/SUPPLIERS
let $orders := collection("ORDERS")/ORDERS[SUP_ID = $supplier/ID and
ORDER_DATE = current-date()]
where $orders (: ignore the supplier if there are no orders for today :)
return
let $url := concat("file:///C:/output/", $supplier/NAME, ".xml")
let $msg := <supplier name="{$supplier/NAME/}">
<orders>{
for $order in $orders
return
<order>
<isbn>{$order/ISBN/text()}</isbn>
<quantity>{$order/QUANTITY/text()}</quantity>
}</orders>
</supplier>
return
ddtek:serialize-to-url($msg, $url, "method=xml")

But now we need to communicate through EDI with our business partners. DataDirect XQuery not only integrates with DataDirect XML Converters to query and transform EDI messages as shown in a previous post, but also to generate your EDI messages.

In the example above, the third argument of ddtek:serialize-to-url is "method=xml". This instructs DataDirect XQuery to serialize the messages in XML format. By simply specifying "method=EDI", DataDirect XQuery will use the appropriate DataDirect XML Converter to produce EDI messages rather than XML.

The next example is similar to the previous, but generates EANCOM ORDERS messages for each of the suppliers.

for $supplier in collection("SUPPLIERS")/SUPPLIERS
let $orders := collection("ORDERS")/ORDERS[SUP_ID = $supplier/ID and
ORDER_DATE = current-date()]
where $orders
return
let $url := concat("file:///C:/output/", $supplier/NAME, ".edi")
let $edi := (: this message is incomplete :)
<EDIFACT>
<UNB>
<UNB01>
<UNB0101>UNOA</UNB0101>
<UNB0102>4</UNB0102>
</UNB01>
<UNB02>
<UNB0201>MYBOOKSHOP</UNB0201>
<UNB0202>666</UNB0202>
</UNB02>
<UNB03>
<UNB0301>{$supplier/NAME/text()}</UNB0301>
<UNB0302>{$supplier/BUSINESS_ID/text()}</UNB0302>
</UNB03>
<UNB05>6002</UNB05>
</UNB>
<ORDERS>{
for $order at $lineitem in $orders
return
<GROUP_28>
<LIN>
<LIN01>{$lineitem}</LIN01>
<LIN02>1</LIN02>
<LIN03>
<LIN0301>{$order/ISBN/text()}</LIN0301>
<LIN0302>IB</LIN0302>
</LIN03>
</LIN>
<QTY>
<QTY01>
<QTY0101>1</QTY0101>
<QTY0102>{$order/QUANTITY/text()}</QTY0102>
</QTY01>
</QTY>
</GROUP_28>
}</ORDERS>
</EDIFACT>
return
ddtek:serialize-to-url($edi, $url, "method=EDI")

This example creates a fixed EDI message for each of the suppliers, but this can easily be tuned as we're using XQuery. Suppose some of your suppliers require the order information as XML and others as EDI. We can combine parts of the previous two queries.

for $supplier in collection("SUPPLIERS")/SUPPLIERS
let $orders := collection("ORDERS")/ORDERS[SUP_ID = $supplier/ID and
ORDER_DATE = current-date()]
where $orders
return
if ($supplier/MESSAGE_FORMAT = "XML") then
(: continue here with the first query :)
else
(: continue here with the second query :)

I hope this gives you an idea how easy it is to generate XML or EDI messages out of your database. And this is not limited to EANCOM messages, many other standards are supported too, think of X12, EDIFACT, HL7, IATA, etc.
Using DataDirect XQuery together with DataDirect XML Converters and all the expressiveness of the XQuery language there are close to no limits.

Tech Tags:

Labels: , ,

Wednesday, December 19, 2007

Transform a directory of EDI messages

We recently announced DataDirect XQuery 3.1. It integrates now nicely with the DataDirect XML Converters, and as we have the ability to query all files in a directory, this opens a new range of use cases.

Suppose you have a directory with EDI messages and need to transform them all to XML, for example to make the messages consumable for some other business process.

In XQuery generating multiple XML documents? we learned how to query a directory of XML document, and to transform and save each of the documents into a new XML file. To refresh our mind, the following query copies all XML files from one to another directory,

declare function local:get-file-name($document-uri as xs:string){
tokenize($document-uri, "/")[last()]
};

for $doc in fn:collection("file:///C:/input?select=*.xml")
let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($doc)))
return
ddtek:serialize-to-url($doc, $filename, "")

But ok, we have now a directory of EDI messages, and want to transform them all into XML. Only a few changes to the above query are needed. The argument to fn:collection is different, in order to make DataDirect XQuery invoke the appropriate DataDirect XML Converter. And second, the local:get-file-name() is a bit more complex to generate the matching .xml filename.

declare function local:get-file-name($document-uri as xs:string){
let $file-name := tokenize($document-uri, "/")[last()]
let $file-name := replace($file-name,'^(.*)\..*','$1')
let $file-name := concat($file-name, ".", "edi")
return $file-name
};

for $doc in fn:collection("converter:///EDI?file:///C:/input?select=*.edi")
let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($doc)))
return
ddtek:serialize-to-url($doc, $filename, "")

The resulting XML document need to conform to a specific XML Schema. No problem, XQuery is designed to perform such transformations. Assume we need to transform EDIFACT messages, we start from our previous query and simply add some transformation logic.

for $edi in fn:collection("converter:///EDI?file:///C:/input?select=*.edi")

let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($edi)))
let $xml :=
<order submitter="{$edi/EDIFACT/UNB/UNB02/UNB0201}">{
for $group28 in $edi/EDIFACT/ORDERS/GROUP_28
return
<book>
<id>{
$group28/LIN/LIN03/LIN0301/text()
}</id>
<quantity>{
$group28/QTY/QTY01/QTY0102/text()
}</quantity>
<ISBN>{
$group28/LIN/LIN03/LIN0301/text()
}</ISBN>
</book>
}</order>
return
ddtek:serialize-to-url($xml, $filename, "")

We have still fairly simple code with already significant functionality. You’re now ready to use any of the built-in DataDirect XQuery functionality. Assume the data needs to be enriched with data from a web service or your customer data stored in a relational database. Let’s extend the previous example and add a shipping address to the result, which is obtained out of the CUSTOMERS table in our database.

for $edi in fn:collection("converter:///EDI?file:///C:/input?select=*.edi")
let $address := collection("CUSTOMERS")/CUSTOMERS[ID = $edi/EDIFACT/UNB/UNB02/UNB0201]
let $filename := concat("file:///C:/output/",
local:get-file-name(document-uri($edi)))
let $xml :=
<order submitter="{$edi/EDIFACT/UNB/UNB02/UNB0201}">{
<shipping_address>
<street>{
concat($address/STREET, " ", $address/NUMBER)
}</street>
<zip>{
$address/ZIPCODE
}</zip>
<state>{
$address/STATE
}</state>
</shipping_address>,
for $group28 in $edi/EDIFACT/ORDERS/GROUP_28
return
<book>
<id>{
$group28/LIN/LIN03/LIN0301/text()
}</id>
<quantity>{
$group28/QTY/QTY01/QTY0102/text()
}</quantity>
<ISBN>{
$group28/LIN/LIN03/LIN0301/text()
}</ISBN>
</book>
}</order>
return
ddtek:serialize-to-url($xml, $filename, "")


Hopefully I demonstrated how to query, enrich an transform XML messages. Soon we'll talk about "split a huge XML file into smaller documents".

Tech Tags:

Labels: , ,

Tuesday, October 16, 2007

Converting between EDI and XML... Why?


EDI (Electronic Data Interchange) pre-dates XML by several years; people started using the term EDI to describe the transfer of data, typically across multiple companies, often using VANs (Value Added Networks) or the Internet. Many standards bodies have been creating EDI standards in the past several years; some of the most popular EDI standards in active use today include X12, HL7, EDIFACT, IATA.

And how is XML related to all this? Since the start of XML's adoption in the late 90's, one of XML's primary applications was in handling B2B or even B2C data interchange. The benefits compared to EDI were obvious, numerous, and embedded in the very nature of XML: human readable, easily extensible, easy to validate (well, after XML Schema saw the light, at least), easily parsed using off-the-shelf components, and easily accessed through standard interfaces like DOM, Sax, and StAX. None of these features applies to the EDI standards, as anyone dealing with EDI knows very well.

So, there is no point in worrying about EDI standards; we can just use XML and rely on powerful XML tools like DataDirect XQuery to create and manipulate messages used for data interchange across companies, right? Wrong.

It is true that "the world" is moving in the direction of using XML more and more to deal with data interchange. But it's also true that there are so many existing critical applications based on EDI standards, in so many different industry verticals (health, airlines, and insurance, to name but a few), that EDI is not going away anytime soon.

As you can imagine, the contemporaneous existence of two strong data interchange formats creates another problem: you are company A, and you deal with company B, which "speaks" only EDI, and with company C, which has recently upgraded all its systems which now "speak" only XML... (I wish I were making up this scenario for illustrative purposes, but I am not.) You must become bilingual, speaking and understanding both EDI and XML. Many companies try to address that problem creating ad-hoc code that knows how to deal with EDI and XML, and the many variants that various partner companies may be using in their EDI approach; but that's far from an optimal approach, as developing and testing applications becomes a nightmare, and handling changes is a big challenge.

That's exactly where a product like DataDirect XML Converters can help — XML Converters are streaming-based Java and .NET components that translate between EDI and XML. This means that if your company (company A) receives EDI messages from a partner company (company B), you can still treat that EDI as XML data thanks to XML Converters; similarly, if you need to send an EDI message to a partner company, you can create XML and then rely on XML Converters to translate that XML into EDI. You don't need to write EDI parsers; you can handle variations from the EDI standards through XML Converters; and you can just focus on manipulating XML, preferably through a language like XQuery.

Edict Systems understood the benefits that DataDirect XML Converters bring to the table, and they are now part of the growing number of companies that use XML Converters to handle business situations in which EDI and XML need to co-exist, or where developers want to be shielded from the task of parsing and creating raw EDI. An interesting case study of Edict System's DataDirect XML Converters implementation is now available on the DataDirect XML Converters web site: http://www.xmlconverters.com/customers/edict-systems.html

There's much more to write about other usecases where using DataDirect XML Converters and DataDirect XQuery together provide a very powerful combination, and I'll be blogging about that in future posts.



Tech Tags:

Labels: , , ,