XQuery for the SQL Programmer - Grouping and aggregation
Today's topic in the XQuery for the SQL programmer series is grouping.
Where grouping is built-in the SQL language, this is unfortunately a bit less trivial in XQuery. Let's start with a simple example, count the number of bids for every user.
In the margin, the XML Query Working Group has recognized the issue, and is active working on adding native grouping capabilities to the language. All this in scope of XQuery 1.1. Today still a working draft, but at least we know it will change and improve over time.
Where grouping is built-in the SQL language, this is unfortunately a bit less trivial in XQuery. Let's start with a simple example, count the number of bids for every user.
select USERID, count(*)XQuery has no built-in grouping construct, but you can achieve the same using the distinct-values() function, which we introduced in our last week's post.
from BIDS
group by USERID
for $userid in distinct-values(collection("BIDS")/BIDS/USERID)
return
<user>
<userid>{$userid}</userid>
<count>{count(collection("BIDS")/BIDS[USERID = $userid])}</count>
</user>Let's now add a having clause, we're interested in the users with more than 1 bid,select USERID, count(*)Our previous XQuery is changed as follows. Note that we use a let-clause, which we introduced in Joining Data.
from BIDS
group by USERID
having count(*) > 1
for $userid in distinct-values(collection("BIDS")/BIDS/USERID)
let $count := count(collection("BIDS")/BIDS[USERID = $userid])
where $count > 1
return
<user>
<userid>{$userid}</userid>
<count>{$count }</count>
</user>And we further detail our query, as we're only interested in significant bids of more than 100$.select USERID, count(*)In XQuery we do,
from BIDS
where BID > 100
group by USERID
having count(*) > 1
for $userid in distinct-values(collection("BIDS")/BIDS/USERID)
let $count := count(collection("BIDS")/BIDS[USERID = $userid and BID > 100])
where $count > 1
return
<user>
<userid>{$userid}</userid>
<count>{$count }</count>
</user>As a final example, let's discuss a grouping scenario based on two values. For all items group on the rating of the user, and count the number of bids, in addition to the average bid,select ITEMNO, RATING, AVG(BID), COUNT(*) from xvs001.BIDS, xvs001.USERSIn XQuery one could write the following query. Not the two for-loops over distinct-values().
where USERS.USERID = BIDS.USERID
GROUP BY RATING, ITEMNO
ORDER BY ITEMNO, RATING
for $rating in distinct-values(collection("USERS")/USERS/RATING)
for $itemno in distinct-values(collection("BIDS")/BIDS/ITEMNO)
for $userid in collection("USERS")/USERS[RATING=$rating]/USERID
let $bids := collection("BIDS")/BIDS[ITEMNO=$itemno and USERID = $userid]
let $avg := avg($bids/BID)
let $count := count($bids)
where $bids
order by $itemno, $rating
return
<result>
<<itemno>{$itemno}</itemno>
<rating>{$rating }</rating>
<avg>{$avg}</avg>
<count>{$count}</count>
</result>We can conclude that resolving grouping problems in XQuery 1.0 is less simple than in SQL. Unfortunate, but today's reality.In the margin, the XML Query Working Group has recognized the issue, and is active working on adding native grouping capabilities to the language. All this in scope of XQuery 1.1. Today still a working draft, but at least we know it will change and improve over time.
Labels: relational, SQL, XQuery

0 Comments:
Post a Comment
<< Home