SlideShare a Scribd company logo
IBM DB2 v9
     pureXML


          amolpujari@gmail.com
Agenda

•   XML in DB2
•   Saving hordes of lines of code and precious time too
•   SQL/XML
•   XQuery
•   RSS Generator
•   Workflow example
•   XML in oracle
•   DB2 x Oracle (xquery performance)
XML in DB2
(Native XML Database)


•   DB2 code name was Viper

•   Native XML support
     –   Native Storage for XML
     –   Stores parsed XML
     –   Big documents get divided into regions


•   Indexes on XML column
     –   Internal indexes
     –   User Created indexes


•   XML Schema

•   Xquery
     –   XPATH
     –   FLOWR
Native XML support

•   Storing it hierarchical
•   Keep XML as XML
•   DB2 will store XML in parsed hierarchical format (similar to the DOM representation)
•   “Native” = the best-suited on-disk representation of XML

                                CREATE TABLE dept ( deptID char(8), … , doc xml);




•   Relational columns are stored in relational format
•   XML columns are stored natively
•   All XML data is stored in XML-typed columns
Native XML Storage




                     •1 String table per database
                     •Database wide dictionary for
                      all tags in all xml columns
XML Node Storage Layout
XML in DB2

                              DB2 Client/Application


                SQL/XML                                    XQuery




         Relational Interface                     XML Interface




   DB2 Engine


          SQL/XML Parser                          XQuery Parser




                          Hybrid SQL/XQuery Compiler



                 Query Evaluation and Runtime XML Navigation
XML in DB2

•   CREATE TABLE msg ( item XML)

•   INSERT INTO msg VALUES (
     XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'
     PRESERVE WHITESPACE)
     )

•   REGISTER XMLSCHEMA 'http://sample/po'
    FROM 'file:item.xsd'
    AS xscma COMPLETE

•   INSERT INTO msg VALUES (
     XMLVALIDATE(XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'
     PRESERVE WHITESPACE) ACCORDING TO XMLSCHEMA ID xscma)
     )

•   CREATE INDEX xind_newsgroup
    ON msg(item)
    GENERATE KEY USING XMLPATTERN '//@newsgroup„
    AS SQL VARCHAR(50)
Saving hordes of lines of code

• Web applications use databases

• What they get from database is relational data

• Relational data need to be used to form xml in the end and this
  involves DOM/SAX operations

• But what if they get the required xml formed direct from database
  by firing a single xquery?

• With DB2 XML, you
   –   Don't involve so many relational tables
   –   Don't keep fetching relational records out
   –   Don't need external DOM/SAX operations
   –   Just need a single Xquery and required xml doc is ready in one fetch
   –   Save a lot of execution time and also hordes of lines of code
SQL/XML

•    A standardized mechanism for using SQL and XML together
•    Retrieve data as XML from relational objects
•    A set of functions


xmlelement()      Creates an XML element, allowing the name to be specified
xmlattributes()   Creates XML attributes from columns, using the name of each column as the name of the corresponding attribute
xmlroot()         Creates the root node of an XML document
xmlcomment()      Creates an XML comment
xmlpi()           Creates an XML processing instruction
xmlparse()        Parses a string as XML and returns the resulting XML structure
xmlforest()       Creates XML elements from columns, using the name of each column as the name of the corresponding element
xmlconcat()       Combines a list of individual XML values to create a single value containing an XML forest
xmlagg()          Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.
SQL/XML

SELECT XMLELEMENT(NAME, “Department”,
              XMLATTRIBUTES( e.department AS “name”),
              XMLAGG( XMLELEMENT(NAME “emp”, e.firstname))
) AS “department_list”
FROM employee e
WHERE . . .
GROUP BY e.department




                                                             department_list

 firstname      lastname     department                      <Department name=“A00”>
                                                                <emp>CHRISTINE</emp>
 SEAN           LEE          A00                                <emp>VINCENZO</emp>
                                                                <emp>SEAN</emp>
 MICHAEL        JOHNSON      B01                             </Department>

 VINCENZO       BARELLI      A00
                                                             <Department name=“B01”>
 CHRISTINE      SMITH        A00                                <emp>MICHAEL</emp>
                                                             </Department>
SQL/XML

Traditional way                                      SQL/XML

db2 select empno, firstnme, lastname from employee   SELECT XMLSerialize(
                                                          XMLELEMENT(NAME "TABLE",
EMPNO    FIRSTNME       LASTNAME                     --   XMLATTRIBUTES(‟80%‟ AS “width”)
------   ------------   ---------------                   XMLAGG(      XMLELEMENT(NAME "TR",
000010   CHRISTINE      HAAS                                           XMLELEMENT(NAME "TD", empno),
000020   MICHAEL        THOMPSON                                       XMLELEMENT(NAME "TD", firstnme),
.                                                                      XMLELEMENT(NAME "TD", lastname))))
.                                                    AS varchar(4000)) FROM employee
000030   SALLY          KWAN
200340   ROY            ALONZO

 42 record(s) selected.

// fetching relational data                          // single fetch and html(xml) is ready
//construct html table                               <% rs.next(); %>
<table ….                                            <%=(rs.getString(1))%>
<!—setting table attributes ->                       // job done
<%While(rs.next())// 42 fetches {%>
     // construct table rows
     <tr…>
     <!—setting row attributes ->
             //construct table columns
             <!—setting column attributes ->
             <td…><%=(rs.getString(“EMPNO”))%>
             <td…><%=(rs.getString(“FIRSTNME”))%>
             <td…><%=(rs.getString(“LASTNAME”))%>
     </tr>
<%}%>
XQuery
                   New kid on the block


•   A language for running queries against XML-tagged documents in files and “databases”
•   Provides XPath compatibility
•   Supports conditional expressions, element constructors
•   FLOWR expressions the syntax for retrieving, filtering, and transforming operators, functions, path
•   Result of an XQuery is an instance of XML Query Data Model
•   Uses XML Schema types, offers static typing at compile time and dynamic typing at run time,
    supports primitive and derived types
•   could evaluate to simple node values (such as elements and attributes) or atomic values (such as
    strings and numbers). XQueries can also evaluate to sequences of both nodes and simple values.
•   XQuery update is planned
FLWOR Expression

•   FOR: iterates through a sequence, bind variables to items
•   LET: binds a variable to a sequence
•   WHERE: eliminates items of the iteration
•   ORDER: reorders items of the iteration
•   RETURN: constructs query results




FOR $movie in db2-fn:xmlcolumn(„MOVIE.DOC‟)
LET $actors :=$movie/actor
WHERE $movie/duration > 90
ORDER by $movie/@year
RETURN                                          <movie>
                                                  <title>Chicago</title>
<movie>                                           <actor>Renne Zellweger</actor>
     {$movie/title, $actors}                      <actor>Richard Gere</actor>
                                                  <actor>Catherine Zita-Jones</actor>
</movie>                                        </movie>
XQuery
(sample data)


                                                 Item (xml)

                <msg id=„12‟ newsserver=„news.persistent.co.in‟ newsgroup=„comp.lang.c‟>
                    <item>
                        <title>Re: SIGPIPE - Finding the thread</title>
                        <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link>
                        <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author>
                        <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate>
                        <description>some description here…</description>
                    </item>
                </msg>



                <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟>
                    <item>
                        <title>Gold Mobile</title>
                        <link><d1nl7v$4lug$5@news.boulder.ibm.com></link>
                        <author>Nadine <Nadine.grantham@gmail.com></author>
                        <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate>
                        <description>some description here…</description>
                    </item>
                </msg>
XQuery
examples

•   Getting the list of messages where the description contains a particular string
    (“uninitialized” in this case)

     xquery
     for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
     where contains($a/item/description,"uninitialized")
     return $a



•   Getting the first 3 messages sent by an author to the news group

    xquery
    let $a :=
    (
         for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg
         where contains($b/item/author,"Shridhar")
         return $b
    )
    return $a [position() < 4]
XQuery
examples

•   Getting the last 5 messages sent by an author to the news group
    xquery
    let $a := for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg
    where contains($b/item/author,"Shridhar")
    return $b
    let $c := count($a)
    let $d := $c - 5
    return $a [position() > $d]


•   Returns the list of authors and the number of messages they have sent to the group


    xquery
    let $a := db2-fn:xmlcolumn('MSG2.ITEM')/msg/item/author
    let $b := distinct-values($a)
    for $e in ($b)
    let $d := count(for $c in db2-fn:xmlcolumn('MSG2.ITEM')/msg/item
    where $c/author = $e
    return $c )
    return
      <result>
         <author>{$e}</author>
         <message-count>{$d}</message-count>
      </result>
RSS Generator

•   Really Simple Syndication (lightweight XML format designed for sharing data)

•   A web application to generate RSS and ATOM feeds

•   Source: data (messages) from news servers

•   Uploading messages from news server to xmldb2 in xml document format

•   Used XML Schema definition support for validation at database level

•   Used xml indexes as necessary based on XQueries

•   Need just a single xquery fetch to generate RSS/ATOM feeds
RSS example

<rss version="2.0">
    <channel>
    <title>news.persistent.co.in: comp.lang.c</title>
    <link>http://news.persistent.co.in</link>
    <description>The latest content from news.persistent.co.in: comp.lang.c</description>
    <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate>
    <language>en-us</language>
    <copyright>Copyright 2006 Persistent System Private Limited</copyright>

          <item>
          <title>Re: SIGPIPE - Finding the thread</title>
          <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link>
          <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author>
          <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate>
          <description>some description here…</description>
          </item>
          .
          .
          <item>
          .
          .
          <item>
          .
    </channel>
</rss>
RSS Generator
(Administration)




                        news
                        message                          xml record
                                     News Updater
                                                                               NXD


                             Uploading newsgroup messages to NXD

   News Server                                                                Database



                 <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟>
                     <item>
                         <title>Gold Mobile</title>
                         <link><d1nl7v$4lug$5@news.boulder.ibm.com></link>
                         <author>Nadine <Nadine.grantham@gmail.com></author>
                         <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate>
                         <description>some description here…</description>
                     </item>
                 </msg>
                                                                                       xml record
RSS Generator


                               NXD




                    Xquery


                                     <xml>
                                             response
              request        Generator

Web Browser                                             Web Browser
RSS Generator

•    One Xquery and the job is done
•    Result of XQuery is a single record which is a RSS document
•    No DOM/SAX stuff
•    Not even 2nd fetch


xquery
for $a in ( 1 to 1 )
return
<rss version="2.0">
       <channel>
       <title> newsServer:newsGroup </title>
       <link>http://newsServer</link>
       <description>The latest content from newsServer:newsGroup</description>
       <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate>
       {
              let $e :=

               (    for $b in
                   db2-fn:xmlcolumn('MSG.ITEM')/msg[@newsserver="newsServer"][@newsgroup="newsGroup"]
                   where $b/item[fn:contains(title,"subject")]
                   and/or $b/item[fn:contains(author,"author")]
                   and/or $b/item[fn:contains(description,"description")]
                   order by fn:number($b/@id) descending
                   return $b
               )
               for $i in ( 1 to n)
               return $e[$i]/item
         }
         </channel>
</rss>
RSS Generator
Workflow Example
A Document Approval System ( One simple, Content Management Use Case)



•   A Web Application

•   Uses Native XML features

•   Just a single xquery fetch and html (xml) is ready

•   Simple and easy to use

•   Facilitates document review process

•   Uses NXD to store document state related info

•   Facilitates easy querying of requests based on assignee, reviewer, request states etc
Workflow Example
XML in OracleStorage

• XMLType Storage( Gone Relational and not Native)

   – CLOB

       •   Whole Document Stored in one column
       •   Requires DOM operations
       •   Text Indexing
       •   Inefficient update


   – Object Relational

       • Document Shredded across tables, rows and columns
       • Requires XML Schema
       • Insert/retrieval requires (de) composition
XML in Oracle Index
            XML


•   CTXXPATH( Gone Relational and not Native)


     – When you need to speed up existsNode() queries on an XMLType
       column.
     – e.g.

         • CREATE INDEX [schema.]index on
           [schema.]table(XMLType column) INDEXTYPE IS
           ctxsys.CTXXPATH [PARAMETERS ('[storage
           storage_pref] [memory memsize]')];


     – Looks a bit complicated
     – No XML specific index support
XML in OracleXquery

•   No full support for Xquery
     e.g.

     SELECT XMLQuery(„Xquery for $a in
        ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=“pspl.misc”]
     Return <root>{$a/item/title}</root>‟) from MSG



     This xquery will return some “null” values where newsgroup condition doesn‟t match

     The xquery will need to be modified to suppress the „null‟ values and so to get the proper result

     SELECT XMLQuery(„Xquery for $a in
        ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=„pspl.misc‟]
     Return <root>{$a/item/title}</root>‟) from MSG
     WHERE ExistsNode(ITEM,‟/msg[@newsgroup=“pspl.misc”]‟)=1


     More the conditions, bigger the query with more number of ExistsNode calls
XML in OracleXquery

•   No full support for Xquery
     another example

     SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg
     Where contains($a/item/title,”join”)
     Return <root>{$a/item/title}</root>‟) from MSG



     This xquery will return some “null” values where contains return false

     –   Now there is no workaround for this. One can not modify this query to give proper result as
         one can not specify “contains” function within ExistsNode. So possible workaround is to add
         some code at application level to suppress „null‟ values
XML           x   Oracle
(Sample Database Design)                  Xquery performance

                          Table1: msg

                                                                Item (xml)
DB2 xml index:
                               <msg id=„12‟ newsserver=„news.persistent.co.in‟ newsgroup=„comp.lang.c‟>
create index xind_newsserver
                                   <item>
on msg(item)
                                       <title>Re: SIGPIPE - Finding the thread</title>
generate key using
                                       <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link>
xmlpattern '//@newsserver'
                                       <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author>
as sql varchar(50);
                                       <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate>
                                       <description>some description here…</description>
                                   </item>
                               </msg>



                               <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟>
                                   <item>
                                       <title>Gold Mobile</title>
                                       <link><d1nl7v$4lug$5@news.boulder.ibm.com></link>
                                       <author>Nadine <Nadine.grantham@gmail.com></author>
                                       <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate>
                                       <description>some description here…</description>
                                   </item>
                               </msg>


     Oracle CTXXPATH index:

     CREATE INDEX on MSG (ITEM)                       around 4, 50 000 xml records
     INDEXTYPE IS ctxsys.CTXXPATH                      on both side DB2 and oracle
XML             x   OracleXquery performance
Db2_1.sql


  xquery
  for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
  where contains($a/item/description,“sample")
  return $a




                                                 Execution time in milliseconds: 187525




ora_1.sql


  select
  xmlquery ('for $a in /msg
  where contains($a/item/description,"sample")
  return $a'
  passing item returning content) result
  from msg




                                                 ORA-04030: out of process memory
XML             x   OracleXquery performance
Db2_2.sql


  xquery
  for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
  where contains($a/item/title,"Lint")
  return $a




                                               Execution time in milliseconds: 198474




ora_2.sql


  select
  xmlquery ('for $a in /msg
  where contains($a/item/title,“Lint")
  return $a'
  passing item returning content) result
  from msg




                                               ORA-04030: out of process memory
XML             x   OracleXquery performance
Db2_3.sql


  xquery
  for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg
  where $a/@newsgroup = "control.cancel"
  return $a




                                                                 Execution time in milliseconds:   126858




ora_3.sql


  select
  xmlquery ('for $a in /msg
  return $a'
  passing item returning content) result
  from msg
  Where existsNode(ITEM,'/msg[@newsgroup="control.cancel"]')=1




                                                                 Took more than an hour to fetch all records
XML               x   OracleXquery performance
Db2_4.sql


  xquery
  let $a := (
  for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg
  where contains($b/item/author,"Shantanu Gadgil")
  return $b
            )
  return $a [position() < 10]




                                                                             Execution time in milliseconds:    173419




ora_4.sql


  select * from
  (
                  select xmlquery ('for $a in /msg
                               where contains($a/item/author,"Shantanu Gadgil ")
                               order by $a[@id]
                               return $a'
                  passing item returning content) result
                  from msg
  )
  where rownum <= 10




                                                                             ORA-04030: out of process memory
Thank You

More Related Content

PPT
Sqlxml vs xquery
PDF
Lab1-DB-Cassandra
PPTX
Learn PHP Lacture2
PDF
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
PDF
Store and Process Big Data with Hadoop and Cassandra
PPT
PDF
Lab2-DB-Mongodb
Sqlxml vs xquery
Lab1-DB-Cassandra
Learn PHP Lacture2
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
Store and Process Big Data with Hadoop and Cassandra
Lab2-DB-Mongodb

What's hot (20)

KEY
Mongo db勉強会20110730
PPT
PDF
Cassandra 3.0 - JSON at scale - StampedeCon 2015
PPTX
Introduction to NOSQL And MongoDB
PDF
All Things Open 2016 -- Database Programming for Newbies
ODT
Mysql
PDF
Building node.js applications with Database Jones
PDF
Cutting Edge Data Processing with PHP & XQuery
PDF
MongoDB Webtech conference 2010
PDF
Using JSON with MariaDB and MySQL
PPTX
BGOUG15: JSON support in MySQL 5.7
PPT
Mysql Ppt
PPT
My sql with querys
PDF
Demystifying PostgreSQL (Zendcon 2010)
PDF
Developing for Node.JS with MySQL and NoSQL
PDF
Demystifying PostgreSQL
PDF
CQL3 in depth
PDF
My sql tutorial-oscon-2012
PDF
4.3 MySQL + PHP
Mongo db勉強会20110730
Cassandra 3.0 - JSON at scale - StampedeCon 2015
Introduction to NOSQL And MongoDB
All Things Open 2016 -- Database Programming for Newbies
Mysql
Building node.js applications with Database Jones
Cutting Edge Data Processing with PHP & XQuery
MongoDB Webtech conference 2010
Using JSON with MariaDB and MySQL
BGOUG15: JSON support in MySQL 5.7
Mysql Ppt
My sql with querys
Demystifying PostgreSQL (Zendcon 2010)
Developing for Node.JS with MySQL and NoSQL
Demystifying PostgreSQL
CQL3 in depth
My sql tutorial-oscon-2012
4.3 MySQL + PHP
Ad

Similar to DB2 Native XML (20)

PDF
PostgreSQL and XML
DOCX
Xml generation and extraction using XMLDB
PPTX
SQLPASS AD501-M XQuery MRys
PDF
XML Support: Specifications and Development
PPT
Oracle XML Handling
PPT
XMLLec1 (1xML lecturefsfsdfsdfdsfdsfsdfsdfdsf
PPT
XMLLec1.pptsfsfsafasfasdfasfdsadfdsfdf dfdsfds
PPT
XML stands for EXtensible Markup Language
PPTX
Hotsos 2013 - Creating Structure in Unstructured Data
PPT
Xml and DTD's
PPT
Xml nisha dwivedi
PDF
Xml overview
PPT
Sql2005 Xml
PPT
Xml 215-presentation
PPTX
XML DATABASES in the Master of Engineering
PDF
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...
PPTX
unit 1 adbms _3.pptxvhjvjhvjhvjhvjjvjvjvjvjv
PPT
unit_5_XML data integration database management
PPT
DATA INTEGRATION (Gaining Access to Diverse Data).ppt
PPTX
Introduction to XML
PostgreSQL and XML
Xml generation and extraction using XMLDB
SQLPASS AD501-M XQuery MRys
XML Support: Specifications and Development
Oracle XML Handling
XMLLec1 (1xML lecturefsfsdfsdfdsfdsfsdfsdfdsf
XMLLec1.pptsfsfsafasfasdfasfdsadfdsfdf dfdsfds
XML stands for EXtensible Markup Language
Hotsos 2013 - Creating Structure in Unstructured Data
Xml and DTD's
Xml nisha dwivedi
Xml overview
Sql2005 Xml
Xml 215-presentation
XML DATABASES in the Master of Engineering
Developer & Fusion Middleware 1 | Mark Drake | An introduction to Oracle XML ...
unit 1 adbms _3.pptxvhjvjhvjhvjhvjjvjvjvjvjv
unit_5_XML data integration database management
DATA INTEGRATION (Gaining Access to Diverse Data).ppt
Introduction to XML
Ad

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Cloud computing and distributed systems.
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Spectroscopy.pptx food analysis technology
PPT
Teaching material agriculture food technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Electronic commerce courselecture one. Pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Machine learning based COVID-19 study performance prediction
MYSQL Presentation for SQL database connectivity
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Unlocking AI with Model Context Protocol (MCP)
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Cloud computing and distributed systems.
sap open course for s4hana steps from ECC to s4
Spectroscopy.pptx food analysis technology
Teaching material agriculture food technology
Reach Out and Touch Someone: Haptics and Empathic Computing
“AI and Expert System Decision Support & Business Intelligence Systems”
The AUB Centre for AI in Media Proposal.docx
20250228 LYD VKU AI Blended-Learning.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Electronic commerce courselecture one. Pdf
Building Integrated photovoltaic BIPV_UPV.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

DB2 Native XML

  • 1. IBM DB2 v9 pureXML amolpujari@gmail.com
  • 2. Agenda • XML in DB2 • Saving hordes of lines of code and precious time too • SQL/XML • XQuery • RSS Generator • Workflow example • XML in oracle • DB2 x Oracle (xquery performance)
  • 3. XML in DB2 (Native XML Database) • DB2 code name was Viper • Native XML support – Native Storage for XML – Stores parsed XML – Big documents get divided into regions • Indexes on XML column – Internal indexes – User Created indexes • XML Schema • Xquery – XPATH – FLOWR
  • 4. Native XML support • Storing it hierarchical • Keep XML as XML • DB2 will store XML in parsed hierarchical format (similar to the DOM representation) • “Native” = the best-suited on-disk representation of XML CREATE TABLE dept ( deptID char(8), … , doc xml); • Relational columns are stored in relational format • XML columns are stored natively • All XML data is stored in XML-typed columns
  • 5. Native XML Storage •1 String table per database •Database wide dictionary for all tags in all xml columns
  • 7. XML in DB2 DB2 Client/Application SQL/XML XQuery Relational Interface XML Interface DB2 Engine SQL/XML Parser XQuery Parser Hybrid SQL/XQuery Compiler Query Evaluation and Runtime XML Navigation
  • 8. XML in DB2 • CREATE TABLE msg ( item XML) • INSERT INTO msg VALUES ( XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>' PRESERVE WHITESPACE) ) • REGISTER XMLSCHEMA 'http://sample/po' FROM 'file:item.xsd' AS xscma COMPLETE • INSERT INTO msg VALUES ( XMLVALIDATE(XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>' PRESERVE WHITESPACE) ACCORDING TO XMLSCHEMA ID xscma) ) • CREATE INDEX xind_newsgroup ON msg(item) GENERATE KEY USING XMLPATTERN '//@newsgroup„ AS SQL VARCHAR(50)
  • 9. Saving hordes of lines of code • Web applications use databases • What they get from database is relational data • Relational data need to be used to form xml in the end and this involves DOM/SAX operations • But what if they get the required xml formed direct from database by firing a single xquery? • With DB2 XML, you – Don't involve so many relational tables – Don't keep fetching relational records out – Don't need external DOM/SAX operations – Just need a single Xquery and required xml doc is ready in one fetch – Save a lot of execution time and also hordes of lines of code
  • 10. SQL/XML • A standardized mechanism for using SQL and XML together • Retrieve data as XML from relational objects • A set of functions xmlelement() Creates an XML element, allowing the name to be specified xmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attribute xmlroot() Creates the root node of an XML document xmlcomment() Creates an XML comment xmlpi() Creates an XML processing instruction xmlparse() Parses a string as XML and returns the resulting XML structure xmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding element xmlconcat() Combines a list of individual XML values to create a single value containing an XML forest xmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.
  • 11. SQL/XML SELECT XMLELEMENT(NAME, “Department”, XMLATTRIBUTES( e.department AS “name”), XMLAGG( XMLELEMENT(NAME “emp”, e.firstname)) ) AS “department_list” FROM employee e WHERE . . . GROUP BY e.department department_list firstname lastname department <Department name=“A00”> <emp>CHRISTINE</emp> SEAN LEE A00 <emp>VINCENZO</emp> <emp>SEAN</emp> MICHAEL JOHNSON B01 </Department> VINCENZO BARELLI A00 <Department name=“B01”> CHRISTINE SMITH A00 <emp>MICHAEL</emp> </Department>
  • 12. SQL/XML Traditional way SQL/XML db2 select empno, firstnme, lastname from employee SELECT XMLSerialize( XMLELEMENT(NAME "TABLE", EMPNO FIRSTNME LASTNAME -- XMLATTRIBUTES(‟80%‟ AS “width”) ------ ------------ --------------- XMLAGG( XMLELEMENT(NAME "TR", 000010 CHRISTINE HAAS XMLELEMENT(NAME "TD", empno), 000020 MICHAEL THOMPSON XMLELEMENT(NAME "TD", firstnme), . XMLELEMENT(NAME "TD", lastname)))) . AS varchar(4000)) FROM employee 000030 SALLY KWAN 200340 ROY ALONZO 42 record(s) selected. // fetching relational data // single fetch and html(xml) is ready //construct html table <% rs.next(); %> <table …. <%=(rs.getString(1))%> <!—setting table attributes -> // job done <%While(rs.next())// 42 fetches {%> // construct table rows <tr…> <!—setting row attributes -> //construct table columns <!—setting column attributes -> <td…><%=(rs.getString(“EMPNO”))%> <td…><%=(rs.getString(“FIRSTNME”))%> <td…><%=(rs.getString(“LASTNAME”))%> </tr> <%}%>
  • 13. XQuery New kid on the block • A language for running queries against XML-tagged documents in files and “databases” • Provides XPath compatibility • Supports conditional expressions, element constructors • FLOWR expressions the syntax for retrieving, filtering, and transforming operators, functions, path • Result of an XQuery is an instance of XML Query Data Model • Uses XML Schema types, offers static typing at compile time and dynamic typing at run time, supports primitive and derived types • could evaluate to simple node values (such as elements and attributes) or atomic values (such as strings and numbers). XQueries can also evaluate to sequences of both nodes and simple values. • XQuery update is planned
  • 14. FLWOR Expression • FOR: iterates through a sequence, bind variables to items • LET: binds a variable to a sequence • WHERE: eliminates items of the iteration • ORDER: reorders items of the iteration • RETURN: constructs query results FOR $movie in db2-fn:xmlcolumn(„MOVIE.DOC‟) LET $actors :=$movie/actor WHERE $movie/duration > 90 ORDER by $movie/@year RETURN <movie> <title>Chicago</title> <movie> <actor>Renne Zellweger</actor> {$movie/title, $actors} <actor>Richard Gere</actor> <actor>Catherine Zita-Jones</actor> </movie> </movie>
  • 15. XQuery (sample data) Item (xml) <msg id=„12‟ newsserver=„news.persistent.co.in‟ newsgroup=„comp.lang.c‟> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link> <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> </msg> <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg>
  • 16. XQuery examples • Getting the list of messages where the description contains a particular string (“uninitialized” in this case) xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/description,"uninitialized") return $a • Getting the first 3 messages sent by an author to the news group xquery let $a := ( for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($b/item/author,"Shridhar") return $b ) return $a [position() < 4]
  • 17. XQuery examples • Getting the last 5 messages sent by an author to the news group xquery let $a := for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($b/item/author,"Shridhar") return $b let $c := count($a) let $d := $c - 5 return $a [position() > $d] • Returns the list of authors and the number of messages they have sent to the group xquery let $a := db2-fn:xmlcolumn('MSG2.ITEM')/msg/item/author let $b := distinct-values($a) for $e in ($b) let $d := count(for $c in db2-fn:xmlcolumn('MSG2.ITEM')/msg/item where $c/author = $e return $c ) return <result> <author>{$e}</author> <message-count>{$d}</message-count> </result>
  • 18. RSS Generator • Really Simple Syndication (lightweight XML format designed for sharing data) • A web application to generate RSS and ATOM feeds • Source: data (messages) from news servers • Uploading messages from news server to xmldb2 in xml document format • Used XML Schema definition support for validation at database level • Used xml indexes as necessary based on XQueries • Need just a single xquery fetch to generate RSS/ATOM feeds
  • 19. RSS example <rss version="2.0"> <channel> <title>news.persistent.co.in: comp.lang.c</title> <link>http://news.persistent.co.in</link> <description>The latest content from news.persistent.co.in: comp.lang.c</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> <language>en-us</language> <copyright>Copyright 2006 Persistent System Private Limited</copyright> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link> <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> . . <item> . . <item> . </channel> </rss>
  • 20. RSS Generator (Administration) news message xml record News Updater NXD Uploading newsgroup messages to NXD News Server Database <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg> xml record
  • 21. RSS Generator NXD Xquery <xml> response request Generator Web Browser Web Browser
  • 22. RSS Generator • One Xquery and the job is done • Result of XQuery is a single record which is a RSS document • No DOM/SAX stuff • Not even 2nd fetch xquery for $a in ( 1 to 1 ) return <rss version="2.0"> <channel> <title> newsServer:newsGroup </title> <link>http://newsServer</link> <description>The latest content from newsServer:newsGroup</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> { let $e := ( for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg[@newsserver="newsServer"][@newsgroup="newsGroup"] where $b/item[fn:contains(title,"subject")] and/or $b/item[fn:contains(author,"author")] and/or $b/item[fn:contains(description,"description")] order by fn:number($b/@id) descending return $b ) for $i in ( 1 to n) return $e[$i]/item } </channel> </rss>
  • 24. Workflow Example A Document Approval System ( One simple, Content Management Use Case) • A Web Application • Uses Native XML features • Just a single xquery fetch and html (xml) is ready • Simple and easy to use • Facilitates document review process • Uses NXD to store document state related info • Facilitates easy querying of requests based on assignee, reviewer, request states etc
  • 26. XML in OracleStorage • XMLType Storage( Gone Relational and not Native) – CLOB • Whole Document Stored in one column • Requires DOM operations • Text Indexing • Inefficient update – Object Relational • Document Shredded across tables, rows and columns • Requires XML Schema • Insert/retrieval requires (de) composition
  • 27. XML in Oracle Index XML • CTXXPATH( Gone Relational and not Native) – When you need to speed up existsNode() queries on an XMLType column. – e.g. • CREATE INDEX [schema.]index on [schema.]table(XMLType column) INDEXTYPE IS ctxsys.CTXXPATH [PARAMETERS ('[storage storage_pref] [memory memsize]')]; – Looks a bit complicated – No XML specific index support
  • 28. XML in OracleXquery • No full support for Xquery e.g. SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=“pspl.misc”] Return <root>{$a/item/title}</root>‟) from MSG This xquery will return some “null” values where newsgroup condition doesn‟t match The xquery will need to be modified to suppress the „null‟ values and so to get the proper result SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg[@newsgroup=„pspl.misc‟] Return <root>{$a/item/title}</root>‟) from MSG WHERE ExistsNode(ITEM,‟/msg[@newsgroup=“pspl.misc”]‟)=1 More the conditions, bigger the query with more number of ExistsNode calls
  • 29. XML in OracleXquery • No full support for Xquery another example SELECT XMLQuery(„Xquery for $a in ora:view(„MSG‟)/ROW/ITEM/msg Where contains($a/item/title,”join”) Return <root>{$a/item/title}</root>‟) from MSG This xquery will return some “null” values where contains return false – Now there is no workaround for this. One can not modify this query to give proper result as one can not specify “contains” function within ExistsNode. So possible workaround is to add some code at application level to suppress „null‟ values
  • 30. XML x Oracle (Sample Database Design) Xquery performance Table1: msg Item (xml) DB2 xml index: <msg id=„12‟ newsserver=„news.persistent.co.in‟ newsgroup=„comp.lang.c‟> create index xind_newsserver <item> on msg(item) <title>Re: SIGPIPE - Finding the thread</title> generate key using <link><e1lqu9$2ee$1@news.intranet.pspl.co.in></link> xmlpattern '//@newsserver' <author>sushrut bidwai <sushrut_bidwai@persistent.co.in></author> as sql varchar(50); <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item> </msg> <msg id=„12‟ newsserver=„news.software.ibm.com‟ newsgroup=„ibm.software.unicode‟> <item> <title>Gold Mobile</title> <link><d1nl7v$4lug$5@news.boulder.ibm.com></link> <author>Nadine <Nadine.grantham@gmail.com></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item> </msg> Oracle CTXXPATH index: CREATE INDEX on MSG (ITEM) around 4, 50 000 xml records INDEXTYPE IS ctxsys.CTXXPATH on both side DB2 and oracle
  • 31. XML x OracleXquery performance Db2_1.sql xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/description,“sample") return $a Execution time in milliseconds: 187525 ora_1.sql select xmlquery ('for $a in /msg where contains($a/item/description,"sample") return $a' passing item returning content) result from msg ORA-04030: out of process memory
  • 32. XML x OracleXquery performance Db2_2.sql xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/title,"Lint") return $a Execution time in milliseconds: 198474 ora_2.sql select xmlquery ('for $a in /msg where contains($a/item/title,“Lint") return $a' passing item returning content) result from msg ORA-04030: out of process memory
  • 33. XML x OracleXquery performance Db2_3.sql xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where $a/@newsgroup = "control.cancel" return $a Execution time in milliseconds: 126858 ora_3.sql select xmlquery ('for $a in /msg return $a' passing item returning content) result from msg Where existsNode(ITEM,'/msg[@newsgroup="control.cancel"]')=1 Took more than an hour to fetch all records
  • 34. XML x OracleXquery performance Db2_4.sql xquery let $a := ( for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($b/item/author,"Shantanu Gadgil") return $b ) return $a [position() < 10] Execution time in milliseconds: 173419 ora_4.sql select * from ( select xmlquery ('for $a in /msg where contains($a/item/author,"Shantanu Gadgil ") order by $a[@id] return $a' passing item returning content) result from msg ) where rownum <= 10 ORA-04030: out of process memory