─ plug-ins for storing and querying large data sequences
        and gridded data, and for querying external files



            Barrodale Computing Services Ltd. (BCS)
www.barrodale.com




 Dealing with Data - Challenges
  Data comes in many flavors
        Gridded, time series, spatial series, …
        Measured data, sensor data, model data, …
  Lots of file formats
        NetCDF, HDF5, FITS, GRIB, …
  Data volumes can be huge
        tens of terabytes for LOFAR radio-astronomy data files
             http://pos.sissa.it/archive/conferences/112/062/ISKAF2010_062.pdf
        Terabytes/day from a single next generation sequencing run
             http://cloudfront-blog-cache.bioteam.net/wp-
             content/uploads/2008/04/gen_apr15_datamanagement.pdf



                                                                                 2
www.barrodale.com




 Our Solutions
  The Grid DataBlade
        Slice, dice, and reproject your gridded data


  DBXten
        Store and query huge data series efficiently


  Universal File Interface (UFI)
        Query your external files from inside a database


                                                            3
www.barrodale.com




 Using the Grid DataBlade




                            4
www.barrodale.com




 Dealing with Data Sequences




                               5
www.barrodale.com




 DBXten – Performance Results
  Task                  Conventional   BCS DBXten    Improvement
                        Approach                     Ratio
  Size of table         15.6 GB        1.4 GB        X 11
  Size of index         6, 605 MB      6.8 MB        X 971
  Index creation time   5.25 hours     5 seconds     X 3,780
  Insertion time        1.67 minutes   1.2 seconds   X 83
  Retrieval time        14.7 seconds   3.8 seconds   X4




                                                                   6
www.barrodale.com




 Dealing with Data in Files




                              7
www.barrodale.com




 UFI in Action – Weatherdemo




                               8
www.barrodale.com




 UFI in Action – Weatherdemo




                               9
www.barrodale.com




 For more information …
   Website: http://www.barrodale.com




   Contact: BCSInfo@barrodale.com or (250) 412-7428
   More: http://www.barrodale.com/DBToolsinDepth.pdf

                                                        10

More Related Content

PPT
DataCite How To: Use the MDS
PPT
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
PPT
Introduction to DataCite and its Infrastructure for new Members
PPT
Epo inpadoc21
PPTX
Metadata ppt
PPT
Mongo db
PDF
ORCID for DSpace
PPTX
Linked Open Data with Semantic MediaWiki
DataCite How To: Use the MDS
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
Introduction to DataCite and its Infrastructure for new Members
Epo inpadoc21
Metadata ppt
Mongo db
ORCID for DSpace
Linked Open Data with Semantic MediaWiki

Viewers also liked (7)

PPTX
Managing and extracting gridded data - short
PPTX
UFI - short
PPTX
Grid DataBlade - short
PPTX
DBXten - short
PPTX
Presentation on BCS Database Products January 2011
PPTX
Gridded data primer
PDF
20 Ideas for your Website Homepage Content
Managing and extracting gridded data - short
UFI - short
Grid DataBlade - short
DBXten - short
Presentation on BCS Database Products January 2011
Gridded data primer
20 Ideas for your Website Homepage Content
Ad

Similar to Database tools for technologists - short (20)

PPTX
Eliminating the Problems of Exponential Data Growth, Forever
PDF
Introduction to Big Data & Hadoop
PDF
iRODS UGM 2016 Preso Summary FINAL
PDF
Building modern data lakes
PDF
XDC demo: CTA
PDF
Data Deduplication Approaches: Concepts, Strategies, and Challenges 1st Editi...
PDF
UKOUG TechFest PDB Isolation and Security
PDF
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
PDF
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
PPTX
Storage Decisions Nirvanix Introduction
PPTX
Webinar: NAS vs. Object Storage: 10 Reasons Why Object Storage Will Win
PPTX
Next generation storage: eliminating the guesswork and avoiding forklift upgrade
PDF
Super hybrid2016 tdc
PPTX
Research Data Management Fundamentals for MSU Engineering Students
PDF
Scaling the (evolving) web data –at low cost-
PDF
Spark Summit EU talk by Jiri Simsa
PDF
Spark Summit EU talk by Jiri Simsa
PDF
Hadoop introduction
PPTX
Big data presentation
PPTX
Unit-3.pptx
Eliminating the Problems of Exponential Data Growth, Forever
Introduction to Big Data & Hadoop
iRODS UGM 2016 Preso Summary FINAL
Building modern data lakes
XDC demo: CTA
Data Deduplication Approaches: Concepts, Strategies, and Challenges 1st Editi...
UKOUG TechFest PDB Isolation and Security
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Storage Decisions Nirvanix Introduction
Webinar: NAS vs. Object Storage: 10 Reasons Why Object Storage Will Win
Next generation storage: eliminating the guesswork and avoiding forklift upgrade
Super hybrid2016 tdc
Research Data Management Fundamentals for MSU Engineering Students
Scaling the (evolving) web data –at low cost-
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Hadoop introduction
Big data presentation
Unit-3.pptx
Ad

Recently uploaded (20)

PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
Architecture types and enterprise applications.pdf
PPTX
Chapter 5: Probability Theory and Statistics
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Abstractive summarization using multilingual text-to-text transfer transforme...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
Modernising the Digital Integration Hub
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Hindi spoken digit analysis for native and non-native speakers
UiPath Agentic Automation session 1: RPA to Agents
Architecture types and enterprise applications.pdf
Chapter 5: Probability Theory and Statistics
Getting started with AI Agents and Multi-Agent Systems
OpenACC and Open Hackathons Monthly Highlights July 2025
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
2018-HIPAA-Renewal-Training for executives
Abstractive summarization using multilingual text-to-text transfer transforme...
Taming the Chaos: How to Turn Unstructured Data into Decisions
sustainability-14-14877-v2.pddhzftheheeeee
Modernising the Digital Integration Hub
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
NewMind AI Weekly Chronicles – August ’25 Week III
Convolutional neural network based encoder-decoder for efficient real-time ob...
Enhancing emotion recognition model for a student engagement use case through...
A comparative study of natural language inference in Swahili using monolingua...
Benefits of Physical activity for teenagers.pptx
Hindi spoken digit analysis for native and non-native speakers

Database tools for technologists - short

  • 1. ─ plug-ins for storing and querying large data sequences and gridded data, and for querying external files Barrodale Computing Services Ltd. (BCS)
  • 2. www.barrodale.com Dealing with Data - Challenges Data comes in many flavors  Gridded, time series, spatial series, …  Measured data, sensor data, model data, … Lots of file formats  NetCDF, HDF5, FITS, GRIB, … Data volumes can be huge  tens of terabytes for LOFAR radio-astronomy data files http://pos.sissa.it/archive/conferences/112/062/ISKAF2010_062.pdf  Terabytes/day from a single next generation sequencing run http://cloudfront-blog-cache.bioteam.net/wp- content/uploads/2008/04/gen_apr15_datamanagement.pdf 2
  • 3. www.barrodale.com Our Solutions The Grid DataBlade  Slice, dice, and reproject your gridded data DBXten  Store and query huge data series efficiently Universal File Interface (UFI)  Query your external files from inside a database 3
  • 4. www.barrodale.com Using the Grid DataBlade 4
  • 6. www.barrodale.com DBXten – Performance Results Task Conventional BCS DBXten Improvement Approach Ratio Size of table 15.6 GB 1.4 GB X 11 Size of index 6, 605 MB 6.8 MB X 971 Index creation time 5.25 hours 5 seconds X 3,780 Insertion time 1.67 minutes 1.2 seconds X 83 Retrieval time 14.7 seconds 3.8 seconds X4 6
  • 8. www.barrodale.com UFI in Action – Weatherdemo 8
  • 9. www.barrodale.com UFI in Action – Weatherdemo 9
  • 10. www.barrodale.com For more information …  Website: http://www.barrodale.com  Contact: BCSInfo@barrodale.com or (250) 412-7428  More: http://www.barrodale.com/DBToolsinDepth.pdf 10

Editor's Notes

  • #3: Technologists generate and deal with data of many different types, in multiple formats, and in volumes that can be small, medium, large,or enormous. Given that technologists want to benefit from (or at least explore the feasibility of) working with a database management system (a DBMS), the issue at hand is - how well can a DBMS be adapted to technologists’ needs? Our database extensions (or plug-ins), which install into popular object relational DBMSs, speed up data loading and querying, shrink data space requirements – and hence reduce storage costs. In addition, they shield users from much computational and data management complexity.
  • #4: The three BCS database extension products presented here deal with storing and querying gridded data, storing and querying large data series, and providing SQL querying facilities over top of traditional files.
  • #5: I’ll talk about the Grid DataBlade first. We developed the Grid DataBlade to address specific needs of the US Navy to customize and speed up its generation and delivery of tactical weather forecasts to the fleet. FNMOC, in Monterey, California,provides worldwide weather information to the US Navy. It generates very large 4D grids each covering a vast expanse of the Globe, but what’s of interest to a pilot on a mission are the forecast conditions around his or her plane, for the duration of that mission. So the Grid DataBlade is used to extract that relatively small amount of vital information from the large grid. The consequent productivity gains experienced at FNMOC from using the Grid DataBlade exceeded an order of magnitude.
  • #6: Our second product is DBXten. The motivation for DBXten was to address the issue of ingesting into a database fast streams of data sequences – that is, complex data, produced, say, by multiple sensors, related to any of the sources shown in this picture.
  • #7: There is extensive information about DBXten on our website, and we’ve done lots of performance testing, also documented on our website. This slide shows the impressive performance results we achieved in one of these tests. I’ll leave this slide up for a few more seconds as we’re quite proud of these results – which are typical of the gains achieved with DBXten! BCS holds US Patent 8077059 for DBXten.
  • #8: Our latest product, the Universal File Interface (UFI), is designed to address one of the primary reasons we’ve heard from technologists who avoid, or limit their use of, DBMSs “…my files are too cumbersome/complex/large to load into a database”. UFI allows users to index and query their files without having to actually load them into a database! Files often have an inherent structure: CSV files have columns, separated by commas, DBF files have fields, NetCDF files have attributes and columns, and so on. With UFI we map these file components to columns of virtual tables. If we want, we can map several files to a single table, thereby creating a virtual table that represents the union of the data in many files. And we can perform SQL on that virtual table, just as if it were a real table in a database. The only difference is – we didn’t have to spend time loading it.
  • #9: UFI is one of the products behind a popular demo on our website – we call it the “Weatherdemo”, shown in this picture.
  • #10: What it allows you to do is zoom and pan around the contiguous United States and look at the forecasts made for a particular day and time in the future, and then compare those forecasts with the actual weather recorded on that day and time. So, every day we download gridded binary files containing temperature, wind, and rain forecast data for the USA; we then use UFI and SQL to join this forecast data, stored in files, with actual weather observed values, stored in a database. Our website explains what information is available in each colored square, circle, teardrop, and little star depicted above (e.g., clicking on a teardrop will give you a textual display of how the forecast compared with the actual conditions at that location and time).
  • #11: This concludes our mini presentation on Database Tools for Technologists. Thank you for your interest. For more information please visit our website, send us an email, or give us a call. The final bullet above points to an in-depth down-loadable presentation on Database Tools for Technologists. Goodbye for now.