SlideShare a Scribd company logo
Grab some coffee and enjoy 
the pre-show banter before 
the top of the hour!
The New Database Frontier: Harnessing the Cloud 
The Briefing Room
Twitter Tag: #briefr 
The Briefing Room 
Welcome 
Host: 
Eric Kavanagh 
eric.kavanagh@bloorgroup.com 
@eric_kavanagh
! Reveal the essential characteristics of enterprise software, 
good and bad 
! Provide a forum for detailed analysis of today’s innovative 
technologies 
! Give vendors a chance to explain their product to savvy 
analysts 
! Allow audience members to pose serious questions... and get 
answers! 
Twitter Tag: #briefr 
The Briefing Room 
Mission
Twitter Tag: #briefr 
The Briefing Room 
Topics 
This Month: DATABASE 
June: ANALYTICS & MACHINE LEARNING 
July: INNOVATIVE TECHNOLOGY 
2014 Editorial Calendar at 
www.insideanalysis.com/webcasts/the-briefing-room
“ We are stuck with technology 
“ 
when what we really want is 
just stuff that works. -Douglas Adams
Twitter Tag: #briefr 
The Briefing Room 
Analyst: Rick Sherman 
Rick Sherman is CEO of 
Athena IT Solutions
Twitter Tag: #briefr 
The Briefing Room 
MarkLogic 
! MarkLogic offers a distributed, scale-out, enterprise NoSQL 
database 
! The platform is comprised of a database, search engine and 
application services 
! MarkLogic can run directly on the Hadoop file system, and it 
features full text search, location services and geospatial 
alerting
Twitter Tag: #briefr 
The Briefing Room 
Guest: Ken Krupa 
Ken Krupa is Chief Field Architect at MarkLogic. 
With 24 years of professional IT experience, Mr. 
Krupa has a unique breadth and depth of 
expertise within nearly all aspects of IT 
architecture. Prior to joining MarkLogic, Ken 
consulted at some of the largest North American 
Financial institutions during difficult economic 
times, advising senior and C-level executives. 
Prior to that, he consulted with Sun 
Microsystems as a direct partner and also served 
as Chief Architect of GFI Group, a Wall St. inter-dealer 
brokerage. Although his work primarily 
involves high-level technology strategy, Mr. 
Krupa remains an active hands-on engineer. 
In 2005, Ken was awarded patent #6,915,304 – 
“System and method for converting an XML data 
structure into a relational database.” Today Ken 
continues to pursue both individual and 
community-based engineering activities.
Expect More From Your Database 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. 
Ken Krupa, Chief Field Architect, MarkLogic Corporation
Overview 
§ Evolution of the Database 
§ The Enterprise Data Warehouse (EDW) 
§ Big Data 
§ NoSQL 
§ Enterprise NoSQL 
§ The Logical Data Warehouse (LDW) 
§ Unified Database 
§ Parting Thoughts 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 11 ALL RIGHTS RESERVED.
We Are the New Generation Database 
Hierarchical Era 
For your application data! 
• Application- and 
hardware-specific 
Relational Era 
“For all your structured 
data!” 
• Normalized, tabular 
model 
• Application-independent 
query 
Any Structure Era 
• User control “For all your data!” 
• Schema-agnostic 
• Massive scale 
• Query and search 
• Analytics 
• Application services 
• Faster time-to-results 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 12 ALL RIGHTS RESERVED.
RDBMS: One Tool, Many Contortions 
§ OLTP 
§ 3rd normal form, updates, simple query 
§ Reporting DB 
§ Because the OLTP app slowed down during heavy query use 
§ Enterprise Data Warehouse 
§ Star schema - unified view of the enterprise 
§ Data Marts 
§ Because the EDW didn’t have everything – Also star schema 
§ Federated 
§ Because it took too long to agree on a standard model 
§ Hybrid 
§ Because Federated is too slow 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 13 ALL RIGHTS RESERVED.
It’s Complicated 
OLTP 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 14 ALL RIGHTS RESERVED.
It’s Complicated 
ETL 
OLTP 
Reference 
Data ETL 
Warehouse 
ETL 
ETL ETL 
Data Marts 
Archives 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 15 ALL RIGHTS RESERVED.
It’s Complicated 
ETL 
OLTP 
Reference 
Data ETL 
Warehouse 
ETL 
ETL ETL 
ETL 
Data Marts 
Archives 
“Unstructured” 
Documents, 
Messages { } 
“ ” 
Video 
Audio 
Signals, 
Logs, 
Streams 
Metadata 
Social 
ETL  
Search 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 16 ALL RIGHTS RESERVED.
Look closely… 
ETL 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 17 ALL RIGHTS RESERVED.
Look closely… 
ETL 
M 
…at the hidden “M” 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 18 ALL RIGHTS RESERVED.
At a crossroads… 
§ Pre-requisite modeling has become an unsustainable friction point 
§ Can/should we wait until the data is perfectly modeled to do discovery? 
§ A real dollar cost before value is realized 
§ “Cost per column” 
§ Moving data around is also becoming a friction point 
§ There’s too much of it to do it for all cases 
§ Also a real dollar cost before value is realized 
§ Traditional data warehousing largely leaves out “the 80%” 
§ Most of the world’s data is unstructured 
§ Dimensional warehouses as we know them are not up to the task 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 19 ALL RIGHTS RESERVED.
Gartner’s Take 
§ Organizations that failed to deploy strategies to address data 
complexity and volume issues for their analytics by 2012 will 
experience more than doubling costs of ownership for their 
data warehouse and mart environments in disorganized 
attempts to meet this new demand. 
§ By 2014, 85% of organizations will fail to deploy new strategies to 
address data complexity and volume in their analytics. 
§ By 2014, organizations that have deployed analytics to support new 
complex data types and large volumes of data in analytics will 
outperform their market peers by more than 20% in revenue, 
margins, penetration and retention. 
Source: Gartner Does the 21st Century, Beyer and Feinberg. 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 20 ALL RIGHTS RESERVED.
What’s being done? 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 21 ALL RIGHTS RESERVED.
What about Hadoop? 
Staging Analytics 
Aggregates, 
Models 
Persistence © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 22 ALL RIGHTS RESERVED.
What about Hadoop? 
Updates 
? 
Queries 
Staging Analytics 
Aggregates, 
Models 
Persistence © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 23 ALL RIGHTS RESERVED.
Hadoop + RDBMS 
Distill into RDBMS 
…or spill-over into Hadoop 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 24 ALL RIGHTS RESERVED.
Hadoop + RDBMS 
ETL Distill into RDBMS 
…or spill-over into Hadoop 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 25 ALL RIGHTS RESERVED.
Hadoop – What You Get 
Advantages Limitations 
§ HDFS provides scale and 
economies of scale 
§ File-based nature allows for 
greater Variety 
§ Raw data is fine and any 
shape will do 
§ Schema-on-read possible 
§ Map-reduce enables massive 
parallel scaling 
§ Hadoop was designed for batch 
processing 
§ Does not support real-time 
applications on its own 
§ Requires expertise to configure, 
deploy and manage 
§ Has security limitations 
§ Is not a database 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 26 ALL RIGHTS RESERVED.
Enter NoSQL 
§ Agility 
§ Flexible data models (or none at all) 
§ Many different types 
o Simple Key/Value 
o Columnar 
o Document 
o Graph 
o Etc. 
§ Enterprise features? 
§ More confusion…? 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 27 ALL RIGHTS RESERVED.
Tough Choice… 
Legacy RDBMS 
§ Indexes 
§ Transactions 
§ Security 
§ Enterprise operations 
“NoSQL” 
§ Flexible data model 
§ Commodity scale out 
§ Distributed, fault-tolerant 
§ Hadoop integration 
Cashflows 
PartyID 
Net 
Date 
Reference TradeI Payer 
D 
Amount 
Receiver 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 28 ALL RIGHTS RESERVED.
Enterprise NoSQL – Best of Both Worlds… 
§ Flexible, schema agnostic, document oriented data model 
§ Comprehensive indexes 
o Documents: Hierarchy, text, values, tags—schema “on-demand” 
o Scalars: Aggregates and range filters, including geospatial 
o Triples: Linked facts and inferencing 
o Permissions: Users, roles, compartments, and privileges 
o Queries: Reverse indexes for alerting, matching 
§ Ad-hoc dimensions 
§ Real-time transformation and/or schema on read 
§ Lock-free reads 
§ Strict consistency throughout 
§ Oh yeah.. SQL too 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 29 ALL RIGHTS RESERVED.
Universal Index 
A Unified Platform 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 30 ALL RIGHTS RESERVED.
A Unified Platform 
Content: Words, phrases, entities, positions, etc. 
... Semantic Web is a collaborative movement led by the World Wide Web 
Consortium (W3C) ... 
... ACE inhibitors, since the risk of lithium toxicity is very high in such 
patients... 
Structure Label 
Author Ing 
Comp 
ID Para 
Org 
Values 
name:sorbitol 
date:2012-06-04 
company:Roche 
Geospatial 
Relationships 
Security 
Lat: 46.946584 
Long: 93.076172 
Trenton isCityOf 
NewJersey 
James livesIn Trenton 
Role:researcher-worldwide 
Universal Index 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 31 ALL RIGHTS RESERVED.
A Unified Platform 
Content: Words, phrases, entities, positions, etc. 
... Semantic Web is a collaborative movement led by the World Wide Web 
Consortium (W3C) ... 
... ACE inhibitors, since the risk of lithium toxicity is very high in such 
patients... 
Structure Label 
Author Ing 
Comp 
ID Para 
Org 
Values 
name:sorbitol 
date:2012-06-04 
company:Roche 
Geospatial 
Relationships 
Security 
Lat: 46.946584 
Long: 93.076172 
Trenton isCityOf 
NewJersey 
James livesIn Trenton 
Role:researcher-worldwide 
Universal Index 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 32 ALL RIGHTS RESERVED.
DB + Search 
§ DB separate from search is another unnecessary friction point 
§ Best if integrated at the DB layer: 
§ Quicker time to information 
§ Greater than the sum of its parts 
§ No separate indexes to maintain 
§ Search as a text processing engine 
§ Query capability for unstructured text 
§ Turn text into numbers, create new dimensions 
§ Infer new information from text search and enrich 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 33 ALL RIGHTS RESERVED.
The advanced interface 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 34 ALL RIGHTS RESERVED.
Semantics and RDF 
Data stored in Triples 
§ Expressed as Subject : Predicate : Object 
John 
Smith 
LivesIn 
Brooklyn 
PartOf 
Brooklyn New 
York 
§ Can make inferences – e.g. John Smith LivesIn New York 
§ Can create relationships on-the-fly 
§ “We’ve identified a special relationship between a drug and an 
interaction…” 
§ Machine-comprehensible “knowledge” 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 35 ALL RIGHTS RESERVED.
The World of Triples 
Linked Open Data 
(Free semantic facts available to anyone) 
Proprietary Semantic Facts 
(Facts and Taxonomies in your organization) 
Semantic 
World 
Document 
World 
Facts from Free-Flowing Text 
(Derived from semantic enrichment) 
MarkLogic 
Facts in Documents 
(Part of metadata or added with authoring tools)
Data Everywhere 
Application 
Data stored in 
MarkLogic 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 37 ALL RIGHTS RESERVED.
Data Everywhere 
Application 
Data stored in 
MarkLogic 
On Tiered 
Storage 
Widely 
Accessible 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 38 ALL RIGHTS RESERVED.
Data Everywhere 
Application 
Data stored in 
MarkLogic 
On Tiered 
Storage 
Calling out to 
endpoints 
Widely 
Accessible 
RDBMS 
SPARQL 
REST 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 39 ALL RIGHTS RESERVED.
Data Everywhere 
Application 
Data stored in 
MarkLogic 
On Tiered 
Storage 
Searchable 
and Queryable 
Calling out to 
endpoints 
Widely 
Accessible 
RDBMS 
SPARQL 
REST 
Logical Data 
Warehouse 
Reimagined 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 40 ALL RIGHTS RESERVED.
More from Gartner… 
§ 64% of surveyed organizations either have invested in big data already (30%) or 
have plans to invest within 24 months 
§ Through 2017, 90% of the information assets from big data analytic efforts will be 
siloed and unleveragable across multiple business processes 
§ By 2016, excessive focus of truth over trust in big data will prompt leadership 
change in 75% of projects 
§ Through 2017, premiums for big data-related technology and project skills will remain 
20% to 30% above norms for traditional information management skills 
Source: Gartner – Predicts 2014: Big Data, Heudecker, Beyer, et al 
§ Companies will spend more on application integration than on new application 
systems 
§ By 2018, more than 50% of the cost of implementing 90% of new large systems will be 
spent on integration 
Source: Gartner – Predicts 2013: Application Integration, Lheureux, Pezzini, et al 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 41 ALL RIGHTS RESERVED.
Integration…? 
Hadoop Ecosystem 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 42 ALL RIGHTS RESERVED.
The New Data Life-cycle 
Textual 
Structured 
Multi-media Geospatial “ Social 
” 
Semantic 
Discovery/Model 
Loop 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 43 ALL RIGHTS RESERVED.
Some parting thoughts 
§ ETL is a friction point that should be minimized 
§ Integration is another friction point 
§ Modeling should no longer be pre-requisite to discovery 
§ Evolve your model alongside discovery 
§ Expect more from your database 
§ Schema agility of Hadoop but in a DBMS 
§ Enterprise capabilities of traditional DBMS 
§ ACID, Security, HA/DR, etc. 
§ Support for indexing and analyzing heterogeneous 
information assets (text, data, geospatial, semantics, etc.) 
§ Support for heterogeneous locality of data for strategy execution 
§ Operational data + LDW with fewer moving parts 
§ Tiered Storage 
© COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 44 ALL RIGHTS RESERVED.
© COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. 
Thank You 
Ken Krupa 
Ken.krupa@marklogic.com 
@kenkrupa 
Check out these pages: 
www.marklogic.com 
developer.marklogic.com 
Now reimagine the possibilities! 
Join us in a city near you! 
May 15 Amsterdam 
May 20 London 
June 3 New York City 
June 5 Chicago 
June 19 Baltimore 
June 24 Washington DC
Twitter Tag: #briefr 
The Briefing Room 
Perceptions & Questions 
Analyst: 
Rick Sherman
The 
Briefing 
Room: 
The 
New 
Database 
Fron:er 
Rick 
Sherman 
Athena 
IT 
Solu:ons 
rsherman@athena-­‐solu:ons.com 
Copyright © 2014 Athena IT Solutions
The New Database Frontier: 
Our History 
• Relational emerged in 1980s & went mainstream in 1990s 
ü Transactional 
ü Data warehousing, Business Intelligence & Analytics 
• Relational keep adding features for BI & DW 
ü More infrastructure for I/O & memory 
ü More complexity 
ü More skills 
ü Can “one size fit all”? 
• OLAP (On-line Analytical Processing) Emerge 
ü Successful but…not pervasive 
ü Proprietary: Database, ETL & BI 
ü Specialized skills 
ü Scalability & extensibility issues 
Slide 48 Copyright © 2014 Athena IT Solutions All rights reserved.
The New Database Frontier: 
Data & Analytical Needs Have Expanded 
• Data Volume, Velocity & Variety Exploding 
• Data Integration has Succeeded in Many Ways 
ü 5 C’s: comprehensive, consistent, clean & current 
ü Achieved at a cost 
• BI & Analytics has Evolved & Expanded 
Slide 49 Copyright © 2014 Athena IT Solutions All rights reserved.
The New Database Frontier: 
Do We Need to Change? 
• Volume & Velocity for Structured Data 
ü Typically can be handled by traditional DW platform with 
enhancements & emerging technologies 
û In-memory, Columnar, MPP, other 
û Infrastructure, Appliances, Cloud 
û Better architecture & design 
ü But are there other approaches that can do it better? 
• Variety is key difference & requires different approach 
ü Unstructured: text, audio, video, click streams, log files, social media 
ü Semi-structured: XML, RSS feeds, machine data 
ü None of big 3 (ETL, databases & BI) were built it 
Slide 50 Copyright © 2014 Athena IT Solutions All rights reserved.
The New Database Frontier: 
NoSQL Data Stores 
• NoSQL differ from Relational databases 
ü Structure 
ü Purpose 
ü SQL not used as primary query language 
• Structures: 
ü Wide Column Store / Column Families 
ü Document Store 
ü Key Value / Tuple Store 
ü Graph Databases 
• Characteristics 
ü Scalable, flexible, commodity hardware 
ü Supports 3 V’s 
ü Fixed table schemas not needed 
ü May not guarantee ACID (atomicity, consistency, isolation, durability) 
Slide 51 Copyright © 2014 Athena IT Solutions All rights reserved.
The New Database Frontier: 
Big Data is Evolving 
Big data platforms are evolving 
• NoSQL tools versus platforms 
• Cloud deployments 
• Integration 
• Advanced analytics 
Benefits: 
• Increased capabilities & reduced programming 
• Shift roles 
ü IT & Business 
ü Data scientists & business analysts 
ü Services 
• Lower costs & time to market 
Slide 52 Copyright © 2014 Athena IT Solutions All rights reserved.
Q&A 
• Big Data Implementations 
o How are you addressing high cost, manual coding, time to market & skills shortage? 
• Use Cases: (Assume used for unstructured data) 
o When would you use your database to store structured data? 
o Would your database be used for operational & transaction processing applications? 
• Data Ingestion & Integration: 
o How would data sources typically be ingested? 
o Are there data integration capabilities similar to ETL available? 
• BI: 
o In order to use BI tools or SQL is a data model needed? 
o How use extension keywords (“MATCH”) be used? 
• Modeling data: 
o Contrast semantic triples/RDF vs dimensional models 
o Compare skills sets needed 
• What are the differences in how your database handles: 
o Data capture versus information analysis 
o Data ingestion versus processing business transaction or processes 
Slide 53 Copyright © 2014 Athena IT Solutions All rights reserved.
Twitter Tag: #briefr 
The Briefing Room
This Month: DATABASE 
June: ANALYTICS & MACHINE LEARNING 
July: INNOVATIVE TECHNOLOGY 
www.insideanalysis.com/webcasts/the-briefing-room 
Twitter Tag: #briefr 
The Briefing Room 
Upcoming Topics 
2014 Editorial Calendar at 
www.insideanalysis.com
Twitter Tag: #briefr 
THANK YOU 
for your 
ATTENTION! 
The Briefing Room

More Related Content

PPTX
Mark logic Corporate Overview
PDF
Action from Insight - Joining the 2 Percent Who are Getting Big Data Right
PDF
Data-Centric Infrastructure for Agile Development
PDF
MarkLogic Overview and Use Cases
PPTX
Northeastern DB Class Introduction to Marklogic NoSQL april 2016
PDF
Organising the Data Lake - Information Management in a Big Data World
PDF
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
PPTX
MarkLogic Overview, Ron Avnur, MarkLogic
Mark logic Corporate Overview
Action from Insight - Joining the 2 Percent Who are Getting Big Data Right
Data-Centric Infrastructure for Agile Development
MarkLogic Overview and Use Cases
Northeastern DB Class Introduction to Marklogic NoSQL april 2016
Organising the Data Lake - Information Management in a Big Data World
GoDaddy Customer Success Dashboard Using Apache Spark with Baburao Kamble
MarkLogic Overview, Ron Avnur, MarkLogic

What's hot (20)

PDF
Large scale patent analytics at Bayer
PDF
Himss DC meet mark logic
PDF
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPTX
Infochimps CxO Seminar @ PARC
PDF
How to Streamline DataOps on AWS
PPTX
MarkLogic and The Universal Index
PDF
Introduction to big data and apache spark
PDF
The Scout24 Data Platform (A Technical Deep Dive)
PDF
Introduction to Deep Learning and AI at Scale for Managers
PDF
The Model Enterprise: A Blueprint for Enterprise Data Governance
PPTX
IDERA Live | The Ever Growing Science of Database Migrations
PDF
Smart data for a predictive bank
PPTX
The Importance of DataOps in a Multi-Cloud World
PPTX
Vmware Serengeti - Based on Infochimps Ironfan
PDF
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
PPTX
A New Day for Oracle Analytics
PDF
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
PDF
Journey to Big Data: Main Issues, Solutions, Benefits
PPTX
Big Data & Cloud - Infinite Monkey Theorem
Large scale patent analytics at Bayer
Himss DC meet mark logic
Solutions Linux 2013: SpagoBI and Talend jointly support Big Data scenarios
Data Lakehouse Symposium | Day 1 | Part 1
Infochimps CxO Seminar @ PARC
How to Streamline DataOps on AWS
MarkLogic and The Universal Index
Introduction to big data and apache spark
The Scout24 Data Platform (A Technical Deep Dive)
Introduction to Deep Learning and AI at Scale for Managers
The Model Enterprise: A Blueprint for Enterprise Data Governance
IDERA Live | The Ever Growing Science of Database Migrations
Smart data for a predictive bank
The Importance of DataOps in a Multi-Cloud World
Vmware Serengeti - Based on Infochimps Ironfan
Demystifying Data Virtualization: Why it’s Now Critical for Your Data Strategy
A New Day for Oracle Analytics
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
Journey to Big Data: Main Issues, Solutions, Benefits
Big Data & Cloud - Infinite Monkey Theorem
Ad

Similar to The New Database Frontier: Harnessing the Cloud (20)

PPTX
New Trends in Data Management in the Information Industries
PPTX
Hadoop and Your Data Warehouse
PDF
Data Lake, Virtual Database, or Data Hub - How to Choose?
PDF
Business Intelligence Data Warehouse System
PDF
Data Con LA 2018 - Agile Integration Using an Enterprise Data Hub by Michael ...
PPTX
Finding business value in Big Data
PPTX
5 Things that Make Hadoop a Game Changer
PDF
The Value of Metadata
PPTX
DATA WAREHOUSING
PDF
Extending BI with Big Data Analytics
PPT
Gulabs Ppt On Data Warehousing And Mining
PPT
Data Warehousing Datamining Concepts
PPTX
dataWarehouse.pptx
PPTX
CHAPTER 1 - Introdution to Datawarehousing.pptx
PPT
13500892 data-warehousing-and-data-mining
PDF
How Can Analytics Improve Business?
PPTX
MediaMath - Big Data Warehousing Meetup - 2/16/2016
PDF
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
PPTX
Data modeling trends for analytics
PDF
Pitfalls of Data Warehousing_2019-04-24
New Trends in Data Management in the Information Industries
Hadoop and Your Data Warehouse
Data Lake, Virtual Database, or Data Hub - How to Choose?
Business Intelligence Data Warehouse System
Data Con LA 2018 - Agile Integration Using an Enterprise Data Hub by Michael ...
Finding business value in Big Data
5 Things that Make Hadoop a Game Changer
The Value of Metadata
DATA WAREHOUSING
Extending BI with Big Data Analytics
Gulabs Ppt On Data Warehousing And Mining
Data Warehousing Datamining Concepts
dataWarehouse.pptx
CHAPTER 1 - Introdution to Datawarehousing.pptx
13500892 data-warehousing-and-data-mining
How Can Analytics Improve Business?
MediaMath - Big Data Warehousing Meetup - 2/16/2016
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
Data modeling trends for analytics
Pitfalls of Data Warehousing_2019-04-24
Ad

More from Inside Analysis (20)

PDF
An Ounce of Prevention: Forging Healthy BI
PDF
Agile, Automated, Aware: How to Model for Success
PDF
First in Class: Optimizing the Data Lake for Tighter Integration
PDF
Fit For Purpose: Preventing a Big Data Letdown
PDF
To Serve and Protect: Making Sense of Hadoop Security
PDF
The Hadoop Guarantee: Keeping Analytics Running On Time
PDF
Introducing: A Complete Algebra of Data
PDF
The Role of Data Wrangling in Driving Hadoop Adoption
PDF
Ahead of the Stream: How to Future-Proof Real-Time Analytics
PDF
All Together Now: Connected Analytics for the Internet of Everything
PDF
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
PDF
The Biggest Picture: Situational Awareness on a Global Level
PDF
Structurally Sound: How to Tame Your Architecture
PDF
SQL In Hadoop: Big Data Innovation Without the Risk
PDF
The Perfect Fit: Scalable Graph for Big Data
PDF
A Revolutionary Approach to Modernizing the Data Warehouse
PDF
The Maturity Model: Taking the Growing Pains Out of Hadoop
PDF
Rethinking Data Availability and Governance in a Mobile World
PDF
DisrupTech - Dave Duggal
PPTX
Modus Operandi
An Ounce of Prevention: Forging Healthy BI
Agile, Automated, Aware: How to Model for Success
First in Class: Optimizing the Data Lake for Tighter Integration
Fit For Purpose: Preventing a Big Data Letdown
To Serve and Protect: Making Sense of Hadoop Security
The Hadoop Guarantee: Keeping Analytics Running On Time
Introducing: A Complete Algebra of Data
The Role of Data Wrangling in Driving Hadoop Adoption
Ahead of the Stream: How to Future-Proof Real-Time Analytics
All Together Now: Connected Analytics for the Internet of Everything
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
The Biggest Picture: Situational Awareness on a Global Level
Structurally Sound: How to Tame Your Architecture
SQL In Hadoop: Big Data Innovation Without the Risk
The Perfect Fit: Scalable Graph for Big Data
A Revolutionary Approach to Modernizing the Data Warehouse
The Maturity Model: Taking the Growing Pains Out of Hadoop
Rethinking Data Availability and Governance in a Mobile World
DisrupTech - Dave Duggal
Modus Operandi

Recently uploaded (20)

PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
NewMind AI Monthly Chronicles - July 2025
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Machine learning based COVID-19 study performance prediction
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Modernizing your data center with Dell and AMD
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
Teaching material agriculture food technology
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation theory and applications.pdf
PPTX
Big Data Technologies - Introduction.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
NewMind AI Monthly Chronicles - July 2025
The AUB Centre for AI in Media Proposal.docx
Advanced methodologies resolving dimensionality complications for autism neur...
Chapter 3 Spatial Domain Image Processing.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Approach and Philosophy of On baking technology
Encapsulation_ Review paper, used for researhc scholars
Modernizing your data center with Dell and AMD
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Teaching material agriculture food technology
Spectral efficient network and resource selection model in 5G networks
Encapsulation theory and applications.pdf
Big Data Technologies - Introduction.pptx

The New Database Frontier: Harnessing the Cloud

  • 1. Grab some coffee and enjoy the pre-show banter before the top of the hour!
  • 2. The New Database Frontier: Harnessing the Cloud The Briefing Room
  • 3. Twitter Tag: #briefr The Briefing Room Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com @eric_kavanagh
  • 4. ! Reveal the essential characteristics of enterprise software, good and bad ! Provide a forum for detailed analysis of today’s innovative technologies ! Give vendors a chance to explain their product to savvy analysts ! Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room Mission
  • 5. Twitter Tag: #briefr The Briefing Room Topics This Month: DATABASE June: ANALYTICS & MACHINE LEARNING July: INNOVATIVE TECHNOLOGY 2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room
  • 6. “ We are stuck with technology “ when what we really want is just stuff that works. -Douglas Adams
  • 7. Twitter Tag: #briefr The Briefing Room Analyst: Rick Sherman Rick Sherman is CEO of Athena IT Solutions
  • 8. Twitter Tag: #briefr The Briefing Room MarkLogic ! MarkLogic offers a distributed, scale-out, enterprise NoSQL database ! The platform is comprised of a database, search engine and application services ! MarkLogic can run directly on the Hadoop file system, and it features full text search, location services and geospatial alerting
  • 9. Twitter Tag: #briefr The Briefing Room Guest: Ken Krupa Ken Krupa is Chief Field Architect at MarkLogic. With 24 years of professional IT experience, Mr. Krupa has a unique breadth and depth of expertise within nearly all aspects of IT architecture. Prior to joining MarkLogic, Ken consulted at some of the largest North American Financial institutions during difficult economic times, advising senior and C-level executives. Prior to that, he consulted with Sun Microsystems as a direct partner and also served as Chief Architect of GFI Group, a Wall St. inter-dealer brokerage. Although his work primarily involves high-level technology strategy, Mr. Krupa remains an active hands-on engineer. In 2005, Ken was awarded patent #6,915,304 – “System and method for converting an XML data structure into a relational database.” Today Ken continues to pursue both individual and community-based engineering activities.
  • 10. Expect More From Your Database © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Ken Krupa, Chief Field Architect, MarkLogic Corporation
  • 11. Overview § Evolution of the Database § The Enterprise Data Warehouse (EDW) § Big Data § NoSQL § Enterprise NoSQL § The Logical Data Warehouse (LDW) § Unified Database § Parting Thoughts © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 11 ALL RIGHTS RESERVED.
  • 12. We Are the New Generation Database Hierarchical Era For your application data! • Application- and hardware-specific Relational Era “For all your structured data!” • Normalized, tabular model • Application-independent query Any Structure Era • User control “For all your data!” • Schema-agnostic • Massive scale • Query and search • Analytics • Application services • Faster time-to-results © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 12 ALL RIGHTS RESERVED.
  • 13. RDBMS: One Tool, Many Contortions § OLTP § 3rd normal form, updates, simple query § Reporting DB § Because the OLTP app slowed down during heavy query use § Enterprise Data Warehouse § Star schema - unified view of the enterprise § Data Marts § Because the EDW didn’t have everything – Also star schema § Federated § Because it took too long to agree on a standard model § Hybrid § Because Federated is too slow © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 13 ALL RIGHTS RESERVED.
  • 14. It’s Complicated OLTP © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 14 ALL RIGHTS RESERVED.
  • 15. It’s Complicated ETL OLTP Reference Data ETL Warehouse ETL ETL ETL Data Marts Archives © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 15 ALL RIGHTS RESERVED.
  • 16. It’s Complicated ETL OLTP Reference Data ETL Warehouse ETL ETL ETL ETL Data Marts Archives “Unstructured” Documents, Messages { } “ ” Video Audio Signals, Logs, Streams Metadata Social ETL  Search © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 16 ALL RIGHTS RESERVED.
  • 17. Look closely… ETL © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 17 ALL RIGHTS RESERVED.
  • 18. Look closely… ETL M …at the hidden “M” © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 18 ALL RIGHTS RESERVED.
  • 19. At a crossroads… § Pre-requisite modeling has become an unsustainable friction point § Can/should we wait until the data is perfectly modeled to do discovery? § A real dollar cost before value is realized § “Cost per column” § Moving data around is also becoming a friction point § There’s too much of it to do it for all cases § Also a real dollar cost before value is realized § Traditional data warehousing largely leaves out “the 80%” § Most of the world’s data is unstructured § Dimensional warehouses as we know them are not up to the task © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 19 ALL RIGHTS RESERVED.
  • 20. Gartner’s Take § Organizations that failed to deploy strategies to address data complexity and volume issues for their analytics by 2012 will experience more than doubling costs of ownership for their data warehouse and mart environments in disorganized attempts to meet this new demand. § By 2014, 85% of organizations will fail to deploy new strategies to address data complexity and volume in their analytics. § By 2014, organizations that have deployed analytics to support new complex data types and large volumes of data in analytics will outperform their market peers by more than 20% in revenue, margins, penetration and retention. Source: Gartner Does the 21st Century, Beyer and Feinberg. © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 20 ALL RIGHTS RESERVED.
  • 21. What’s being done? © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 21 ALL RIGHTS RESERVED.
  • 22. What about Hadoop? Staging Analytics Aggregates, Models Persistence © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 22 ALL RIGHTS RESERVED.
  • 23. What about Hadoop? Updates ? Queries Staging Analytics Aggregates, Models Persistence © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 23 ALL RIGHTS RESERVED.
  • 24. Hadoop + RDBMS Distill into RDBMS …or spill-over into Hadoop © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 24 ALL RIGHTS RESERVED.
  • 25. Hadoop + RDBMS ETL Distill into RDBMS …or spill-over into Hadoop © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 25 ALL RIGHTS RESERVED.
  • 26. Hadoop – What You Get Advantages Limitations § HDFS provides scale and economies of scale § File-based nature allows for greater Variety § Raw data is fine and any shape will do § Schema-on-read possible § Map-reduce enables massive parallel scaling § Hadoop was designed for batch processing § Does not support real-time applications on its own § Requires expertise to configure, deploy and manage § Has security limitations § Is not a database © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 26 ALL RIGHTS RESERVED.
  • 27. Enter NoSQL § Agility § Flexible data models (or none at all) § Many different types o Simple Key/Value o Columnar o Document o Graph o Etc. § Enterprise features? § More confusion…? © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 27 ALL RIGHTS RESERVED.
  • 28. Tough Choice… Legacy RDBMS § Indexes § Transactions § Security § Enterprise operations “NoSQL” § Flexible data model § Commodity scale out § Distributed, fault-tolerant § Hadoop integration Cashflows PartyID Net Date Reference TradeI Payer D Amount Receiver © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 28 ALL RIGHTS RESERVED.
  • 29. Enterprise NoSQL – Best of Both Worlds… § Flexible, schema agnostic, document oriented data model § Comprehensive indexes o Documents: Hierarchy, text, values, tags—schema “on-demand” o Scalars: Aggregates and range filters, including geospatial o Triples: Linked facts and inferencing o Permissions: Users, roles, compartments, and privileges o Queries: Reverse indexes for alerting, matching § Ad-hoc dimensions § Real-time transformation and/or schema on read § Lock-free reads § Strict consistency throughout § Oh yeah.. SQL too © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 29 ALL RIGHTS RESERVED.
  • 30. Universal Index A Unified Platform © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 30 ALL RIGHTS RESERVED.
  • 31. A Unified Platform Content: Words, phrases, entities, positions, etc. ... Semantic Web is a collaborative movement led by the World Wide Web Consortium (W3C) ... ... ACE inhibitors, since the risk of lithium toxicity is very high in such patients... Structure Label Author Ing Comp ID Para Org Values name:sorbitol date:2012-06-04 company:Roche Geospatial Relationships Security Lat: 46.946584 Long: 93.076172 Trenton isCityOf NewJersey James livesIn Trenton Role:researcher-worldwide Universal Index © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 31 ALL RIGHTS RESERVED.
  • 32. A Unified Platform Content: Words, phrases, entities, positions, etc. ... Semantic Web is a collaborative movement led by the World Wide Web Consortium (W3C) ... ... ACE inhibitors, since the risk of lithium toxicity is very high in such patients... Structure Label Author Ing Comp ID Para Org Values name:sorbitol date:2012-06-04 company:Roche Geospatial Relationships Security Lat: 46.946584 Long: 93.076172 Trenton isCityOf NewJersey James livesIn Trenton Role:researcher-worldwide Universal Index © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 32 ALL RIGHTS RESERVED.
  • 33. DB + Search § DB separate from search is another unnecessary friction point § Best if integrated at the DB layer: § Quicker time to information § Greater than the sum of its parts § No separate indexes to maintain § Search as a text processing engine § Query capability for unstructured text § Turn text into numbers, create new dimensions § Infer new information from text search and enrich © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 33 ALL RIGHTS RESERVED.
  • 34. The advanced interface © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 34 ALL RIGHTS RESERVED.
  • 35. Semantics and RDF Data stored in Triples § Expressed as Subject : Predicate : Object John Smith LivesIn Brooklyn PartOf Brooklyn New York § Can make inferences – e.g. John Smith LivesIn New York § Can create relationships on-the-fly § “We’ve identified a special relationship between a drug and an interaction…” § Machine-comprehensible “knowledge” © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 35 ALL RIGHTS RESERVED.
  • 36. The World of Triples Linked Open Data (Free semantic facts available to anyone) Proprietary Semantic Facts (Facts and Taxonomies in your organization) Semantic World Document World Facts from Free-Flowing Text (Derived from semantic enrichment) MarkLogic Facts in Documents (Part of metadata or added with authoring tools)
  • 37. Data Everywhere Application Data stored in MarkLogic © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 37 ALL RIGHTS RESERVED.
  • 38. Data Everywhere Application Data stored in MarkLogic On Tiered Storage Widely Accessible © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 38 ALL RIGHTS RESERVED.
  • 39. Data Everywhere Application Data stored in MarkLogic On Tiered Storage Calling out to endpoints Widely Accessible RDBMS SPARQL REST © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 39 ALL RIGHTS RESERVED.
  • 40. Data Everywhere Application Data stored in MarkLogic On Tiered Storage Searchable and Queryable Calling out to endpoints Widely Accessible RDBMS SPARQL REST Logical Data Warehouse Reimagined © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 40 ALL RIGHTS RESERVED.
  • 41. More from Gartner… § 64% of surveyed organizations either have invested in big data already (30%) or have plans to invest within 24 months § Through 2017, 90% of the information assets from big data analytic efforts will be siloed and unleveragable across multiple business processes § By 2016, excessive focus of truth over trust in big data will prompt leadership change in 75% of projects § Through 2017, premiums for big data-related technology and project skills will remain 20% to 30% above norms for traditional information management skills Source: Gartner – Predicts 2014: Big Data, Heudecker, Beyer, et al § Companies will spend more on application integration than on new application systems § By 2018, more than 50% of the cost of implementing 90% of new large systems will be spent on integration Source: Gartner – Predicts 2013: Application Integration, Lheureux, Pezzini, et al © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 41 ALL RIGHTS RESERVED.
  • 42. Integration…? Hadoop Ecosystem © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 42 ALL RIGHTS RESERVED.
  • 43. The New Data Life-cycle Textual Structured Multi-media Geospatial “ Social ” Semantic Discovery/Model Loop © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 43 ALL RIGHTS RESERVED.
  • 44. Some parting thoughts § ETL is a friction point that should be minimized § Integration is another friction point § Modeling should no longer be pre-requisite to discovery § Evolve your model alongside discovery § Expect more from your database § Schema agility of Hadoop but in a DBMS § Enterprise capabilities of traditional DBMS § ACID, Security, HA/DR, etc. § Support for indexing and analyzing heterogeneous information assets (text, data, geospatial, semantics, etc.) § Support for heterogeneous locality of data for strategy execution § Operational data + LDW with fewer moving parts § Tiered Storage © COPYRIGHT 2013 MARKLOGIC CORPORATION. SLIDE: 44 ALL RIGHTS RESERVED.
  • 45. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Thank You Ken Krupa Ken.krupa@marklogic.com @kenkrupa Check out these pages: www.marklogic.com developer.marklogic.com Now reimagine the possibilities! Join us in a city near you! May 15 Amsterdam May 20 London June 3 New York City June 5 Chicago June 19 Baltimore June 24 Washington DC
  • 46. Twitter Tag: #briefr The Briefing Room Perceptions & Questions Analyst: Rick Sherman
  • 47. The Briefing Room: The New Database Fron:er Rick Sherman Athena IT Solu:ons rsherman@athena-­‐solu:ons.com Copyright © 2014 Athena IT Solutions
  • 48. The New Database Frontier: Our History • Relational emerged in 1980s & went mainstream in 1990s ü Transactional ü Data warehousing, Business Intelligence & Analytics • Relational keep adding features for BI & DW ü More infrastructure for I/O & memory ü More complexity ü More skills ü Can “one size fit all”? • OLAP (On-line Analytical Processing) Emerge ü Successful but…not pervasive ü Proprietary: Database, ETL & BI ü Specialized skills ü Scalability & extensibility issues Slide 48 Copyright © 2014 Athena IT Solutions All rights reserved.
  • 49. The New Database Frontier: Data & Analytical Needs Have Expanded • Data Volume, Velocity & Variety Exploding • Data Integration has Succeeded in Many Ways ü 5 C’s: comprehensive, consistent, clean & current ü Achieved at a cost • BI & Analytics has Evolved & Expanded Slide 49 Copyright © 2014 Athena IT Solutions All rights reserved.
  • 50. The New Database Frontier: Do We Need to Change? • Volume & Velocity for Structured Data ü Typically can be handled by traditional DW platform with enhancements & emerging technologies û In-memory, Columnar, MPP, other û Infrastructure, Appliances, Cloud û Better architecture & design ü But are there other approaches that can do it better? • Variety is key difference & requires different approach ü Unstructured: text, audio, video, click streams, log files, social media ü Semi-structured: XML, RSS feeds, machine data ü None of big 3 (ETL, databases & BI) were built it Slide 50 Copyright © 2014 Athena IT Solutions All rights reserved.
  • 51. The New Database Frontier: NoSQL Data Stores • NoSQL differ from Relational databases ü Structure ü Purpose ü SQL not used as primary query language • Structures: ü Wide Column Store / Column Families ü Document Store ü Key Value / Tuple Store ü Graph Databases • Characteristics ü Scalable, flexible, commodity hardware ü Supports 3 V’s ü Fixed table schemas not needed ü May not guarantee ACID (atomicity, consistency, isolation, durability) Slide 51 Copyright © 2014 Athena IT Solutions All rights reserved.
  • 52. The New Database Frontier: Big Data is Evolving Big data platforms are evolving • NoSQL tools versus platforms • Cloud deployments • Integration • Advanced analytics Benefits: • Increased capabilities & reduced programming • Shift roles ü IT & Business ü Data scientists & business analysts ü Services • Lower costs & time to market Slide 52 Copyright © 2014 Athena IT Solutions All rights reserved.
  • 53. Q&A • Big Data Implementations o How are you addressing high cost, manual coding, time to market & skills shortage? • Use Cases: (Assume used for unstructured data) o When would you use your database to store structured data? o Would your database be used for operational & transaction processing applications? • Data Ingestion & Integration: o How would data sources typically be ingested? o Are there data integration capabilities similar to ETL available? • BI: o In order to use BI tools or SQL is a data model needed? o How use extension keywords (“MATCH”) be used? • Modeling data: o Contrast semantic triples/RDF vs dimensional models o Compare skills sets needed • What are the differences in how your database handles: o Data capture versus information analysis o Data ingestion versus processing business transaction or processes Slide 53 Copyright © 2014 Athena IT Solutions All rights reserved.
  • 54. Twitter Tag: #briefr The Briefing Room
  • 55. This Month: DATABASE June: ANALYTICS & MACHINE LEARNING July: INNOVATIVE TECHNOLOGY www.insideanalysis.com/webcasts/the-briefing-room Twitter Tag: #briefr The Briefing Room Upcoming Topics 2014 Editorial Calendar at www.insideanalysis.com
  • 56. Twitter Tag: #briefr THANK YOU for your ATTENTION! The Briefing Room