SlideShare a Scribd company logo
Big Data, NoSQL &
Data Modeling
10 Tips for Data Modeling Success on Modern Data Projects
Karen Lopez, InfoAdvisors
www.datamodel.com
Data Models – Traditional Process
Conceptual
(Data)
Model
Logical
Data Model
Physical
Data
Model(s) OLTP
OLTP
OLTP OLTP
OLTP
MARTMART
OLTP
OLTP
OLTP
Aug 2014©InfoAdvisors - infoadvisors.com
Relational
Aug 2014©InfoAdvisors - infoadvisors.com
Data Models started
with relational
modeling, so they look
like relational database
structures.
But….
That doesn’t mean they can’t be used to
model data that goes into a non-
relational format.
All that formatting happens at build OR
consumption time, not requirements
time.
Aug 2014©InfoAdvisors - infoadvisors.com
The Big Data Story
Lots of data
Coming at us fast
Lots of variety in format & quality
We want all the data
Highly available
“It’s web scale”
Aug 2014©InfoAdvisors - infoadvisors.com
What do we really mean by scale?
Bringing computing to the data
Massively parallel processing
Cheap, commodity hardware, but lots
of it
Optimized for
Query/Reads/Questions/Telling stories
Aug 2014©InfoAdvisors - infoadvisors.com
We’ve been down this road before…
Traditional
transactional
applications
Reporting-
optimized
tables/structures
Data Warehouse
/ Dimensional
Modeling
Aug 2014©InfoAdvisors - infoadvisors.com
Highly normalized Highly Denormalized
ETL
EDW
Data
Mart
Data
Mart
Hadoop
ETL
EDW
Analytics
Mart
Data
Mart
NoSQL, Not Only SQL
Relational Graph
Columnar/Column
Family
Key Value
Document
Databases
Others
Aug 2014©InfoAdvisors - infoadvisors.com
Sample Hive Statement
CREATE EXTERNAL TABLE TaxRebateUsage (
state string,
zipcode string,
agi_class int,
n1 int,
mars2 int,
prep int,
n2 int,
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE
Aug 2014©InfoAdvisors - infoadvisors.com
Sample JSON/MongoDB Notation
Aug 2014©InfoAdvisors - infoadvisors.com
Sample FoundationDB Statement
Aug 2014©InfoAdvisors - infoadvisors.com
Sample Cassandra Statement
Aug 2014©InfoAdvisors - infoadvisors.com
Sample Vertica Statement
Aug 2014©InfoAdvisors - infoadvisors.com
Sample Neo4j Statement
Aug 2014©InfoAdvisors - infoadvisors.com
Those weren’t SCHEMALESS….
They had data facts, which had meanings. And sometimes
expected formats, precisions, and types.
In the NoSQL world, we don’t apply those necessarily at
write time, but at read time.
SCHEMALESS really is MULTIPLE SCHEMAs (Polyschematic)
or VARYING SCHEMAs.
Aug 2014©InfoAdvisors - infoadvisors.com
The Big Data Big Lies
Schemaless
• Schema on
Read, not
Schema on Write
• Polyschematic
Big
• New data stories
• New
technologies
• Not just volume
Aug 2014©InfoAdvisors - infoadvisors.com
10 Tips For Modeling in a Hybrid World
1. Models require a modeler
2. Data modeling tools are essential
3. There are many types of data models: know which ones
you need
4. Modeling does not have to happen at the same time in
every project. It should happen at the right time
5. Modeling is not just schema design. Think outside the
boxes and lines
Aug 2014©InfoAdvisors - infoadvisors.com
10 Tips for Modeling in a Hybrid World
6. A data model is much more than a diagram
7. You will need training.
8. Team members may not understand modeling.
They will need training
9. NoSQL is not one thing. Learn many patterns
10.Modern data architectures are likely hybrid
solutions. You can’t just support one part.
Aug 2014©InfoAdvisors - infoadvisors.com
What does this mean for data modelers?
There will be jobs for traditional, ERD, relational modelers….
….just like there are still jobs of RPG and COBOL programmers
All data has a data story. Many data stories.
A good modeler is a an architect at heart – finding the right
solution for the data story.
Aug 2014©InfoAdvisors - infoadvisors.com
Business Intelligence Journal
Look for September 2014
Issue Article on Modern
Data Architectures
Aug 2014©InfoAdvisors - infoadvisors.com
Thank You!
www.infoadvisors.com
www.datamodel.com
www.dataversity.net
community.embarcadero.com
#TEAMDATA
Aug 2014©InfoAdvisors - infoadvisors.com

More Related Content

PDF
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
PDF
Big Data Modeling
PPTX
Karen Lopez 10 Physical Data Modeling Blunders
PDF
Mastering Customer Data on Apache Spark
PDF
Not Your Father's Database by Databricks
PDF
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
PDF
Making Big Data Easy for Everyone
PPTX
Creating an Enterprise AI Strategy
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
Big Data Modeling
Karen Lopez 10 Physical Data Modeling Blunders
Mastering Customer Data on Apache Spark
Not Your Father's Database by Databricks
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
Making Big Data Easy for Everyone
Creating an Enterprise AI Strategy

What's hot (20)

PDF
The Emerging Data Lake IT Strategy
PDF
You're the New CDO, Now What?
PDF
Dataiku Data Science Studio (datasheet)
PDF
The Data Lake and Getting Buisnesses the Big Data Insights They Need
PDF
Moving Past Infrastructure Limitations
PDF
Modern Big Data Analytics Tools: An Overview
PDF
NoSQL Simplified: Schema vs. Schema-less
PDF
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
PDF
Setting Up the Data Lake
PDF
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
PDF
The Emerging Role of the Data Lake
PDF
Intro to Data Science on Hadoop
PPTX
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
PDF
Data Lake,beyond the Data Warehouse
PDF
Graph Databases - Where Do We Do the Modeling Part?
PPTX
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
PDF
SlamData Overview 9-1-2014
PPTX
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
PPTX
Defining and Applying Data Governance in Today’s Business Environment
The Emerging Data Lake IT Strategy
You're the New CDO, Now What?
Dataiku Data Science Studio (datasheet)
The Data Lake and Getting Buisnesses the Big Data Insights They Need
Moving Past Infrastructure Limitations
Modern Big Data Analytics Tools: An Overview
NoSQL Simplified: Schema vs. Schema-less
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Setting Up the Data Lake
Building New Data Ecosystem for Customer Analytics, Strata + Hadoop World, 2016
The Emerging Role of the Data Lake
Intro to Data Science on Hadoop
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Data Lake,beyond the Data Warehouse
Graph Databases - Where Do We Do the Modeling Part?
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
SlamData Overview 9-1-2014
How to Optimize Sales Analytics Using 10x the Data at 1/10th the Cost
Defining and Applying Data Governance in Today’s Business Environment
Ad

Viewers also liked (20)

PPTX
7 Databases in 70 minutes
PPT
5 Data Modeling for NoSQL 1/2
PDF
Data Modeling for Big Data
PDF
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of View
PDF
Automated Schema Design for NoSQL Databases
PDF
NoSQL, Growing up at Oracle
PDF
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
PDF
NoSE: Schema Design for NoSQL Applications
PPTX
Software Developer and Architecture @ LinkedIn (QCon SF 2014)
PDF
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سوم
PPTX
Operational Analytics Using Spark and NoSQL Data Stores
PDF
Non-Relational Databases & Key/Value Stores
PDF
Data Modeling for Integration of NoSQL with a Data Warehouse
PDF
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
PDF
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
PDF
Big Challenges in Data Modeling: NoSQL and Data Modeling
PDF
NoSQL meets Microservices
PDF
SDEC2011 NoSQL Data modelling
PPT
7. Key-Value Databases: In Depth
PDF
Real-World NoSQL Schema Design
7 Databases in 70 minutes
5 Data Modeling for NoSQL 1/2
Data Modeling for Big Data
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of View
Automated Schema Design for NoSQL Databases
NoSQL, Growing up at Oracle
Michael Hackstein - NoSQL meets Microservices - NoSQL matters Dublin 2015
NoSE: Schema Design for NoSQL Applications
Software Developer and Architecture @ LinkedIn (QCon SF 2014)
آموزش مدیریت بانک اطلاعاتی اوراکل - بخش سوم
Operational Analytics Using Spark and NoSQL Data Stores
Non-Relational Databases & Key/Value Stores
Data Modeling for Integration of NoSQL with a Data Warehouse
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Wakanda: NoSQL for Model-Driven Web applications - NoSQL matters 2012
Big Challenges in Data Modeling: NoSQL and Data Modeling
NoSQL meets Microservices
SDEC2011 NoSQL Data modelling
7. Key-Value Databases: In Depth
Real-World NoSQL Schema Design
Ad

Similar to NoSQL and Data Modeling for Data Modelers (20)

PPTX
Using Drupal 8 + D3 + Arduino to Create Real World Solutions
PDF
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data Wrong
PPTX
TOP Business Intelligence Predictions for 2015
PDF
These Are The Data You Are Looking For
PDF
Information is at the heart of all architecture disciplines & why Conceptual ...
PPTX
Top Business Intelligence Trends for 2016 by Panorama Software
PDF
Neo4j GraphTalks Oslo - Next Generation Solutions built on Neoej
PDF
Big Data Fud
PDF
Hadoop for Finance - sample chapter
PPTX
Road Map for Careers in Big Data
PDF
Course 8 : How to start your big data project by Eric Rodriguez
DOCX
Seven big data predictions for 2015
PDF
Operationalizing Data Analytics
PDF
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
PDF
Big data rmoug
PDF
Data Modeling & Data Integration
PDF
Agile & Data Modeling – How Can They Work Together?
PPTX
Bbbt presentation 210415_final_2
PPTX
Geek Sync | Avoid the Seven Mistakes Data Modelers Make in Aiding Data Govern...
PDF
Top BI trends and predictions for 2017
Using Drupal 8 + D3 + Arduino to Create Real World Solutions
The Heart of Data Modeling: 7 Ways Your Agile Project is Managing Data Wrong
TOP Business Intelligence Predictions for 2015
These Are The Data You Are Looking For
Information is at the heart of all architecture disciplines & why Conceptual ...
Top Business Intelligence Trends for 2016 by Panorama Software
Neo4j GraphTalks Oslo - Next Generation Solutions built on Neoej
Big Data Fud
Hadoop for Finance - sample chapter
Road Map for Careers in Big Data
Course 8 : How to start your big data project by Eric Rodriguez
Seven big data predictions for 2015
Operationalizing Data Analytics
Data Science as a Commodity: Use MADlib, R, & other OSS Tools for Data Scienc...
Big data rmoug
Data Modeling & Data Integration
Agile & Data Modeling – How Can They Work Together?
Bbbt presentation 210415_final_2
Geek Sync | Avoid the Seven Mistakes Data Modelers Make in Aiding Data Govern...
Top BI trends and predictions for 2017

More from Karen Lopez (18)

PPTX
Data Modelling for security and privacy PRAGUE.pptx
PPTX
DGIQ East 2023 AI Ethics SIG
PPTX
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
PPTX
Data in the Stars
PPTX
Designer's Favorite New Features in SQLServer
PDF
WhoseTinklingInYourDataLake - DAMA Chicago.pdf
PPTX
Expert Cloud Data Backup and Recovery Best Practice.pptx
PPTX
Manage Your Time So It Doesn't Manage You
PPTX
Migrating Data and Databases to Azure
PPTX
Blockchain for the DBA and Data Professional
PPTX
Blockchain for the DBA and Data Professional
PDF
Data Security and Protection in DevOps
PPTX
Data Modeling for Security, Privacy and Data Protection
PPTX
Fast Focus: SQL Server Graph Database & Processing
PPTX
Designing for Data Security by Karen Lopez
PPTX
The Key to Keys - Database Design
PPTX
How to Survive as a Data Architect in a Polyglot Database World
PPTX
Karen's Favourite Features of SQL Server 2016
Data Modelling for security and privacy PRAGUE.pptx
DGIQ East 2023 AI Ethics SIG
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
Data in the Stars
Designer's Favorite New Features in SQLServer
WhoseTinklingInYourDataLake - DAMA Chicago.pdf
Expert Cloud Data Backup and Recovery Best Practice.pptx
Manage Your Time So It Doesn't Manage You
Migrating Data and Databases to Azure
Blockchain for the DBA and Data Professional
Blockchain for the DBA and Data Professional
Data Security and Protection in DevOps
Data Modeling for Security, Privacy and Data Protection
Fast Focus: SQL Server Graph Database & Processing
Designing for Data Security by Karen Lopez
The Key to Keys - Database Design
How to Survive as a Data Architect in a Polyglot Database World
Karen's Favourite Features of SQL Server 2016

Recently uploaded (20)

PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Database Infoormation System (DBIS).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Global journeys: estimating international migration
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Supervised vs unsupervised machine learning algorithms
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Database Infoormation System (DBIS).pptx
climate analysis of Dhaka ,Banglades.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Global journeys: estimating international migration
Miokarditis (Inflamasi pada Otot Jantung)
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Moving the Public Sector (Government) to a Digital Adoption
Data_Analytics_and_PowerBI_Presentation.pptx
Reliability_Chapter_ presentation 1221.5784
.pdf is not working space design for the following data for the following dat...
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd

NoSQL and Data Modeling for Data Modelers

  • 1. Big Data, NoSQL & Data Modeling 10 Tips for Data Modeling Success on Modern Data Projects Karen Lopez, InfoAdvisors www.datamodel.com
  • 2. Data Models – Traditional Process Conceptual (Data) Model Logical Data Model Physical Data Model(s) OLTP OLTP OLTP OLTP OLTP MARTMART OLTP OLTP OLTP Aug 2014©InfoAdvisors - infoadvisors.com
  • 3. Relational Aug 2014©InfoAdvisors - infoadvisors.com Data Models started with relational modeling, so they look like relational database structures.
  • 4. But…. That doesn’t mean they can’t be used to model data that goes into a non- relational format. All that formatting happens at build OR consumption time, not requirements time. Aug 2014©InfoAdvisors - infoadvisors.com
  • 5. The Big Data Story Lots of data Coming at us fast Lots of variety in format & quality We want all the data Highly available “It’s web scale” Aug 2014©InfoAdvisors - infoadvisors.com
  • 6. What do we really mean by scale? Bringing computing to the data Massively parallel processing Cheap, commodity hardware, but lots of it Optimized for Query/Reads/Questions/Telling stories Aug 2014©InfoAdvisors - infoadvisors.com
  • 7. We’ve been down this road before… Traditional transactional applications Reporting- optimized tables/structures Data Warehouse / Dimensional Modeling Aug 2014©InfoAdvisors - infoadvisors.com Highly normalized Highly Denormalized
  • 10. NoSQL, Not Only SQL Relational Graph Columnar/Column Family Key Value Document Databases Others Aug 2014©InfoAdvisors - infoadvisors.com
  • 11. Sample Hive Statement CREATE EXTERNAL TABLE TaxRebateUsage ( state string, zipcode string, agi_class int, n1 int, mars2 int, prep int, n2 int, ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE Aug 2014©InfoAdvisors - infoadvisors.com
  • 12. Sample JSON/MongoDB Notation Aug 2014©InfoAdvisors - infoadvisors.com
  • 13. Sample FoundationDB Statement Aug 2014©InfoAdvisors - infoadvisors.com
  • 14. Sample Cassandra Statement Aug 2014©InfoAdvisors - infoadvisors.com
  • 15. Sample Vertica Statement Aug 2014©InfoAdvisors - infoadvisors.com
  • 16. Sample Neo4j Statement Aug 2014©InfoAdvisors - infoadvisors.com
  • 17. Those weren’t SCHEMALESS…. They had data facts, which had meanings. And sometimes expected formats, precisions, and types. In the NoSQL world, we don’t apply those necessarily at write time, but at read time. SCHEMALESS really is MULTIPLE SCHEMAs (Polyschematic) or VARYING SCHEMAs. Aug 2014©InfoAdvisors - infoadvisors.com
  • 18. The Big Data Big Lies Schemaless • Schema on Read, not Schema on Write • Polyschematic Big • New data stories • New technologies • Not just volume Aug 2014©InfoAdvisors - infoadvisors.com
  • 19. 10 Tips For Modeling in a Hybrid World 1. Models require a modeler 2. Data modeling tools are essential 3. There are many types of data models: know which ones you need 4. Modeling does not have to happen at the same time in every project. It should happen at the right time 5. Modeling is not just schema design. Think outside the boxes and lines Aug 2014©InfoAdvisors - infoadvisors.com
  • 20. 10 Tips for Modeling in a Hybrid World 6. A data model is much more than a diagram 7. You will need training. 8. Team members may not understand modeling. They will need training 9. NoSQL is not one thing. Learn many patterns 10.Modern data architectures are likely hybrid solutions. You can’t just support one part. Aug 2014©InfoAdvisors - infoadvisors.com
  • 21. What does this mean for data modelers? There will be jobs for traditional, ERD, relational modelers…. ….just like there are still jobs of RPG and COBOL programmers All data has a data story. Many data stories. A good modeler is a an architect at heart – finding the right solution for the data story. Aug 2014©InfoAdvisors - infoadvisors.com
  • 22. Business Intelligence Journal Look for September 2014 Issue Article on Modern Data Architectures Aug 2014©InfoAdvisors - infoadvisors.com