SlideShare a Scribd company logo
Data Modelers Save Their Careers:
Surviving and Thriving with NoSQL
Joe Maguire
Data Quality Strategies, LLC
http://www.DataQualityStrategies.com/
© 2013 Data Quality Strategies, LLC
Thesis
• Relational DBMS’s have dominated,
• ...so relational modeling subsumed other
forms, including conceptual modeling.
• As R-DBMS wanes, so does relational
modeling – and sadly, whatever it subsumed.
• Conceptual modeling must be saved.
• Relational modelers can step in to save it...
• ...with some significant effort.
25 June 2013 © 2013 Data Quality Strategies, LLC 2
My Perspective
• Over three decades in industry
• Career is a three-legged stool
– Product development for software vendors
– Solution design for enterprises
– Author, Industry Analyst, Thought Leader
• Specialize in
– Modeling
– Requirements analysis
– Data architecture
– Data quality
• Joe.Maguire@DataQualityStrategies.com
25 June 2013 © 2013 Data Quality Strategies, LLC 3
Agenda
• History
• Current Events
• Your Future as a Data Modeler
• Q&A
25 June 2013 © 2013 Data Quality Strategies, LLC 4
A Big-Picture Framework
25 June 2013 © 2013 Data Quality Strategies, LLC 5
Meta-model Data Perspective
Conceptual • Entities
• Attributes
• Relationships
• Identifiers
Logical • Tables
• Columns
• Primary and foreign keys
Physical • Indexes
• Table spaces
• Vertical and horizontal partitioning
• Denormalizations
Good Ideas in the Framework
• Information Hiding
– e.g., conceptual excludes implementation details
• The Type/Instance distinction
– Models describe categories, data describes members
• Application/Data Independence
– Data modeling is separate from process modeling
• User Requirements ≠ System Requirements
– Users should not participate in logical and physical
• Model-Driven Development
– Forward and reverse engineering across model levels
25 June 2013 © 2013 Data Quality Strategies, LLC 6
A Big-Picture Framework, distorted
25 June 2013 © 2013 Data Quality Strategies, LLC 7
Meta-model Data Perspective
Relational • Entities / Tables
• Attributes / Columns
• Relationships / FKs
• Identifiers / PKs
Physical • Indexes
• Table spaces
• Vertical and horizontal partitioning
• Denormalizations
How the Distortion Happens
• Tool Vendors Dismiss Conceptual Modeling
– Because their tools cannot support it anyway
• Info Mgmt Specialists Confuse Models w Reality
– E.g., believing the relational model suffices to
describe the universe
• Institutionalized Expediency
– We know about conceptual modeling, but to save
time, we combine it with relational modeling...
– ...then we formalize that into our dev processes...
– ...and eventually, that becomes the “best practices.”
25 June 2013 © 2013 Data Quality Strategies, LLC 8
Distortions, Revisited
• Summary of Distortions:
– Distortion: Conceptual means vague
– Distortion: Logical implies relational
• Rather than implying XML, OO, KV Store, Array
Database, Graph Database
• Results of Distortions:
– Two levels only: relational and physical
– Relational modeling used for user requirements
25 June 2013 © 2013 Data Quality Strategies, LLC 9
Agenda
• History
• Current Events
• Your Future as a Data Modeler
• Q&A
25 June 2013 © 2013 Data Quality Strategies, LLC 10
Current Events: NoSQL
• The “Just Say No” Interpretation
25 June 2013 © 2013 Data Quality Strategies, LLC 11
Meta-model Data Perspective
Logical
Relational
• Entities / Tables
• Attributes / Columns
• Relationships / FKs
• Identifiers / PKs
Physical NO LONGER RELATIONAL:
• Schemas Based on Big Table Implementations
• Alien DDL language
• Limited Support from Modeling Tools
Current Events: NoSQL
25 June 2013 © 2013 Data Quality Strategies, LLC 12
• The “Not Only SQL” Interpretation
– Okay, so there might be some work for you
– But you’re at risk of being marginalized
Agenda
• History
• Current Events
• Your Future as a Data Modeler
• Summary
• Q&A
25 June 2013 © 2013 Data Quality Strategies, LLC 13
Your Future as a Modeler
25 June 2013 © 2013 Data Quality Strategies, LLC 14
• Remaining Relevant
– Selfishly: Saving your career
– Nobly: Serving your client / company / customer
• What You Can Do:
– Wait for relational projects
– Become a NoSQL database designer
– Help your client choose data platforms
• That starts with understanding the problems
– which starts with CONCEPTUAL MODELING.
A New (?) Modeling Framework
• Conceptual Modeling
• Choosing a Logical Meta-model
• Logical Modeling
• Physical Modeling
• Tool Support?
25 June 2013 © 2013 Data Quality Strategies, LLC 15
Conceptual Modeling
• Behaviors and constructs will compare to
relational modeling:
– Keep some
– Discard some
– Stress some
– Change some
25 June 2013 © 2013 Data Quality Strategies, LLC 16
Conceptual Data Model Example
25 June 2013 © 2013 Data Quality Strategies, LLC 17
Keep Some
• Keep Entities
• Keep Attributes
• Keep Relationships
• Keep Identifiers
• Keep Maximum Cardinality of Relationships
25 June 2013 © 2013 Data Quality Strategies, LLC 18
Keep Entities
• Minimum Expressiveness
• Entities, Not Tables
– Don’t express horizontal or vertical partitioning for
performance
• But yes if motivated by privacy/security/risk
• Entity names, not table names
– Honor user vocabulary, not IT naming standards
25 June 2013 © 2013 Data Quality Strategies, LLC 19
Keep Attributes
• Honor The User Phenomenon
– Attributes are part of user discourse
• Attributes, Not Columns
– Worry about scale
(nominal, numeric, ordinal, Boolean, cyclic), not
data type
– Attribute names, not column names
• Support In-Progress Models
– During which attributes can become entities
25 June 2013 © 2013 Data Quality Strategies, LLC 20
Keep Relationships
• Minimum Expressiveness
– Relationships are part of user discourse
• Allow Many-Many and Collection Entities
– If the latter seem illegal, you’ve been in IT too long
• Relationships, not FKs
25 June 2013 © 2013 Data Quality Strategies, LLC 21
• Relationships, not Foreign Keys
– (achievement DOES NOT have code or creatureID)
Keep Relationships
25 June 2013 © 2013 Data Quality Strategies, LLC 22
• Many-Many Allowed
Keep Relationships
25 June 2013 © 2013 Data Quality Strategies, LLC 23
Keep Identifiers
• Identifiers, Not PKs
– IDs are not motivated by computerization, but by
typography
– IDs predate the information revolution
• and the automotive revolution, for that matter
– Allow collection entities
• Support In-Progress Modeling
– IDs help the modeler ferret out the homonym
problem
25 June 2013 © 2013 Data Quality Strategies, LLC 24
Keep Identifiers
• Identifiers, not PKs. (E.g., Collection Entities):
– (each squad is identified by the skaters on it.)
25 June 2013 © 2013 Data Quality Strategies, LLC 25
Discard Some
• Discard Foreign Keys
– They’re relational
• Discard Minimum Cardinality
– A function of process or policy, not data
– Over-reported by users
• Discard Most Constraints
– A function of process or policy, not data
– Are over-reported by users
25 June 2013 © 2013 Data Quality Strategies, LLC 26
Discard Minimum Cardinality
• Must EVERY instance of meeting have a person?
– No. E.g., CassandraSummit 2014 already has a date and
location but has zero persons associated with it.
• More generally: Should the DBMS refuse to store
incomplete data?
– People get interrupted and want to save their partial
work.
25 June 2013 © 2013 Data Quality Strategies, LLC 27
Keep/Discard Rule of Thumb
• Keep
– Anything that helps you and the users together
discover and name the user categories
• Discard
– Anything else
25 June 2013 © 2013 Data Quality Strategies, LLC 28
Conceptual Data Model Examples
25 June 2013 © 2013 Data Quality Strategies, LLC 29
Stress Some
• Stress Consistency Requirements
– Relational modelers (of non-distributed databases)
have not been asking about these.
• Stress Data Volume / Velocity Requirements
– Can lead or force your to relax application-data
independence
25 June 2013 © 2013 Data Quality Strategies, LLC 30
Change Some
• Change Your Process
– From math-y normalization to English-y
conversation with users
– Very difficult to achieve rigor conversationally
25 June 2013 © 2013 Data Quality Strategies, LLC 31
• More help:
– Mastering Data Modeling: A
User-Driven Approach
by Carlis & Maguire
A New Modeling Framework
• Conceptual Modeling
• Choosing a Logical Meta-Model
• Logical Modeling
• Physical Modeling
• Tool Support?
25 June 2013 © 2013 Data Quality Strategies, LLC 32
Choosing a Logical Meta-Model
• Don’t Assume Relational (Duh...)
• Don’t Assume Big Table, KV-Store, Cassandra
• Lots of Choices
– Relational
– Key-Value Store
– XML/Document Database
– Graph database
– Array database
– ...
25 June 2013 © 2013 Data Quality Strategies, LLC 33
A New Modeling Framework
• Conceptual Modeling
• Choosing a Logical Meta-Model
• Logical Modeling
• Physical Modeling
• Tool Support?
25 June 2013 © 2013 Data Quality Strategies, LLC 34
Logical, Physical, and Tool Support
• Minimal Support From Modeling Tools
– Because few tools support conceptual modeling
– Because vendors have not caught up to NoSQL yet
• Community Needs to Develop Shapes
– And the attendant transformations from conceptual
shapes to Big-Table shapes
• During Logical NoSQL Modeling, Process
Requirements Will Infiltrate
25 June 2013 © 2013 Data Quality Strategies, LLC 35
Agenda
• History
• Current Events
• Your Future as a Data Modeler
• Summary
• Q&A
25 June 2013 © 2013 Data Quality Strategies, LLC 36
Summary
• Recommit to Conceptual Modeling for
Requirements Analysis
– Some but not all relational-modeling skills will
apply
– Must learn to focus on user communication, not
nerdy stuff like intermediate normal forms
25 June 2013 © 2013 Data Quality Strategies, LLC 37
Summary
• Remember the fundamentals, so that you can
make informed decisions about relaxing them
– Application-data independence (relax knowingly)
– Distinguish problems from solutions (relax at your
own peril)
– Consistency level as a user requirement (as you
ask, you’ll find immediate consistency is often
negotiable)
25 June 2013 © 2013 Data Quality Strategies, LLC 38
Summary
• Additional Benefits
– Users will like you better
– Agile developers will like you better
– This framework works in traditional, all-SQL
environments
25 June 2013 © 2013 Data Quality Strategies, LLC 39
Q&A
• Joe.Maguire@DataQualityStrategies.com
• www.DataQualityStrategies.com
25 June 2013 © 2013 Data Quality Strategies, LLC 40

More Related Content

PDF
Customer Journey Mapping Research Report
PPTX
5 Key Steps to Drive with Fintech Customer Journeys
PDF
UX STRAT Asia 2020: Kévin Boezennec, Bank of Singapore
PDF
RD2S_algorithm
PPTX
Finwizz Financial Services
PDF
How HorizonCX uses UXPressia to create better maps
PPTX
Designing the User Experience
PPTX
Future of ba iiba slides
Customer Journey Mapping Research Report
5 Key Steps to Drive with Fintech Customer Journeys
UX STRAT Asia 2020: Kévin Boezennec, Bank of Singapore
RD2S_algorithm
Finwizz Financial Services
How HorizonCX uses UXPressia to create better maps
Designing the User Experience
Future of ba iiba slides

What's hot (20)

PPTX
How to Scale Your UX Research
PPTX
Predictive Analytics World for Business Germany 2018
PPT
Embedded Analytics for the ISV: Supercharging Applications with BI
PPTX
Preparing for Peak in Ecommerce | eTail Asia 2020
PDF
Spocto :: NPA and Data Recovery Solution
PPTX
Analytics in business
PDF
Smart Answers for Employee and Customer Support After COVID-19
PDF
CRM 2.0 - Social CRM - The New Discipline
PDF
Capabilities Packet-7-for-Web
PDF
SMARI Capabilities Packet
PDF
SMARI Capabilities Packet
PDF
What's So Great About Embedded Analytics?
PDF
Monitoring the Digital World – Demystifying Customer Experience
PDF
Explainability for Natural Language Processing
PDF
Real Time Customer Insights
PDF
TrendMiner Award Write Up
PDF
Designing Big Data Interactions Using the Language of Discovery
PDF
Bpma contextual inquiry
PPTX
Applying AI & Search in Europe - featuring 451 Research
PPT
Search Me: Designing Information Retrieval Experiences
How to Scale Your UX Research
Predictive Analytics World for Business Germany 2018
Embedded Analytics for the ISV: Supercharging Applications with BI
Preparing for Peak in Ecommerce | eTail Asia 2020
Spocto :: NPA and Data Recovery Solution
Analytics in business
Smart Answers for Employee and Customer Support After COVID-19
CRM 2.0 - Social CRM - The New Discipline
Capabilities Packet-7-for-Web
SMARI Capabilities Packet
SMARI Capabilities Packet
What's So Great About Embedded Analytics?
Monitoring the Digital World – Demystifying Customer Experience
Explainability for Natural Language Processing
Real Time Customer Insights
TrendMiner Award Write Up
Designing Big Data Interactions Using the Language of Discovery
Bpma contextual inquiry
Applying AI & Search in Europe - featuring 451 Research
Search Me: Designing Information Retrieval Experiences
Ad

Viewers also liked (20)

PPTX
Webinar: Don't Leave Your Data in the Dark
PPTX
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
PPTX
How much money do you lose every time your ecommerce site goes down?
PDF
Cassandra Community Webinar | In Case of Emergency Break Glass
PPTX
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
PPTX
Webinar | Introducing DataStax Enterprise 4.6
PPTX
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
PPTX
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
PDF
Webinar | How Clear Capital Delivers Always-on Appraisals on 122 Million Prop...
PPTX
Webinar: Eventual Consistency != Hopeful Consistency
PPTX
Cassandra Community Webinar: Back to Basics with CQL3
PDF
Cassandra TK 2014 - Large Nodes
PPTX
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
PDF
Cassandra Community Webinar | Practice Makes Perfect: Extreme Cassandra Optim...
PPT
Webinar: 2 Billion Data Points Each Day
PPT
Webinar: Getting Started with Apache Cassandra
PPTX
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
PPTX
Cassandra Community Webinar | Make Life Easier - An Introduction to Cassandra...
PPTX
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
PPTX
Webinar: Building Blocks for the Future of Television
Webinar: Don't Leave Your Data in the Dark
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
How much money do you lose every time your ecommerce site goes down?
Cassandra Community Webinar | In Case of Emergency Break Glass
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar | Introducing DataStax Enterprise 4.6
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar | How Clear Capital Delivers Always-on Appraisals on 122 Million Prop...
Webinar: Eventual Consistency != Hopeful Consistency
Cassandra Community Webinar: Back to Basics with CQL3
Cassandra TK 2014 - Large Nodes
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Cassandra Community Webinar | Practice Makes Perfect: Extreme Cassandra Optim...
Webinar: 2 Billion Data Points Each Day
Webinar: Getting Started with Apache Cassandra
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Cassandra Community Webinar | Make Life Easier - An Introduction to Cassandra...
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: Building Blocks for the Future of Television
Ad

Similar to Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment (20)

PDF
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
PPTX
When to Consider Semantic Technology for Your Enterprise
PDF
When to Consider Semantic Technology for Your Enterprise
PDF
Data Ed: Best Practices with the DMM
PDF
Data-Ed Webinar: Best Practices with the DMM
PDF
Data-Ed Online: Data Management Maturity Model
PDF
Data-Ed: Best Practices with the Data Management Maturity Model
PDF
Best Practices with the DMM
PDF
DataEd Slides: Data Management Maturity - Achieving Best Practices Using DMM
PDF
Data-Ed Webinar: Design & Manage Data Structures
PDF
Data-Ed: Design and Manage Data Structures
PDF
Exploring Business Intelligence: How BI Transforms Business Operations and Fu...
PPTX
Building enterprise advance analytics platform
PDF
Implementing the Data Maturity Model (DMM)
PDF
2013 ALPFA Leadership Submit, Data Analytics in Practice
PDF
Business Centric Data Modeling
PDF
LDM Webinar: Data Modeling & Business Intelligence
PPTX
RACI_Product_Services_Training.ChainSys.
PPTX
Chapter 5: Data Development
PDF
chapter5-220725172250-dc425eb2.pdf
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
When to Consider Semantic Technology for Your Enterprise
When to Consider Semantic Technology for Your Enterprise
Data Ed: Best Practices with the DMM
Data-Ed Webinar: Best Practices with the DMM
Data-Ed Online: Data Management Maturity Model
Data-Ed: Best Practices with the Data Management Maturity Model
Best Practices with the DMM
DataEd Slides: Data Management Maturity - Achieving Best Practices Using DMM
Data-Ed Webinar: Design & Manage Data Structures
Data-Ed: Design and Manage Data Structures
Exploring Business Intelligence: How BI Transforms Business Operations and Fu...
Building enterprise advance analytics platform
Implementing the Data Maturity Model (DMM)
2013 ALPFA Leadership Submit, Data Analytics in Practice
Business Centric Data Modeling
LDM Webinar: Data Modeling & Business Intelligence
RACI_Product_Services_Training.ChainSys.
Chapter 5: Data Development
chapter5-220725172250-dc425eb2.pdf

More from DataStax (20)

PPTX
Is Your Enterprise Ready to Shine This Holiday Season?
PPTX
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
PPTX
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
PPTX
Best Practices for Getting to Production with DataStax Enterprise Graph
PPTX
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
PPTX
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
PDF
Webinar | Better Together: Apache Cassandra and Apache Kafka
PDF
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
PDF
Introduction to Apache Cassandra™ + What’s New in 4.0
PPTX
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
PPTX
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
PDF
Designing a Distributed Cloud Database for Dummies
PDF
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
PDF
How to Evaluate Cloud Databases for eCommerce
PPTX
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
PPTX
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
PPTX
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
PPTX
Datastax - The Architect's guide to customer experience (CX)
PPTX
An Operational Data Layer is Critical for Transformative Banking Applications
PPTX
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Is Your Enterprise Ready to Shine This Holiday Season?
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Best Practices for Getting to Production with DataStax Enterprise Graph
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | Better Together: Apache Cassandra and Apache Kafka
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Introduction to Apache Cassandra™ + What’s New in 4.0
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Designing a Distributed Cloud Database for Dummies
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Evaluate Cloud Databases for eCommerce
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Datastax - The Architect's guide to customer experience (CX)
An Operational Data Layer is Critical for Transformative Banking Applications
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Machine learning based COVID-19 study performance prediction
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
Teaching material agriculture food technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation_ Review paper, used for researhc scholars
The AUB Centre for AI in Media Proposal.docx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Spectral efficient network and resource selection model in 5G networks
Machine learning based COVID-19 study performance prediction
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Teaching material agriculture food technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
cuic standard and advanced reporting.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
“AI and Expert System Decision Support & Business Intelligence Systems”
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
Understanding_Digital_Forensics_Presentation.pptx
Big Data Technologies - Introduction.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

  • 1. Data Modelers Save Their Careers: Surviving and Thriving with NoSQL Joe Maguire Data Quality Strategies, LLC http://www.DataQualityStrategies.com/ © 2013 Data Quality Strategies, LLC
  • 2. Thesis • Relational DBMS’s have dominated, • ...so relational modeling subsumed other forms, including conceptual modeling. • As R-DBMS wanes, so does relational modeling – and sadly, whatever it subsumed. • Conceptual modeling must be saved. • Relational modelers can step in to save it... • ...with some significant effort. 25 June 2013 © 2013 Data Quality Strategies, LLC 2
  • 3. My Perspective • Over three decades in industry • Career is a three-legged stool – Product development for software vendors – Solution design for enterprises – Author, Industry Analyst, Thought Leader • Specialize in – Modeling – Requirements analysis – Data architecture – Data quality • Joe.Maguire@DataQualityStrategies.com 25 June 2013 © 2013 Data Quality Strategies, LLC 3
  • 4. Agenda • History • Current Events • Your Future as a Data Modeler • Q&A 25 June 2013 © 2013 Data Quality Strategies, LLC 4
  • 5. A Big-Picture Framework 25 June 2013 © 2013 Data Quality Strategies, LLC 5 Meta-model Data Perspective Conceptual • Entities • Attributes • Relationships • Identifiers Logical • Tables • Columns • Primary and foreign keys Physical • Indexes • Table spaces • Vertical and horizontal partitioning • Denormalizations
  • 6. Good Ideas in the Framework • Information Hiding – e.g., conceptual excludes implementation details • The Type/Instance distinction – Models describe categories, data describes members • Application/Data Independence – Data modeling is separate from process modeling • User Requirements ≠ System Requirements – Users should not participate in logical and physical • Model-Driven Development – Forward and reverse engineering across model levels 25 June 2013 © 2013 Data Quality Strategies, LLC 6
  • 7. A Big-Picture Framework, distorted 25 June 2013 © 2013 Data Quality Strategies, LLC 7 Meta-model Data Perspective Relational • Entities / Tables • Attributes / Columns • Relationships / FKs • Identifiers / PKs Physical • Indexes • Table spaces • Vertical and horizontal partitioning • Denormalizations
  • 8. How the Distortion Happens • Tool Vendors Dismiss Conceptual Modeling – Because their tools cannot support it anyway • Info Mgmt Specialists Confuse Models w Reality – E.g., believing the relational model suffices to describe the universe • Institutionalized Expediency – We know about conceptual modeling, but to save time, we combine it with relational modeling... – ...then we formalize that into our dev processes... – ...and eventually, that becomes the “best practices.” 25 June 2013 © 2013 Data Quality Strategies, LLC 8
  • 9. Distortions, Revisited • Summary of Distortions: – Distortion: Conceptual means vague – Distortion: Logical implies relational • Rather than implying XML, OO, KV Store, Array Database, Graph Database • Results of Distortions: – Two levels only: relational and physical – Relational modeling used for user requirements 25 June 2013 © 2013 Data Quality Strategies, LLC 9
  • 10. Agenda • History • Current Events • Your Future as a Data Modeler • Q&A 25 June 2013 © 2013 Data Quality Strategies, LLC 10
  • 11. Current Events: NoSQL • The “Just Say No” Interpretation 25 June 2013 © 2013 Data Quality Strategies, LLC 11 Meta-model Data Perspective Logical Relational • Entities / Tables • Attributes / Columns • Relationships / FKs • Identifiers / PKs Physical NO LONGER RELATIONAL: • Schemas Based on Big Table Implementations • Alien DDL language • Limited Support from Modeling Tools
  • 12. Current Events: NoSQL 25 June 2013 © 2013 Data Quality Strategies, LLC 12 • The “Not Only SQL” Interpretation – Okay, so there might be some work for you – But you’re at risk of being marginalized
  • 13. Agenda • History • Current Events • Your Future as a Data Modeler • Summary • Q&A 25 June 2013 © 2013 Data Quality Strategies, LLC 13
  • 14. Your Future as a Modeler 25 June 2013 © 2013 Data Quality Strategies, LLC 14 • Remaining Relevant – Selfishly: Saving your career – Nobly: Serving your client / company / customer • What You Can Do: – Wait for relational projects – Become a NoSQL database designer – Help your client choose data platforms • That starts with understanding the problems – which starts with CONCEPTUAL MODELING.
  • 15. A New (?) Modeling Framework • Conceptual Modeling • Choosing a Logical Meta-model • Logical Modeling • Physical Modeling • Tool Support? 25 June 2013 © 2013 Data Quality Strategies, LLC 15
  • 16. Conceptual Modeling • Behaviors and constructs will compare to relational modeling: – Keep some – Discard some – Stress some – Change some 25 June 2013 © 2013 Data Quality Strategies, LLC 16
  • 17. Conceptual Data Model Example 25 June 2013 © 2013 Data Quality Strategies, LLC 17
  • 18. Keep Some • Keep Entities • Keep Attributes • Keep Relationships • Keep Identifiers • Keep Maximum Cardinality of Relationships 25 June 2013 © 2013 Data Quality Strategies, LLC 18
  • 19. Keep Entities • Minimum Expressiveness • Entities, Not Tables – Don’t express horizontal or vertical partitioning for performance • But yes if motivated by privacy/security/risk • Entity names, not table names – Honor user vocabulary, not IT naming standards 25 June 2013 © 2013 Data Quality Strategies, LLC 19
  • 20. Keep Attributes • Honor The User Phenomenon – Attributes are part of user discourse • Attributes, Not Columns – Worry about scale (nominal, numeric, ordinal, Boolean, cyclic), not data type – Attribute names, not column names • Support In-Progress Models – During which attributes can become entities 25 June 2013 © 2013 Data Quality Strategies, LLC 20
  • 21. Keep Relationships • Minimum Expressiveness – Relationships are part of user discourse • Allow Many-Many and Collection Entities – If the latter seem illegal, you’ve been in IT too long • Relationships, not FKs 25 June 2013 © 2013 Data Quality Strategies, LLC 21
  • 22. • Relationships, not Foreign Keys – (achievement DOES NOT have code or creatureID) Keep Relationships 25 June 2013 © 2013 Data Quality Strategies, LLC 22
  • 23. • Many-Many Allowed Keep Relationships 25 June 2013 © 2013 Data Quality Strategies, LLC 23
  • 24. Keep Identifiers • Identifiers, Not PKs – IDs are not motivated by computerization, but by typography – IDs predate the information revolution • and the automotive revolution, for that matter – Allow collection entities • Support In-Progress Modeling – IDs help the modeler ferret out the homonym problem 25 June 2013 © 2013 Data Quality Strategies, LLC 24
  • 25. Keep Identifiers • Identifiers, not PKs. (E.g., Collection Entities): – (each squad is identified by the skaters on it.) 25 June 2013 © 2013 Data Quality Strategies, LLC 25
  • 26. Discard Some • Discard Foreign Keys – They’re relational • Discard Minimum Cardinality – A function of process or policy, not data – Over-reported by users • Discard Most Constraints – A function of process or policy, not data – Are over-reported by users 25 June 2013 © 2013 Data Quality Strategies, LLC 26
  • 27. Discard Minimum Cardinality • Must EVERY instance of meeting have a person? – No. E.g., CassandraSummit 2014 already has a date and location but has zero persons associated with it. • More generally: Should the DBMS refuse to store incomplete data? – People get interrupted and want to save their partial work. 25 June 2013 © 2013 Data Quality Strategies, LLC 27
  • 28. Keep/Discard Rule of Thumb • Keep – Anything that helps you and the users together discover and name the user categories • Discard – Anything else 25 June 2013 © 2013 Data Quality Strategies, LLC 28
  • 29. Conceptual Data Model Examples 25 June 2013 © 2013 Data Quality Strategies, LLC 29
  • 30. Stress Some • Stress Consistency Requirements – Relational modelers (of non-distributed databases) have not been asking about these. • Stress Data Volume / Velocity Requirements – Can lead or force your to relax application-data independence 25 June 2013 © 2013 Data Quality Strategies, LLC 30
  • 31. Change Some • Change Your Process – From math-y normalization to English-y conversation with users – Very difficult to achieve rigor conversationally 25 June 2013 © 2013 Data Quality Strategies, LLC 31 • More help: – Mastering Data Modeling: A User-Driven Approach by Carlis & Maguire
  • 32. A New Modeling Framework • Conceptual Modeling • Choosing a Logical Meta-Model • Logical Modeling • Physical Modeling • Tool Support? 25 June 2013 © 2013 Data Quality Strategies, LLC 32
  • 33. Choosing a Logical Meta-Model • Don’t Assume Relational (Duh...) • Don’t Assume Big Table, KV-Store, Cassandra • Lots of Choices – Relational – Key-Value Store – XML/Document Database – Graph database – Array database – ... 25 June 2013 © 2013 Data Quality Strategies, LLC 33
  • 34. A New Modeling Framework • Conceptual Modeling • Choosing a Logical Meta-Model • Logical Modeling • Physical Modeling • Tool Support? 25 June 2013 © 2013 Data Quality Strategies, LLC 34
  • 35. Logical, Physical, and Tool Support • Minimal Support From Modeling Tools – Because few tools support conceptual modeling – Because vendors have not caught up to NoSQL yet • Community Needs to Develop Shapes – And the attendant transformations from conceptual shapes to Big-Table shapes • During Logical NoSQL Modeling, Process Requirements Will Infiltrate 25 June 2013 © 2013 Data Quality Strategies, LLC 35
  • 36. Agenda • History • Current Events • Your Future as a Data Modeler • Summary • Q&A 25 June 2013 © 2013 Data Quality Strategies, LLC 36
  • 37. Summary • Recommit to Conceptual Modeling for Requirements Analysis – Some but not all relational-modeling skills will apply – Must learn to focus on user communication, not nerdy stuff like intermediate normal forms 25 June 2013 © 2013 Data Quality Strategies, LLC 37
  • 38. Summary • Remember the fundamentals, so that you can make informed decisions about relaxing them – Application-data independence (relax knowingly) – Distinguish problems from solutions (relax at your own peril) – Consistency level as a user requirement (as you ask, you’ll find immediate consistency is often negotiable) 25 June 2013 © 2013 Data Quality Strategies, LLC 38
  • 39. Summary • Additional Benefits – Users will like you better – Agile developers will like you better – This framework works in traditional, all-SQL environments 25 June 2013 © 2013 Data Quality Strategies, LLC 39

Editor's Notes

  • #6: Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • #8: Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • #13: Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • #15: Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • #18: Point of this slide: reinforce ability to discern major similarities/differences between two tools/services focused on similar domain, by comparing/contrasting model diagrams Non-technical people can easily learn how to read/use this type of model – not the case with most logical and physical model diagramming techniquesEvernote conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Incomplete – a full conceptual model includes accompanying documentation, e.g., with entity definitions and examplesMicrosoft OneNote 2010 conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Reason for including it: it provides an example, comparing it to the Evernote conceptual model fragment, of how easy it is to understand domains, when using conceptual models – e.g., the fact that OneNote has a more elaborate info item containment structure, and supports tags at the item/paragraph level, while Evernote tagging is at the note/page level. That’s not meant to be a judgment call; the extent to which Evernote or OneNote is more useful is a function of your info item/note-taking needs.
  • #30: Point of this slide: reinforce ability to discern major similarities/differences between two tools/services focused on similar domain, by comparing/contrasting model diagrams Non-technical people can easily learn how to read/use this type of model – not the case with most logical and physical model diagramming techniquesEvernote conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Incomplete – a full conceptual model includes accompanying documentation, e.g., with entity definitions and examplesMicrosoft OneNote 2010 conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Reason for including it: it provides an example, comparing it to the Evernote conceptual model fragment, of how easy it is to understand domains, when using conceptual models – e.g., the fact that OneNote has a more elaborate info item containment structure, and supports tags at the item/paragraph level, while Evernote tagging is at the note/page level. That’s not meant to be a judgment call; the extent to which Evernote or OneNote is more useful is a function of your info item/note-taking needs.