SlideShare a Scribd company logo
Demystfying Nosql
Databases
Mike King & Matt Thomas
Enterprise Technologists, Big Data
2 Dell - Restricted - Confidential
What are databases?
• Tedd Codd & Chris Date
– 13 rules
– An Introduction to Database Systems
• Wikia/Wikipedia
• Mike
– An organized collection of data offering varying levels
of availability, scalability, performance, consistency,
management, accessibility and quality.
• Matt
Databases defined
3 Dell - Restricted - Confidential
What types of databases
exist?
• Network – Adabas
• Hierarchical – IMS
• Relational – PostgreSQL
• Object Oriented – Versant
• Nosql – MongoDB, Hbase
• Newsql – VoltDB, MemSQL
• XML – MarkLogic, Xyleme
4 Dell - Restricted - Confidential
Nosql background, issues and considerations
• History
– Google Big Table, Amazon Dynamo
• What does schema-less mean?
– On read
– Still structured
– Embedded
– Can vary between records
• Languages & formats used
– Java, Python
– JSON, BSON, XML, CSV
5 Dell - Restricted - Confidential
Nosql background, issues and considerations
continued
• Eric Brewer’s CAP theorem
– Can’t do all three.
• What does NoSQL really mean?
– Distributed, shared-nothing aggregate oriented database
– “Not only SQL” versus “No”
• What are the factors for the various choices?
– Best fit
– Use case(s)
– KV
– HA, Multi-site
– Network
– Kevin Bacon
• Sharding
– Partitioning
6 Dell - Restricted - Confidential
NewSQL
• SQL as predominant access method
• OLTP
• Larger user populations than nosql
• Better consistency than nosql
• Still subject to Brewer’s CAP theorem
• Examples
– VoltDB, MemSQL, Clustrix, NuoDB
7 Dell - Restricted - Confidential
RDBMS or NOSQL?-tablify
• RDBMS
– Large user populations
– Structured
– Static schema
– Strong typing
– Access by PK, AK, indexes
– Complex structures
– Feature rich
– Multi-purpose, shared by apps
– OLTP
– ACID
– Complex queries
– >3 way joins
– Small to medium sized dbs
– COTS pkgs
– Datamarts
• Nosql
– Smaller user populations
– Multi-structured
– Schema evolution
– Weak typing
– Mostly random access by PK
– Simple structures
– Bare bones functionality
– Single purpose/use case, not shared by apps
– Not transactional
– BASE
– Simple queries
– VLDB
– Horizontal scalability
8 Dell - Restricted - Confidential
NoSQL Database Types
• Four types
– Columnar
– Hbase, Cassandra
– Document
– MongoDB, Couchbase
– KV
– Riak, Redis
– Graph
– Neo4j, Titan
• How many do you need?
– By type
– Within type
• Who will manage them?
– DBAs
• How do you access them?
– SQL, nosql
– Sequential
9 Dell - Restricted - Confidential
Nosql Commonalities
• Mostly open source
• Weak typing
• Multi-structured
• Horizontal scale
• No standardization
• VLDB
• Single purpose, per database
10 Dell - Restricted - Confidential
Nosql Differences
• Access
• Formats supported
• Features
• Management
• Administration
• VLDB
• Performance & tuning
• Resource consumption
• Language bindings
• APIs
• Security
• Persistence
• Programmability
• ?Schemas
11 Dell - Restricted - Confidential
How are nosql databases typically used?
• As an adjunct to Hadoop
• As a partial replacement for some RDBMS workloads
• To scale linearly
• As a data store for semi-structured and multi-structured data
12 Dell - Restricted - Confidential
What questions do our customers ask?
• Why is my Hbase cluster so CPU hungry?
• Do you have an RA for <Your favorite nosql db goes here>?
• Can I replace all my Oracle databases w/ some nosql databases?
13 Dell - Restricted - Confidential
What are some common problems?
• Cohabitation with Hadoop and other programs on a cluster.
• Poor db design
• Falling prey to vendor hype
14 Dell - Restricted - Confidential
How about some general recommendations?
• Read a book or two on your target nosql db.
• Search thru the blogosphere & twitterverse.
• Don’t use more than one type, unless you’re an SI or large service provider.
• If performance & service levels are important isolate the cluster.
• Review your database design w/ DBAs & those that have done it already.
– Presentations, conference proceedings, boutique consultancies
15 Dell - Restricted - Confidential
Nosql Examples, Diving Deeper
• Hbase
• MongoDB
• Redis
• Neo4j
16 Dell - Restricted - Confidential
Hbase
• Columnar
– Column families
• Uses ZK
• Has a master
• WAL
• Region servers
• Memstore
• Hfiles
• HDFS
• Uses jvm heap
• Access
– Row key
– Get
– Put
– Scan
– Bulk load
• Design
– Beware of skew
– Tune for peaks
• Perf
– CPU intensive
– Very fast for puts & gets by key
17 Dell - Restricted - Confidential
Neo4j
• Property graph
– Nodes, edges, relationship/arc, direction, data/properties(node & arc)
– Edge labeled multi-digraph
• REST API
• ACID
• Fast, scaleablable lookups
• Lucene index for search
18 Dell - Restricted - Confidential
Our Contact Info
• Mike_King2@dell.com
• @MikeDataKing
• 901-262-7918
• Matt_Thomas@Dell.com
• ?twitter?
• 904-429-6709

More Related Content

PPT
No sql landscape_nosqltips
PDF
HPTS 2011: The NoSQL Ecosystem
PDF
Overview of no sql
PPTX
Infinispan, transactional key value data grid and nosql database
PDF
Infinispan - Galder Zamarreno - October 2010
KEY
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
PDF
Mongo db groundup-0-nosql-intro-syedawasekhirni
PPTX
Scaling SQL and NoSQL Databases in the Cloud
No sql landscape_nosqltips
HPTS 2011: The NoSQL Ecosystem
Overview of no sql
Infinispan, transactional key value data grid and nosql database
Infinispan - Galder Zamarreno - October 2010
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Mongo db groundup-0-nosql-intro-syedawasekhirni
Scaling SQL and NoSQL Databases in the Cloud

What's hot (20)

PDF
Mongo db model relationships with documents
PPTX
Lviv EDGE 2 - NoSQL
PPTX
Nashville analytics summit aug9 no sql mike king dell v1.5
PDF
Oracle Week 2016 - Modern Data Architecture
PPTX
No sql databases
PPT
Creating Game Leaderboards with Redis
PDF
MySQL Storage Engines
PPTX
Operationalizing MongoDB at AOL
PDF
Orchestrating MySQL
PPTX
Big data - Online Training
PDF
Conhecendo o Apache HBase
PDF
The Evolution of Open Source Databases
PDF
Scaling the Web: Databases & NoSQL
PPT
Scaling MySQL using Fabric
PDF
Capacity planning for your data stores
PPTX
Koha System Architecture
PPTX
Infinispan, a distributed in-memory key/value data grid and cache
PPT
Mongo DB for Java, Python and PHP Developers
PPTX
Introduction to Total Library Solution- TLS
PPTX
The Rise of NoSQL and Polyglot Persistence
Mongo db model relationships with documents
Lviv EDGE 2 - NoSQL
Nashville analytics summit aug9 no sql mike king dell v1.5
Oracle Week 2016 - Modern Data Architecture
No sql databases
Creating Game Leaderboards with Redis
MySQL Storage Engines
Operationalizing MongoDB at AOL
Orchestrating MySQL
Big data - Online Training
Conhecendo o Apache HBase
The Evolution of Open Source Databases
Scaling the Web: Databases & NoSQL
Scaling MySQL using Fabric
Capacity planning for your data stores
Koha System Architecture
Infinispan, a distributed in-memory key/value data grid and cache
Mongo DB for Java, Python and PHP Developers
Introduction to Total Library Solution- TLS
The Rise of NoSQL and Polyglot Persistence
Ad

Similar to Demystfying nosql databases (20)

PPTX
Introduction to NoSQL
PPTX
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
PPTX
NoSQL in Big Data Analytics Tools .pptx
PPTX
No SQL DATABASE Description about 4 no sql database.pptx
PPTX
NoSQL and MongoDB
PPTX
Muskan Kumari (1276) Intro to NoSql.pptx. pptx
PPTX
NoSQL A brief look at Apache Cassandra Distributed Database
DOCX
Sql vs NO-SQL database differences explained
PPTX
nosql - introduction on nosql and sql vs nosql comparison
PDF
Nosql databases for the .net developer
PPTX
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
PPTX
NoSql Brownbag
PDF
NOsql Presentation.pdf
PPTX
introduction to NOSQL Database
PPTX
Presentation on NOSQL and mongodb .pptx
PDF
Database Technologies
PDF
Solr cloud the 'search first' nosql database extended deep dive
PPT
NoSQL_Night
PPTX
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
PDF
Baisc introduction of mongodb for beginn
Introduction to NoSQL
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
NoSQL in Big Data Analytics Tools .pptx
No SQL DATABASE Description about 4 no sql database.pptx
NoSQL and MongoDB
Muskan Kumari (1276) Intro to NoSql.pptx. pptx
NoSQL A brief look at Apache Cassandra Distributed Database
Sql vs NO-SQL database differences explained
nosql - introduction on nosql and sql vs nosql comparison
Nosql databases for the .net developer
my no sql introductiobkjhikjhkjhkhjhgchjvbbnn.ppt
NoSql Brownbag
NOsql Presentation.pdf
introduction to NOSQL Database
Presentation on NOSQL and mongodb .pptx
Database Technologies
Solr cloud the 'search first' nosql database extended deep dive
NoSQL_Night
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Baisc introduction of mongodb for beginn
Ad

Recently uploaded (20)

PPTX
Introduction to Knowledge Engineering Part 1
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Mega Projects Data Mega Projects Data
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Lecture1 pattern recognition............
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Transcultural that can help you someday.
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Leprosy and NLEP programme community medicine
Introduction to Knowledge Engineering Part 1
Optimise Shopper Experiences with a Strong Data Estate.pdf
.pdf is not working space design for the following data for the following dat...
SAP 2 completion done . PRESENTATION.pptx
Quality review (1)_presentation of this 21
Acceptance and paychological effects of mandatory extra coach I classes.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Mega Projects Data Mega Projects Data
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Lecture1 pattern recognition............
Qualitative Qantitative and Mixed Methods.pptx
Transcultural that can help you someday.
STERILIZATION AND DISINFECTION-1.ppthhhbx
Miokarditis (Inflamasi pada Otot Jantung)
Leprosy and NLEP programme community medicine

Demystfying nosql databases

  • 1. Demystfying Nosql Databases Mike King & Matt Thomas Enterprise Technologists, Big Data
  • 2. 2 Dell - Restricted - Confidential What are databases? • Tedd Codd & Chris Date – 13 rules – An Introduction to Database Systems • Wikia/Wikipedia • Mike – An organized collection of data offering varying levels of availability, scalability, performance, consistency, management, accessibility and quality. • Matt Databases defined
  • 3. 3 Dell - Restricted - Confidential What types of databases exist? • Network – Adabas • Hierarchical – IMS • Relational – PostgreSQL • Object Oriented – Versant • Nosql – MongoDB, Hbase • Newsql – VoltDB, MemSQL • XML – MarkLogic, Xyleme
  • 4. 4 Dell - Restricted - Confidential Nosql background, issues and considerations • History – Google Big Table, Amazon Dynamo • What does schema-less mean? – On read – Still structured – Embedded – Can vary between records • Languages & formats used – Java, Python – JSON, BSON, XML, CSV
  • 5. 5 Dell - Restricted - Confidential Nosql background, issues and considerations continued • Eric Brewer’s CAP theorem – Can’t do all three. • What does NoSQL really mean? – Distributed, shared-nothing aggregate oriented database – “Not only SQL” versus “No” • What are the factors for the various choices? – Best fit – Use case(s) – KV – HA, Multi-site – Network – Kevin Bacon • Sharding – Partitioning
  • 6. 6 Dell - Restricted - Confidential NewSQL • SQL as predominant access method • OLTP • Larger user populations than nosql • Better consistency than nosql • Still subject to Brewer’s CAP theorem • Examples – VoltDB, MemSQL, Clustrix, NuoDB
  • 7. 7 Dell - Restricted - Confidential RDBMS or NOSQL?-tablify • RDBMS – Large user populations – Structured – Static schema – Strong typing – Access by PK, AK, indexes – Complex structures – Feature rich – Multi-purpose, shared by apps – OLTP – ACID – Complex queries – >3 way joins – Small to medium sized dbs – COTS pkgs – Datamarts • Nosql – Smaller user populations – Multi-structured – Schema evolution – Weak typing – Mostly random access by PK – Simple structures – Bare bones functionality – Single purpose/use case, not shared by apps – Not transactional – BASE – Simple queries – VLDB – Horizontal scalability
  • 8. 8 Dell - Restricted - Confidential NoSQL Database Types • Four types – Columnar – Hbase, Cassandra – Document – MongoDB, Couchbase – KV – Riak, Redis – Graph – Neo4j, Titan • How many do you need? – By type – Within type • Who will manage them? – DBAs • How do you access them? – SQL, nosql – Sequential
  • 9. 9 Dell - Restricted - Confidential Nosql Commonalities • Mostly open source • Weak typing • Multi-structured • Horizontal scale • No standardization • VLDB • Single purpose, per database
  • 10. 10 Dell - Restricted - Confidential Nosql Differences • Access • Formats supported • Features • Management • Administration • VLDB • Performance & tuning • Resource consumption • Language bindings • APIs • Security • Persistence • Programmability • ?Schemas
  • 11. 11 Dell - Restricted - Confidential How are nosql databases typically used? • As an adjunct to Hadoop • As a partial replacement for some RDBMS workloads • To scale linearly • As a data store for semi-structured and multi-structured data
  • 12. 12 Dell - Restricted - Confidential What questions do our customers ask? • Why is my Hbase cluster so CPU hungry? • Do you have an RA for <Your favorite nosql db goes here>? • Can I replace all my Oracle databases w/ some nosql databases?
  • 13. 13 Dell - Restricted - Confidential What are some common problems? • Cohabitation with Hadoop and other programs on a cluster. • Poor db design • Falling prey to vendor hype
  • 14. 14 Dell - Restricted - Confidential How about some general recommendations? • Read a book or two on your target nosql db. • Search thru the blogosphere & twitterverse. • Don’t use more than one type, unless you’re an SI or large service provider. • If performance & service levels are important isolate the cluster. • Review your database design w/ DBAs & those that have done it already. – Presentations, conference proceedings, boutique consultancies
  • 15. 15 Dell - Restricted - Confidential Nosql Examples, Diving Deeper • Hbase • MongoDB • Redis • Neo4j
  • 16. 16 Dell - Restricted - Confidential Hbase • Columnar – Column families • Uses ZK • Has a master • WAL • Region servers • Memstore • Hfiles • HDFS • Uses jvm heap • Access – Row key – Get – Put – Scan – Bulk load • Design – Beware of skew – Tune for peaks • Perf – CPU intensive – Very fast for puts & gets by key
  • 17. 17 Dell - Restricted - Confidential Neo4j • Property graph – Nodes, edges, relationship/arc, direction, data/properties(node & arc) – Edge labeled multi-digraph • REST API • ACID • Fast, scaleablable lookups • Lucene index for search
  • 18. 18 Dell - Restricted - Confidential Our Contact Info • Mike_King2@dell.com • @MikeDataKing • 901-262-7918 • Matt_Thomas@Dell.com • ?twitter? • 904-429-6709

Editor's Notes

  • #3: https://en.wikipedia.org/wiki/Codd's_12_rules
  • #4: https://en.wikipedia.org/wiki/NewSQL https://en.wikipedia.org/wiki/XML_database
  • #5: ? Aggregate orientation VS relational?
  • #9: https://itsavant.wordpress.com/2013/04/23/can-you-get-by-with-just-one-nosql-database/ https://en.wikipedia.org/wiki/NoSQL