SlideShare a Scribd company logo
Non-relational Databases A new kind of Databases for handling Web Scale
Agenda The problem
The solution
Benefits
Cost
Example: Cassandra
The problem The Web introduces a new scale for applications, in terms of: Concurrent users  (millions of reqs/second)
Data  (peta-bytes generated daily)
Processing  (all this data needs processing)
Exponential growth  (surging unpredictable demands)
The problem (contd.) Web sites with very large traffic have no way to deal with this using existing RDBMS solutions: Oracle
MS SQL
Sybase
MySQL
PostgreSQL Even with their high-end clustering solutions
The problem (contd.) Why? Applications using normalized database schema require the use of join's, which doesn't perform well under lots of data and/or nodes
Existing RDBMS clustering solutions require scale-up, which is limited & not really scalable when dealing with exponential growth
Machines have upper limits on capacity, & sharding the data & processing across machines is very complex & app-specific
The problem (contd.) Why not just use sharding? Very problematic when adding/removing nodes
Basically, you end up denormalizing everything & loosing all benefits of relational databases
Who faced this problem? Web applications dealing with high traffic, massive data, large user-base & user-generated content, such as: Google
Yahoo!
Amazon
Facebook
Twitter
Linked-In
& many more
1 difference though Compared to traditional large applications (telco, financial, &c), these web applications are usually  free  & therefore: can sacrifice data integrity / consistency No one will sue them if he doesn't receive the most current: status of their friends (Facebook/Twitter)
Web search result (Google /Yahoo!)
Item added to cart (Amazon)
The solution These companies had to come up with a new kind of DBMS, capable of handling web scale Possibly sacrificing some level of consistency or some other feature
Must we sacrifice something? In 2000, Eric Brewer (co-founder of Inktomi) formulated the CAP theorem, claiming that you can only optimize 2 out of these 3: C onsistency
A vailability
P artition-tolerance BTW, the theorem was later proved by MIT scientists in 2002
Simple example When you have a lot of data which needs to be highly available, you'll usually need to  p artition it across machines & also replicate it to be more fault-tolerant
This means, that when writing a record, all replica's must be updated too
Now you need to choose between: Lock all relevant replica's during update => be less  a vailable

More Related Content

PPTX
Non relational databases-no sql
PPTX
Introduction to NoSQL Databases
PDF
HBase Storage Internals
ZIP
NoSQL databases
PPTX
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
PPTX
Schema-on-Read vs Schema-on-Write
PPT
Introduction to redis
PDF
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Non relational databases-no sql
Introduction to NoSQL Databases
HBase Storage Internals
NoSQL databases
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Schema-on-Read vs Schema-on-Write
Introduction to redis
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...

What's hot (20)

PPSX
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
PDF
Introduction to HBase
PPTX
NoSQL databases - An introduction
PPTX
PDF
Apache Spark & Hadoop
PPTX
Hadoop hdfs
PDF
Azure vs AWS Best Practices: What You Need to Know
PPTX
An Intro to NoSQL Databases
KEY
Redis overview for Software Architecture Forum
PDF
Redis persistence in practice
PDF
Cassandra serving netflix @ scale
PPTX
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
PPTX
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
PPTX
Graph Databases at Netflix
PPTX
Apache Tez: Accelerating Hadoop Query Processing
PPTX
Introduction to NoSQL
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
PPTX
Apache HBase™
PPTX
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Introduction to HBase
NoSQL databases - An introduction
Apache Spark & Hadoop
Hadoop hdfs
Azure vs AWS Best Practices: What You Need to Know
An Intro to NoSQL Databases
Redis overview for Software Architecture Forum
Redis persistence in practice
Cassandra serving netflix @ scale
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
Graph Databases at Netflix
Apache Tez: Accelerating Hadoop Query Processing
Introduction to NoSQL
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Apache HBase™
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Ad

Similar to Nonrelational Databases (20)

ODP
Front Range PHP NoSQL Databases
PPTX
عصر کلان داده، چرا و چگونه؟
PPT
Bhupeshbansal bigdata
PPT
Schemaless Databases
PPTX
Relational databases vs Non-relational databases
PPT
No sql
KEY
Escalando Aplicaciones Web
PPTX
PPTX
Designing for the Cloud Tutorial - QCon SF 2009
PDF
Architectural anti-patterns for data handling
PPT
No SQL Databases as modern database concepts
PPT
The World of Structured Storage System
PDF
No Sql On Social And Sematic Web
PDF
NoSQL On Social And Sematic Web
PPTX
NoSQL Introduction, Theory, Implementations
KEY
DynamoDB Gluecon 2012
ZIP
Gluecon 2012 - DynamoDB
PPTX
Big data vahidamiri-tabriz-13960226-datastack.ir
PPTX
http://www.hfadeel.com/Blog/?p=151
PPT
Final deck
Front Range PHP NoSQL Databases
عصر کلان داده، چرا و چگونه؟
Bhupeshbansal bigdata
Schemaless Databases
Relational databases vs Non-relational databases
No sql
Escalando Aplicaciones Web
Designing for the Cloud Tutorial - QCon SF 2009
Architectural anti-patterns for data handling
No SQL Databases as modern database concepts
The World of Structured Storage System
No Sql On Social And Sematic Web
NoSQL On Social And Sematic Web
NoSQL Introduction, Theory, Implementations
DynamoDB Gluecon 2012
Gluecon 2012 - DynamoDB
Big data vahidamiri-tabriz-13960226-datastack.ir
http://www.hfadeel.com/Blog/?p=151
Final deck
Ad

More from Udi Bauman (12)

KEY
PDF
Intro to-django-for-media-companies
PDF
Django course final-project
PDF
Django course final-project
PDF
Django course summary
PDF
Ship Early Ship Often With Django
PDF
Django Article V0
PDF
Python Django Intro V0.1
ODP
Large Scale Processing with Django
PDF
Django And Ajax
ODP
Udi Google Dev Day
KEY
Intro To Django
Intro to-django-for-media-companies
Django course final-project
Django course final-project
Django course summary
Ship Early Ship Often With Django
Django Article V0
Python Django Intro V0.1
Large Scale Processing with Django
Django And Ajax
Udi Google Dev Day
Intro To Django

Recently uploaded (20)

PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Electronic commerce courselecture one. Pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Big Data Technologies - Introduction.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
KodekX | Application Modernization Development
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Approach and Philosophy of On baking technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Review of recent advances in non-invasive hemoglobin estimation
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
“AI and Expert System Decision Support & Business Intelligence Systems”
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction
Electronic commerce courselecture one. Pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Big Data Technologies - Introduction.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KodekX | Application Modernization Development
Unlocking AI with Model Context Protocol (MCP)
Reach Out and Touch Someone: Haptics and Empathic Computing
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Encapsulation theory and applications.pdf
Approach and Philosophy of On baking technology
20250228 LYD VKU AI Blended-Learning.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Review of recent advances in non-invasive hemoglobin estimation

Nonrelational Databases