SlideShare a Scribd company logo
NoSQL for Artificial Intelligence
How NoSQL Fundamentally Changed Big Data,
Machine Learning and Artificial Intelligence
NoSQL for Artificial Intelligence
Who am I? Sebastián Ramírez
Chief Data Scientist at
Datum Consultants
Boutique Data Science company specialized in
Artificial Intelligence
Two different
topics, "joined" at
the end
NoSQL and Big Data
NoSQL and Big DataTwo different
topics, "joined" at
the end
Artificial Intelligence
NoSQL and Big DataTwo different
topics, "joined" at
the end
Artificial Intelligence
NoSQL for AI
How relational ("standard")
databases work
What was missing, enter "Big Data"
The different evolutions of Big Data
What we'll see:
NoSQL and Big Data
What is Artificial Intelligence (AI) and
Machine Learning (ML)
What can be done with AI and ML
How AI and ML work (visually, no
math)
What we'll see:
Artificial
Intelligence
What we'll see:
NoSQL for AI
NoSQL and Big Data Artificial Intelligence
NoSQL for AI
Relational Database
Management
Systems (RDBMSs)
Entities
Entities to relations
RDBMSs features
● Uses as little disk as possible (it was expensive in the 70s)
● Simple and very popular query language (SQL)
● Data consistency (enforced with normalization)
RDBMSs restrictions
● All data on a single machine, doesn't scale
● Fixed schema from start, difficult to change (add columns)
● Reads and writes are expensive
● Difficulties with fast I/O
● Difficulties with big datasets, even more analytics
● Any failure is catastrophic
RDBMS: Examples
Big Data
None of these work well with traditional RDBMS
Big Data:
Distributed batch
processing
Big Data: Batch Processing
MapReduce
● Distributed
● Parallel
● Fault tolerant
● Batch, data processing
Big Data: Batch Processing
Released in 2005
Open Source
MapReduce implementation
HDFS (Hadoop Distributed File System)
Big Data: Distributed Batch Processing
● Distributed
● Batch analytics (processing all the data together)
● Aggregations (avg, max, min)
● Takes long, finishes at some point
● Fault tolerant
Big Data:
Distributed,
in-memory, batch
processing
Big Data: Distributed, in-memory, batch
processing
In-memory first
Distributed batch processing
MapReduce (as with Hadoop)
...and other algorithms and tools
Big Data: Distributed, in-memory, batch
processing
Same features as Hadoop:
● Distributed
● Parallel
● Fault tolerant
● batch, data processing
● MapReduce
Plus:
● In-memory first, faster than Hadoop
● Other algorithms and tools apart from
MapReduce
Big Data for
applications:
NoSQL
Big Data for
applications:
NoSQL, key-value
stores
Big Data for applications: NoSQL key-value
stores
● Distributed, parallel, fault tolerant, etc
● Minimal latency
● Read / Write individual records
● Know the IDs of each record
● Not complex queries
● Denormalization, duplicate for speed
● Data consistency is harder
● Precomputed aggregations (avg, max, min)
● Reads and Writes are cheap
● High volume, velocity, variety
NoSQL, key-value stores: examples
Big Data for
applications:
NoSQL, key-value
memory stores
Same characteristics as key-value stores:
● Distributed, parallel, fault tolerant, etc
● Minimal latency
● Read / Write individual records
● Know the IDs of each record
● Not complex queries
● Denormalization, duplicate for speed
● Data consistency is harder
● Precomputed aggregations (avg, max, min)
● Reads and Writes are cheap
● High volume, velocity, variety
Plus:
● Memory first storage, faster
Big Data for applications: NoSQL key-value
memory stores
NoSQL, key-value memory stores: examples
Big Data for
applications:
NoSQL, Document
Stores
Big Data for applications: NoSQL Document
Stores
Plus:
● Complex structures (JSON "documents")
● Arbitrary indexes on non-key fields
● Complex queries, by non-key fields
● Some denormalization, much less duplication
● Data consistency is easier than key-value
Key-value stores' characteristics:
● Distributed, parallel, fault tolerant, etc
● Minimal latency
● Read / Write individual records
● Precomputed aggregations (avg, max, min)
● Reads and Writes are cheap
● High volume, velocity, variety
NoSQL, Document Stores: examples
Big Data for
applications:
NoSQL, Search
Engines
Big Data for applications: Search Engines
Search Engines' characteristics:
● Distributed, parallel fault-tolerant...
● Minimal latency
● Store complex text documents
● Specialized text processing indexes for search
● Copy data from main data store to search engine
NoSQL, Search Engines: examples
Big Data for
applications:
NoSQL, data
synchronization
Big Data for applications: data synchronization
Plus:
● Server to server data synchronization
● Edge data synchronization (mobile, IoT)
● Offline-first complex applications
NoSQL document stores characteristics:
● Distributed, parallel, fault tolerant, etc
● Minimal latency
● Read / Write individual records
● Precomputed aggregations (avg, max, min)
● Reads and Writes are cheap
● High volume, velocity, variety
● Complex structures (JSON "documents")
● Arbitrary indexes on non-key fields
● Complex queries, by non-key fields
● Some denormalization, much less duplication
● Data consistency is easier than key-value
NoSQL, data synchronization: examples
Big Data for
applications:
NoSQL, with all the
toppings
NoSQL, data synchronization: examples
NoSQL restrictions ● Extra work when:
○ Duplication is needed
○ Data updates are required
● New way of thinking and designing systems
○ Some extra learning for RDBMS people
● Strict multi-record transactions require more work
○ But in many cases a single record (document) can
store the transaction data
○ Like bank transactions
What is Artificial
Intelligence (AI)?
Artificial
Intelligence
Machine
Learning
Deep
Learning
What is Artificial
Intelligence (AI)?
Artificial
Intelligence
Machine
Learning
Deep
Learning
What can be done
with Machine
Learning (ML)?
What can be done with ML?
What can be done with ML?
What can be done with ML?
How Machine
Learning works
...explained without math
ML: Linear Regression
ML: Linear Regression
ML: K-Means
ML: K-Means
ML: K-Means
ML: K-Means
ML: datasets for training
ML: datasets for prediction
How NoSQL helps
AI / ML
How NoSQL helps AI / ML
Data Volume
How NoSQL helps AI / ML
Data Velocity
How NoSQL helps AI / ML
Data Variety
How NoSQL helps AI / ML
● ML iteration
● Update schemas,
queries
Thank you! Questions?
Sebastián Ramírez
@tiangolo
sebastian@datumcon.com

More Related Content

PDF
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ODP
An introduction to SQLAlchemy
PDF
Yahoo! JAPANのデータ基盤とHadoop #dbts2016
PPTX
Using AWR/Statspack for Wait Analysis
PPTX
CP02-Structure and Union.pptx
PDF
InnoDB Flushing and Checkpoints
PDF
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組み
PDF
Overview of Postgres Utility Processes
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
An introduction to SQLAlchemy
Yahoo! JAPANのデータ基盤とHadoop #dbts2016
Using AWR/Statspack for Wait Analysis
CP02-Structure and Union.pptx
InnoDB Flushing and Checkpoints
Data Engineering Meetup #1 持続可能なデータ基盤のためのデータの多様性に対する取り組み
Overview of Postgres Utility Processes
 

What's hot (11)

PPT
Introduction To Docbook 4 .5 Authoring
PPT
Robot operating systems (ros) overview & (1)
PDF
PostgreSQL and Benchmarks
PPTX
Best Tools for first time Odoo Development
PDF
オープンソースの情報共有の仕組み「Knowledge」の使い方説明
PDF
PostgreSQL : Introduction
PDF
Parallel Replication in MySQL and MariaDB
PDF
SQiP20222投稿応援フォーラム「開発現場で役立つ論文の書き方のお話」
PDF
Inside PostgreSQL Shared Memory
 
PDF
運用してわかったLookerの本質的メリット : Data Engineering Study #8
PDF
MyRocks Deep Dive
Introduction To Docbook 4 .5 Authoring
Robot operating systems (ros) overview & (1)
PostgreSQL and Benchmarks
Best Tools for first time Odoo Development
オープンソースの情報共有の仕組み「Knowledge」の使い方説明
PostgreSQL : Introduction
Parallel Replication in MySQL and MariaDB
SQiP20222投稿応援フォーラム「開発現場で役立つ論文の書き方のお話」
Inside PostgreSQL Shared Memory
 
運用してわかったLookerの本質的メリット : Data Engineering Study #8
MyRocks Deep Dive
Ad

Similar to NoSQL for Artificial Intelligence (20)

PDF
NoSQL Databases Introduction - UTN 2013
PPTX
Sql vs NoSQL
PPTX
Introduction to Data Science NoSQL.pptx
PPTX
Big Data in Action : Operations, Analytics and more
PDF
Big data, Hadoop, NoSQL DB - introduction
PPTX
PPTX
NoSQLDatabases
PDF
Dba to data scientist -Satyendra
PPTX
No SQL- The Future Of Data Storage
PPTX
Big Data Analytics (Collection of Huge Data 3)
PDF
NOsql Presentation.pdf
PPTX
Big Data Analytics Module-4 as per vtu .pptx
PPTX
Introduction to Bigdata and NoSQL
PPTX
NoSQL and MapReduce
PPTX
PDF
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
PPTX
Big data presentation
PDF
Big Data technology Landscape
PDF
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
PPTX
NoSql - mayank singh
NoSQL Databases Introduction - UTN 2013
Sql vs NoSQL
Introduction to Data Science NoSQL.pptx
Big Data in Action : Operations, Analytics and more
Big data, Hadoop, NoSQL DB - introduction
NoSQLDatabases
Dba to data scientist -Satyendra
No SQL- The Future Of Data Storage
Big Data Analytics (Collection of Huge Data 3)
NOsql Presentation.pdf
Big Data Analytics Module-4 as per vtu .pptx
Introduction to Bigdata and NoSQL
NoSQL and MapReduce
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big data presentation
Big Data technology Landscape
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
NoSql - mayank singh
Ad

More from Sebastián Ramírez Montaño (7)

PDF
Serving ML easily with FastAPI - meme version
PDF
Serving ML easily with FastAPI
PDF
De noob a experto en Big Data con cursos online (MOOCs)
PDF
Computacion distribuida usando Celery para Python
PPTX
Familiarización básica a métodos y herramientas para soluciones de Big Data
PPTX
Estudios de caso e historias de éxito del uso efectivo de Big Data
PPTX
Introducción básica a Big Data e inventario de herramientas efectivas para Bi...
Serving ML easily with FastAPI - meme version
Serving ML easily with FastAPI
De noob a experto en Big Data con cursos online (MOOCs)
Computacion distribuida usando Celery para Python
Familiarización básica a métodos y herramientas para soluciones de Big Data
Estudios de caso e historias de éxito del uso efectivo de Big Data
Introducción básica a Big Data e inventario de herramientas efectivas para Bi...

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Approach and Philosophy of On baking technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Mobile App Security Testing_ A Comprehensive Guide.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
sap open course for s4hana steps from ECC to s4
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
Reach Out and Touch Someone: Haptics and Empathic Computing
Per capita expenditure prediction using model stacking based on satellite ima...
Unlocking AI with Model Context Protocol (MCP)
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Empathic Computing: Creating Shared Understanding
Digital-Transformation-Roadmap-for-Companies.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx

NoSQL for Artificial Intelligence