SlideShare a Scribd company logo
Put Data to Work. Simply.
An Introduction to Crate
Johannes Moser, Head of Integrations
About Me
• Loves working with people and code
• Head of Integrations at @CrateIO
johannes@crate.io
@joemoeAT
Put DATA TO WORK. SIMPLY
Crate is a fully sharded, distributed, horizontally scaling
SQL database with realtime search and aggregation to
build worry-free applications and services at scale.
super simple to use & operate
plays well with others
delivers fast results
Crate is...
• distributed standard SQL
• shared nothing architecture
• auto-sharding/partioning/replication
• extremely simple to install/operate
• all data (relational, json docs, blobs)
• realtime search & aggregations
• horizontal scaling, elastic, resilient
• dynamic schemas, agile dev
• eventual consistent, atomic on doc level
• balanced memory-disk usage
• ideal for containers
Crate - Stateful Container
enabled through masterless shared nothing architecture
crate
APP CONTAINERS
LOCAL
STORAGE
crate
APP CONTAINERS
LOCAL
STORAGE
crate
APP CONTAINERS
LOCAL
STORAGE
crate
APP CONTAINERS
LOCAL
STORAGE
CR ATE CLUSTE R
CR ATE Container CLUSTER
Cr ate is like an et h er - an omnipresent, pe rsistent layer for your data
Crate Clusters running
right now (ca 400)
Competition
• MySQL
• lacks scalability, fault-tolerance, search
• Hadoop World
• it´s never realtime and not hot data
• Cassandra
• rather inflexible, complex indexing, fat stack
• Elasticsearch
• no SQL, not a DB, no compute of result
• MemSQL
• memory only really, closed source
• MongoDB
• slow aggregations, lack of search, tough scaling
• traditional, proprietary SQL servers
• lack of scaling, complex and $$$
Customer “Skyhigh Networks”
• Security Events (times series)
• Realtime Dashboard
• side-by-side to Hadoop cluster
• 500 customers running on Crate
• better spread of data & load &
resilience
• 3,2 BN inserts/updates per day
• 48.000 concurrent TCP connections
• about 60 nodes on AWS
• 100% drop-in for MySQL cluster
-75% servers, but 20x performance
Use Cases for crate
Well Suited
• Real time analytics and business
intelligence, Dashboards
• Internet of Things backend
• High volume, semistructured/
dynamic data
• Operational database for web
applications at scale
• Combination of operational DB and
analytics DB
Not Well Suited
• Systems that require strong
transactional consistency
• Strong, complex relational data

(e.g insurances…)
DEMO
Crate Components (  ​   ​Crate   ​   ​ Elasticsearch ,   ​   ​other Open Source) 
Crate OPEN SOURCE STACK
Benefits FOR
developerS
• run faster !
• use SQL and ORMs
• migrate easily from existing solutions
• native search & text processing
• combine all data (relational, documents,
blobs) in one store
• schema changes any time without penalty
• times series and geospatial support
• run Crate on your notebook
• deploy app & database together (with
Docker)
Ruby
Benefits OPS
• horizontally scalable, elastic
• high availability out of the box
• ease of use, simple operation
• massively parallel read & write
• incremental backup & restore
• on prem & cloud integrations &
containers
• manage TB-clusters for realtime queries
• available for Linux, Mac, Windows
• run in containers: 

Docker, CoreOS, Tutum, Containership
THANK YOU
Johannes Moser, Head of Integrations
@joemoeAT, johannes@crate.io

More Related Content

PPTX
Azure Big Data Story
PDF
Presto Summit 2018 - 02 - LinkedIn
PPTX
R in Power BI
PPTX
Move your on prem data to a lake in a Lake in Cloud
PPTX
Database Choices
KEY
Introduction to cloud computing
PPTX
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
PPTX
Vitalii Bondarenko "Machine Learning on Fast Data"
Azure Big Data Story
Presto Summit 2018 - 02 - LinkedIn
R in Power BI
Move your on prem data to a lake in a Lake in Cloud
Database Choices
Introduction to cloud computing
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...
Vitalii Bondarenko "Machine Learning on Fast Data"

What's hot (20)

PPTX
Bleeding Edge Databases
PDF
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
PPTX
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
PPTX
Scylla Summit 2018: Scaling your time series data with Newts
PPTX
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
PPTX
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
PDF
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
PPTX
Bootstrap SaaS startup using Open Source Tools
PDF
Architecture Best Practices to Master + Pitfalls to Avoid
PPTX
Not only SQL - Database Choices
PDF
Presto: Fast SQL on Everything
PDF
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
PDF
Presto Summit 2018 - 01 - Facebook Presto
PDF
Big data serving: Processing and inference at scale in real time
PPTX
Bigdata antipatterns
PPTX
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to Kubernetes
PPTX
UNC Chapel Hill Ctc Retreat 2014 SAS Visual Analytics and Business Intelligence
PPTX
Infinitely Scalable Clusters - Grid Computing on Public Cloud - New York
PDF
Presto Summit 2018 - 07 - Lyft
PDF
Introducing Kafka Connect and Implementing Custom Connectors
Bleeding Edge Databases
Дмитрий Лавриненко "Blockchain for Identity Management, based on Fast Big Data"
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Scylla Summit 2018: Scaling your time series data with Newts
GOTO Aarhus 2014: Making Enterprise Data Available in Real Time with elastics...
4Developers 2018: Przetwarzanie Big Data w oparciu o architekturę Lambda na p...
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
Bootstrap SaaS startup using Open Source Tools
Architecture Best Practices to Master + Pitfalls to Avoid
Not only SQL - Database Choices
Presto: Fast SQL on Everything
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
Presto Summit 2018 - 01 - Facebook Presto
Big data serving: Processing and inference at scale in real time
Bigdata antipatterns
Apache Cassandra Lunch #78: Deploy Cassandra using DSE Operator to Kubernetes
UNC Chapel Hill Ctc Retreat 2014 SAS Visual Analytics and Business Intelligence
Infinitely Scalable Clusters - Grid Computing on Public Cloud - New York
Presto Summit 2018 - 07 - Lyft
Introducing Kafka Connect and Implementing Custom Connectors
Ad

Similar to Basic Introduction to Crate @ ViennaDB Meetup (20)

PPTX
CrateDB - Giacomo Ceribelli
PDF
Webinar: The Future of SQL
PDF
Webinar: SQL for Machine Data?
PDF
CrateDB Machine Data Platform Webinar
PDF
Getting the most out of your containerized database
PDF
OSDC 2017 - Claus Matzinger - An Open Machine Data Analysis Srack with Docker...
PDF
OSDC 2017 | An Open Machine Data Analysis Stack with Docker, CrateDB, and Gr...
PDF
Get Started with CrateDB: Sensor Data
PDF
CrateDB 101: Sensor data
PDF
node-crate: node.js and big data
PDF
Sensordaten analysieren mit Docker, CrateDB und Grafana
PDF
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+Lunch
PDF
OldSQL to NewSQL
PDF
Chris Ward - Understanding databases for distributed docker applications - No...
PDF
Containerized DBs in a Machine Data environment with Crate.io
PPTX
Keys for Success from Streams to Queries
PPTX
http://www.hfadeel.com/Blog/?p=151
PDF
SQL for Elasticsearch
PDF
Data management in cloud study of existing systems and future opportunities
PPTX
Storage Systems for High Scalable Systems Presentation
CrateDB - Giacomo Ceribelli
Webinar: The Future of SQL
Webinar: SQL for Machine Data?
CrateDB Machine Data Platform Webinar
Getting the most out of your containerized database
OSDC 2017 - Claus Matzinger - An Open Machine Data Analysis Srack with Docker...
OSDC 2017 | An Open Machine Data Analysis Stack with Docker, CrateDB, and Gr...
Get Started with CrateDB: Sensor Data
CrateDB 101: Sensor data
node-crate: node.js and big data
Sensordaten analysieren mit Docker, CrateDB und Grafana
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+Lunch
OldSQL to NewSQL
Chris Ward - Understanding databases for distributed docker applications - No...
Containerized DBs in a Machine Data environment with Crate.io
Keys for Success from Streams to Queries
http://www.hfadeel.com/Blog/?p=151
SQL for Elasticsearch
Data management in cloud study of existing systems and future opportunities
Storage Systems for High Scalable Systems Presentation
Ad

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPT
Teaching material agriculture food technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Cloud computing and distributed systems.
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Programs and apps: productivity, graphics, security and other tools
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
20250228 LYD VKU AI Blended-Learning.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Teaching material agriculture food technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Review of recent advances in non-invasive hemoglobin estimation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Big Data Technologies - Introduction.pptx
Cloud computing and distributed systems.
Chapter 3 Spatial Domain Image Processing.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
sap open course for s4hana steps from ECC to s4
Programs and apps: productivity, graphics, security and other tools

Basic Introduction to Crate @ ViennaDB Meetup

  • 1. Put Data to Work. Simply. An Introduction to Crate Johannes Moser, Head of Integrations
  • 2. About Me • Loves working with people and code • Head of Integrations at @CrateIO johannes@crate.io @joemoeAT
  • 3. Put DATA TO WORK. SIMPLY Crate is a fully sharded, distributed, horizontally scaling SQL database with realtime search and aggregation to build worry-free applications and services at scale. super simple to use & operate plays well with others delivers fast results
  • 4. Crate is... • distributed standard SQL • shared nothing architecture • auto-sharding/partioning/replication • extremely simple to install/operate • all data (relational, json docs, blobs) • realtime search & aggregations • horizontal scaling, elastic, resilient • dynamic schemas, agile dev • eventual consistent, atomic on doc level • balanced memory-disk usage • ideal for containers
  • 5. Crate - Stateful Container enabled through masterless shared nothing architecture crate APP CONTAINERS LOCAL STORAGE crate APP CONTAINERS LOCAL STORAGE crate APP CONTAINERS LOCAL STORAGE crate APP CONTAINERS LOCAL STORAGE CR ATE CLUSTE R CR ATE Container CLUSTER Cr ate is like an et h er - an omnipresent, pe rsistent layer for your data
  • 7. Competition • MySQL • lacks scalability, fault-tolerance, search • Hadoop World • it´s never realtime and not hot data • Cassandra • rather inflexible, complex indexing, fat stack • Elasticsearch • no SQL, not a DB, no compute of result • MemSQL • memory only really, closed source • MongoDB • slow aggregations, lack of search, tough scaling • traditional, proprietary SQL servers • lack of scaling, complex and $$$
  • 8. Customer “Skyhigh Networks” • Security Events (times series) • Realtime Dashboard • side-by-side to Hadoop cluster • 500 customers running on Crate • better spread of data & load & resilience • 3,2 BN inserts/updates per day • 48.000 concurrent TCP connections • about 60 nodes on AWS • 100% drop-in for MySQL cluster -75% servers, but 20x performance
  • 9. Use Cases for crate Well Suited • Real time analytics and business intelligence, Dashboards • Internet of Things backend • High volume, semistructured/ dynamic data • Operational database for web applications at scale • Combination of operational DB and analytics DB Not Well Suited • Systems that require strong transactional consistency • Strong, complex relational data
 (e.g insurances…)
  • 10. DEMO
  • 12. Benefits FOR developerS • run faster ! • use SQL and ORMs • migrate easily from existing solutions • native search & text processing • combine all data (relational, documents, blobs) in one store • schema changes any time without penalty • times series and geospatial support • run Crate on your notebook • deploy app & database together (with Docker) Ruby
  • 13. Benefits OPS • horizontally scalable, elastic • high availability out of the box • ease of use, simple operation • massively parallel read & write • incremental backup & restore • on prem & cloud integrations & containers • manage TB-clusters for realtime queries • available for Linux, Mac, Windows • run in containers: 
 Docker, CoreOS, Tutum, Containership
  • 14. THANK YOU Johannes Moser, Head of Integrations @joemoeAT, johannes@crate.io