SlideShare a Scribd company logo
Datastax Academy: What ? Why ? How ?
DuyHai DOAN
Cassandra as Infrastructure Technology
•  ING (Cassandra as a service)
•  Netflix
•  Sony Playstation Network
•  Microsoft Office 365
•  Ebay (40Tb in a single table …)
•  Etc ..
© 2015 DataStax, All Rights Reserved.
 2
Cassandra as Infrastructure Technology
•  The SMACK stack as an alternative to the Hadoop stack for streaming
•  Spark
•  Mesos
•  Akka
•  Cassandra
•  Kafka
•  Read @helenaedelson slides here http://goo.gl/cCIE7F
© 2015 DataStax, All Rights Reserved.
 3
Rich eco-system around Cassandra
•  Apache Spark (C* connector)
•  Apache Zeppelin (C* interpreter)
•  Apache Mesos (https://github.com/mesosphere/cassandra-mesos)
•  Apache Kafka (KIP-30)
•  Apache Shiro (C* as cluster session store)
•  Hunk, JasperSoft, Pentaho, Tableau ..
© 2015 DataStax, All Rights Reserved.
 4
Increasing SQL-like features
•  CQL DML (SELECT, INSERT, UPDATE, DELETE …)
•  CQL DDL CREATE/ALTER/DROP (SCHEMA, TABLE, TYPE, FUNCTION …)
•  CQL Credentials
•  CREATE/ALTER/DROP (USER, ROLE)
•  GRANT <xxx> PERMISSION ON <resource> TO <user_name>
•  REVOKE <xxx> PERMISSION ON <resource> FROM <user_name>
© 2015 DataStax, All Rights Reserved.
 5
Increasing SQL-like features
•  User Defined Functions
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] keyspace';'function-name ( <arg-name> <arg-type> )
(CALLED | RETURNS NULL) ON NULL INPUT
RETURNS <type>
LANGUAGE <language>
AS <body>
© 2015 DataStax, All Rights Reserved.
 6
Increasing SQL-like features
•  Materialized Views
CREATE MATERIALIZED VIEW [IF NOT EXISTS] keyspace_name.view_name AS
SELECT column1, column2, ...
FROM keyspace_name.table_name
WHERE column1 IS NOT NULL AND column2 IS NOT NULL ...
PRIMARY KEY(column1, column2, ...)
•  Real time notifications (CDC) CASSANDRA-8844
© 2015 DataStax, All Rights Reserved.
 7
More powerful search in future
•  Apple open-sourced Secondary Index Impl
•  https://github.com/xedin/sasi
CREATE CUSTOM INDEX ON sasi (bio) USING
'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex'
WITH OPTIONS = {
'analyzer_class': 'org.apache.cassandra.db.index.sasi.analyzer.StandardAnalyzer',
'tokenization_enable_stemming': 'true',
'analyzed': 'true',
'tokenization_normalize_lowercase': 'true',
'tokenization_locale': 'en'
};
© 2015 DataStax, All Rights Reserved.
 8
More powerful search in future
•  Apple open-sourced Secondary Index Impl
•  https://github.com/xedin/sasi
SELECT *
FROM sasi
WHERE (created_at > 1442959315018 OR first_name = 'P')
AND age > 26
ALLOW FILTERING;
© 2015 DataStax, All Rights Reserved.
 9
More powerful search in future
•  Limited to 2.0.x branch
•  Needs special patch to OSS code
•  Support only COMPACT STORAGE table
•  Only compatible with Murmur3Partitioner
•  CASSANDRA-10661 to merge to Cassandra 3.0 !!!
•  Github issues to support full CQL3 (https://github.com/xedin/sasi/issues/3)
© 2015 DataStax, All Rights Reserved.
 10
Datastax Gartner reports (Operational DB)
© 2015 DataStax, All Rights Reserved.
 11
Oct 2013
Datastax Gartner reports (Operational DB)
© 2015 DataStax, All Rights Reserved.
 12
Oct 2014
Datastax Gartner reports (Operational DB)
© 2015 DataStax, All Rights Reserved.
 13
Oct 2015
Cassandra Job Trend
© 2015 DataStax, All Rights Reserved.
 14
Cassandra Job Offers (I’ve received)
© 2015 DataStax, All Rights Reserved.
 15
Cassandra Job Offers (I’ve received)
© 2015 DataStax, All Rights Reserved.
 16
Problem ?
© 2015 DataStax, All Rights Reserved.
 17
•  Lack of Cassandra skills
•  Difficulty to hire Cassandra experts
Solution: https://academy.datastax.com
© 2015 DataStax, All Rights Reserved.
 18
Self-Paced Courses
19
© 2015 DataStax, All Rights Reserved.
FREE
Instructor-Led Training
20
© 2015 DataStax, All Rights Reserved.
FREE
O’Reilly Certification
21
© 2015 DataStax, All Rights Reserved.
Technical Evangelists
22
© 2015 DataStax, All Rights Reserved.
•  On-site help, data-modeling, cluster health check
•  duy_hai.doan@datastax.com, @doanduyhai
FREE
From Devs & Ops perspective
23
© 2015 DataStax, All Rights Reserved.
Cassandra is mainstream
+
You are trained & certified
=
Career Boost
24
© 2015 DataStax, All Rights Reserved.
academy.datastax.com

More Related Content

PDF
Apache zeppelin the missing component for the big data ecosystem
PDF
Spark zeppelin-cassandra at synchrotron
PDF
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
PDF
Apache zeppelin, the missing component for the big data ecosystem
PDF
Cassandra 3 new features 2016
PDF
Apache cassandra in 2016
PDF
Spark cassandra integration, theory and practice
PDF
Sasi, cassandra on full text search ride
Apache zeppelin the missing component for the big data ecosystem
Spark zeppelin-cassandra at synchrotron
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Apache zeppelin, the missing component for the big data ecosystem
Cassandra 3 new features 2016
Apache cassandra in 2016
Spark cassandra integration, theory and practice
Sasi, cassandra on full text search ride

What's hot (20)

PDF
Spark cassandra integration 2016
PDF
Cassandra UDF and Materialized Views
PDF
Cassandra introduction 2016
PDF
Datastax enterprise presentation
PDF
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
PDF
Spark Cassandra 2016
PDF
Habits of Effective Sqoop Users
PDF
Spark Programming
PDF
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
PDF
Cassandra introduction 2016
PDF
Apache Sqoop: Unlocking Hadoop for Your Relational Database
PPTX
Using existing language skillsets to create large-scale, cloud-based analytics
PPTX
Hadoop on osx
PDF
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
PDF
Webinar: What's New in Solr 6
PDF
Cassandra Materialized Views
PDF
Apache Spark and DataStax Enablement
PDF
Big data analytics with Spark & Cassandra
PDF
DataEngConf SF16 - Spark SQL Workshop
PDF
Solr Indexing and Analysis Tricks
Spark cassandra integration 2016
Cassandra UDF and Materialized Views
Cassandra introduction 2016
Datastax enterprise presentation
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Spark Cassandra 2016
Habits of Effective Sqoop Users
Spark Programming
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Cassandra introduction 2016
Apache Sqoop: Unlocking Hadoop for Your Relational Database
Using existing language skillsets to create large-scale, cloud-based analytics
Hadoop on osx
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Webinar: What's New in Solr 6
Cassandra Materialized Views
Apache Spark and DataStax Enablement
Big data analytics with Spark & Cassandra
DataEngConf SF16 - Spark SQL Workshop
Solr Indexing and Analysis Tricks
Ad

Viewers also liked (17)

PDF
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
PDF
Libon cassandra summiteu2014
PDF
Cassandra 3 new features @ Geecon Krakow 2016
PDF
Introduction to Cassandra & Data model
PDF
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
PDF
Introduction to KillrChat
PDF
Cassandra introduction @ ParisJUG
PDF
KillrChat presentation
PDF
Apache Zeppelin @DevoxxFR 2016
PDF
Cassandra drivers and libraries
PDF
Fast track to getting started with DSE Max @ ING
PDF
Cassandra introduction @ NantesJUG
PDF
Cassandra introduction mars jug
PDF
KillrChat Data Modeling
PDF
Datastax day 2016 introduction to apache cassandra
PDF
Cassandra introduction at FinishJUG
PDF
Cassandra for the ops dos and donts
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Libon cassandra summiteu2014
Cassandra 3 new features @ Geecon Krakow 2016
Introduction to Cassandra & Data model
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
Introduction to KillrChat
Cassandra introduction @ ParisJUG
KillrChat presentation
Apache Zeppelin @DevoxxFR 2016
Cassandra drivers and libraries
Fast track to getting started with DSE Max @ ING
Cassandra introduction @ NantesJUG
Cassandra introduction mars jug
KillrChat Data Modeling
Datastax day 2016 introduction to apache cassandra
Cassandra introduction at FinishJUG
Cassandra for the ops dos and donts
Ad

Similar to Data stax academy (20)

PDF
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
PPTX
Big data architecture on cloud computing infrastructure
PDF
Maria db 10 and the mariadb foundation(colin)
PPTX
Azure satpn19 time series analytics with azure adx
PDF
Chef for OpenStack December 2012
PDF
Streaming Solutions for Real time problems
PDF
Spark Summit EU talk by Mike Percy
PPTX
An intro to Azure Data Lake
PPTX
Cassandra
PDF
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
PDF
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
PPTX
Apache Cassandra introduction
PPTX
Building an intelligent big data application in 30 minutes
PDF
Databases in the hosted cloud
PDF
What is MariaDB Server 10.3?
PPTX
BI, Reporting and Analytics on Apache Cassandra
PDF
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
PPTX
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
PDF
Koalas: How Well Does Koalas Work?
PDF
Webinar - DreamObjects/Ceph Case Study
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Big data architecture on cloud computing infrastructure
Maria db 10 and the mariadb foundation(colin)
Azure satpn19 time series analytics with azure adx
Chef for OpenStack December 2012
Streaming Solutions for Real time problems
Spark Summit EU talk by Mike Percy
An intro to Azure Data Lake
Cassandra
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
Apache Cassandra introduction
Building an intelligent big data application in 30 minutes
Databases in the hosted cloud
What is MariaDB Server 10.3?
BI, Reporting and Analytics on Apache Cassandra
【旧版】Oracle Exadata Cloud Service:サービス概要のご紹介 [2020年8月版]
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
Koalas: How Well Does Koalas Work?
Webinar - DreamObjects/Ceph Case Study

More from Duyhai Doan (9)

PDF
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
PDF
Le futur d'apache cassandra
PDF
Big data 101 for beginners devoxxpl
PDF
Big data 101 for beginners riga dev days
PDF
Datastax day 2016 : Cassandra data modeling basics
PDF
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
PDF
Distributed algorithms for big data @ GeeCon
PDF
Spark cassandra connector.API, Best Practices and Use-Cases
PDF
Algorithmes distribues pour le big data @ DevoxxFR 2015
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Le futur d'apache cassandra
Big data 101 for beginners devoxxpl
Big data 101 for beginners riga dev days
Datastax day 2016 : Cassandra data modeling basics
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Distributed algorithms for big data @ GeeCon
Spark cassandra connector.API, Best Practices and Use-Cases
Algorithmes distribues pour le big data @ DevoxxFR 2015

Recently uploaded (20)

PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Cloud computing and distributed systems.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
sap open course for s4hana steps from ECC to s4
Digital-Transformation-Roadmap-for-Companies.pptx
Empathic Computing: Creating Shared Understanding
Building Integrated photovoltaic BIPV_UPV.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
The AUB Centre for AI in Media Proposal.docx
MIND Revenue Release Quarter 2 2025 Press Release
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
20250228 LYD VKU AI Blended-Learning.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Review of recent advances in non-invasive hemoglobin estimation
Network Security Unit 5.pdf for BCA BBA.
Cloud computing and distributed systems.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...

Data stax academy

  • 1. Datastax Academy: What ? Why ? How ? DuyHai DOAN
  • 2. Cassandra as Infrastructure Technology •  ING (Cassandra as a service) •  Netflix •  Sony Playstation Network •  Microsoft Office 365 •  Ebay (40Tb in a single table …) •  Etc .. © 2015 DataStax, All Rights Reserved. 2
  • 3. Cassandra as Infrastructure Technology •  The SMACK stack as an alternative to the Hadoop stack for streaming •  Spark •  Mesos •  Akka •  Cassandra •  Kafka •  Read @helenaedelson slides here http://goo.gl/cCIE7F © 2015 DataStax, All Rights Reserved. 3
  • 4. Rich eco-system around Cassandra •  Apache Spark (C* connector) •  Apache Zeppelin (C* interpreter) •  Apache Mesos (https://github.com/mesosphere/cassandra-mesos) •  Apache Kafka (KIP-30) •  Apache Shiro (C* as cluster session store) •  Hunk, JasperSoft, Pentaho, Tableau .. © 2015 DataStax, All Rights Reserved. 4
  • 5. Increasing SQL-like features •  CQL DML (SELECT, INSERT, UPDATE, DELETE …) •  CQL DDL CREATE/ALTER/DROP (SCHEMA, TABLE, TYPE, FUNCTION …) •  CQL Credentials •  CREATE/ALTER/DROP (USER, ROLE) •  GRANT <xxx> PERMISSION ON <resource> TO <user_name> •  REVOKE <xxx> PERMISSION ON <resource> FROM <user_name> © 2015 DataStax, All Rights Reserved. 5
  • 6. Increasing SQL-like features •  User Defined Functions CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] keyspace';'function-name ( <arg-name> <arg-type> ) (CALLED | RETURNS NULL) ON NULL INPUT RETURNS <type> LANGUAGE <language> AS <body> © 2015 DataStax, All Rights Reserved. 6
  • 7. Increasing SQL-like features •  Materialized Views CREATE MATERIALIZED VIEW [IF NOT EXISTS] keyspace_name.view_name AS SELECT column1, column2, ... FROM keyspace_name.table_name WHERE column1 IS NOT NULL AND column2 IS NOT NULL ... PRIMARY KEY(column1, column2, ...) •  Real time notifications (CDC) CASSANDRA-8844 © 2015 DataStax, All Rights Reserved. 7
  • 8. More powerful search in future •  Apple open-sourced Secondary Index Impl •  https://github.com/xedin/sasi CREATE CUSTOM INDEX ON sasi (bio) USING 'org.apache.cassandra.db.index.SSTableAttachedSecondaryIndex' WITH OPTIONS = { 'analyzer_class': 'org.apache.cassandra.db.index.sasi.analyzer.StandardAnalyzer', 'tokenization_enable_stemming': 'true', 'analyzed': 'true', 'tokenization_normalize_lowercase': 'true', 'tokenization_locale': 'en' }; © 2015 DataStax, All Rights Reserved. 8
  • 9. More powerful search in future •  Apple open-sourced Secondary Index Impl •  https://github.com/xedin/sasi SELECT * FROM sasi WHERE (created_at > 1442959315018 OR first_name = 'P') AND age > 26 ALLOW FILTERING; © 2015 DataStax, All Rights Reserved. 9
  • 10. More powerful search in future •  Limited to 2.0.x branch •  Needs special patch to OSS code •  Support only COMPACT STORAGE table •  Only compatible with Murmur3Partitioner •  CASSANDRA-10661 to merge to Cassandra 3.0 !!! •  Github issues to support full CQL3 (https://github.com/xedin/sasi/issues/3) © 2015 DataStax, All Rights Reserved. 10
  • 11. Datastax Gartner reports (Operational DB) © 2015 DataStax, All Rights Reserved. 11 Oct 2013
  • 12. Datastax Gartner reports (Operational DB) © 2015 DataStax, All Rights Reserved. 12 Oct 2014
  • 13. Datastax Gartner reports (Operational DB) © 2015 DataStax, All Rights Reserved. 13 Oct 2015
  • 14. Cassandra Job Trend © 2015 DataStax, All Rights Reserved. 14
  • 15. Cassandra Job Offers (I’ve received) © 2015 DataStax, All Rights Reserved. 15
  • 16. Cassandra Job Offers (I’ve received) © 2015 DataStax, All Rights Reserved. 16
  • 17. Problem ? © 2015 DataStax, All Rights Reserved. 17 •  Lack of Cassandra skills •  Difficulty to hire Cassandra experts
  • 19. Self-Paced Courses 19 © 2015 DataStax, All Rights Reserved. FREE
  • 20. Instructor-Led Training 20 © 2015 DataStax, All Rights Reserved. FREE
  • 21. O’Reilly Certification 21 © 2015 DataStax, All Rights Reserved.
  • 22. Technical Evangelists 22 © 2015 DataStax, All Rights Reserved. •  On-site help, data-modeling, cluster health check •  duy_hai.doan@datastax.com, @doanduyhai FREE
  • 23. From Devs & Ops perspective 23 © 2015 DataStax, All Rights Reserved. Cassandra is mainstream + You are trained & certified = Career Boost
  • 24. 24 © 2015 DataStax, All Rights Reserved. academy.datastax.com