Apache Cassandra Lunch #75: Getting Started with DataStax Enterprise on Docker

Version 1.0
Getting Started with DataStax
Enterprise (DSE) on Docker
In Cassandra Lunch #75, we are going to look at getting
started with DataStax Enterprise on Docker.
Isaac Omolayo
Jr. Software Engineer
@Anant

Getting Started with DataStax Enterprise
● What is DataStax Enterprise ?
● Packages and capabilities of DataStax Enterprise
● Using the DSE Search to solve data problems
● Using DSE Analytics (Spark) to handle data workloads
● Using the DataStax Enterprise Graph
● Running DataStax Enterprise packages on Docker
● Working with DSE Studio, DSE Search, DSE Analytics, DSE Graph

What is DataStax Enterprise ?
● DataStax Enterprise helps enterprises to build transformational data architectures for applications, microservices and
different use cases. The purpose of these is for data sovereignty, availability, scalability, agility, and accessibility by any user
● DataStax Enterprise (DSE) is built on Apache Cassandra
● DSE the world’s most scalable database, well known for 100% uptime, unmatched low latency
● DSE has the ability to handle and manage massive data at planetary scale
● DataStax Enterprise is a cohesive data management platform
● You have the ability to handle different workloads for different use cases using DSE Graph, DSE Analytics, and DSE Search
integration

Packages and capabilities of DataStax Enterprise
● There are different packages that come together to form the DataStax Enterprise ecosystem
○ DataStax OpsCenter
○ DataStax Studio
○ DataStax Enterprise
○ DataStax Enterprise Search
○ DataStax Enterprise Analytics with Spark
○ DataStax Enterprise Graph e.t.c

DataStax Enterprise with Search
● DSE Search allows you to quickly find data and provides a more flexibility search experience for your users
● With DSE Search you can create features like product catalogs, document repositories, ad-hoc reporting engines easily
● Data is written to the database first, and then indexes are updated next, you must create index on your data to enable search
capabilities
● The benefits of running enterprise search functions through DataStax Enterprise and DSE Search include:
○ DSE Search is backed by a scalable database, the connections and the packages are fully integrated
○ A persistent store for search indexes
○ You can easily examine and aggregate data in real-time using CQL
○ Supports indexing and querying of advanced data types, including tuples and user-defined types (UDT)

DataStax Enterprise with Analytics (Spark)
● DSE integrates real-time and batch operational analytics capabilities with an of Apache Spark
● With DSE Analytics you can easily generate reports, target customer and process real-time streams of data
● Care should be taken when enabling both Search and Analytics capability are enabled on a DSE node
● Provision sufficient memory and compute resources to accommodate the specific indexing, query, and processing
appropriate to the use case
● Spark is the default mode when you start an analytics node in a packaged installation. Spark runs locally on each node

DataStax Enterprise with Analytics (Spark)
● DSE Analytics includes integration with Apache Spark, Spark is the framework that will help to support the analytics
applications. Use DSE Analytics to analyze huge databases
● Spark is a distributed computation engine that is designed to handle big data and for in-memory processing
● Features of DSE Analytics
○ Spark Master management
○ Analytics without ETL
○ DataStax Enterprise file system (DSEFS)
○ DSE Analytics Solo
○ Integrated security
○ AlwaysOn SQL

DataStax Enterprises with Graph
● DSE graph is built on top of Apache TinkerPop, Apache Cassandra, Apache Solr, and Apache Spark
● DSE Graph uses Apache TinkerPop standards for data and traversal while also using Apache Cassandra for scalable storage
and retrieval
● DSE Graph supports both transactional and analytic workloads, using two different engines
○ OLAP: Online analytical processing (OLAP) is typically used to perform multidimensional analysis of data
■ Complex calculations on aggregated historical data
○ OLTP: Online transactional processing (OLTP) is characterized by a large number of short, online transactions for
very fast query processing
■ OLTP is typically used for data entry and retrieval with transaction-oriented applications
■ OLTP queries are best for questions that require access to a limited subset of the entire graph

DataStax Enterprise with Graph
● All the DataStax enterprise components are integrated into the DSE graph to form a real-time graph database management
system
● It has the built-in DSE Analytics and DSE Search functionality, visual management and monitoring, and development tools
including DataStax Studio incorporated

Running DataStax Enterprise packages on Docker
● Install Docker on your machine
● Pull all the needed DataStax Enterprise packages images
● Set up DSE Search, DSE Analytics and DSE Graph on Docker container
● Remote into the Docker Containers
● Create a table in Cassandra using CQL
● Access and create a search index on table
● Transform table with Spark Scala on Cassandra table using DSE Analytics
● Access the table in DataStax Studio
● Use the DSE Graph to query the data

Demo
● https://github.com/yTek01/Getting-Started-with-DSE-on-Docker

Resources
● https://docs.datastax.com/en/dse/6.7/dse-
admin/datastax_enterprise/newFeatures.html
dev/datastax_enterprise/dseGettingStarted.html
arch/datastax_enterprise/dbArch/archGraphSimilarDiff.html
● https://github.com/datastax/docker-images
● https://github.com/roberd13/Getting-Started-With-DSE-and-Docker
● https://docs.docker.com/engine/install/

Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

Apache Cassandra Lunch #75: Getting Started with DataStax Enterprise on Docker

More Related Content

What's hot (20)

Similar to Apache Cassandra Lunch #75: Getting Started with DataStax Enterprise on Docker (20)

More from Anant Corporation (20)

Recently uploaded (20)

Apache Cassandra Lunch #75: Getting Started with DataStax Enterprise on Docker