SlideShare a Scribd company logo
What’s new in MariaDB AX,
the database for modern analytics
& data warehousing
Dipti Joshi
Director Product Management
MariaDB
Conrad Hotel, New York City
February 26–27, 2018
m18.mariadb.com
M|18, the second annual MariaDB user conference, is where global MariaDB experts and
practitioners meet to exchange ideas, best practices and success stories. Join us in NYC to
share journeys on open source strategies and infrastructure modernization with MariaDB.
-2017 SPONSORS-
YOTTA ZETTA EXA
MariaDB AX
Analytics made easy –
simple, fast, scalable…
and open source
Customer Use Case
Industry: healthcare (Medicaid)
Data: surveys
Use case: decision support system
Details:
1. Identify trends and patterns
2. Determine population cohorts
3. Predict health outcomes
4. Anticipate funding / capacity
5. Recommend intervention
Can’t do complex queries on current
hardware with Oracle and snowflake
schemas
Limited to optimizing for simple, known
queries (2-3 columns)
Replaced with ColumnStore
> a single table
> 2.5 million rows, 248 columns >
complex, ad-hoc queries
> query 20+ columns in seconds
Customer Use Cases
By industry
Finance
Identify trade patterns
Detect fraud and anomolies
Predict trading outcomes
Manufacturing
Simulations to improve design/yield
Detect production anomalies
Predict machine failures (sensor data)
Telecom
Behavioral analysis of customer calls
Network analysis (perf and reliability)
Healthcare
Find genetic profiles/matches
Analyze health vs spending
Predict viral outbreaks
MariaDB AX
MariaDB Server
MariaDB MaxScale
MariaDB ColumnStore
Parallel queries
Distributed storage
No indexes
Automatic partitioning
Read optimized
High compression
Low disk IO ColumnStore
Storage
ColumnStore
Storage
ColumnStore
Storage
MariaDB Server
ColumnStore
MariaDB Server
ColumnStore
MariaDB MaxScale
MariaDB Server
ColumnStore
ColumnStore
Storage
MariaDB MaxScale
MariaDB AX
Goals
3. Streamline and simplify the process of ingesting data
2. Make it easier to perform custom, complex analytics
1. Expand high availability/disaster recovery options
MariaDB AX
What was there
Manual import
Manual backup/restore
Window functions
Aggregate functions
User-defined functions
Cross-engine joins
ColumnStore StorageMariaDB Server
ColumnStoreInnoDB
Applications / Spark
MariaDB MaxScale
MariaDB AX
What’s new
MariaDB ColumnStore 1.1
Streaming data adapters
Bulk data adapters
User defined
Window functions
Distributed aggregates
Spark support
Phase I: SQL (JDBC)
Phase II: data adapters
High availability
Local storage (GlusterFS)
Parallel backup/restore
ColumnStore Storage
Backup/Restore GlusterFS
MariaDB Server
ColumnStore
Applications / Spark
Bulk
Data Adapters
User Defined
Window Functions
Streaming
Data Adapters
User Defined
Aggregate Functions
MariaDB MaxScale
CERTIFICATION
What’s new in MariaDB AX
INGESTION
ANALYTICS
Applications, Apache Kafka, MariaDB MaxScale
User-defined aggregate and window functions
HA / DR GlusterFS support, Parallel backup/restore
DATA TYPES Text, BLOB columns
SECURITY Auditing
Tableau
Extend high availability/disaster recovery options
GlusterFS Volume
Replication
High availability
GlusterFS can replicate files
within a volume - HA without
the need for an expensive
SAN
ColumnStore storage nodes can
read other files within a volume
- simple, automatic
failover
ColumnStore
Storage
(dbroot1)
ColumnStore
Storage
(dbroot2)
MariaDB Server
ColumnStore
MariaDB Server
ColumnStore
ColumnStore
Storage
(dbroot3)
/dbroot1 /dbroot2 /dbroot2 /dbroot3 /dbroot3 /dbroot1
High availability
GlusterFS can replicate files
within a volume - HA without
the need for an expensive
SAN
ColumnStore storage nodes can
read other files within a volume
- simple, automatic
failover
MariaDB Server
ColumnStore
MariaDB Server
ColumnStore
GlusterFS Volume
/dbroot1 /dbroot2 /dbroot2 /dbroot3 /dbroot3 /dbroot1
Replication
ColumnStore
Storage
(dbroot1)
ColumnStore
Storage
(dbroot2)
ColumnStore
Storage
(dbroot3)
GlusterFS Volume
High availability
GlusterFS can replicate files
within a volume - HA without
the need for an expensive
SAN
ColumnStore storage nodes can
read other files within a volume
- simple, automatic
failover
MariaDB Server
ColumnStore
MariaDB Server
ColumnStore
/dbroot2 /dbroot3 /dbroot3 /dbroot1
Replication
ColumnStore
Storage
(dbroot2)
ColumnStore
Storage
(dbroot3)
/home/user/columnstoreBackupData/pm1dbroot1
/home/user/columnstoreBackupData/pm2dbroot2
/home/user/columnstoreBackupData/pm3dbroot3
Parallel
backup/restore
Parallel backup/restore using
rsync - faster backup and
restore
Support incremental backup
and restore - faster backup
and restore
Consolidate data from multiple
storage nodes in a single
backup location - simplified,
automatic backups and
restores
ColumnStore
Storage
ColumnStore
Storage
MariaDB Server
ColumnStore
MariaDB Server
ColumnStore
ColumnStore
Storage
Backup and restore tool
rsync /data1/* rsync /data2/* rsync /data3/*
Make it easier to perform custom, complex analytics
User-defined
aggregate and
window functions
User-defined distributed
aggregate functions - custom
analytical functions and
better performance
User-defined window functions
Example: calculate a weighted
sum (revenue)
$1-10 (0.5)
$11-100 (1.0)
$100+ (1.5)
MariaDB Server
ColumnStore
MariaDB Server
ColumnStore
ColumnStore
Storage
ColumnStore
Storage
ColumnStore
Storage
$10 $5
$100 $100
$200 $300
Column WSUM
$4 $2
$8 $4
$20 $20
Column WSUM
$12 $6
$60 $60
$300 $450
Column WSUM
WSUM = $405 WSUM = $26 WSUM = $516
WSUM = $947
Streamline and simplify the process of data ingestion
Organizations need to make data available for analysis as
soon as it arrives
Machine learning results need to be stored where other
business/data analysts work with them
Time to insight and time to action are now competitive
differentiators for businesses
Motivation
Bulk data adapters
Applications can use bulk data
adapters to collect and write
data - on-demand data
loading
Bypass SQL interface, parser
and optimizer - faster writes
C++
Python
Java
More on the way
MariaDB Server
ColumnStore
Application
ColumnStore Storage ColumnStore StorageColumnStore Storage
Write API Write API Write API
MariaDB Server
ColumnStore
Bulk Data Adapter
1. For each row
a. For each column
i. bulkInsert->setColumn
b. bulkInsert->writeRow
2. bulkInsert->commit
* Buffer 100,000 rows by default
Customer Use Case
Industry: biotechnology (genetics)
Data: genotypes
Use case: genetic profiling
Details:
1. Find genetic mates for cattle
2. Predict meat production
3. Gene/DNA analysis
Had to convert to CSV files and schedule
import jobs (cron)
Always receiving new genetic data
Migrated to data adapter (Python)
> streamline import process
> remove steps / possible error
> remove delays
> import data on demand
> immediate customer access
Streaming data
adapters – MaxScale
CDC
Stream all writes from
MariaDB TX to MariaDB AX
automatically and continuously
- ensure analytical data is
up to date and not stale, no
need for batch jobs,
manual processes or
human intervention
MariaDB Server
InnoDB
MariaDB Server
ColumnStore
MariaDB MaxScale
ColumnStore Storage ColumnStore StorageColumnStore Storage
Write API Write API Write API
MariaDB Server
ColumnStore
Streaming Data Adapter
(CDC Client)
CDC Server
Streaming data
adapters – Apache
Kafka
Stream all messages published
to Apache Kafka topics to
MariaDB AX automatically and
continuously - enable data
from many sources to be
streamed and collected for
analysis without complex
code
MariaDB Server
ColumnStore
Apache Kafka
ColumnStore Storage ColumnStore StorageColumnStore Storage
Write API Write API Write API
MariaDB Server
ColumnStore
Streaming Data Adapter
(Kafka Client)
Topic Topic Topic
The big picture – putting it all together
AnalyticsOperations Ingestion
Apache Kafka
Streaming Data Adapters
Data Services
Bulk Data Adapters
Spark / Python / ML
Bulk Data Adapters
Transaction (OLTP)
MariaDB Server
InnoDB
MariaDB MaxScale
Web/Mobile Services
MariaDB MaxScale
Analytics (OLAP)
MariaDB Server
ColumnStore
Demo
TX to AX data streaming
Reach me
Resources
Download
Documentation https://mariadb.com/kb/en/library/mariadb-columnstore/
Blogs https://mariadb.com/blog-tags/columnstore
https://mariadb.com/blog-tags/big-data
dipti.joshi@mariadb.com
MariaDB ColumnStore 1.1
https://mariadb.com/downloads/mariadb-ax
Bulk Data Adapters and Streaming Data Adapters
https://mariadb.com/downloads/mariadb-ax/data-adapters
MariaDB ColumnStore Backup/Restore Tool
https://mariadb.com/downloads/mariadb-ax/tools-ax
What’s new in MariaDB AX
Summary
Improved HA/DR
GlusterFS support
Parallel backup/restore
Streamlined data ingestion
Streaming data adapters
Bulk data adapters
Complex, custom analytics
User-defined aggregate functions
User-defined window functions
Text and binary columns
Spark integration (in-progress)
JDBC (SQL)
Direct (data adapter)
Thank you

More Related Content

PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
PDF
Database Security Threats - MariaDB Security Best Practices
PDF
Getting started with MariaDB with Docker
PDF
Database Security Threats — MariaDB Security Best Practices
PDF
Getting Started with MariaDB with Docker
PDF
When Open Source Meets the Enterprise
PDF
Big Data Analytics with MariaDB ColumnStore
PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
Database Security Threats - MariaDB Security Best Practices
Getting started with MariaDB with Docker
Database Security Threats — MariaDB Security Best Practices
Getting Started with MariaDB with Docker
When Open Source Meets the Enterprise
Big Data Analytics with MariaDB ColumnStore
How to Manage Scale-Out Environments with MariaDB MaxScale

What's hot (20)

PDF
How to Manage Scale-Out Environments with MariaDB MaxScale
PDF
When Open Source Meets the Enterprise
PPTX
MongoDB 3.4 webinar
PPTX
The rise of microservices - containers and orchestration
PPTX
M|18 Welcome Keynote
PDF
Einführung: MariaDB heute und unsere Vision für die Zukunft
PPTX
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
PPTX
An Introduction to MongoDB Ops Manager
PDF
Maximizing performance via tuning and optimization
PPTX
Keynote: Open Source für den geschäftskritischen Einsatz
PDF
Azure SQL Database
PDF
Novinky v Oracle Database 18c
PPTX
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
PPTX
Deploying MariaDB databases with containers at Nokia Networks
PPTX
Migration to Alibaba Cloud
PPTX
Mainframe Modernization with Precisely and Microsoft Azure
PDF
Caching for Microservices Architectures: Session I
PDF
Azure Cloud Dev Camp - Introduction
PDF
Azure - Data Platform
PPTX
Responding to Digital Transformation With RDS Database Technology
How to Manage Scale-Out Environments with MariaDB MaxScale
When Open Source Meets the Enterprise
MongoDB 3.4 webinar
The rise of microservices - containers and orchestration
M|18 Welcome Keynote
Einführung: MariaDB heute und unsere Vision für die Zukunft
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)
An Introduction to MongoDB Ops Manager
Maximizing performance via tuning and optimization
Keynote: Open Source für den geschäftskritischen Einsatz
Azure SQL Database
Novinky v Oracle Database 18c
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...
Deploying MariaDB databases with containers at Nokia Networks
Migration to Alibaba Cloud
Mainframe Modernization with Precisely and Microsoft Azure
Caching for Microservices Architectures: Session I
Azure Cloud Dev Camp - Introduction
Azure - Data Platform
Responding to Digital Transformation With RDS Database Technology
Ad

Similar to What's new in MariaDB AX webinar (20)

PDF
M|18 What's New in the MariaDB AX Platform
PPTX
M|18 Analyzing Data with the MariaDB AX Platform
PDF
Delivering fast, powerful and scalable analytics
PDF
Fast, Powerful and Scalable Analytics
PDF
Data Con LA 2018 - Why use a columnar database for analytical workloads by Sh...
PDF
Introduction of MariaDB AX / TX
PPTX
Delivering fast, powerful and scalable analytics
PDF
How Columnar Databases Support Modern Analytics
PDF
MariaDB today and our vision for the future
PDF
Welcome: MariaDB today and our vision for the future
PDF
MariaDB today and our vision for the future
PDF
MariaDB today and our vision for the future
PDF
MariaDB AX: Analytics with MariaDB ColumnStore
PDF
MariaDB AX: Solución analítica con ColumnStore
PDF
What to expect from MariaDB Platform X5, part 2
PDF
Open Source für den geschäftskritischen Einsatz
PDF
Improving Transactional Applications with Analytics
PDF
When Open Source Meets the Enterprise
PDF
MariaDB AX ユースケース / ColumnStore 1.2 新機能
PDF
Exploring modern analytics use cases
M|18 What's New in the MariaDB AX Platform
M|18 Analyzing Data with the MariaDB AX Platform
Delivering fast, powerful and scalable analytics
Fast, Powerful and Scalable Analytics
Data Con LA 2018 - Why use a columnar database for analytical workloads by Sh...
Introduction of MariaDB AX / TX
Delivering fast, powerful and scalable analytics
How Columnar Databases Support Modern Analytics
MariaDB today and our vision for the future
Welcome: MariaDB today and our vision for the future
MariaDB today and our vision for the future
MariaDB today and our vision for the future
MariaDB AX: Analytics with MariaDB ColumnStore
MariaDB AX: Solución analítica con ColumnStore
What to expect from MariaDB Platform X5, part 2
Open Source für den geschäftskritischen Einsatz
Improving Transactional Applications with Analytics
When Open Source Meets the Enterprise
MariaDB AX ユースケース / ColumnStore 1.2 新機能
Exploring modern analytics use cases
Ad

More from MariaDB plc (20)

PDF
MariaDB Berlin Roadshow Slides - 8 April 2025
PDF
MariaDB München Roadshow - 24 September, 2024
PDF
MariaDB Paris Roadshow - 19 September 2024
PDF
MariaDB Amsterdam Roadshow: 19 September, 2024
PDF
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
PDF
MariaDB Paris Workshop 2023 - Newpharma
PDF
MariaDB Paris Workshop 2023 - Cloud
PDF
MariaDB Paris Workshop 2023 - MariaDB Enterprise
PDF
MariaDB Paris Workshop 2023 - Performance Optimization
PDF
MariaDB Paris Workshop 2023 - MaxScale
PDF
MariaDB Paris Workshop 2023 - novadys presentation
PDF
MariaDB Paris Workshop 2023 - DARVA presentation
PDF
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
PDF
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
PDF
Einführung : MariaDB Tech und Business Update Hamburg 2023
PDF
Hochverfügbarkeitslösungen mit MariaDB
PDF
Die Neuheiten in MariaDB Enterprise Server
PDF
Global Data Replication with Galera for Ansell Guardian®
PDF
Introducing workload analysis
PDF
Under the hood: SkySQL monitoring
MariaDB Berlin Roadshow Slides - 8 April 2025
MariaDB München Roadshow - 24 September, 2024
MariaDB Paris Roadshow - 19 September 2024
MariaDB Amsterdam Roadshow: 19 September, 2024
MariaDB Paris Workshop 2023 - MaxScale 23.02.x
MariaDB Paris Workshop 2023 - Newpharma
MariaDB Paris Workshop 2023 - Cloud
MariaDB Paris Workshop 2023 - MariaDB Enterprise
MariaDB Paris Workshop 2023 - Performance Optimization
MariaDB Paris Workshop 2023 - MaxScale
MariaDB Paris Workshop 2023 - novadys presentation
MariaDB Paris Workshop 2023 - DARVA presentation
MariaDB Tech und Business Update Hamburg 2023 - MariaDB Enterprise Server
MariaDB SkySQL Autonome Skalierung, Observability, Cloud-Backup
Einführung : MariaDB Tech und Business Update Hamburg 2023
Hochverfügbarkeitslösungen mit MariaDB
Die Neuheiten in MariaDB Enterprise Server
Global Data Replication with Galera for Ansell Guardian®
Introducing workload analysis
Under the hood: SkySQL monitoring

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
A Presentation on Artificial Intelligence
PDF
Encapsulation theory and applications.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPT
Teaching material agriculture food technology
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
Chapter 3 Spatial Domain Image Processing.pdf
MYSQL Presentation for SQL database connectivity
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Understanding_Digital_Forensics_Presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
A Presentation on Artificial Intelligence
Encapsulation theory and applications.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Per capita expenditure prediction using model stacking based on satellite ima...
Teaching material agriculture food technology
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
20250228 LYD VKU AI Blended-Learning.pptx

What's new in MariaDB AX webinar

  • 1. What’s new in MariaDB AX, the database for modern analytics & data warehousing Dipti Joshi Director Product Management MariaDB
  • 2. Conrad Hotel, New York City February 26–27, 2018 m18.mariadb.com M|18, the second annual MariaDB user conference, is where global MariaDB experts and practitioners meet to exchange ideas, best practices and success stories. Join us in NYC to share journeys on open source strategies and infrastructure modernization with MariaDB. -2017 SPONSORS- YOTTA ZETTA EXA
  • 3. MariaDB AX Analytics made easy – simple, fast, scalable… and open source
  • 4. Customer Use Case Industry: healthcare (Medicaid) Data: surveys Use case: decision support system Details: 1. Identify trends and patterns 2. Determine population cohorts 3. Predict health outcomes 4. Anticipate funding / capacity 5. Recommend intervention Can’t do complex queries on current hardware with Oracle and snowflake schemas Limited to optimizing for simple, known queries (2-3 columns) Replaced with ColumnStore > a single table > 2.5 million rows, 248 columns > complex, ad-hoc queries > query 20+ columns in seconds
  • 5. Customer Use Cases By industry Finance Identify trade patterns Detect fraud and anomolies Predict trading outcomes Manufacturing Simulations to improve design/yield Detect production anomalies Predict machine failures (sensor data) Telecom Behavioral analysis of customer calls Network analysis (perf and reliability) Healthcare Find genetic profiles/matches Analyze health vs spending Predict viral outbreaks
  • 6. MariaDB AX MariaDB Server MariaDB MaxScale MariaDB ColumnStore Parallel queries Distributed storage No indexes Automatic partitioning Read optimized High compression Low disk IO ColumnStore Storage ColumnStore Storage ColumnStore Storage MariaDB Server ColumnStore MariaDB Server ColumnStore MariaDB MaxScale MariaDB Server ColumnStore ColumnStore Storage MariaDB MaxScale
  • 7. MariaDB AX Goals 3. Streamline and simplify the process of ingesting data 2. Make it easier to perform custom, complex analytics 1. Expand high availability/disaster recovery options
  • 8. MariaDB AX What was there Manual import Manual backup/restore Window functions Aggregate functions User-defined functions Cross-engine joins ColumnStore StorageMariaDB Server ColumnStoreInnoDB Applications / Spark MariaDB MaxScale
  • 9. MariaDB AX What’s new MariaDB ColumnStore 1.1 Streaming data adapters Bulk data adapters User defined Window functions Distributed aggregates Spark support Phase I: SQL (JDBC) Phase II: data adapters High availability Local storage (GlusterFS) Parallel backup/restore ColumnStore Storage Backup/Restore GlusterFS MariaDB Server ColumnStore Applications / Spark Bulk Data Adapters User Defined Window Functions Streaming Data Adapters User Defined Aggregate Functions MariaDB MaxScale
  • 10. CERTIFICATION What’s new in MariaDB AX INGESTION ANALYTICS Applications, Apache Kafka, MariaDB MaxScale User-defined aggregate and window functions HA / DR GlusterFS support, Parallel backup/restore DATA TYPES Text, BLOB columns SECURITY Auditing Tableau
  • 12. GlusterFS Volume Replication High availability GlusterFS can replicate files within a volume - HA without the need for an expensive SAN ColumnStore storage nodes can read other files within a volume - simple, automatic failover ColumnStore Storage (dbroot1) ColumnStore Storage (dbroot2) MariaDB Server ColumnStore MariaDB Server ColumnStore ColumnStore Storage (dbroot3) /dbroot1 /dbroot2 /dbroot2 /dbroot3 /dbroot3 /dbroot1
  • 13. High availability GlusterFS can replicate files within a volume - HA without the need for an expensive SAN ColumnStore storage nodes can read other files within a volume - simple, automatic failover MariaDB Server ColumnStore MariaDB Server ColumnStore GlusterFS Volume /dbroot1 /dbroot2 /dbroot2 /dbroot3 /dbroot3 /dbroot1 Replication ColumnStore Storage (dbroot1) ColumnStore Storage (dbroot2) ColumnStore Storage (dbroot3)
  • 14. GlusterFS Volume High availability GlusterFS can replicate files within a volume - HA without the need for an expensive SAN ColumnStore storage nodes can read other files within a volume - simple, automatic failover MariaDB Server ColumnStore MariaDB Server ColumnStore /dbroot2 /dbroot3 /dbroot3 /dbroot1 Replication ColumnStore Storage (dbroot2) ColumnStore Storage (dbroot3)
  • 15. /home/user/columnstoreBackupData/pm1dbroot1 /home/user/columnstoreBackupData/pm2dbroot2 /home/user/columnstoreBackupData/pm3dbroot3 Parallel backup/restore Parallel backup/restore using rsync - faster backup and restore Support incremental backup and restore - faster backup and restore Consolidate data from multiple storage nodes in a single backup location - simplified, automatic backups and restores ColumnStore Storage ColumnStore Storage MariaDB Server ColumnStore MariaDB Server ColumnStore ColumnStore Storage Backup and restore tool rsync /data1/* rsync /data2/* rsync /data3/*
  • 16. Make it easier to perform custom, complex analytics
  • 17. User-defined aggregate and window functions User-defined distributed aggregate functions - custom analytical functions and better performance User-defined window functions Example: calculate a weighted sum (revenue) $1-10 (0.5) $11-100 (1.0) $100+ (1.5) MariaDB Server ColumnStore MariaDB Server ColumnStore ColumnStore Storage ColumnStore Storage ColumnStore Storage $10 $5 $100 $100 $200 $300 Column WSUM $4 $2 $8 $4 $20 $20 Column WSUM $12 $6 $60 $60 $300 $450 Column WSUM WSUM = $405 WSUM = $26 WSUM = $516 WSUM = $947
  • 18. Streamline and simplify the process of data ingestion
  • 19. Organizations need to make data available for analysis as soon as it arrives Machine learning results need to be stored where other business/data analysts work with them Time to insight and time to action are now competitive differentiators for businesses Motivation
  • 20. Bulk data adapters Applications can use bulk data adapters to collect and write data - on-demand data loading Bypass SQL interface, parser and optimizer - faster writes C++ Python Java More on the way MariaDB Server ColumnStore Application ColumnStore Storage ColumnStore StorageColumnStore Storage Write API Write API Write API MariaDB Server ColumnStore Bulk Data Adapter 1. For each row a. For each column i. bulkInsert->setColumn b. bulkInsert->writeRow 2. bulkInsert->commit * Buffer 100,000 rows by default
  • 21. Customer Use Case Industry: biotechnology (genetics) Data: genotypes Use case: genetic profiling Details: 1. Find genetic mates for cattle 2. Predict meat production 3. Gene/DNA analysis Had to convert to CSV files and schedule import jobs (cron) Always receiving new genetic data Migrated to data adapter (Python) > streamline import process > remove steps / possible error > remove delays > import data on demand > immediate customer access
  • 22. Streaming data adapters – MaxScale CDC Stream all writes from MariaDB TX to MariaDB AX automatically and continuously - ensure analytical data is up to date and not stale, no need for batch jobs, manual processes or human intervention MariaDB Server InnoDB MariaDB Server ColumnStore MariaDB MaxScale ColumnStore Storage ColumnStore StorageColumnStore Storage Write API Write API Write API MariaDB Server ColumnStore Streaming Data Adapter (CDC Client) CDC Server
  • 23. Streaming data adapters – Apache Kafka Stream all messages published to Apache Kafka topics to MariaDB AX automatically and continuously - enable data from many sources to be streamed and collected for analysis without complex code MariaDB Server ColumnStore Apache Kafka ColumnStore Storage ColumnStore StorageColumnStore Storage Write API Write API Write API MariaDB Server ColumnStore Streaming Data Adapter (Kafka Client) Topic Topic Topic
  • 24. The big picture – putting it all together
  • 25. AnalyticsOperations Ingestion Apache Kafka Streaming Data Adapters Data Services Bulk Data Adapters Spark / Python / ML Bulk Data Adapters Transaction (OLTP) MariaDB Server InnoDB MariaDB MaxScale Web/Mobile Services MariaDB MaxScale Analytics (OLAP) MariaDB Server ColumnStore
  • 26. Demo TX to AX data streaming
  • 27. Reach me Resources Download Documentation https://mariadb.com/kb/en/library/mariadb-columnstore/ Blogs https://mariadb.com/blog-tags/columnstore https://mariadb.com/blog-tags/big-data dipti.joshi@mariadb.com MariaDB ColumnStore 1.1 https://mariadb.com/downloads/mariadb-ax Bulk Data Adapters and Streaming Data Adapters https://mariadb.com/downloads/mariadb-ax/data-adapters MariaDB ColumnStore Backup/Restore Tool https://mariadb.com/downloads/mariadb-ax/tools-ax
  • 28. What’s new in MariaDB AX Summary Improved HA/DR GlusterFS support Parallel backup/restore Streamlined data ingestion Streaming data adapters Bulk data adapters Complex, custom analytics User-defined aggregate functions User-defined window functions Text and binary columns Spark integration (in-progress) JDBC (SQL) Direct (data adapter)