SlideShare a Scribd company logo
#MDBW17
MongoDB BI Connector & Tableau
JUMPSTART
#MDBW17
Solutions Architect
MongoDB
ronan@mongodb.com
RONÁN BOHAN
Sr Manager
TechnologyPartnerships
Tableau
vkrishnan@tableau.com
VAIDY KRISHNAN
Jumpstart: MongoDB BI Connector & Tableau
People who know the data
should ask the questions
Software should be
designed for deeper thinking
Analytics at scale
can drive change
Connectivity
Access to all
data
Performance
Fast interaction
with all data
Discovery
Finding the
right data.
Big Data Focus
Analytics for All your Data
Broad access to Big Data
platforms
Visual analytics without coding
Platform query performance
Consistent visual interface
Hybrid data architecture
• Connect to almost any data (esp.
MongoDB)
• Fast direct connections, fast in-memory
• Ease of use/rapid adoption times
• Speed of deployment
• Pace of innovation
• Enterprise security (Kerberos, AD, etc)
• A leader in the Gartner Magic Quadrant
• The leader in visual analytics
Why enterprise customers choose Tableau
#MDBW17
SO WHAT IF?
MongoDBTableau
#MDBW17
HOUSTON, WE HAVE A PROBLEM
Flexible Schema
MQL
Structured Tabular
Data
SQL
?
#MDBW17
RELATIONAL VS DOCUMENT MODEL
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: {
type : ‘Point’,
coordinates : [45.123,47.232]
},
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
]
}
MongoDBRDBMS
WHAT’S THE SOLUTION?
#MDBW17
HOUSTON, WE HAVE A SOLUTION!
MongoDBTableau
BI Connector
SQL-99 SELECT
Compatible
MySQL Wire Protocol
MQL
DEMO
UNDER THE
COVERS
#MDBW17
TWO COMPONENTS
MongoDRDL
Schema
Generator
mongodrdl
MongoSQLD
Daemon Process
mongosqld
#MDBW17
SCHEMA FILE GENERATION
mongodrdl DRD
L
DRD
L
DRD
L
Onetime Setup
#MDBW17
BI-CONNECTOR IN ACTION
mongosqld
MongoDBTableau
DRD
L
DRD
L
DRD
L
Runtime
#MDBW17
SCHEMA FILES
Document / Relational Definition LanguageDRDL
Provides a mapping between
MongoDB Rich Documents
And
Structured Tabular Data
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
DRDL DEFINITION FILE
schema:
- db: owners
tables:
- table: people
collection: people
columns:
- Name: first_name
MongoType: string
SqlName: name
SqlType: varchar
- Name: …
MongoType: …
Yaml file
#MDBW17
MAPPINGS
Compound Document
Type
Mapping
Simple Field Mapped as is
Embedded Documents Flattened using a ‘.’ separator
Embedded Arrays Array elements unwound into separate ’rows’
#MDBW17
SUPPORTED TYPES & CONVERSIONS
MongoDB Type SQL Type
String varchar
Numeric Homogeneous data types: Most precise numeric type
Mixture of floating point & integer: numeric
Dates timestamp
ObjectID varchar
UUID varchar
Geospatial An array of numeric longitude-latitude coordinates
Heterogeneous Fields Most frequently sampled type
#MDBW17
RECAP
DRD
L
DRD
L
DRD
L
mongodrdl
One-time Setup
#MDBW17
RECAP
mongosqld
DRD
L
DRD
L
DRD
L
mongodrdl
Runtime
PERFORMANCE
#MDBW17
PERFORMANCE
mongosqld
MongoDBTableau
PushDown
#MDBW17
PERFORMANCE
• Leverages Aggregation Framework
‒ Makes use of latest MongoDB features
‒ Supports MongoDB 3.2, optimized for MongoDB 3.4
‒ Includes $lookup aggregation stage to support SQL Joins
• Full Index support
DEMO
SECURITY
#MDBW17
FULLY SECURE
mongosqld
User: JaneDoe
Pass: <pwd>
User: JaneDoe
Pass: <pwd>
Full pass-through Authentication
SSL SSL
required optional
#MDBW17
CLIENT SECURITY REQUIREMENTS
• Username adheres to MongoDB Connection String Format
 <username>
 <username>?source=<database>
 <username>?source=<database>&mechanism=SCRAM-SHA-1 | PLAIN
• Requires Clear Text Plugin
• Connection must use SSL
‒ Client SSL key, Client SSL cert, CA Cert (for self-signed certs)
#MDBW17
Tableau users can use a Tableau Datasource Connection file:
<?xml version='1.0' encoding='utf-8' ?>
<connection-customization class='mongodb' version='7.7' enabled='true'>
<vendor name='mongodb' />
<driver name=’mongodb' />
<customizations>
<customization name='odbc-connect-string-extras'
value='SSLKEY={/path_to_key/mongodb.key};SSLCERT={/path_to_cert/mongodb.crt};SSLCA={/pa
th_to_CAcert/ca.crt};ENABLE_CLEARTEXT_PLUGIN=1;SSL_ENFORCE=1' />
</customizations>
</connection-customization>
TABLEAU DATASOURCE CONNECTION
#MDBW17
Tableau users can use a Tableau Datasource Connection file:
<?xml version='1.0' encoding='utf-8' ?>
<connection-customization class='mongodb' version='7.7' enabled='true'>
<vendor name='mongodb' />
<driver name=’mongodb' />
<customizations>
<customization name='odbc-connect-string-extras'
value='SSLKEY={/path_to_key/mongodb.key};SSLCERT={/path_to_cert/mongodb.crt};SSLCA={/pa
th_to_CAcert/ca.crt};ENABLE_CLEARTEXT_PLUGIN=1;SSL_ENFORCE=1' />
</customizations>
</connection-customization>
TABLEAU DATASOURCE CONNECTION
DEMO
airlines.json dataset obtained from CORGIS Dataset Project at VirginaTech
https://think.cs.vt.edu/corgis/json/index.html
BEST PRACTICE
#MDBW17
Model for use
Index efficiently
Index effectively
Leverage Views
Use latest Server
Version
Casts
Cross-collection
• Non-equijoins
• Subqueries
until at least
MongoDB 3.6
Do’s Don’ts
BEST PRACTICES
Aggregated dataPrepared data
Data
Size
PerformanceLarge data (raw or prepared)
COLD WARM HOT
Aggregated dataPrepared data
Data
Size
PerformanceLarge data (raw or prepared)
Data Lakes
Store Everything and Anything
Unknown Questions with Unknown
Answers
Unstructured / Data Mining / Data
Science
Data Warehouses
Analytical Queries
Known Questions with Unknown
Answers
Regularly Refreshed Business Concepts
Analytical Databases
Precomputed Aggregates
Known Questions and Known Answers
Analytical Dashboard Already
Constructed
Aggregated dataPrepared data
Data
Size
Performance
Large data
(raw or prepared)
Cold, Warm, Hot Strategy
Aggregated dataPrepared data
Performance
Large data
(raw or prepared)
Data
Size
Cold, Warm, Hot Strategy with Optimized MongoDB
How do we see customers using Tableau on MongoDB
• Use Case
‒ Data Exploration/Mining
‒ Ad-Hoc Report Conceptual Modeling
‒ Query directly/Explore Concepts to Migrate to Analytically Optimized
Data Stores
• Financial Services: Analyze ticks, tweets, satellite imagery, weather trends,
and any other type of data to inform trading algorithms in real time.
• Government: Identify social program fraud within seconds based on program
history, citizen profile, and geospatial data.
• HighTech: Identify unique individuals across any type of device, browser or
app and use a holistic behavioral model to advertise to them.
• Retail: Set up a digital geo-fence around your brick-and-mortar locations to
push in-store incentives to shoppers in real time.
Success Across Industries & Verticals
WHAT’S NEXT?
#MDBW17
THE FUTURE?
• Additional PushDown capabilities
• Improved authentication mechanism
• Centralized management of DRDL files (longer term)
#MDBW17
MORE INFO
Online documentation:
https://docs.mongodb.com/bi-connector/master/
Download:
https://www.mongodb.com/download-center#bi-connector
Platforms (x64):
‒ Linux (RHEL/CentOS*, Amazon Linux, Debian, Ubuntu, SUSE Enterprise Linux)
‒ OSX / macOS
‒ Windows
* Also IBM Power & System Z
Jumpstart: MongoDB BI Connector & Tableau

More Related Content

PDF
Building a Microservices-based ERP System
PDF
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
PPTX
Experian Health: Moving Universal Identity Manager from ANSI SQL to MongoDB
PPTX
MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
PPTX
Webinar: Enterprise Trends for Database-as-a-Service
PDF
MongoDB Evenings Houston: Implementing EDW Using MongoDB by Purvesh Patel, Ch...
PPTX
Benefits of Using MongoDB Over RDBMSs
PPTX
MongoDB Atlas
Building a Microservices-based ERP System
MongoDB Evenings Dallas: What's the Scoop on MongoDB & Hadoop
Experian Health: Moving Universal Identity Manager from ANSI SQL to MongoDB
MongoDB Evenings DC: Get MEAN and Lean with Docker and Kubernetes
Webinar: Enterprise Trends for Database-as-a-Service
MongoDB Evenings Houston: Implementing EDW Using MongoDB by Purvesh Patel, Ch...
Benefits of Using MongoDB Over RDBMSs
MongoDB Atlas

What's hot (20)

PPTX
MongoDB in a Mainframe World
PPTX
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
PPTX
Tableau & MongoDB: Visual Analytics at the Speed of Thought
PPTX
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
PPTX
Webinar: Elevate Your Enterprise Architecture with In-Memory Computing
PPT
MongoDB in the Healthcare Enterprise
PDF
How MongoDB is Transforming Healthcare Technology
PPTX
L’architettura di Classe Enterprise di Nuova Generazione
PPTX
Introduction To MongoDB
PPTX
Bye Bye Legacy: Simplifying the Journey
PDF
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
PPTX
MongoDB Operations for Developers
PPTX
Replacing Traditional Technologies with MongoDB: A Single Platform for All Fi...
PPTX
MongoDB: How We Did It – Reanimating Identity at AOL
PDF
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
PDF
MongoDB .local London 2019: MongoDB Atlas Data Lake Technical Deep Dive
PPTX
How to deliver a Single View in Financial Services
PDF
Data persistence using pouchdb and couchdb
PPTX
MongoDB and RDBMS: Using Polyglot Persistence at Equifax
PPTX
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
MongoDB in a Mainframe World
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
Tableau & MongoDB: Visual Analytics at the Speed of Thought
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
Webinar: Elevate Your Enterprise Architecture with In-Memory Computing
MongoDB in the Healthcare Enterprise
How MongoDB is Transforming Healthcare Technology
L’architettura di Classe Enterprise di Nuova Generazione
Introduction To MongoDB
Bye Bye Legacy: Simplifying the Journey
MongoDB .local Munich 2019: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB Operations for Developers
Replacing Traditional Technologies with MongoDB: A Single Platform for All Fi...
MongoDB: How We Did It – Reanimating Identity at AOL
Business Track: How MongoDB Helps Telefonia Digital Accelerate Time to Market
MongoDB .local London 2019: MongoDB Atlas Data Lake Technical Deep Dive
How to deliver a Single View in Financial Services
Data persistence using pouchdb and couchdb
MongoDB and RDBMS: Using Polyglot Persistence at Equifax
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Ad

Similar to Jumpstart: MongoDB BI Connector & Tableau (20)

PDF
MongoDB NoSQL database a deep dive -MyWhitePaper
PPTX
L’architettura di classe enterprise di nuova generazione
PPTX
MongoDB Schema Design: Practical Applications and Implications
PPTX
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
PDF
MongoDB Europe 2016 - The Rise of the Data Lake
PPTX
Microsoft Azure Big Data Analytics
PDF
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
PPTX
Webinar: Live Data Visualisation with Tableau and MongoDB
PDF
Best Practices for Building and Deploying Data Pipelines in Apache Spark
PDF
Mongodb
PPTX
Mongo db
PPTX
Sharing a Startup’s Big Data Lessons
PPTX
Elevate MongoDB with ODBC/JDBC
PDF
Introduction to MongoDB and its best practices
PPT
OLAP Cubes in Datawarehousing
PPTX
Webinar: How to Drive Business Value in Financial Services with MongoDB
PPTX
Mongodb Introduction
PPTX
MongoDB 3.4 webinar
PPT
No SQL and MongoDB - Hyderabad Scalability Meetup
PPTX
Data Analytics with MongoDB - Jane Fine
MongoDB NoSQL database a deep dive -MyWhitePaper
L’architettura di classe enterprise di nuova generazione
MongoDB Schema Design: Practical Applications and Implications
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
MongoDB Europe 2016 - The Rise of the Data Lake
Microsoft Azure Big Data Analytics
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
Webinar: Live Data Visualisation with Tableau and MongoDB
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Mongodb
Mongo db
Sharing a Startup’s Big Data Lessons
Elevate MongoDB with ODBC/JDBC
Introduction to MongoDB and its best practices
OLAP Cubes in Datawarehousing
Webinar: How to Drive Business Value in Financial Services with MongoDB
Mongodb Introduction
MongoDB 3.4 webinar
No SQL and MongoDB - Hyderabad Scalability Meetup
Data Analytics with MongoDB - Jane Fine
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Electronic commerce courselecture one. Pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Approach and Philosophy of On baking technology
PPTX
sap open course for s4hana steps from ECC to s4
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Empathic Computing: Creating Shared Understanding
Diabetes mellitus diagnosis method based random forest with bat algorithm
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Electronic commerce courselecture one. Pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
The AUB Centre for AI in Media Proposal.docx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Approach and Philosophy of On baking technology
sap open course for s4hana steps from ECC to s4
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Per capita expenditure prediction using model stacking based on satellite ima...
Network Security Unit 5.pdf for BCA BBA.
Empathic Computing: Creating Shared Understanding

Jumpstart: MongoDB BI Connector & Tableau

Editor's Notes

  • #3: 1:00
  • #4: Mission is to help people see and understand data…Big and Small…..to people of all skillsets..regardless of whether they are a data scientist, doctor, teacher...We believe everyone can benefit from harnessing the power of data... And today I will to provide an overview of our Big Data vision and how we execute that vision, especially as it pertains to MongoDB and data of the noSQL... And also impart some awareness of some best practices and techniques customers might want to employas they use Tableau on top of MongoDB for performant and efficient results..
  • #5: People who know the data should ask the questions. It seems so simple … BI infrastructure and processes are designed to prevent this.
  • #6: Software should be designed for deeper thinking. In fact - Out of way – user flow – prompted, wizards stay focused on tasks -- software should do the heavy lifting for you, not ask you even more questions. Need to Build software - smart decisions and guides you to questions you never thought to ask
  • #7: Analytics at scale can drive change. Data is one of the greatest assets. don’t know – advantage Your customer’s organizations need to change to harness the potential of their data. Not one dashboard, everybode - changing processes , relationships , yes, even the power structure Analytics, and the quest for facts can rapidly drive that change.
  • #8: Start with sharing how we are thinking/approaching the Big Data Space. In addition to everything in Tableau, focus on cloud, analytics, mobile, we have a this perspective on Big Data. Access to all Data Data of any size, stored in any database or file, or in any format Structured or unstructured, messy or perfectly organized We also believe there is and will always be many sources of data that one person wishes to use Data at Scale Being able to have a fast experience with my data. How do I connect to Billions of records and get to the data point fast. We’re not big advocates of sampling data or using Machine Learning tools. We want users to have access to all their data, at scale, to integrate with other data, and find insights fast. No compromises. Discovery Today, the data needs to be created in Hcatalog for a user to connect and analyze. But what if I want to explore my data and build a schema, or build a table in hcatalog? We’re building the tools to make discovery on unstructured data an easy step for the end-user.
  • #9: How do we achieve Analytics for all data… -------------------------------------------------------------------------- Now how does Tableau do this all to make Fast, Easy & Beautiful possible for Big Data? It boils down to 6 main pillars First, we provide broad access to Big Data platforms – We have a number of direct connectors in our product including: Hadoop: Cloudera Impala & Hive, Hortonworks Hive, MapR Hive, Amazon EMR with Impala & Hive, Pivotal HAWQ, IBM BigInsights NoSQL: MarkLogic, Datastax Spark: Apache Spark SQL Cloud: Amazon Redshift, Google BigQuery Operational Data: Splunk We enable visual analytics without coding – Business users can visualize their data using drag-and-drop operations without writing complex SQL, Java code or MapReduce jobs. Tableau simplifies the task of analyzing data - users can discover visual insights about their data faster than they ever could before. We have a hybrid data architecture - Tableau can connect live to data sources or bring it in-memory. Live connectivity works great when connecting to fast interactive query engines such as Impala & Spark and large datasets. However, we can also augment and accelerate slower data sources by creating an extract of the data and bringing it into our in-memory Data Engine. We enable data blending - Distributed Data is often times an even bigger challenge than Big Data. It is rare an analyst’s data is nicely packaged in a single place - instead, data is all over the place residing in disparate technologies and platforms. Tableau enables users to traverse across data sources by blending Big Data with other data sources (e.g. Salesforce, MySQL, Excel files), allowing organizations to keep their data assets where they reside. We invest in our platform query performance - As data volumes grow, Tableau continues to invest in core query performance improvements that help facilitate real-time conversations with data. Features in 9.0 such as parallel queries are especially relevant on distributed scale-out architectures such as Hadoop Speed to insight….is achieved both by speeding access...as well making the whole process of analytics more agile...business users can ask and answer their own questions..without relying on specialists... MongoDB helps provide our customers the same benefit…application layer/business availability layer……Don’t have to ETL or reformat the data…without having to move data…significant cost and complexity…can be eliminated We provide a consistent interface to visualizing data – The way that you use Tableau is nearly identical across all data sources. If you or your other data analysts have used Tableau with other data sources, they’ll be familiar with our visual interface to data which is the same across big and small data. Across the spectrum of “big data” vs. “small data” we can see that the tools we use at each end of the spectrum are different. It’s not viable to use complex, high-end tools for simple problems over a few thousand records; nor is it viable to solve complex queries over billions of records with Excel. However, with Tableau you have a single tool that lets you span the range. Partnering with best of breed.. Analytics for all of your data: Tableau empowers people throughout the organization to answer questions of their data, large or small, in real-time. The more questions they ask, the more value they extract from the data, leading to smarter business decision every day. Tableau works with best of breed technologies and works seamlessly with Big Data databases in addition to more traditional databases, you can have one interface into all of your data. This makes Tableau itself the best “Tableau for Big Data” tool.
  • #11: 5:30
  • #16: demo1.mov
  • #18: MongoSQLD = mediator process = involved in translation to/from SQL Change from BI Connector 1.x (which was a PostgreSQL foreign data wrapper based solution)
  • #34: demo2.mov
  • #35: Compare to BIC 1.x which used foreign data wrapper & didn’t support the same level of pushdown
  • #36: 16:00 Note all Agg Framework tips & tricks apply
  • #37: demo2.mov
  • #38: 21:00
  • #43: demo3.mov
  • #44: 25:00
  • #46: We’ll talk about a Cold, Warm, Hot strategy for data which is something that has come out of our customers who are using Tableau with petabytes of data.
  • #47: and a new way of thinking about how you store, process and use data...Gone are the days, where you used one platform for everything...Paradigm of cold warm and hot data... Not everything needs to be sub 2 second response time...sometimes queries that run for a while are ok...cold use cases...ETL..algorithmic analysis The idea of this is you have a large cold data store of raw or unprepared data, this could be structured or unstructured that is used for data exploration use and query response may be slow. Next is warm data which is data that has been prepared or ETL’d and is typically in a relational datawarehouse or fast analytical database. Finally we have the hot, which will typically be smaller in memory datasets used on the highest value vizzes where performance is critical. In the NoSQL world – where you store your data, how you access your data is a function of what you want to do with it...You can optimize that at many levels... Cold, Warm, Hot represents three tiers of data Each Tier varies by: 1. The amount of data and level of aggregation 2. The speed of access required 3. How frequently you need to access the data
  • #48: Just a double click into our Cold Warm Hot Strategy…If you dont like that nomenclature you could also call it...Raw, Entity and Analytical structures... We can use guided analysis methods in Tableau to build out interactive visualisations that leverage the hot tier for the highest value elements while still providing a path to drill down into the granular data when needed. This chart is just a generalization to help understand the different tiers of data and data access The larger the data volumes are the more likely you are to have them stored in the unstructured and semi structured forms....Fly all over US with a TB of storage..infathomable amount of storage not possible a few years ago... MongoDB before the launch of this optimized connector was traditionally “Cold”. You could either be exploring the raw data or working off of a prepared view…. Relational databases can be seen as your “Warm”. This is the data we’re mostly used to. These are the Kimball stars, the cross tabular reports we cut our teeth to…Typically, you are working from prepared views of the data and you need fairly fast access…Semi analytical structures supporting production reporting.. Lastly, you have in-memory Tableau Data Extracts as well as other fast analytical databases (like Vertica/Exasol) that took burden off traditional data warehouses to speed things up for read-style queries…can we have MPP style execution to make our DW and speed to business intelligence faster..., extracts(Columnar) and fast analytical databases are solely used to back your Tableau dashboards…We also have the loads of tools that have cropped up generating loads of SQL queries We’re also seeing Tableau making an investment with our Hyper engine...We want to put more things into Tableau and federate things....bring more data into Tableau and access it faster...
  • #49: MongoDB before the launch of this optimized connector was traditionally “Cold”. You could either be exploring the raw data or working off of a prepared view….
  • #50: MongoDB is a great place to capture and store all of your information. There are a couple main use cases that I hear about from customers: 1. In the Data Exploration/Mining use case, customers will simply use Tableau as a way to understand general trends and outliers of the data. These are your core data scientists, who'll use Python, Scala, Java to process data and use Tableau to give you quick insights into your data….. In the 2nd, They’ll do Ad-Hoc Report Conceptual Modeling where they will ask….hey is this idea even worth exploring....theyll use Tableau to figure out what parts of the data they actually want to use in preparing a more finalized view of the data…so theyll explore concepts and migrated them into more of an aggregated or analytically optimized structure or up the stack into an Impala to expose it to business users.. Technologies for Hive on MapReduce include Cloudera, Hortonworks, MapR, Amazon EMR to do these types of use cases..
  • #51: Here are some use cases I’ve seen out in the field speaking with customers.. Fixed width financial transactions, storing raw text, Airline booking records, green screen style printouts Ecommerce...Dumb amounts of data, in massively multiplayer online games... E Commerce...All browsing activity, click activity, people like to do AB testing, campaign optimization to deliver target ads Telco...So many different devices that have to be in place in the infrastructure to move communications forward...Every vendor of even hardware makes a big data play...then you have the service providers...the ATT and the Verizons of the world..
  • #52: 30:00
  • #55: Mention: Wisdom’s talk Pre-empt first question about other BI Tools / ODBC / MySQL clients