SlideShare a Scribd company logo
Kyligence Introduction
MicroStrategy Partnership
Saswata Sengupta
© Kyligence Inc. 2019, Confidential.
Apache Kylin
Top Level Apache Project
 The only open-source OLAP on big data
platform
Best Open-source Big Data Tool
 InfoWorld’s Bossies (Best of Open
Source Software Awards) in 2015 &
2016
Sub-Second Interactive Query
 Large scale, high concurrency, sub
second query latency, multi-dimension
1000+ Organizations
 Adopted by thousands of organizations
globally
© Kyligence Inc. 2019, Confidential.
Kyligence = Kylin + Intelligence
• Founded in 2016 by the creators of Apache Kylin
• Built around Kylin, with augmented AI and enhanced to
deliver unprecedented enterprise analytic performance
• CRN Top-10 big data startups in 2018
• Global Presence: San Jose, Seattle, New York, Shanghai,
Beijing
• VCs: Fidelity International, Shunwei Capital, Broadband
Capital, Redpoint, Cisco, Coatue
Accelerate Critical Business Decisions with AI-Augmented Data Management and Analytics
2016
Founded Pre-
A
Redpoint
Cisco
2017
Series A
CBC
SHUNWEI
2018
Series B
8Roads
2019
Series C
Coatue
© Kyligence Inc. 2019, Confidential.
Trusted by Global Fortune 500
BFSI
Telecom
Technology
Manufacturing,
Retail, Etc.
© Kyligence Inc. 2019, Confidential.
Pains in Collaboration
Data Analyst
Data Engineer
• Manage data source
• Design data model to
keep one source of truth
• ETL and load data
• Develop dashboard/reporting
• Self-service analysis to
answer business questions
Low efficiency in development to fulfill business
requirements
Limited dimension and measures in a model to serve
complex calculations
Difficulty if analytics requirements or source
data change
Time to insight is slow
© Kyligence Inc. 2019, Confidential.
Kyligence Ecosystem
Global Partners
• Fully enabled on leading cloud and data
platforms (Azure, AWS, Google Cloud,
Cloudera)
• Integrated with popular BI and
virtualization (Tableau, Power BI, Qlik,
MicroStrategy)
• Certified on main Hadoop distributions
(CDP)
© Kyligence Inc. 2019, Confidential.
Kyligence Enterprise Accelerate Mission-critical Analytics Intelligently
• Unified Query Entrance
ODBC/JDBC API/SDK
Finance Marketing Sales Customer Checkout
Cube Index
10%4% 80%
RDBM
s Hive
SQL/MDX
Semantic Services
6%
Distributed
Query Engine
AI-Augmented
Engine
Smart
Pushdown
Metadata
Management
Enterprise
Security
• Business Semantic Layer
• Query Pattern for all data
• High Performance Engine
© Kyligence Inc. 2019, Confidential.
Kyligence Cloud
FinanceMarketingSales
Index
more…
Landing &
Transformation
Semantic & Augmentation ApplicationsSource
Azure Blob Storage
Azure Synapse
© Kyligence Inc. 2019, Confidential.
AI Augmented Engine: Intelligent Data Development
© Kyligence Inc. 2019, Confidential.
AI Augmented Engine: One-click Acceleration
• Self-maintaining
• Dynamic auto-modeling
• Self-learning engine
• One-click acceleration
• Adaptive model
© Kyligence Inc. 2019, Confidential.
AI-Augmented Engine — Learn From Your Analytics History
© Kyligence Inc. 2019, Confidential.
Advanced Tuning Features – Push Down and Aggregate Index
© Kyligence Inc. 2019, Confidential.
Under the hood : Smart Cuboids
• Each Model consists of N-Dimension Cuboids which is a
combination of several dimension in different permutations and
combinations.
• Apache Spark is used to build the cuboids making query results
extremely fast.
• When the user sends a query the model intelligently looks for
the Cuboids/segment returns the results extremely fast.
© Kyligence Inc. 2019, Confidential.
Unified Semantic Layer
BI Integration Access Control
Enterprise Security
Query Engine Model
Query Platform
Data Sources
Excel MicroStrategy Other BI Tools
Semantic Layer
Cloud DW Parquet ORC
Blob
Storage
CSVSnowflake
• Translate technical details into
business terminology
• Synchronize semantics across major
BI tools
• Unified business definitions
• Flexible business calculations
© Kyligence Inc. 2019, Confidential.
Elastic Scaling — Handle Peak Time Automatically
 Fewer compute and storage resources
utilized
 Dynamic on-demand cluster resizing
 Uses spot instances
 Efficient planning for data growth
© Kyligence Inc. 2019, Confidential.
TPC-H 22 Queries
SF=50
Query Response Time | 0.5 Billion
SF=500
Query Response Time | 5 Billion
• No warm up
• Lower is better
• Run each query 3 times
• Record the average time
For each Dataset:
© Kyligence Inc. 2019, Confidential.
Financial Risk Management - replacing the large SSAS cube
Challenges Kyligence’s Solution
modernization
same data source
same front-end BI
similar OLAP concepts
comparable semantic layer
finer granular access control
Scalability
Performance
Low Cost
• 5TB SSAS cube with 5 Billion rows daily
incremental data
• 14 Lookup tables, half over 20M
cardinalities (largest 200M)
• 600+ dimensions
• 30+ analysis users
• Analysts’ work locked by incremental
loading workload, system crashes
happen frequently
• Poor performance on data loading and
queries (especially on UHC, Count
Distinct, Correlation)
• Limited concurrent users
• Single cube easy management
• Analysts’ work no longer interrupted
• Transparent to business users, same
• analysis tool Excel
• Improved query and loading performance
• Support 1000+ concurrent users
• Meet future requirement - prediction of 40% data
volume growth, migration to cloud, Realtime
THANK YOU

More Related Content

PPTX
Providing Interactive Analytics on Excel with Billions of Rows
PPTX
Hassle-Free Data Lake Governance: Automating Your Analytics with a Semantic L...
PPTX
Kyligence Cloud 4 - Feature Focus: AI-Augmented Engine
PDF
Take the Bias out of Big Data Insights With Augmented Analytics
PPTX
Precomputation or Data Virtualization, which one is right for you?
PPTX
Open Source Technologies in the Analytics Revolution
PDF
Modern Data Platform Part 1: Data Ingestion
PDF
Simplify Data Analytics Over the Cloud
Providing Interactive Analytics on Excel with Billions of Rows
Hassle-Free Data Lake Governance: Automating Your Analytics with a Semantic L...
Kyligence Cloud 4 - Feature Focus: AI-Augmented Engine
Take the Bias out of Big Data Insights With Augmented Analytics
Precomputation or Data Virtualization, which one is right for you?
Open Source Technologies in the Analytics Revolution
Modern Data Platform Part 1: Data Ingestion
Simplify Data Analytics Over the Cloud

What's hot (18)

PPTX
SnapLogic Technology Open House – January 2018
PPTX
Importance of global certifications
PPTX
AI-Powered Analytics: What It Is and How It’s Powering the Next Generation of...
PDF
Pivotal Digital Transformation Forum: Requirements to Become a Data-Driven En...
PPTX
Qlik sense- Technical Seminar
PPTX
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
PPTX
Webinar: BI in the Sky - The New Rules of Cloud Analytics
PDF
Augmented OLAP for Big Data
PDF
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...
PPTX
Moving to the Cloud: Modernizing Data Architecture in Healthcare
PDF
The API Lie
PDF
Making Better Decisions Using BigData and Analytics
PPTX
SnapLogic Live: Big Data Integration
PPTX
Altis Webinar: Use Cases For The Modern Data Platform
PPTX
Event Sponsor NetApp - CSO- Jon Kissane
PPTX
Snaplogic Live: Big Data in Motion
PPTX
Cloud-Con: Integration & Web APIs
PPTX
Introduction to Big Data using AWS Services
SnapLogic Technology Open House – January 2018
Importance of global certifications
AI-Powered Analytics: What It Is and How It’s Powering the Next Generation of...
Pivotal Digital Transformation Forum: Requirements to Become a Data-Driven En...
Qlik sense- Technical Seminar
Aeris + Cassandra: An IOT Solution Helping Automakers Make the Connected Car ...
Webinar: BI in the Sky - The New Rules of Cloud Analytics
Augmented OLAP for Big Data
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...
Moving to the Cloud: Modernizing Data Architecture in Healthcare
The API Lie
Making Better Decisions Using BigData and Analytics
SnapLogic Live: Big Data Integration
Altis Webinar: Use Cases For The Modern Data Platform
Event Sponsor NetApp - CSO- Jon Kissane
Snaplogic Live: Big Data in Motion
Cloud-Con: Integration & Web APIs
Introduction to Big Data using AWS Services
Ad

Similar to Lightning-Fast, Interactive Business Intelligence Performance with MicroStrategy and Kyligence (20)

PPTX
Architecting Snowflake for High Concurrency and High Performance
PPTX
How Analytics Teams Using SSAS Can Embrace Big Data and the Cloud
PPTX
Addressing the systemic shortcomings of cloud analytics
PPTX
Enhance Data Governance with Kyligence Unified Semantic Layer
PPTX
Kyligence Cloud 4 - An Overview
PDF
ICP for Data- Enterprise platform for AI, ML and Data Science
PPTX
Accelerating Data Warehouse Modernization
PPTX
Building Enterprise OLAP on Hadoop for FSI
PDF
Augmented OLAP Analytics for Big Data
PDF
Accelerating Innovation with Hybrid Cloud
PPTX
Building a Modern Analytic Database with Cloudera 5.8
PPTX
The Cloud - What's different
PDF
Apache Kylin and Use Cases - 2018 Big Data Spain
PDF
Building a hybrid, dynamic cloud on an open architecture
PDF
Cloud the current future v6
PDF
Connecta Event: Big Query och dataanalys med Google Cloud Platform
PPTX
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
PPTX
SQL + Hadoop: The High Performance Advantage�
PPTX
Move Cloud to the Core of your Business Strategy
PDF
Hadoop in the Cloud
Architecting Snowflake for High Concurrency and High Performance
How Analytics Teams Using SSAS Can Embrace Big Data and the Cloud
Addressing the systemic shortcomings of cloud analytics
Enhance Data Governance with Kyligence Unified Semantic Layer
Kyligence Cloud 4 - An Overview
ICP for Data- Enterprise platform for AI, ML and Data Science
Accelerating Data Warehouse Modernization
Building Enterprise OLAP on Hadoop for FSI
Augmented OLAP Analytics for Big Data
Accelerating Innovation with Hybrid Cloud
Building a Modern Analytic Database with Cloudera 5.8
The Cloud - What's different
Apache Kylin and Use Cases - 2018 Big Data Spain
Building a hybrid, dynamic cloud on an open architecture
Cloud the current future v6
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
SQL + Hadoop: The High Performance Advantage�
Move Cloud to the Core of your Business Strategy
Hadoop in the Cloud
Ad

More from Tyler Wishnoff (8)

PPTX
Snowflake: The Good, the Bad, and the Ugly
PPTX
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
PPTX
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
PPTX
Analysis of the Pressure Placed on Medical Systems during the COVID-19 Pandemic
PDF
Apache Kylin Meetup: Berlin - With OLX Group
PDF
Apache Kylin Data Summit 2019: Kyligence Presentation
PPTX
Augmented OLAP for Big Data Analytics
PDF
Accelerating Big Data Analytics with Apache Kylin
Snowflake: The Good, the Bad, and the Ugly
How to Guarantee Exact COUNT DISTINCT Queries with Sub-Second Latency on Mass...
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Analysis of the Pressure Placed on Medical Systems during the COVID-19 Pandemic
Apache Kylin Meetup: Berlin - With OLX Group
Apache Kylin Data Summit 2019: Kyligence Presentation
Augmented OLAP for Big Data Analytics
Accelerating Big Data Analytics with Apache Kylin

Recently uploaded (20)

PPTX
Database Infoormation System (DBIS).pptx
PDF
Introduction to the R Programming Language
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
1_Introduction to advance data techniques.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Database Infoormation System (DBIS).pptx
Introduction to the R Programming Language
climate analysis of Dhaka ,Banglades.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to Knowledge Engineering Part 1
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Lecture1 pattern recognition............
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
1_Introduction to advance data techniques.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Reliability_Chapter_ presentation 1221.5784
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...

Lightning-Fast, Interactive Business Intelligence Performance with MicroStrategy and Kyligence

  • 2. © Kyligence Inc. 2019, Confidential. Apache Kylin Top Level Apache Project  The only open-source OLAP on big data platform Best Open-source Big Data Tool  InfoWorld’s Bossies (Best of Open Source Software Awards) in 2015 & 2016 Sub-Second Interactive Query  Large scale, high concurrency, sub second query latency, multi-dimension 1000+ Organizations  Adopted by thousands of organizations globally
  • 3. © Kyligence Inc. 2019, Confidential. Kyligence = Kylin + Intelligence • Founded in 2016 by the creators of Apache Kylin • Built around Kylin, with augmented AI and enhanced to deliver unprecedented enterprise analytic performance • CRN Top-10 big data startups in 2018 • Global Presence: San Jose, Seattle, New York, Shanghai, Beijing • VCs: Fidelity International, Shunwei Capital, Broadband Capital, Redpoint, Cisco, Coatue Accelerate Critical Business Decisions with AI-Augmented Data Management and Analytics 2016 Founded Pre- A Redpoint Cisco 2017 Series A CBC SHUNWEI 2018 Series B 8Roads 2019 Series C Coatue
  • 4. © Kyligence Inc. 2019, Confidential. Trusted by Global Fortune 500 BFSI Telecom Technology Manufacturing, Retail, Etc.
  • 5. © Kyligence Inc. 2019, Confidential. Pains in Collaboration Data Analyst Data Engineer • Manage data source • Design data model to keep one source of truth • ETL and load data • Develop dashboard/reporting • Self-service analysis to answer business questions Low efficiency in development to fulfill business requirements Limited dimension and measures in a model to serve complex calculations Difficulty if analytics requirements or source data change Time to insight is slow
  • 6. © Kyligence Inc. 2019, Confidential. Kyligence Ecosystem Global Partners • Fully enabled on leading cloud and data platforms (Azure, AWS, Google Cloud, Cloudera) • Integrated with popular BI and virtualization (Tableau, Power BI, Qlik, MicroStrategy) • Certified on main Hadoop distributions (CDP)
  • 7. © Kyligence Inc. 2019, Confidential. Kyligence Enterprise Accelerate Mission-critical Analytics Intelligently • Unified Query Entrance ODBC/JDBC API/SDK Finance Marketing Sales Customer Checkout Cube Index 10%4% 80% RDBM s Hive SQL/MDX Semantic Services 6% Distributed Query Engine AI-Augmented Engine Smart Pushdown Metadata Management Enterprise Security • Business Semantic Layer • Query Pattern for all data • High Performance Engine
  • 8. © Kyligence Inc. 2019, Confidential. Kyligence Cloud FinanceMarketingSales Index more… Landing & Transformation Semantic & Augmentation ApplicationsSource Azure Blob Storage Azure Synapse
  • 9. © Kyligence Inc. 2019, Confidential. AI Augmented Engine: Intelligent Data Development
  • 10. © Kyligence Inc. 2019, Confidential. AI Augmented Engine: One-click Acceleration • Self-maintaining • Dynamic auto-modeling • Self-learning engine • One-click acceleration • Adaptive model
  • 11. © Kyligence Inc. 2019, Confidential. AI-Augmented Engine — Learn From Your Analytics History
  • 12. © Kyligence Inc. 2019, Confidential. Advanced Tuning Features – Push Down and Aggregate Index
  • 13. © Kyligence Inc. 2019, Confidential. Under the hood : Smart Cuboids • Each Model consists of N-Dimension Cuboids which is a combination of several dimension in different permutations and combinations. • Apache Spark is used to build the cuboids making query results extremely fast. • When the user sends a query the model intelligently looks for the Cuboids/segment returns the results extremely fast.
  • 14. © Kyligence Inc. 2019, Confidential. Unified Semantic Layer BI Integration Access Control Enterprise Security Query Engine Model Query Platform Data Sources Excel MicroStrategy Other BI Tools Semantic Layer Cloud DW Parquet ORC Blob Storage CSVSnowflake • Translate technical details into business terminology • Synchronize semantics across major BI tools • Unified business definitions • Flexible business calculations
  • 15. © Kyligence Inc. 2019, Confidential. Elastic Scaling — Handle Peak Time Automatically  Fewer compute and storage resources utilized  Dynamic on-demand cluster resizing  Uses spot instances  Efficient planning for data growth
  • 16. © Kyligence Inc. 2019, Confidential. TPC-H 22 Queries SF=50 Query Response Time | 0.5 Billion SF=500 Query Response Time | 5 Billion • No warm up • Lower is better • Run each query 3 times • Record the average time For each Dataset:
  • 17. © Kyligence Inc. 2019, Confidential. Financial Risk Management - replacing the large SSAS cube Challenges Kyligence’s Solution modernization same data source same front-end BI similar OLAP concepts comparable semantic layer finer granular access control Scalability Performance Low Cost • 5TB SSAS cube with 5 Billion rows daily incremental data • 14 Lookup tables, half over 20M cardinalities (largest 200M) • 600+ dimensions • 30+ analysis users • Analysts’ work locked by incremental loading workload, system crashes happen frequently • Poor performance on data loading and queries (especially on UHC, Count Distinct, Correlation) • Limited concurrent users • Single cube easy management • Analysts’ work no longer interrupted • Transparent to business users, same • analysis tool Excel • Improved query and loading performance • Support 1000+ concurrent users • Meet future requirement - prediction of 40% data volume growth, migration to cloud, Realtime

Editor's Notes

  • #7: UBS case uses databricks
  • #8: UBS case uses databricks
  • #9: Azure storage to be generic, replace Alibaba with Hadoop
  • #10: 灵活的多维建模 模型的变化只影响有关的索引; 模型定义的变化与数据加载互不影响; -------------------- Flexible multidimensional modeling Changes in the model affect only the relevant indexes Changes in model definitions and data loading do not affect each other
  • #11: 灵活的多维建模 模型的变化只影响有关的索引; 模型定义的变化与数据加载互不影响; -------------------- Flexible multidimensional modeling Changes in the model affect only the relevant indexes Changes in model definitions and data loading do not affect each other
  • #17: Industry-recognized data analysis test data sets Analysis of key business decisions Practical business significance 0.5 billion dataset, test TPC-H 22 queries. Test method: 3 times to average, no query engine to warm up. TPC-H Benchmark Examine large volumes of data High complexity queries Answers critical business questions 22 decision making queries E.g. The Shipping Priority Query retrieves the shipping priority and potential revenue of the orders having the largest revenue among those that had not been shipped as of a given date. Top 10 orders are listed in decreasing order of revenue. HARDWARE CONFIGURATION Same 4 physical nodes Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz * 2 Totally 86 vCores, 188 GB mem Same Spark configuration for both KE 4 Beta and SparkSQL 2.4 spark.driver.memory=16g spark.executor.memory=8g spark.yarn.executor.memoryOverhead=2g spark.yarn.am.memory=1024m spark.executor.cores=5 spark.executor.instances=17 Query Response Time | 5 Billion Same 4 physical nodes Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz * 2 Totally 86 vCores, 188 GB mem Same Spark configuration for both KE 4 Beta and SparkSQL 2.4 spark.driver.memory=16g spark.executor.memory=20g spark.yarn.executor.memoryOverhead=2g spark.yarn.am.memory=1024m spark.executor.cores=5 spark.executor.instances=30
  • #18: Benefits: Unlimited scale-out solution to fit future data volume growth 1 hour non-blocking incremental loading Single cube easy maintenance Low infrastructure cost with auto scaling support 100 concurrent users Transparent to business users, same analysis tool Excel Architecture Kyligence Enterprise 4.0 Azure HDInsight 3.6 Azure Data Lake gen2 Cluster size: 30 D3 V2 worker nodes (potentially) ingest data from Oracle Query performance 90% SQL queries within 5s 90% MDX queries within 60s 80% MDX queries within 20s 50 QPS per query node