SlideShare a Scribd company logo
Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 1
Data Analytics Explained MeetupIrfan Elahi - Deloitte
• Working as a Senior Consultant in Deloitte (Analytics Service Line)
• Trainer of Deloitte’s Data Science Training
• Speaker at DataWorks Summit, Sydney (2017) and AWS meetup, Melbourne
• Premium Udemy Instructor with 18,000+ students from 145 countries
• Author of upcoming book about Scala Programming for Big Data Analytics
• Technical Reviewer of an upcoming book on Hadoop published by APress
About Me
Irfan Elahi - Deloitte When Databases Meet Big Data
Traditional
Databases
Drivers of
Disruption
The Advent & Rise
of Big Data
Data Warehousing
in Hadoop
Big Data Adoption
in Enterprises
Above and Beyond
Overview
Agenda
Irfan Elahi - Deloitte When Databases Meet Big Data
Irfan Elahi - Deloitte When Databases Meet Big Data
Core Capabilities
Store:
Store data in relational
models/objects
Process:
Perform relational
processing via SQL
Language
Value & Use-Cases
Traditional Databases
Use Cases
• Backend Persistence
Layer of applications
(OLTP -> OLAP)
• Batch Analysis
• Reporting
• Business
Intelligence
• Data Warehousing
TightlyCoupled
Irfan Elahi - Deloitte When Databases Meet Big Data
Functional Capabilities
Traditional Databases
ACID
Irfan Elahi - Deloitte When Databases Meet Big Data
Functional Capabilities
Traditional Databases
Indexing
Irfan Elahi - Deloitte When Databases Meet Big Data
Functional Capabilities
Traditional Databases
Mutability
Irfan Elahi - Deloitte When Databases Meet Big Data
Capabilities
Traditional Databases
Unsurpassed SQL Support
Irfan Elahi - Deloitte When Databases Meet Big Data
Irfan Elahi - Deloitte When Databases Meet Big Data
Compute and Storage
continued to become
cheaper
Decreasing Cost of
Resources
Volume, variety and
veracity of data continued
to increase
Data Explosion
Propositions like elasticity,
less CAPEX and low time to
value accelerated adoption
of the Cloud
Increased Adoption of
the Cloud
Changing Trends
Drivers of Disruption
Irfan Elahi - Deloitte When Databases Meet Big Data
Databases and Scalability
Drivers of Disruption
Approaches to scale:
• Denormalization
• Caching
• Sharding
• Materialized Views
But…
Efficiency and operational cost to operate
at scale?
Ability to store massive volume of
unstructured data at high velocity?
Irfan Elahi - Deloitte When Databases Meet Big Data
Data Analytics Explained MeetupIrfan Elahi - Deloitte
Leveraged commodity hardware
and open source licensing
resulting in lower operational
costs
Segregated computing and
storage layers which enabled
innovative capabilities
Decoupled Storage and Compute
Provided fault-tolerance and
high availability to optimize
operations
Fault Tolerance
Low CAPEX/OPEX and TCO
Aced the notion of bringing
compute to data
Addressed scalability challenges
by introducing horizontal scaling
model resulting in distributed
storage and compute
Horizontal Scaling
Was devised to enable large scale
processing of unstructured data like
logs and social network
Originally for Analytics Use-Cases
Optimal Data Locality
Key Propositions
The Advent & Rise of Big Data
Irfan Elahi - Deloitte When Databases Meet Big Data
Nuances of Scalability
Drivers of Disruption
source: ResearchGate
Data Analytics Explained MeetupIrfan Elahi - Deloitte
Hadoop Ecosystem
Growth
Commercialization of
Hadoop
Embodiment of
Enterprise Appealing
Capabilities
Wide Spectrum
of Use-cases
Data Warehousing in
Hadoop?
SQL on Hadoop
Technologies
IoT, Machine Learning,
NLP …For instance Cloudera,
Hortonworks, MapR, Pivotal,
EMR, HDInsight…
Boulevard to Data Warehousing in Hadoop
The Advent & Rise of Big Data
Irfan Elahi - Deloitte When Databases Meet Big Data
Irfan Elahi - Deloitte When Databases Meet Big Data
Traditional use-
case; New Platform
Same expectations and
mind-set may result in
conflict of expectations
Challenges
x No In-place Mutation*
x NoSQL not fit for
relational processing
x Table Locking
x ACID
x No PK/FK constraints*
x No Indexing*
x Limited Data Models
support (Multi-Dim vs
Tabular)
x Limited SQL coverage
Opportunities
New Products supporting in-place mutation
(Kudu, Apache Ignite)
New Products enabling MOLAP on Hadoop
(Apache Kylin)
Data Virtualization enabling enterprise-wide
integration (Denodo, atScale)
Enhancement in existing products (Hive ACID)
SQL Interface to everything (Kafka, Spark,
Ignite)
Shift to Cloud-Native technology stack (Big Data
(Altus, HDInsight, Databricks), Storage (Blob,
ADLS, S3))
Above and Beyond
Data Warehousing on
Hadoop trend will
continue to grow
Success will lie in the
sweet spot of enhanced
capabilities of the stack
and flexible mind-set
Panoramic View
Data Warehousing on Big Data
Irfan Elahi - Deloitte When Databases Meet Big Data
Elasticity
…Questions?
If interested: Develop skill-set in Big Data and Analytics from my
resources:
Apache Spark Hands-on Specialization for
Big Data Analytics
Udemy Course
R Programming Hands-on Specialization
for Data Science
Udemy Course
Scala Programming for Big Data Analytics
E-book

More Related Content

PDF
S3 Deduplication with StorReduce and Cloudian
PPTX
Dairy data warehouse - Introducing the concept of Data Science and Big Data i...
PPTX
Snowflake Overview
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
PPTX
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
PPTX
Rob Bearden Keynote Hadoop Summit San Jose
PDF
The Future of Data Management: The Enterprise Data Hub
PDF
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
S3 Deduplication with StorReduce and Cloudian
Dairy data warehouse - Introducing the concept of Data Science and Big Data i...
Snowflake Overview
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Rob Bearden Keynote Hadoop Summit San Jose
The Future of Data Management: The Enterprise Data Hub
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...

What's hot (19)

PPTX
PgConf 2018 - Postgres in a World of DevOps
 
PPTX
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
PPTX
Get Savvy with Snowflake
PPTX
Better Together: The New Data Management Orchestra
PDF
AIS data management and time series analytics on TileDB Cloud (Webinar, Feb 3...
PPTX
It's not the size of your cluster, it's how you use it
PPTX
Beyond Batch: Is ETL still relevant in the API economy?
PDF
Company report xinglian
PPTX
Why Data Lake should be the foundation of Enterprise Data Architecture
PPTX
Enterprise Data Hub: The Next Big Thing in Big Data
PPTX
Hadoop: Extending your Data Warehouse
PPTX
Intorducing Big Data and Microsoft Azure
PPTX
An Operational Data Layer is Critical for Transformative Banking Applications
PDF
Postgres Vision 2018: How to Consume your Database Platform On-premises
 
PPTX
Use dependency injection to get Hadoop *out* of your application code
PDF
Destroying Data Silos
PPTX
Webinar: DataStax Managed Cloud: focus on innovation, not administration
PPTX
Making Bank Predictive and Real-Time
PDF
Snowflake Company Presentation
PgConf 2018 - Postgres in a World of DevOps
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Get Savvy with Snowflake
Better Together: The New Data Management Orchestra
AIS data management and time series analytics on TileDB Cloud (Webinar, Feb 3...
It's not the size of your cluster, it's how you use it
Beyond Batch: Is ETL still relevant in the API economy?
Company report xinglian
Why Data Lake should be the foundation of Enterprise Data Architecture
Enterprise Data Hub: The Next Big Thing in Big Data
Hadoop: Extending your Data Warehouse
Intorducing Big Data and Microsoft Azure
An Operational Data Layer is Critical for Transformative Banking Applications
Postgres Vision 2018: How to Consume your Database Platform On-premises
 
Use dependency injection to get Hadoop *out* of your application code
Destroying Data Silos
Webinar: DataStax Managed Cloud: focus on innovation, not administration
Making Bank Predictive and Real-Time
Snowflake Company Presentation
Ad

Similar to When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture (20)

PPTX
Big Data Practice_Planning_steps_RK
PPTX
Better Together: The New Data Management Orchestra
PDF
Creating a Next-Generation Big Data Architecture
PDF
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
PPT
Making the Case for Hadoop in a Large Enterprise-British Airways
PDF
Hadoop and SQL: Delivery Analytics Across the Organization
PPTX
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
PDF
Managing The Data Deluge By Optimizing Storage
PDF
Hitachi Data Systems Hadoop Solution
PPTX
The modern analytics architecture
PDF
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
PDF
Tapping into the Big Data Reservoir (CON7934)
PDF
Modern data warehouse
PDF
Modern data warehouse
PDF
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
PPTX
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
PDF
Addressing Big Data Challenges - The Hadoop Way
PDF
Oracle Unified Information Architeture + Analytics by Example
PDF
Scalable Analytics on the Cloud
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Big Data Practice_Planning_steps_RK
Better Together: The New Data Management Orchestra
Creating a Next-Generation Big Data Architecture
Creatinganext generationbigdataarchitecture-141204150317-conversion-gate02
Making the Case for Hadoop in a Large Enterprise-British Airways
Hadoop and SQL: Delivery Analytics Across the Organization
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Managing The Data Deluge By Optimizing Storage
Hitachi Data Systems Hadoop Solution
The modern analytics architecture
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Tapping into the Big Data Reservoir (CON7934)
Modern data warehouse
Modern data warehouse
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Bridging Oracle Database and Hadoop by Alex Gorbachev, Pythian from Oracle Op...
Addressing Big Data Challenges - The Hadoop Way
Oracle Unified Information Architeture + Analytics by Example
Scalable Analytics on the Cloud
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Ad

Recently uploaded (20)

PDF
Business Analytics and business intelligence.pdf
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Fluorescence-microscope_Botany_detailed content
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Mega Projects Data Mega Projects Data
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Computer network topology notes for revision
PPTX
Database Infoormation System (DBIS).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Lecture1 pattern recognition............
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Business Analytics and business intelligence.pdf
IBA_Chapter_11_Slides_Final_Accessible.pptx
IB Computer Science - Internal Assessment.pptx
Business Acumen Training GuidePresentation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Fluorescence-microscope_Botany_detailed content
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Qualitative Qantitative and Mixed Methods.pptx
Mega Projects Data Mega Projects Data
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Computer network topology notes for revision
Database Infoormation System (DBIS).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Lecture1 pattern recognition............
Clinical guidelines as a resource for EBP(1).pdf
MODULE 8 - DISASTER risk PREPAREDNESS.pptx

When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture

  • 1. Memory Speed Big Data Analytics: Alluxio vs Apache IgniteIrfan Elahi - Deloitte 1
  • 2. Data Analytics Explained MeetupIrfan Elahi - Deloitte • Working as a Senior Consultant in Deloitte (Analytics Service Line) • Trainer of Deloitte’s Data Science Training • Speaker at DataWorks Summit, Sydney (2017) and AWS meetup, Melbourne • Premium Udemy Instructor with 18,000+ students from 145 countries • Author of upcoming book about Scala Programming for Big Data Analytics • Technical Reviewer of an upcoming book on Hadoop published by APress About Me
  • 3. Irfan Elahi - Deloitte When Databases Meet Big Data Traditional Databases Drivers of Disruption The Advent & Rise of Big Data Data Warehousing in Hadoop Big Data Adoption in Enterprises Above and Beyond Overview Agenda
  • 4. Irfan Elahi - Deloitte When Databases Meet Big Data
  • 5. Irfan Elahi - Deloitte When Databases Meet Big Data Core Capabilities Store: Store data in relational models/objects Process: Perform relational processing via SQL Language Value & Use-Cases Traditional Databases Use Cases • Backend Persistence Layer of applications (OLTP -> OLAP) • Batch Analysis • Reporting • Business Intelligence • Data Warehousing TightlyCoupled
  • 6. Irfan Elahi - Deloitte When Databases Meet Big Data Functional Capabilities Traditional Databases ACID
  • 7. Irfan Elahi - Deloitte When Databases Meet Big Data Functional Capabilities Traditional Databases Indexing
  • 8. Irfan Elahi - Deloitte When Databases Meet Big Data Functional Capabilities Traditional Databases Mutability
  • 9. Irfan Elahi - Deloitte When Databases Meet Big Data Capabilities Traditional Databases Unsurpassed SQL Support
  • 10. Irfan Elahi - Deloitte When Databases Meet Big Data
  • 11. Irfan Elahi - Deloitte When Databases Meet Big Data Compute and Storage continued to become cheaper Decreasing Cost of Resources Volume, variety and veracity of data continued to increase Data Explosion Propositions like elasticity, less CAPEX and low time to value accelerated adoption of the Cloud Increased Adoption of the Cloud Changing Trends Drivers of Disruption
  • 12. Irfan Elahi - Deloitte When Databases Meet Big Data Databases and Scalability Drivers of Disruption Approaches to scale: • Denormalization • Caching • Sharding • Materialized Views But… Efficiency and operational cost to operate at scale? Ability to store massive volume of unstructured data at high velocity?
  • 13. Irfan Elahi - Deloitte When Databases Meet Big Data
  • 14. Data Analytics Explained MeetupIrfan Elahi - Deloitte Leveraged commodity hardware and open source licensing resulting in lower operational costs Segregated computing and storage layers which enabled innovative capabilities Decoupled Storage and Compute Provided fault-tolerance and high availability to optimize operations Fault Tolerance Low CAPEX/OPEX and TCO Aced the notion of bringing compute to data Addressed scalability challenges by introducing horizontal scaling model resulting in distributed storage and compute Horizontal Scaling Was devised to enable large scale processing of unstructured data like logs and social network Originally for Analytics Use-Cases Optimal Data Locality Key Propositions The Advent & Rise of Big Data
  • 15. Irfan Elahi - Deloitte When Databases Meet Big Data Nuances of Scalability Drivers of Disruption source: ResearchGate
  • 16. Data Analytics Explained MeetupIrfan Elahi - Deloitte Hadoop Ecosystem Growth Commercialization of Hadoop Embodiment of Enterprise Appealing Capabilities Wide Spectrum of Use-cases Data Warehousing in Hadoop? SQL on Hadoop Technologies IoT, Machine Learning, NLP …For instance Cloudera, Hortonworks, MapR, Pivotal, EMR, HDInsight… Boulevard to Data Warehousing in Hadoop The Advent & Rise of Big Data
  • 17. Irfan Elahi - Deloitte When Databases Meet Big Data
  • 18. Irfan Elahi - Deloitte When Databases Meet Big Data Traditional use- case; New Platform Same expectations and mind-set may result in conflict of expectations Challenges x No In-place Mutation* x NoSQL not fit for relational processing x Table Locking x ACID x No PK/FK constraints* x No Indexing* x Limited Data Models support (Multi-Dim vs Tabular) x Limited SQL coverage Opportunities New Products supporting in-place mutation (Kudu, Apache Ignite) New Products enabling MOLAP on Hadoop (Apache Kylin) Data Virtualization enabling enterprise-wide integration (Denodo, atScale) Enhancement in existing products (Hive ACID) SQL Interface to everything (Kafka, Spark, Ignite) Shift to Cloud-Native technology stack (Big Data (Altus, HDInsight, Databricks), Storage (Blob, ADLS, S3)) Above and Beyond Data Warehousing on Hadoop trend will continue to grow Success will lie in the sweet spot of enhanced capabilities of the stack and flexible mind-set Panoramic View Data Warehousing on Big Data
  • 19. Irfan Elahi - Deloitte When Databases Meet Big Data Elasticity …Questions? If interested: Develop skill-set in Big Data and Analytics from my resources: Apache Spark Hands-on Specialization for Big Data Analytics Udemy Course R Programming Hands-on Specialization for Data Science Udemy Course Scala Programming for Big Data Analytics E-book