SlideShare a Scribd company logo
5
Most read
7
Most read
8
Most read
IN-MEMORY BIG DATA
ANALYTICS
Supreeth MP
1st sem BDA
18/9/2017 1
Table of Contents:
1. Data is growing
2. What is In-Memory analytics?
3. Why In-Memory Now?
4. The landscape of disk-based and in-memory data management systems
5. In-Memory vsTraditional (on-disk) database management system
6. Optimization Aspects on In-Memory Data Management and Processing
7. Some questions on in-memory analytics
8. References
18/9/2017 2
Data is growing:
• Continuous flow of data
• Real-time, 24/7 streaming updates
• More than 2.5 quintillion bytes of data added daily
• Data is always available
• Democratization of data
• Main source for business decisions
• Shift to digital and STP
• Affordable technology
• Better and faster analytics
• Business Intelligence
• Cloud and subscription-based computing
18/9/2017 3
What is In-Memory analytics?
An in-memory analytics system basically is a database management system that
stores data entirely in main memory that is in the RAM.This contrasts to traditional
(on-disk) database systems, which are designed for data storage on persistent media
such as hardisk. Because working with data in memory is much faster than writing to
and reading from a file system.
In-memory is ideal when:
• Your database is too slow for interactive analytics
• You need to perform real-time data analytics
• You need to be offline and can't connect to your data live
18/9/2017 4
Why In-Memory Now?
• RAM is 200 times faster than disk storage and typically enables data access 50 to 100 times
quicker
• Memory storage capacity and bandwidth have been doubling roughly every three years,
while its price has been dropping by a factor of 10 every five years.
• Modern high-end servers now have multiple sockets, each of which can have tens or
hundreds of gigabytes of DRAM
• Growth of distributed systems
• The increasing adoption of 64-bit computer technology has made RAM more suitable for
use with large datasets.
• Database systems have been evolving over the last few decades.
18/9/2017 5
The landscape of disk-based and in-memory
data management systems:
18/9/2017 6
In-Memory vsTraditional (on-disk) database
management system:
18/9/2017 7
In-Memory vsTraditional (on-disk) database
management system:
Aspects DBDMS IMDBS
File I/O Carries File I/O burden No file I/O burden
Storage Usage Assumes storage is abundant Uses storage more efficiently
Algorithms Algorithm optimized for disk Algorithms optimized for memory
CPU Cycles More CPU cycles Less CPU cycles
Persistence Non-volatile Volatile
Lock Fine Locks Coarse Locks
18/9/2017 8
In-Memory vsTraditional (on-disk) database
management system:
18/9/2017 9
Optimization Aspects on In-Memory Data
Management and Processing:
Aspects Concerns Techniques
Index Cache consciousness, time/space
efficiency
Hash-based, tree-based
Data Layout Cache consciousness, space efficiency Columnar layout
Parallelism Linear scaling, partitioning Data-level, shared-memory scale-up and
shared-nothing scale out parallelism
Concurrency
Control
Overhead, correctness Coarse-grained locks
Query Processing Code locality, register temporal locality,
time efficiency
Coarse-grained stored procedures
Fault tolerance Durability, correlated failures, availability Checkpoints andTransaction logging
Data Overflow Locality, Paging strategy, hot/cold
classification
Anti-caching
18/9/2017 10
Some questions on in-memory analytics:
• What do companies need to think about as they take on an in-memory analytics path?
• What are some potential speed bumps in adopting in-memory analytics?
• What role do skills play here?
• If an in-memory database system boosts performance by holding all records in memory,
can’t we get the same result by creating a RAM disk and deploying a traditional database
there?
• Won’t an in-memory database require huge amounts of memory because database systems
are large?
• Isn’t the database just lost if there’s a system crash?
18/9/2017 11
References:
[1] In-Memory Big Data Management and Processing:A Survey, IEEETRANSACTIONSON
KNOWLEDGEAND DATA ENGINEERING JULY 2015
[2] Using In-MemoryAnalytics to Quickly Crunch Big Data by Lee Garber
[3] https://www.sas.com/en_us/insights/articles/big-data/in-memory-analytics-questions.html
[4] DataAnalytics using In-MemoryComputing: https://www.gridgain.com/
[5] How Computers Work: Disks And Secondary Storage:
http://homepage.cs.uri.edu/faculty/wolfe/book/Readings/Reading05.htm
[6] http://www.mcobject.com/in_memory_database
[7] In-Memory DatabaseComputing – Smarter way of data analysis:
http://www.xoriant.com/blog/big-data-analytics/memory-database-computing-faster-smarter-
analysis-big-data-world.html
[8] How Computers Work:The CPU and Memory:
http://homepage.cs.uri.edu/book/cpu_memory/cpu_memory.htm
18/9/2017 12
THANKYOU

More Related Content

PPTX
Introduction to Hadoop and Hadoop component
PPT
PPTX
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
PPTX
Hadoop Tutorial For Beginners
PPTX
Big data and Hadoop
PDF
Big Data Architecture
PPT
Map reduce in BIG DATA
PPT
Data preprocessing in Data Mining
Introduction to Hadoop and Hadoop component
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Hadoop Tutorial For Beginners
Big data and Hadoop
Big Data Architecture
Map reduce in BIG DATA
Data preprocessing in Data Mining

What's hot (20)

PPTX
Chapter 1 big data
PPTX
Data preprocessing
PPTX
Big Data Analytics with Hadoop
PPTX
Introduction to NOSQL databases
PDF
CS6010 Social Network Analysis Unit IV
PDF
Big Data Visualization
PPT
Unit-3_BDA.ppt
PPT
PPTX
What is big data?
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
PDF
Big Data Evolution
PDF
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
PPTX
Challenges of Conventional Systems.pptx
PPTX
Data Streaming in Big Data Analysis
PDF
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
PPT
Data Mining: Concepts and Techniques — Chapter 2 —
PDF
Architect’s Open-Source Guide for a Data Mesh Architecture
PPTX
Apache Atlas: Tracking dataset lineage across Hadoop components
PPTX
Lecture #01
PDF
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Chapter 1 big data
Data preprocessing
Big Data Analytics with Hadoop
Introduction to NOSQL databases
CS6010 Social Network Analysis Unit IV
Big Data Visualization
Unit-3_BDA.ppt
What is big data?
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Big Data Evolution
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Challenges of Conventional Systems.pptx
Data Streaming in Big Data Analysis
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Data Mining: Concepts and Techniques — Chapter 2 —
Architect’s Open-Source Guide for a Data Mesh Architecture
Apache Atlas: Tracking dataset lineage across Hadoop components
Lecture #01
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Ad

Similar to In-Memory Big Data Analytics (20)

PDF
Capitalizing on the New Era of In-memory Computing
PDF
IRJET- Improving Performance of Data Analytical Queries using In-Memory D...
PPTX
In Memory Data Grids, Demystified!
PDF
Enterprise Storage Solutions for Overcoming Big Data and Analytics Challenges
PPTX
Big in memory file system
PDF
Meta scale kognitio hadoop webinar
PDF
In memory big data management and processing a survey
PDF
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
PPTX
IBM Spectrum Scale Overview november 2015
PDF
#FMS2018 NGD Systems Real World Results with #ComputationalStorage
PPT
E06WarehouseDesign.pptxkjhjkljhlkjhlkhlkj
PPTX
Webinar: The Three New Requirements of Unstructured Data Protection
PPT
E06WarehouseDesignissuesindatawarehousedesign.ppt
PPTX
Best storage engine for MySQL
PPTX
Big data by Mithlesh sadh
PDF
Meta scale kognitio hadoop webinar
PPTX
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
PDF
#MFSummit2016 Operate: The race for space
PPTX
MongoDB and In-Memory Computing
Capitalizing on the New Era of In-memory Computing
IRJET- Improving Performance of Data Analytical Queries using In-Memory D...
In Memory Data Grids, Demystified!
Enterprise Storage Solutions for Overcoming Big Data and Analytics Challenges
Big in memory file system
Meta scale kognitio hadoop webinar
In memory big data management and processing a survey
Positioning IBM Flex System 16 Gb Fibre Channel Fabric for Storage-Intensive ...
IBM Spectrum Scale Overview november 2015
#FMS2018 NGD Systems Real World Results with #ComputationalStorage
E06WarehouseDesign.pptxkjhjkljhlkjhlkhlkj
Webinar: The Three New Requirements of Unstructured Data Protection
E06WarehouseDesignissuesindatawarehousedesign.ppt
Best storage engine for MySQL
Big data by Mithlesh sadh
Meta scale kognitio hadoop webinar
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
#MFSummit2016 Operate: The race for space
MongoDB and In-Memory Computing
Ad

Recently uploaded (20)

PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Global journeys: estimating international migration
PDF
Foundation of Data Science unit number two notes
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Introduction to Business Data Analytics.
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Launch Your Data Science Career in Kochi – 2025
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Business Ppt On Nestle.pptx huunnnhhgfvu
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Global journeys: estimating international migration
Foundation of Data Science unit number two notes
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Introduction to Business Data Analytics.
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Launch Your Data Science Career in Kochi – 2025
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Moving the Public Sector (Government) to a Digital Adoption
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Reliability_Chapter_ presentation 1221.5784
IBA_Chapter_11_Slides_Final_Accessible.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
STUDY DESIGN details- Lt Col Maksud (21).pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush

In-Memory Big Data Analytics

  • 1. IN-MEMORY BIG DATA ANALYTICS Supreeth MP 1st sem BDA 18/9/2017 1
  • 2. Table of Contents: 1. Data is growing 2. What is In-Memory analytics? 3. Why In-Memory Now? 4. The landscape of disk-based and in-memory data management systems 5. In-Memory vsTraditional (on-disk) database management system 6. Optimization Aspects on In-Memory Data Management and Processing 7. Some questions on in-memory analytics 8. References 18/9/2017 2
  • 3. Data is growing: • Continuous flow of data • Real-time, 24/7 streaming updates • More than 2.5 quintillion bytes of data added daily • Data is always available • Democratization of data • Main source for business decisions • Shift to digital and STP • Affordable technology • Better and faster analytics • Business Intelligence • Cloud and subscription-based computing 18/9/2017 3
  • 4. What is In-Memory analytics? An in-memory analytics system basically is a database management system that stores data entirely in main memory that is in the RAM.This contrasts to traditional (on-disk) database systems, which are designed for data storage on persistent media such as hardisk. Because working with data in memory is much faster than writing to and reading from a file system. In-memory is ideal when: • Your database is too slow for interactive analytics • You need to perform real-time data analytics • You need to be offline and can't connect to your data live 18/9/2017 4
  • 5. Why In-Memory Now? • RAM is 200 times faster than disk storage and typically enables data access 50 to 100 times quicker • Memory storage capacity and bandwidth have been doubling roughly every three years, while its price has been dropping by a factor of 10 every five years. • Modern high-end servers now have multiple sockets, each of which can have tens or hundreds of gigabytes of DRAM • Growth of distributed systems • The increasing adoption of 64-bit computer technology has made RAM more suitable for use with large datasets. • Database systems have been evolving over the last few decades. 18/9/2017 5
  • 6. The landscape of disk-based and in-memory data management systems: 18/9/2017 6
  • 7. In-Memory vsTraditional (on-disk) database management system: 18/9/2017 7
  • 8. In-Memory vsTraditional (on-disk) database management system: Aspects DBDMS IMDBS File I/O Carries File I/O burden No file I/O burden Storage Usage Assumes storage is abundant Uses storage more efficiently Algorithms Algorithm optimized for disk Algorithms optimized for memory CPU Cycles More CPU cycles Less CPU cycles Persistence Non-volatile Volatile Lock Fine Locks Coarse Locks 18/9/2017 8
  • 9. In-Memory vsTraditional (on-disk) database management system: 18/9/2017 9
  • 10. Optimization Aspects on In-Memory Data Management and Processing: Aspects Concerns Techniques Index Cache consciousness, time/space efficiency Hash-based, tree-based Data Layout Cache consciousness, space efficiency Columnar layout Parallelism Linear scaling, partitioning Data-level, shared-memory scale-up and shared-nothing scale out parallelism Concurrency Control Overhead, correctness Coarse-grained locks Query Processing Code locality, register temporal locality, time efficiency Coarse-grained stored procedures Fault tolerance Durability, correlated failures, availability Checkpoints andTransaction logging Data Overflow Locality, Paging strategy, hot/cold classification Anti-caching 18/9/2017 10
  • 11. Some questions on in-memory analytics: • What do companies need to think about as they take on an in-memory analytics path? • What are some potential speed bumps in adopting in-memory analytics? • What role do skills play here? • If an in-memory database system boosts performance by holding all records in memory, can’t we get the same result by creating a RAM disk and deploying a traditional database there? • Won’t an in-memory database require huge amounts of memory because database systems are large? • Isn’t the database just lost if there’s a system crash? 18/9/2017 11
  • 12. References: [1] In-Memory Big Data Management and Processing:A Survey, IEEETRANSACTIONSON KNOWLEDGEAND DATA ENGINEERING JULY 2015 [2] Using In-MemoryAnalytics to Quickly Crunch Big Data by Lee Garber [3] https://www.sas.com/en_us/insights/articles/big-data/in-memory-analytics-questions.html [4] DataAnalytics using In-MemoryComputing: https://www.gridgain.com/ [5] How Computers Work: Disks And Secondary Storage: http://homepage.cs.uri.edu/faculty/wolfe/book/Readings/Reading05.htm [6] http://www.mcobject.com/in_memory_database [7] In-Memory DatabaseComputing – Smarter way of data analysis: http://www.xoriant.com/blog/big-data-analytics/memory-database-computing-faster-smarter- analysis-big-data-world.html [8] How Computers Work:The CPU and Memory: http://homepage.cs.uri.edu/book/cpu_memory/cpu_memory.htm 18/9/2017 12