SlideShare a Scribd company logo
BIG DATA WITH AWS
Architecting for Big Data with AWS
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
- By Stany Simon
Agenda
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Introduction
to Big Data
When starting
a Big Data
Project
Cost Vs. ROI
Architecting for
Big Data
What is Big Data
Big data is an evolving term that describes any voluminous amount of
structured, semi-structured and unstructured data that has the potential to be
mined for information.
Big Data is less about the data itself and more about what you do with the
data.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Data vs. Information
Data is raw, unorganized facts
Information is derived from data
Data is useless unless useful information from it is not derived
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
The 3 Vs of Big Data
Velocity: Speed in
which data is
created, processed
& analyzed
Variety: Varying
formats of the data
( structured /
unstructured)
Volume: Size of
the data to be
handled
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Are the 3 V's enough to define Big Data today?
Veracity
Variability
Visualization
Value
Validity
Volatility
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Solving the Big Pie Mystery
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Faster Fast Food
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Who stole my Cheese?
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Where has Big Data Analytics helped?
Tooth Fairy
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
When starting a Big Data
Project
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Identify clear business need and value.
Build a strong data infrastructure to host and manage data.
Time taken for a useful outcome.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Time
Money Outcome
Big Data
Project
Big Data Today….
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Estimated to grow to
40 zettabytes
by 2020.
Investment expected
to top $114
billion by 2018.
Data market
reached $27.36B
in 2014.
Big Data market
will top $84B in
2026
BUT....
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Gartner predicted that through 2017, 60% of
Big Data projects will fail to go beyond piloting
and experimentation and will be abandoned.
BUT....
3. 30% of Big Data projects , 65 % failure,
35 % successful ( 5% with useful insights )
Case one
Brief: A retailer which was into a range of Big
Data projects.
Objective: Mining all of their stock and
purchase data
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Importance of Identifying clear need and value
Outcome:
Case two
Brief: Web-based startup focused on mothers
on child development
Objective: Collection of data from multiple
sources for adaptive learning & behavioral
pattern analysis.
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Outcome:
Importance of Identifying clear need and value
Budget
Points to Ponder
Does this data help your organizations with their business decisions.
How to capture the Data?
Where to store?
How to analyze?
Cost vs. ROI- Value
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Big Data Break Down
Flow of Data
Ingest Store
Process/
Analyze
Visualize
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 1: Ingestion
Transactional
Orders, Invoices, Travel Records
File
Server Logs
Stream
IOT
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 2: Storage
Database Storage
Cloud Storage
Stream Storage
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
AWS Services for Storage
Amazon
CloudSearch
Amazon
DynamoDB
Amazon
ElastiCache
Amazon Elasticsearch
Service
Amazon
Kinesis
Amazon RDSAmazon
Glacier
Amazon S3Amazon
Redshift
Points to Ponder
Size of the data to be stored.
Time for which the data needs to be stored.
Cost of storage per GB.
Criticality of the data in terms of security & recovery.
Availability of data.
Data Structure
Query Pattern www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Traditional Architecture
Client Tier
Web-App Tier
Data Storage Tier
SQL
Storage Architecture
Storage Tier
SQL NoSQL Search Cache
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 3: Process/ Analyse
Batch Processing
• Hourly, daily, weekly reports
• Works on huge amount of data sets at one go
• Response Time : Might take a few Hours to answer your questions
Real-Time Processing
• Minute based reports
• Works on very small data sets
• Response Time : Takes a few seconds to answer your data
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
AWS Services
Amazon EMR
Amazon
Kinesis
Amazon
Redshift
AWS Lambda
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Phase 4: Visualize
Value to the users
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Points to Ponder
Business :
Value of the insights from the project
Architecture:
• Frequency of data flowing in
• Size of data coming in
• Tools to be used for processing the data
• Scaling the Infra & HA as per requirement
• Maintenance
• Cost of maintenance
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Why AWS?
Scalability Elasticity
Pay as you go Maintenance &
Management
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
Tooth Fairy-Architecture
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
www.blazeclan.com
Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com

More Related Content

PDF
Drowning in Data but Thirsty for Insights
PPTX
Why Big Query is so Powerful - Trusted Conf
PDF
Forecast of Big Data Trends
PDF
Big Query - Utilizing Google Data Warehouse for Media Analytics
PDF
Connecta Event: Big Query och dataanalys med Google Cloud Platform
PDF
Graphs & the Police: How Law Enforcement Analyze Connected Data at Scale
PDF
Modernizing to a Cloud Data Architecture
PDF
Hadoop et bases de données relationnelles ultra performantes : le meilleur de...
Drowning in Data but Thirsty for Insights
Why Big Query is so Powerful - Trusted Conf
Forecast of Big Data Trends
Big Query - Utilizing Google Data Warehouse for Media Analytics
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Graphs & the Police: How Law Enforcement Analyze Connected Data at Scale
Modernizing to a Cloud Data Architecture
Hadoop et bases de données relationnelles ultra performantes : le meilleur de...

What's hot (17)

PDF
Big data case study collection
PDF
Google and big query
PDF
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
PDF
GraphTalk Barcelona - Keynote
PDF
Introduction to Data Mining, Business Intelligence and Data Science
PPTX
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
PPT
Web analyticsandbigdata techweek2011
PDF
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
PPTX
The Business of Big Data - IA Ventures
PDF
The New Convergence of Data; the Next Strategic Business Advantage
PPTX
Presentation on Big Data
PPTX
Moving to the Cloud: Modernizing Data Architecture in Healthcare
PDF
Big query
PPTX
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
PDF
10 reasons why you should choose big data hadoop as career in 2018
PDF
BigQuery for Beginners
PDF
How Google Does Big Data - DevNexus 2014
Big data case study collection
Google and big query
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
GraphTalk Barcelona - Keynote
Introduction to Data Mining, Business Intelligence and Data Science
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
Web analyticsandbigdata techweek2011
Analytical Systems Evolution: From Excel to Big Data Platforms and Data Lakes
The Business of Big Data - IA Ventures
The New Convergence of Data; the Next Strategic Business Advantage
Presentation on Big Data
Moving to the Cloud: Modernizing Data Architecture in Healthcare
Big query
Neo4j GraphTalk Amsterdam - Next Generation Solutions using Neo4j
10 reasons why you should choose big data hadoop as career in 2018
BigQuery for Beginners
How Google Does Big Data - DevNexus 2014
Ad

Viewers also liked (18)

PDF
PPTX
PPTX
Life of data from generation to visualization using big data
PDF
Analyze Amazon CloudFront, S3 & ELB Logs with Cloudlytics - Part 1
PDF
[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)
PPTX
Overview of AWS Services for Media Content
PPTX
Enterprise Cloud for your Business Applications
PDF
Productive Expansion on Amazon Web Services with BlazeClan
PDF
How to Design for High Availability & Scale with AWS
PDF
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
PDF
Big Data Building Blocks with AWS Cloud
PPTX
Solving Big Data problems on AWS by Rajnish Malik
PPTX
Overview of AWS Services for your Enterprise
PDF
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty
PDF
Solving Big Data Industry Use Cases with AWS Cloud Computing
PDF
Amazon CloudFront Complete with Blazeclan's Media Solution Stack
Life of data from generation to visualization using big data
Analyze Amazon CloudFront, S3 & ELB Logs with Cloudlytics - Part 1
[TechTalks] Learning Configuration Management with SaltStack (Advanced Concepts)
Overview of AWS Services for Media Content
Enterprise Cloud for your Business Applications
Productive Expansion on Amazon Web Services with BlazeClan
How to Design for High Availability & Scale with AWS
Cloudlytics Reporting: Analyze Amazon CloudFront, S3 & ELB Logs - Part 2
Big Data Building Blocks with AWS Cloud
Solving Big Data problems on AWS by Rajnish Malik
Overview of AWS Services for your Enterprise
[TechTalks] Effects of UI/ UX Designs on Customer Satisfaction & Loyalty
Solving Big Data Industry Use Cases with AWS Cloud Computing
Amazon CloudFront Complete with Blazeclan's Media Solution Stack
Ad

Similar to Architecting for Big Data with AWS (20)

PPTX
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
PDF
Analytics in a Day Virtual Workshop
 
PPTX
Unlock Data-driven Insights in Databricks Using Location Intelligence
PPTX
When SAP alone is not enough
PDF
Analytics in a Day Ft. Synapse Virtual Workshop
 
PPTX
Data In Action: Business Value of Data
PDF
Analytics in a Day Virtual Workshop
 
PDF
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
PDF
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
PDF
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
PDF
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
PDF
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
PDF
QuickView #3 - Big Data
PDF
IBM Governed Data Lake
PPTX
How Insurance Companies Use MongoDB
PPT
Your big data audience insight big data show 24 apr 2013
PPTX
Big Data in the Cloud
PDF
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
PPTX
Liberate Legacy Data Sources with Precisely and Databricks
PDF
Accelerating Digital Transformation with App Modernization
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Analytics in a Day Virtual Workshop
 
Unlock Data-driven Insights in Databricks Using Location Intelligence
When SAP alone is not enough
Analytics in a Day Ft. Synapse Virtual Workshop
 
Data In Action: Business Value of Data
Analytics in a Day Virtual Workshop
 
Webinar: Introducing the MongoDB Connector for BI 2.0 with Tableau
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
QuickView #3 - Big Data
IBM Governed Data Lake
How Insurance Companies Use MongoDB
Your big data audience insight big data show 24 apr 2013
Big Data in the Cloud
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Liberate Legacy Data Sources with Precisely and Databricks
Accelerating Digital Transformation with App Modernization

More from Blazeclan Technologies Private Limited (12)

PDF
2020 Recap | Clan's Transformational Journey In The New Normal
PDF
Reminiscing 2019 And Heading Toward A Brighter Future!
PDF
AWS Managed Services - BlazeClan Technologies
PDF
Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports
PDF
Amazon Reshift as your Data Warehouse Solution
PDF
Testing Framework on AWS Cloud - Solution Set
PDF
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
PDF
5 Points to Consider - Enterprise Road Map to AWS Cloud
PDF
How cloud is fueling growth for online gaming
PDF
A guide on Aws Security Token Service
PDF
Working and Features of HTML5 and PhoneGap - An Overview
PDF
Cloud Migration Strategy - IT Transformation with Cloud
2020 Recap | Clan's Transformational Journey In The New Normal
Reminiscing 2019 And Heading Toward A Brighter Future!
AWS Managed Services - BlazeClan Technologies
Cloudlytics: In Depth S3 & CloudFront Log Analysis - Featuring Reports
Amazon Reshift as your Data Warehouse Solution
Testing Framework on AWS Cloud - Solution Set
Cloud for Media - A Complete Solution Stack for Faster Cloud Adoption
5 Points to Consider - Enterprise Road Map to AWS Cloud
How cloud is fueling growth for online gaming
A guide on Aws Security Token Service
Working and Features of HTML5 and PhoneGap - An Overview
Cloud Migration Strategy - IT Transformation with Cloud

Recently uploaded (20)

PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Spectroscopy.pptx food analysis technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
MIND Revenue Release Quarter 2 2025 Press Release
The Rise and Fall of 3GPP – Time for a Sabbatical?
Digital-Transformation-Roadmap-for-Companies.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Spectroscopy.pptx food analysis technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Chapter 3 Spatial Domain Image Processing.pdf
MYSQL Presentation for SQL database connectivity
“AI and Expert System Decision Support & Business Intelligence Systems”
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
Programs and apps: productivity, graphics, security and other tools
Spectral efficient network and resource selection model in 5G networks
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
sap open course for s4hana steps from ECC to s4
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)

Architecting for Big Data with AWS

  • 1. BIG DATA WITH AWS Architecting for Big Data with AWS www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com - By Stany Simon
  • 2. Agenda www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Introduction to Big Data When starting a Big Data Project Cost Vs. ROI Architecting for Big Data
  • 3. What is Big Data Big data is an evolving term that describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information. Big Data is less about the data itself and more about what you do with the data. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 4. Data vs. Information Data is raw, unorganized facts Information is derived from data Data is useless unless useful information from it is not derived www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 5. The 3 Vs of Big Data Velocity: Speed in which data is created, processed & analyzed Variety: Varying formats of the data ( structured / unstructured) Volume: Size of the data to be handled www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 6. Are the 3 V's enough to define Big Data today? Veracity Variability Visualization Value Validity Volatility www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 7. Where has Big Data Analytics helped? Solving the Big Pie Mystery www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 8. Where has Big Data Analytics helped? Faster Fast Food www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 9. Where has Big Data Analytics helped? Who stole my Cheese? www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 10. Where has Big Data Analytics helped? Tooth Fairy www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 11. When starting a Big Data Project www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 12. Identify clear business need and value. Build a strong data infrastructure to host and manage data. Time taken for a useful outcome. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Time Money Outcome Big Data Project
  • 13. Big Data Today…. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Estimated to grow to 40 zettabytes by 2020. Investment expected to top $114 billion by 2018. Data market reached $27.36B in 2014. Big Data market will top $84B in 2026 BUT....
  • 14. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Gartner predicted that through 2017, 60% of Big Data projects will fail to go beyond piloting and experimentation and will be abandoned. BUT.... 3. 30% of Big Data projects , 65 % failure, 35 % successful ( 5% with useful insights )
  • 15. Case one Brief: A retailer which was into a range of Big Data projects. Objective: Mining all of their stock and purchase data www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Importance of Identifying clear need and value Outcome:
  • 16. Case two Brief: Web-based startup focused on mothers on child development Objective: Collection of data from multiple sources for adaptive learning & behavioral pattern analysis. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com Outcome: Importance of Identifying clear need and value Budget
  • 17. Points to Ponder Does this data help your organizations with their business decisions. How to capture the Data? Where to store? How to analyze? Cost vs. ROI- Value www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 18. Big Data Break Down Flow of Data Ingest Store Process/ Analyze Visualize www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 19. Phase 1: Ingestion Transactional Orders, Invoices, Travel Records File Server Logs Stream IOT www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 20. Phase 2: Storage Database Storage Cloud Storage Stream Storage www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 21. AWS Services for Storage Amazon CloudSearch Amazon DynamoDB Amazon ElastiCache Amazon Elasticsearch Service Amazon Kinesis Amazon RDSAmazon Glacier Amazon S3Amazon Redshift
  • 22. Points to Ponder Size of the data to be stored. Time for which the data needs to be stored. Cost of storage per GB. Criticality of the data in terms of security & recovery. Availability of data. Data Structure Query Pattern www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 23. Traditional Architecture Client Tier Web-App Tier Data Storage Tier SQL
  • 24. Storage Architecture Storage Tier SQL NoSQL Search Cache www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 25. Phase 3: Process/ Analyse Batch Processing • Hourly, daily, weekly reports • Works on huge amount of data sets at one go • Response Time : Might take a few Hours to answer your questions Real-Time Processing • Minute based reports • Works on very small data sets • Response Time : Takes a few seconds to answer your data www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 26. AWS Services Amazon EMR Amazon Kinesis Amazon Redshift AWS Lambda www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 27. Phase 4: Visualize Value to the users www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 28. Points to Ponder Business : Value of the insights from the project Architecture: • Frequency of data flowing in • Size of data coming in • Tools to be used for processing the data • Scaling the Infra & HA as per requirement • Maintenance • Cost of maintenance www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 29. Why AWS? Scalability Elasticity Pay as you go Maintenance & Management www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 30. Tooth Fairy-Architecture www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com
  • 31. www.blazeclan.com Phone: +91 20 6529 0035 | e-mail: info@blazeclan.com