SlideShare a Scribd company logo
Faiz ul haque Zeya
MS CS University of Tulsa,OK,USA
Topics covered











1. Introduction
2.Bigdata: how big it is
3.Bigdata Technology.
4. Few examples of Big Data.
5. Airline reservation system
6. Google Translate.
7.Amazon recommendation.
8. Netflix recommendation.
9. Hadoop, Map reduce.
10. Q&A.
Introduction
 Large set of data. Site of peta byte, exa byte.
 Not stored relational.
 Massive scale computational.
 NO SQL queries.

 New technology like MAP REDUCE,HADOOP.
 Reason: Scalability and poor performance on large

scale.
How large it is

 Peta byte 10^15
 Zetta byte 10^21

Exabyte 10^ 18

 Google processed about 24 petabytes of data per day in

2009.[
 Yahoo stores 2 petabytes of data on behavior.
 eBay.com uses two data warehouses at 7.5 petabytes
and 40PB as well as a 40PB Hadoop cluster for search,
consumer recommendations, and merchandising.
Big data introduction
BigData Technologies
 Relational database,SQL queries cannot handle such

amount of data.
 Therefore other technologies are requried
 MAP REDUCE parallel computation.
Few examples of Big Data
 Airplane reservation system.
 Google Translate.
 Netflix Movie recommendation
 Amazon Book recommendation
Airline reservation system
 Oren Etzioni of Washington ‘s venture capital based






startup Farecast.
It predicts based on past data whether airline prices
will go up or down.
Etzioni uses predictive model for that.
Microsoft purchase it for 110 M $
Make it part of BING search engine.
GOOGLE Translate
 Whole internet as training data.Corpus
 Google release Trillion word corpus in 2009.
 They accept messy data.
 Candide uses 3 million translated sentences.

 Google uses billions of pages from intenet.
Netflix Million $ prize
 Netflix announced to award 1M$ prize for the team

who improves the recommendation algorithm by 5%.
 They are movie recommender.
 Most of the sales are due to recommendations from
the site.
 Reason is that so many shows that the user don’t even
know.
Amazon’s recommendation
 Amazon uses item to item recommendation instead of

traditional collaborative recommendation.
 Item to item recommendation search for similar items
rather than similar users.
 This approach is scalable to large data set.
Map Reduce
 Q&A

More Related Content

PDF
IBM and Apache Spark
PPTX
Data science using r multisoft systems
PDF
Data science
PDF
What is data science ?
PPTX
Philipp Kandal , CTO, Skobbler - Big data on a small budget
PPTX
Real-time Big Data Analytics: From Deployment to Production
PPTX
Joe C
PPTX
Big data analytics presented at meetup big data for decision makers
IBM and Apache Spark
Data science using r multisoft systems
Data science
What is data science ?
Philipp Kandal , CTO, Skobbler - Big data on a small budget
Real-time Big Data Analytics: From Deployment to Production
Joe C
Big data analytics presented at meetup big data for decision makers

Similar to Big data introduction (20)

PDF
Not Your Mom's SEO
PPTX
A Big Data Concept
PPTX
HadoopWorkshopJuly2014
PDF
Machine Learning on Big Data with HADOOP
PPT
Ps Appliance Overview
PPTX
Big Data and Hadoop
KEY
Big data and APIs for PHP developers - SXSW 2011
PDF
PDF
Improve your Tech Quotient
PDF
ANALYTICS OF DATA USING HADOOP-A REVIEW
PPTX
BigData Meets the Federal Data Center
PDF
Is It A Right Time For Me To Learn Hadoop. Find out ?
PDF
Hadoop Webinar 28July15
PDF
How to build and run a big data platform in the 21st century
PDF
Big data analytics 1
PPTX
Big data business case
PDF
L21 Big Data and Analytics
PPTX
Hadoop for beginners free course ppt
PPTX
Big Data Analytics from a Practitioners View
PPT
Web 3.0 Emerging
Not Your Mom's SEO
A Big Data Concept
HadoopWorkshopJuly2014
Machine Learning on Big Data with HADOOP
Ps Appliance Overview
Big Data and Hadoop
Big data and APIs for PHP developers - SXSW 2011
Improve your Tech Quotient
ANALYTICS OF DATA USING HADOOP-A REVIEW
BigData Meets the Federal Data Center
Is It A Right Time For Me To Learn Hadoop. Find out ?
Hadoop Webinar 28July15
How to build and run a big data platform in the 21st century
Big data analytics 1
Big data business case
L21 Big Data and Analytics
Hadoop for beginners free course ppt
Big Data Analytics from a Practitioners View
Web 3.0 Emerging
Ad

More from Faiz Zeya (17)

PPTX
FUNCTIONAL SIZE MEASURE AND ESTIMATES. SOFTWARE METRICS COURSE
PPTX
HALSTEAD COMPLEXITY-SOFTWARE METRICS AND ESTIMATION
PPTX
Line of code metrics. Software metrics and estimation
PPTX
Software metrics lecture 4 Usability metrics
PPTX
Lecture 2 software metrics and estimation.
PPTX
Software metrics and estimation lecture 1
PPTX
Reinforcement learning through human feedback
PPTX
Artificial Intelligence- lecture 1 BUKC lecture
PPT
Structure of Z Formal methods Lecture
PPTX
Elements of Z. Formal methods lecture
PPTX
Text editor in Z
PPTX
First order logic
PPTX
Word2vec Lecture. NLP BUKC lecture.
PPTX
Code completion using OpenAI APIs.pptx
PPTX
Types of machine learning.pptx
PPTX
Linear algebraweek2
PPTX
Query expansion for search improvement by faizulhaque
FUNCTIONAL SIZE MEASURE AND ESTIMATES. SOFTWARE METRICS COURSE
HALSTEAD COMPLEXITY-SOFTWARE METRICS AND ESTIMATION
Line of code metrics. Software metrics and estimation
Software metrics lecture 4 Usability metrics
Lecture 2 software metrics and estimation.
Software metrics and estimation lecture 1
Reinforcement learning through human feedback
Artificial Intelligence- lecture 1 BUKC lecture
Structure of Z Formal methods Lecture
Elements of Z. Formal methods lecture
Text editor in Z
First order logic
Word2vec Lecture. NLP BUKC lecture.
Code completion using OpenAI APIs.pptx
Types of machine learning.pptx
Linear algebraweek2
Query expansion for search improvement by faizulhaque
Ad

Recently uploaded (20)

PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Approach and Philosophy of On baking technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Modernizing your data center with Dell and AMD
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
A Presentation on Artificial Intelligence
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
MYSQL Presentation for SQL database connectivity
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Approach and Philosophy of On baking technology
Per capita expenditure prediction using model stacking based on satellite ima...
The AUB Centre for AI in Media Proposal.docx
Modernizing your data center with Dell and AMD
NewMind AI Weekly Chronicles - August'25 Week I
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Monthly Chronicles - July 2025
CIFDAQ's Market Insight: SEC Turns Pro Crypto
A Presentation on Artificial Intelligence
“AI and Expert System Decision Support & Business Intelligence Systems”
The Rise and Fall of 3GPP – Time for a Sabbatical?
Understanding_Digital_Forensics_Presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation_ Review paper, used for researhc scholars
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Big data introduction

  • 1. Faiz ul haque Zeya MS CS University of Tulsa,OK,USA
  • 2. Topics covered           1. Introduction 2.Bigdata: how big it is 3.Bigdata Technology. 4. Few examples of Big Data. 5. Airline reservation system 6. Google Translate. 7.Amazon recommendation. 8. Netflix recommendation. 9. Hadoop, Map reduce. 10. Q&A.
  • 3. Introduction  Large set of data. Site of peta byte, exa byte.  Not stored relational.  Massive scale computational.  NO SQL queries.  New technology like MAP REDUCE,HADOOP.  Reason: Scalability and poor performance on large scale.
  • 4. How large it is  Peta byte 10^15  Zetta byte 10^21 Exabyte 10^ 18  Google processed about 24 petabytes of data per day in 2009.[  Yahoo stores 2 petabytes of data on behavior.  eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising.
  • 6. BigData Technologies  Relational database,SQL queries cannot handle such amount of data.  Therefore other technologies are requried  MAP REDUCE parallel computation.
  • 7. Few examples of Big Data  Airplane reservation system.  Google Translate.  Netflix Movie recommendation  Amazon Book recommendation
  • 8. Airline reservation system  Oren Etzioni of Washington ‘s venture capital based     startup Farecast. It predicts based on past data whether airline prices will go up or down. Etzioni uses predictive model for that. Microsoft purchase it for 110 M $ Make it part of BING search engine.
  • 9. GOOGLE Translate  Whole internet as training data.Corpus  Google release Trillion word corpus in 2009.  They accept messy data.  Candide uses 3 million translated sentences.  Google uses billions of pages from intenet.
  • 10. Netflix Million $ prize  Netflix announced to award 1M$ prize for the team who improves the recommendation algorithm by 5%.  They are movie recommender.  Most of the sales are due to recommendations from the site.  Reason is that so many shows that the user don’t even know.
  • 11. Amazon’s recommendation  Amazon uses item to item recommendation instead of traditional collaborative recommendation.  Item to item recommendation search for similar items rather than similar users.  This approach is scalable to large data set.