SlideShare a Scribd company logo
4
Most read
6
Most read
7
Most read
Developing a Database for Netflix
Database and SQL
Hannah Parker, Sean Scott, Anqi Wang, Dongqi Wang
17 February 2016
Professor Nejad
I. Introduction
Netflix provides streaming movies and TV shows to over 75 million subscribers across
the globe. Customers can watch as many shows/ movies as they want as long as they are
connected to the internet for a monthly subscription fee of about ten dollars. Netflix produces
original content and also pays for the rights to stream feature films and shows.
In order to understand customer behavior, Netflix needs to track its customers, its content,
and the content that specific customers watch. Understanding which users watch which shows
and movies will allow the firm to recommend similar content that the user will also likely enjoy.
This type of data collection and analysis in order to provide recommendations offers customers
an enjoyable, convenient streaming experience. Moreover, the database will track important
metrics such as customer churn and poor performing content (content that receives poor ratings
and content that is rarely streamed).
II. Database Design
In order for Netflix to collect the information it needs, three tables need to be established.
Netflix first needs to compile a table listing all of its content (movies, tv shows, etc). Content
will be uniquely identified by its ‘Content ID’. For TV shows, each episode will have a unique
Content ID. Netflix also needs to gather information on its customers. The customer table will
collect data including individual customer names, phone numbers, addresses, emails, and dates
of birth. Customers are uniquely identified by their ‘CustomerID’. Once Netflix is properly
recording all content and customers, it also needs a table for streams. A stream is defined as an
instance when a unique customer watches a unique piece of content. Streams are uniquely
identified by a ‘StreamID’ and also characterized by the Customer’s ID, the Content’s ID, the
date of the stream, the time of the stream, the length or duration of the stream (i.e., did the
customer watch the entire show or movie?) and finally the rating that the customer gives the
content.
Figure 1
Relationships Diagram
III. Table Schemas
Table 1
Customer Table
Field Type Length Description Primary Key or
Foreign Key
CustomerID Character
with fixed
length
9 Uniquely
identifies a
customer
Primary Key
CustomerName Character
with fixed
length
20 Shows a
customer name
CustomerPhoneNumber Number 15 Shows a
customer phone
number
CustomerAddress Character
with fixed
length
30 Shows a
customer
address
CustomerEmail Character
with fixed
length
30 Shows a
customer email
CustomerDOB Date Shows a
customer
birthday
Figure 2
Example of a Record from the Customer Table.
Table 2
Content Table
Field Type Length Description Primary Key or
Foreign Key
ContentID Character with
fixed length
9 Uniquely
identifies a
content
Primary Key
Title String with
variable length
50 Shows the title of
a content
Episode String with
variable length
10 Shows the episode
of a content
Genre String with
variable length
20 Shows the
category of a
content
TimeLength Time Shows the length
of a content
CostPerStream Currency Shows the cost of
every stream
ReleaseDate Date Shows the release
date of a content
Distributer String with
variable length
20 Shows the
distributer of a
content
Figure 3
Example of a Record from the Content Table.
Table 3
Streams Table
Field Type Length Description Primary Key
or Foreign Key
StreamID Character 9 Uniquely
identifies a stream
Primary Key
CustomerID Character 9 Identifies a
customer
Foreign Key
ContentID Character 9 Identifies a
content
Foreign Key
StreamDate Character 10 Shows the date of
a stream
StreamTime Time Shows the time of
a stream
StreamLength Time Shows the length
of a stream
StreamRate Number Shows the rate of
a stream
Figure 4
Example of a Record from the Streams Table.
IV. Queries
SELECT *
FROM Content
This query returns all columns and all rows of the Content table in the database. This query is
useful when one needs to visualize all potentially useful information in the Content Table.
SELECT CustomerName, CustomerPhoneNumber, CustomerEmail
FROM Customer
This query returns the name, phone number, and email for each customer in the database. The
purpose of this query is to return customer contact information. If Netflix was running a new
promotional campaign and needed to get in touch with all their customers by phone and/or email,
this query would provide the information they need.
SELECT Customer.CustomerID, Customer.CustomerName,
Customer.CustomerEmail .Streams.StreamDate, Streams.StreamRating
FROM Customer, Streams
WHERE Customer.CustomerID = Streams.CustomerID;
This query uses an inner join to match a customer’s name and email with the date of each of their
streams and the rating they gave the stream. The information provided with this query paints a
picture of how ratings have changed over time for each customer. If a customer’s ratings are
getting lower, this query also gives an email, so Netflix can get in touch with them and
recommend some new content.
SELECT Content.ContentID, Content.Title, Streams.StreamDate, Streams.StreamTime
FROM Content LEFT JOIN Streams
ON Content.ContentID = Streams.ContentID;
This query uses an outer join to show the Title of each stream, along with the content’s ID, and
the date and time of the stream. This query shows trends in stream time. This query (because it
is an outer join) also notifies Netflix what content has yet to be streamed. This is important
because Netflix does not want to pay for content that no one is watching.
SELECT C.CustomerName, Sum(Co.CostPerStream) AS TotalCost
FROM Customer C, Content Co, Streams S
WHERE C.CustomerID = S.CustomerID
AND S.ContentID = Co.ContentID
Group By C.CustomerName
HAVING Sum(Co.CostPerStream) > 1.00;
By taking information from the customer and content tables, this query shows managers which
customers are the most costly based on number of streams. Once the database is bigger, Netflix
will be able to identify its most costly customers (on a large scale) and decide what to do with
those customers. If a customer is costing Netflix more than he or she is worth, it may not be
practical to keep that customer.
SELECT Co.Title, Co.Episode, AVG(S.StreamRating) AS AverageRating, Count(S.StreamID)
AS TotalStreams
FROM Content Co, Streams S
WHERE Co.ContentID = S.ContentID
Group By Co.Episode, Co.Title;
This query contains a calculation that identifies the average rating given to any particular content.
It will show Netflix managers the performance of each piece of content in the company’s
database. Therefore, if one show or movie has an extremely low rating, the company may want
to consider discontinuing that streaming option.
CREATE VIEW AllStreams as
Select *
From customerstreams;
This query creates a view of a previously mentioned query. The purpose of this view is to store,
in one place, information from multiple tables: customer information and streaming information.

More Related Content

PPTX
Odbms concepts
PDF
Networking Technologies Basic's complete notes
PDF
Student management system university erp
DOCX
Library Management System
PPTX
Mysql Crud, Php Mysql, php, sql
PDF
Bca sem 6 php practicals 1to12
PDF
SQL practice questions set
DOCX
Input design and output design
Odbms concepts
Networking Technologies Basic's complete notes
Student management system university erp
Library Management System
Mysql Crud, Php Mysql, php, sql
Bca sem 6 php practicals 1to12
SQL practice questions set
Input design and output design

What's hot (20)

PDF
SQL window functions for MySQL
PPT
1 - Introduction to PL/SQL
PPTX
Normalization Practice case study.pptx
PDF
Netflix Promotional Campaign
PPT
Creating and Managing Tables -Oracle Data base
PDF
Tagging Strategy and KPIs: The Case of Netflix
PPTX
Recommender systems: Content-based and collaborative filtering
PPTX
Vsam presentation PPT
PPTX
introdution to SQL and SQL functions
PDF
Graphs for Data Science and Machine Learning
PDF
Introduction to DAX Language
PPTX
Netflix
PPTX
DATABASE PROJECT
PDF
CSPro Training Slides
PDF
Marketing analysis for Netflix
PDF
Top 100 SQL Interview Questions and Answers
PPTX
Html form
PPTX
SQL Basics
PDF
Srs template ieee-movie recommender
DOCX
SQL Queries and Solutions (Database)
SQL window functions for MySQL
1 - Introduction to PL/SQL
Normalization Practice case study.pptx
Netflix Promotional Campaign
Creating and Managing Tables -Oracle Data base
Tagging Strategy and KPIs: The Case of Netflix
Recommender systems: Content-based and collaborative filtering
Vsam presentation PPT
introdution to SQL and SQL functions
Graphs for Data Science and Machine Learning
Introduction to DAX Language
Netflix
DATABASE PROJECT
CSPro Training Slides
Marketing analysis for Netflix
Top 100 SQL Interview Questions and Answers
Html form
SQL Basics
Srs template ieee-movie recommender
SQL Queries and Solutions (Database)
Ad

Viewers also liked (11)

PPTX
Ronalao termpresent
PPTX
Software Risk Management
PDF
jQuery Makes Writing JavaScript Fun Again (for HTML5 User Group)
PPTX
SQL Server database project ideas - Top, latest and best project ideas final ...
PDF
Netflix Global Cloud Architecture
PDF
YouTube Content ID Handbook - Google
PDF
Software engineering lecture notes
PPT
Risk management in software engineering
PPTX
RMMM-Risk Management,Mitigation and Monitoring.
DOCX
Dbms project list
PPT
Entity relationship diagram (erd)
Ronalao termpresent
Software Risk Management
jQuery Makes Writing JavaScript Fun Again (for HTML5 User Group)
SQL Server database project ideas - Top, latest and best project ideas final ...
Netflix Global Cloud Architecture
YouTube Content ID Handbook - Google
Software engineering lecture notes
Risk management in software engineering
RMMM-Risk Management,Mitigation and Monitoring.
Dbms project list
Entity relationship diagram (erd)
Ad

Similar to Database Project for Netflix (SQL Project) (20)

PDF
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
PDF
Cassandra Day Denver 2014: A Cassandra Data Model for Serving up Cat Videos
PDF
Streaming SQL
DOCX
IST365 - Project Deliverable #3Create the corresponding relation.docx
PDF
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
PDF
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
PDF
Streaming SQL
PDF
Streaming SQL
PPTX
Exploring KSQL Patterns
PPTX
Relational database concept and technology
PDF
Role of Data Analytics in the Media and Entertainment Industry - White Paper
PPTX
Svccg nosql 2011_v4
PDF
A project on SQL (Amazon database)p.pdf
PPT
Netflix Exemplar
PDF
Database Project for Starbucks (SQL)
PPTX
Current trends in DBMS
PPS
Databases
PPTX
Netflix's Transition to High-Availability Storage (QCon SF 2010)
PDF
Julian Hyde - Streaming SQL
PDF
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Cassandra Day Denver 2014: A Cassandra Data Model for Serving up Cat Videos
Streaming SQL
IST365 - Project Deliverable #3Create the corresponding relation.docx
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
Querying the Internet of Things: Streaming SQL on Kafka/Samza and Storm/Trident
Streaming SQL
Streaming SQL
Exploring KSQL Patterns
Relational database concept and technology
Role of Data Analytics in the Media and Entertainment Industry - White Paper
Svccg nosql 2011_v4
A project on SQL (Amazon database)p.pdf
Netflix Exemplar
Database Project for Starbucks (SQL)
Current trends in DBMS
Databases
Netflix's Transition to High-Availability Storage (QCon SF 2010)
Julian Hyde - Streaming SQL
Streaming SQL (at FlinkForward, Berlin, 2016/09/12)

Recently uploaded (20)

PPTX
Best Digital marketing service provider in Chandigarh.pptx
PDF
Digital Marketing in the Age of AI: What CEOs Need to Know - Jennifer Apy, Ch...
PPTX
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
PPTX
Your score increases as you pick a category, fill out a long description and ...
PDF
EVOLUTION OF RURAL MARKETING IN INDIAN CIVILIZATION
PDF
Mastering Bulk Email Campaign Optimization for 2025
PPTX
hnk joint business plan for_Rooftop_Plan
PDF
Unit 1 -2 THE 4 As of RURAL MARKETING MIX.pdf
PDF
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
PDF
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
PPTX
UNIT 3 - 5 INDUSTRIAL PRICING.ppt x
PDF
Instagram Marketing Agency by IIS INDIA.pdf
PPTX
Ipsos+Protocols+Playbook+V1.2+(DEC2024)+final+IntClientUseOnly.pptx
PDF
AI & Automation: The Future of Marketing or the End of Creativity - Matthew W...
PDF
Proven AI Visibility: From SEO Strategy To GEO Tactics
PPTX
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
PPTX
Mastering eCommerce SEO: Strategies to Boost Traffic and Maximize Conversions
PDF
UNIT 1 -4 Profile of Rural Consumers (1).pdf
PPTX
Strategic Sage Digital-The Professional Digital Marketing Company in Mohali.pptx
DOCX
procubiz_modern digital marketingblog.docx
Best Digital marketing service provider in Chandigarh.pptx
Digital Marketing in the Age of AI: What CEOs Need to Know - Jennifer Apy, Ch...
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
Your score increases as you pick a category, fill out a long description and ...
EVOLUTION OF RURAL MARKETING IN INDIAN CIVILIZATION
Mastering Bulk Email Campaign Optimization for 2025
hnk joint business plan for_Rooftop_Plan
Unit 1 -2 THE 4 As of RURAL MARKETING MIX.pdf
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
UNIT 3 - 5 INDUSTRIAL PRICING.ppt x
Instagram Marketing Agency by IIS INDIA.pdf
Ipsos+Protocols+Playbook+V1.2+(DEC2024)+final+IntClientUseOnly.pptx
AI & Automation: The Future of Marketing or the End of Creativity - Matthew W...
Proven AI Visibility: From SEO Strategy To GEO Tactics
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
Mastering eCommerce SEO: Strategies to Boost Traffic and Maximize Conversions
UNIT 1 -4 Profile of Rural Consumers (1).pdf
Strategic Sage Digital-The Professional Digital Marketing Company in Mohali.pptx
procubiz_modern digital marketingblog.docx

Database Project for Netflix (SQL Project)

  • 1. Developing a Database for Netflix Database and SQL Hannah Parker, Sean Scott, Anqi Wang, Dongqi Wang 17 February 2016 Professor Nejad
  • 2. I. Introduction Netflix provides streaming movies and TV shows to over 75 million subscribers across the globe. Customers can watch as many shows/ movies as they want as long as they are connected to the internet for a monthly subscription fee of about ten dollars. Netflix produces original content and also pays for the rights to stream feature films and shows. In order to understand customer behavior, Netflix needs to track its customers, its content, and the content that specific customers watch. Understanding which users watch which shows and movies will allow the firm to recommend similar content that the user will also likely enjoy. This type of data collection and analysis in order to provide recommendations offers customers an enjoyable, convenient streaming experience. Moreover, the database will track important metrics such as customer churn and poor performing content (content that receives poor ratings and content that is rarely streamed). II. Database Design In order for Netflix to collect the information it needs, three tables need to be established. Netflix first needs to compile a table listing all of its content (movies, tv shows, etc). Content will be uniquely identified by its ‘Content ID’. For TV shows, each episode will have a unique Content ID. Netflix also needs to gather information on its customers. The customer table will collect data including individual customer names, phone numbers, addresses, emails, and dates of birth. Customers are uniquely identified by their ‘CustomerID’. Once Netflix is properly recording all content and customers, it also needs a table for streams. A stream is defined as an instance when a unique customer watches a unique piece of content. Streams are uniquely identified by a ‘StreamID’ and also characterized by the Customer’s ID, the Content’s ID, the date of the stream, the time of the stream, the length or duration of the stream (i.e., did the customer watch the entire show or movie?) and finally the rating that the customer gives the content.
  • 4. III. Table Schemas Table 1 Customer Table Field Type Length Description Primary Key or Foreign Key CustomerID Character with fixed length 9 Uniquely identifies a customer Primary Key CustomerName Character with fixed length 20 Shows a customer name CustomerPhoneNumber Number 15 Shows a customer phone number CustomerAddress Character with fixed length 30 Shows a customer address CustomerEmail Character with fixed length 30 Shows a customer email CustomerDOB Date Shows a customer birthday Figure 2 Example of a Record from the Customer Table.
  • 5. Table 2 Content Table Field Type Length Description Primary Key or Foreign Key ContentID Character with fixed length 9 Uniquely identifies a content Primary Key Title String with variable length 50 Shows the title of a content Episode String with variable length 10 Shows the episode of a content Genre String with variable length 20 Shows the category of a content TimeLength Time Shows the length of a content CostPerStream Currency Shows the cost of every stream ReleaseDate Date Shows the release date of a content Distributer String with variable length 20 Shows the distributer of a content Figure 3 Example of a Record from the Content Table.
  • 6. Table 3 Streams Table Field Type Length Description Primary Key or Foreign Key StreamID Character 9 Uniquely identifies a stream Primary Key CustomerID Character 9 Identifies a customer Foreign Key ContentID Character 9 Identifies a content Foreign Key StreamDate Character 10 Shows the date of a stream StreamTime Time Shows the time of a stream StreamLength Time Shows the length of a stream StreamRate Number Shows the rate of a stream Figure 4 Example of a Record from the Streams Table.
  • 7. IV. Queries SELECT * FROM Content This query returns all columns and all rows of the Content table in the database. This query is useful when one needs to visualize all potentially useful information in the Content Table. SELECT CustomerName, CustomerPhoneNumber, CustomerEmail FROM Customer This query returns the name, phone number, and email for each customer in the database. The purpose of this query is to return customer contact information. If Netflix was running a new promotional campaign and needed to get in touch with all their customers by phone and/or email, this query would provide the information they need.
  • 8. SELECT Customer.CustomerID, Customer.CustomerName, Customer.CustomerEmail .Streams.StreamDate, Streams.StreamRating FROM Customer, Streams WHERE Customer.CustomerID = Streams.CustomerID; This query uses an inner join to match a customer’s name and email with the date of each of their streams and the rating they gave the stream. The information provided with this query paints a picture of how ratings have changed over time for each customer. If a customer’s ratings are getting lower, this query also gives an email, so Netflix can get in touch with them and recommend some new content. SELECT Content.ContentID, Content.Title, Streams.StreamDate, Streams.StreamTime FROM Content LEFT JOIN Streams ON Content.ContentID = Streams.ContentID; This query uses an outer join to show the Title of each stream, along with the content’s ID, and the date and time of the stream. This query shows trends in stream time. This query (because it is an outer join) also notifies Netflix what content has yet to be streamed. This is important because Netflix does not want to pay for content that no one is watching.
  • 9. SELECT C.CustomerName, Sum(Co.CostPerStream) AS TotalCost FROM Customer C, Content Co, Streams S WHERE C.CustomerID = S.CustomerID AND S.ContentID = Co.ContentID Group By C.CustomerName HAVING Sum(Co.CostPerStream) > 1.00; By taking information from the customer and content tables, this query shows managers which customers are the most costly based on number of streams. Once the database is bigger, Netflix will be able to identify its most costly customers (on a large scale) and decide what to do with those customers. If a customer is costing Netflix more than he or she is worth, it may not be practical to keep that customer. SELECT Co.Title, Co.Episode, AVG(S.StreamRating) AS AverageRating, Count(S.StreamID) AS TotalStreams FROM Content Co, Streams S WHERE Co.ContentID = S.ContentID Group By Co.Episode, Co.Title; This query contains a calculation that identifies the average rating given to any particular content. It will show Netflix managers the performance of each piece of content in the company’s database. Therefore, if one show or movie has an extremely low rating, the company may want to consider discontinuing that streaming option.
  • 10. CREATE VIEW AllStreams as Select * From customerstreams; This query creates a view of a previously mentioned query. The purpose of this view is to store, in one place, information from multiple tables: customer information and streaming information.