SlideShare a Scribd company logo
Amazon Redshift
~ Ahasan Habib
Technical Project Manager, Ixora Solution
Dhaka, Bangladesh
Data warehouse concept
What is Data warehouse?
● Relational database
● Query & analysis
● Transaction processing data => Historical data
● Transaction workload => Analysis work load
● Extract, Transform, Load
Data warehouse architecture
Big Data
So large and complex traditional data processing applications are not adequate
characteristics:
● Volume:
The amount or quality of data.
● Velocity
The rate at which data is created.
● Variety
The different types of data.
Big data Architecture
Operational Data
● Transactional data.
● Event data.
● Realtime data
● Helps to run day to day system/business operation.
Analytical Data
● Historical data
● Numerical values, measure, matrix (numerical measurement)
● Business intelligence & decision making
Rows Vs Columnar Database
What is Redshift ?
● A data warehouse management tool.
● Develop and manage by Amazon.
● Cloud hosted large data management system.
● Distributed data management system.
● Columnar data storage.
Redshift Speciality
1. Extremely fast.
2. Web service API based communication.
3. Massive parallel processing.
4. Full ANSI SQL support.
5. Columnar database.
6. Learning is very easy.
Redshift Product History
● November 2012 Bita release
● Feb 14 2014 Initial release
● POSTGRESQL 8.0.2
Redshift Architecture
Advantages using Redshift
● Extremly faster for analytical data processing.
● Support ANSI SQL syntax.
● Cloud based solution.
● Highly secured (context of data & system access)
Redshift data warehouse design
1. Start schema
2. Snowflakes Schema
3. Denormalized Fact Table
Customer Id
Customer Name
Customer Address
State
City
Country
Product Id
Product Name
Product Category
Gross Sales Amount
Net Sales Amount
Index and Constraints
1. Sort Key
2. Distribution Key
3. Primary-key/Foreign Key
4. Triggers
Data Types
Data Type Alias Description
SMALLINT INT2 Signed 2 byte
INTEGER INT4 Signed 4 byte
BIGINT INT8 Signed 8 byte
DECIMAL NUMERIC Selectable precision
REAL, Double Precision Float4, Float8 Single, Double Precision (32,64)
CHAR CHARACTER,NCHAR Fixed Length (4096)
VARCHAR NVARCHAR, TEXT Variable Length (65535)
DATE, TIMESTAMP Calendar Date, Date & Time (UTC)
BOOLEAN BOOL True/False
Data Loading
● S3
● COPY command
● Data Pipeline
Query
● CRUD
● Dynamic query
● Metadata Query
● Query execution Plan
Other database objects
● Built in Function
● User defined Function
● Stored Procedures
● Transactions
Security
● User Management
● Role Management
● Schema Management
Client Development Tools
● Navicat
● SQL Server Management Studio
● Various Drivers:
Linux
Visual Studio
Scala
Python
Q & A

More Related Content

PPTX
MongoDB as a Data Warehouse: Time Series and Device History Data (Medtronic)
PPTX
An Intro to Elasticsearch and Kibana
PPTX
Exploring MongoDB & Elasticsearch: Better Together
PDF
Building Pinterest Real-Time Ads Platform Using Kafka Streams
PPTX
Accelerating Delivery of Data Products - The EBSCO Way
PPTX
Using druid for interactive count distinct queries at scale
PPTX
Our journey with druid - from initial research to full production scale
PPTX
Challenges in Building a Data Pipeline
MongoDB as a Data Warehouse: Time Series and Device History Data (Medtronic)
An Intro to Elasticsearch and Kibana
Exploring MongoDB & Elasticsearch: Better Together
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Accelerating Delivery of Data Products - The EBSCO Way
Using druid for interactive count distinct queries at scale
Our journey with druid - from initial research to full production scale
Challenges in Building a Data Pipeline

What's hot (20)

PDF
Iceberg: A modern table format for big data (Strata NY 2018)
PDF
Data platform architecture principles - ieee infrastructure 2020
PDF
Data Pipline Observability meetup
PDF
Converging Database Transactions and Analytics
PDF
Presto Summit 2018 - 03 - Starburst CBO
PDF
Big data real time architectures
PDF
The State of the Data Warehouse in 2017 and Beyond
PDF
Personalization Journey: From Single Node to Cloud Streaming
PDF
Introduction to Data Engineer and Data Pipeline at Credit OK
PDF
Continuous delivery for machine learning
PDF
Iceberg: a fast table format for S3
PDF
Kafka as an Eventing System to Replatform a Monolith into Microservices
PPTX
Symantec: Cassandra Data Modelling techniques in action
PPTX
PSSUG Nov 2012: Big Data with SQL Server
PDF
Real time analytics on deep learning @ strata data 2019
PPTX
Telco analytics at scale
PPTX
Big Data Best Practices on GCP
PDF
Big Data Streams Architectures. Why? What? How?
PDF
Building the Next-gen Digital Meter Platform for Fluvius
PDF
Uber Geo spatial data platform at DataWorks Summit
Iceberg: A modern table format for big data (Strata NY 2018)
Data platform architecture principles - ieee infrastructure 2020
Data Pipline Observability meetup
Converging Database Transactions and Analytics
Presto Summit 2018 - 03 - Starburst CBO
Big data real time architectures
The State of the Data Warehouse in 2017 and Beyond
Personalization Journey: From Single Node to Cloud Streaming
Introduction to Data Engineer and Data Pipeline at Credit OK
Continuous delivery for machine learning
Iceberg: a fast table format for S3
Kafka as an Eventing System to Replatform a Monolith into Microservices
Symantec: Cassandra Data Modelling techniques in action
PSSUG Nov 2012: Big Data with SQL Server
Real time analytics on deep learning @ strata data 2019
Telco analytics at scale
Big Data Best Practices on GCP
Big Data Streams Architectures. Why? What? How?
Building the Next-gen Digital Meter Platform for Fluvius
Uber Geo spatial data platform at DataWorks Summit
Ad

Viewers also liked (7)

PPTX
angular2-learn
PPTX
Big data bi-mature-oanyc summit
PDF
Oas schwartz 16
PPTX
AWS_Data_Pipeline
PDF
Introducing Elastic MapReduce
PDF
Introducing AWS OpsWorks
PPTX
DECK36 - Log everything! and Realtime Datastream Analytics with Storm
angular2-learn
Big data bi-mature-oanyc summit
Oas schwartz 16
AWS_Data_Pipeline
Introducing Elastic MapReduce
Introducing AWS OpsWorks
DECK36 - Log everything! and Realtime Datastream Analytics with Storm
Ad

Similar to AmazonRedshift (20)

PDF
London Redshift Meetup - July 2017
PPTX
Redshift overview
PPTX
A tour of Amazon Redshift
PPTX
What is Amazon Redshift?
PPTX
Introdução ao Data Warehouse Amazon Redshift
PPTX
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
PPTX
AWS (Amazon Redshift) presentation
PDF
Amazon RedShift - Ianni Vamvadelis
PDF
Module 2 - Datalake
PPTX
REDSHIFT - Amazon
PDF
Melhores práticas de data warehouse no Amazon Redshift
PDF
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
PDF
Redshift deep dive
PDF
Benefícios e melhores práticas no uso do Amazon Redshift
PDF
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
PDF
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
PDF
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
PDF
Introdução ao data warehouse Amazon Redshift
PDF
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
PPTX
Data engineering
London Redshift Meetup - July 2017
Redshift overview
A tour of Amazon Redshift
What is Amazon Redshift?
Introdução ao Data Warehouse Amazon Redshift
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
AWS (Amazon Redshift) presentation
Amazon RedShift - Ianni Vamvadelis
Module 2 - Datalake
REDSHIFT - Amazon
Melhores práticas de data warehouse no Amazon Redshift
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Redshift deep dive
Benefícios e melhores práticas no uso do Amazon Redshift
Immersion Day - Como simplificar o acesso ao seu ambiente analítico
[よくわかるAmazon Redshift]Amazon Redshift最新情報と導入事例のご紹介
[よくわかるAmazon Redshift in 大阪]Amazon Redshift最新情報と導入事例のご紹介
Introdução ao data warehouse Amazon Redshift
클라우드에서의 데이터 웨어하우징 & 비즈니스 인텔리전스
Data engineering

AmazonRedshift