SlideShare a Scribd company logo
Teradata - Architecture of Teradata
Introduction To Teradata
Teradata - Architecture of Teradata
Teradata Company Highlights
• Founded 1979 – West LA
• First product to market – 1984
• First Terabyte system – 1987
• Acquired by AT&T and
merged with acquired NCR – 1992
• Tri-vested as part of NCR - 1997
• Teradata Corporation – (re)Launched October 1, 2007
– Global Leader in Enterprise Data Warehousing
• EDW/ADW Database Technology
• Analytic Solutions
– Positioned in Gartner’s Leaders Quadrant
in data warehousing since 1999
• Top 10 U.S. publicly-traded software company
– S&P 500 Member
– Listed NYSE: “TDC”
– 2007 - $1.7B revenue
Teradata - Architecture of Teradata
Continuous (R)evolution
Hardware
+ Database
+ Consulting
+ Data models and
reports
+ Analytic applications
Continuous (R)evolution
Sell the HW, give everything else
away
Sell the SW with some HW to
run on
Sell solving business problems – and technology to
solve them
Sell applications with consulting, SW
and HW inside
Continuous (R)evolution
90% R&D
10% integration
80286
70% R&D
30% integration
i486
20% R&D
80% integration
Pentium
10% R&D
90% integration
Xeon Quad Core
Scale
• Every dimension of the technology must scale to meet today’s requirements
– Data, Data model complexity, Users, Performance, queries, Data loading, …
• What is a big Data Warehouse?
• Total spinning disk?
– 2.5 Petabytes
• Big table?
– 150 billion rows
• Number of tables?
– 300,000
• Insert/Update per day?
– 5 billion records
• Identified users?
– 100,000
• Queries per day?
– 5 million
• Data Turnover rate?
– 1TB per 5 seconds
The Problem
10 > 09/2009
Accts. Payable
Accts. Receivable
Invoicing
Sales/Orders
Finance G/L
Customer Support
HR
Payroll
Purchasing
Order Fulfillment
Manufacturing
Inventory …
Marketing
Supply Chain
Finance
Risk Management
Maintenance
Sales
Operations
Inventory
Call Center …
Operational Systems Decision Makers
The EDW Solution
Accts. Payable
Accts. Receivable
Invoicing
Sales/Orders
Finance G/L
Customer Support
HR
Payroll
Purchasing
Order Fulfillment
Manufacturing
Inventory …
EnterpriseEnterprise
DataData
WarehouseWarehouse
(EDW)(EDW)
Marketing
Supply Chain
Finance
Risk Management
Maintenance
Sales
Operations
Inventory
Call Center …
Operational Systems Decision Makers
Active Enterprise Intelligence™
An Obvious Trend: More Speed, More Users
Strategic Intelligence Operational Intelligence
Enterprise Data Warehouse
BI Tools & reports
Analysis & visualization
Predictive Analytics
EDW Enterprise Integration
Mixed workload management
SOA, BPMS, IDEs
Portals/composite applications
Days
Seconds
Active Enterprise Intelligence™ enabled by an
Active Data Warehouse™
STRATEGIC INTELLIGENCEOPERATIONAL INTELLIGENCE
Business Intelligence
Tools and Applications
Teradata Warehouse
Workflow & Applications
Active EventsActive Access
Suppliers Customers Call
Center
Logistics MarketingFinanceProduct/
Services
Executive
Active Enterprise Integration
Active
Availability
Active
Workload
Management
Active
Load
Active Enterprise Intelligence™ in Retail
Detecting Retail Fraud
Situation
Thieves make copies of cash register receipts, walk into
the store, pick up merchandise, and return items for
cash.
Problem
Associates in returns department did not have historical
POS receipt retrieval access to verify against previously
“returned” receipts or to do returns without receipts.
Solution
Associates query Teradata to quickly check if a return
has already occurred on that receipt number. Also used
by analysts to understand and prevent excessive
returns.
Impact
(for 500-store chain)
• 100% ROI in 5 months
• Stopped a crime ring on the
first day of rollout
• “Cost savings have been
huge”
Active Enterprise Intelligence™ in Retail
Single View of the Customer Across All Channels
Situation
Needed to add Web channel for selling shoes.
Problem
Too much time and cost to keep multiple customer
systems synchronized. Realized they needed just
one customer database, not one more for the Web,
in addition to Call Center, and POS/Store databases.
Solution
Adopted an ADW strategy, moved all customer data
to one Teradata system, revised data models to
cover all channels, added web channel for
commerce, used web services, added TASM to
handle multiple workload types
Impact
• 1M tactical hits to the
EDW per day from the
POS, Call Center, and
Web with 0.11 sec
response time
• Runs simultaneously
with back-office BI,
reports, and ETL
workloads
• Eliminated all other
customer data systems
What is the Measure of a Great
Architecture?
Handle huge changes of underlying technologies and
dependent components while continuing to deliver the
key value proposition.
Teradata - Architecture of Teradata
Processor RoadmapCPU power radically increasing
2003 2005 2009 2011
90nm
process
45nm
process
65nm
process
32nm
process
22nm
process
Hyper-Threading Dual Core Multi Core
20002000 2008+2008+
SPECInt2000SPECInt2000
5X5X
SINGLE-CORESINGLE-CORE
PERFORMANCEPERFORMANCE
DUAL/MULTI-CORE
PERFORMANCE
2007
20042004
What Does Shared Nothing Mean?
• 1985 – Every hardware part, every line of software – “pure” shared
nothing
• 1995 – Multiple units of parallelism sharing CPU, memory
• 2004 – Multiple units of parallelism sharing multiple cores, memory
• 2009 – Multiple units of parallelism sharing same physical spindles
– but still not sharing data
• Future – Multiple units of parallelism in Virtual machines/cloud
not even knowing what physical machine it is on or sharing
19 > 09/2009
Copyright Teradata © 2007-2009
– All rights Reserved
Teradata MPP Server Architecture
• Nodes
– Incrementally scalable to 1024
nodes
• Operating System
– Linux, Windows, Unix
• Storage
– Independent I/O
– Scales per node
• BYNET Interconnect
– Fully scalable bandwidth
• Connectivity
– Fully scalable
– Channel – ESCON/FICON
– LAN, WAN
• Server Management
– One console to view
the entire system
SMP Node1 SMP Node2 SMP Node3 SMP Node4
Server
Management
Dual BYNET Interconnects
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
Shared Nothing - Dividing the Work
• “Virtual processors” (vprocs) do the work
• Two types
– AMP: owns and operates on the data
– PE: handles SQL and external interaction
• Configure multiple vprocs per hardware node
– Take full advantage of SMP CPU and memory
• Each vproc has many threads of execution
– Many operations executing concurrently
– Each thread can do work for any user, transaction
• Software is equivalent regardless of configuration
– No user changes as system grows from small SMP to huge MPP
Shared Nothing - Dividing the Work
• Basis of Teradata scalability
– Each AMP owns an equal slice of the disk
– Only that AMP reads that slice
• No single point of control for any operation
– I/O, Buffers, Locking, Logging, Dictionary
– Nothing centralized
– Exponential communication costs avoided
AMPsLogs
Locks
Buffers
I/O
# Nodes
Coordination
cost
Teradata
Teradata Data Distribution
• Rows automatically distributed evenly by hash partitioning
– Even distribution results in scalable performance
– Done in real-time as data are loaded, appended, or changed.
– Hash map defined and maintained by the system
• 2**32 hash codes, 64K buckets distributed to AMPs
– Prime Index (PI) column(s) are hashed
– Hash is always the same - for the same values
– No reorgs, repartitioning, space management
Table A Table B Table C
AMP1 AMP2 AMP3 AMP4 ……………………………………………………… AMPn
Primary Index
Teradata Parallel Hash Function
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
RowHash (Hash Bucket) Data Fields
Disk Capacity Exploding
with Little Increase in Performance
36 GB
5.5
73 GB
6.0
146 GB
6.4
.044
.080
.155
PerformanceperCapacity
MB/Sec/GB
DiskDriveBandwidth(MB/Sec)
1
2
3
4
5
6
7
8
Disk Drive Capacity
Platform Change
• Focus used to be
– Optimization of expensive CPU cycles
– Micro-management of precious disk space
• Now
– Manage I/O
– Balance CPU power to the I/O capacity
– Find new ways to optimize I/O, trading for CPU use as necessary
– Pulling 2.5GB/sec per node continuous
• Discontinuity coming
– SSDs become price competitive and reliable
File System
• Teradata wrote a new rule book
– Old one written by IBM 35 years ago, used by all mainstream DBMSs today - except Teradata
• File system built of raw slices
• Rows stored in blocks
– Variable length
– Grow and shrink on demand
– Rows located dynamically
• May be moved to reclaim space, defrag
– Maximum block size is configurable
• System default or per table
• 8K to 128K
• Change dynamically
• Indexes are just rows in tables
• Has evolved from direct management of single spindles to completely virtualized storage, not even
knowing spindle location
Workload Management Evolution
• 1984 – pure timeshare
• 1987 – 4 priorities, defined by user
• 1995 – multiple priorities in multiple partitions
• 2000 – weighted workload groups
• 2004 – queuing, reserved resources, focus on tactical work
• 2009 – Visualization and detailed workgroup management
• Future – Set service level goals, our job to deliver
Active Workload Management
• Manage workloads
– Reduce server congestion
• Dynamically adjust
in-flight task priority
– Turn the dial – change priorities
• Fast active access queries
– Performance, performance,
performance
• Get maximum throughput
Speed
10
Active
Events
Active
Access
Query and
ReportingActive Load
Active Data
Warehouse
Speed
60
Speed
75
Speed
25
TASM Reporting/Monitoring - 13.10
Availability Requirements
IT, Finance,
Planners, Power
Users,
Data Miners
Executives,
Middles
Managers,
Marketing
1000000
100000
10000
1000
100
10
Consumers
Suppliers
B2B
Operational
Employees
Category Mgr,
Line Managers,
Service Managers
Users
Mission Critical
Dual
Active
Strategic Intelligence Operational Intelligence
“Always ON” – An Elusive Challenge
• Unplanned downtime
– Hardware faults
– Software faults
– Hangs
• Planned downtime
– Software upgrade
– Hardware upgrade
– Data center maintenance
• “Disasters”
– Multi-component failures
– Building disasters
– Area disasters
• And optimize resource value to the business
• And avoid hidden costs and surprises
– Eg Major performance variations
• Major opportunity for research – but must be holistic
– Reaches far beyond core database
Real time Operational Actions
Strategic
Intelligence
Operational
Intelligence
1. Customer makes
multi-segment
travel reservation
2. Flight rerouted
causing missed
connections.
“Active”
Enterprise Data
Warehouse
3. What are the customers’
flying history?
4. How profitable is each
customer?
5. Which customers
experienced delays or
other problems in last 6
months?
WebSphere MQ,
Oracle AQ,
Microsoft MSMQ
6. Customer re-booked
and notified.
7. Airport operations
adjusted
Real Time Customer Management
Strategic
Intelligence
Operational
Intelligence
4. Is this customer
approaching the
predicted loss rate for
their segment?
5. What offers are
available for this
customer?6. Message sent to floor
Luck Ambassador with
customer offer to
prevent additional
losses.
TIBCO
2. What is the customer’s past
spending history in all our
casinos?
3. What is a significant loss
for this person based on
market segment, past and
predicted behavior?“Active”
Enterprise Data
Warehouse
1. Customer inserts
Total Rewards
Card at Slot
Machine
That’s a Wrap!
• Business requires a new level of decision making
– Many more decisions by many more people much faster
– Current representation of the state of the enterprise
• Data Warehouse must evolve to support the requirements of Active
Enterprise Intelligence
• Technology must evolve to deal with the new requirements
– Rich area for research and innovation
– Change view of what data warehouse/BI means
• Teradata driving an aggressive roadmap to meet real business
requirements
Teradata - Architecture of Teradata
For More Information click below link:
Follow Us on:
http://vibranttechnologies.co.in/teradata-classes-in-mumbai.html
Thank You !!!

More Related Content

PDF
Overview - IBM Big Data Platform
PPTX
An Overview of Apache Cassandra
PDF
Introduction to Hadoop
PDF
HDFS Architecture
PPTX
Teradata Architecture
PPTX
Introduction to Apache Hadoop Eco-System
PDF
Lecture1 introduction to big data
PDF
Introduction to Big Data Analytics and Data Science
Overview - IBM Big Data Platform
An Overview of Apache Cassandra
Introduction to Hadoop
HDFS Architecture
Teradata Architecture
Introduction to Apache Hadoop Eco-System
Lecture1 introduction to big data
Introduction to Big Data Analytics and Data Science

What's hot (20)

PPTX
Data partitioning
PPTX
DNS Security Presentation ISSA
PPTX
Hadoop File system (HDFS)
PPTX
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
PPT
Data warehouse
PDF
Distributed Counters in Cassandra (Cassandra Summit 2010)
PPSX
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
PPTX
Apache hive introduction
PPTX
Big Data Analytics with Hadoop
PPTX
Data Warehousing
PPTX
Ozone: An Object Store in HDFS
PPTX
Big data architecture
PDF
Big Data Architecture
PDF
Intro to HBase
PPTX
Hive+Tez: A performance deep dive
PPTX
PDF
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
PPTX
Microservices Part 3 Service Mesh and Kafka
PDF
Data warehouse architecture
Data partitioning
DNS Security Presentation ISSA
Hadoop File system (HDFS)
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
Data warehouse
Distributed Counters in Cassandra (Cassandra Summit 2010)
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Apache hive introduction
Big Data Analytics with Hadoop
Data Warehousing
Ozone: An Object Store in HDFS
Big data architecture
Big Data Architecture
Intro to HBase
Hive+Tez: A performance deep dive
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Microservices Part 3 Service Mesh and Kafka
Data warehouse architecture
Ad

Viewers also liked (19)

PPTX
Teradata introduction - A basic introduction for Taradate system Architecture
PPTX
Teradata introduction
PPTX
PPT
Teradata 13.10
PDF
Teradata - Presentation at Hortonworks Booth - Strata 2014
PPTX
Teradata Big Data London Seminar
PPTX
Introduction to Teradata And How Teradata Works
PDF
Key note big data analytics ecosystem strategy
PPTX
The Big Data Analytics Ecosystem at LinkedIn
PDF
Teradata Aster: Big Data Discovery Made Easy
PDF
Unified big data architecture
PPT
Teradata Unity
PPTX
Leveraging your hadoop cluster better - running performant code at scale
PPTX
Tableau AWS EC2 integration architecture diagram
PPTX
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
PDF
Big data performance management thesis
PPTX
Teradata Training Course Content
PDF
Big Data to your advantage with High-Performance Analytics
PPTX
EMC Big Data Solutions Overview
Teradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction
Teradata 13.10
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata Big Data London Seminar
Introduction to Teradata And How Teradata Works
Key note big data analytics ecosystem strategy
The Big Data Analytics Ecosystem at LinkedIn
Teradata Aster: Big Data Discovery Made Easy
Unified big data architecture
Teradata Unity
Leveraging your hadoop cluster better - running performant code at scale
Tableau AWS EC2 integration architecture diagram
Big Data Day LA 2015 - Event Driven Architecture for Web Analytics by Peyman ...
Big data performance management thesis
Teradata Training Course Content
Big Data to your advantage with High-Performance Analytics
EMC Big Data Solutions Overview
Ad

Similar to Teradata - Architecture of Teradata (20)

PDF
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
PDF
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
PPT
Informix & IWA : Operational analytics performance
DOC
PradeepDWH
PPTX
The Most Trusted In-Memory database in the world- Altibase
PPTX
Oracle Big Data Appliance and Big Data SQL for advanced analytics
PPTX
DATA WAREHOUSING
PPTX
5 Things that Make Hadoop a Game Changer
PPTX
TECHunplugged Austin 2016
PPTX
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
PDF
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
PPTX
Ten tools for ten big data areas 01 informatica
PPT
informatica data replication (IDR)
PPT
Lecture1
PPT
Teradata Technology Leadership and Innovation
PPTX
Designing modern dw and data lake
PDF
Informix warehouse accelerator update
PDF
1.1 Overview.pdf
PPTX
Enterprise Architecture in the Era of Big Data and Quantum Computing
PPTX
IBM Modern Analytics Journey
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
Informix & IWA : Operational analytics performance
PradeepDWH
The Most Trusted In-Memory database in the world- Altibase
Oracle Big Data Appliance and Big Data SQL for advanced analytics
DATA WAREHOUSING
5 Things that Make Hadoop a Game Changer
TECHunplugged Austin 2016
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Ten tools for ten big data areas 01 informatica
informatica data replication (IDR)
Lecture1
Teradata Technology Leadership and Innovation
Designing modern dw and data lake
Informix warehouse accelerator update
1.1 Overview.pdf
Enterprise Architecture in the Era of Big Data and Quantum Computing
IBM Modern Analytics Journey

More from Vibrant Technologies & Computers (20)

PPT
Buisness analyst business analysis overview ppt 5
PPT
SQL Introduction to displaying data from multiple tables
PPT
SQL- Introduction to MySQL
PPT
SQL- Introduction to SQL database
PPT
ITIL - introduction to ITIL
PPT
Salesforce - Introduction to Security & Access
PPT
Data ware housing- Introduction to olap .
PPT
Data ware housing - Introduction to data ware housing process.
PPT
Data ware housing- Introduction to data ware housing
PPT
Salesforce - classification of cloud computing
PPT
Salesforce - cloud computing fundamental
PPT
SQL- Introduction to PL/SQL
PPT
SQL- Introduction to advanced sql concepts
PPT
SQL Inteoduction to SQL manipulating of data
PPT
SQL- Introduction to SQL Set Operations
PPT
Sas - Introduction to designing the data mart
PPT
Sas - Introduction to working under change management
PPT
SAS - overview of SAS
PPT
Teradata - Restoring Data
PPT
Datastage database design and data modeling ppt 4
Buisness analyst business analysis overview ppt 5
SQL Introduction to displaying data from multiple tables
SQL- Introduction to MySQL
SQL- Introduction to SQL database
ITIL - introduction to ITIL
Salesforce - Introduction to Security & Access
Data ware housing- Introduction to olap .
Data ware housing - Introduction to data ware housing process.
Data ware housing- Introduction to data ware housing
Salesforce - classification of cloud computing
Salesforce - cloud computing fundamental
SQL- Introduction to PL/SQL
SQL- Introduction to advanced sql concepts
SQL Inteoduction to SQL manipulating of data
SQL- Introduction to SQL Set Operations
Sas - Introduction to designing the data mart
Sas - Introduction to working under change management
SAS - overview of SAS
Teradata - Restoring Data
Datastage database design and data modeling ppt 4

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Modernizing your data center with Dell and AMD
PDF
cuic standard and advanced reporting.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Electronic commerce courselecture one. Pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Unlocking AI with Model Context Protocol (MCP)
Chapter 3 Spatial Domain Image Processing.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
20250228 LYD VKU AI Blended-Learning.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectral efficient network and resource selection model in 5G networks
Modernizing your data center with Dell and AMD
cuic standard and advanced reporting.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
NewMind AI Weekly Chronicles - August'25 Week I
Agricultural_Statistics_at_a_Glance_2022_0.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Electronic commerce courselecture one. Pdf
A Presentation on Artificial Intelligence
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction

Teradata - Architecture of Teradata

  • 4. Teradata Company Highlights • Founded 1979 – West LA • First product to market – 1984 • First Terabyte system – 1987 • Acquired by AT&T and merged with acquired NCR – 1992 • Tri-vested as part of NCR - 1997 • Teradata Corporation – (re)Launched October 1, 2007 – Global Leader in Enterprise Data Warehousing • EDW/ADW Database Technology • Analytic Solutions – Positioned in Gartner’s Leaders Quadrant in data warehousing since 1999 • Top 10 U.S. publicly-traded software company – S&P 500 Member – Listed NYSE: “TDC” – 2007 - $1.7B revenue
  • 6. Continuous (R)evolution Hardware + Database + Consulting + Data models and reports + Analytic applications
  • 7. Continuous (R)evolution Sell the HW, give everything else away Sell the SW with some HW to run on Sell solving business problems – and technology to solve them Sell applications with consulting, SW and HW inside
  • 8. Continuous (R)evolution 90% R&D 10% integration 80286 70% R&D 30% integration i486 20% R&D 80% integration Pentium 10% R&D 90% integration Xeon Quad Core
  • 9. Scale • Every dimension of the technology must scale to meet today’s requirements – Data, Data model complexity, Users, Performance, queries, Data loading, … • What is a big Data Warehouse? • Total spinning disk? – 2.5 Petabytes • Big table? – 150 billion rows • Number of tables? – 300,000 • Insert/Update per day? – 5 billion records • Identified users? – 100,000 • Queries per day? – 5 million • Data Turnover rate? – 1TB per 5 seconds
  • 10. The Problem 10 > 09/2009 Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L Customer Support HR Payroll Purchasing Order Fulfillment Manufacturing Inventory … Marketing Supply Chain Finance Risk Management Maintenance Sales Operations Inventory Call Center … Operational Systems Decision Makers
  • 11. The EDW Solution Accts. Payable Accts. Receivable Invoicing Sales/Orders Finance G/L Customer Support HR Payroll Purchasing Order Fulfillment Manufacturing Inventory … EnterpriseEnterprise DataData WarehouseWarehouse (EDW)(EDW) Marketing Supply Chain Finance Risk Management Maintenance Sales Operations Inventory Call Center … Operational Systems Decision Makers
  • 12. Active Enterprise Intelligence™ An Obvious Trend: More Speed, More Users Strategic Intelligence Operational Intelligence Enterprise Data Warehouse BI Tools & reports Analysis & visualization Predictive Analytics EDW Enterprise Integration Mixed workload management SOA, BPMS, IDEs Portals/composite applications Days Seconds
  • 13. Active Enterprise Intelligence™ enabled by an Active Data Warehouse™ STRATEGIC INTELLIGENCEOPERATIONAL INTELLIGENCE Business Intelligence Tools and Applications Teradata Warehouse Workflow & Applications Active EventsActive Access Suppliers Customers Call Center Logistics MarketingFinanceProduct/ Services Executive Active Enterprise Integration Active Availability Active Workload Management Active Load
  • 14. Active Enterprise Intelligence™ in Retail Detecting Retail Fraud Situation Thieves make copies of cash register receipts, walk into the store, pick up merchandise, and return items for cash. Problem Associates in returns department did not have historical POS receipt retrieval access to verify against previously “returned” receipts or to do returns without receipts. Solution Associates query Teradata to quickly check if a return has already occurred on that receipt number. Also used by analysts to understand and prevent excessive returns. Impact (for 500-store chain) • 100% ROI in 5 months • Stopped a crime ring on the first day of rollout • “Cost savings have been huge”
  • 15. Active Enterprise Intelligence™ in Retail Single View of the Customer Across All Channels Situation Needed to add Web channel for selling shoes. Problem Too much time and cost to keep multiple customer systems synchronized. Realized they needed just one customer database, not one more for the Web, in addition to Call Center, and POS/Store databases. Solution Adopted an ADW strategy, moved all customer data to one Teradata system, revised data models to cover all channels, added web channel for commerce, used web services, added TASM to handle multiple workload types Impact • 1M tactical hits to the EDW per day from the POS, Call Center, and Web with 0.11 sec response time • Runs simultaneously with back-office BI, reports, and ETL workloads • Eliminated all other customer data systems
  • 16. What is the Measure of a Great Architecture? Handle huge changes of underlying technologies and dependent components while continuing to deliver the key value proposition.
  • 18. Processor RoadmapCPU power radically increasing 2003 2005 2009 2011 90nm process 45nm process 65nm process 32nm process 22nm process Hyper-Threading Dual Core Multi Core 20002000 2008+2008+ SPECInt2000SPECInt2000 5X5X SINGLE-CORESINGLE-CORE PERFORMANCEPERFORMANCE DUAL/MULTI-CORE PERFORMANCE 2007 20042004
  • 19. What Does Shared Nothing Mean? • 1985 – Every hardware part, every line of software – “pure” shared nothing • 1995 – Multiple units of parallelism sharing CPU, memory • 2004 – Multiple units of parallelism sharing multiple cores, memory • 2009 – Multiple units of parallelism sharing same physical spindles – but still not sharing data • Future – Multiple units of parallelism in Virtual machines/cloud not even knowing what physical machine it is on or sharing 19 > 09/2009 Copyright Teradata © 2007-2009 – All rights Reserved
  • 20. Teradata MPP Server Architecture • Nodes – Incrementally scalable to 1024 nodes • Operating System – Linux, Windows, Unix • Storage – Independent I/O – Scales per node • BYNET Interconnect – Fully scalable bandwidth • Connectivity – Fully scalable – Channel – ESCON/FICON – LAN, WAN • Server Management – One console to view the entire system SMP Node1 SMP Node2 SMP Node3 SMP Node4 Server Management Dual BYNET Interconnects CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys CPU1 CPU2 Memory Operating Sys
  • 21. Shared Nothing - Dividing the Work • “Virtual processors” (vprocs) do the work • Two types – AMP: owns and operates on the data – PE: handles SQL and external interaction • Configure multiple vprocs per hardware node – Take full advantage of SMP CPU and memory • Each vproc has many threads of execution – Many operations executing concurrently – Each thread can do work for any user, transaction • Software is equivalent regardless of configuration – No user changes as system grows from small SMP to huge MPP
  • 22. Shared Nothing - Dividing the Work • Basis of Teradata scalability – Each AMP owns an equal slice of the disk – Only that AMP reads that slice • No single point of control for any operation – I/O, Buffers, Locking, Logging, Dictionary – Nothing centralized – Exponential communication costs avoided AMPsLogs Locks Buffers I/O # Nodes Coordination cost Teradata
  • 23. Teradata Data Distribution • Rows automatically distributed evenly by hash partitioning – Even distribution results in scalable performance – Done in real-time as data are loaded, appended, or changed. – Hash map defined and maintained by the system • 2**32 hash codes, 64K buckets distributed to AMPs – Prime Index (PI) column(s) are hashed – Hash is always the same - for the same values – No reorgs, repartitioning, space management Table A Table B Table C AMP1 AMP2 AMP3 AMP4 ……………………………………………………… AMPn Primary Index Teradata Parallel Hash Function P DM P DM P DM P DM P DM P DM P DM P DM P DM RowHash (Hash Bucket) Data Fields
  • 24. Disk Capacity Exploding with Little Increase in Performance 36 GB 5.5 73 GB 6.0 146 GB 6.4 .044 .080 .155 PerformanceperCapacity MB/Sec/GB DiskDriveBandwidth(MB/Sec) 1 2 3 4 5 6 7 8 Disk Drive Capacity
  • 25. Platform Change • Focus used to be – Optimization of expensive CPU cycles – Micro-management of precious disk space • Now – Manage I/O – Balance CPU power to the I/O capacity – Find new ways to optimize I/O, trading for CPU use as necessary – Pulling 2.5GB/sec per node continuous • Discontinuity coming – SSDs become price competitive and reliable
  • 26. File System • Teradata wrote a new rule book – Old one written by IBM 35 years ago, used by all mainstream DBMSs today - except Teradata • File system built of raw slices • Rows stored in blocks – Variable length – Grow and shrink on demand – Rows located dynamically • May be moved to reclaim space, defrag – Maximum block size is configurable • System default or per table • 8K to 128K • Change dynamically • Indexes are just rows in tables • Has evolved from direct management of single spindles to completely virtualized storage, not even knowing spindle location
  • 27. Workload Management Evolution • 1984 – pure timeshare • 1987 – 4 priorities, defined by user • 1995 – multiple priorities in multiple partitions • 2000 – weighted workload groups • 2004 – queuing, reserved resources, focus on tactical work • 2009 – Visualization and detailed workgroup management • Future – Set service level goals, our job to deliver
  • 28. Active Workload Management • Manage workloads – Reduce server congestion • Dynamically adjust in-flight task priority – Turn the dial – change priorities • Fast active access queries – Performance, performance, performance • Get maximum throughput Speed 10 Active Events Active Access Query and ReportingActive Load Active Data Warehouse Speed 60 Speed 75 Speed 25
  • 30. Availability Requirements IT, Finance, Planners, Power Users, Data Miners Executives, Middles Managers, Marketing 1000000 100000 10000 1000 100 10 Consumers Suppliers B2B Operational Employees Category Mgr, Line Managers, Service Managers Users Mission Critical Dual Active Strategic Intelligence Operational Intelligence
  • 31. “Always ON” – An Elusive Challenge • Unplanned downtime – Hardware faults – Software faults – Hangs • Planned downtime – Software upgrade – Hardware upgrade – Data center maintenance • “Disasters” – Multi-component failures – Building disasters – Area disasters • And optimize resource value to the business • And avoid hidden costs and surprises – Eg Major performance variations • Major opportunity for research – but must be holistic – Reaches far beyond core database
  • 32. Real time Operational Actions Strategic Intelligence Operational Intelligence 1. Customer makes multi-segment travel reservation 2. Flight rerouted causing missed connections. “Active” Enterprise Data Warehouse 3. What are the customers’ flying history? 4. How profitable is each customer? 5. Which customers experienced delays or other problems in last 6 months? WebSphere MQ, Oracle AQ, Microsoft MSMQ 6. Customer re-booked and notified. 7. Airport operations adjusted
  • 33. Real Time Customer Management Strategic Intelligence Operational Intelligence 4. Is this customer approaching the predicted loss rate for their segment? 5. What offers are available for this customer?6. Message sent to floor Luck Ambassador with customer offer to prevent additional losses. TIBCO 2. What is the customer’s past spending history in all our casinos? 3. What is a significant loss for this person based on market segment, past and predicted behavior?“Active” Enterprise Data Warehouse 1. Customer inserts Total Rewards Card at Slot Machine
  • 34. That’s a Wrap! • Business requires a new level of decision making – Many more decisions by many more people much faster – Current representation of the state of the enterprise • Data Warehouse must evolve to support the requirements of Active Enterprise Intelligence • Technology must evolve to deal with the new requirements – Rich area for research and innovation – Change view of what data warehouse/BI means • Teradata driving an aggressive roadmap to meet real business requirements
  • 36. For More Information click below link: Follow Us on: http://vibranttechnologies.co.in/teradata-classes-in-mumbai.html Thank You !!!

Editor's Notes

  • #5: [Enter any extra notes here; leave the item ID line at the bottom] Avitage Item ID: {{E3648B2F-FB1B-499B-B91B-8871943BA5EE}}
  • #15: Retail Fraud is a $16 B year problem in the USA alone. With web receipts and better copying capabilities, thieves can make multiple copies of a single receipt and make multiple returns for cash or other merchandise. Or they can bring back shoplifted items and try to exchange for cash. The problem is that often the associates in Returns department don’t have access to past sales information and can’t keep track easily of returned merchandise. This is especially problematic if the policy is to make returns without receipts. So the solution is straightforward: hook up the Point of Sale systems so within seconds, the Teradata data warehouse is updated with sales, return, exchange, and void data, and provide the Returns department with the entire history of purchases by that customer,, so they can ensure that a sold product can only be returned once. <Click> The impact? Huge, according to one Teradata customer who has already built this system. They stopped a crime ring in the first day of their rollout, a group that had defrauded the company of thousands of dollars. They saw a 100% payback on their investment in just 5 months, and continue to reap the benefits of this example use of Active Enterprise Intelligence.
  • #21: [Enter any extra notes here; leave the item ID line at the bottom] Avitage! Item ID: {{33DC1405-7316-423E-B269-8F92054D20CE}}
  • #25: (CLICK) In this chart, we have 3 different disk drive sizes, and you can see that per generation, disk drive bandwidth hasn’t increased very much. (CLICK) As disk capacities get larger (36 GB  73 GB  146 GB) the performance per capacity ratio (Capacity vs. Disk Bandwidth on right side of chart) declines significantly. The key metric on this slide is performance per capacity (MB/ SEC/ GB) Look at this slide! Capacity is doubling, but throughput is diminishing! If you fill all the drives up with data, you will not have enough I/O or bandwidth! Choosing twice as much storage capacity in a configuration, but not increasing the number of physical disks (to keep I/O constant), will result in performance degradation.
  • #29: Assuming workloads are categorized, this illustration shows “speed limits” which are actually resource limits for each workload. Each workload is allowed to consume a limited amount of resources at any given time to ensure other workloads get their rightful share. Dynamic Resource Prioritization Inside every fully utilized active data warehouse, there’s a major turf battle going on. Each job in the database is engaged in an ongoing struggle for more and more resources for its own work, often competing against other diverse activities. In most databases, these me-first conflicts result in short, resource-light queries falling victim to the heavier jobs. Those batch fraud-detection reports and long-running market share analysis queries essentially take ownership of the database and all it has to give. But Teradata Database lets your specific business needs determine how your precious database resources are divided. Once a definition for equitable sharing of database assets is in place, it automatically controls what percent of the CPU and disk I/O those batch reports and complex queries, as well as those vulnerable short queries, will receive. When there’s a handful of users on the system, Teradata Database spreads available resources out relative to the priorities and assignments that have been made to those particular users, without a single sub-second of CPU being wasted. Teradata Database has made job scheduling and prioritization of the work a core competency since 1988. And recently, that technology has deepened and matured offering even more flexibility. Teradata’s Priority Scheduler can be used to ensure that the event-driven work coming from the web is allowed to cut into line to grab the CPU it needs to get that promotion back to the client quickly. For example, if the tactical query that comes up with that promotion returns an answer in 1 second when running alone in the database, that same query, if armed with a high Teradata Database priority, can maintain a similar turnaround even if multiple complex inventory adjustment queries begin executing at the same time. For the active data warehouse, it will be critical to keep more resource-hungry complex queries from dominating the resources in the system, starving out the shorter tactical work. Teradata’s Dynamic Workload Manager will play a big role in enabling favored work to be as near to real time as it needs to be.
  • #31: While no 2 dimensional drawing can accurately portray such complex issues, this graphic frames the discussion around when to move to mission critical and dual active solutions. In general, the type of users often correlates with the population of users. For example, we know that the consumer population for many industries can mean 10 of thousands to millions of possible users via the internet . Similarly, for some industries, the population of supplier employees who access your data warehouse can be enormous, maybe not always in concurrent users but certainly in potential users. At the other end of the spectrum, planning, analysis, and power users tend to be a small community albeit an influential one. In the middle of the graphic we see overlaps of many kinds because line managers (category managers, sales managers, service managers, etc.) often bounce between strategic decisio0ns and operational decisions, with probably more time spent in the operational tasks. Business critical is not a well defined term in our industry. It tends to mean anything less than mission critical. These users can often tolerate downtime, from a few hours perhaps even an entire day. But many data warehouse sites have become so dependent on the EDW, that they have “hardened” the server, software, and procedures to a mission critical level. This means the executives realize how many decisions are made daily based on BI Tools based reporting that they are willing to fund the project to increase system availability. Mission critical can begin in the EDW and certainly extends all the way to the end of the graphic. These clients understand that large populations of front line users will demand 24X7 data availability. With operational employees you MIGHT be able to tolerate a 10-20 minute outage every month. It depends very much on the business use of the EDW. As the EDW evolves to larger populations and more operational ACTIVE tasks, outrages become increasingly expensive so additional investments in availability become mandatory. In some cases, an active data warehouse begins being so critical to the operational employee that it becomes necessary to step up to a dual active configuration. This is particularly true in retail with 100s of concurrent employees and suppliers using the data, but it may also occur with large call centers or sales staff. Finally, we hope it is obvious that when consumers gain access to the data warehouse, it is typically for eCommerce purchasing. No downtime is tolerated in this case because the loss of revenue cannot be tolerated.
  • #33: Problem: Lack of ability to track customer gaming behavior and Comp redemption. No mechanism to communicate or react to specific behaviors and trends Solution: Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata. The player profile is accessed and it is determined if the casino should make personal contact with that player. Allows Harrah’s to provide real-time offers to customers at each gaming point Enables Harrah’s to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to “over-comp” guests. Future: “Marketing At The Slots” initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications. This will drive CRM to a new “real-time” level allowing interaction with the customer while they are gaming.
  • #34: Problem: Lack of ability to track customer gaming behavior and Comp redemption. No mechanism to communicate or react to specific behaviors and trends Solution: Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata. The player profile is accessed and it is determined if the casino should make personal contact with that player. Allows Harrah’s to provide real-time offers to customers at each gaming point Enables Harrah’s to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to “over-comp” guests. Future: “Marketing At The Slots” initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications. This will drive CRM to a new “real-time” level allowing interaction with the customer while they are gaming.