SlideShare a Scribd company logo
C02- Perfect Trio: Temporal Tables,
Transparent Archiving in DB2 for
z/OS and IDAA
Mehmet Cuneyt Goksu, mehmet.goksu@ibm.com
IDAA & Db2 Tools Lab Advocate
IBM Germany R&D, Böblingen
• Data archiving requirements and challenges
• Data archiving solutions for z/OS systems
• Temporal Tables & History Generation
• Transparent Archiving & History
Generation
• Overview of IDAA Technology
• Combining Solutions for different usecases
Agenda
Sometimes, due to legal requirements
Why retain data for long periods of time?
• Sometimes in support of customer service
We need to
repair
your 2005
vehicle
• Sometimes for analytics purposes
If we analyze more
data, we’ll get more
valuable insight…
• DB2 tables with non-continuously-ascending clustering key (new rows get inserted throughout table),
data retention can increase the CPU cost of data access
• More recently-inserted rows are often the most frequently accessed, but sets of such rows will be separated by
ever-larger numbers of “old and cold” rows
Result: more and more DB2 GETPAGEs are required to retrieve the same result sets, and more GETPAGEs means
more CPU
Data retention’s impact: application performance
• Even for DB2 table with continuously-ascending clustering key (so newer rows are concentrated at
“end” of table), growth means larger indexes, and that means more CPU
• A larger index has more levels, leading to more GETPAGEs
• DB2 utilities that process indexes (such as REORG and RUNSTATS) may become more expensive to run
• Storing years of historical data on the high-end disk subsystems typically used with z
Systems can cost a lot of $$$
• A cost-reducing alternative – storing historical data offline, on tape, has its own problems
• No dynamic query access – data requested for analysis might be restored to disk
overnight, available next day
• Even then, likely that only a subset of data-on-tape would be restored at any given
time
• Is there a better way?
Yes – several of them!!
Data retention’s impact: data storage costs
Non DBMS
Retention Platform
ATA File Server
EMC Centera
IBM RS550
HDS
Compressed
Archives
Offline
Retention Platform
CD
Tape
Optical
Compressed
Archives
Production
Database
Archive
Definitions
Archive
Restore
Archive
Database
Compressed
Archives
Online
Archive
5-6 years
Offline
Archive
7+ years
Current
Data
1-2 years
Active
Historical
3-4 years
DB2 Temporal Tables – Time Travel Query
• One of the major improvements since DB2 10
• The ability for the database to reduce the complexity and amount of coding needed to
implement “versioned” data, data that has different values at different points in time.
• Data that you need to keep a record of for any given point in time
• Data that you may need to look at for the past, current or future situation
• The ability to support history or auditing queries
• Business Time & System time
Temporal Concepts
• Business Time (Effective Dates, Valid Time, From/To-dates)
– Every row has a pair of TIMESTAMP or DATE columns set by Application
• Begin time : when the business deems the row valid
• End Time : when the business deems row validity ends
– Constraint created to ensure Begin time < End time
– Query at current, any prior, or future point/period in business time
• System Time (Assertion Dates, Knowledge Dates, Transaction Time, Audit Time, In/Out-
dates)
– Every row has a pair of TIMESTAMP columns set by DBMS
• Begin time : when the row was inserted in the DBMS
• End Time : when the row was modified/deleted
– Every base row has a Transaction Start ID timestamp
– Query at current or any prior point/period in system time
• Times are inclusive for start time and exclusive for end times
DB2 Temporal Tables - History Generation
• Concept of period (SYSTEM_TIME and BUSINESS_TIME periods)
• A period is represented by a pair of datetime columns in DB2 relations, one column stores start
time, the other one stores end time
• SYSTEM_TIME period captures DB2’s creation and deletion of records. DB2 SYSTEM_TIME
versioning automatically keeps historical versions of records
• BUSINESS_TIME period allows users to create their own valid period for a given record. Users
maintain the valid times for a record.
• Temporal tables: System-period Temporal Table (STT), Application-period Temporal Table (ATT)
• Business value
• It helps meet compliance requirements
• It performs better
• It is easier to manage compared to home-grown solutions
Row Maintenance with System Time – History Generation
* T1: INSERT Row A
* T2: UPDATE Row A
* T3: UPDATE Row A
* T4: DELETE Row A
* T5: INSERT Row A
Row A1:T1-T2Row A1:T1-HVRow A2:T2-HVRow A3:T3-HVRow A4:T5-HV
Base Table History Table
* Notes:
– INSERT has no History Table impact
– The first UPDATE begins a lineage for Row A.
• History Table ST End = Base Table ST Begin (No gap)
• The Base Table ST End is always High Values (HV)
– The second UPDATE deepens the lineage
• No gaps exist across all generations of Row A.
– The DELETE adds to the lineage in the History Table.
• There is no current row (Base Table) after the DELETE
– The second INSERT begins a new row lineage
• There is a gap between the History Table rows and the Base Table
– If all of the above statements happen in the same UOW, there would be no History Table rows
Row A2:T2-T3
Row A1:T1-T2
Row A3:T3-T4
Row A2:T2-T3
Row A1:T1-T2
Sep 2008
Audit
HistoryCurrent
Aug 2008
Jul 2008
History
Generation
SQL using
current data
SQL using
ASOF
Transparent/automatic access to satisfy ASOF
Queries
History table contains version of every update on a single row
DB2 Temporal Tables - History Generation
Temporal auditing
Track which SQL operation caused modification
− Also: who modified data
− Usage not restricted to DB2 temporal
• GENERATED ALWAYS AS ... can also be defined for non-temporal tables
ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END
Table BANK_ACC_STT
GENERATED ALWAYS AS (SESSION_USER)
Also special registers such as
• CURRENT CLIENT_USERID
• CURRENT SQLID
• CURRENT CLIENT_ACCTNG ...
CHAR(1) GENERATED ALWAYS AS
(DATA CHANGE OPERATION)
Temporal auditing -
example
 User JOE inserts entry for ACCOUNT_ID 56789
ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END
56789 1234.56 JOE I 2017-01-19 9999-12-30
BANK_ACC_STT
ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END
56789 88.77 DON U 2017-01-21 9999-12-30
 User DON updates this record
ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END
56789 1234.56 JOE I 2017-01-19 2017-01-21
BANK_ACC_HIST
ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END
ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END
56789 1234.56 JOE I 2017-01-19 2017-01-21
56789 88.77 DON U 2017-01-21 2017-02-15
56789 88.77 LAURA D 2017-02-15 2017-02-15
BANK_ACC_STT
 User LAURA deletes this record
BANK_ACC_STT
BANK_ACC_HIST
*
* Requires ON DELETE ADD EXTRA ROW in temporal DDL☼
• Both active and history tables with Timestamp(12) can be loaded to the Accelerator
System Time Temporal Query Routing with DB2 12 and IDAA
• Special query rewrite is applied for the following 3 temporal SQL:
• FOR SYSTEM_TIME AS OF expr
• FOR SYSTEM_TIME FROM expr1 TO expr2
• FOR SYSTEM_TIME BETWEEN expr1 AND expr2
• Queries on system temporal tables are routed to the Accelerator when ZPARM QUERY_ACCEL_OPTIONS
is set to 5
5: Allows to run accelerated queries against STT and bi-temporal tables
6: queries will be offloaded if the queries reference timestamp columns with precision of 12
• All existing offloading criteria have to be met
• Yes – it is a “historical” data retention option
• With system-time temporal, you are retaining data that was once, but is no longer, in effect
• Needs of the business determine which data retention approach is appropriate for a given
situation
• When data previously inserted in a table is changed (updated or deleted), is there a need to retain
a “before” image of a changed row, along with the “from” and “to” times of the row’s “in effect”
period?
• That’s what system-time temporal is for – it lets you see data that WAS current at some
prior point in time
Can system-time temporal be a form of archiving?
• Querying and managing tables that contain a large amount
of data is a common problem
• Maintaining for performance of a large table is a another pain point
Index or not?
• One known solution is to archive the inactive/cold data to a different
environment
• Challenges on the ease of use and performance
• How to access to both current and archived data within single query
• How to make data archiving and access “transparent” with minimum
application changes
Poor Application
Performance
Why DB2 Archive Transparency
1. DBA creates table (e.g., T1_AR) to be used as archive for table T1
2. DBA tells DB2 to enable archiving for T1, using archive table T1_AR
ALTER TABLE T1 ENABLE ARCHIVE USE T1_AR;
3. Program deletes to-be-archived rows from T1
• If program sets DB2 global variable SYSIBMADM.MOVE_TO_ARCHIVE to ‘Y’,
all it has to do is delete from T1 – DB2 will move deleted rows to T1_AR
• The value of a global variable affects only the DB2 thread for which it was set
4. Bind packages appropriately (bind option affects static and dynamic SQL)
• If a program will ALWAYS access ONLY the base table, it should be bound with
ARCHIVESENSITIVE(NO)
• If a program will SOMETIMES or ALWAYS access rows in the base table and the
associated archive table, it should be bound with ARCHIVESENSITIVE(YES)
• If program sets DB2 global variable SYSIBMADM.GET_ARCHIVE to ‘Y’, and
issues SELECT against base table, DB2 will automatically drive that SELECT
against associated archive table, too, and will merge results with UNION ALL
• So, with DB2-managed archiving, a program can retrieve data from an archive
table without having to reference the archive table
DB2-managed data archiving – how it’s done
• NOT the same thing as system time temporal data
• When versioning (system time) is activated for a table, the “before” images of rows made
“non-current” by update or delete are inserted into an associated history table
• With DB2-managed archiving, rows in an archive table are current in terms of validity –
they are just older than rows in the associated base table (if row age is the archive
criterion)
When most access is to rows recently inserted into a table, moving older rows to an
archive table can improve performance for newer-row retrieval
Particularly useful when data clustered by non-continuously-ascending key
DB2 users are already doing it for several years! – DB2 makes it easier
DB2-managed data archiving
Before DB2-managed
data archiving
After DB2-managed
data archiving Newer, more
“popular” rows
Older rows, less
frequently retrieved
Base table Archive table
DB2 Archive Transparency - History Generation
Sep 2008
ArchiveArchive-
enabled
Aug 2008
Jul 2008
Archive
@DELETE/
REORG
DISCARD
SQL using
current data
GET_ARCHIVE = 'Y';
SQL
Transparent/automatic
access to satisfy “GET_ARCHIVE” queries
History table contains version of every update on a single row
MOVE_TO_ARCHIVE =‘Y’| 'E';
DB2 Transparent archiving – What is new!
 Transparent archiving introduced with DB2 11
− Enable archiving of deleted rows in separate tables
− Similar to temporal / SYSTEM TIME
 New with DB2 12: new ZPARM to specify default value for MOVE_TO_ARCHIVE
global variable
− retrofitted to DB2 11 with APAR PI56767
 New with DB2 12: allow row change timestamp column to be part of
partitioning key
− can facilitate archiving of archive table to DB2 Analytics Accelerator (on partition basis)
− retrofitted to DB2 11 with APAR PI63830
 AND: optimizer improvements in DB2 12 (e.g. UNION ALL) with positive impact
on transparent archiving and temporal tables
• System-time temporal support and DB2-managed archiving cannot be activated for the same table
– use one or the other
• Key differences:
• System-time temporal
• Implemented with a base table and an associated history table
• Rows in the history table are NOT current – they are the “before” images of rows that were
made non-current by DELETE or UPDATE operations targeting the base table
• DB2-managed archiving
• Implemented with a base table and an associated archive table
• Rows in the archive table ARE current – they are just older than the rows in the base table
(assuming that age is the archive criterion)
DB2: temporal (system time) versus archive
IBM z Analytics
22
Query execution process flow
AcceleratorDRDARequestor
Application
Interface
Heartbeat
(availability and performance indicators)
Application
Optimizer
Query execution run-time for queries
that cannot be or should not be
routed to Accelerator
Heartbeat
Queries executed
with Accelerator
Queries executed
without Accelerator
Db2 Analytics Accelerator
IBM z Analytics
Introducing Accelerator-only table type in DB2 for z/OS
Creation (DDL) and access remains through DB2 for z/OS in all cases
Non-accelerator DB2 table
• Data in DB2 only
Accelerator-shadow table
• Data in DB2 and the Accelerator
Accelerator-archived table /
partition
• Empty read-only partition in DB2
• Partition data is in Accelerator only
Accelerator-only table (AOT)
• “Proxy table” in DB2
• Data is in Accelerator only
Table 1
Table 4
Table 3
Table 2Table 2
Table 4
Table 3
Db2 Analytics Accelerator
1. A base table and its associated archive table can be selected for acceleration (so both
tables will exist on both the front-end DB2 for z/OS system and the back-end Analytics
Accelerator)
2. The archive table can be partitioned, regardless of whether or not the base table is
partitioned (base and associated archive table only have to be logically – not physically
– identical)
3. If archive table is partitioned on a date basis (could require adding timestamp column to
base and archive tables), and if older rows are not updated, High-Performance Storage
Saver can be utilized
• In that case, large majority of archive table’s data would physically exist only on the
Analytics Accelerator
• Timestamp column, if added to base and archive tables to facilitate date-based
partitioning of archive table, can be defined as:
GENERATED ALWAYS FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP
Combining two solutions - DB2-managed archiving and IDAA
DB2 will generate a value when a row is moved from base to archive table
The archiving combination, in a picture
Front-end DB2 system
Base table T1
DB2 Analytics Accelerator
“Accelerated” table T1
…
…
Archive table T1_AR “Accelerated” table T1_AR
Week n-5*
Week n
Week n-1
Week n-2*
Week n-3*
Week n-4*
Most recent 3
months of data
Most recent 3
months of data
Week n-5
Week n
Week n-1
Week n-2
Week n-3
Week n-4
“Trickle-feed”
replication keeps
“accelerated” tables
within 1-2 minutes of
currency
* Older partitions exist only logically on front-end DB2
(In this example, base table holds 3 months of data, archive table is partitioned by week)
Combining History in DB2 and on the Accelerator
• Both active|archive-enabled and history|archive table need to be
accelerated to route SQL to IDAA
Active
tables
History tables
DB2
Accelerator
Active
tables History tables
archive tables
Archive-enabled
tables
Archive-enabled
tables
archive tables
SQL1 SQL2
• ETL Processing pattern
• Move data from original data source(s) through tools or custom transformation
programs to target DW/DM
• Typically, data is stored several times in intermittent staging areas
• Myth: main purpose for ETL
• To make data consumable for end users
• To optimize for performance (star schema)
• Merging and cleansing (making consistent)
• Reality: majority of the ETL processing is generating history data…
Combining solutions for ETL Modernization
CREATE TABLE T1 (...)
IN ACCELERATOR
ACC1;
INSERT INTO T1
SELECT ... FROM
CUST_TABLE_1 JOIN
TRANS_TABLE_1....
CREATE TABLE T2 (...)
IN ACCELERATOR
ACC1;
INSERT INTO T2
SELECT ... FROM
CUST_TABLE_2 JOIN
TRANS_TABLE_2....
Select ... FROM T1
JOIN T2....;
DROP TABLE T1;
DROP TABLE T2;
Accelerator-only tables store temporary results
during reporting process
Customer
Summary Mart
Credit Card
Transaction History
Set of tables CUST_TABLE_x,
TRANS_TABLE_x
Credit Card
History
Customer
Sum Mart
RoutingofCREATE,SELECTand
DROPstatements
ACC1
32
1
Data for analytical processing
Multi-Step
Report
Reporting Application
2
3
1
Reports and Dashboards
ETL with Accelerator-Only Tables
Perfect trio : temporal tables, transparent archiving in db2 for z_os and idaa

More Related Content

PDF
How should I monitor my idaa
PDF
Db2 analytics accelerator technical update
PDF
The Top 12 Features new to Oracle 12c
PDF
High Availability Options for DB2 Data Centre
PPTX
Understanding DB2 Optimizer
PDF
12c In Memory Management - Saurabh Gupta
PDF
Reduce planned database down time with Oracle technology
PPTX
An AMIS Overview of Oracle database 12c (12.1)
How should I monitor my idaa
Db2 analytics accelerator technical update
The Top 12 Features new to Oracle 12c
High Availability Options for DB2 Data Centre
Understanding DB2 Optimizer
12c In Memory Management - Saurabh Gupta
Reduce planned database down time with Oracle technology
An AMIS Overview of Oracle database 12c (12.1)

What's hot (20)

PDF
SQL Server Tuning to Improve Database Performance
PPTX
Oracle 12c Architecture
PDF
Oracle 12c New Features_RMAN_slides
PDF
Oracle database high availability solutions
PPTX
Oracle database upgrade to 12c and available methods
PPSX
Oracle database 12c new features
PDF
Oracle 12c PDB insights
PDF
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
PPT
Sql server performance tuning
PPTX
Database Performance Tuning
PPT
Sql Server Performance Tuning
PDF
Tungsten University: Configure & Provision Tungsten Clusters
PDF
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
PPTX
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
PPT
Cooper Oracle 11g Overview
PPTX
SAP Migration Overview
PPT
How Data Instant Replay and Data Progression Work Together
PDF
Oracle database 12c intro
PDF
DB2 LUW Access Plan Stability
PDF
SAP HANA Dynamic Tiering Test-drive
SQL Server Tuning to Improve Database Performance
Oracle 12c Architecture
Oracle 12c New Features_RMAN_slides
Oracle database high availability solutions
Oracle database upgrade to 12c and available methods
Oracle database 12c new features
Oracle 12c PDB insights
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Sql server performance tuning
Database Performance Tuning
Sql Server Performance Tuning
Tungsten University: Configure & Provision Tungsten Clusters
IDUG NA 2014 / 11 tips for DB2 11 for z/OS
Real Time Operational Analytics with Microsoft Sql Server 2016 [Liviu Ieran]
Cooper Oracle 11g Overview
SAP Migration Overview
How Data Instant Replay and Data Progression Work Together
Oracle database 12c intro
DB2 LUW Access Plan Stability
SAP HANA Dynamic Tiering Test-drive
Ad

Similar to Perfect trio : temporal tables, transparent archiving in db2 for z_os and idaa (20)

PDF
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
PDF
Time Travelling With DB2 10 For zOS
PDF
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
PPTX
Sql 2016 - What's New
PDF
Overview of Oracle database12c for developers
PDF
PDF
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
PPT
database-stucture-and-space-managment.ppt
PPT
database-stucture-and-space-managment.ppt
PPTX
Redshift overview
PDF
Webinar - MariaDB Temporal Tables: a demonstration
PDF
Redefining tables online without surprises
PPTX
Teradata Tutorial for Beginners
PPTX
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
PDF
Deep Dive: Amazon Redshift (March 2017)
PPT
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
PDF
Oracle Database : Addressing a performance issue the drilldown approach
PPTX
AWS (Amazon Redshift) presentation
PDF
SQL Server 2016 novelties
PDF
Designing and Building Next Generation Data Pipelines at Scale with Structure...
Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA
Time Travelling With DB2 10 For zOS
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Sql 2016 - What's New
Overview of Oracle database12c for developers
Reducing Your E-Business Suite Storage Footprint Using Oracle Advanced Compre...
database-stucture-and-space-managment.ppt
database-stucture-and-space-managment.ppt
Redshift overview
Webinar - MariaDB Temporal Tables: a demonstration
Redefining tables online without surprises
Teradata Tutorial for Beginners
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Deep Dive: Amazon Redshift (March 2017)
Maaz Anjum - IOUG Collaborate 2013 - An Insight into Space Realization on ODA...
Oracle Database : Addressing a performance issue the drilldown approach
AWS (Amazon Redshift) presentation
SQL Server 2016 novelties
Designing and Building Next Generation Data Pipelines at Scale with Structure...
Ad

More from Cuneyt Goksu (20)

PDF
Home Office
PDF
Makine Düsünebilir mi
PDF
WhatsApp nedir
PDF
Db2 for z os trends
PDF
Ibm machine learning for z os
PDF
Machine Learning for z/OS
PDF
Seçsi̇s sistemi hakkında değerlendirme ve öneriler
PDF
Gaining Insight into
PDF
Identify SQL Tuning Opportunities
PDF
Diagnose RIDPool Failures
PDF
Sosyal Medya ve Yeni Örgütlenmeler
PDF
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
PDF
Denver 2012 -- After IDUG Conference
PDF
BIG DATA Nedir ve IBM Çözümleri.
PPTX
Nato ve medya
PDF
Occupy wall street
PDF
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
PDF
Vietnam 2011
PPTX
Real-life DB2 tuning experİences with Apptune
PDF
Lessons learned from Isbank - A Story of a DB2 for z/OS Initiative
Home Office
Makine Düsünebilir mi
WhatsApp nedir
Db2 for z os trends
Ibm machine learning for z os
Machine Learning for z/OS
Seçsi̇s sistemi hakkında değerlendirme ve öneriler
Gaining Insight into
Identify SQL Tuning Opportunities
Diagnose RIDPool Failures
Sosyal Medya ve Yeni Örgütlenmeler
Understanding IBM Tivoli OMEGAMON for DB2 Batch Reporting, Customization and ...
Denver 2012 -- After IDUG Conference
BIG DATA Nedir ve IBM Çözümleri.
Nato ve medya
Occupy wall street
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
Vietnam 2011
Real-life DB2 tuning experİences with Apptune
Lessons learned from Isbank - A Story of a DB2 for z/OS Initiative

Recently uploaded (20)

PPTX
1_Introduction to advance data techniques.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Introduction to Business Data Analytics.
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Foundation of Data Science unit number two notes
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
1_Introduction to advance data techniques.pptx
Fluorescence-microscope_Botany_detailed content
Business Acumen Training GuidePresentation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Moving the Public Sector (Government) to a Digital Adoption
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to Business Data Analytics.
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Foundation of Data Science unit number two notes
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Miokarditis (Inflamasi pada Otot Jantung)
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
.pdf is not working space design for the following data for the following dat...
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb

Perfect trio : temporal tables, transparent archiving in db2 for z_os and idaa

  • 1. C02- Perfect Trio: Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA Mehmet Cuneyt Goksu, mehmet.goksu@ibm.com IDAA & Db2 Tools Lab Advocate IBM Germany R&D, Böblingen
  • 2. • Data archiving requirements and challenges • Data archiving solutions for z/OS systems • Temporal Tables & History Generation • Transparent Archiving & History Generation • Overview of IDAA Technology • Combining Solutions for different usecases Agenda
  • 3. Sometimes, due to legal requirements Why retain data for long periods of time? • Sometimes in support of customer service We need to repair your 2005 vehicle • Sometimes for analytics purposes If we analyze more data, we’ll get more valuable insight…
  • 4. • DB2 tables with non-continuously-ascending clustering key (new rows get inserted throughout table), data retention can increase the CPU cost of data access • More recently-inserted rows are often the most frequently accessed, but sets of such rows will be separated by ever-larger numbers of “old and cold” rows Result: more and more DB2 GETPAGEs are required to retrieve the same result sets, and more GETPAGEs means more CPU Data retention’s impact: application performance • Even for DB2 table with continuously-ascending clustering key (so newer rows are concentrated at “end” of table), growth means larger indexes, and that means more CPU • A larger index has more levels, leading to more GETPAGEs • DB2 utilities that process indexes (such as REORG and RUNSTATS) may become more expensive to run
  • 5. • Storing years of historical data on the high-end disk subsystems typically used with z Systems can cost a lot of $$$ • A cost-reducing alternative – storing historical data offline, on tape, has its own problems • No dynamic query access – data requested for analysis might be restored to disk overnight, available next day • Even then, likely that only a subset of data-on-tape would be restored at any given time • Is there a better way? Yes – several of them!! Data retention’s impact: data storage costs
  • 6. Non DBMS Retention Platform ATA File Server EMC Centera IBM RS550 HDS Compressed Archives Offline Retention Platform CD Tape Optical Compressed Archives Production Database Archive Definitions Archive Restore Archive Database Compressed Archives Online Archive 5-6 years Offline Archive 7+ years Current Data 1-2 years Active Historical 3-4 years
  • 7. DB2 Temporal Tables – Time Travel Query • One of the major improvements since DB2 10 • The ability for the database to reduce the complexity and amount of coding needed to implement “versioned” data, data that has different values at different points in time. • Data that you need to keep a record of for any given point in time • Data that you may need to look at for the past, current or future situation • The ability to support history or auditing queries • Business Time & System time
  • 8. Temporal Concepts • Business Time (Effective Dates, Valid Time, From/To-dates) – Every row has a pair of TIMESTAMP or DATE columns set by Application • Begin time : when the business deems the row valid • End Time : when the business deems row validity ends – Constraint created to ensure Begin time < End time – Query at current, any prior, or future point/period in business time • System Time (Assertion Dates, Knowledge Dates, Transaction Time, Audit Time, In/Out- dates) – Every row has a pair of TIMESTAMP columns set by DBMS • Begin time : when the row was inserted in the DBMS • End Time : when the row was modified/deleted – Every base row has a Transaction Start ID timestamp – Query at current or any prior point/period in system time • Times are inclusive for start time and exclusive for end times
  • 9. DB2 Temporal Tables - History Generation • Concept of period (SYSTEM_TIME and BUSINESS_TIME periods) • A period is represented by a pair of datetime columns in DB2 relations, one column stores start time, the other one stores end time • SYSTEM_TIME period captures DB2’s creation and deletion of records. DB2 SYSTEM_TIME versioning automatically keeps historical versions of records • BUSINESS_TIME period allows users to create their own valid period for a given record. Users maintain the valid times for a record. • Temporal tables: System-period Temporal Table (STT), Application-period Temporal Table (ATT) • Business value • It helps meet compliance requirements • It performs better • It is easier to manage compared to home-grown solutions
  • 10. Row Maintenance with System Time – History Generation * T1: INSERT Row A * T2: UPDATE Row A * T3: UPDATE Row A * T4: DELETE Row A * T5: INSERT Row A Row A1:T1-T2Row A1:T1-HVRow A2:T2-HVRow A3:T3-HVRow A4:T5-HV Base Table History Table * Notes: – INSERT has no History Table impact – The first UPDATE begins a lineage for Row A. • History Table ST End = Base Table ST Begin (No gap) • The Base Table ST End is always High Values (HV) – The second UPDATE deepens the lineage • No gaps exist across all generations of Row A. – The DELETE adds to the lineage in the History Table. • There is no current row (Base Table) after the DELETE – The second INSERT begins a new row lineage • There is a gap between the History Table rows and the Base Table – If all of the above statements happen in the same UOW, there would be no History Table rows Row A2:T2-T3 Row A1:T1-T2 Row A3:T3-T4 Row A2:T2-T3 Row A1:T1-T2
  • 11. Sep 2008 Audit HistoryCurrent Aug 2008 Jul 2008 History Generation SQL using current data SQL using ASOF Transparent/automatic access to satisfy ASOF Queries History table contains version of every update on a single row DB2 Temporal Tables - History Generation
  • 12. Temporal auditing Track which SQL operation caused modification − Also: who modified data − Usage not restricted to DB2 temporal • GENERATED ALWAYS AS ... can also be defined for non-temporal tables ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END Table BANK_ACC_STT GENERATED ALWAYS AS (SESSION_USER) Also special registers such as • CURRENT CLIENT_USERID • CURRENT SQLID • CURRENT CLIENT_ACCTNG ... CHAR(1) GENERATED ALWAYS AS (DATA CHANGE OPERATION)
  • 13. Temporal auditing - example  User JOE inserts entry for ACCOUNT_ID 56789 ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END 56789 1234.56 JOE I 2017-01-19 9999-12-30 BANK_ACC_STT ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END 56789 88.77 DON U 2017-01-21 9999-12-30  User DON updates this record ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END 56789 1234.56 JOE I 2017-01-19 2017-01-21 BANK_ACC_HIST ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END ACCOUNT_ID BALANCE USER OP_CODE SYS_START SYS_END 56789 1234.56 JOE I 2017-01-19 2017-01-21 56789 88.77 DON U 2017-01-21 2017-02-15 56789 88.77 LAURA D 2017-02-15 2017-02-15 BANK_ACC_STT  User LAURA deletes this record BANK_ACC_STT BANK_ACC_HIST * * Requires ON DELETE ADD EXTRA ROW in temporal DDL☼
  • 14. • Both active and history tables with Timestamp(12) can be loaded to the Accelerator System Time Temporal Query Routing with DB2 12 and IDAA • Special query rewrite is applied for the following 3 temporal SQL: • FOR SYSTEM_TIME AS OF expr • FOR SYSTEM_TIME FROM expr1 TO expr2 • FOR SYSTEM_TIME BETWEEN expr1 AND expr2 • Queries on system temporal tables are routed to the Accelerator when ZPARM QUERY_ACCEL_OPTIONS is set to 5 5: Allows to run accelerated queries against STT and bi-temporal tables 6: queries will be offloaded if the queries reference timestamp columns with precision of 12 • All existing offloading criteria have to be met
  • 15. • Yes – it is a “historical” data retention option • With system-time temporal, you are retaining data that was once, but is no longer, in effect • Needs of the business determine which data retention approach is appropriate for a given situation • When data previously inserted in a table is changed (updated or deleted), is there a need to retain a “before” image of a changed row, along with the “from” and “to” times of the row’s “in effect” period? • That’s what system-time temporal is for – it lets you see data that WAS current at some prior point in time Can system-time temporal be a form of archiving?
  • 16. • Querying and managing tables that contain a large amount of data is a common problem • Maintaining for performance of a large table is a another pain point Index or not? • One known solution is to archive the inactive/cold data to a different environment • Challenges on the ease of use and performance • How to access to both current and archived data within single query • How to make data archiving and access “transparent” with minimum application changes Poor Application Performance Why DB2 Archive Transparency
  • 17. 1. DBA creates table (e.g., T1_AR) to be used as archive for table T1 2. DBA tells DB2 to enable archiving for T1, using archive table T1_AR ALTER TABLE T1 ENABLE ARCHIVE USE T1_AR; 3. Program deletes to-be-archived rows from T1 • If program sets DB2 global variable SYSIBMADM.MOVE_TO_ARCHIVE to ‘Y’, all it has to do is delete from T1 – DB2 will move deleted rows to T1_AR • The value of a global variable affects only the DB2 thread for which it was set 4. Bind packages appropriately (bind option affects static and dynamic SQL) • If a program will ALWAYS access ONLY the base table, it should be bound with ARCHIVESENSITIVE(NO) • If a program will SOMETIMES or ALWAYS access rows in the base table and the associated archive table, it should be bound with ARCHIVESENSITIVE(YES) • If program sets DB2 global variable SYSIBMADM.GET_ARCHIVE to ‘Y’, and issues SELECT against base table, DB2 will automatically drive that SELECT against associated archive table, too, and will merge results with UNION ALL • So, with DB2-managed archiving, a program can retrieve data from an archive table without having to reference the archive table DB2-managed data archiving – how it’s done
  • 18. • NOT the same thing as system time temporal data • When versioning (system time) is activated for a table, the “before” images of rows made “non-current” by update or delete are inserted into an associated history table • With DB2-managed archiving, rows in an archive table are current in terms of validity – they are just older than rows in the associated base table (if row age is the archive criterion) When most access is to rows recently inserted into a table, moving older rows to an archive table can improve performance for newer-row retrieval Particularly useful when data clustered by non-continuously-ascending key DB2 users are already doing it for several years! – DB2 makes it easier DB2-managed data archiving Before DB2-managed data archiving After DB2-managed data archiving Newer, more “popular” rows Older rows, less frequently retrieved Base table Archive table
  • 19. DB2 Archive Transparency - History Generation Sep 2008 ArchiveArchive- enabled Aug 2008 Jul 2008 Archive @DELETE/ REORG DISCARD SQL using current data GET_ARCHIVE = 'Y'; SQL Transparent/automatic access to satisfy “GET_ARCHIVE” queries History table contains version of every update on a single row MOVE_TO_ARCHIVE =‘Y’| 'E';
  • 20. DB2 Transparent archiving – What is new!  Transparent archiving introduced with DB2 11 − Enable archiving of deleted rows in separate tables − Similar to temporal / SYSTEM TIME  New with DB2 12: new ZPARM to specify default value for MOVE_TO_ARCHIVE global variable − retrofitted to DB2 11 with APAR PI56767  New with DB2 12: allow row change timestamp column to be part of partitioning key − can facilitate archiving of archive table to DB2 Analytics Accelerator (on partition basis) − retrofitted to DB2 11 with APAR PI63830  AND: optimizer improvements in DB2 12 (e.g. UNION ALL) with positive impact on transparent archiving and temporal tables
  • 21. • System-time temporal support and DB2-managed archiving cannot be activated for the same table – use one or the other • Key differences: • System-time temporal • Implemented with a base table and an associated history table • Rows in the history table are NOT current – they are the “before” images of rows that were made non-current by DELETE or UPDATE operations targeting the base table • DB2-managed archiving • Implemented with a base table and an associated archive table • Rows in the archive table ARE current – they are just older than the rows in the base table (assuming that age is the archive criterion) DB2: temporal (system time) versus archive
  • 22. IBM z Analytics 22 Query execution process flow AcceleratorDRDARequestor Application Interface Heartbeat (availability and performance indicators) Application Optimizer Query execution run-time for queries that cannot be or should not be routed to Accelerator Heartbeat Queries executed with Accelerator Queries executed without Accelerator Db2 Analytics Accelerator
  • 23. IBM z Analytics Introducing Accelerator-only table type in DB2 for z/OS Creation (DDL) and access remains through DB2 for z/OS in all cases Non-accelerator DB2 table • Data in DB2 only Accelerator-shadow table • Data in DB2 and the Accelerator Accelerator-archived table / partition • Empty read-only partition in DB2 • Partition data is in Accelerator only Accelerator-only table (AOT) • “Proxy table” in DB2 • Data is in Accelerator only Table 1 Table 4 Table 3 Table 2Table 2 Table 4 Table 3 Db2 Analytics Accelerator
  • 24. 1. A base table and its associated archive table can be selected for acceleration (so both tables will exist on both the front-end DB2 for z/OS system and the back-end Analytics Accelerator) 2. The archive table can be partitioned, regardless of whether or not the base table is partitioned (base and associated archive table only have to be logically – not physically – identical) 3. If archive table is partitioned on a date basis (could require adding timestamp column to base and archive tables), and if older rows are not updated, High-Performance Storage Saver can be utilized • In that case, large majority of archive table’s data would physically exist only on the Analytics Accelerator • Timestamp column, if added to base and archive tables to facilitate date-based partitioning of archive table, can be defined as: GENERATED ALWAYS FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP Combining two solutions - DB2-managed archiving and IDAA DB2 will generate a value when a row is moved from base to archive table
  • 25. The archiving combination, in a picture Front-end DB2 system Base table T1 DB2 Analytics Accelerator “Accelerated” table T1 … … Archive table T1_AR “Accelerated” table T1_AR Week n-5* Week n Week n-1 Week n-2* Week n-3* Week n-4* Most recent 3 months of data Most recent 3 months of data Week n-5 Week n Week n-1 Week n-2 Week n-3 Week n-4 “Trickle-feed” replication keeps “accelerated” tables within 1-2 minutes of currency * Older partitions exist only logically on front-end DB2 (In this example, base table holds 3 months of data, archive table is partitioned by week)
  • 26. Combining History in DB2 and on the Accelerator • Both active|archive-enabled and history|archive table need to be accelerated to route SQL to IDAA Active tables History tables DB2 Accelerator Active tables History tables archive tables Archive-enabled tables Archive-enabled tables archive tables SQL1 SQL2
  • 27. • ETL Processing pattern • Move data from original data source(s) through tools or custom transformation programs to target DW/DM • Typically, data is stored several times in intermittent staging areas • Myth: main purpose for ETL • To make data consumable for end users • To optimize for performance (star schema) • Merging and cleansing (making consistent) • Reality: majority of the ETL processing is generating history data… Combining solutions for ETL Modernization
  • 28. CREATE TABLE T1 (...) IN ACCELERATOR ACC1; INSERT INTO T1 SELECT ... FROM CUST_TABLE_1 JOIN TRANS_TABLE_1.... CREATE TABLE T2 (...) IN ACCELERATOR ACC1; INSERT INTO T2 SELECT ... FROM CUST_TABLE_2 JOIN TRANS_TABLE_2.... Select ... FROM T1 JOIN T2....; DROP TABLE T1; DROP TABLE T2; Accelerator-only tables store temporary results during reporting process Customer Summary Mart Credit Card Transaction History Set of tables CUST_TABLE_x, TRANS_TABLE_x Credit Card History Customer Sum Mart RoutingofCREATE,SELECTand DROPstatements ACC1 32 1 Data for analytical processing Multi-Step Report Reporting Application 2 3 1 Reports and Dashboards ETL with Accelerator-Only Tables