SlideShare a Scribd company logo
DB2 Data Movement Utilities: A
Comparison
Speaker: Jeyabarathi(JB) Chakrapani
NASCO
Session Code: D09
Wed, May 06, 2015 (08:00 AM - 09:00 AM) : Hancock | Platform: DB2 LUW II
Agenda
 Learn the various tools that are available with DB2 for achieving
efficient data movement within the database environment.
 Offer a brief introduction into each utility including the DB2
Admin Online Table Move procedure.
 Learn the various enhancements offered in each DB2 version
for each of these utilities
 Understand how to use the different utilities with examples.
 Learn what it takes to maximize the performance of your choice
of data movement utility along with useful tricks and tips.
Introduction to DB2 data movement utilities
Load Utility
Export Utility
Import Utility
Ingest Utility
DB2move tool
Restore utility
ADMIN_COPY_SCHEMA
ADMIN_MOVE_TABLE
Split Mirror
IBM replication tools
3
What are the available tools and options for data movement?
LOAD UTILITY
4
Load utility
Required input for Load:
 The path and the name of the input file, named pipe, or
device.
 The name or alias of the target table.
 The format of the input source. This format can be DEL,
ASC, PC/IXF, or CURSOR.
 Whether the input data is to be appended to the table, or
is to replace the existing data in the table.
 A message file name, if the utility is invoked through the
application programming interface (API), db2Load.
5
Load
Load phases:
• Load
• Build
• Delete
• Index Copy
Load modes:
• Insert
• Replace
• Restart
• Terminate
6
Load options Include:
• If the load utility is invoked from a remotely connected client,
the data file must be on the client. XML and LOB data are
always read from the server, even you specify the CLIENT
option.
• The method to use for loading the data: column location,
column name, or relative column position.
• How often the utility is to establish consistency points.
• The names of the table columns into which the data is to be
inserted.
• Whether or not preexisting data in the table can be queried
while the load operation is in progress.
• Whether the load operation should wait for other utilities or
applications to finish using the table or force the other
applications off before proceeding.
Client Options
Method
Consistency
Points
Access level
Paths
TableSpace
Statistics
Recovery
COPY NO/YES
7
Load options Include:
• An alternate system temporary table space in which to build the
index.
• The paths and the names of the input files in which LOBs are
stored.
• A message file name.
• Whether the utility should modify the amount of free space
available after a table is loaded.
• Whether statistics are to be gathered during the load process.
This option is only supported if the load operation is running in
REPLACE mode.
• Whether to keep a copy of the changes made. This is done to
enable rollforward recovery of the database.
• The fully qualified path to be used when creating temporary files
during a load operation. The name is specified by the TEMPFILES
PATH parameter of the LOAD command.
Client Options
Method
Consistency
Points
Access level
Paths
TableSpace
Statistics
Recovery
COPY NO/YES
8
Load Restrictions:
• Loading data into nicknames is not supported.
• Loading data into typed tables, or tables with
structured type columns, is not supported.
• Loading data into declared temporary tables and
created temporary tables is not supported.
• XML data can only be read from the server side;
if you want to have the XML files read from the
client, use the import utility.
• You cannot create or drop tables in a table space
that is in Backup Pending state.
• If an error occurs during a LOAD REPLACE
operation, the original data in the table is lost.
Retain a copy of the input data to allow the load
operation to be restarted.
• Triggers are not activated on newly loaded rows.
Business rules associated with triggers are not
enforced by the load utility.
• Loading encrypted data is not supported.
Nick names
Structured Data Types
Temporary Tables
XML support
Backup Pending
Load Replace
Triggers
Data encryptions
Partitioned Tables
9
Load from Cursor…
Examples:
DECLARE mycurs CURSOR FOR SELECT * FROM abc.table1;
LOAD FROM mycurs OF cursor INSERT INTO abc.table2;
DECLARE C1 CURSOR FOR SELECT * FROM customers
WHERE XMLEXISTS(’$DOC/customer[income_level=1]’);
LOAD FROM C1 OF CURSOR INSERT INTO lvl1_customers;
The ANYORDER file type modifier is supported for
loading XML data into an XML column.
• Loads the results of a query
directly into the target table,
no intermediate export
necessary.
• XML data can be loaded
with the cursor option.
•Nicknames can be
referenced in the SQL query
of the cursor using the
DATABASE option.
•Load from remote database
using the DATABASE option.
10
Examples:
 Loading from a federated database:
Federation should be enabled and data source cataloged.
CREATE NICKNAME myschema1.table1 FOR source.abc.table1;
DECLARE mycurs CURSOR FOR SELECT c1,c2,c3 FROM myschema1.table1
LOAD FROM mycurs OF cursor INSERT INTO abc.table2
 Loading from a remote database:
The remote database should be cataloged.
DECLARE mycurs CURSOR DATABASE dbsource USER dsciaraf USING mypasswd
FOR SELECT * FROM abc.table1;
LOAD FROM mycurs OF cursor INSERT INTO abc.table2;
11
Checking for Integrity violations….
Load puts the tables in check pending status when:
 The table has check constraints or RI constraints.
 The table has identity columns and a V7 or earlier client was used to load data.
 The table has descendent immediate staging tables or MQT tables referencing it.
 The table is a staging table or MQT table.
12
Load Performance….
CPU_PARALLELISM - specifies the number of threads used by the load utility to
parse, convert, and format data records
DISK_PARALLELISM - specifies the number of processes or threads used by the load
utility to write data records to disk
DATA_BUFFER - total amount of memory, in 4 KB units, allocated to the load utility
as a buffer
NONRECOVERABLE – Does not put the table in backup pending.
SAVE COUNT – Specifies consistency points.
STATISTICS USE PROFILE – Collection of statistics after load
FASTPARSE – Used when data is known to be valid.
NOROWWARNINGS - use this when multiple warnings are expected.
PAGEFREESPACE,INDEXFREESPACE,TOTALFREESPACE – Specify to reduce the need
for reorg
13
EXPORT
14
EXPORT UTILITY
• Required input:
 Pathname for the output file.
 Format of the output file -
IXF or DEL.
 Specification of data to be
extracted using SELECT
statement
• Additional options:
 subset of columns to be extracted
using the METHOD option.
 XML TO, XMLFILE, XML SAVESCHEMA
- to export and store XML data in
different ways.
 The SELECT statement used for
extracting data can be optimized the
same way any SQL query can be
optimized to improve export
performance.
 ‘Messages’ option allows messages
generated by the export utility to be
written to a file.
15
Data extraction using SQL query or Xquery statements
EXPORT UTILITY
16
Examples…
• Export to table1.ixf of ixf messages msg.txt select * from
myschema.table1.
This is a simple export command that exports all rows to the ixf file.
• EXPORT TO table1export.del OF DEL XML TO /db/xmlpath
XMLFILE xmldocs XMLSAVESCHEMA SELECT * FROM
myschema.table1
• export to table1.del of del lobs to /db/lob1, /db/lob2/ modified
by lobsinfile select * from myschema.table1
IMPORT
17
IMPORT
• Required input for Import:
 The path and the name of the
input file
 The name or alias of the target
table or view
 The format of the data in the
input file
 The method by which the data is
to be imported
 The traverse order, when
importing hierarchical data
 The subtable list, when importing
typed tables
• Additional options:
 MODIFIED BY clause
 ALLOW WRITE ACCESS – Import
acquires non exclusive lock
 ALLOW NO ACCESS – Import
acquires exclusive lock, waits for
other work to complete until it
can acquire the lock.
 COMMITCOUNT – Commits after
the specified number of rows.
 MESSAGES
18
Data append/update using SQL query or XQuery statements
Import
• Import Support
 Import supports IXF, ASC, and DEL
data formats.
 Used with file type modifiers to
customize the import operation.
 Used to move hierarchical data and
typed tables.
 Import logs all activity, updates
indexes, verifies constraints, and
fires triggers.
 Allows you to specify the names of
the columns within the table or
view into which the data is to be
inserted.
• Import Modes
 INSERT – Adds data to the existing
table without changing existing data.
 INSERT_UPDATE – Updates data with
matching primary key values,
otherwise inserts.
 REPLACE – Deletes existing data and
inserts new data.
 CREATE - Creates the target table and
its index definitions.
 REPLACE_CREATE – Deletes existing
data and inserts new data. If the
target table does not exist, it is
created
19
IMPORT
Restrictions
• If the table has primary key that is
referenced by a foreign key, it can be
only appended to.
• You cannot perform an import replace
operation into an underlying table of a
materialized query table defined in
refresh immediate mode.
• You cannot import data into a system
table, a summary table, or a table with
a structured type column.
• You cannot import data into declared
temporary tables.
• Views cannot be created through the
import utility.
• Cannot import encrypted data.
• Referential constraints and foreign key
definitions are not preserved when
creating tables from PC/IXF files.
(Primary key definitions are preserved if
the data was previously exported by
using SELECT *.)
• Because the import utility generates its
own SQL statements, the maximum
statement size of 2 MB might, in some
cases, be exceeded.
• You cannot re-create a partitioned table
or a multidimensional clustered table
(MDC) by using the CREATE or
REPLACE_CREATE import parameters.
• Cannot re-create tables containing XML
• Does not honor Not Logged Initially
clause.
20
IMPORT Restrictions …
Remote import is not
allowed if
• The application and
database code pages are
different.
• The file being imported is a
multiple-part PC/IXF file.
• The method used for
importing the data is either
column name or relative
column position.
• The target column list
provided is longer than 4 KB.
• The LOBS FROM clause or
the lobsinfile modifier is
specified.
• The NULL INDICATORS
clause is specified for ASC
files.
21
IMPORT Performance …
22
 If the workload is mostly insert, consider altering the table to
‘append on’.
 To avoid transaction log full condition, consider an
appropriate ‘commit count’ value.
 Enable DB2_PARALLEL_IO registry variable.
 Review logbuffer db cfg value and increase it as necessary.
 Review utility heap db cfg value and increase as needed.
 Review num_ioservers, num_iocleaners parameters.
INGEST
23
INGEST
• INGEST characteristics
• Fast – Multithreaded design to process in parallel.
• Available – Uses row level locking and so tables remain
concurrent.
• Continuous – Can continuously ingest data streams from
pipes or files.
• Robust – Handles unexpected failures. Can be restarted
from last commit point.
• Flexible and Functional – Supports different input formats
and target tables types, has rich data manipulation
capabilities.
24
INGEST
Supported Table Types
• multidimensional clustering (MDC)
and insert time clustering (ITC)
tables
• range-partitioned tables
• range-clustered tables (RCT)
• materialized query tables (MQTs)
that are defined as MAINTAINED
BY USER, including summary
tables
• temporal tables
• updatable views (except typed
views)
Supported data formats
• Delimited text
• Positional text and binary
• Columns in various orders and
formats
25
Ingest
Transporter Formatter
Formatter
Formatter
Transporter
SQL
DB Partition 1
Multiple
files
or
Multiple
pipes
Formatter
Array InsertFlusher
[Array Insert]
Array InsertFlusher
[Array Insert]
Array InsertFlusher
[Array Insert]
Hash by
database
partition
DB Partition 2
DB Partition n
Main components:
 Transporter
 Formatter
 Flusher
INGEST
• Transporter:
 Reads from data source and
writes to formatter queues.
For INSERT and MERGE
operations, there is one
transporter thread for each
input source. For UPDATE and
DELETE operations, there is
only one transporter thread.
• Formatter:
 Parse each record, convert the
data into the format that DB2
requires, & writes each
formatted record to one of
the flusher queues for that
record's partition.
 The num_formatters
configuration parameter is
used to specify the number of
formatter threads. The default
is (number of logical CPUs)/2.
27
INGEST
• Flusher:
 The flushers issue the SQL statements to perform the operations
on the DB2 tables. The number of flushers for each partition is
specified by the num_flushers_per_partition configuration
parameter. The default is max(1, ((number of logical
CPUs)/2)/(number of partitions) ).
28
INGEST Examples
29
INGEST FROM FILE my_file.del FORMAT DELIMITED INSERT INTO
my_table;
Input records are sent over a named pipe
INGEST FROM PIPE my_pipe FORMAT DELIMITED INSERT INTO
my_table;
Input records delimited by CRLF; fields are delimited by vertical bar
INGEST FROM FILE my_file.del FORMAT DELIMITED '|' INSERT
INTO my_table;
INGEST Examples
30
INGEST FROM FILE input_file.txt
FORMAT DELIMITED
(
$key1 INTEGER EXTERNAL,
$data1 CHAR(8),
$data2 CHAR(32),
$data3 DECIMAL(5,2) EXTERNAL
)
MERGE INTO target_table
ON (key1 = $key1)
WHEN MATCHED THEN
UPDATE SET (data1, data2, data3) = ($data1, $data2,
$data3)
WHEN NOT MATCHED THEN
INSERT VALUES($key1, $data1, $data2, $data3);
INGEST – Examples…
31
Ingest configuration:
connect to mydb user <username> using <password>;
INGEST SET num_flushers_per_partition 1;
INGEST SET NUM_FORMATTERS 12;
INGEST SET shm_max_size 12 GB;
INGEST SET commit_count 20000;
ingest from file
/mydir/file1
FORMAT DELIMITED by ','
RESTART OFF
insert into myschema.tab1;
INGEST – Restart ..
32
Restart information is stored in a separate table
(SYSTOOLS.INGESTRESTART) and it is created once.
To create the restart table on DB2 10.1
CALL SYSPROC.SYSINSTALLOBJECTS('INGEST', 'C',
NULL, NULL).
The table contains some counters to keep track of which records
have been ingested.
INGEST - Restart
33
RESTART CONTINUE to restart a previously failed job
(and clean up the restart data)
RESTART TERMINATE to clean up the restart data from
a failed job you don't plan to restart
RESTART OFF to suppress saving of restart information
(in which case the ingest job is not restartable)
INGEST – Additional features
34
Commit by time or number of rows - Commit_count or
commit_period configuration parameter
Support for copying rejected records to a file or table -
DUMPFILE or EXCEPTION TABLE parameter
Support for restart and recovery - retry_count ingest
configuration parameter.
INGEST - Monitoring
35
INGEST LIST and INGEST GET STATS commands
Reads information that the utility maintains in shared
memory.
Must be run in a separate window on the same machine
as the INGEST command.
Can display detailed information
INGEST and LOAD
• INGEST
• When the table needs to
remain concurrent during load.
• You need only some fields from
the input file to be loaded.
• You need to specify an SQL
statement other than INSERT
• You need to be able to use an
SQL expression (to construct a
column value from field values)
• You need to recover and
continue on when the utility
gets a recoverable error
• LOAD
• Don’t need the table to remain
concurrent.
• XML & LOB data to be loaded.
• Load from cursor or load from a
device
• Input source file is in IXF format
• Load a GENERATED ALWAYS column
or SYSTEM_TIME column from the
input file
• Use SYSPROC.ADMIN_CMD
• Invoke the utility through an API
• Don't want the INSERTs to be logged
36
INGEST - Performance
• Field type and column type
• Define fields to be the same type as their corresponding column types.
• Materialized query tables (MQTs)
• If you are using Ingest utility against a base table of an MQT defined as
refresh immediate, performance can degrade significantly due to the time
required to update the MQT.
• Row size
• Increase the setting of the commit_count for tables with smaller row size
and reduce it for tables with larger row size
• Other workloads
• If multiple workloads are running along with the ingest, consider increasing
the locklist database configuration parameter and reduce the
commit_count ingest configuration parameter.
•
37
Comparison between Import, Load and Ingest
38
Table type IMPORT LOAD INGEST
Created global temporary table no no no
Declared global temporary table no no no
Detached table that has dependent table where SET
INTEGRITY not run (detached table has
SYSCAT.TABLES.TYPE = 'L')
no
(SQL20285N,
reason code 1)
no
(SQL20285N,
reason code 1)
no
Multidimensional clustering (MDC) table yes yes yes
Materialized query table (MQT) that is maintained by user yes yes yes
Nickname relational
except ODBC
no
(SQL02305N)
yes
Range-clustered table (RCT) yes no yes
Range-partitioned table yes yes yes
Summary table no yes yes
Typed table yes no (SQL3211N) no
Typed view yes no
(SQL2305N)
no
Untyped (regular) table yes yes yes
Updatable view yes no
(SQL02305N)
yes
Comparison to IMPORT and LOAD – Column types
39
Column data type IMPORT LOAD INGEST
Numeric: SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE,
DECFLOAT
yes yes yes
Character: CHAR, VARCHAR, NCHAR, NVARCHAR, plus
corresponding FOR BIT DATA types
yes yes yes
Graphic: GRAPHIC, VARGRAPHIC yes yes yes
Long types: LONG VARCHAR, LONG VARGRAPHIC yes yes yes
Date/time: DATE, TIME, TIMESTAMP(p) yes yes yes
DB2SECURITYLABEL yes yes yes
LOBs from files: BLOB, CLOB, DBCLOB, NCLOB yes yes no
Inline LOBs yes yes no
XML from files yes yes no
Inline XML no no no
Distinct Type (note 1) yes yes yes
Structured Type no no no
Reference Type yes yes yes
Comparison to IMPORT and LOAD
Input Types and Formats
40
Input type Import Load INGEST
Cursor no yes no
Device no yes no
File yes yes yes
Pipe no yes yes
Multiple input files, multiple
pipes, etc
no yes yes
Input format IMPORT LOAD INGEST
ASC (including binary) yes, except binary yes yes
DEL yes yes yes
IXF yes yes no
WSF (worksheet format) yes, but will be
discontinued in
DB2 10.1
no no
Comparison to IMPORT and LOAD – Other features
41
Feature IMPORT LOAD INGEST
Can other apps update the table
while utility is loading the table?
yes no yes
Can use SQL expressions? no no yes
Support for REPLACE yes yes yes
Support for UPDATE, MERGE,
and DELETE
Update only no yes
Can update GENERATED ALWAYS
and SYSTEM_TIME columns?
no yes no
Performance for large number of
input records
slow best Comparable to load into
staging table followed by
multiple concurrent inserts
from staging table to
target table
API yes yes no (planned for a fix pack)
SYSPROC.ADMIN_CMD support no yes no
Inserts and updates are logged? yes no yes (cannot be turned off,
and no support for NOT
LOGGED INITIALLY)
Error recovery no no yes
Restart no yes Yes
ADMIN_MOVE_TABLE Procedure
42
Can be done online or offline.
A shadow copy of the source table is taken.
Source table changes are captured and applied thru triggers.
Source table is taken offline briefly to rename the shadow
copy and its indexes to the source table name.
ADMIN_MOVE_TABLE Procedure
43
Call the stored procedure once specifying atleast the schema
name and the table name.
CALL SYSPROC.ADMIN_MOVE_TABLE (’schema name’,
’source table’, ’’,’’,’’,’’,’’,’’,’’,’’,’MOVE’)
Or call the procedure multiple times for each operation of the
move.
CALL SYSPROC.ADMIN_MOVE_TABLE (’schema name’,
’source table’, ’’,’’,’’,’’,’’,’’,’’,’’,’operation name’)
ADMIN_MOVE_TABLE Procedure
44
Moving range partitioned tables
CREATE TABLE “SCHEMA1 "."T1" ("I1" INTEGER ,"I2" INTEGER )
DISTRIBUTE BY HASH("I1") PARTITION BY RANGE("I1")
(PART "PART0" STARTING(0) ENDING(100) IN "TS1",
PART "PART1" STARTING(101) ENDING(MAXVALUE) IN "TS2");
Move the T1 table from schema SCHEMA1 to the TS3 table space, leaving
the first partition in TS1.
DB2 "CALL SYSPROC.ADMIN_MOVE_TABLE
(’SCHEMA1’,’T1’,’TS3’,’TS3’,’TS3’,’’,’’,
’(I1) (STARTING 0 ENDING 100 IN TS1 INDEX IN TS1 LONG IN TS1,
STARTING 101 ENDING MAXVALUE IN TS3 INDEX IN TS3 LONG IN
TS3)’, ’’,’’,’MOVE’)"
IBM Replication tools
45
 Q replication
Q capture and Q apply components
Q capture reads DB2 recovery logs and translates committed data into
Websphere MQ messages.
Q apply reads the messages from the queue translates them into SQL
statements that can be applied to the target server.
SQL replication
Capture and apply components
Capture reads DB2 log data and writes to change data tables. Apply reads
the change data tables and replicates the changes to the target tables.
DB2move utility and ADMIN_COPY_SCHEMA
46
ADMIN_COPY_SCHEMA procedure to copy a single schema within the
same database.
Options:
- DDL, COPY, COPYNO.
 db2move utility with the -co COPY action to copy a single schema or
multiple schemas from a source database to a target database.
Eg:
db2move <dbname> COPY -sn schema1 -co target_db target schema_map "
((schema1,schema2))" tablespace_map "((TS1, TS2),(TS3, TS4),
SYS_ANY)“ -u userid -p password
DB2 Redirected Restore utility
47
Perform redirected restores to build partial or full database images.
db2 restore db test from <directory/tsm> taken at <timestamp>
redirect generate script redirect.sql
Transport a set of table spaces, storage groups and SQL schemas from
database backup image to a database using the TRANSPORT option (in DB2
Version 9.7 Fix Pack 2 and later fix packs).
db2 restore db <sourcedb> tablespace (mydata1)
schema(schema1,schema2)
from <Media_Target_clause> taken at <date-time>
transport into <targetdb> redirect
db2 list tablespaces
db2 set tablespace containers for <tablespace ID for mydata1> using
(path '/db2DB/data1')
Suspended I/O and online split mirror
48
For large databases, make copies from a mirrored image by using suspended
I/O and the split mirror function. This approach also:
 Eliminates backup operation overhead from the production machine
 Represents a fast way to clone systems.
 Represents a fast implementation of idle standby failover.
Disk mirroring is the process of writing data to two separate hard disks at the
same time. One copy of the data is called a mirror of the other. Splitting a
mirror is the process of separating the two copies.
Summary
49
Load - This utility is best suited to situations where performance is your
primary concern.
Ingest - This utility strikes a good balance between performance and
availability, but if the latter is more important to you, then you should choose
the ingest utility instead of the load utility.
Import - The import utility can be a good alternative to the load utility
in the following situations:
- where the target table is a view.
- the target table has constraints and you don't want the target table to be put
in the Set Integrity Pending state.
- the target table has triggers and you want them fired.
References
50
IBM Red book on DB2 Data Movement.
IBM Knowledge center for DB2 V9.7 and V10.1.
IBM Developer Works Technical Library.
IDUG technical archives.
JEYABARATHI CHAKRAPANI
NASCO
jbchakra@gmail.com
Session D09
Title: DB2 Data Movement Utilities : A comparison
Please fill out your session
evaluation before leaving!

More Related Content

PDF
PDF
Postgresql tutorial
PDF
One PDB to go, please!
PDF
Oracle GoldenGateでの資料採取(トラブル時に採取すべき資料)
PPT
Dwh lecture 08-denormalization tech
PDF
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
PDF
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
PDF
Get to know PostgreSQL!
Postgresql tutorial
One PDB to go, please!
Oracle GoldenGateでの資料採取(トラブル時に採取すべき資料)
Dwh lecture 08-denormalization tech
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Practical Recipes for Daily DBA Activities using DB2 9 and 10 for z/OS
Get to know PostgreSQL!

What's hot (20)

PPTX
Why oracle data guard new features in oracle 18c, 19c
DOCX
Db2 Important questions to read
PPTX
IBM DB2 for zOSのソースエンドポイントとしての利用
PDF
A deep dive about VIP,HAIP, and SCAN
PDF
IBM DB2 for z/OS Administration Basics
 
PPT
Oracle archi ppt
PDF
Solving PostgreSQL wicked problems
PDF
DB2 for z/OS Architecture in Nutshell
PDF
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
PDF
Oracle db performance tuning
PDF
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
PDF
MongoDB WiredTiger Internals: Journey To Transactions
DOC
DB2 utilities
PDF
PostgreSQLアーキテクチャ入門(PostgreSQL Conference 2012)
PDF
Advanced RAC troubleshooting: Network
PDF
More mastering the art of indexing
PPT
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
PPT
Less14 br concepts
PDF
Oracle RAC 19c: Best Practices and Secret Internals
PPTX
Extreme Replication - Performance Tuning Oracle GoldenGate
Why oracle data guard new features in oracle 18c, 19c
Db2 Important questions to read
IBM DB2 for zOSのソースエンドポイントとしての利用
A deep dive about VIP,HAIP, and SCAN
IBM DB2 for z/OS Administration Basics
 
Oracle archi ppt
Solving PostgreSQL wicked problems
DB2 for z/OS Architecture in Nutshell
MySQL 8.0.18 latest updates: Hash join and EXPLAIN ANALYZE
Oracle db performance tuning
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
MongoDB WiredTiger Internals: Journey To Transactions
DB2 utilities
PostgreSQLアーキテクチャ入門(PostgreSQL Conference 2012)
Advanced RAC troubleshooting: Network
More mastering the art of indexing
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
Less14 br concepts
Oracle RAC 19c: Best Practices and Secret Internals
Extreme Replication - Performance Tuning Oracle GoldenGate
Ad

Similar to IDUG 2015 NA Data Movement Utilities final (20)

PDF
Ibm db2 10.5 for linux, unix, and windows data movement utilities guide and...
PPT
DB2UDB_the_Basics Day 5
PPTX
Oracle database 12.2 new features
PPTX
Database Migration using Oracle SQL Developer: DBA Stuff for the Non-DBA
PPT
5\9 SSIS 2008R2_Training - DataFlow Basics
PPT
Lauri Pietarinen - What's Wrong With My Test Data
PPTX
Optimizing your Database Import!
PDF
Session 1 - Databases-JUNE 2023.pdf
PPTX
moving data between the data bases in database
PPT
Oracle data pump
PDF
Intruduction to SQL.Structured Query Language(SQL}
PPTX
The Tools for Data Migration Between Oracle , MySQL and Flat Text File.
PDF
Oracle 11G Development Training noida Delhi NCR
PDF
Oracle11gdevtrainingindelhincr
PDF
Import data rahul vishwanath
DOCX
Migration from 8.1 to 11.3
PPT
Lec 1 = introduction to structured query language (sql)
PPT
15925 structured query
PPT
Introduction to Structured Query Language (SQL).ppt
PPT
Introduction to Structured Query Language (SQL) (1).ppt
Ibm db2 10.5 for linux, unix, and windows data movement utilities guide and...
DB2UDB_the_Basics Day 5
Oracle database 12.2 new features
Database Migration using Oracle SQL Developer: DBA Stuff for the Non-DBA
5\9 SSIS 2008R2_Training - DataFlow Basics
Lauri Pietarinen - What's Wrong With My Test Data
Optimizing your Database Import!
Session 1 - Databases-JUNE 2023.pdf
moving data between the data bases in database
Oracle data pump
Intruduction to SQL.Structured Query Language(SQL}
The Tools for Data Migration Between Oracle , MySQL and Flat Text File.
Oracle 11G Development Training noida Delhi NCR
Oracle11gdevtrainingindelhincr
Import data rahul vishwanath
Migration from 8.1 to 11.3
Lec 1 = introduction to structured query language (sql)
15925 structured query
Introduction to Structured Query Language (SQL).ppt
Introduction to Structured Query Language (SQL) (1).ppt
Ad

IDUG 2015 NA Data Movement Utilities final

  • 1. DB2 Data Movement Utilities: A Comparison Speaker: Jeyabarathi(JB) Chakrapani NASCO Session Code: D09 Wed, May 06, 2015 (08:00 AM - 09:00 AM) : Hancock | Platform: DB2 LUW II
  • 2. Agenda  Learn the various tools that are available with DB2 for achieving efficient data movement within the database environment.  Offer a brief introduction into each utility including the DB2 Admin Online Table Move procedure.  Learn the various enhancements offered in each DB2 version for each of these utilities  Understand how to use the different utilities with examples.  Learn what it takes to maximize the performance of your choice of data movement utility along with useful tricks and tips.
  • 3. Introduction to DB2 data movement utilities Load Utility Export Utility Import Utility Ingest Utility DB2move tool Restore utility ADMIN_COPY_SCHEMA ADMIN_MOVE_TABLE Split Mirror IBM replication tools 3 What are the available tools and options for data movement?
  • 5. Load utility Required input for Load:  The path and the name of the input file, named pipe, or device.  The name or alias of the target table.  The format of the input source. This format can be DEL, ASC, PC/IXF, or CURSOR.  Whether the input data is to be appended to the table, or is to replace the existing data in the table.  A message file name, if the utility is invoked through the application programming interface (API), db2Load. 5
  • 6. Load Load phases: • Load • Build • Delete • Index Copy Load modes: • Insert • Replace • Restart • Terminate 6
  • 7. Load options Include: • If the load utility is invoked from a remotely connected client, the data file must be on the client. XML and LOB data are always read from the server, even you specify the CLIENT option. • The method to use for loading the data: column location, column name, or relative column position. • How often the utility is to establish consistency points. • The names of the table columns into which the data is to be inserted. • Whether or not preexisting data in the table can be queried while the load operation is in progress. • Whether the load operation should wait for other utilities or applications to finish using the table or force the other applications off before proceeding. Client Options Method Consistency Points Access level Paths TableSpace Statistics Recovery COPY NO/YES 7
  • 8. Load options Include: • An alternate system temporary table space in which to build the index. • The paths and the names of the input files in which LOBs are stored. • A message file name. • Whether the utility should modify the amount of free space available after a table is loaded. • Whether statistics are to be gathered during the load process. This option is only supported if the load operation is running in REPLACE mode. • Whether to keep a copy of the changes made. This is done to enable rollforward recovery of the database. • The fully qualified path to be used when creating temporary files during a load operation. The name is specified by the TEMPFILES PATH parameter of the LOAD command. Client Options Method Consistency Points Access level Paths TableSpace Statistics Recovery COPY NO/YES 8
  • 9. Load Restrictions: • Loading data into nicknames is not supported. • Loading data into typed tables, or tables with structured type columns, is not supported. • Loading data into declared temporary tables and created temporary tables is not supported. • XML data can only be read from the server side; if you want to have the XML files read from the client, use the import utility. • You cannot create or drop tables in a table space that is in Backup Pending state. • If an error occurs during a LOAD REPLACE operation, the original data in the table is lost. Retain a copy of the input data to allow the load operation to be restarted. • Triggers are not activated on newly loaded rows. Business rules associated with triggers are not enforced by the load utility. • Loading encrypted data is not supported. Nick names Structured Data Types Temporary Tables XML support Backup Pending Load Replace Triggers Data encryptions Partitioned Tables 9
  • 10. Load from Cursor… Examples: DECLARE mycurs CURSOR FOR SELECT * FROM abc.table1; LOAD FROM mycurs OF cursor INSERT INTO abc.table2; DECLARE C1 CURSOR FOR SELECT * FROM customers WHERE XMLEXISTS(’$DOC/customer[income_level=1]’); LOAD FROM C1 OF CURSOR INSERT INTO lvl1_customers; The ANYORDER file type modifier is supported for loading XML data into an XML column. • Loads the results of a query directly into the target table, no intermediate export necessary. • XML data can be loaded with the cursor option. •Nicknames can be referenced in the SQL query of the cursor using the DATABASE option. •Load from remote database using the DATABASE option. 10
  • 11. Examples:  Loading from a federated database: Federation should be enabled and data source cataloged. CREATE NICKNAME myschema1.table1 FOR source.abc.table1; DECLARE mycurs CURSOR FOR SELECT c1,c2,c3 FROM myschema1.table1 LOAD FROM mycurs OF cursor INSERT INTO abc.table2  Loading from a remote database: The remote database should be cataloged. DECLARE mycurs CURSOR DATABASE dbsource USER dsciaraf USING mypasswd FOR SELECT * FROM abc.table1; LOAD FROM mycurs OF cursor INSERT INTO abc.table2; 11
  • 12. Checking for Integrity violations…. Load puts the tables in check pending status when:  The table has check constraints or RI constraints.  The table has identity columns and a V7 or earlier client was used to load data.  The table has descendent immediate staging tables or MQT tables referencing it.  The table is a staging table or MQT table. 12
  • 13. Load Performance…. CPU_PARALLELISM - specifies the number of threads used by the load utility to parse, convert, and format data records DISK_PARALLELISM - specifies the number of processes or threads used by the load utility to write data records to disk DATA_BUFFER - total amount of memory, in 4 KB units, allocated to the load utility as a buffer NONRECOVERABLE – Does not put the table in backup pending. SAVE COUNT – Specifies consistency points. STATISTICS USE PROFILE – Collection of statistics after load FASTPARSE – Used when data is known to be valid. NOROWWARNINGS - use this when multiple warnings are expected. PAGEFREESPACE,INDEXFREESPACE,TOTALFREESPACE – Specify to reduce the need for reorg 13
  • 15. EXPORT UTILITY • Required input:  Pathname for the output file.  Format of the output file - IXF or DEL.  Specification of data to be extracted using SELECT statement • Additional options:  subset of columns to be extracted using the METHOD option.  XML TO, XMLFILE, XML SAVESCHEMA - to export and store XML data in different ways.  The SELECT statement used for extracting data can be optimized the same way any SQL query can be optimized to improve export performance.  ‘Messages’ option allows messages generated by the export utility to be written to a file. 15 Data extraction using SQL query or Xquery statements
  • 16. EXPORT UTILITY 16 Examples… • Export to table1.ixf of ixf messages msg.txt select * from myschema.table1. This is a simple export command that exports all rows to the ixf file. • EXPORT TO table1export.del OF DEL XML TO /db/xmlpath XMLFILE xmldocs XMLSAVESCHEMA SELECT * FROM myschema.table1 • export to table1.del of del lobs to /db/lob1, /db/lob2/ modified by lobsinfile select * from myschema.table1
  • 18. IMPORT • Required input for Import:  The path and the name of the input file  The name or alias of the target table or view  The format of the data in the input file  The method by which the data is to be imported  The traverse order, when importing hierarchical data  The subtable list, when importing typed tables • Additional options:  MODIFIED BY clause  ALLOW WRITE ACCESS – Import acquires non exclusive lock  ALLOW NO ACCESS – Import acquires exclusive lock, waits for other work to complete until it can acquire the lock.  COMMITCOUNT – Commits after the specified number of rows.  MESSAGES 18 Data append/update using SQL query or XQuery statements
  • 19. Import • Import Support  Import supports IXF, ASC, and DEL data formats.  Used with file type modifiers to customize the import operation.  Used to move hierarchical data and typed tables.  Import logs all activity, updates indexes, verifies constraints, and fires triggers.  Allows you to specify the names of the columns within the table or view into which the data is to be inserted. • Import Modes  INSERT – Adds data to the existing table without changing existing data.  INSERT_UPDATE – Updates data with matching primary key values, otherwise inserts.  REPLACE – Deletes existing data and inserts new data.  CREATE - Creates the target table and its index definitions.  REPLACE_CREATE – Deletes existing data and inserts new data. If the target table does not exist, it is created 19
  • 20. IMPORT Restrictions • If the table has primary key that is referenced by a foreign key, it can be only appended to. • You cannot perform an import replace operation into an underlying table of a materialized query table defined in refresh immediate mode. • You cannot import data into a system table, a summary table, or a table with a structured type column. • You cannot import data into declared temporary tables. • Views cannot be created through the import utility. • Cannot import encrypted data. • Referential constraints and foreign key definitions are not preserved when creating tables from PC/IXF files. (Primary key definitions are preserved if the data was previously exported by using SELECT *.) • Because the import utility generates its own SQL statements, the maximum statement size of 2 MB might, in some cases, be exceeded. • You cannot re-create a partitioned table or a multidimensional clustered table (MDC) by using the CREATE or REPLACE_CREATE import parameters. • Cannot re-create tables containing XML • Does not honor Not Logged Initially clause. 20
  • 21. IMPORT Restrictions … Remote import is not allowed if • The application and database code pages are different. • The file being imported is a multiple-part PC/IXF file. • The method used for importing the data is either column name or relative column position. • The target column list provided is longer than 4 KB. • The LOBS FROM clause or the lobsinfile modifier is specified. • The NULL INDICATORS clause is specified for ASC files. 21
  • 22. IMPORT Performance … 22  If the workload is mostly insert, consider altering the table to ‘append on’.  To avoid transaction log full condition, consider an appropriate ‘commit count’ value.  Enable DB2_PARALLEL_IO registry variable.  Review logbuffer db cfg value and increase it as necessary.  Review utility heap db cfg value and increase as needed.  Review num_ioservers, num_iocleaners parameters.
  • 24. INGEST • INGEST characteristics • Fast – Multithreaded design to process in parallel. • Available – Uses row level locking and so tables remain concurrent. • Continuous – Can continuously ingest data streams from pipes or files. • Robust – Handles unexpected failures. Can be restarted from last commit point. • Flexible and Functional – Supports different input formats and target tables types, has rich data manipulation capabilities. 24
  • 25. INGEST Supported Table Types • multidimensional clustering (MDC) and insert time clustering (ITC) tables • range-partitioned tables • range-clustered tables (RCT) • materialized query tables (MQTs) that are defined as MAINTAINED BY USER, including summary tables • temporal tables • updatable views (except typed views) Supported data formats • Delimited text • Positional text and binary • Columns in various orders and formats 25
  • 26. Ingest Transporter Formatter Formatter Formatter Transporter SQL DB Partition 1 Multiple files or Multiple pipes Formatter Array InsertFlusher [Array Insert] Array InsertFlusher [Array Insert] Array InsertFlusher [Array Insert] Hash by database partition DB Partition 2 DB Partition n Main components:  Transporter  Formatter  Flusher
  • 27. INGEST • Transporter:  Reads from data source and writes to formatter queues. For INSERT and MERGE operations, there is one transporter thread for each input source. For UPDATE and DELETE operations, there is only one transporter thread. • Formatter:  Parse each record, convert the data into the format that DB2 requires, & writes each formatted record to one of the flusher queues for that record's partition.  The num_formatters configuration parameter is used to specify the number of formatter threads. The default is (number of logical CPUs)/2. 27
  • 28. INGEST • Flusher:  The flushers issue the SQL statements to perform the operations on the DB2 tables. The number of flushers for each partition is specified by the num_flushers_per_partition configuration parameter. The default is max(1, ((number of logical CPUs)/2)/(number of partitions) ). 28
  • 29. INGEST Examples 29 INGEST FROM FILE my_file.del FORMAT DELIMITED INSERT INTO my_table; Input records are sent over a named pipe INGEST FROM PIPE my_pipe FORMAT DELIMITED INSERT INTO my_table; Input records delimited by CRLF; fields are delimited by vertical bar INGEST FROM FILE my_file.del FORMAT DELIMITED '|' INSERT INTO my_table;
  • 30. INGEST Examples 30 INGEST FROM FILE input_file.txt FORMAT DELIMITED ( $key1 INTEGER EXTERNAL, $data1 CHAR(8), $data2 CHAR(32), $data3 DECIMAL(5,2) EXTERNAL ) MERGE INTO target_table ON (key1 = $key1) WHEN MATCHED THEN UPDATE SET (data1, data2, data3) = ($data1, $data2, $data3) WHEN NOT MATCHED THEN INSERT VALUES($key1, $data1, $data2, $data3);
  • 31. INGEST – Examples… 31 Ingest configuration: connect to mydb user <username> using <password>; INGEST SET num_flushers_per_partition 1; INGEST SET NUM_FORMATTERS 12; INGEST SET shm_max_size 12 GB; INGEST SET commit_count 20000; ingest from file /mydir/file1 FORMAT DELIMITED by ',' RESTART OFF insert into myschema.tab1;
  • 32. INGEST – Restart .. 32 Restart information is stored in a separate table (SYSTOOLS.INGESTRESTART) and it is created once. To create the restart table on DB2 10.1 CALL SYSPROC.SYSINSTALLOBJECTS('INGEST', 'C', NULL, NULL). The table contains some counters to keep track of which records have been ingested.
  • 33. INGEST - Restart 33 RESTART CONTINUE to restart a previously failed job (and clean up the restart data) RESTART TERMINATE to clean up the restart data from a failed job you don't plan to restart RESTART OFF to suppress saving of restart information (in which case the ingest job is not restartable)
  • 34. INGEST – Additional features 34 Commit by time or number of rows - Commit_count or commit_period configuration parameter Support for copying rejected records to a file or table - DUMPFILE or EXCEPTION TABLE parameter Support for restart and recovery - retry_count ingest configuration parameter.
  • 35. INGEST - Monitoring 35 INGEST LIST and INGEST GET STATS commands Reads information that the utility maintains in shared memory. Must be run in a separate window on the same machine as the INGEST command. Can display detailed information
  • 36. INGEST and LOAD • INGEST • When the table needs to remain concurrent during load. • You need only some fields from the input file to be loaded. • You need to specify an SQL statement other than INSERT • You need to be able to use an SQL expression (to construct a column value from field values) • You need to recover and continue on when the utility gets a recoverable error • LOAD • Don’t need the table to remain concurrent. • XML & LOB data to be loaded. • Load from cursor or load from a device • Input source file is in IXF format • Load a GENERATED ALWAYS column or SYSTEM_TIME column from the input file • Use SYSPROC.ADMIN_CMD • Invoke the utility through an API • Don't want the INSERTs to be logged 36
  • 37. INGEST - Performance • Field type and column type • Define fields to be the same type as their corresponding column types. • Materialized query tables (MQTs) • If you are using Ingest utility against a base table of an MQT defined as refresh immediate, performance can degrade significantly due to the time required to update the MQT. • Row size • Increase the setting of the commit_count for tables with smaller row size and reduce it for tables with larger row size • Other workloads • If multiple workloads are running along with the ingest, consider increasing the locklist database configuration parameter and reduce the commit_count ingest configuration parameter. • 37
  • 38. Comparison between Import, Load and Ingest 38 Table type IMPORT LOAD INGEST Created global temporary table no no no Declared global temporary table no no no Detached table that has dependent table where SET INTEGRITY not run (detached table has SYSCAT.TABLES.TYPE = 'L') no (SQL20285N, reason code 1) no (SQL20285N, reason code 1) no Multidimensional clustering (MDC) table yes yes yes Materialized query table (MQT) that is maintained by user yes yes yes Nickname relational except ODBC no (SQL02305N) yes Range-clustered table (RCT) yes no yes Range-partitioned table yes yes yes Summary table no yes yes Typed table yes no (SQL3211N) no Typed view yes no (SQL2305N) no Untyped (regular) table yes yes yes Updatable view yes no (SQL02305N) yes
  • 39. Comparison to IMPORT and LOAD – Column types 39 Column data type IMPORT LOAD INGEST Numeric: SMALLINT, INTEGER, BIGINT, DECIMAL, REAL, DOUBLE, DECFLOAT yes yes yes Character: CHAR, VARCHAR, NCHAR, NVARCHAR, plus corresponding FOR BIT DATA types yes yes yes Graphic: GRAPHIC, VARGRAPHIC yes yes yes Long types: LONG VARCHAR, LONG VARGRAPHIC yes yes yes Date/time: DATE, TIME, TIMESTAMP(p) yes yes yes DB2SECURITYLABEL yes yes yes LOBs from files: BLOB, CLOB, DBCLOB, NCLOB yes yes no Inline LOBs yes yes no XML from files yes yes no Inline XML no no no Distinct Type (note 1) yes yes yes Structured Type no no no Reference Type yes yes yes
  • 40. Comparison to IMPORT and LOAD Input Types and Formats 40 Input type Import Load INGEST Cursor no yes no Device no yes no File yes yes yes Pipe no yes yes Multiple input files, multiple pipes, etc no yes yes Input format IMPORT LOAD INGEST ASC (including binary) yes, except binary yes yes DEL yes yes yes IXF yes yes no WSF (worksheet format) yes, but will be discontinued in DB2 10.1 no no
  • 41. Comparison to IMPORT and LOAD – Other features 41 Feature IMPORT LOAD INGEST Can other apps update the table while utility is loading the table? yes no yes Can use SQL expressions? no no yes Support for REPLACE yes yes yes Support for UPDATE, MERGE, and DELETE Update only no yes Can update GENERATED ALWAYS and SYSTEM_TIME columns? no yes no Performance for large number of input records slow best Comparable to load into staging table followed by multiple concurrent inserts from staging table to target table API yes yes no (planned for a fix pack) SYSPROC.ADMIN_CMD support no yes no Inserts and updates are logged? yes no yes (cannot be turned off, and no support for NOT LOGGED INITIALLY) Error recovery no no yes Restart no yes Yes
  • 42. ADMIN_MOVE_TABLE Procedure 42 Can be done online or offline. A shadow copy of the source table is taken. Source table changes are captured and applied thru triggers. Source table is taken offline briefly to rename the shadow copy and its indexes to the source table name.
  • 43. ADMIN_MOVE_TABLE Procedure 43 Call the stored procedure once specifying atleast the schema name and the table name. CALL SYSPROC.ADMIN_MOVE_TABLE (’schema name’, ’source table’, ’’,’’,’’,’’,’’,’’,’’,’’,’MOVE’) Or call the procedure multiple times for each operation of the move. CALL SYSPROC.ADMIN_MOVE_TABLE (’schema name’, ’source table’, ’’,’’,’’,’’,’’,’’,’’,’’,’operation name’)
  • 44. ADMIN_MOVE_TABLE Procedure 44 Moving range partitioned tables CREATE TABLE “SCHEMA1 "."T1" ("I1" INTEGER ,"I2" INTEGER ) DISTRIBUTE BY HASH("I1") PARTITION BY RANGE("I1") (PART "PART0" STARTING(0) ENDING(100) IN "TS1", PART "PART1" STARTING(101) ENDING(MAXVALUE) IN "TS2"); Move the T1 table from schema SCHEMA1 to the TS3 table space, leaving the first partition in TS1. DB2 "CALL SYSPROC.ADMIN_MOVE_TABLE (’SCHEMA1’,’T1’,’TS3’,’TS3’,’TS3’,’’,’’, ’(I1) (STARTING 0 ENDING 100 IN TS1 INDEX IN TS1 LONG IN TS1, STARTING 101 ENDING MAXVALUE IN TS3 INDEX IN TS3 LONG IN TS3)’, ’’,’’,’MOVE’)"
  • 45. IBM Replication tools 45  Q replication Q capture and Q apply components Q capture reads DB2 recovery logs and translates committed data into Websphere MQ messages. Q apply reads the messages from the queue translates them into SQL statements that can be applied to the target server. SQL replication Capture and apply components Capture reads DB2 log data and writes to change data tables. Apply reads the change data tables and replicates the changes to the target tables.
  • 46. DB2move utility and ADMIN_COPY_SCHEMA 46 ADMIN_COPY_SCHEMA procedure to copy a single schema within the same database. Options: - DDL, COPY, COPYNO.  db2move utility with the -co COPY action to copy a single schema or multiple schemas from a source database to a target database. Eg: db2move <dbname> COPY -sn schema1 -co target_db target schema_map " ((schema1,schema2))" tablespace_map "((TS1, TS2),(TS3, TS4), SYS_ANY)“ -u userid -p password
  • 47. DB2 Redirected Restore utility 47 Perform redirected restores to build partial or full database images. db2 restore db test from <directory/tsm> taken at <timestamp> redirect generate script redirect.sql Transport a set of table spaces, storage groups and SQL schemas from database backup image to a database using the TRANSPORT option (in DB2 Version 9.7 Fix Pack 2 and later fix packs). db2 restore db <sourcedb> tablespace (mydata1) schema(schema1,schema2) from <Media_Target_clause> taken at <date-time> transport into <targetdb> redirect db2 list tablespaces db2 set tablespace containers for <tablespace ID for mydata1> using (path '/db2DB/data1')
  • 48. Suspended I/O and online split mirror 48 For large databases, make copies from a mirrored image by using suspended I/O and the split mirror function. This approach also:  Eliminates backup operation overhead from the production machine  Represents a fast way to clone systems.  Represents a fast implementation of idle standby failover. Disk mirroring is the process of writing data to two separate hard disks at the same time. One copy of the data is called a mirror of the other. Splitting a mirror is the process of separating the two copies.
  • 49. Summary 49 Load - This utility is best suited to situations where performance is your primary concern. Ingest - This utility strikes a good balance between performance and availability, but if the latter is more important to you, then you should choose the ingest utility instead of the load utility. Import - The import utility can be a good alternative to the load utility in the following situations: - where the target table is a view. - the target table has constraints and you don't want the target table to be put in the Set Integrity Pending state. - the target table has triggers and you want them fired.
  • 50. References 50 IBM Red book on DB2 Data Movement. IBM Knowledge center for DB2 V9.7 and V10.1. IBM Developer Works Technical Library. IDUG technical archives.
  • 51. JEYABARATHI CHAKRAPANI NASCO jbchakra@gmail.com Session D09 Title: DB2 Data Movement Utilities : A comparison Please fill out your session evaluation before leaving!