SlideShare a Scribd company logo
Table of Contents
SSIS Partitioning and Best Practices ............................................................................................................ 3
Sliding window .......................................................................................................................................... 4
Parallel Execution Using partition logic ................................................................................................ 4
SSIS Best Practices ........................................................................................................................................ 5
Benefits of using SSIS Partitioning ............................................................................................................ 7
Appendix ............................................................................................................................................... 7

1
SSIS Partitioning and Best
Practices

Date

27/1/2014

Owner

Vinod kumar kodatham

OBJECT OVERVIEW
Technical Name

Description

SSIS Partitioning and Best Practices.
Partitioning is Divides the large table and its indexes into smaller
parts / partitions, so that maintenance operations can be applied on
a partition-by-partition basis, rather than on the entire table.

2
SSIS Partitioning and Best Practices
Partitioning and Best Practices to be followed while developing SSIS ETLs to improve
Performance of the Packages.
Types of Partitions
•

Vertical partitioning
some columns in one table
other columns in some other table

•

Horizontal partitioning
Based on the rows range splitting the table

Requirements for Table Partition
•

Partition Function - Logical - Defines the points on the line (right or left)

Syntax : CREATE PARTITION FUNCTION [partfunc_TinyInt_MOD10](tinyint) AS
RANGE RIGHT FOR VALUES (0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09)
GO
ex:Creating a RANGE LEFT partition function on an int column
CREATE PARTITION FUNCTION myRangePF1 (int) AS RANGE LEFT FOR VALUES (1, 100, 1000);
Creating a RANGE RIGHT partition function on an int column
CREATE PARTITION FUNCTION myRangePF2 (int) AS RANGE RIGHT FOR VALUES (1, 100, 1000);

Syntax : CREATE PARTITION SCHEME [partscheme_DATA_TinyInt_MOD10]
AS PARTITION [partfunc_TinyInt_MOD10] TO ([DATA], [DATA], [DATA],
[DATA], [DATA], [DATA], [DATA], [DATA], [DATA], [DATA])
GO
•

Partitioned Key

Single Column or Computed Column which are marked Persisted
All data types for use as index columns are valid, except timestamp. LOB data types and CLR user defined types
cannot be used
Clustered table - must be part of either primary key or clustered index
Ideally queries should use them as filter

Partitioning Usage in Table
Create the table with PARTITION SCHEME
CREATE TABLE [tmp].[Table_1](
.
.

3
) ON
[partscheme_DATA_TinyInt_MOD10]([MOD10])

Sliding window
1. Create a non partitioned archive table with the same structure, and a matching clustered index (if
required).

Place it on the same filegroup as the oldest partition.

2.

Use SWITCH to move the oldest partition from the partitioned table to the archive table.

3.

Remove the boundary value of the oldest partition using MERGE.
get smallest range vlaue from sys.partition_range_values and MERGE it
Syntax: ALTER PARTITION FUNCTION pf_k_rows()
MERGE RANGE (@merge_range)

4.Designate the NEXT USED filegroup.
5.
Create a new boundary value for the new partition using SPLIT (the best practice is to split an empty
partition at the leading end of the table into two empty partitions to minimize data movement.).
get largest range vlaue from sys.partition_range_values SPLIT last range with a new value
Syntax:SELECT @split_range = @split_range + 1000
ALTER PARTITION FUNCTION pf_k_rows()
SPLIT RANGE (@split_range)
6.Create a staging table that has the same structure as the partitioned table on the target filegroup.
7.Populate the staging table.
8.Add indexes.
9.Add a check constraint that matches the constraint of the new partition.
10.Ensure that all indexes are aligned.
11.Switch the newest data into the partitioned table (the staging table is now empty).
12.Update statistics on the partitioned table

Parallel Execution Using partition logic
Table data refresh time can be improved using partitioned parallel execution.
1.

Create PARTITION FUNCTION

2.

Create PARTITION SCHEME

3.

CREATE TABLE [dbo].[syslargevolumelog]

4.

Check If loading not at completed it will go down else go to step 8

5.

Create the table with PARTITION SCHEME

6.

Laod the TargetTable with SourceTable Using idcolumn/10=1 Etc...

7.

Update [syslargevolumelog] with data is loaded for this partition

8.

Create temporary table same as original table

4
9.

Switch all partitions to temporary table

10.

Create unique clustered indexes

11.

Rename the temporary table as original table

SSIS Best Practices
Avoid SELECT *
Removing this unused output column can increase Data Flow task performance
Steps need to be considered while loading the data.
If any Non Clustered Index(es) exists
DROP all Non-Clustered Index(es)
If Clustered Index exists
DROP Clustered Index
Steps need to be considered while selecting the data.
If Clustered Index does not exists
CREATE Clustered Index
If Non Clustered Index(es) does exists
CREATE Non Clustered Index
Effect of OLEDB Destination Settings

Keep Identity – By default this setting is unchecked. If you check this setting, the dataflow engine will ensure that
the source identity values are preserved and same value is inserted into the destination table.
Keep Nulls –By default this setting is unchecked. If you check this option then default constraint on the
destination table's column will be ignored and preserved NULL of the source column will be inserted into the
destination.
Table Lock – By default this setting is checked and the recommendation is to let it be checked unless the same
table is being used by some other process at same time.
Check Constraints – Again by default this setting is checked and recommendation is to un-check it if you are sure
that the incoming data is not going to violate constraints of the destination table. If you un-check this option it
will improve the performance of the data load.
Better performance with parallel execution
MaxConcurrentExecutables – default value is -1, which means total number of available processors + 2, also if
you have hyper-threading enabled then it is total number of logical processors + 2.
Avoid asynchronous transformation (such as Sort Transformation) wherever possible
Ex: - Aggregate
- Fuzzy Grouping
- Merge
- Merge Join

5
- Sort
- Union All
How DelayValidation property can help you
In general the package will be validated during design time itself. However, we can control this behavior by using
"Delay Validation" property.
Default value of this property is false. By setting delay validation to true, we can delay validation of the package
until run time.
When to use events logging and when to avoid...
Recommendation here is to enable logging if required, you can dynamically set the value of the
LoggingMode property (of a package and its executables) to enable or disable logging without modifying the
package. Also you should choose to log for only those executables which you suspect to have problems and
further you should only log

those events which are absolutely required for troubleshooting.
Effect of Rows Per Batch and Maximum Insert Commit Size Settings
Rows per batch – The default value for this setting is -1 which specifies all incoming rows will be treated as a
single batch. You can change this default behavior and break all incoming rows into multiple batches. The allowed
value is only positive integer which specifies the maximum number of rows in a batch.
OLEDB Destination:
Maximum insert commit size – The default value for this setting is '2147483647' (largest value for 4 byte integer
type) which specifies all incoming rows will be committed once on successful completion. You can specify a
positive value for this setting to indicate that commit will be done for those number of records.
Changing the default value for this setting will put overhead on the dataflow engine to commit several times. Yes
that is true, but at the same time it will release the pressure on the transaction log and tempdb to grow
tremendously specifically during high volume data transfers.
DefaultBufferMaxSize and DefaultBufferMaxRows
The number of buffer created is dependent on how many rows fit into a buffer and how many rows fit into a
buffer dependent on few other factors.
1. Estimated row size,
2. DefaultBufferMaxSize property of the data flow task.default value is 10 MB and its upper and lower boundaries
are MaxBufferSize (100MB) and MinBufferSize (64 KB).
3. DefaultBufferMaxRows which is again a property of data flow task which specifies the default number of rows
in a buffer. Its default value is 10000.
Lookup transformation consideration
Choose the caching mode wisely after analyzing your environment.
If you are using Partial Caching or No Caching mode, ensure you have an index on the reference table for better
performance.
Instead of directly specifying a reference table in he lookup configuration, you should use a SELECT statement
with only the required columns.
You should use a WHERE clause to filter out all the rows which are not required for the lookup.
set data type in each column appropriately, especially if your source is flat file. This will enable you to
accommodate as many rows as possible in the buffer.

6
Avoid many small buffers. Tweak the values for DefaultMaxBufferRows and DefaultMaxBufferSize to get as many
records into a buffer as possible, especially when dealing with large data volumes.

Full Load vs Delta Load
Design the package in such a way that it does a full pull of data only in the beginning or on-demand, next time
onward it should do the incremental pull, this will greatly reduce the volume of data load operations, especially
when volumes are likely to increase over the lifecycle of an application. For this purpose, use upstream enabled
CDC (Change Data Capture) feature of SQL Server 2008; for previous versions of SQL Server incremental pull
logic.
Use merge instead of SCD
The big advantage of the MERGE statement is being able to handle multiple actions in a single pass of the data
sets, rather than requiring multiple passes with separate inserts and updates. A well tuned optimizer could handle
this extremely efficiently.
Packet size in connection should equal to 32767

Data types as narrow as possible for less memory usage

Do not perform excessive casting
Use group by instead of aggregation
Unnecessary delta detection vs. reload
commit size 0 == fastest

Benefits of using SSIS Partitioning
Following are some of the benefits of following SSIS Partitioning and Best Practices:
It facilitates the management of large fact tables in data warehouses.
Performance / parallelism benefits
Dividing the table into across file groups is benefitting on IO Operations, fetch latest data ,re indexing ,backup
and restore.
For range-based inserts or range-based deletes
Sliding window scenario
In SQL Server 2008 SP2 and SQL Server 2008 R2 SP1, you can choose to enable support for 15,000 partitions.

Appendix
Reference used for Best Practices:
http://msdn.microsoft.com/en-us/library/ms190787.aspx

http://www.mssqltips.com/sql_server_business_intelligence_tips.asp
7

More Related Content

DOCX
Nested loop join technique - part2
PPTX
What's New In MySQL 5.6
PDF
An Introduction To PostgreSQL Triggers
PPTX
Chapter 4 functions, views, indexing
PDF
Optimizer Hints
PPTX
Rapid postgresql learning, part 4
PDF
Optimizer Cost Model MySQL 5.7
PDF
PostgreSQL, performance for queries with grouping
Nested loop join technique - part2
What's New In MySQL 5.6
An Introduction To PostgreSQL Triggers
Chapter 4 functions, views, indexing
Optimizer Hints
Rapid postgresql learning, part 4
Optimizer Cost Model MySQL 5.7
PostgreSQL, performance for queries with grouping

What's hot (20)

PDF
MySQL Document Store
PDF
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
PDF
8 tune tusc
TXT
Oracle ORA Errors
PPTX
Database administration commands
PDF
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
ODP
Drools & jBPM Info Sheet
PPTX
PPTX
Drools Ecosystem
DOCX
database-querry-student-note
PDF
PostgreSQL High_Performance_Cheatsheet
PDF
PostgreSQL 9.5 - Major Features
PDF
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
PPTX
Stored procedure
PDF
Triggers and Stored Procedures
ODT
Mysql
PDF
MySQL Query tuning 101
PDF
你所不知道的Oracle后台进程Smon功能
PDF
[PGDay.Seoul 2020] PostgreSQL 13 New Features
PDF
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
MySQL Document Store
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
8 tune tusc
Oracle ORA Errors
Database administration commands
[Pgday.Seoul 2021] 2. Porting Oracle UDF and Optimization
Drools & jBPM Info Sheet
Drools Ecosystem
database-querry-student-note
PostgreSQL High_Performance_Cheatsheet
PostgreSQL 9.5 - Major Features
Accessing Data Through Hibernate; What DBAs Should Tell Developers and Vice V...
Stored procedure
Triggers and Stored Procedures
Mysql
MySQL Query tuning 101
你所不知道的Oracle后台进程Smon功能
[PGDay.Seoul 2020] PostgreSQL 13 New Features
PGConf APAC 2018 - Lightening Talk #2 - Centralizing Authorization in PostgreSQL
Ad

Similar to Ssis partitioning and best practices (20)

PPTX
Large scale sql server best practices
PPT
Ssis optimization –better designs
PPT
SQL Server 2008 Integration Services
PPTX
SQL Server 2012 Best Practices
PPTX
Exploring Scalability, Performance And Deployment
PPT
SQL Server 2008 Performance Enhancements
PPT
Optimizing Data Accessin Sq Lserver2005
PDF
PDF
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
PDF
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
PPT
Informix partitioning interval_rolling_window_table
PPT
Five Tuning Tips For Your Datawarehouse
PDF
Tuning data warehouse
PPTX
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
PPTX
Pl sql best practices document
PPTX
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
PPTX
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
PPT
Lauri Pietarinen - What's Wrong With My Test Data
PDF
Correctly Loading Incremental Data at Scale
PPTX
How to tune a query - ODTUG 2012
Large scale sql server best practices
Ssis optimization –better designs
SQL Server 2008 Integration Services
SQL Server 2012 Best Practices
Exploring Scalability, Performance And Deployment
SQL Server 2008 Performance Enhancements
Optimizing Data Accessin Sq Lserver2005
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
Informix partitioning interval_rolling_window_table
Five Tuning Tips For Your Datawarehouse
Tuning data warehouse
Scaling out SSIS with Parallelism, Diving Deep Into The Dataflow Engine
Pl sql best practices document
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
Lauri Pietarinen - What's Wrong With My Test Data
Correctly Loading Incremental Data at Scale
How to tune a query - ODTUG 2012
Ad

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Spectroscopy.pptx food analysis technology
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Electronic commerce courselecture one. Pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Empathic Computing: Creating Shared Understanding
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
sap open course for s4hana steps from ECC to s4
PPT
Teaching material agriculture food technology
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
A Presentation on Artificial Intelligence
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Spectroscopy.pptx food analysis technology
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Electronic commerce courselecture one. Pdf
Programs and apps: productivity, graphics, security and other tools
Empathic Computing: Creating Shared Understanding
The AUB Centre for AI in Media Proposal.docx
Advanced methodologies resolving dimensionality complications for autism neur...
Diabetes mellitus diagnosis method based random forest with bat algorithm
sap open course for s4hana steps from ECC to s4
Teaching material agriculture food technology
A comparative analysis of optical character recognition models for extracting...
Review of recent advances in non-invasive hemoglobin estimation
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx

Ssis partitioning and best practices

  • 1. Table of Contents SSIS Partitioning and Best Practices ............................................................................................................ 3 Sliding window .......................................................................................................................................... 4 Parallel Execution Using partition logic ................................................................................................ 4 SSIS Best Practices ........................................................................................................................................ 5 Benefits of using SSIS Partitioning ............................................................................................................ 7 Appendix ............................................................................................................................................... 7 1
  • 2. SSIS Partitioning and Best Practices Date 27/1/2014 Owner Vinod kumar kodatham OBJECT OVERVIEW Technical Name Description SSIS Partitioning and Best Practices. Partitioning is Divides the large table and its indexes into smaller parts / partitions, so that maintenance operations can be applied on a partition-by-partition basis, rather than on the entire table. 2
  • 3. SSIS Partitioning and Best Practices Partitioning and Best Practices to be followed while developing SSIS ETLs to improve Performance of the Packages. Types of Partitions • Vertical partitioning some columns in one table other columns in some other table • Horizontal partitioning Based on the rows range splitting the table Requirements for Table Partition • Partition Function - Logical - Defines the points on the line (right or left) Syntax : CREATE PARTITION FUNCTION [partfunc_TinyInt_MOD10](tinyint) AS RANGE RIGHT FOR VALUES (0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09) GO ex:Creating a RANGE LEFT partition function on an int column CREATE PARTITION FUNCTION myRangePF1 (int) AS RANGE LEFT FOR VALUES (1, 100, 1000); Creating a RANGE RIGHT partition function on an int column CREATE PARTITION FUNCTION myRangePF2 (int) AS RANGE RIGHT FOR VALUES (1, 100, 1000); Syntax : CREATE PARTITION SCHEME [partscheme_DATA_TinyInt_MOD10] AS PARTITION [partfunc_TinyInt_MOD10] TO ([DATA], [DATA], [DATA], [DATA], [DATA], [DATA], [DATA], [DATA], [DATA], [DATA]) GO • Partitioned Key Single Column or Computed Column which are marked Persisted All data types for use as index columns are valid, except timestamp. LOB data types and CLR user defined types cannot be used Clustered table - must be part of either primary key or clustered index Ideally queries should use them as filter Partitioning Usage in Table Create the table with PARTITION SCHEME CREATE TABLE [tmp].[Table_1]( . . 3
  • 4. ) ON [partscheme_DATA_TinyInt_MOD10]([MOD10]) Sliding window 1. Create a non partitioned archive table with the same structure, and a matching clustered index (if required). Place it on the same filegroup as the oldest partition. 2. Use SWITCH to move the oldest partition from the partitioned table to the archive table. 3. Remove the boundary value of the oldest partition using MERGE. get smallest range vlaue from sys.partition_range_values and MERGE it Syntax: ALTER PARTITION FUNCTION pf_k_rows() MERGE RANGE (@merge_range) 4.Designate the NEXT USED filegroup. 5. Create a new boundary value for the new partition using SPLIT (the best practice is to split an empty partition at the leading end of the table into two empty partitions to minimize data movement.). get largest range vlaue from sys.partition_range_values SPLIT last range with a new value Syntax:SELECT @split_range = @split_range + 1000 ALTER PARTITION FUNCTION pf_k_rows() SPLIT RANGE (@split_range) 6.Create a staging table that has the same structure as the partitioned table on the target filegroup. 7.Populate the staging table. 8.Add indexes. 9.Add a check constraint that matches the constraint of the new partition. 10.Ensure that all indexes are aligned. 11.Switch the newest data into the partitioned table (the staging table is now empty). 12.Update statistics on the partitioned table Parallel Execution Using partition logic Table data refresh time can be improved using partitioned parallel execution. 1. Create PARTITION FUNCTION 2. Create PARTITION SCHEME 3. CREATE TABLE [dbo].[syslargevolumelog] 4. Check If loading not at completed it will go down else go to step 8 5. Create the table with PARTITION SCHEME 6. Laod the TargetTable with SourceTable Using idcolumn/10=1 Etc... 7. Update [syslargevolumelog] with data is loaded for this partition 8. Create temporary table same as original table 4
  • 5. 9. Switch all partitions to temporary table 10. Create unique clustered indexes 11. Rename the temporary table as original table SSIS Best Practices Avoid SELECT * Removing this unused output column can increase Data Flow task performance Steps need to be considered while loading the data. If any Non Clustered Index(es) exists DROP all Non-Clustered Index(es) If Clustered Index exists DROP Clustered Index Steps need to be considered while selecting the data. If Clustered Index does not exists CREATE Clustered Index If Non Clustered Index(es) does exists CREATE Non Clustered Index Effect of OLEDB Destination Settings Keep Identity – By default this setting is unchecked. If you check this setting, the dataflow engine will ensure that the source identity values are preserved and same value is inserted into the destination table. Keep Nulls –By default this setting is unchecked. If you check this option then default constraint on the destination table's column will be ignored and preserved NULL of the source column will be inserted into the destination. Table Lock – By default this setting is checked and the recommendation is to let it be checked unless the same table is being used by some other process at same time. Check Constraints – Again by default this setting is checked and recommendation is to un-check it if you are sure that the incoming data is not going to violate constraints of the destination table. If you un-check this option it will improve the performance of the data load. Better performance with parallel execution MaxConcurrentExecutables – default value is -1, which means total number of available processors + 2, also if you have hyper-threading enabled then it is total number of logical processors + 2. Avoid asynchronous transformation (such as Sort Transformation) wherever possible Ex: - Aggregate - Fuzzy Grouping - Merge - Merge Join 5
  • 6. - Sort - Union All How DelayValidation property can help you In general the package will be validated during design time itself. However, we can control this behavior by using "Delay Validation" property. Default value of this property is false. By setting delay validation to true, we can delay validation of the package until run time. When to use events logging and when to avoid... Recommendation here is to enable logging if required, you can dynamically set the value of the LoggingMode property (of a package and its executables) to enable or disable logging without modifying the package. Also you should choose to log for only those executables which you suspect to have problems and further you should only log those events which are absolutely required for troubleshooting. Effect of Rows Per Batch and Maximum Insert Commit Size Settings Rows per batch – The default value for this setting is -1 which specifies all incoming rows will be treated as a single batch. You can change this default behavior and break all incoming rows into multiple batches. The allowed value is only positive integer which specifies the maximum number of rows in a batch. OLEDB Destination: Maximum insert commit size – The default value for this setting is '2147483647' (largest value for 4 byte integer type) which specifies all incoming rows will be committed once on successful completion. You can specify a positive value for this setting to indicate that commit will be done for those number of records. Changing the default value for this setting will put overhead on the dataflow engine to commit several times. Yes that is true, but at the same time it will release the pressure on the transaction log and tempdb to grow tremendously specifically during high volume data transfers. DefaultBufferMaxSize and DefaultBufferMaxRows The number of buffer created is dependent on how many rows fit into a buffer and how many rows fit into a buffer dependent on few other factors. 1. Estimated row size, 2. DefaultBufferMaxSize property of the data flow task.default value is 10 MB and its upper and lower boundaries are MaxBufferSize (100MB) and MinBufferSize (64 KB). 3. DefaultBufferMaxRows which is again a property of data flow task which specifies the default number of rows in a buffer. Its default value is 10000. Lookup transformation consideration Choose the caching mode wisely after analyzing your environment. If you are using Partial Caching or No Caching mode, ensure you have an index on the reference table for better performance. Instead of directly specifying a reference table in he lookup configuration, you should use a SELECT statement with only the required columns. You should use a WHERE clause to filter out all the rows which are not required for the lookup. set data type in each column appropriately, especially if your source is flat file. This will enable you to accommodate as many rows as possible in the buffer. 6
  • 7. Avoid many small buffers. Tweak the values for DefaultMaxBufferRows and DefaultMaxBufferSize to get as many records into a buffer as possible, especially when dealing with large data volumes. Full Load vs Delta Load Design the package in such a way that it does a full pull of data only in the beginning or on-demand, next time onward it should do the incremental pull, this will greatly reduce the volume of data load operations, especially when volumes are likely to increase over the lifecycle of an application. For this purpose, use upstream enabled CDC (Change Data Capture) feature of SQL Server 2008; for previous versions of SQL Server incremental pull logic. Use merge instead of SCD The big advantage of the MERGE statement is being able to handle multiple actions in a single pass of the data sets, rather than requiring multiple passes with separate inserts and updates. A well tuned optimizer could handle this extremely efficiently. Packet size in connection should equal to 32767 Data types as narrow as possible for less memory usage Do not perform excessive casting Use group by instead of aggregation Unnecessary delta detection vs. reload commit size 0 == fastest Benefits of using SSIS Partitioning Following are some of the benefits of following SSIS Partitioning and Best Practices: It facilitates the management of large fact tables in data warehouses. Performance / parallelism benefits Dividing the table into across file groups is benefitting on IO Operations, fetch latest data ,re indexing ,backup and restore. For range-based inserts or range-based deletes Sliding window scenario In SQL Server 2008 SP2 and SQL Server 2008 R2 SP1, you can choose to enable support for 15,000 partitions. Appendix Reference used for Best Practices: http://msdn.microsoft.com/en-us/library/ms190787.aspx http://www.mssqltips.com/sql_server_business_intelligence_tips.asp 7