SlideShare a Scribd company logo
Informatica Partitioning
Partitioning Sessions Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. If you have PowerCenter partitioning available, you can increase the number of partitions in a pipeline to improve session performance. Increasing the number of partitions allows the Integration Service to create multiple connections to sources and process partitions of source data concurrently.
Session Partition WRITER Source data Target data Target data THREAD 1 THREAD 2 READER TRANSFORMATION
Partition Points & Partitions
Partition Types Round-robin Partitioning Hash Partitioning Key Range Partitioning Pass-through Partitioning
Partition Types Round-robin Partitioning The Integration service distributes data evenly among all partitions. Use round-robin partitioning when you need to distribute rows evenly and do not need to group data among partitions. Hash Partitioning The PowerCenter Server uses a hash function to group rows of data among partitions. The Server groups the data based on a partition key. There are two types of hash partitioning:
Partition Types Hash auto-keys.  The Integration Service uses all grouped or sorted ports as a compound partition key. You can use hash auto-keys partitioning at or before Rank, Sorter, and unsorted Aggregator transformations to ensure that rows are grouped properly before they enter these transformations.  Hash user keys.  The Integration Service uses a hash function to group rows of data among partitions based on a user-defined partition key. You choose the ports that define the partition key.
Partition Types Key Range Partitioning With this type of partitioning, you specify one or more ports to form a compound partition key for a source or target. The Integration Service then passes data to each partition depending on the ranges you specify for each port. Pass-through Partitioning In this type of partitioning, the Integration Service passes all rows at one partition point to the next partition point without redistributing them.
Optimizing Sorter/Aggregator with partitions Add a hash auto-keys partition to Sorter/Aggregator transformation. To obtain expected results and get best performance when partitioning a sorter/Aggregator transformation, you must group and sort data. To group data, ensure that rows with the same key value are routed to the same partition.  The best way to ensure that data is grouped and distributed evenly among partitions is to add a hash auto-keys partition.
How Hash key partition works ? Hash partitioning maps data to partitions based on a hashing algorithm for the specified partitioning keys.  Hash functions can be used to locate records in a large file which have similar keys. For that purpose, one needs a hash function that maps similar keys to hash values that differ by at most m, where m is a small integer (say, 1 or 2). The Hash function groups the similar records in the same bucket.
Summary This presentation showed you how to: Problem Definition Informatica Partitions Approach the performance tuning challenge

More Related Content

DOC
Informatica Interview Questions & Answers
PDF
Informatica data warehousing_job_interview_preparation_guide
PPTX
What is Informatica Powercenter
PDF
Informatica Interview Questions | Informatica Tutorial | Informatica Training...
PPTX
1. informatica power center architecture
PPTX
Introduction to DAX
PDF
Etl overview training
PDF
ETL Using Informatica Power Center
Informatica Interview Questions & Answers
Informatica data warehousing_job_interview_preparation_guide
What is Informatica Powercenter
Informatica Interview Questions | Informatica Tutorial | Informatica Training...
1. informatica power center architecture
Introduction to DAX
Etl overview training
ETL Using Informatica Power Center

What's hot (20)

PDF
Informatica interview questions
PPTX
Informatica Powercenter Architecture
PDF
Cts informatica interview question answers
DOCX
Informatica interview questions and answers|Informatica Faqs 2014
PDF
Building a Dashboard in an hour with Power Pivot and Power BI
PPT
SQL Queries
PPTX
Introduction of sql server indexing
PPTX
Sets in Tableau
PDF
Power BI Interview Questions and Answers | Power BI Certification | Power BI ...
PPTX
Introduction to Microsoft’s Hadoop solution (HDInsight)
PDF
Informatica Tutorial For Beginners | Informatica Powercenter Tutorial | Edureka
PDF
Informatica Transformations with Examples | Informatica Tutorial | Informatic...
PPT
Tableau PPT.ppt
PPTX
Power BI vs Tableau: Which One is Best For Business Intelligence
PPTX
Presentation 1 - SSRS (1)
PDF
Microsoft Power BI Overview
PPT
Informatica session
PDF
Power BI Desktop | Power BI Tutorial | Power BI Training | Edureka
PDF
127556030 bisp-informatica-question-collections
PDF
Power BI Full Course | Power BI Tutorial for Beginners | Edureka
Informatica interview questions
Informatica Powercenter Architecture
Cts informatica interview question answers
Informatica interview questions and answers|Informatica Faqs 2014
Building a Dashboard in an hour with Power Pivot and Power BI
SQL Queries
Introduction of sql server indexing
Sets in Tableau
Power BI Interview Questions and Answers | Power BI Certification | Power BI ...
Introduction to Microsoft’s Hadoop solution (HDInsight)
Informatica Tutorial For Beginners | Informatica Powercenter Tutorial | Edureka
Informatica Transformations with Examples | Informatica Tutorial | Informatic...
Tableau PPT.ppt
Power BI vs Tableau: Which One is Best For Business Intelligence
Presentation 1 - SSRS (1)
Microsoft Power BI Overview
Informatica session
Power BI Desktop | Power BI Tutorial | Power BI Training | Edureka
127556030 bisp-informatica-question-collections
Power BI Full Course | Power BI Tutorial for Beginners | Edureka
Ad

Viewers also liked (10)

DOC
Informatica student meterial
PDF
Informatica push down optimization implementation
DOC
Informatica and datawarehouse Material
PPT
Informatica Server Manager
PDF
Presentation on linux
PPTX
Informatica power center 9 Online Training
PPTX
Informatica ppt
PPTX
Informatica PowerCenter
PPT
localization of stroke, CVS, stroke, for post graduates
PPT
Informatica power center performance tuning
Informatica student meterial
Informatica push down optimization implementation
Informatica and datawarehouse Material
Informatica Server Manager
Presentation on linux
Informatica power center 9 Online Training
Informatica ppt
Informatica PowerCenter
localization of stroke, CVS, stroke, for post graduates
Informatica power center performance tuning
Ad

Similar to Informatica partitions (20)

PDF
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
PPTX
Understanding Database Sharding and Partitioning
PPTX
Sql server lesson7
PDF
Pptofdistributeddb
PPTX
IS_Hash Function fundamental concepts BS information technology .pptx
PPTX
DMBS Indexes.pptx
PDF
Accenture informatica interview question answers
PDF
Bdc details
PDF
KNIME Finance Cheatseet PDF For Data Analytics
PPTX
map reduce ..............................
PPT
20. Parallel Databases in DBMS
PPTX
What is Amazon Athena
PPTX
Annotating Search Results from Web Databases
PDF
Effortless Polkadot Parachain Indexing_ Experience the Traceye Advantage.pdf
PPTX
Windows azure table storage – deep dive
PPT
Excel Datamining Addin Advanced
PPT
Excel Datamining Addin Advanced
PPTX
PARALLEL DATABASE SYSTEM in Computer Science.pptx
PDF
AWS MLS-C01 Exam Study Notes
DOCX
An efficient and robust addressing protocol for node auto configuration in ad...
MuleSoft Surat Virtual Meetup#30 - Flat File Schemas Transformation With Mule...
Understanding Database Sharding and Partitioning
Sql server lesson7
Pptofdistributeddb
IS_Hash Function fundamental concepts BS information technology .pptx
DMBS Indexes.pptx
Accenture informatica interview question answers
Bdc details
KNIME Finance Cheatseet PDF For Data Analytics
map reduce ..............................
20. Parallel Databases in DBMS
What is Amazon Athena
Annotating Search Results from Web Databases
Effortless Polkadot Parachain Indexing_ Experience the Traceye Advantage.pdf
Windows azure table storage – deep dive
Excel Datamining Addin Advanced
Excel Datamining Addin Advanced
PARALLEL DATABASE SYSTEM in Computer Science.pptx
AWS MLS-C01 Exam Study Notes
An efficient and robust addressing protocol for node auto configuration in ad...

Recently uploaded (20)

PDF
Insiders guide to clinical Medicine.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Lesson notes of climatology university.
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Sports Quiz easy sports quiz sports quiz
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Insiders guide to clinical Medicine.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Pharma ospi slides which help in ospi learning
Lesson notes of climatology university.
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
GDM (1) (1).pptx small presentation for students
Sports Quiz easy sports quiz sports quiz
TR - Agricultural Crops Production NC III.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
102 student loan defaulters named and shamed – Is someone you know on the list?
PPH.pptx obstetrics and gynecology in nursing
2.FourierTransform-ShortQuestionswithAnswers.pdf
Microbial disease of the cardiovascular and lymphatic systems
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
VCE English Exam - Section C Student Revision Booklet
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...

Informatica partitions

  • 2. Partitioning Sessions Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. If you have PowerCenter partitioning available, you can increase the number of partitions in a pipeline to improve session performance. Increasing the number of partitions allows the Integration Service to create multiple connections to sources and process partitions of source data concurrently.
  • 3. Session Partition WRITER Source data Target data Target data THREAD 1 THREAD 2 READER TRANSFORMATION
  • 4. Partition Points & Partitions
  • 5. Partition Types Round-robin Partitioning Hash Partitioning Key Range Partitioning Pass-through Partitioning
  • 6. Partition Types Round-robin Partitioning The Integration service distributes data evenly among all partitions. Use round-robin partitioning when you need to distribute rows evenly and do not need to group data among partitions. Hash Partitioning The PowerCenter Server uses a hash function to group rows of data among partitions. The Server groups the data based on a partition key. There are two types of hash partitioning:
  • 7. Partition Types Hash auto-keys. The Integration Service uses all grouped or sorted ports as a compound partition key. You can use hash auto-keys partitioning at or before Rank, Sorter, and unsorted Aggregator transformations to ensure that rows are grouped properly before they enter these transformations. Hash user keys. The Integration Service uses a hash function to group rows of data among partitions based on a user-defined partition key. You choose the ports that define the partition key.
  • 8. Partition Types Key Range Partitioning With this type of partitioning, you specify one or more ports to form a compound partition key for a source or target. The Integration Service then passes data to each partition depending on the ranges you specify for each port. Pass-through Partitioning In this type of partitioning, the Integration Service passes all rows at one partition point to the next partition point without redistributing them.
  • 9. Optimizing Sorter/Aggregator with partitions Add a hash auto-keys partition to Sorter/Aggregator transformation. To obtain expected results and get best performance when partitioning a sorter/Aggregator transformation, you must group and sort data. To group data, ensure that rows with the same key value are routed to the same partition.  The best way to ensure that data is grouped and distributed evenly among partitions is to add a hash auto-keys partition.
  • 10. How Hash key partition works ? Hash partitioning maps data to partitions based on a hashing algorithm for the specified partitioning keys. Hash functions can be used to locate records in a large file which have similar keys. For that purpose, one needs a hash function that maps similar keys to hash values that differ by at most m, where m is a small integer (say, 1 or 2). The Hash function groups the similar records in the same bucket.
  • 11. Summary This presentation showed you how to: Problem Definition Informatica Partitions Approach the performance tuning challenge

Editor's Notes

  • #2: Memory Optimization 15. Time: Lecture: XX minutes; Labs: 0 minutes Intent: One sentence description of the reason this module is here Flow: Narrative or “storyline” version of the module’s content in a paragraph or so Key Terms: List terms introduced in the module Module Setup: Any physical setup the instructor may need to do before starting the module
  • #3: Memory Optimization 15.
  • #4: Memory Optimization 15. This diagram is a simplification of how the DTM uses memory. The DTM Buffer allows each thread to pass data on to the next thread and for the Writer to receive data to pass to the target. The DTM Buffer is divided into blocks. Different threads control different blocks. If there are multiple transformer threads, each requires its own set of blocks to pass data to the next thread. Thus, the number of required blocks is a function of the number of sources, targets, & stages in your pipeline. In addition to the DTM Buffer, certain transformations require memory known as the transformation caches . The transformation caches reside outside of the DTM Buffer. That means the transformation caches represent an additional memory requirement beyond the DTM Buffer.
  • #7: Memory Optimization 15.
  • #8: Memory Optimization 15.
  • #9: Memory Optimization 15.
  • #10: Memory Optimization 15. The transformation caches are separate from the DTM Buffer.
  • #11: Memory Optimization 15. Use the auto settings as a starting point. Check the session log to see the actual runtime allocations. Note that each transformation stage also requires a minimum of 2 blocks.
  • #12: Memory Optimization 15. Purpose: To allow for a review. Steps: Ensure that students “got” the material, have completed lab successfully, etc.