SlideShare a Scribd company logo
5
Most read
6
Most read
8
Most read
ETL Metadata Injection
David Fombella Pombal
http://anonymousbi.wordpress.com
@Pentaho_Fan
ETL Metadata Injection 2
David Fombella Pombal
BI Consultant, Technical Developmental Editor and
Pentaho Community Guy
About me
ETL Metadata Injection 3
Introduction
•The ETL Metadata Injection step is capable of injecting
step metadata into a template transformation. So
instead of statically entering ETL metadata in a step
dialog, you pass it at run-time. It is possible to solve
repetitive ETL workloads like loading of text files, data
migration and so on.
ETL Metadata Injection 4
Problem description
•We need to load cost files containing a dynamic header
•Budget files contain 12 months information
•Forecast files contain between 1 and 12 months
•3 fixed columns containing Entity, Account and Cost Centre
ETL Metadata Injection 5
File Types
•Budget file containing 2015 (12 months)
•Forecast file October 2014 (3 months)
ETL Metadata Injection 6
Easy to use +
Open Source +
BI Solution =
Pentaho and
Mondrian!
Template transformation (Text file input)
ETL Metadata Injection 7
Easy to use +
Open Source +
BI Solution =
Pentaho and
Mondrian!
Template transformation (Row normaliser)
ETL Metadata Injection 8
Load file Metadata Injection 1
ETL Metadata Injection 9
Load file Metadata Injection 2
ETL Metadata Injection 10
Load file Metadata Injection 3
ETL Metadata Injection 11
Load file after Injection
ETL Metadata Injection 12
Non-Native Metadata Injection
Metadata Injection is one of the single most powerful
features of PDI. However it only some steps support it. But
as PDI has 100s of steps, more being added weekly, there’s
always going to be a scenario where the step you want to
Inject doesn't support it.
So what do you do? Use the API. The API is how PDI itself
sets the metadata for a step.
So the steps are:
1. Open the transformation file
2. Find the step(s) we want to change
3. Configure it accordingly
4. Save the file
ETL Metadata Injection 13
Non-Native Metadata Injection (template)
ETL Metadata Injection 14
Non-Native Metadata Injection (Inject 1)
ETL Metadata Injection 15
Non-Native Metadata Injection (Inject 2)
ETL Metadata Injection 16
Non-Native Metadata Injection (Inject 3)
ETL Metadata Injection 17
Non-Native Metadata Injection
Thanks to…

More Related Content

PDF
DITA and Metadata on an Enterprise Scale
PPTX
Database Consolidation using the Oracle Multitenant Architecture
PPTX
Snowflake + Power BI: Cloud Analytics for Everyone
PDF
Building an MLOps Stack for Companies at Reasonable Scale
PPTX
Modernize & Automate Analytics Data Pipelines
PPTX
The Roadmap for SQL Server 2019
PDF
Session découverte de la Data Virtualization
PPTX
Snowflake essentials
DITA and Metadata on an Enterprise Scale
Database Consolidation using the Oracle Multitenant Architecture
Snowflake + Power BI: Cloud Analytics for Everyone
Building an MLOps Stack for Companies at Reasonable Scale
Modernize & Automate Analytics Data Pipelines
The Roadmap for SQL Server 2019
Session découverte de la Data Virtualization
Snowflake essentials

What's hot (20)

PDF
Making Data Timelier and More Reliable with Lakehouse Technology
PPTX
Introducing the Snowflake Computing Cloud Data Warehouse
PPTX
Zero to Snowflake Presentation
PDF
Snowflake for Data Engineering
PPTX
Operational Data Vault
PPTX
Informatica PowerCenter
PPTX
Dynamics AX 2009 Data Dictionary - Güven Şahin - 04.05.2013
PPTX
Developing dynamics 365 reports in dynamics 365
PPTX
Snowflake Data Loading.pptx
PDF
Business Intelligence (BI) and Data Management Basics
PDF
Introduction to Power BI
PPTX
Tableau Desktop Material
PDF
Power BI Dashboard | Microsoft Power BI Tutorial | Data Visualization | Edureka
PPT
An overview of snowflake
PDF
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
PDF
Etl overview training
PDF
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
PPTX
Introduction to Azure Databricks
PDF
DITA Metadata
Making Data Timelier and More Reliable with Lakehouse Technology
Introducing the Snowflake Computing Cloud Data Warehouse
Zero to Snowflake Presentation
Snowflake for Data Engineering
Operational Data Vault
Informatica PowerCenter
Dynamics AX 2009 Data Dictionary - Güven Şahin - 04.05.2013
Developing dynamics 365 reports in dynamics 365
Snowflake Data Loading.pptx
Business Intelligence (BI) and Data Management Basics
Introduction to Power BI
Tableau Desktop Material
Power BI Dashboard | Microsoft Power BI Tutorial | Data Visualization | Edureka
An overview of snowflake
Pipelines and Packages: Introduction to Azure Data Factory (DATA:Scotland 2019)
Etl overview training
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
Introduction to Azure Databricks
DITA Metadata
Ad

Similar to ETL Metadata Injection with Pentaho Data Integration (20)

PDF
ETL Tools Ankita Dubey
PDF
A Comparitive Study Of ETL Tools
PDF
Why shift from ETL to ELT?
PPTX
Lecture13- Extract Transform Load presentation.pptx
DOC
Etl testing
DOC
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
PPTX
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
PDF
LPR - Week 1
PDF
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
DOCX
Lab manual etl
PPTX
Our ETL testing training program in Hyderabad covers comprehensive topics suc...
PPTX
“Extract, Load, Transform,” is another type of data integration process
PDF
ETL testing training program in Hyderabad covers comprehensive topics
PPTX
ETL Technologies.pptx
PDF
ETL vs ELT
PPTX
ELT Publishing Tool Overview V3_Jeff
PPTX
1.3 CLASS-DW.pptx-ETL process in details with detailed descriptions
PDF
What is ETL and Zero ETL | Extract, Transform, Load
PPTX
Extract Transformation Load (3) (1).pptx
ETL Tools Ankita Dubey
A Comparitive Study Of ETL Tools
Why shift from ETL to ELT?
Lecture13- Extract Transform Load presentation.pptx
Etl testing
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
Pentaho Data Integration: Extrayendo, integrando, normalizando y preparando m...
LPR - Week 1
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
Lab manual etl
Our ETL testing training program in Hyderabad covers comprehensive topics suc...
“Extract, Load, Transform,” is another type of data integration process
ETL testing training program in Hyderabad covers comprehensive topics
ETL Technologies.pptx
ETL vs ELT
ELT Publishing Tool Overview V3_Jeff
1.3 CLASS-DW.pptx-ETL process in details with detailed descriptions
What is ETL and Zero ETL | Extract, Transform, Load
Extract Transformation Load (3) (1).pptx
Ad

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
cuic standard and advanced reporting.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Approach and Philosophy of On baking technology
PDF
KodekX | Application Modernization Development
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Electronic commerce courselecture one. Pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
Dropbox Q2 2025 Financial Results & Investor Presentation
Programs and apps: productivity, graphics, security and other tools
cuic standard and advanced reporting.pdf
Review of recent advances in non-invasive hemoglobin estimation
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
20250228 LYD VKU AI Blended-Learning.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Digital-Transformation-Roadmap-for-Companies.pptx
Approach and Philosophy of On baking technology
KodekX | Application Modernization Development
sap open course for s4hana steps from ECC to s4
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)

ETL Metadata Injection with Pentaho Data Integration

  • 1. ETL Metadata Injection David Fombella Pombal http://anonymousbi.wordpress.com @Pentaho_Fan
  • 2. ETL Metadata Injection 2 David Fombella Pombal BI Consultant, Technical Developmental Editor and Pentaho Community Guy About me
  • 3. ETL Metadata Injection 3 Introduction •The ETL Metadata Injection step is capable of injecting step metadata into a template transformation. So instead of statically entering ETL metadata in a step dialog, you pass it at run-time. It is possible to solve repetitive ETL workloads like loading of text files, data migration and so on.
  • 4. ETL Metadata Injection 4 Problem description •We need to load cost files containing a dynamic header •Budget files contain 12 months information •Forecast files contain between 1 and 12 months •3 fixed columns containing Entity, Account and Cost Centre
  • 5. ETL Metadata Injection 5 File Types •Budget file containing 2015 (12 months) •Forecast file October 2014 (3 months)
  • 6. ETL Metadata Injection 6 Easy to use + Open Source + BI Solution = Pentaho and Mondrian! Template transformation (Text file input)
  • 7. ETL Metadata Injection 7 Easy to use + Open Source + BI Solution = Pentaho and Mondrian! Template transformation (Row normaliser)
  • 8. ETL Metadata Injection 8 Load file Metadata Injection 1
  • 9. ETL Metadata Injection 9 Load file Metadata Injection 2
  • 10. ETL Metadata Injection 10 Load file Metadata Injection 3
  • 11. ETL Metadata Injection 11 Load file after Injection
  • 12. ETL Metadata Injection 12 Non-Native Metadata Injection Metadata Injection is one of the single most powerful features of PDI. However it only some steps support it. But as PDI has 100s of steps, more being added weekly, there’s always going to be a scenario where the step you want to Inject doesn't support it. So what do you do? Use the API. The API is how PDI itself sets the metadata for a step. So the steps are: 1. Open the transformation file 2. Find the step(s) we want to change 3. Configure it accordingly 4. Save the file
  • 13. ETL Metadata Injection 13 Non-Native Metadata Injection (template)
  • 14. ETL Metadata Injection 14 Non-Native Metadata Injection (Inject 1)
  • 15. ETL Metadata Injection 15 Non-Native Metadata Injection (Inject 2)
  • 16. ETL Metadata Injection 16 Non-Native Metadata Injection (Inject 3)
  • 17. ETL Metadata Injection 17 Non-Native Metadata Injection Thanks to…