SlideShare a Scribd company logo
Data transformation.
Using kettle transformation
■ Andriy Kyrylenko (Gera-IT)
How to manage import files from different vendors?
What are the common transformations?
How Kettle can help with managing different
transformations?
Import records standard feeds
NEWS
PRODUCTS
RSS feed
ATOM feed
???
Import products
Vendor1
Vendor2
Vendor3
Vendor4
Products1
Products3
Products4
Products2 Shop
productsShop
Import products: source and file types
FTP
HTTP/HTTPS
Local file
Service API
Source file
file
Records
XML file
Shop Products
DB
Import products: source and file types
Source type1
Source type2
Source type3
Service API
File format1
File format3
Records
File format2
Shop Products
DB
Import products:
several readers
Fieldset1
Fieldset2
FieldsetN
Service API
Converter1
ConverterN
ReaderS
Converter2
Shop Products
DB
Import products:
several ruby classes
Fieldset1
Fieldset2
FieldsetN
Service API
RubyClass1
RubyClassN
RubyClassS
RubyClass2
Shop Products
DB
Converting records:
simple cases
■ Filtering fields
■ Filtering records by field value
■ Cloning records
■ Splitting fields into several fields
■ Joining fields
■ Creating new fields
■ Merging two files
Import products:
several ruby classes
FileSet1
FileSet2
FileSetN
Service API
Kettle1
KettleN
KettleS
Kettle2
Shop Products
DB
http://kettle.pentaho.com
Data transformations. Using kettle transformations - Andriy Kyrylenko,
Pentaho Kettle Transformations
■ http://kettle.pentaho.com
■
■ Download data-integration packager
■ Run ”Spoon”
■ Enjoy

More Related Content

PPTX
POINT AND GO user manual 2016 Nov
PPTX
Collo -01 , en
ODP
Introduction to ETL
PPT
Java Web Scraping
PPTX
Helen Duriez #crossref15
PDF
Intro to GraphQL
PDF
GraphQL Search
PPTX
Psm digital-olympus-slides-09.19.18
POINT AND GO user manual 2016 Nov
Collo -01 , en
Introduction to ETL
Java Web Scraping
Helen Duriez #crossref15
Intro to GraphQL
GraphQL Search
Psm digital-olympus-slides-09.19.18

What's hot (6)

PPTX
Hack angular wildly
PDF
DataXDay - Real-Time Access log analysis
PPTX
Dataset Descriptions in Open PHACTS and HCLS
PPTX
Highlights of the Projects
PPTX
.NET Fest 2017. Константин Проскурдин. Marten как хранилище документов для .N...
PDF
GraphQL Advanced
Hack angular wildly
DataXDay - Real-Time Access log analysis
Dataset Descriptions in Open PHACTS and HCLS
Highlights of the Projects
.NET Fest 2017. Константин Проскурдин. Marten как хранилище документов для .N...
GraphQL Advanced
Ad

Similar to Data transformations. Using kettle transformations - Andriy Kyrylenko, (20)

PDF
Mcneill 01
PDF
Building bridges - Plone Conference 2015 Bucharest
PDF
FDMEE Custom Reports
PDF
Phpconf taiwan-2012
PDF
OmegaT "Team Project" feature: a case study
PPTX
Understanding linport
PDF
Stream Processing using Apache Flink in Zalando's World of Microservices - Re...
PPTX
Hadoop @ LifeWay
PDF
[WSO2 Integration Summit Nairobi 2019] Role of Integration in an API Driven W...
PPT
Best Implementation Practices with BI Publisher
PPTX
10 Reasons ColdFusion PDFs should rule the world
PPT
IBM Omnifind Enterprise Portal Seach To Improve Productivity
PPTX
Build 2017 - P4016 - Demystifying Cloud Data Services for an App Developer
PDF
CenitHub: Introduction
PDF
Introduction to Apache Kafka
PDF
Large-Scale ETL Data Flows With Data Pipeline and Dataduct
PDF
Arrested by the cap devoxx uk 2018
PDF
Stream processing with Apache Flink @ OfferUp
PDF
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
PDF
Fluentd Project Intro at Kubecon 2019 EU
Mcneill 01
Building bridges - Plone Conference 2015 Bucharest
FDMEE Custom Reports
Phpconf taiwan-2012
OmegaT "Team Project" feature: a case study
Understanding linport
Stream Processing using Apache Flink in Zalando's World of Microservices - Re...
Hadoop @ LifeWay
[WSO2 Integration Summit Nairobi 2019] Role of Integration in an API Driven W...
Best Implementation Practices with BI Publisher
10 Reasons ColdFusion PDFs should rule the world
IBM Omnifind Enterprise Portal Seach To Improve Productivity
Build 2017 - P4016 - Demystifying Cloud Data Services for an App Developer
CenitHub: Introduction
Introduction to Apache Kafka
Large-Scale ETL Data Flows With Data Pipeline and Dataduct
Arrested by the cap devoxx uk 2018
Stream processing with Apache Flink @ OfferUp
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
Fluentd Project Intro at Kubecon 2019 EU
Ad

More from Ruby Meditation (20)

PDF
Is this Legacy or Revenant Code? - Sergey Sergyenko | Ruby Meditation 30
PDF
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...
PDF
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29
PDF
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...
PDF
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28
PDF
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
PDF
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...
PDF
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
PDF
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...
PDF
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...
PDF
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27
PDF
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
PDF
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26
PDF
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...
PDF
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26
PDF
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25
PDF
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...
PDF
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...
PDF
Rails App performance at the limit - Bogdan Gusiev
PDF
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23
Is this Legacy or Revenant Code? - Sergey Sergyenko | Ruby Meditation 30
Life with GraphQL API: good practices and unresolved issues - Roman Dubrovsky...
Where is your license, dude? - Viacheslav Miroshnychenko | Ruby Meditation 29
Dry-validation update. Dry-validation vs Dry-schema 1.0 - Aleksandra Stolyar ...
How to cook Rabbit on Production - Bohdan Parshentsev | Ruby Meditation 28
How to cook Rabbit on Production - Serhiy Nazarov | Ruby Meditation 28
Reinventing the wheel - why do it and how to feel good about it - Julik Tarkh...
Performance Optimization 101 for Ruby developers - Nihad Abbasov (ENG) | Ruby...
Use cases for Serverless Technologies - Ruslan Tolstov (RUS) | Ruby Meditatio...
The Trailblazer Ride from the If Jungle into a Civilised Railway Station - Or...
What/How to do with GraphQL? - Valentyn Ostakh (ENG) | Ruby Meditation 27
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
Security Scanning Overview - Tetiana Chupryna (RUS) | Ruby Meditation 26
Teach your application eloquence. Logs, metrics, traces - Dmytro Shapovalov (...
Best practices. Exploring - Ike Kurghinyan (RUS) | Ruby Meditation 26
Road to A/B testing - Alexey Vasiliev (ENG) | Ruby Meditation 25
Concurrency in production. Real life example - Dmytro Herasymuk | Ruby Medita...
Data encryption for Ruby web applications - Dmytro Shapovalov (RUS) | Ruby Me...
Rails App performance at the limit - Bogdan Gusiev
GDPR. Next Y2K in 2018? - Anton Tkachov | Ruby Meditation #23

Recently uploaded (20)

PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Mushroom cultivation and it's methods.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
A Presentation on Touch Screen Technology
PDF
August Patch Tuesday
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
A comparative analysis of optical character recognition models for extracting...
Hindi spoken digit analysis for native and non-native speakers
Mushroom cultivation and it's methods.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
WOOl fibre morphology and structure.pdf for textiles
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Web App vs Mobile App What Should You Build First.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
NewMind AI Weekly Chronicles - August'25-Week II
1 - Historical Antecedents, Social Consideration.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Unlocking AI with Model Context Protocol (MCP)
DP Operators-handbook-extract for the Mautical Institute
Enhancing emotion recognition model for a student engagement use case through...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A Presentation on Touch Screen Technology
August Patch Tuesday
Chapter 5: Probability Theory and Statistics
TLE Review Electricity (Electricity).pptx
Encapsulation_ Review paper, used for researhc scholars
A comparative analysis of optical character recognition models for extracting...

Data transformations. Using kettle transformations - Andriy Kyrylenko,