SlideShare a Scribd company logo
Francesco Ganora
DataWeave
A functional
data transformation language
from MuleSoft
The data mapping challenge
JSON
XML
CSV
Fixed Width
POJO
JSON
XML
CSV
Fixed Width
POJO
Structural Transformation
Value Transformation
Conditional mapping
Filtering
Grouping
Best practice: always define the mapping in terms of the desired target data structure
The old programmatic approach
❖ Map the target message from the source message
programmatically (e.g., via a script or Java method)
❖ Sequence of procedural steps that incrementally build the
target message from the source message
❖ Typical example: loop on elements of a source sequence
and for each element instantiate a target sub-structure, then
attach it to the overall target structure
❖ This approach is neither concise nor expressive; if
implemented incorrectly, it is also inefficient
The templating approach
❖ Template engines can be used as
data mapping engines:
❖ We define the target structure
(template)
❖ We define how each part of the
template is generated dynamically
from source data
❖ The template consists of a semi-
literal expression with
placeholders e.g. $() in the this
example
❖ More constructs are necessary to
instantiate repetitive structures
(looping), for conditional
mapping, etc.
{“user”:
{“id”: “$(sourceData.userID)”,
“firstName”: “$(sourceData.givenName)”,
“lastName”: “$(sourceData.lastName)”,
“contacts”: {
“phone”: “$(sourceData.phoneNumber)”,
“email”: “$(sourceData.emailAddress)”
}}
<?xml version="1.0">
<user>
<id> $(sourceData.userID) </id>
<firstName> $(sourceData.givenName) </firstName>,
<lastName> $(sourceData.lastName) </lastName>
<contacts>
<phone> $(sourceData.phoneNumber) </phone>
<email> $(sourceData.emailAddress) </email>
</contacts>
</user>
JSON
XML
Issues with standard templating
❖ Template depends on the concrete syntax of the target message (separate
templates for XML, JSON etc.)
❖ Placeholder syntax depends on the type of source message (e.g., XPath for
XML, JSONPath for JSON, non-standard syntax for other media types)
❖ Placeholder syntax may clash with target message syntax (cannot use for
example <> as placeholder markers with XML)
❖ Looping constructs of traditional template engines mix engine syntax with
generated content (“PHP-like”)
❖ XSLT is a very powerful templating and transformation language, but it
does have drawbacks (verbose XML syntax, cannot operate on non-tree-
structured source message that cannot be rendered into XML, etc.)
DataWeave (DW)
❖ Data mapping and
transformation tool from
MuleSoft
❖ Tightly integrated with
AnyPoint Studio IDE
❖ Non-procedural expression
language
❖ Applies functional
programming constructs
(lambdas)
❖ Uses internal, canonical data
format (application/dw)
Canonical data representation
1. DW parses the source message into application/dw canonical format using supplied metadata
/ DataSense capability
2. A DW expression is used to transform the source message (result still in canonical application/
dw format)
3. DW renders the canonical target message into the target MIME type specified as a “header”
to the DW expression (e.g. %output application/json)
This decouples the transformation from the concrete syntax of source and target messages!
Source
message
<source MIME type>
parser renderer
Source
message
(canonical)
Target
message
(canonical)
Target
message
DW
expression
<target MIME type>application/dw application/dw
The DW canonical format
❖ Only 3 kinds of data in SW:
• Simple (String, Number,
Boolean, Date types)
• Array
• Objects (key:value pairs)
❖ The canonical application/dw format
is shown in a JSON-like concrete
syntax in Anypoint Studio
❖ Parsing and rendering between
application/json and application/dw
is straightforward
[
{
"order_nr": "DO1234",
"order_date": "2016-03-12T13:30:23+8.00",
sku: "1233244",
"sku_description": "Product A",
qty: "20"
},
{
"order_nr": "DO1234",
"order_date": "2016-03-12T13:30:23+8.00",
sku: "1233255",
"sku_description": "Product B",
qty: "50"
}
]
XML Parsing
❖ repeated XML elements —> repeated object keys
❖ XML attributes —> special @() object
CSV parsing
❖ Array of records (lines)
❖ Record (line) —> array
element of type Object
❖ Field in record: object
field (key is taken from
CSV header line or
configured metadata)
❖ Reader configuration to
set field separator, etc.
DW transform structure
%dw 1.0
%input payload application/csv
%output application/json
%type sapDate = :string { format: “YYYYMMDD” }
%var unitOfMeasure = 'EA'
%var doubleNumber = (nr) -> [nr * 2.0]
%namespace xsi http://www.w3.org/2001/XMLSchema-instance
%function fname(name) {firstName: upper name}
——-
order: {
ID: payload.orderID ++ " dated " ++ payload.orderDate,
nrLines: (sizeOf payload.orderItems) + 1,
totalOrderAmount: payload.*orderItems reduce
$$ + (($.orderQuantity as :number) * ($.unitPrice as :number))
}
}
Optional header contains:
• transformation directives
• reusable declarations
Body contains the DW
transformation expression
Case study: introduction
Transforming a list of order items into a corresponding list of delivery routes.
The source payload is unsorted list of items in CSV format:
OrderId;OrderDate;CustomerId;DeliveryDate;City;ProductId;Quantity
000001;2016-09-14;Customer1;2016-09-20;London;ProductA;120
000001;2016-09-14;Customer1;2016-09-20;London;ProductB;88
000002;2016-09-15;Customer2;2016-09-20;Paris;ProductC;60
000002;2016-09-15;Customer2;2016-09-20;Paris;ProductA;100
000002;2016-09-15;Customer2;2016-09-20;Paris;ProductD;15
000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductB;14
000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductD;30
000004;2016-09-15;Customer4;2016-09-20;London;ProductC;14
000004;2016-09-15;Customer4;2016-09-20;London;ProductE;30
000005;2016-09-16;Customer4;2016-09-20;London;ProductB;20
000006;2016-09-16;Customer2;2016-09-22;Paris;ProductD;7
000006;2016-09-16;Customer2;2016-09-22;Paris;ProductE;30
000007;2016-09-16;Customer5;2016-09-22;Berlin;ProductB;12
The target structure (described in the following slide) is a multi-level JSON structure.
This case study focuses on the structural transformation capabilities of DW, but DW offers a
wide range of value and formatting capabilities, conditional mapping, and much more!
Case study: target format
[
{
city: "<City>",
deliveryDate: "<DeliveryDate>",
stops: [
{
customer: "<CustomerId>",
orderitems: [
{
ordernr: "<OrderId>",
orderdate: "<OrderDate>",
product: "<ProductId>",
qty: "<Quantity>"
}
]
}
]
}
]
JSON document with
sequence of delivery
routes by delivery date
and city:
❖ Sort CSV order lines by
city and delivery date
❖ Within each delivery
date and city, group
order lines by customer
❖ Render the structure as
JSON
By city / delivery date
By customer
By order item
Case study: step 1
Source message parsed as application/dw:
The DW expression payload evaluates the entire message payload (see earlier slide “CSV parsing)”
NOTE: the DW transformer Preview functionality in MuleSoft Anypoint Studio maps the sample
source in realtime as you type the transformation!
Case study: step 2
Sorting and grouping by combination of city and delivery date:
A composite key is used for sorting and grouping via the string concatenation operator (++) .
The groupBy operator creates an object with the group values as keys.
Case study: step 3
Iterating over the group values (city/delivery date combination) to
generate the 1st level of the target structure:
The pluck operator maps an object into an array. $$ is the key in the current iteration, $ is the
value.
City and delivery date are mapped from the composite key by String manipulation.
Case study: step 4
Within each route group, group by customer and generate 2nd (inner) level of target
structure:
In the inner pluck the context for $ and $$ changes (e.g., $$ is now the CustomerID key).
Case study: (final) step 5
Within each customer group, generate the 3rd (innermost) level of the target
structure via the map operator:
Also get the JSON rending by changing the %output directive.
Thanks!
This is just a “taste” of the innovative DataWeave
transformation language.
Find out more at:
https://docs.mulesoft.com/mule-user-guide/v/3.8/
dataweave

More Related Content

PDF
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
PDF
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...
PDF
Apache Kafka and the Data Mesh | Michael Noll, Confluent
PDF
Introduction à HDFS
PPT
Introduction to Google APIs
PDF
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
PDF
Apache Hadoop 3
PPTX
Amazon_SNS.pptx
YOW London - Considering Migrating a Monolith to Microservices? A Dark Energy...
Google Cloud Storage | Google Cloud Platform Tutorial | Google Cloud Architec...
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Introduction à HDFS
Introduction to Google APIs
DDD SoCal: Decompose your monolith: Ten principles for refactoring a monolith...
Apache Hadoop 3
Amazon_SNS.pptx

What's hot (20)

PPTX
Disaster Recovery Synapse
PDF
CQRS + Event Sourcing
PDF
E-Business Suite on Oracle Cloud
PPT
App Dynamics
PDF
Twitter Heron in Practice
PPTX
Introduction to appDynamics
PPTX
Easy data-with-spring-data-jpa
PPTX
Updated: Should you be using an Event Driven Architecture
PDF
Testing Strategies for Data Lake Hosted on Hadoop
PDF
Resource-Oriented Architecture (ROA)
PDF
Got data?… now what? An introduction to modern data platforms
PDF
The Buyer's Journey
PDF
How queries work with sharding
PDF
Incident Management Framework
PPSX
Microservices Architecture - Cloud Native Apps
PPTX
Cloud computing and big data analytics
PDF
Engineering Velocity: Shifting the Curve at Netflix
PPTX
Datasaturday Pordenone Azure Purview Erwin de Kreuk
PDF
MongoDB Breakfast Milan - Mainframe Offloading Strategies
PDF
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
Disaster Recovery Synapse
CQRS + Event Sourcing
E-Business Suite on Oracle Cloud
App Dynamics
Twitter Heron in Practice
Introduction to appDynamics
Easy data-with-spring-data-jpa
Updated: Should you be using an Event Driven Architecture
Testing Strategies for Data Lake Hosted on Hadoop
Resource-Oriented Architecture (ROA)
Got data?… now what? An introduction to modern data platforms
The Buyer's Journey
How queries work with sharding
Incident Management Framework
Microservices Architecture - Cloud Native Apps
Cloud computing and big data analytics
Engineering Velocity: Shifting the Curve at Netflix
Datasaturday Pordenone Azure Purview Erwin de Kreuk
MongoDB Breakfast Milan - Mainframe Offloading Strategies
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
Ad

Viewers also liked (15)

PPTX
Overview of XSL, XPath and XSL-FO
PPTX
Mulesoft API
PPTX
Deploying mule applications
PPTX
Operators in mule dataweave
PPTX
Mule esb data weave multi input data
PPTX
Why Integrate using an API? | MuleSoft
PPTX
Mule data weave_6
PPTX
Mule esb :Data Weave
PPTX
SOAP To REST API Proxy
PPTX
MuleSoft London Community - API Marketing, Culture Change and Tooling
PPTX
ADP: Driving Faster Customer Onboarding with MuleSoft - Michael Bevilacqua, V...
PPTX
How Cisco is Leveraging MuleSoft to Drive Continuous Innovation​ at Enterpris...
PDF
The Emerging Integration Reference Architecture | MuleSoft
PPTX
Microservices Best Practices
PDF
IoT architecture
Overview of XSL, XPath and XSL-FO
Mulesoft API
Deploying mule applications
Operators in mule dataweave
Mule esb data weave multi input data
Why Integrate using an API? | MuleSoft
Mule data weave_6
Mule esb :Data Weave
SOAP To REST API Proxy
MuleSoft London Community - API Marketing, Culture Change and Tooling
ADP: Driving Faster Customer Onboarding with MuleSoft - Michael Bevilacqua, V...
How Cisco is Leveraging MuleSoft to Drive Continuous Innovation​ at Enterpris...
The Emerging Integration Reference Architecture | MuleSoft
Microservices Best Practices
IoT architecture
Ad

Similar to MuleSoft DataWeave data transformation language (20)

PPTX
Data weave in mule
PPTX
Data weave documentation
PPTX
Data weave (MuleSoft)
PPTX
PDF
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...
PPTX
MuleSoft Meetup 3 Charlotte Presentation Slides
PPTX
Mule soft meetup_virtual_ charlotte_2020_final1
PPTX
Power of Transformation with DataWeave 2.X Engine
PPTX
Data weave documentation
PPTX
Mule soft meetup_charlotte_4__draft_v2.0
PDF
DataWeave and Error Handling Meetup at SF Tower Sept 24th
PPTX
Data weave component
PDF
Data weave
PPTX
Data weave in Mule
PPTX
Dataweave nagarjuna
PPTX
Data weave
PPTX
Dataweave by nagarjuna
PPTX
Data weave
PPTX
Dataweave
PPTX
Data weave
Data weave in mule
Data weave documentation
Data weave (MuleSoft)
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...
MuleSoft Meetup 3 Charlotte Presentation Slides
Mule soft meetup_virtual_ charlotte_2020_final1
Power of Transformation with DataWeave 2.X Engine
Data weave documentation
Mule soft meetup_charlotte_4__draft_v2.0
DataWeave and Error Handling Meetup at SF Tower Sept 24th
Data weave component
Data weave
Data weave in Mule
Dataweave nagarjuna
Data weave
Dataweave by nagarjuna
Data weave
Dataweave
Data weave

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
Teaching material agriculture food technology
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Machine learning based COVID-19 study performance prediction
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Big Data Technologies - Introduction.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Teaching material agriculture food technology
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Approach and Philosophy of On baking technology
NewMind AI Weekly Chronicles - August'25-Week II
Diabetes mellitus diagnosis method based random forest with bat algorithm
The AUB Centre for AI in Media Proposal.docx
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Machine learning based COVID-19 study performance prediction
Chapter 3 Spatial Domain Image Processing.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Spectral efficient network and resource selection model in 5G networks
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

MuleSoft DataWeave data transformation language

  • 1. Francesco Ganora DataWeave A functional data transformation language from MuleSoft
  • 2. The data mapping challenge JSON XML CSV Fixed Width POJO JSON XML CSV Fixed Width POJO Structural Transformation Value Transformation Conditional mapping Filtering Grouping Best practice: always define the mapping in terms of the desired target data structure
  • 3. The old programmatic approach ❖ Map the target message from the source message programmatically (e.g., via a script or Java method) ❖ Sequence of procedural steps that incrementally build the target message from the source message ❖ Typical example: loop on elements of a source sequence and for each element instantiate a target sub-structure, then attach it to the overall target structure ❖ This approach is neither concise nor expressive; if implemented incorrectly, it is also inefficient
  • 4. The templating approach ❖ Template engines can be used as data mapping engines: ❖ We define the target structure (template) ❖ We define how each part of the template is generated dynamically from source data ❖ The template consists of a semi- literal expression with placeholders e.g. $() in the this example ❖ More constructs are necessary to instantiate repetitive structures (looping), for conditional mapping, etc. {“user”: {“id”: “$(sourceData.userID)”, “firstName”: “$(sourceData.givenName)”, “lastName”: “$(sourceData.lastName)”, “contacts”: { “phone”: “$(sourceData.phoneNumber)”, “email”: “$(sourceData.emailAddress)” }} <?xml version="1.0"> <user> <id> $(sourceData.userID) </id> <firstName> $(sourceData.givenName) </firstName>, <lastName> $(sourceData.lastName) </lastName> <contacts> <phone> $(sourceData.phoneNumber) </phone> <email> $(sourceData.emailAddress) </email> </contacts> </user> JSON XML
  • 5. Issues with standard templating ❖ Template depends on the concrete syntax of the target message (separate templates for XML, JSON etc.) ❖ Placeholder syntax depends on the type of source message (e.g., XPath for XML, JSONPath for JSON, non-standard syntax for other media types) ❖ Placeholder syntax may clash with target message syntax (cannot use for example <> as placeholder markers with XML) ❖ Looping constructs of traditional template engines mix engine syntax with generated content (“PHP-like”) ❖ XSLT is a very powerful templating and transformation language, but it does have drawbacks (verbose XML syntax, cannot operate on non-tree- structured source message that cannot be rendered into XML, etc.)
  • 6. DataWeave (DW) ❖ Data mapping and transformation tool from MuleSoft ❖ Tightly integrated with AnyPoint Studio IDE ❖ Non-procedural expression language ❖ Applies functional programming constructs (lambdas) ❖ Uses internal, canonical data format (application/dw)
  • 7. Canonical data representation 1. DW parses the source message into application/dw canonical format using supplied metadata / DataSense capability 2. A DW expression is used to transform the source message (result still in canonical application/ dw format) 3. DW renders the canonical target message into the target MIME type specified as a “header” to the DW expression (e.g. %output application/json) This decouples the transformation from the concrete syntax of source and target messages! Source message <source MIME type> parser renderer Source message (canonical) Target message (canonical) Target message DW expression <target MIME type>application/dw application/dw
  • 8. The DW canonical format ❖ Only 3 kinds of data in SW: • Simple (String, Number, Boolean, Date types) • Array • Objects (key:value pairs) ❖ The canonical application/dw format is shown in a JSON-like concrete syntax in Anypoint Studio ❖ Parsing and rendering between application/json and application/dw is straightforward [ { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233244", "sku_description": "Product A", qty: "20" }, { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233255", "sku_description": "Product B", qty: "50" } ]
  • 9. XML Parsing ❖ repeated XML elements —> repeated object keys ❖ XML attributes —> special @() object
  • 10. CSV parsing ❖ Array of records (lines) ❖ Record (line) —> array element of type Object ❖ Field in record: object field (key is taken from CSV header line or configured metadata) ❖ Reader configuration to set field separator, etc.
  • 11. DW transform structure %dw 1.0 %input payload application/csv %output application/json %type sapDate = :string { format: “YYYYMMDD” } %var unitOfMeasure = 'EA' %var doubleNumber = (nr) -> [nr * 2.0] %namespace xsi http://www.w3.org/2001/XMLSchema-instance %function fname(name) {firstName: upper name} ——- order: { ID: payload.orderID ++ " dated " ++ payload.orderDate, nrLines: (sizeOf payload.orderItems) + 1, totalOrderAmount: payload.*orderItems reduce $$ + (($.orderQuantity as :number) * ($.unitPrice as :number)) } } Optional header contains: • transformation directives • reusable declarations Body contains the DW transformation expression
  • 12. Case study: introduction Transforming a list of order items into a corresponding list of delivery routes. The source payload is unsorted list of items in CSV format: OrderId;OrderDate;CustomerId;DeliveryDate;City;ProductId;Quantity 000001;2016-09-14;Customer1;2016-09-20;London;ProductA;120 000001;2016-09-14;Customer1;2016-09-20;London;ProductB;88 000002;2016-09-15;Customer2;2016-09-20;Paris;ProductC;60 000002;2016-09-15;Customer2;2016-09-20;Paris;ProductA;100 000002;2016-09-15;Customer2;2016-09-20;Paris;ProductD;15 000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductB;14 000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductD;30 000004;2016-09-15;Customer4;2016-09-20;London;ProductC;14 000004;2016-09-15;Customer4;2016-09-20;London;ProductE;30 000005;2016-09-16;Customer4;2016-09-20;London;ProductB;20 000006;2016-09-16;Customer2;2016-09-22;Paris;ProductD;7 000006;2016-09-16;Customer2;2016-09-22;Paris;ProductE;30 000007;2016-09-16;Customer5;2016-09-22;Berlin;ProductB;12 The target structure (described in the following slide) is a multi-level JSON structure. This case study focuses on the structural transformation capabilities of DW, but DW offers a wide range of value and formatting capabilities, conditional mapping, and much more!
  • 13. Case study: target format [ { city: "<City>", deliveryDate: "<DeliveryDate>", stops: [ { customer: "<CustomerId>", orderitems: [ { ordernr: "<OrderId>", orderdate: "<OrderDate>", product: "<ProductId>", qty: "<Quantity>" } ] } ] } ] JSON document with sequence of delivery routes by delivery date and city: ❖ Sort CSV order lines by city and delivery date ❖ Within each delivery date and city, group order lines by customer ❖ Render the structure as JSON By city / delivery date By customer By order item
  • 14. Case study: step 1 Source message parsed as application/dw: The DW expression payload evaluates the entire message payload (see earlier slide “CSV parsing)” NOTE: the DW transformer Preview functionality in MuleSoft Anypoint Studio maps the sample source in realtime as you type the transformation!
  • 15. Case study: step 2 Sorting and grouping by combination of city and delivery date: A composite key is used for sorting and grouping via the string concatenation operator (++) . The groupBy operator creates an object with the group values as keys.
  • 16. Case study: step 3 Iterating over the group values (city/delivery date combination) to generate the 1st level of the target structure: The pluck operator maps an object into an array. $$ is the key in the current iteration, $ is the value. City and delivery date are mapped from the composite key by String manipulation.
  • 17. Case study: step 4 Within each route group, group by customer and generate 2nd (inner) level of target structure: In the inner pluck the context for $ and $$ changes (e.g., $$ is now the CustomerID key).
  • 18. Case study: (final) step 5 Within each customer group, generate the 3rd (innermost) level of the target structure via the map operator: Also get the JSON rending by changing the %output directive.
  • 19. Thanks! This is just a “taste” of the innovative DataWeave transformation language. Find out more at: https://docs.mulesoft.com/mule-user-guide/v/3.8/ dataweave