SlideShare a Scribd company logo
1 
© Talend 2014 
Talend: Solutions 
Overview
2 
About the Presenter 
Rajan Kanitkar 
• Senior Solutions Engineer 
• Rajan Kanitkar is a Pre-Sales Consultant with Talend. He 
has been active in the broader Data Integration space for 
the past 15 years and has experience with several leading 
edge software companies in these areas. His areas of 
specialties at Talend include Data Integration (DI), Big 
Data (BD), Data Quality (DQ) , and Master Data 
Management (MDM). 
• Contact: rkanitkar@talend.com 
© Talend 2014
3 
Talend Big Data Platform 
Hadoop, MapReduce, NoSQL capabilities … 
© Talend 2014
4 
The Big Data Ecosystem 
• Hadoop: the core project 
• HDFS: the Hadoop Distributed File System 
• MapReduce: the software framework for distributed 
processing of large data sets 
• Hive: a data warehouse infrastructure that provides 
data summarization and a querying language 
• Pig: a high-level data-flow language and execution 
framework for parallel computation 
• HBase: this is the Hadoop database. Use it when 
you need random, realtime read/write access to 
your Big Data 
• And many many more: Sqoop, HCatalog, 
Zookeeper, Oozie, Cassandra, MongoDB, Flume, 
Impala, Stinger, Neo4J, etc. 
© Talend 2014
5 
Talend’s Solution 
© Talend 2014
6 
Key differentiator of Our Next Gen Architecture… 
© Talend 2014 
JAVA 
ETL 
Day-to-day 
integration 
Run everywhere 
SQL 
ELT 
DW 
appliance 
Teradata, Netezza… 
MapReduce 
+ PIG + HiveQL 
+ Sqoop + … 
Hadoop 
Highly 
Scalable 
Hadoop Grid 
CAMEL 
CAMEL 
Message 
transform-ation 
High Frequency 
 No black-box engine 
 Enables light-weight distributed, 
customizable and parallelizable 
run time 
 Standards-Based 
Code Generator
7 
© Talend 2014 
Trying to get from this…
8 
Talend Big Data – “pure Hadoop” 
© Talend 2014 
Visual design in Map Reduce and optimize before 
deploying on Hadoop 
to this…
9 
Native Map/Reduce Jobs 
• Create classic ETL patterns using native Map/Reduce 
- Only data management solution on the market to generate native 
Map/Reduce code 
© Talend 2014 
• Reduce the need for big 
data coding skills 
• Zero pre-installation on 
the Hadoop cluster 
• Hadoop is the “engine” 
for data processing
10 
MapReduce 2.0, YARN, Storm, Spark 
• Yarn: Ensures predictable performance & QoS for all apps 
• Enables apps to run “IN” Hadoop rather than “ON” 
• In Labs: Streaming with Apache Storm 
• In Labs: mini-Batch and In-Memory with Apache Spark 
© Talend 2014 
Applications Run Natively IN Hadoop 
YARN (Cluster Resource Management) 
HDFS2 (Redundant, Reliable Storage) 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, Spark) 
GRAPH 
(Giraph) 
NoSQL 
(MongoDB) 
EVENTS 
(Falcon) 
ONLINE 
(HBase) 
OTHER 
(Search) 
Source: Hortonworks
11 
© Talend 2014 
iPaaS MDM 
HA Govern 
Security Meta 
Storm Kafka 
CXF Camel 
STANDARD-IZE 
MACHINE 
YARN (Cluster Resource Management) 
HDFS2 (Redundant, Reliable Storage) 
800+ 
HIVE 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, Spark) 
GRAPH 
(Giraph) 
NoSQL 
(MongoDB) 
Events 
(Falcon) 
ONLINE 
(HBase) 
OTHER 
(Search) 
Talend: Ingest – Transform – Deliver 
TRANSFORM (Data Refinement) 
MAP PROFILE PARSE CLEANSE CDC 
LEARNING 
MATCH 
INGEST 
(Ingestion) 
SQOOP 
FLUME 
HDFS API 
HBase API 
DELIVER 
(as an API) 
Karaf ActiveMQ
12 
© Talend 2014 
Talend Big Data Sandbox & 
Talend Big Data Jumpstart 
Delivering instant value from all your data
13 
BIG DATA CHALLENGES 
The Big Data Customer Discussion 
© Talend 2014
14 
Top Big Data Challenges 
© Talend 2014 
Talend Directly 
Addresses these 
Challenges 
Source: Gartner - Survey Analysis: Big Data Adoption in 2013 Shows Substance 
Behind the Hype - 12 September 2013 - G00255160
15 
Talend’s Solution 
© Talend 2014
16 
TALEND BIG DATA SANDBOX 
30 day customer trial 
© Talend 2014
17 
Cookbook Step-by-Step Directions 
• Completely Self-contained Demo Sandbox 
• Key Scenarios: 
- Twitter Analysis 
- Clickstream Analysis 
- Web Log analysis 
- ETL Offload 
• Scenario Summaries 
- Social Media insights 
- Channel optimization 
- Customer insights 
- Data Warehouse Cost Reduction 
© Talend 2014
18 
Ready for Launch 
• Announcements 
- Public announcement Tuesday 15th 
- Newsletter was sent 9th July 
• Customer Nurture campaign 
- Scenario reminders, videos & Links 
- Reminder to Talend AE 
• Two Routes for 5.5 
- Sandbox Download publicly available – 15th July 
- Jumpstart and AE ‘access’ – 15th July 
• Links for the 15th (Sandbox download) 
- Public: http://www.talend.com/talend-big-data-sandbox 
- Account Exec: send download link for customer to fill in: 
© Talend 2014 
• https://info.talend.com/prodevaltpbdsandbox
19 
TALEND BIG DATA JUMPSTART 
A ‘guided tour’ of the Sandbox 
© Talend 2014
20 
Why the ‘Jumpstart’? 
Practical 
Guided Tour 
• Lead by Talend Solutions Engineer 
• Learn about the Talend Studio 
• See how to execute Hadoop processes 
- Map/Reduce with YARN 
- Pig 
- HDFS 
• See NoSQL Examples 
- Hive 
- HBase 
- MongoDB 
- Cassandra 
© Talend 2014
21 
Key benefits 
• NO Configuration/Development 
• INSTANT results now, for the Future 
• Valuable prototypes for FREE 
• Working on the top THREE Hadoop Distributions 
© Talend 2014
22 
3 Simple Messages 
• Sandbox is Customer led, Jumpstart is Sales led 
• Jumpstart is the best way to ‘get Talend’ 
- Google: Talend Jumpstart 
• Work to get the best conversation & involve pre-sales 
© Talend 2014
23 
© Talend 2014 
Sandbox 
- Talend Jumpstart Sandbox - virtual image installed with: 
• Apache Hadoop distribution provided Hortonworks, Cloudera & MapR 
• Pre-configured Talend Platform for Big Data 5.5* 
• Four scenarios for you to try: 
– Clickstream data 
– Twitter sentiment 
– Apache weblogs 
– ETL Offload 
• Demonstrations of several NoSQL databases 
*Includes Talend Studio (graphical IDE), team working, 
management, data quality and advanced big data features. 
www.talend.com/products/platform-for-big-data
24 
SHOW ME 
Talend Demo 
© Talend 2014

More Related Content

PDF
Big Data Hoopla Simplified - TDWI Memphis 2014
PDF
ETL using Big Data Talend
PDF
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
PPTX
ETL big data with apache hadoop
PDF
Why Talend for Big Data?
PPTX
Etl with talend (big data)
PPTX
Talend Big Data Capabilities Overview
PDF
Manipulating Data with Talend.
Big Data Hoopla Simplified - TDWI Memphis 2014
ETL using Big Data Talend
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
ETL big data with apache hadoop
Why Talend for Big Data?
Etl with talend (big data)
Talend Big Data Capabilities Overview
Manipulating Data with Talend.

What's hot (20)

PDF
Filling the Data Lake
PDF
Hadoop and the Data Warehouse: When to Use Which
PDF
Talend Data Preparation Overview
PPTX
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
PDF
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
PPTX
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
PDF
Data-In-Motion Unleashed
PDF
Evolving Hadoop into an Operational Platform with Data Applications
PPTX
Hadoop crash course workshop at Hadoop Summit
PDF
Innovation in the Data Warehouse - StampedeCon 2016
PPTX
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
PDF
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
PPTX
Scaling Data Science on Big Data
PDF
Solving Big Data Problems using Hortonworks
PPTX
SQL on Hadoop for the Oracle Professional
PPTX
Luo june27 1150am_room230_a_v2
PDF
Integrated Data Warehouse with Hadoop and Oracle Database
PPTX
Harnessing the Power of Apache Hadoop
PPTX
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
PPTX
Priyank Patel, Teradata, Hadoop & SQL
Filling the Data Lake
Hadoop and the Data Warehouse: When to Use Which
Talend Data Preparation Overview
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Bring your SAP and Enterprise Data to Hadoop, Apache Kafka and the Cloud
Data-In-Motion Unleashed
Evolving Hadoop into an Operational Platform with Data Applications
Hadoop crash course workshop at Hadoop Summit
Innovation in the Data Warehouse - StampedeCon 2016
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Scaling Data Science on Big Data
Solving Big Data Problems using Hortonworks
SQL on Hadoop for the Oracle Professional
Luo june27 1150am_room230_a_v2
Integrated Data Warehouse with Hadoop and Oracle Database
Harnessing the Power of Apache Hadoop
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Priyank Patel, Teradata, Hadoop & SQL
Ad

Similar to Talend Big Data Capabilities - 2014 (20)

PPTX
Talend for big_data_intorduction
PPTX
Simplifying Big Data ETL with Talend
PDF
Talend webinar
PPTX
Visual Mapping of Clickstream Data
PDF
Webinar : Talend : The Non-Programmer's Swiss Knife for Big Data
PDF
Talend For Big Data : Secret Key to Hadoop
PDF
Manipulating data with Talend. Learn how?
PDF
Big Data Expo 2015 - Talend Delivering Real Time
PDF
Big dataimplementation hadoop_and_beyond
PDF
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
PPT
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...
PPTX
Talend 6.1 - What's New in Talend?
PPTX
BIG Data & Hadoop Applications in Social Media
PPTX
An Introduction to Talend Integration Cloud
PPTX
Lambda architecture with Spark
PPTX
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
PPTX
Hd insight overview
PPTX
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
PDF
Delivering real time analytics in 1 click
PPTX
BIG Data & Hadoop Applications in E-Commerce
Talend for big_data_intorduction
Simplifying Big Data ETL with Talend
Talend webinar
Visual Mapping of Clickstream Data
Webinar : Talend : The Non-Programmer's Swiss Knife for Big Data
Talend For Big Data : Secret Key to Hadoop
Manipulating data with Talend. Learn how?
Big Data Expo 2015 - Talend Delivering Real Time
Big dataimplementation hadoop_and_beyond
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Exploring the Talend unified Big Data toolset for sentiment analysis - Ben Br...
Talend 6.1 - What's New in Talend?
BIG Data & Hadoop Applications in Social Media
An Introduction to Talend Integration Cloud
Lambda architecture with Spark
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hd insight overview
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
Delivering real time analytics in 1 click
BIG Data & Hadoop Applications in E-Commerce
Ad

Recently uploaded (20)

PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
L1 - Introduction to python Backend.pptx
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Online Work Permit System for Fast Permit Processing
PDF
System and Network Administration Chapter 2
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Transform Your Business with a Software ERP System
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
CHAPTER 2 - PM Management and IT Context
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Odoo Companies in India – Driving Business Transformation.pdf
Design an Analysis of Algorithms I-SECS-1021-03
L1 - Introduction to python Backend.pptx
ISO 45001 Occupational Health and Safety Management System
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Online Work Permit System for Fast Permit Processing
System and Network Administration Chapter 2
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Transform Your Business with a Software ERP System
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Softaken Excel to vCard Converter Software.pdf
2025 Textile ERP Trends: SAP, Odoo & Oracle
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
VVF-Customer-Presentation2025-Ver1.9.pptx

Talend Big Data Capabilities - 2014

  • 1. 1 © Talend 2014 Talend: Solutions Overview
  • 2. 2 About the Presenter Rajan Kanitkar • Senior Solutions Engineer • Rajan Kanitkar is a Pre-Sales Consultant with Talend. He has been active in the broader Data Integration space for the past 15 years and has experience with several leading edge software companies in these areas. His areas of specialties at Talend include Data Integration (DI), Big Data (BD), Data Quality (DQ) , and Master Data Management (MDM). • Contact: rkanitkar@talend.com © Talend 2014
  • 3. 3 Talend Big Data Platform Hadoop, MapReduce, NoSQL capabilities … © Talend 2014
  • 4. 4 The Big Data Ecosystem • Hadoop: the core project • HDFS: the Hadoop Distributed File System • MapReduce: the software framework for distributed processing of large data sets • Hive: a data warehouse infrastructure that provides data summarization and a querying language • Pig: a high-level data-flow language and execution framework for parallel computation • HBase: this is the Hadoop database. Use it when you need random, realtime read/write access to your Big Data • And many many more: Sqoop, HCatalog, Zookeeper, Oozie, Cassandra, MongoDB, Flume, Impala, Stinger, Neo4J, etc. © Talend 2014
  • 5. 5 Talend’s Solution © Talend 2014
  • 6. 6 Key differentiator of Our Next Gen Architecture… © Talend 2014 JAVA ETL Day-to-day integration Run everywhere SQL ELT DW appliance Teradata, Netezza… MapReduce + PIG + HiveQL + Sqoop + … Hadoop Highly Scalable Hadoop Grid CAMEL CAMEL Message transform-ation High Frequency  No black-box engine  Enables light-weight distributed, customizable and parallelizable run time  Standards-Based Code Generator
  • 7. 7 © Talend 2014 Trying to get from this…
  • 8. 8 Talend Big Data – “pure Hadoop” © Talend 2014 Visual design in Map Reduce and optimize before deploying on Hadoop to this…
  • 9. 9 Native Map/Reduce Jobs • Create classic ETL patterns using native Map/Reduce - Only data management solution on the market to generate native Map/Reduce code © Talend 2014 • Reduce the need for big data coding skills • Zero pre-installation on the Hadoop cluster • Hadoop is the “engine” for data processing
  • 10. 10 MapReduce 2.0, YARN, Storm, Spark • Yarn: Ensures predictable performance & QoS for all apps • Enables apps to run “IN” Hadoop rather than “ON” • In Labs: Streaming with Apache Storm • In Labs: mini-Batch and In-Memory with Apache Spark © Talend 2014 Applications Run Natively IN Hadoop YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, Spark) GRAPH (Giraph) NoSQL (MongoDB) EVENTS (Falcon) ONLINE (HBase) OTHER (Search) Source: Hortonworks
  • 11. 11 © Talend 2014 iPaaS MDM HA Govern Security Meta Storm Kafka CXF Camel STANDARD-IZE MACHINE YARN (Cluster Resource Management) HDFS2 (Redundant, Reliable Storage) 800+ HIVE BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, Spark) GRAPH (Giraph) NoSQL (MongoDB) Events (Falcon) ONLINE (HBase) OTHER (Search) Talend: Ingest – Transform – Deliver TRANSFORM (Data Refinement) MAP PROFILE PARSE CLEANSE CDC LEARNING MATCH INGEST (Ingestion) SQOOP FLUME HDFS API HBase API DELIVER (as an API) Karaf ActiveMQ
  • 12. 12 © Talend 2014 Talend Big Data Sandbox & Talend Big Data Jumpstart Delivering instant value from all your data
  • 13. 13 BIG DATA CHALLENGES The Big Data Customer Discussion © Talend 2014
  • 14. 14 Top Big Data Challenges © Talend 2014 Talend Directly Addresses these Challenges Source: Gartner - Survey Analysis: Big Data Adoption in 2013 Shows Substance Behind the Hype - 12 September 2013 - G00255160
  • 15. 15 Talend’s Solution © Talend 2014
  • 16. 16 TALEND BIG DATA SANDBOX 30 day customer trial © Talend 2014
  • 17. 17 Cookbook Step-by-Step Directions • Completely Self-contained Demo Sandbox • Key Scenarios: - Twitter Analysis - Clickstream Analysis - Web Log analysis - ETL Offload • Scenario Summaries - Social Media insights - Channel optimization - Customer insights - Data Warehouse Cost Reduction © Talend 2014
  • 18. 18 Ready for Launch • Announcements - Public announcement Tuesday 15th - Newsletter was sent 9th July • Customer Nurture campaign - Scenario reminders, videos & Links - Reminder to Talend AE • Two Routes for 5.5 - Sandbox Download publicly available – 15th July - Jumpstart and AE ‘access’ – 15th July • Links for the 15th (Sandbox download) - Public: http://www.talend.com/talend-big-data-sandbox - Account Exec: send download link for customer to fill in: © Talend 2014 • https://info.talend.com/prodevaltpbdsandbox
  • 19. 19 TALEND BIG DATA JUMPSTART A ‘guided tour’ of the Sandbox © Talend 2014
  • 20. 20 Why the ‘Jumpstart’? Practical Guided Tour • Lead by Talend Solutions Engineer • Learn about the Talend Studio • See how to execute Hadoop processes - Map/Reduce with YARN - Pig - HDFS • See NoSQL Examples - Hive - HBase - MongoDB - Cassandra © Talend 2014
  • 21. 21 Key benefits • NO Configuration/Development • INSTANT results now, for the Future • Valuable prototypes for FREE • Working on the top THREE Hadoop Distributions © Talend 2014
  • 22. 22 3 Simple Messages • Sandbox is Customer led, Jumpstart is Sales led • Jumpstart is the best way to ‘get Talend’ - Google: Talend Jumpstart • Work to get the best conversation & involve pre-sales © Talend 2014
  • 23. 23 © Talend 2014 Sandbox - Talend Jumpstart Sandbox - virtual image installed with: • Apache Hadoop distribution provided Hortonworks, Cloudera & MapR • Pre-configured Talend Platform for Big Data 5.5* • Four scenarios for you to try: – Clickstream data – Twitter sentiment – Apache weblogs – ETL Offload • Demonstrations of several NoSQL databases *Includes Talend Studio (graphical IDE), team working, management, data quality and advanced big data features. www.talend.com/products/platform-for-big-data
  • 24. 24 SHOW ME Talend Demo © Talend 2014