SlideShare a Scribd company logo
www.antuit.com 1
The information contained in this document is proprietary.
©2016 Antuit. All rights reserved.
Program Highlights
GLOBAL MANUFACTURER AVERTS DATA SWAMP WITH NEW
DATA LAKE ARCHITECTURE
A multi-billion dollar global manufacturer of
electronic components, connectors and sensors
wanted to enhance the value being derived from
their extensive data. The company had launched a
strategic initiative to utilize data as a strategic asset
to transform the business The company had built a
Hadoop data lake consisting of multiple disparate
data sources of structured and unstructured data,
yet they were unable to effectively leverage the
data to create actionable business insights. The
Hadoop implementation was also showing signs of
performance issues. The organization turned to
Antuit’s team of big data architects and engineers
to improve the performance of their architecture
and create a scalable platform for data consumption.
Working collaboratively with the client, the Antuit
team audited the existing process, and then
re-engineered and implemented a robust scalable
architecture. As a result of this process, Antuit
uncovered a number of challenges. The client’s
existing architecture could not handle the 10+ years
of sales and marketing data. The existing systems
did not scale, and therefore were not prepared to
handle the velocity or volume of data expected in the
future. Some power users within the organization
were executing overwhelmingly complex queries
that exceeded system limitations. Finally, the data
systems themselves were housed and managed by
disparate business units with minimal integration.
Mindful of the significant investment the client
had made in its Hadoop architecture, Antuit
recommended and then implemented a number
of changes. The Antuit team helped the client
restructure the data lake and created a better
data process by partitioning and compressing
data, using split-able file formats, and helping
them to identify and use the right data types. To
create a seamless experience, Antuit was able to
leverage multiple test environments to validate
approaches and identify ancillary technologies that,
once integrated, would keep their data lake running
smoothly and efficiently.
While getting the data lake up and running was
priority number one, changing internal behaviors
and the manner in which queries were written was
an equally important challenge. Antuit established
a new set of guidelines for internal users, directing
them as to how to retrieve desired data from the
lake without bringing the entire system to a halt.
Program Highlights
Challenge
A global electronics component manufacturer
launched a sales and marketing data hub, only
to find that it could not scale to handle the
volume or velocity that its one terabyte of data
presented. They needed an infrastructure that
was easier, faster, and more reliable.
Solution
Designed and Implemented a new data lake
design in Hadoop that was stable and scalable
by reformatting data types, partitioning and
compressing data, and using new file formats.
Outcome
New data lake architecture is fast, reliable, and
easily accessible to business decision makers.
We’ve successfully delivered
projects to our clients globally.
Learn more about our capabilities
www.antuit.com
follow us on:
2
About Antuit
Antuit solves business problems with analytics solutions that deliver
measurable and sustainable business value. By combining data, science,
technology and industry expertise, Antuit helps companies gain a
competitive advantage in their respective industries. Antuit’s global
team of dreamers, hackers, and hustlers work across a wide range
of industry sectors, developing and deploying analytics solutions that
help clients anticipate customer needs, predict business outcomes, and
quickly respond and react to business changes. Armed with Antuit’s
analytics-powered sales, pricing, marketing and supply chain solutions,
organizations are better equipped to achieve exceptional business success
in a rapidly changing environment. Founded in 2013 and backed by
Goldman Sachs, Antuit has offices in New York, Chicago, Dallas, London,
Hong Kong, Singapore, Tokyo, Auckland, Melbourne, Bangalore, and Pune.
The information contained in this document is proprietary.
©2016 Antuit. All rights reserved.
With the solid data architecture
designed by Antuit in place, the
organization now has accessible data
at their fingertips. With a scalable
data lake architecture and sales and
marketing data models in place, the
company is now able to utilized data
as a strategic asset to help transform
the way the business is run through
advanced analytics. Antuit continues
to work with the organization to build
new analytical models that solve
business problems extract value from
their growing data sets.
4 Key Steps to Improve Hadoop Performance
1.	 Partitioned larger data by effective key.
2.	 Implemented split-able file format like Sequence,
Avro or RC file.
3.	 Utilized data compress techniques like snappy, bzip2.
4.	 Used proper data type in hive table.
Structured Data
ERPs
CRMs
Enterprise
Data Warehouse
Semistructured &
Unstructured Data
Hot
Machine Data
Real-Time
Third-Party
Social Media,
Geo spatial etc.
Warm
Enterprise Data Hub (Hadoop)
Landing
Zone
Spark
ETL, Data
processing
and DQ
Staging
Guided
Analytics
Standard
Reporting
Data
Discovery /
Visualization
Predictive
Analytics
Machine
Learning
Published
• Inventory
• Payables
• Sales
• Purchasing
Labs

More Related Content

PDF
Open Source Ecosystem Future of Enterprise IT
PPTX
Augmented analytics will push the analytics adoption
PDF
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
PDF
Skytree Partner Program 2-15
PPTX
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...
PPTX
David Waxman Keynote
PPTX
Modernizing Architecture for a Complete Data Strategy
PDF
Pieter den Hamer Alliander
Open Source Ecosystem Future of Enterprise IT
Augmented analytics will push the analytics adoption
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
Skytree Partner Program 2-15
BDW Chicago 2016 - John K. Thompson, GM for Advanced Analytics Dell Statisti...
David Waxman Keynote
Modernizing Architecture for a Complete Data Strategy
Pieter den Hamer Alliander

What's hot (20)

PDF
CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...
PPTX
Rick Mutsaers Informatica
PDF
Discover how Covid-19 is accelerating the need for healthcare interoperabilit...
PPTX
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
PDF
BIg Data Trends in 2016
PDF
Appfluent and Cloudera Solution Brief
PDF
Supply chain and Big data : top 5 Trends
PPTX
Big Idea For Big Data
PDF
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
PPTX
Moving from data to insights: How to effectively drive business decisions & g...
PDF
Analyst Keynote: The Economic Benefits of Data Virtualization and Logical Dat...
PPTX
Big Data
PPTX
Eneco Ronald Root
PDF
6 enriching your data warehouse with big data and hadoop
PDF
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
PPTX
The Journey to Success with Big Data
PDF
The Big Picture on Big Data and Cognos
PDF
How Cloud BI Powers Today's Agile Enterprise
PDF
Modern Manufacturing: 4 Ways Data is Transforming the Industry
PPTX
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...
Rick Mutsaers Informatica
Discover how Covid-19 is accelerating the need for healthcare interoperabilit...
DMTI Spatial Location Hub Analytics: big data, analytics, visualization
BIg Data Trends in 2016
Appfluent and Cloudera Solution Brief
Supply chain and Big data : top 5 Trends
Big Idea For Big Data
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Moving from data to insights: How to effectively drive business decisions & g...
Analyst Keynote: The Economic Benefits of Data Virtualization and Logical Dat...
Big Data
Eneco Ronald Root
6 enriching your data warehouse with big data and hadoop
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
The Journey to Success with Big Data
The Big Picture on Big Data and Cognos
How Cloud BI Powers Today's Agile Enterprise
Modern Manufacturing: 4 Ways Data is Transforming the Industry
Digital Government: Data + Government Isn't Enough | Wrangle Conference 2017
Ad

Similar to Global Manufacturer Averts Data Swamp with New Data Lake Architecture (20)

PDF
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
PDF
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
PDF
Big Data at a Gaming Company: Spil Games
PDF
Are You Killing the Benefits of Your Data Lake?
PDF
Hadoop data-lake-white-paper
PDF
Harness the power of data
PDF
5 Steps for Architecting a Data Lake
PDF
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
PDF
Data lakes
PDF
Architecting Agile Data Applications for Scale
PDF
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
PDF
Big data data lake and beyond
PPTX
The Future of Apache Hadoop an Enterprise Architecture View
PPTX
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
PPTX
Finding business value in Big Data
PDF
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
PDF
Modern data warehouse
PDF
Modern data warehouse
PDF
Hortonworks hadoop big data_retail__white_paper
PDF
Big data and you
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Big Data at a Gaming Company: Spil Games
Are You Killing the Benefits of Your Data Lake?
Hadoop data-lake-white-paper
Harness the power of data
5 Steps for Architecting a Data Lake
Big Data LDN 2018: FORTUNE 100 LESSONS ON ARCHITECTING DATA LAKES FOR REAL-TI...
Data lakes
Architecting Agile Data Applications for Scale
Transforming Data Architecture Complexity at Sears - StampedeCon 2013
Big data data lake and beyond
The Future of Apache Hadoop an Enterprise Architecture View
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Finding business value in Big Data
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Modern data warehouse
Modern data warehouse
Hortonworks hadoop big data_retail__white_paper
Big data and you
 
Ad

Recently uploaded (20)

PPTX
Business_Capability_Map_Collection__pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
DOCX
Factor Analysis Word Document Presentation
PPT
Predictive modeling basics in data cleaning process
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
Microsoft Core Cloud Services powerpoint
PDF
Introduction to the R Programming Language
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PDF
Microsoft 365 products and services descrption
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Transcultural that can help you someday.
PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
modul_python (1).pptx for professional and student
PPTX
A Complete Guide to Streamlining Business Processes
Business_Capability_Map_Collection__pptx
[EN] Industrial Machine Downtime Prediction
retention in jsjsksksksnbsndjddjdnFPD.pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Factor Analysis Word Document Presentation
Predictive modeling basics in data cleaning process
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Microsoft Core Cloud Services powerpoint
Introduction to the R Programming Language
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Microsoft 365 products and services descrption
STERILIZATION AND DISINFECTION-1.ppthhhbx
Transcultural that can help you someday.
DU, AIS, Big Data and Data Analytics.ppt
modul_python (1).pptx for professional and student
A Complete Guide to Streamlining Business Processes

Global Manufacturer Averts Data Swamp with New Data Lake Architecture

  • 1. www.antuit.com 1 The information contained in this document is proprietary. ©2016 Antuit. All rights reserved. Program Highlights GLOBAL MANUFACTURER AVERTS DATA SWAMP WITH NEW DATA LAKE ARCHITECTURE A multi-billion dollar global manufacturer of electronic components, connectors and sensors wanted to enhance the value being derived from their extensive data. The company had launched a strategic initiative to utilize data as a strategic asset to transform the business The company had built a Hadoop data lake consisting of multiple disparate data sources of structured and unstructured data, yet they were unable to effectively leverage the data to create actionable business insights. The Hadoop implementation was also showing signs of performance issues. The organization turned to Antuit’s team of big data architects and engineers to improve the performance of their architecture and create a scalable platform for data consumption. Working collaboratively with the client, the Antuit team audited the existing process, and then re-engineered and implemented a robust scalable architecture. As a result of this process, Antuit uncovered a number of challenges. The client’s existing architecture could not handle the 10+ years of sales and marketing data. The existing systems did not scale, and therefore were not prepared to handle the velocity or volume of data expected in the future. Some power users within the organization were executing overwhelmingly complex queries that exceeded system limitations. Finally, the data systems themselves were housed and managed by disparate business units with minimal integration. Mindful of the significant investment the client had made in its Hadoop architecture, Antuit recommended and then implemented a number of changes. The Antuit team helped the client restructure the data lake and created a better data process by partitioning and compressing data, using split-able file formats, and helping them to identify and use the right data types. To create a seamless experience, Antuit was able to leverage multiple test environments to validate approaches and identify ancillary technologies that, once integrated, would keep their data lake running smoothly and efficiently. While getting the data lake up and running was priority number one, changing internal behaviors and the manner in which queries were written was an equally important challenge. Antuit established a new set of guidelines for internal users, directing them as to how to retrieve desired data from the lake without bringing the entire system to a halt. Program Highlights Challenge A global electronics component manufacturer launched a sales and marketing data hub, only to find that it could not scale to handle the volume or velocity that its one terabyte of data presented. They needed an infrastructure that was easier, faster, and more reliable. Solution Designed and Implemented a new data lake design in Hadoop that was stable and scalable by reformatting data types, partitioning and compressing data, and using new file formats. Outcome New data lake architecture is fast, reliable, and easily accessible to business decision makers.
  • 2. We’ve successfully delivered projects to our clients globally. Learn more about our capabilities www.antuit.com follow us on: 2 About Antuit Antuit solves business problems with analytics solutions that deliver measurable and sustainable business value. By combining data, science, technology and industry expertise, Antuit helps companies gain a competitive advantage in their respective industries. Antuit’s global team of dreamers, hackers, and hustlers work across a wide range of industry sectors, developing and deploying analytics solutions that help clients anticipate customer needs, predict business outcomes, and quickly respond and react to business changes. Armed with Antuit’s analytics-powered sales, pricing, marketing and supply chain solutions, organizations are better equipped to achieve exceptional business success in a rapidly changing environment. Founded in 2013 and backed by Goldman Sachs, Antuit has offices in New York, Chicago, Dallas, London, Hong Kong, Singapore, Tokyo, Auckland, Melbourne, Bangalore, and Pune. The information contained in this document is proprietary. ©2016 Antuit. All rights reserved. With the solid data architecture designed by Antuit in place, the organization now has accessible data at their fingertips. With a scalable data lake architecture and sales and marketing data models in place, the company is now able to utilized data as a strategic asset to help transform the way the business is run through advanced analytics. Antuit continues to work with the organization to build new analytical models that solve business problems extract value from their growing data sets. 4 Key Steps to Improve Hadoop Performance 1. Partitioned larger data by effective key. 2. Implemented split-able file format like Sequence, Avro or RC file. 3. Utilized data compress techniques like snappy, bzip2. 4. Used proper data type in hive table. Structured Data ERPs CRMs Enterprise Data Warehouse Semistructured & Unstructured Data Hot Machine Data Real-Time Third-Party Social Media, Geo spatial etc. Warm Enterprise Data Hub (Hadoop) Landing Zone Spark ETL, Data processing and DQ Staging Guided Analytics Standard Reporting Data Discovery / Visualization Predictive Analytics Machine Learning Published • Inventory • Payables • Sales • Purchasing Labs