SlideShare a Scribd company logo
Jim Forrester
Director, Information Services
jim.forrester@lifeway.com
@jrforrester
HADOOP @ LIFEWAY
ABOUT LIFEWAY
• Founded in 1891
• 4,200+ employees
• One of the world’s largest providers of Christian
products
ABOUT LIFEWAY (CONT.)
• Operate more than 185 LifeWay Christian Stores
across the United States
• Trade publishing through B&H – direct publishing
and events through LifeWay Resources
OUR ISSUES
• Long ETL process in EDW
• Disjointed data warehouse
LifeWay.com
Database Server
WSPRODDBProduct Attribute Repository
JDA Test
Server
JOPPA
JDA Prod
Server
GAZA
Enterprise Data Warehouse
UID / RCP
Tomcat Server
KETESH
ScanUS
App Server
SAMSON
App Server
SOCO
Business Objects
SAS
UNICA
Epiphany
B&H Dashboard
UID/Customer Insight
Maximizer
Database Server
MAXSQLPROD
Database Server
lwSQLPROD
Web App Server
URBANE
App Server
INTWEB01
UID Database Server
ABRAM
Business Objects App Server
BOBAPP
Epiphany App Server
EPIAPPPROD
Unica App Server
UNICAAPP
Enterprise Data Warehouse
(EDW)
BISQLPROD
STAGING PROD
BISQLPROD2
Used for running SSIS load packages
against BISQLPROD
UNICA
EPIPHANY
.txt
files
Unica Database Server
UNICASQL
Unica
Staging
(SQL Server)
Unica
Data Mart
(SQL Server)
Epiphany Database Server
EPISQLPROD
Epiphany
(SQL Server)
Business
Objects
SAS App Server
NINEVEH
BaseSAS
SAS-EG
Client App
SAS-Web
Reporting
Studio
HK Inv
(oracle)
SBDS
(SQL Server)
Ministers Disc
(monthly)
UID Staging
(Sybase) UID
Web App
Tri-Media
RTS/Alliant
(SQL Server)
WORDSearch
(SQL Server)
LW Worship
(MY SQL)
RCP
(SQL Server)
Vendor LW
Commission
MOSAIC
MRI
Simmons
Vista
(static)
SAS
Libraries
Base SAS App
Server
ZERUBBABEL
BaseSAS
JDA
SAS
Information
Maps
Retail Acq RCI
(static)
Unity Mail
(static)
Freq Buyers
(static)
SBDS
(.asp)
Church leaders,
organizations, ACP
UID
(Sybase)
Discoverer
Prospects
(access)
DataFlux
Client App
NCOA
Natl Chg of Addr
(90 Days)
PeachTree
[TD]
AEC
Addr Elmt Corr
(Semi-Annual)
Post Office
[TD]
UNC
Universal Coder
(Java)
Unique Customer ID
B&H
Dashboard
(Java)
Customer
Insight
(asp)
C
ustom
erAccount
Customer
Insight
(SQL Server)
My Study Bible
/ Oracle
Campaign
Campaign
Account / Transaction / Product
Store / Inventory
Account / Commission
Royalty
Shipping / Receiving / Inventory
CRD Prospect
Account / Transaction
Lifestyle Segmentation
Church Demographics
Account / Transaction / Product
GL / Account / Transaction / Product
People
Church Leaders
Email Addresses
Maximizer
(SQL Server)
App Server
MAXAPPPROD
Maximizer
(asp)
Account / Transaction / Product
Account / Transaction / Product
All Data
All Data
All Data
CRD
Data
M
M
S
/C
ustom
erH
ousehold
ing
/S
ale
s
Address
Address
Address
Address
App Server
GUARDIANNEW
BlueFusion
(VB6)
LW Std
Uniq Cust
(VB6)
Customer
Customer
Custo
mer
C
ustom
er
Customer
Household / Customer / Profile
Customer
All Data
All Data
Customer
Name
B&H
Sto
re
/In
ventory
Address
Address /
Name
Unique
Customer
ID
Group 1
Address
Standardization
(Householding)
[Vend LW
Comm/TD]
Unique
Customer
ID
Address
Business Intelligence
Technology and
Integration Flowchart
Address
Vendor
LW
Commission
New or
Amended
Site
Account / Transaction / Product
Sales
Customer
Customer
Customer
Customer
Glorietta
(static)
Ridgecrest
(static)
Maximizer
(static)
Conference Attendees
Conference Attendees
Assigned Consultant
Household
Segm
entatio
n
à
/ß
Sale
s
ScanUS
Client App
.txt
files
App Server
THOMASSCANUS
ScanUS
Network App
Network Computer
[CRD/CSD/B&H]
Notebook Computer
[CRD/CSD/B&H]
.txt
files
Any Data
All Data
.txt
files
Customer
Budget/Proje
ctio
ns
Budget / Meeting Category
Store
Demographics / Price Event
JDA
(MY SQL)
Coupon
Customer
JDA
(DB2)
[CRD/Retail]
[CRD]
[B&H]
Campaigns &
Mail Lists
[Retail]
Campaigns &
Mail Lists
[CRD]
SAS Reports
[Retail]
SAS Reports
[Retail/CRD]
SAS Reports
[Retail/CRD]
WebIntelligence
Reports
[all divisions]
[TD/CRD/Retail]
[TD]
[Orgs/TD/LW Rsch]
Buying
Behavior
Buying
Behavior
JDA
(DB2)
Non EDW
Data
Non EDW
Data
Any Data
Any Data
Aptify
Product
Contribution
(CRD / B&H)
BHPOS
Aptify
(SQL Server)
Oracle eBus
(oracle)
Price Events
New Seed List Update
Store
Data
Lifestyle Segmentation
App Server
HAZOR
MSC
Mktg Sppt Cntr
(Java)
[TD/CS]
Mail Lists
Salesà/ßAssignmentsUID
Address
PAR
ItemPopularity
Websphere
Commerce
for
Lifeway.com
WCP01
(DB2)
Address
Business
Objects
Universes
Product Class
1/24/2014
Mosaic
My Study Bible
SSRS Reports
All Data
Big TX
Church leaders,
organizations, ACP
Church leaders,
organizations, ACP
Tomcat Reports
(TD)
Changed Names
and Addresses Unique Customer ID
Customer / UID
OUR ISSUES (CONT.)
• Analysts spending 80% of time gathering data
sources
• Data is limited to structured sources
• A lot of HiPPOs
“Without data you’re just another person with an
opinion.”
– W. Edward Deming
WHY HADOOP?
• ETL -> ELT
• Centralized schema development
• Analysts get to be analysts
• More and different types of data
• “Win with data” – new analytics culture
FIRST STEPS
• POCs on several distributions
• Ran POCs on a 5 node VM cluster
• Chose Hortonworks
• Pure play distribution
• Engineering expertise
• Support model
INVESTMENT
• Started with a 12-node physical cluster
• Cisco UCS hardware to meet infrastructure
standards
• Training for 1 administrator and 2 developers
• Two week professional services engagement
• Got environment up
• Successful ETL offload for two data marts
• Templated framework for remaining data marts
EMPLOYEE SKILLS
• SQL skills translate well for Hive
• Pig can be picked up quickly through training
• Forward thinking DBA for administration
HADOOP ECOSYSTEM TECHNOLOGIES
• Currently using:
• Hive
• Pig
• Sqoop
• Oozie
• Ranger
TYPES OF DATA
• Structured Systems
• ERP
• Logistics
• POS
• Merchandising
• Unstructured
• Wifi analytics
• Price API
• Clickstream
• International weather
• CPI
• Census
ARCHITECTURE
Enterprise
Data Hub
Cloud
Data
Structured
Data
Unstructured
Data
A
P
I
Apps
ARCHITECTURE (CONT.)
BUSINESS COLLABORATION
• Socializing impacts of Hadoop
• Data gathering time for analysts
• New data sets
• Improved schemas
• Successful implementation of Hadoop
• Lays the groundwork for new BI&A tooling
• Creates an Agile BI framework
USE CASES
• Segmentation/Targeting of customers
• Omnichannel customer views
• Pricing optimization
• Product clustering
• Supply chain optimization
• Cannibalization of products and customers
• Fraud detection on AP
NEXT STEPS
• Continued growth of cluster
• HA/DR planning
• Cloud vs On-premise
• EMC Isilon
FUTURE OF HADOOP AT LIFEWAY
• Data science
• Machine learning
• Process optimization
• Heat mapping for store optimization
• Event log and sensor aggregation – predicting failure
Jim Forrester
Director, Information Services
jim.forrester@lifeway.com
@jrforrester
HADOOP @ LIFEWAY

More Related Content

PPTX
professional informatica trainer
PPT
ODI 11g in the Enterprise - BIWA 2013
PPTX
How to Handle DEV&TEST&PROD for Oracle Data Integrator
PDF
bright box professional services for software development
PPTX
Cameron Hawthorne - E Bus Sales analysis with Olap and Disco
PDF
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...
PPTX
Bringing your data to life using Power BI - SPS London 2016
PDF
Enabling Telco to Build and Run Modern Applications
professional informatica trainer
ODI 11g in the Enterprise - BIWA 2013
How to Handle DEV&TEST&PROD for Oracle Data Integrator
bright box professional services for software development
Cameron Hawthorne - E Bus Sales analysis with Olap and Disco
Don’t Struggle with Complex and Rigid Data Migrations, Leverage API Wizard to...
Bringing your data to life using Power BI - SPS London 2016
Enabling Telco to Build and Run Modern Applications

What's hot (19)

PDF
Integrated Planning Using Enterprise Planning and Budgeting Cloud Service at ...
PPTX
NRB SAP Hosting & Cloud Solutions
 
PPTX
How to Migrate from Oracle to EDB Postgres
PDF
Deploying OBIEE in the Cloud - Oracle Openworld 2014
PPTX
OAC - From Cloud Entry to Data Engineering to Data Science
PPTX
Oracle hyperion essbase
PPTX
Lightning Talk: Get Even More Value from MongoDB Applications
PPTX
NRB SAP DAY 2017 - Intro
 
PPTX
Adobe Spark Meetup - 9/19/2018 - San Jose, CA
PPTX
Amazon AWS vs Azure Cloud vs Kubernetes
PPTX
ECS19 - Jason Himmelstein - Telling data stories with Power BI
PDF
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
PDF
Streamline your SOA Portfolio
PDF
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...
PPT
hyperion essbase training | hyperion essbase online training | hyperion essb...
PPTX
NoSQL on ACID: Meet Unstructured Postgres
 
PDF
API Trends & Use Cases
PDF
Talend Introduction by TSI
PPTX
Business and IT agility through DevOps and microservice architecture powered ...
Integrated Planning Using Enterprise Planning and Budgeting Cloud Service at ...
NRB SAP Hosting & Cloud Solutions
 
How to Migrate from Oracle to EDB Postgres
Deploying OBIEE in the Cloud - Oracle Openworld 2014
OAC - From Cloud Entry to Data Engineering to Data Science
Oracle hyperion essbase
Lightning Talk: Get Even More Value from MongoDB Applications
NRB SAP DAY 2017 - Intro
 
Adobe Spark Meetup - 9/19/2018 - San Jose, CA
Amazon AWS vs Azure Cloud vs Kubernetes
ECS19 - Jason Himmelstein - Telling data stories with Power BI
Migrating Target to Fastly - Eddie Roger at Fastly Altitude 2015
Streamline your SOA Portfolio
Baha Mar's All in Bet on Red: The Story of Integrating Data and Master Data w...
hyperion essbase training | hyperion essbase online training | hyperion essb...
NoSQL on ACID: Meet Unstructured Postgres
 
API Trends & Use Cases
Talend Introduction by TSI
Business and IT agility through DevOps and microservice architecture powered ...
Ad

Similar to Hadoop @ LifeWay (20)

PDF
Track B-1 建構新世代的智慧數據平台
PPTX
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
PDF
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
PDF
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
PDF
Achieving Business Value by Fusing Hadoop and Corporate Data
PPTX
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PDF
Eliminating the Challenges of Big Data Management Inside Hadoop
PPTX
The Future of SAP® Automation in the Cloud
PDF
Gab Genai Cloudera - Going Beyond Traditional Analytic
PDF
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
PPTX
Oracle Big Data Appliance and Big Data SQL for advanced analytics
PDF
Hadoop and Your Enterprise Data Warehouse
PPTX
The Most Trusted In-Memory database in the world- Altibase
PDF
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
PDF
Accelerate Self-Service Analytics with Data Virtualization and Visualization
PPTX
Building a Self-Service Big Data Pipeline
PDF
Key Methodologies for Migrating from Oracle to Postgres
 
PPTX
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
PDF
The Future of Data Management: The Enterprise Data Hub
Track B-1 建構新世代的智慧數據平台
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Achieving Business Value by Fusing Hadoop and Corporate Data
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
The Future of SAP® Automation in the Cloud
Gab Genai Cloudera - Going Beyond Traditional Analytic
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Hadoop and Your Enterprise Data Warehouse
The Most Trusted In-Memory database in the world- Altibase
Data and Analytics at Holland & Barrett: Building a '3-Michelin-star' Data Pl...
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Building a Self-Service Big Data Pipeline
Key Methodologies for Migrating from Oracle to Postgres
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
The Future of Data Management: The Enterprise Data Hub
Ad

Recently uploaded (20)

PPTX
Introduction to Knowledge Engineering Part 1
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Foundation of Data Science unit number two notes
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
annual-report-2024-2025 original latest.
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Lecture1 pattern recognition............
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Introduction to Knowledge Engineering Part 1
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Business Acumen Training GuidePresentation.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Qualitative Qantitative and Mixed Methods.pptx
Supervised vs unsupervised machine learning algorithms
Fluorescence-microscope_Botany_detailed content
IB Computer Science - Internal Assessment.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
climate analysis of Dhaka ,Banglades.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Foundation of Data Science unit number two notes
Acceptance and paychological effects of mandatory extra coach I classes.pptx
annual-report-2024-2025 original latest.
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Lecture1 pattern recognition............
STUDY DESIGN details- Lt Col Maksud (21).pptx
Business Analytics and business intelligence.pdf
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx

Hadoop @ LifeWay

  • 1. Jim Forrester Director, Information Services jim.forrester@lifeway.com @jrforrester HADOOP @ LIFEWAY
  • 2. ABOUT LIFEWAY • Founded in 1891 • 4,200+ employees • One of the world’s largest providers of Christian products
  • 3. ABOUT LIFEWAY (CONT.) • Operate more than 185 LifeWay Christian Stores across the United States • Trade publishing through B&H – direct publishing and events through LifeWay Resources
  • 4. OUR ISSUES • Long ETL process in EDW • Disjointed data warehouse LifeWay.com Database Server WSPRODDBProduct Attribute Repository JDA Test Server JOPPA JDA Prod Server GAZA Enterprise Data Warehouse UID / RCP Tomcat Server KETESH ScanUS App Server SAMSON App Server SOCO Business Objects SAS UNICA Epiphany B&H Dashboard UID/Customer Insight Maximizer Database Server MAXSQLPROD Database Server lwSQLPROD Web App Server URBANE App Server INTWEB01 UID Database Server ABRAM Business Objects App Server BOBAPP Epiphany App Server EPIAPPPROD Unica App Server UNICAAPP Enterprise Data Warehouse (EDW) BISQLPROD STAGING PROD BISQLPROD2 Used for running SSIS load packages against BISQLPROD UNICA EPIPHANY .txt files Unica Database Server UNICASQL Unica Staging (SQL Server) Unica Data Mart (SQL Server) Epiphany Database Server EPISQLPROD Epiphany (SQL Server) Business Objects SAS App Server NINEVEH BaseSAS SAS-EG Client App SAS-Web Reporting Studio HK Inv (oracle) SBDS (SQL Server) Ministers Disc (monthly) UID Staging (Sybase) UID Web App Tri-Media RTS/Alliant (SQL Server) WORDSearch (SQL Server) LW Worship (MY SQL) RCP (SQL Server) Vendor LW Commission MOSAIC MRI Simmons Vista (static) SAS Libraries Base SAS App Server ZERUBBABEL BaseSAS JDA SAS Information Maps Retail Acq RCI (static) Unity Mail (static) Freq Buyers (static) SBDS (.asp) Church leaders, organizations, ACP UID (Sybase) Discoverer Prospects (access) DataFlux Client App NCOA Natl Chg of Addr (90 Days) PeachTree [TD] AEC Addr Elmt Corr (Semi-Annual) Post Office [TD] UNC Universal Coder (Java) Unique Customer ID B&H Dashboard (Java) Customer Insight (asp) C ustom erAccount Customer Insight (SQL Server) My Study Bible / Oracle Campaign Campaign Account / Transaction / Product Store / Inventory Account / Commission Royalty Shipping / Receiving / Inventory CRD Prospect Account / Transaction Lifestyle Segmentation Church Demographics Account / Transaction / Product GL / Account / Transaction / Product People Church Leaders Email Addresses Maximizer (SQL Server) App Server MAXAPPPROD Maximizer (asp) Account / Transaction / Product Account / Transaction / Product All Data All Data All Data CRD Data M M S /C ustom erH ousehold ing /S ale s Address Address Address Address App Server GUARDIANNEW BlueFusion (VB6) LW Std Uniq Cust (VB6) Customer Customer Custo mer C ustom er Customer Household / Customer / Profile Customer All Data All Data Customer Name B&H Sto re /In ventory Address Address / Name Unique Customer ID Group 1 Address Standardization (Householding) [Vend LW Comm/TD] Unique Customer ID Address Business Intelligence Technology and Integration Flowchart Address Vendor LW Commission New or Amended Site Account / Transaction / Product Sales Customer Customer Customer Customer Glorietta (static) Ridgecrest (static) Maximizer (static) Conference Attendees Conference Attendees Assigned Consultant Household Segm entatio n à /ß Sale s ScanUS Client App .txt files App Server THOMASSCANUS ScanUS Network App Network Computer [CRD/CSD/B&H] Notebook Computer [CRD/CSD/B&H] .txt files Any Data All Data .txt files Customer Budget/Proje ctio ns Budget / Meeting Category Store Demographics / Price Event JDA (MY SQL) Coupon Customer JDA (DB2) [CRD/Retail] [CRD] [B&H] Campaigns & Mail Lists [Retail] Campaigns & Mail Lists [CRD] SAS Reports [Retail] SAS Reports [Retail/CRD] SAS Reports [Retail/CRD] WebIntelligence Reports [all divisions] [TD/CRD/Retail] [TD] [Orgs/TD/LW Rsch] Buying Behavior Buying Behavior JDA (DB2) Non EDW Data Non EDW Data Any Data Any Data Aptify Product Contribution (CRD / B&H) BHPOS Aptify (SQL Server) Oracle eBus (oracle) Price Events New Seed List Update Store Data Lifestyle Segmentation App Server HAZOR MSC Mktg Sppt Cntr (Java) [TD/CS] Mail Lists Salesà/ßAssignmentsUID Address PAR ItemPopularity Websphere Commerce for Lifeway.com WCP01 (DB2) Address Business Objects Universes Product Class 1/24/2014 Mosaic My Study Bible SSRS Reports All Data Big TX Church leaders, organizations, ACP Church leaders, organizations, ACP Tomcat Reports (TD) Changed Names and Addresses Unique Customer ID Customer / UID
  • 5. OUR ISSUES (CONT.) • Analysts spending 80% of time gathering data sources • Data is limited to structured sources • A lot of HiPPOs “Without data you’re just another person with an opinion.” – W. Edward Deming
  • 6. WHY HADOOP? • ETL -> ELT • Centralized schema development • Analysts get to be analysts • More and different types of data • “Win with data” – new analytics culture
  • 7. FIRST STEPS • POCs on several distributions • Ran POCs on a 5 node VM cluster • Chose Hortonworks • Pure play distribution • Engineering expertise • Support model
  • 8. INVESTMENT • Started with a 12-node physical cluster • Cisco UCS hardware to meet infrastructure standards • Training for 1 administrator and 2 developers • Two week professional services engagement • Got environment up • Successful ETL offload for two data marts • Templated framework for remaining data marts
  • 9. EMPLOYEE SKILLS • SQL skills translate well for Hive • Pig can be picked up quickly through training • Forward thinking DBA for administration
  • 10. HADOOP ECOSYSTEM TECHNOLOGIES • Currently using: • Hive • Pig • Sqoop • Oozie • Ranger
  • 11. TYPES OF DATA • Structured Systems • ERP • Logistics • POS • Merchandising • Unstructured • Wifi analytics • Price API • Clickstream • International weather • CPI • Census
  • 14. BUSINESS COLLABORATION • Socializing impacts of Hadoop • Data gathering time for analysts • New data sets • Improved schemas • Successful implementation of Hadoop • Lays the groundwork for new BI&A tooling • Creates an Agile BI framework
  • 15. USE CASES • Segmentation/Targeting of customers • Omnichannel customer views • Pricing optimization • Product clustering • Supply chain optimization • Cannibalization of products and customers • Fraud detection on AP
  • 16. NEXT STEPS • Continued growth of cluster • HA/DR planning • Cloud vs On-premise • EMC Isilon
  • 17. FUTURE OF HADOOP AT LIFEWAY • Data science • Machine learning • Process optimization • Heat mapping for store optimization • Event log and sensor aggregation – predicting failure
  • 18. Jim Forrester Director, Information Services jim.forrester@lifeway.com @jrforrester HADOOP @ LIFEWAY