SlideShare a Scribd company logo
2 JUNE 2016
BUILDING A CLOUD BASED
DATA WAREHOUSE
GILDAS BAH, BRENT BENSON, & RYAN
FRAZIER
2
Presenters
About HBX
HBX Data
Management
Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Ryan Frazier – Director, Systems Engineering and
Operations
Brent Benson – Enterprise Architect
Gildas Bah – Data Analyst Engineer
3
4
 Harvard Business School’s newest
division, tasked with reimagining
business education for the digital age
 Launched in June 2014
 Located in Allston, five minutes from
HBS campus
 Moving from start-up to enterprise
mode
 The teaching model sets
HBX apart from many online
learning options and is
reflective of the HBS in-
person classroom approach
What is HBX?
About HBX
HBX Data
Management
Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
HBX Platforms
5
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
HBX Online Platform HBX Live
 Mainly asynchronous online
business education
 Engagement through student
interaction in cohorts of ~400
 Case-based learning with
highly interactive teaching
elements and peer help
 WGBH studio-based
virtual classroom
 Synchronous audio/video
with chat, polls, boards
 Up to 60 global students
on studio wall, hundreds
or more observers
66
Building a Data Management Practice
Why Build a Data-Driven Culture?
7
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Enhance Outcomes
• Proactively support
struggling students
• Identify challenging
content
• Evaluate and
improve interactive
content, social
engagement, and
retention
Improve Effectiveness
• Scale data intensive
activities like
marketing,
admissions, &
grading
• Use data to test
ideas and improve
quality of decisions
Refine Pedagogy
• Evaluate new
pedagogical
approaches
• Optimize evaluation
approaches
• Support pedagogical
research activities
and innovation
STUDENTS STAFF FACULTY
Foster Innovation & Continuous Improvement
• Identify and evaluate innovation opportunities
• Drive continuous improvement
Data Management Program Objectives
8
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Integrate Data Sources into
Comprehensive Data Warehouse
Build Reports and Dashboards
Enable Self Service Ensure Data Quality and Integrity
Tool and Vendor Selection
9
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Data Warehouse
• Standard relational DB, Redshift
• Chose Redshift because of scalability, performance
• Aligns with AWS platform focus
ETL
• Informatica, Talend
• Chose Informatica because of university relationship and
myriad of plugable connectors
Reporting/Analytics
• Microstrategy, Qlik, Tableau
• Chose Tableau because of feature set and industry
adoption
Reporting
Copy
HBX Data Ecosystem
10
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Course
Platform
Ver. A
MongoDB
MySQL Reporting
Copy
Course
Platform
Ver. B
MongoDB
MySQL
Historical Data
MongoDB
MySQL
Admin System
MySQL
Salesforce
Redshift
Informatica
Secure
Agent
Tableau
Server
Progress ODBC
for MongoDB
sync
HBX Data Management by the Numbers
11
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Source Systems
• 35 databases
• 887 tables
• 5,844 fields
• 109,751,902 rows
Data Warehouse
• 4 Redshift clusters
• 8 databases
• 404 tables
• 5,674 fields
• 400,794,679 rows
Daily ETL Process
• 300 jobs
• 6,515,599 rows
* Updated 6/1/2016
HBX Data Models
12
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
{
"_id": ObjectId("556f25ab662a9b059ea8df8b"),
"tei_id": "554a607b241b5a3f0e09eefe",
"course_instance_id":"556dcf55b7431f414d87f06f",
"user_id" : "8701",
"comments" : [
{
"id" : "ce59a25a-ce69-47-c534611f7ebf",
"text" : "This is a great response…,
"author_id" : "6411",
"date_created" : “2015-09-10”,
MySQL-Relational MongoDB-Semi-Structured
Course offerings
Student demographics
Applications & registration
Limited course content
Course structure
Course content
Student course state
Metric (timing) data
Challenges
13
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Immature data
connection support
Large object storage
limitations in Redshift
Difficulty flattening
complex/polymorphic
data structures
14
{"_id': ObjectId("2804c514e4c20e6d"),
"course_instance_id": "2804c51563c9c772",
"tei_id": "241b5a14b75fac83",
"user_id": "3312”,
"date_created": datetime.datetime(2015, 10, 14, 10, 56, 59, 137000),
"category": "timespan",
"metric": {"interaction_time": 180,
"is_interaction_time": True}}
{"_id": ObjectId("2804c514f92f1f64"),
"course_instance_id": "2804c51563c9c772",
"tei_id": "241b5a14b75fab70",
"user_id": "3312”,
"date_created": datetime.datetime(2015, 10, 14, 17, 11, 56, 967000),
"category": "view_user_response",
"metric": {"viewed_user_id": "9212"}}
Document-Structured Data Challenges
About HBX
HBX Data
Management
Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
15
Document-Structured Data Challenges
About HBX
HBX Data
Management
Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
{"_id": ObjectId("562666868c58dab88be84345"),
"course_instance_id": "561829c02804c51563c9c772",
"tei_id": "55c21b88241b5a14b75fab8d",
"user_id": "3312",
"state": {"answer": "The case really drove home..."}}
{"_id": ObjectId("56414b498c58dab88bf10873"),
"course_instance_id": "561829c02804c51563c9c772",
"tei_id": "5639ef402804c509af1d2721",
"user_id": "3312",
"state": {"summary":
[{"content": "Incorrect: Being quick to market...",
"correct": False,
"id": "5bc32d05-1173-452c-801b-34c2368ea4b6"},
{"content": "Correct: In the early stages...",
"correct": True,
"id": "88893c55-3e30-4f50-8495-a6fe1f1cef94"},
{"content": "Incorrect: Customization becomes...",
"correct": False,
"id": "4afbf29f-d23d-43a3-8266-e41df3defa69"}]}}
User state documents for reflection and multiple choice
16
Document-Structured Data Challenges
About HBX
HBX Data
Management
Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
• Documents with simple and consistent structure are easy
to translate into relational form
• Documents with simple, but polymorphic structure are
handled by modern MongoDB drivers (metric example)
• Documents with complicated and polymorphic structure
(user state example) push the boundaries of current
drivers and declarative tools
• Current solution: copy like-typed documents into
separate collections
• Preferred solution: copy all documents into
warehouse and do post-copy transforms for summary
and detailed information in relational form
17
Creating a Data-Driven Culture
Creating a Data Driven Culture
18
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
People
Technical
Partners
Leadership
Staff
Technology
Self-service
Eliminate
Complexity
Experimentation
Process
Process
Governance
Data
Governance
Education
Enablers for Building Data Driven Culture at
HBX
19
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Strong Partners
• Use off-shore partner
Mindtree to accelerate
• Active engagement of
vendors on technology
challenges
Education
• Short Presentations to staff
• Data Analysis Exercise at
all-staff team meeting
Program Governance
• Active interest &
involvement from Business
Areas
• Alignment to organizational
priorities
Experimentation
• HBX willingness to try new
things
• Helps drive engagement
with vendors
Organizational Impacts
20
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Enablement of real-time data-driven decision making
• Dashboards for Registration Pipeline and
Demographics
• Application Forecasting Dashboard
A move from spreadsheets to dashboards and
configurable business processes
• Development of grading automation data pipeline
• Reporting for B2B Participants
A move from individually handled data requests to
dashboards and self-service reporting
• Self-service marketing data extract
What’s Next?
21
About HBX
HBX Data
Management Initiative
Architecture &
Implementation
Challenges
Data Driven Culture
Impacts
What’s Next?
Streaming Data?
Native JSON Data
Warehouse?
Analytics?
Additional Data Sources?
www.hbx.hbs.edu
Questions?

More Related Content

PPTX
Making the Case for UX
PPTX
Architecture group
PDF
14.06.05 IT Summit IAM Presentation
PPTX
Fas da 20141120
PDF
Standard iSites Migration Plan
PPTX
IT Academy at IT Summti
PPTX
Delivering university self service for it and business services v1
PDF
HUIT 2014 Summer Town Hall
Making the Case for UX
Architecture group
14.06.05 IT Summit IAM Presentation
Fas da 20141120
Standard iSites Migration Plan
IT Academy at IT Summti
Delivering university self service for it and business services v1
HUIT 2014 Summer Town Hall

What's hot (20)

PPTX
20150601 brownbag v3
PPTX
It summit salesforce
PDF
14.05.08 cloud dev_ops_working_group_update
PDF
14.05.08 bcdr working_group_update
PDF
eGoogle analytics-best-practices-abcd-harvard-presentation-9-10-14
PDF
Huit 2015 june town hall renewal slides
PPTX
Huit fall startup 2014 review
PDF
Huit 2015 march town hall
PPTX
Standard i sites migration
PPTX
I sites migration
PPTX
2015 it summit itsm presentation
PPTX
Information security fasit-cait-20150129_v04
PPTX
Information security
PPTX
It academy
PPT
Introducing a social intranet at Freshfields Bruckhaus Deringer
PDF
Best Practices for a Successful SharePoint Migration or Upgrade to the Cloud
PDF
It summit 2016_combined
PPTX
Information Capabilities Framework (ICF)
PPTX
Proposal defensetemplate
DOCX
Abhijit_Choudhury_RESUME
20150601 brownbag v3
It summit salesforce
14.05.08 cloud dev_ops_working_group_update
14.05.08 bcdr working_group_update
eGoogle analytics-best-practices-abcd-harvard-presentation-9-10-14
Huit 2015 june town hall renewal slides
Huit fall startup 2014 review
Huit 2015 march town hall
Standard i sites migration
I sites migration
2015 it summit itsm presentation
Information security fasit-cait-20150129_v04
Information security
It academy
Introducing a social intranet at Freshfields Bruckhaus Deringer
Best Practices for a Successful SharePoint Migration or Upgrade to the Cloud
It summit 2016_combined
Information Capabilities Framework (ICF)
Proposal defensetemplate
Abhijit_Choudhury_RESUME
Ad

Similar to It summit data mgmt-2016.06.02-final (20)

PDF
Intro to big data and applications - day 2
PDF
Becoming (Big) Data Driven presentation at BusinessMeetsIt Big Data seminar M...
PDF
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
PDF
Data Orchestration for the Hybrid Cloud Era
PDF
Data Orchestration Platform for the Cloud
PDF
From limited Hadoop compute capacity to increased data scientist efficiency
PDF
Accelerate Analytics and ML in the Hybrid Cloud Era
PDF
A NOVEL APPROACH FOR PROCESSING BIG DATA
PDF
Data Analysis and Report Generation in Enterprise Mobility Solution
PDF
Enabling Apache Spark for Hybrid Cloud
PDF
Traditional data word
PDF
Beyond CRUD: patterns that never forget
PDF
How organizations can become data-driven: three main rules
PPTX
Bde presentation dv
PPTX
UCSD: Building a Big Data Culture - It Takes a Village
PDF
Hadoop meets Agile! - An Agile Big Data Model
PDF
Big data and you
 
PDF
Modern Data Architecture
PDF
Accelerate Analytics and ML in the Hybrid Cloud Era
PDF
When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
Intro to big data and applications - day 2
Becoming (Big) Data Driven presentation at BusinessMeetsIt Big Data seminar M...
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
Data Orchestration for the Hybrid Cloud Era
Data Orchestration Platform for the Cloud
From limited Hadoop compute capacity to increased data scientist efficiency
Accelerate Analytics and ML in the Hybrid Cloud Era
A NOVEL APPROACH FOR PROCESSING BIG DATA
Data Analysis and Report Generation in Enterprise Mobility Solution
Enabling Apache Spark for Hybrid Cloud
Traditional data word
Beyond CRUD: patterns that never forget
How organizations can become data-driven: three main rules
Bde presentation dv
UCSD: Building a Big Data Culture - It Takes a Village
Hadoop meets Agile! - An Agile Big Data Model
Big data and you
 
Modern Data Architecture
Accelerate Analytics and ML in the Hybrid Cloud Era
When Databases Meet Big data and Hadoop - Uni of Tromso Online Lecture
Ad

More from kevin_donovan (20)

PPTX
2016 it summit_accessibility_2016-05-24_standard
PPTX
Fphs informatics for 2016 it summit 160531
PDF
It summit dataverse-bigdata-mercecrosas
PDF
Hms crash planitsummit2016
PDF
It summit facilitate-researchcomputing-mercecrosas
PPTX
Lightbox ham it_summit_final
PDF
Harvard it summit 2016 - opencast in the cloud at harvard dce- live and on-d...
PDF
Fa qs 2016-04-21
PDF
Tlt and friends it summit 2016
PPTX
2016 it summit_accessibility_2016-05-24_standard
PPTX
Harvard phone it summit demo 06.02.16
PPTX
Phish, flop, or fine
PPTX
Waldo Summit 2016
PDF
Mobile firstpresentation huit
PPTX
Saving our social_media
PDF
Urc it summit-2
PPTX
Tlt success
PDF
Stakeholder update 4 14 data center outage
PPTX
Open housepix
PDF
Data center outage project update
2016 it summit_accessibility_2016-05-24_standard
Fphs informatics for 2016 it summit 160531
It summit dataverse-bigdata-mercecrosas
Hms crash planitsummit2016
It summit facilitate-researchcomputing-mercecrosas
Lightbox ham it_summit_final
Harvard it summit 2016 - opencast in the cloud at harvard dce- live and on-d...
Fa qs 2016-04-21
Tlt and friends it summit 2016
2016 it summit_accessibility_2016-05-24_standard
Harvard phone it summit demo 06.02.16
Phish, flop, or fine
Waldo Summit 2016
Mobile firstpresentation huit
Saving our social_media
Urc it summit-2
Tlt success
Stakeholder update 4 14 data center outage
Open housepix
Data center outage project update

Recently uploaded (20)

PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Classroom Observation Tools for Teachers
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Pharma ospi slides which help in ospi learning
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
GDM (1) (1).pptx small presentation for students
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Complications of Minimal Access Surgery at WLH
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Classroom Observation Tools for Teachers
PPH.pptx obstetrics and gynecology in nursing
Module 4: Burden of Disease Tutorial Slides S2 2025
O5-L3 Freight Transport Ops (International) V1.pdf
Basic Mud Logging Guide for educational purpose
Pharma ospi slides which help in ospi learning
Sports Quiz easy sports quiz sports quiz
Supply Chain Operations Speaking Notes -ICLT Program
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
TR - Agricultural Crops Production NC III.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
GDM (1) (1).pptx small presentation for students
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Final Presentation General Medicine 03-08-2024.pptx
Complications of Minimal Access Surgery at WLH
FourierSeries-QuestionsWithAnswers(Part-A).pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student

It summit data mgmt-2016.06.02-final

  • 1. 2 JUNE 2016 BUILDING A CLOUD BASED DATA WAREHOUSE GILDAS BAH, BRENT BENSON, & RYAN FRAZIER
  • 2. 2 Presenters About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Ryan Frazier – Director, Systems Engineering and Operations Brent Benson – Enterprise Architect Gildas Bah – Data Analyst Engineer
  • 3. 3
  • 4. 4  Harvard Business School’s newest division, tasked with reimagining business education for the digital age  Launched in June 2014  Located in Allston, five minutes from HBS campus  Moving from start-up to enterprise mode  The teaching model sets HBX apart from many online learning options and is reflective of the HBS in- person classroom approach What is HBX? About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next?
  • 5. HBX Platforms 5 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? HBX Online Platform HBX Live  Mainly asynchronous online business education  Engagement through student interaction in cohorts of ~400  Case-based learning with highly interactive teaching elements and peer help  WGBH studio-based virtual classroom  Synchronous audio/video with chat, polls, boards  Up to 60 global students on studio wall, hundreds or more observers
  • 6. 66 Building a Data Management Practice
  • 7. Why Build a Data-Driven Culture? 7 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Enhance Outcomes • Proactively support struggling students • Identify challenging content • Evaluate and improve interactive content, social engagement, and retention Improve Effectiveness • Scale data intensive activities like marketing, admissions, & grading • Use data to test ideas and improve quality of decisions Refine Pedagogy • Evaluate new pedagogical approaches • Optimize evaluation approaches • Support pedagogical research activities and innovation STUDENTS STAFF FACULTY Foster Innovation & Continuous Improvement • Identify and evaluate innovation opportunities • Drive continuous improvement
  • 8. Data Management Program Objectives 8 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Integrate Data Sources into Comprehensive Data Warehouse Build Reports and Dashboards Enable Self Service Ensure Data Quality and Integrity
  • 9. Tool and Vendor Selection 9 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Data Warehouse • Standard relational DB, Redshift • Chose Redshift because of scalability, performance • Aligns with AWS platform focus ETL • Informatica, Talend • Chose Informatica because of university relationship and myriad of plugable connectors Reporting/Analytics • Microstrategy, Qlik, Tableau • Chose Tableau because of feature set and industry adoption
  • 10. Reporting Copy HBX Data Ecosystem 10 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Course Platform Ver. A MongoDB MySQL Reporting Copy Course Platform Ver. B MongoDB MySQL Historical Data MongoDB MySQL Admin System MySQL Salesforce Redshift Informatica Secure Agent Tableau Server Progress ODBC for MongoDB sync
  • 11. HBX Data Management by the Numbers 11 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Source Systems • 35 databases • 887 tables • 5,844 fields • 109,751,902 rows Data Warehouse • 4 Redshift clusters • 8 databases • 404 tables • 5,674 fields • 400,794,679 rows Daily ETL Process • 300 jobs • 6,515,599 rows * Updated 6/1/2016
  • 12. HBX Data Models 12 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? { "_id": ObjectId("556f25ab662a9b059ea8df8b"), "tei_id": "554a607b241b5a3f0e09eefe", "course_instance_id":"556dcf55b7431f414d87f06f", "user_id" : "8701", "comments" : [ { "id" : "ce59a25a-ce69-47-c534611f7ebf", "text" : "This is a great response…, "author_id" : "6411", "date_created" : “2015-09-10”, MySQL-Relational MongoDB-Semi-Structured Course offerings Student demographics Applications & registration Limited course content Course structure Course content Student course state Metric (timing) data
  • 13. Challenges 13 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Immature data connection support Large object storage limitations in Redshift Difficulty flattening complex/polymorphic data structures
  • 14. 14 {"_id': ObjectId("2804c514e4c20e6d"), "course_instance_id": "2804c51563c9c772", "tei_id": "241b5a14b75fac83", "user_id": "3312”, "date_created": datetime.datetime(2015, 10, 14, 10, 56, 59, 137000), "category": "timespan", "metric": {"interaction_time": 180, "is_interaction_time": True}} {"_id": ObjectId("2804c514f92f1f64"), "course_instance_id": "2804c51563c9c772", "tei_id": "241b5a14b75fab70", "user_id": "3312”, "date_created": datetime.datetime(2015, 10, 14, 17, 11, 56, 967000), "category": "view_user_response", "metric": {"viewed_user_id": "9212"}} Document-Structured Data Challenges About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next?
  • 15. 15 Document-Structured Data Challenges About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? {"_id": ObjectId("562666868c58dab88be84345"), "course_instance_id": "561829c02804c51563c9c772", "tei_id": "55c21b88241b5a14b75fab8d", "user_id": "3312", "state": {"answer": "The case really drove home..."}} {"_id": ObjectId("56414b498c58dab88bf10873"), "course_instance_id": "561829c02804c51563c9c772", "tei_id": "5639ef402804c509af1d2721", "user_id": "3312", "state": {"summary": [{"content": "Incorrect: Being quick to market...", "correct": False, "id": "5bc32d05-1173-452c-801b-34c2368ea4b6"}, {"content": "Correct: In the early stages...", "correct": True, "id": "88893c55-3e30-4f50-8495-a6fe1f1cef94"}, {"content": "Incorrect: Customization becomes...", "correct": False, "id": "4afbf29f-d23d-43a3-8266-e41df3defa69"}]}} User state documents for reflection and multiple choice
  • 16. 16 Document-Structured Data Challenges About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? • Documents with simple and consistent structure are easy to translate into relational form • Documents with simple, but polymorphic structure are handled by modern MongoDB drivers (metric example) • Documents with complicated and polymorphic structure (user state example) push the boundaries of current drivers and declarative tools • Current solution: copy like-typed documents into separate collections • Preferred solution: copy all documents into warehouse and do post-copy transforms for summary and detailed information in relational form
  • 18. Creating a Data Driven Culture 18 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? People Technical Partners Leadership Staff Technology Self-service Eliminate Complexity Experimentation Process Process Governance Data Governance Education
  • 19. Enablers for Building Data Driven Culture at HBX 19 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Strong Partners • Use off-shore partner Mindtree to accelerate • Active engagement of vendors on technology challenges Education • Short Presentations to staff • Data Analysis Exercise at all-staff team meeting Program Governance • Active interest & involvement from Business Areas • Alignment to organizational priorities Experimentation • HBX willingness to try new things • Helps drive engagement with vendors
  • 20. Organizational Impacts 20 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Enablement of real-time data-driven decision making • Dashboards for Registration Pipeline and Demographics • Application Forecasting Dashboard A move from spreadsheets to dashboards and configurable business processes • Development of grading automation data pipeline • Reporting for B2B Participants A move from individually handled data requests to dashboards and self-service reporting • Self-service marketing data extract
  • 21. What’s Next? 21 About HBX HBX Data Management Initiative Architecture & Implementation Challenges Data Driven Culture Impacts What’s Next? Streaming Data? Native JSON Data Warehouse? Analytics? Additional Data Sources?

Editor's Notes

  • #7: Add link to the actual dashboard
  • #11: AWS for core infrastructure Adminsitrative system (migrating to Salesforce over next year) Multiple Prod Environments—each new release, ensure stability Reporting Copy + Archive for Historical Data ETL Reviewed several options Picked Informatica to align with HU, HBS data mgmt architecture Use Cloud version Redshift for EDW Aligns to cloud, AWS Tableau for Reporting
  • #19: People: Technical staff to lead and execute Partners & Vendors who work collaboratively to support you Leadership to support investments and organizational priority Staff support to learn new skills and build data literacy Process: Process Governance: Prioritization, Policies, Business Involvement Data Governance: Data Quality, Data Lineage, Data Model Definitions Education: Data Literacy and awareness Technology: Emphasize self-service to enhance data literacy Try to make tools, data models simple for business teams to use and understand Experiment with new technology to be more efficient, solve actual problems, generate more value