Data Wrangling
Questionnaire
Case Study: Vertically Integrated
Food Manufacturer
Getting Started: Map Your Data and its Life Cycle
Create
Store
Use
Share
Archive
Destroy
Getting Started: Tell me more.
● Where will data come from? What sources are available?
● Where does data go?
● Do you use version control?
● What is size of dataset and how much do you need to get from each one?
● Does your system have an API? (Quickbooks, Google Analytics, etc.)
● Do you use the cloud? databases? spreadsheets? hard copies?
And more...
● Do you store your data in one place, like a data warehouse?
● Do you need outside data from external sources?
● Do we need all of the data for more granular analysis or do I need a subset to ensure
faster performance?
● Will the data need to be standardized?
● How frequently will you need to import/export data?
Data Silos Online Sales
Data
(Shopify,
Squarespace, etc.)
Grocery Sales
Data
(IRI, Nielson,
etc.)
Manufacturing
and Ops Data
(Timesheets,
Excel, Google
Sheets, etc.)
Wholesale
Sales Data
(Purchase Orders,
etc.)
Accounting
Data
(Quickbooks,
Xero, SAP, etc.)
Supplier
Production
Data
(Excel, Google
Sheets, etc.)
Website
Metrics
(Google
Analytics,
Squarespace, etc.)
HR Data
(Zenefits, Google
Sheets, Asana,
etc.)
Customer
Service Data
(Zendesk,
Salesforce, etc.)
Inventory and
Warehouse
Data
(Quickbooks,
Excel, etc.)
The Hard (and Longest) Part: Cleaning, ugh.
● How can you ensure data quality?
● Who is responsible for developing and maintaining reports?
● What people and data resources are needed?
● For each source of data, is it complete, accurate, and up to date?
● Can I use the data in its current state?
● If there are inconsistencies or redundant values, what do I need to do to clean the data?
● Is it a matter of manually changing a few values or will a more systematic approach be
necessary?
● Will I need to change the data in its original location or in a secondary environment?
Whatcha looking at?
● Do you currently analyze your data?
● What tools do you use and how often do you use them? (Excel, Tableau, etc.)
● Do you analyze mostly financial, operational, or customer data?
● Do you have standard metrics or KPIs that you already review?
Sales KPIs
● Average deal size
● Average revenue per product
● Customer acquisitions costs as a
percent of sales value
● Customer churn ratio
● Customer purchase frequency
● Customer loyalty
● Customer satisfaction
● Gross margin per product
● Gross margin per sales person
● Number of units sold per
day/week/month/quarter/year
● Percentage of online, wholesale, store sales
revenue
● Pipeline by sales stage
● Revenue per sales person
● Sales growth
● Win/loss ratio percentage
Marketing KPIs
● Ad click-through ratio
● Brand awareness percentage
● Column inches of media coverage
● Cost per lead
● Leads generated
● Number client visits
● Number product focus groups
● Number trade shows attended
● Q score (a way to measure the familiarity
and appeal of a brand, etc.)
● ROI of brand
● Return on marketing investment
● Website click-throughs
● Website hits
● Website leads generated
Finance KPIs
● AP and AR turnover ratios
● Accounts receivable collection period
● Average customer receivable
● Average monetary value of invoices
outstanding
● Budget variance
● Capital expenditures
● Cash conversion cycle
● Cost of Goods Sold
● Cumulative Average Growth Rate
● Days payable
● Discounted cash flow
● Distribution costs as a percent of revenue
● EBIT and EBITDA
● Fixed Costs
● Gross profit and GP margin
● Inventory turnover
● Inventory value
● Internal rate of return
● Net change in cash
● Net income
● Number of invoices outstanding
● Quick ratio
● Variable costs
Operations KPIs
● Asset utilization
● Comparative analytics for products,
plants, divisions
● Cycle time
● Demand forecasting
● Downtime to operating time ratio
● Job, product costing
● Labor as a percentage of cost
● Maintenance cost per unit
● Manufacturing cost per unit
● Number non-compliance events (HACCP,
govt regs)
● On-time orders and shipping
● Open orders
● Overtime as a percentage of total hours
● Products per machine, unit, line, or plant per
shift and per day
● Time from order to shipment
● Yield
Who cares?
● Who are the data users/key stakeholders?
○ Board members
○ Sales reps
○ Customers
○ Employees
○ Suppliers/Vendors
○ Support teams - Accounting, Legal
● What are their needs? Expected findings? Technical skills? Time restraints?
● Who should be able to access the information? (confidentiality/security concerns)
The Big Ask.
● What is the business requesting? vs. What do they need?
● What do you want to find? Data mining vs Data analysis
● How will the results be used? (business decisions, invest in new products, identify risks...)
● What information will be on the report?
● What KPIs will show you the results you need? (ability to filter segments, data across time,
drill-downs, etc.)
● What do you want to change?
● When will each report be delivered? What is frequency of updates required?

More Related Content

PDF
SPI IQ for Retailers
PPTX
Data science in retail industry
PDF
Machine learning 101
PPTX
Ensayo Analytics new
PDF
Eaglei secondry information system by TallyMarks
PPTX
Ensayo Analytics
PPTX
Unified real time analytics for eCommerce
PPTX
Odoo erp warehouse management
SPI IQ for Retailers
Data science in retail industry
Machine learning 101
Ensayo Analytics new
Eaglei secondry information system by TallyMarks
Ensayo Analytics
Unified real time analytics for eCommerce
Odoo erp warehouse management

What's hot (20)

PDF
Data warehousev2.1
PPTX
The Journey To World Class Demand Planning
ODP
Benefits of implementing ERP using Odoo ERPOnline
PPTX
Introduction to Datawarehousing.
PPSX
Prsntcn En V1.0
PPTX
A presentation on Walmart
PDF
Procurement in Business
PPTX
PPTX
attune Fashion Suite Accelerators.2016
PPT
Toll Plaza Performance Management
PDF
Pharmsoft Product Presentation
PDF
Walmart Research
PPTX
2018 SITEC EC CLASS - E-Commerce Process 203: Customer Relationship Managemen...
PDF
Bi retail wholesale-industry-presentation-0909275245
PPT
ElegantJ BI Overview
PPS
ERManager 20090902
DOC
Ryann Bradford updated Resume
PPTX
Chapter9
PPT
About startpoint
PPTX
Use BI to Beat the Competition
Data warehousev2.1
The Journey To World Class Demand Planning
Benefits of implementing ERP using Odoo ERPOnline
Introduction to Datawarehousing.
Prsntcn En V1.0
A presentation on Walmart
Procurement in Business
attune Fashion Suite Accelerators.2016
Toll Plaza Performance Management
Pharmsoft Product Presentation
Walmart Research
2018 SITEC EC CLASS - E-Commerce Process 203: Customer Relationship Managemen...
Bi retail wholesale-industry-presentation-0909275245
ElegantJ BI Overview
ERManager 20090902
Ryann Bradford updated Resume
Chapter9
About startpoint
Use BI to Beat the Competition
Ad

Similar to Data Wrangling Questionnaire (20)

PPTX
Hair_PPT_Ch02.pptx marketing analytics12
PPTX
Improve Your Product Portfolio Through Analytics - H. Del Castillo
PDF
Balanced Scorecard Powerpoint Presentation Slides
PDF
Balanced Scorecard PowerPoint Presentation Slides
PDF
The Business Value of Business Intelligence
PDF
Revenue Operations Analytics: A Strategic Blueprint
PDF
5 KPIs that can boost your business intelligence program
PDF
Transforming data into useful information
PPSX
Small Business Analytics and Metrics: How and What Do you Measure Up?
PPTX
KPIs and Dashboards
PPT
Business Intelligence 101 for Business (BI 101)
PPTX
Business Intelligence and OLAP Practice
PPTX
Business requirements gathering for bi
PPTX
Pragmatic Approach to Data Science
PPT
Business Intelligence Industry Perspective Session I
PPT
Balanced Scorecard
PDF
Making your analytics talk business | Big Data Demystified
PPTX
ETE 2013: Going Big with Big Data...one step at a time
PDF
Part-2.pdf
Hair_PPT_Ch02.pptx marketing analytics12
Improve Your Product Portfolio Through Analytics - H. Del Castillo
Balanced Scorecard Powerpoint Presentation Slides
Balanced Scorecard PowerPoint Presentation Slides
The Business Value of Business Intelligence
Revenue Operations Analytics: A Strategic Blueprint
5 KPIs that can boost your business intelligence program
Transforming data into useful information
Small Business Analytics and Metrics: How and What Do you Measure Up?
KPIs and Dashboards
Business Intelligence 101 for Business (BI 101)
Business Intelligence and OLAP Practice
Business requirements gathering for bi
Pragmatic Approach to Data Science
Business Intelligence Industry Perspective Session I
Balanced Scorecard
Making your analytics talk business | Big Data Demystified
ETE 2013: Going Big with Big Data...one step at a time
Part-2.pdf
Ad

Recently uploaded (20)

PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
MBA JAPAN: 2025 the University of Waseda
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
PPTX
IMPACT OF LANDSLIDE.....................
PPT
expt-design-lecture-12 hghhgfggjhjd (1).ppt
PDF
Best Data Science Professional Certificates in the USA | IABAC
DOCX
Factor Analysis Word Document Presentation
PDF
Navigating the Thai Supplements Landscape.pdf
PDF
An essential collection of rules designed to help businesses manage and reduc...
PPT
statistics analysis - topic 3 - describing data visually
PPTX
Machine Learning and working of machine Learning
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
DU, AIS, Big Data and Data Analytics.ppt
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
MBA JAPAN: 2025 the University of Waseda
Tapan_20220802057_Researchinternship_final_stage.pptx
Session 11 - Data Visualization Storytelling (2).pdf
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
IMPACT OF LANDSLIDE.....................
expt-design-lecture-12 hghhgfggjhjd (1).ppt
Best Data Science Professional Certificates in the USA | IABAC
Factor Analysis Word Document Presentation
Navigating the Thai Supplements Landscape.pdf
An essential collection of rules designed to help businesses manage and reduc...
statistics analysis - topic 3 - describing data visually
Machine Learning and working of machine Learning
1 hour to get there before the game is done so you don’t need a car seat for ...
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...

Data Wrangling Questionnaire

  • 1. Data Wrangling Questionnaire Case Study: Vertically Integrated Food Manufacturer
  • 2. Getting Started: Map Your Data and its Life Cycle Create Store Use Share Archive Destroy
  • 3. Getting Started: Tell me more. ● Where will data come from? What sources are available? ● Where does data go? ● Do you use version control? ● What is size of dataset and how much do you need to get from each one? ● Does your system have an API? (Quickbooks, Google Analytics, etc.) ● Do you use the cloud? databases? spreadsheets? hard copies?
  • 4. And more... ● Do you store your data in one place, like a data warehouse? ● Do you need outside data from external sources? ● Do we need all of the data for more granular analysis or do I need a subset to ensure faster performance? ● Will the data need to be standardized? ● How frequently will you need to import/export data?
  • 5. Data Silos Online Sales Data (Shopify, Squarespace, etc.) Grocery Sales Data (IRI, Nielson, etc.) Manufacturing and Ops Data (Timesheets, Excel, Google Sheets, etc.) Wholesale Sales Data (Purchase Orders, etc.) Accounting Data (Quickbooks, Xero, SAP, etc.) Supplier Production Data (Excel, Google Sheets, etc.) Website Metrics (Google Analytics, Squarespace, etc.) HR Data (Zenefits, Google Sheets, Asana, etc.) Customer Service Data (Zendesk, Salesforce, etc.) Inventory and Warehouse Data (Quickbooks, Excel, etc.)
  • 6. The Hard (and Longest) Part: Cleaning, ugh. ● How can you ensure data quality? ● Who is responsible for developing and maintaining reports? ● What people and data resources are needed? ● For each source of data, is it complete, accurate, and up to date? ● Can I use the data in its current state? ● If there are inconsistencies or redundant values, what do I need to do to clean the data? ● Is it a matter of manually changing a few values or will a more systematic approach be necessary? ● Will I need to change the data in its original location or in a secondary environment?
  • 7. Whatcha looking at? ● Do you currently analyze your data? ● What tools do you use and how often do you use them? (Excel, Tableau, etc.) ● Do you analyze mostly financial, operational, or customer data? ● Do you have standard metrics or KPIs that you already review?
  • 8. Sales KPIs ● Average deal size ● Average revenue per product ● Customer acquisitions costs as a percent of sales value ● Customer churn ratio ● Customer purchase frequency ● Customer loyalty ● Customer satisfaction ● Gross margin per product ● Gross margin per sales person ● Number of units sold per day/week/month/quarter/year ● Percentage of online, wholesale, store sales revenue ● Pipeline by sales stage ● Revenue per sales person ● Sales growth ● Win/loss ratio percentage
  • 9. Marketing KPIs ● Ad click-through ratio ● Brand awareness percentage ● Column inches of media coverage ● Cost per lead ● Leads generated ● Number client visits ● Number product focus groups ● Number trade shows attended ● Q score (a way to measure the familiarity and appeal of a brand, etc.) ● ROI of brand ● Return on marketing investment ● Website click-throughs ● Website hits ● Website leads generated
  • 10. Finance KPIs ● AP and AR turnover ratios ● Accounts receivable collection period ● Average customer receivable ● Average monetary value of invoices outstanding ● Budget variance ● Capital expenditures ● Cash conversion cycle ● Cost of Goods Sold ● Cumulative Average Growth Rate ● Days payable ● Discounted cash flow ● Distribution costs as a percent of revenue ● EBIT and EBITDA ● Fixed Costs ● Gross profit and GP margin ● Inventory turnover ● Inventory value ● Internal rate of return ● Net change in cash ● Net income ● Number of invoices outstanding ● Quick ratio ● Variable costs
  • 11. Operations KPIs ● Asset utilization ● Comparative analytics for products, plants, divisions ● Cycle time ● Demand forecasting ● Downtime to operating time ratio ● Job, product costing ● Labor as a percentage of cost ● Maintenance cost per unit ● Manufacturing cost per unit ● Number non-compliance events (HACCP, govt regs) ● On-time orders and shipping ● Open orders ● Overtime as a percentage of total hours ● Products per machine, unit, line, or plant per shift and per day ● Time from order to shipment ● Yield
  • 12. Who cares? ● Who are the data users/key stakeholders? ○ Board members ○ Sales reps ○ Customers ○ Employees ○ Suppliers/Vendors ○ Support teams - Accounting, Legal ● What are their needs? Expected findings? Technical skills? Time restraints? ● Who should be able to access the information? (confidentiality/security concerns)
  • 13. The Big Ask. ● What is the business requesting? vs. What do they need? ● What do you want to find? Data mining vs Data analysis ● How will the results be used? (business decisions, invest in new products, identify risks...) ● What information will be on the report? ● What KPIs will show you the results you need? (ability to filter segments, data across time, drill-downs, etc.) ● What do you want to change? ● When will each report be delivered? What is frequency of updates required?