SlideShare a Scribd company logo
1
But how do I GET the
data?
Transparency Camp 2014
Shooju is a Web-Based Data Platform
2
• Consolidate your internal and external data sources
• Make all data searchable from one place
• Provide continuous updating
• Seamlessly integrate with tools and applications
• Share data across your entire organization
• Save time and energy while reducing errors and
problems with version control
Shooju saves time, improves data quality and enhances
data sharing across your entire organization
The Analytical Process
3
Data
Data Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
The Analytical Process
4
Data
Data Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
some place
The Analytical Process
5
Data
Data Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
some place
your tool of choice
The Analytical Process
6
Data
Data Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
some place
your tool of choice
your product
The Analytical Process
7
Data
Data Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
some place
your tool of choice
your product
The Fun Part 
The Analytical Process
8
Data
Data Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
some place
your tool of choice
your product
The Not Fun Part 
Big data vs. small data
9
A boring 2 x 2
10
The harsh 80/20 reality
11
Most organizations spend more time collecting,
cleaning, downloading, managing and
wrangling data than they do conducting analysis
Three ways to get data
• API
– Good
– Bad
• Scraping
• Manual
12
Defined as ETL (Extract,
Transform, Load) process
Method comparison
13
TechnicalExpertiserequired
Time (and annoyance)
Manual
Scraping
API
14
Average cost curve of data collection
Manual Collection
AverageCost
Number of times data is collected
15
Average cost curve of data collection
Manual Collection
AverageCost
Number of times data is collected
Scraping
16
Average cost curve of data collection
Manual Collection
AverageCost
Number of times data is collected
Scraping
API
How do you get your data?
What do you like?
What don’t you like?
17
Once the data is scraped, where can it go?
• CSV
• XLS
• DBF
• SQL
• NoSQL
• Many others
18
Where does your data go when you collect it?
19
1 Appendix
Shooju Value Added
Cost Savings
By saving analyst time and energy, Shooju allows analysts to do more with less,
reducing data management costs and putting more focus on high-value analysis.
Added Quality
Automating data processes internally will ensure that your data is accurate, up-to-date
and consistent across your entire organization.
Enhanced Decision Making
Having more accurate data available faster with more analyst time left for analysis
leads to enhanced decision making.
21
Cost
Savings
Added
Quality
Enhanced
Decision
Making
Shooju
Value
Added
22
Shooju
Sources
Excel
Add-In
& Other Tools
Custom
BI Apps
Web
Search
Auto-
Import
Drivers
# of analysts retrieving
time saved in retrieval
# of sources
frequency of retrieval
# of analysts refreshing
time saved in tool refresh
# of sources
frequency of refresh
time to integrate data
analysts contributing data
# of tools created
analyst upload time
# of analysts searching
time saved in search
# of sources
frequency of search
5 analysts
65 min / source
22 sources
18 times / year
11 analysts
74 min / source
22 sources
14 times / year
9 min / source
22 sources
32 times / year
$97k
(14%)
$73k
(10%)
$248k
(35%)
$702kTotal:
Cost Savings
13 analysts
14 wk of dev. saved
8 analysts contributing
2 apps created
$284k
(41%)
40 min 10 times / year
Sample Cost Savings
Cost Savings Added Quality Enhanced Decision MakingShooju Value Added
* Based on real 40-person
organization. Assumed
annual wages vary
between $30k and $140k.
$410k
savings
equivalent to
10% of HR
spend*
Shooju speeds up
custom BI application
development by making
all data natively
accessible and
continuously updated in
any BI tool or custom
app.
USD (%)
Added Quality: The Three “Cs”
23
Cost Savings Added QualityShooju Value Added
Consistency
Shooju ensures that all analysts are using the same data
across all their tools and applications. By allowing
analysts to upload their own data to the platform, internal
data as well as external data now flows seamlessly -
without messy spreadsheet links.
Currency
By automatically pulling in the latest source data through the
Shooju importer layer, Shooju ensures that all of your
spreadsheets and models are populated with the latest data.
Our native plugins for Excel, Access and all your other tools
allow data to flow through directly without any need for the
analyst to download or copy and paste.
Correctness
The more data is touched by human hands, the more prone it is to errors. By streamlining
workflows and automating work processes, Shooju eliminates most of these errors, saving
time and ensuring that the data you rely on is more accurate.
Enhanced Decision Making
We support any data source
24
Ask us about non-mainstream data
sources that traditional data providers
don’t carry.
Shooju Data Process
25
Shooju vs. Custom Data Warehouse
Custom Data
Warehouse Shooju
Design Custom “Plug-and-play”
Cost 7+ digits 5-6 digits
Rollout timeline Months / Years Hours
Scalability Minimal Infinite
Flexibility Low High
Maintenance High Low
Stakeholders IT controlled Analyst run / IT maintained
Tool and app support Clunky, requiring IT Native tool support
26
Data warehouse projects are costly, time consuming and
result in inflexible systems with low adoption rates
Shooju vs. Off-the-shelf Data Management*
Off-the-shelf
Data Management* Shooju
Service focus Data provision/management Process improvement
Prepackaged data feeds Many None
Custom data feeds None (not natively supported) Included(all feeds are custom)
Internal data integration Weeks (high consulting fees) Days (included in service)
Process flexibility Low High
Analyst learning curve Weeks Hours
Ease of migrating off Very difficult/impossible Easy
Annual fee 6-7 digits 5-6 digits
27
Data management* solutions focus on generic data
provision rather than process improvement and limit
analysts to a closed and inflexible data ecosystem.
* Top-ranked providers in the EnergyRisk Data Management category include: Morningstar, ZE Power Group, SunGard, Allegro, Pioneer
Solutions, SAS, and InteractiveData. See http://www.slideshare.net/Allegrodev/energy-risk-magazines-etrm-software-rankings-2013

More Related Content

PDF
Toma de decisiones impulsada por datos en radiología: Rochester Regional Heal...
PPTX
Data driven decision making through analytics and IoT
PPTX
The New Self-Service Analytics - Going Beyond the Tools
PDF
Knowi Overview: NoSQL Analytics and Business Intelligence
PDF
Why Your Data and Analytics Should Live in the Cloud
PDF
Analytica 2014 - Biotech Forum - IDBS Bioprocess Execution System
PPTX
Bio-IT 2014: 'Capturing Value from Collaborative Research with the IDBS Trans...
PPTX
Preclinical development in the current Pharmaceutical space
Toma de decisiones impulsada por datos en radiología: Rochester Regional Heal...
Data driven decision making through analytics and IoT
The New Self-Service Analytics - Going Beyond the Tools
Knowi Overview: NoSQL Analytics and Business Intelligence
Why Your Data and Analytics Should Live in the Cloud
Analytica 2014 - Biotech Forum - IDBS Bioprocess Execution System
Bio-IT 2014: 'Capturing Value from Collaborative Research with the IDBS Trans...
Preclinical development in the current Pharmaceutical space

What's hot (19)

PPTX
Empowering Customers with Personalized Insights
PPTX
The Benefits of Predictive and Proactive Support for an Enterprise Data Hub
PPTX
Data Migrations powered by GalenETL
PDF
Quelles nouveautés avec la version 6.5 de Splunk Enterprise
PPTX
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
PPTX
How to Use Big Data to Transform IT Operations
PDF
ODSC data science to DataOps
PPSX
Tableau - Make your SEO data work for you!
PPTX
Should You Invest In DataOps Services?
PPTX
Florida's Natural Growers BI case study
PPTX
Why Data Science Projects Fail
PDF
Data quality management Basic
PDF
Architectural Health Check for Postgres
 
PDF
2015-11-13 Data for Administrative Professionals
PDF
925 plenary rexer_using our laptop
PPTX
Taking Splunk to the Next Level - Management Breakout Session
PPTX
Why Data Science Projects Fail?
PDF
Data Virtualization Modernizes Biobanking
PPTX
Everything you wanted to know about data ops
Empowering Customers with Personalized Insights
The Benefits of Predictive and Proactive Support for an Enterprise Data Hub
Data Migrations powered by GalenETL
Quelles nouveautés avec la version 6.5 de Splunk Enterprise
Big Data Business Wins: Real-time Inventory Tracking with Hadoop
How to Use Big Data to Transform IT Operations
ODSC data science to DataOps
Tableau - Make your SEO data work for you!
Should You Invest In DataOps Services?
Florida's Natural Growers BI case study
Why Data Science Projects Fail
Data quality management Basic
Architectural Health Check for Postgres
 
2015-11-13 Data for Administrative Professionals
925 plenary rexer_using our laptop
Taking Splunk to the Next Level - Management Breakout Session
Why Data Science Projects Fail?
Data Virtualization Modernizes Biobanking
Everything you wanted to know about data ops
Ad

Viewers also liked (20)

PPTX
OBIEE 11.1.1.7: Upgrade y Nuevas Características
DOC
Data modeling
PPTX
Incredible ODI tips to work with Hyperion tools that you ever wanted to know
PDF
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
PDF
Tableau Best Practices for OBIEE
PPTX
How to solve complex business requirements with Oracle Data Integrator?
PPTX
Logical DB Design (OOP)
PDF
A microservice approach for legacy modernisation
PDF
Empowering Business Users: OBIEE 12c Visual Analyzer and Data Mashup
PDF
OUG Ireland Meet-up - Updates from Oracle Open World 2016
PPTX
Ranges, ranges everywhere (Oracle SQL)
PDF
OUG Ireland Meet-up 12th January
PPT
Domain model
PDF
How to read a data model
PPSX
Row Pattern Matching in Oracle Database 12c
PDF
Data modelling 101
PPS
Data models
PPT
Dbms models
PPTX
Data Modeling PPT
PDF
SQL: The one language to rule all your data
OBIEE 11.1.1.7: Upgrade y Nuevas Características
Data modeling
Incredible ODI tips to work with Hyperion tools that you ever wanted to know
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Tableau Best Practices for OBIEE
How to solve complex business requirements with Oracle Data Integrator?
Logical DB Design (OOP)
A microservice approach for legacy modernisation
Empowering Business Users: OBIEE 12c Visual Analyzer and Data Mashup
OUG Ireland Meet-up - Updates from Oracle Open World 2016
Ranges, ranges everywhere (Oracle SQL)
OUG Ireland Meet-up 12th January
Domain model
How to read a data model
Row Pattern Matching in Oracle Database 12c
Data modelling 101
Data models
Dbms models
Data Modeling PPT
SQL: The one language to rule all your data
Ad

Similar to But how do I GET the data? Transparency Camp 2014 (20)

PDF
data_blending
PDF
DataOps , cbuswaw April '23
PDF
How Can You Implement DataOps In Your Existing Workflow?
PDF
Data Integration Made Easy Databricks Connects Your Data Ecosystem
PPTX
Breed data scientists_ A Presentation.pptx
PDF
Starting Your Modern DataOps Journey
PPTX
Exploring the impact and evolution of Advanced Analytics Tools.pptx
PDF
Big Data Tools PowerPoint Presentation Slides
PPTX
Latest trends in Business Analytics
PPTX
PPTX
Data summit connect fall 2020 - rise of data ops
PDF
Intelligent Documents: From Data to Deliverables with GenAI
PDF
Modernize your Infrastructure and Mobilize Your Data
PPTX
Toad Business Intelligence Suite
PDF
Big Data Evolution
PPTX
MTX Portland Office 365 Strategic Capabilities Sep2017
PDF
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
PDF
TB8568_8568_Presentation
PDF
Advanced Project Data Analytics for Improved Project Delivery
PDF
Big Data Analytics Architecture Powerpoint Presentation Slides
data_blending
DataOps , cbuswaw April '23
How Can You Implement DataOps In Your Existing Workflow?
Data Integration Made Easy Databricks Connects Your Data Ecosystem
Breed data scientists_ A Presentation.pptx
Starting Your Modern DataOps Journey
Exploring the impact and evolution of Advanced Analytics Tools.pptx
Big Data Tools PowerPoint Presentation Slides
Latest trends in Business Analytics
Data summit connect fall 2020 - rise of data ops
Intelligent Documents: From Data to Deliverables with GenAI
Modernize your Infrastructure and Mobilize Your Data
Toad Business Intelligence Suite
Big Data Evolution
MTX Portland Office 365 Strategic Capabilities Sep2017
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
TB8568_8568_Presentation
Advanced Project Data Analytics for Improved Project Delivery
Big Data Analytics Architecture Powerpoint Presentation Slides

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
A Presentation on Artificial Intelligence
PPTX
Tartificialntelligence_presentation.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
TLE Review Electricity (Electricity).pptx
PDF
August Patch Tuesday
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks
Per capita expenditure prediction using model stacking based on satellite ima...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
A Presentation on Artificial Intelligence
Tartificialntelligence_presentation.pptx
Encapsulation theory and applications.pdf
Machine learning based COVID-19 study performance prediction
Agricultural_Statistics_at_a_Glance_2022_0.pdf
1. Introduction to Computer Programming.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TLE Review Electricity (Electricity).pptx
August Patch Tuesday
NewMind AI Weekly Chronicles - August'25-Week II
OMC Textile Division Presentation 2021.pptx
Digital-Transformation-Roadmap-for-Companies.pptx

But how do I GET the data? Transparency Camp 2014

  • 1. 1 But how do I GET the data? Transparency Camp 2014
  • 2. Shooju is a Web-Based Data Platform 2 • Consolidate your internal and external data sources • Make all data searchable from one place • Provide continuous updating • Seamlessly integrate with tools and applications • Share data across your entire organization • Save time and energy while reducing errors and problems with version control Shooju saves time, improves data quality and enhances data sharing across your entire organization
  • 3. The Analytical Process 3 Data Data Data Data Data Data Data Data Data Data Data Data
  • 4. The Analytical Process 4 Data Data Data Data Data Data Data Data Data Data Data Data some place
  • 5. The Analytical Process 5 Data Data Data Data Data Data Data Data Data Data Data Data some place your tool of choice
  • 6. The Analytical Process 6 Data Data Data Data Data Data Data Data Data Data Data Data some place your tool of choice your product
  • 7. The Analytical Process 7 Data Data Data Data Data Data Data Data Data Data Data Data some place your tool of choice your product The Fun Part 
  • 8. The Analytical Process 8 Data Data Data Data Data Data Data Data Data Data Data Data some place your tool of choice your product The Not Fun Part 
  • 9. Big data vs. small data 9
  • 10. A boring 2 x 2 10
  • 11. The harsh 80/20 reality 11 Most organizations spend more time collecting, cleaning, downloading, managing and wrangling data than they do conducting analysis
  • 12. Three ways to get data • API – Good – Bad • Scraping • Manual 12 Defined as ETL (Extract, Transform, Load) process
  • 14. 14 Average cost curve of data collection Manual Collection AverageCost Number of times data is collected
  • 15. 15 Average cost curve of data collection Manual Collection AverageCost Number of times data is collected Scraping
  • 16. 16 Average cost curve of data collection Manual Collection AverageCost Number of times data is collected Scraping API
  • 17. How do you get your data? What do you like? What don’t you like? 17
  • 18. Once the data is scraped, where can it go? • CSV • XLS • DBF • SQL • NoSQL • Many others 18
  • 19. Where does your data go when you collect it? 19
  • 21. Shooju Value Added Cost Savings By saving analyst time and energy, Shooju allows analysts to do more with less, reducing data management costs and putting more focus on high-value analysis. Added Quality Automating data processes internally will ensure that your data is accurate, up-to-date and consistent across your entire organization. Enhanced Decision Making Having more accurate data available faster with more analyst time left for analysis leads to enhanced decision making. 21 Cost Savings Added Quality Enhanced Decision Making Shooju Value Added
  • 22. 22 Shooju Sources Excel Add-In & Other Tools Custom BI Apps Web Search Auto- Import Drivers # of analysts retrieving time saved in retrieval # of sources frequency of retrieval # of analysts refreshing time saved in tool refresh # of sources frequency of refresh time to integrate data analysts contributing data # of tools created analyst upload time # of analysts searching time saved in search # of sources frequency of search 5 analysts 65 min / source 22 sources 18 times / year 11 analysts 74 min / source 22 sources 14 times / year 9 min / source 22 sources 32 times / year $97k (14%) $73k (10%) $248k (35%) $702kTotal: Cost Savings 13 analysts 14 wk of dev. saved 8 analysts contributing 2 apps created $284k (41%) 40 min 10 times / year Sample Cost Savings Cost Savings Added Quality Enhanced Decision MakingShooju Value Added * Based on real 40-person organization. Assumed annual wages vary between $30k and $140k. $410k savings equivalent to 10% of HR spend* Shooju speeds up custom BI application development by making all data natively accessible and continuously updated in any BI tool or custom app. USD (%)
  • 23. Added Quality: The Three “Cs” 23 Cost Savings Added QualityShooju Value Added Consistency Shooju ensures that all analysts are using the same data across all their tools and applications. By allowing analysts to upload their own data to the platform, internal data as well as external data now flows seamlessly - without messy spreadsheet links. Currency By automatically pulling in the latest source data through the Shooju importer layer, Shooju ensures that all of your spreadsheets and models are populated with the latest data. Our native plugins for Excel, Access and all your other tools allow data to flow through directly without any need for the analyst to download or copy and paste. Correctness The more data is touched by human hands, the more prone it is to errors. By streamlining workflows and automating work processes, Shooju eliminates most of these errors, saving time and ensuring that the data you rely on is more accurate. Enhanced Decision Making
  • 24. We support any data source 24 Ask us about non-mainstream data sources that traditional data providers don’t carry.
  • 26. Shooju vs. Custom Data Warehouse Custom Data Warehouse Shooju Design Custom “Plug-and-play” Cost 7+ digits 5-6 digits Rollout timeline Months / Years Hours Scalability Minimal Infinite Flexibility Low High Maintenance High Low Stakeholders IT controlled Analyst run / IT maintained Tool and app support Clunky, requiring IT Native tool support 26 Data warehouse projects are costly, time consuming and result in inflexible systems with low adoption rates
  • 27. Shooju vs. Off-the-shelf Data Management* Off-the-shelf Data Management* Shooju Service focus Data provision/management Process improvement Prepackaged data feeds Many None Custom data feeds None (not natively supported) Included(all feeds are custom) Internal data integration Weeks (high consulting fees) Days (included in service) Process flexibility Low High Analyst learning curve Weeks Hours Ease of migrating off Very difficult/impossible Easy Annual fee 6-7 digits 5-6 digits 27 Data management* solutions focus on generic data provision rather than process improvement and limit analysts to a closed and inflexible data ecosystem. * Top-ranked providers in the EnergyRisk Data Management category include: Morningstar, ZE Power Group, SunGard, Allegro, Pioneer Solutions, SAS, and InteractiveData. See http://www.slideshare.net/Allegrodev/energy-risk-magazines-etrm-software-rankings-2013

Editor's Notes

  • #2: ----- Meeting Notes (5/30/12 21:35) ----- hey there
  • #21: ----- Meeting Notes (5/30/12 21:35) ----- hey there