SlideShare a Scribd company logo
2
Most read
3
Most read
4
Most read
Introducing BigSheets
Spreadsheet-Style Tool
for IBM InfoSphere BigInsights

Cynthia M. Saracco
Senior Solution Architect
IBM Silicon Valley Lab
What is BigSheets?

Browser-based analytics tool for business users.

Why BigSheets?

How can BigSheets help?

Business users need a non-technical approach
for analyzing Big Data.

Translating untapped data into actionable
business insights is a common requirement.

Built-in “readers” can work with data in
several common formats (JSON, CSV, TSV, …)

Visualizing and drilling down into enterprise
and Web data promotes new business
intelligence.

2

Spreadsheet-like interface enables business
users to gather and analyze data easily.

Users can combine and explore various types
of data to identify “hidden” insights.

© 2013 IBM Corporation
What you can do with BigSheets
Model “big data”
collected from various
sources in spreadsheetlike structures
Filter and enrich content
with built-in functions
Combine data in different
workbooks
Visualize results through
spreadsheets, charts
Export data into common
formats (if desired)
No programming knowledge needed!
3

© 2013 IBM Corporation
Sample Scenario
Data gathering

Data storage

• WebCrawler app
• DBMS import app
• BoardReader app
• Accelerators
• Flume
• Hadoop commands
• -...

• Distributed file system
• Web-based file browser
and administration

Data exploration,
manipulation, and
analysis
• BigSheets

InfoSphere BigInsights

Blue italics = IBM technology
4

© 2013 IBM Corporation
Technology

5

© 2013 IBM Corporation
Working with BigSheets
Create workbook (spreadsheet-style structure) to model target data
Customize workbook through graphical editor and built-in functions
– Filter data
– Manipulate data (e.g., concatenate fields)
– Combine data from multiple workbooks

“Run” workbook: apply work to full data set
Explore results in spreadsheet format and/or create charts
Optionally, export your data

6

© 2013 IBM Corporation
What are Workbooks?
Spreadsheet-like structures defined by user
Based on data accessible in BigInsights

7

© 2013 IBM Corporation
Creating a Workbook (one approach)
From BigSheets tab of
Web console, click New
Workbook button
Supply input
– Workbook name
– Source file (select from file
system directory tree)
– Appropriate “reader” (data
format translator)
• Built-in readers for Web
data, JSON, CSV, TSV,
Hive, etc.
• User-written plug-ins
supported

Save the workbook

8

© 2013 IBM Corporation
Customizing a workbook
Work with built-in editor
Add / delete columns
Filter data
Specify formulas to compute
new values using
spreadsheet-style syntax
Apply built-in or custom macro
functions
– Supplied text analytic functions
for popular business entities:
person, location, phone number,
etc.

...
9

© 2013 IBM Corporation
Visualizing results
Built-in charting facility aids analysis
Pie charts, bar charts, tag clouds, maps, etc.
Hover over sections to reveal details

10

© 2013 IBM Corporation
Exporting data
Useful for sharing with downstream applications
Several common formats supported
Save to distributed file system or display in browser (Save As -> local file)

11

© 2013 IBM Corporation
On-demand videos
Available on YouTube’s IBM Big
Data Channel at
http://www.youtube.com/user/ibm
bigdata
“Analyzing Social Media for IBM
Watson”
“Big Data Patent Analysis with
BigSheets”
“Big Data for Business Users”
“BigSheets in Action”
See the full list of videos at
http://tinyurl.com/biginsights
12

© 2013 IBM Corporation
Supplemental

13

© 2013 IBM Corporation
Inspecting runtime statistics

14

© 2013 IBM Corporation
Displaying the workflow diagram

15

© 2013 IBM Corporation
Built-in text analysis functions
Included with BigInsights
Version 2.1
BigSheets functions for
extracting common business
entities from text-based
columns
– Address, EmailAddress, Country,
Person, etc.
– Based on pre-built text extractor
library provided with BigInsights

Add Sheet -> Function ->
Categories -> entities

16

© 2013 IBM Corporation

More Related Content

PDF
Lecture6 introduction to data streams
PPTX
8 queens problem using back tracking
PDF
Data visualization in Python
PPT
Map Reduce
PPTX
Map Reduce
PDF
Python for Data Science
PPTX
Hadoop And Their Ecosystem ppt
PPTX
Lecture6 introduction to data streams
8 queens problem using back tracking
Data visualization in Python
Map Reduce
Map Reduce
Python for Data Science
Hadoop And Their Ecosystem ppt

What's hot (20)

PPTX
PPT on Data Science Using Python
DOCX
Big data lecture notes
PPTX
IOT DATA MANAGEMENT AND COMPUTE STACK.pptx
PPTX
Data Analytics Life Cycle
PDF
Software project management
PDF
8. mutual exclusion in Distributed Operating Systems
PPTX
Cloud Computing & Big Data
PPTX
Fraud and Risk in Big Data
PDF
Introduction to Algorithms Complexity Analysis
PDF
PAC Learning
PDF
Big Data: Its Characteristics And Architecture Capabilities
PPTX
Big data and Hadoop
PPTX
Hadoop Architecture
PPT
Advanced Operating System Lecture Notes
PPTX
Inductive analytical approaches to learning
PDF
Decision trees in Machine Learning
PPTX
Edge Computing.pptx
PDF
Web scraping in python
PPT on Data Science Using Python
Big data lecture notes
IOT DATA MANAGEMENT AND COMPUTE STACK.pptx
Data Analytics Life Cycle
Software project management
8. mutual exclusion in Distributed Operating Systems
Cloud Computing & Big Data
Fraud and Risk in Big Data
Introduction to Algorithms Complexity Analysis
PAC Learning
Big Data: Its Characteristics And Architecture Capabilities
Big data and Hadoop
Hadoop Architecture
Advanced Operating System Lecture Notes
Inductive analytical approaches to learning
Decision trees in Machine Learning
Edge Computing.pptx
Web scraping in python
Ad

Viewers also liked (20)

PDF
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
PDF
Overview - IBM Big Data Platform
PDF
InfoSphere BigInsights
PDF
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
PDF
Big Data: Getting started with Big SQL self-study guide
PDF
Big Data: SQL on Hadoop from IBM
PDF
Hadoop Summit Japan 2011 Fall - LT by IBM
PDF
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
PDF
Big data presentation (2014)
PDF
Using BigSheets for Spreadsheet-like Analytics
PDF
Big Data: Explore Hadoop and BigInsights self-study lab
PDF
Big Data: Querying complex JSON data with BigInsights and Hadoop
PDF
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
PDF
Big Data: Working with Big SQL data from Spark
PDF
Big Data: SQL query federation for Hadoop and RDBMS data
PDF
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
PDF
Big Data: Big SQL and HBase
PDF
IBM Hadoop-DS Benchmark Report - 30TB
PDF
Big Data & Analytics Architecture
PPT
Help Desk Presentation 09202009
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Overview - IBM Big Data Platform
InfoSphere BigInsights
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
Big Data: Getting started with Big SQL self-study guide
Big Data: SQL on Hadoop from IBM
Hadoop Summit Japan 2011 Fall - LT by IBM
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
Big data presentation (2014)
Using BigSheets for Spreadsheet-like Analytics
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: Working with Big SQL data from Spark
Big Data: SQL query federation for Hadoop and RDBMS data
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data: Big SQL and HBase
IBM Hadoop-DS Benchmark Report - 30TB
Big Data & Analytics Architecture
Help Desk Presentation 09202009
Ad

Similar to Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights (20)

PDF
ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...
PPT
NLS Banking Solutions - NQuest BI
PPTX
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
PPTX
Extreme SSAS- SQL 2011
PPT
Business intelligent
PPTX
Power BI for Big Data and the New Look of Big Data Solutions
PPTX
SharePoint - You've got it, now what?
PPTX
Business intelligence on cloud computing
PDF
New dimensions for_reporting
PDF
Data Architecture Process in a BI environment
PPTX
Create Your First SQL Server Cubes
PPTX
Microsoft Fabric Introduction
PPTX
Sap BusinessObjects 4
DOC
Mihai_Nuta
PPTX
Azure Synapse Analytics Overview (r1)
PPTX
SPS Vancouver 2018 - What is CDM and CDS
PDF
Libera la potenza del Machine Learning
PPT
IBM Operations Analytics For z Systems V2.2 - Client Short Pres
PPTX
uman Values in the light of our understanding of Harmony and Co-Existence.pptx
PDF
SD Big Data Monthly Meetup #4 - Session 1 - IBM
ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...
NLS Banking Solutions - NQuest BI
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
Extreme SSAS- SQL 2011
Business intelligent
Power BI for Big Data and the New Look of Big Data Solutions
SharePoint - You've got it, now what?
Business intelligence on cloud computing
New dimensions for_reporting
Data Architecture Process in a BI environment
Create Your First SQL Server Cubes
Microsoft Fabric Introduction
Sap BusinessObjects 4
Mihai_Nuta
Azure Synapse Analytics Overview (r1)
SPS Vancouver 2018 - What is CDM and CDS
Libera la potenza del Machine Learning
IBM Operations Analytics For z Systems V2.2 - Client Short Pres
uman Values in the light of our understanding of Harmony and Co-Existence.pptx
SD Big Data Monthly Meetup #4 - Session 1 - IBM

Recently uploaded (20)

PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Empathic Computing: Creating Shared Understanding
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Electronic commerce courselecture one. Pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Big Data Technologies - Introduction.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Machine learning based COVID-19 study performance prediction
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation theory and applications.pdf
Programs and apps: productivity, graphics, security and other tools
Empathic Computing: Creating Shared Understanding
Network Security Unit 5.pdf for BCA BBA.
Spectral efficient network and resource selection model in 5G networks
Electronic commerce courselecture one. Pdf
Review of recent advances in non-invasive hemoglobin estimation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Understanding_Digital_Forensics_Presentation.pptx
Big Data Technologies - Introduction.pptx
sap open course for s4hana steps from ECC to s4
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Machine learning based COVID-19 study performance prediction

Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights

  • 1. Introducing BigSheets Spreadsheet-Style Tool for IBM InfoSphere BigInsights Cynthia M. Saracco Senior Solution Architect IBM Silicon Valley Lab
  • 2. What is BigSheets? Browser-based analytics tool for business users. Why BigSheets? How can BigSheets help? Business users need a non-technical approach for analyzing Big Data. Translating untapped data into actionable business insights is a common requirement. Built-in “readers” can work with data in several common formats (JSON, CSV, TSV, …) Visualizing and drilling down into enterprise and Web data promotes new business intelligence. 2 Spreadsheet-like interface enables business users to gather and analyze data easily. Users can combine and explore various types of data to identify “hidden” insights. © 2013 IBM Corporation
  • 3. What you can do with BigSheets Model “big data” collected from various sources in spreadsheetlike structures Filter and enrich content with built-in functions Combine data in different workbooks Visualize results through spreadsheets, charts Export data into common formats (if desired) No programming knowledge needed! 3 © 2013 IBM Corporation
  • 4. Sample Scenario Data gathering Data storage • WebCrawler app • DBMS import app • BoardReader app • Accelerators • Flume • Hadoop commands • -... • Distributed file system • Web-based file browser and administration Data exploration, manipulation, and analysis • BigSheets InfoSphere BigInsights Blue italics = IBM technology 4 © 2013 IBM Corporation
  • 6. Working with BigSheets Create workbook (spreadsheet-style structure) to model target data Customize workbook through graphical editor and built-in functions – Filter data – Manipulate data (e.g., concatenate fields) – Combine data from multiple workbooks “Run” workbook: apply work to full data set Explore results in spreadsheet format and/or create charts Optionally, export your data 6 © 2013 IBM Corporation
  • 7. What are Workbooks? Spreadsheet-like structures defined by user Based on data accessible in BigInsights 7 © 2013 IBM Corporation
  • 8. Creating a Workbook (one approach) From BigSheets tab of Web console, click New Workbook button Supply input – Workbook name – Source file (select from file system directory tree) – Appropriate “reader” (data format translator) • Built-in readers for Web data, JSON, CSV, TSV, Hive, etc. • User-written plug-ins supported Save the workbook 8 © 2013 IBM Corporation
  • 9. Customizing a workbook Work with built-in editor Add / delete columns Filter data Specify formulas to compute new values using spreadsheet-style syntax Apply built-in or custom macro functions – Supplied text analytic functions for popular business entities: person, location, phone number, etc. ... 9 © 2013 IBM Corporation
  • 10. Visualizing results Built-in charting facility aids analysis Pie charts, bar charts, tag clouds, maps, etc. Hover over sections to reveal details 10 © 2013 IBM Corporation
  • 11. Exporting data Useful for sharing with downstream applications Several common formats supported Save to distributed file system or display in browser (Save As -> local file) 11 © 2013 IBM Corporation
  • 12. On-demand videos Available on YouTube’s IBM Big Data Channel at http://www.youtube.com/user/ibm bigdata “Analyzing Social Media for IBM Watson” “Big Data Patent Analysis with BigSheets” “Big Data for Business Users” “BigSheets in Action” See the full list of videos at http://tinyurl.com/biginsights 12 © 2013 IBM Corporation
  • 14. Inspecting runtime statistics 14 © 2013 IBM Corporation
  • 15. Displaying the workflow diagram 15 © 2013 IBM Corporation
  • 16. Built-in text analysis functions Included with BigInsights Version 2.1 BigSheets functions for extracting common business entities from text-based columns – Address, EmailAddress, Country, Person, etc. – Based on pre-built text extractor library provided with BigInsights Add Sheet -> Function -> Categories -> entities 16 © 2013 IBM Corporation