SlideShare a Scribd company logo
The Big Picture on Big Data and Cognos
www.senturus.com/blog/big-picture-big-data-cognos/
August 1, 2016 Business Strategy & Perspectives
IBM has a long history of supporting major open source projects and the most widely adopted open standards. Their
enterprise customers have benefited from the flexibility, choice, and innovation that come with the open source
philosophy. Major projects include SOA (Service-Oriented Architecture), Linux, Eclipse, and now Hadoop. The big
data analytics open source offering is known as the IBM Open Platform with Apache Hadoop. The commercial side
of this platform, announced in early 2015, is a suite of products for the enterprise branded as BigInsights.
To better understand IBM's big data offerings around Hadoop and its open data platform, it is helpful to put this in
context of the overall vision for the platform and the three phases of the IBM Big Data Analytics lifecycle:
1. Pull in all types of data from disparate sources
2. Put the data into a business context
3. Produce intelligent, data driven business outcomes, for example, operational efficiency, customer
engagement, or risk management
IBM endeavors to cover a lot of business territory with its analytics platform. For the enterprise IT department, the
technology enables data integration, governance, security, and regulatory compliance. For line of business
managers, the analytics environment is the home of customer and operational intelligence. While analytics play an
important role in increasing operational efficiency and eliminating business process bottlenecks, it is the customer-
centric analytics that have captured the imagination of business executives. Big data analytics offers many
opportunities for improving customer relationships and increasing engagement across marketing channels.
A common big data use case is delivering relevant promotions to customers. We all share the experience of
receiving credit card offers in the mail from the bank and tossing the envelope directly into the recycling bin without
even thinking about it. Despite the dismal response rate, it was cost effective for the bank to send the same direct
mail piece to everyone. With a big data platform, it is possible to develop customer profiles and create targeted
offers for each segment. For example, customers that have a single account and a short customer history would be
candidates for a different array of promotions than someone who has been a customer for decades. The cost of
amassing enough data and having the processing power to crunch the numbers in a timely fashion has dropped
1/3
enough to make it profitable to do so.
With digital advertising and social media data, analysis is required on huge amounts of unstructured data. A couple
of years ago this was experimental at best, but now Hadoop software enables capturing and processing
unprecedented amounts of data. It complements the enterprise data warehouse and is an integral part of the
business intelligence ecosystem.
Open Data Platform ODPi
The ODPi open data platform is a consortium of IBM and 18 other enterprise software vendors working together to
maximize the adoption of technologies based on Apache Hadoop. The goal of ODPi is to accelerate software
development by providing a standard Hadoop solution on which an applications can be run, whether it is
commercial software, open source, or custom code developed in-house. This gives enterprise customers assurance
that they are not locking themselves into a single vendor's Hadoop solution. It also permits using a Hadoop
implementation with products from multiple vendors. For Hadoop to fulfill its role as an enterprise data source, it
must accommodate a broad audience who will be using many different applications.
To that end, the ODPi provides a core platform of agreed on and tested big data Apache Hadoop modules. This is
the ODPi standard, on which the vendors build their applications. For example, Hortonworks, IBM Open Platform
4.0 with Apache Hadoop, EMC Pivotal HD 3.0, and Infosys IIP all adhere to the ODPi standard. Analytics software
vendors or in-house development shops can concentrate on developing applications further up the stack, knowing
that the Hadoop core adheres to a standard and its application will interoperate with any compliant Hadoop system.
This accelerates development, promotes code re-use, and simplifies the technical architecture. Implementing a
Hadoop distribution that adheres to the ODPi standard means not being locked into a proprietary technology.
As a standard, only time will tell if the ODPi will have a lasting impact. The organization has been criticized as being
nothing more than a joint marketing effort for vendors pushing their own commercial flavor of Hadoop. Also to note
are the big data vendors who are conspicuous by their absence: Cloudera, MapR, and Amazon (AWS – EMR
Elastic MapReduce).
IBM BigInsights and Cognos
On top of Hadoop, IBM has developed a suite of big data and analytics tools under the BigInsights brand. There are
tools for scaling and managing the platform (BigInsights Enterprise Management), a machine learning engine
(BigInsights Data Scientist – Decision Trees, PageRank, Clustering) and a data exploration and discovery tool
(BigSheets). Of particular interest to Cognos customers is BigSQL which runs SQL queries against Hadoop or in
other words, BigSQL permits Cognos to use Hadoop as a data source.
This is interesting as data stored in Hadoop only becomes useful when it is put into a business context. Cognos
Analytics (V11) is well suited for this role. It is a powerful tool for BI developers and business power users, enabling
the presentation of Hadoop data in a visually appealing format for executives, managers, and line of business
staffers. Big data becomes much more valuable when it can be interpreted and understood by non-technical users.
Cognos supports connecting to Hadoop using Hive, which translates code from SQL to MapReduce to get results
from Hadoop. There will always be some latency as Hive cannot change the nature of MapReduce, which
distributes processing work across Hadoop nodes. The query is split into discrete chunks of work and the results are
assembled as they are returned. SQL join conditions, which are commonplace in Cognos generated SQL, create an
additional layer of complexity for MapReduce. This further increases the query processing time and will prevent
some queries from running at all.
IBM addresses these problems with BigSQL. It works on the same Hive megastore, but produces faster and more
reliable results. BigSQL is not just about performance, but also assuring that the SQL query will run. It optimizes
2/3
SQL for MapReduce so that it will run faster and prevent having to modify the Cognos Framework Manager model
or hand code SQL inside of Cognos. An alternative to Hive and BigSQL is Impala, which makes similar claims to
performance.
Success with Big Data requires getting key pieces to work together. With BigInsights and BigSQL, IBM is providing
tools for facilitating Hadoop adoption, including interoperability with existing Cognos infrastructure and functionality.
Stay on top of business intelligence topics, read other Senturus blogs at: http://www.senturus.com/blog/.
Resources
Senturus webinar Running Cognos on Hadoop:
http://www.senturus.com/resources/running-cognos-on-hadoop/
Video of Hive and BigSQL performance test results:
https://developer.ibm.com/hadoop/blog/2015/10/23/hive-and-big-sql-performance-test-update/
IBM BigSQL technology sandbox demo cloud environment for Hadoop and BigSQL:
https://my.imdemocloud.com/projects/3467
Thanks to David Currie for contributing this article. David is a long-time business analytics consultant. He blogs
about business intelligence and big data at davidpcurrie.com.
Big Data / IBM Cognos
3/3

More Related Content

PDF
Data warehouse-optimization-with-hadoop-informatica-cloudera
PDF
Combining hadoop with big data analytics
PDF
Open Source Ecosystem Future of Enterprise IT
PPTX
JDV Big Data Webinar v2
PDF
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
PPTX
Future Market Trend Of Big Data Hadoop
PPTX
Capgemini Insights and Data
PDF
10 things you need to know about Spark
Data warehouse-optimization-with-hadoop-informatica-cloudera
Combining hadoop with big data analytics
Open Source Ecosystem Future of Enterprise IT
JDV Big Data Webinar v2
Case Study - Spotad: Rebuilding And Optimizing Real-Time Mobile Adverting Bid...
Future Market Trend Of Big Data Hadoop
Capgemini Insights and Data
10 things you need to know about Spark

What's hot (20)

PPTX
Accelerating Big Data Implementations for the Connected World
DOC
VINEET_ANAND_CV_HADOOP_VA_V3
DOC
VINEET_ANAND_CV_HADOOP_VA_V3
DOC
VINEET_ANAND_CV_HADOOP_VA_V3
DOCX
Souvik das cv
PDF
Case Study - Gordon Foods Delivers Fresh Data to the Cloud
DOCX
Souvik_Das_CV
PPTX
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
PDF
mapr_case_study_experian
PDF
Taming Big Data With Modern Software Architecture
PDF
Global Manufacturer Averts Data Swamp with New Data Lake Architecture
PDF
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
PDF
Actian DataFlow Whitepaper
PDF
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
PDF
Big Data & Analytics Architecture
PPTX
Integrating Hadoop Into the Enterprise
PPTX
SAP HANA Integrated with Microstrategy
PDF
Ibm big data
PDF
Use cases for Hadoop and Big Data Analytics - InfoSphere BigInsights
Accelerating Big Data Implementations for the Connected World
VINEET_ANAND_CV_HADOOP_VA_V3
VINEET_ANAND_CV_HADOOP_VA_V3
VINEET_ANAND_CV_HADOOP_VA_V3
Souvik das cv
Case Study - Gordon Foods Delivers Fresh Data to the Cloud
Souvik_Das_CV
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...
mapr_case_study_experian
Taming Big Data With Modern Software Architecture
Global Manufacturer Averts Data Swamp with New Data Lake Architecture
The Forrester Wave Enterprise Hadoop Solutions Q1 2012
Actian DataFlow Whitepaper
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Big Data & Analytics Architecture
Integrating Hadoop Into the Enterprise
SAP HANA Integrated with Microstrategy
Ibm big data
Use cases for Hadoop and Big Data Analytics - InfoSphere BigInsights
Ad

Viewers also liked (11)

PDF
Demo: Report Authoring in IBM Cognos Analytics
PDF
Microsoft BI Tool Overview and Comparison
PPTX
Cognos Analytics V11 Report Authoring Demonstration
PDF
30 Reasons You Still Need to Stage Your Data Senturus Blog
PPTX
Using Cognos as a Data Source for Tableau: Demo & Live Case Study with Ixia
PPTX
Scaling Tableau to the Enterprise: The Perks and Pitfalls of Tableau Server W...
PPTX
How to Increase Performance in IBM Cognos
PDF
IBM Cognos Analytics - Cognos Business Intelligence version 11
PDF
Project Management Concepts (from PMBOK 5th Ed)
PPTX
Connect Power BI & Tableau to Cognos Data
PPTX
Qualitative and quantitative methods of research
Demo: Report Authoring in IBM Cognos Analytics
Microsoft BI Tool Overview and Comparison
Cognos Analytics V11 Report Authoring Demonstration
30 Reasons You Still Need to Stage Your Data Senturus Blog
Using Cognos as a Data Source for Tableau: Demo & Live Case Study with Ixia
Scaling Tableau to the Enterprise: The Perks and Pitfalls of Tableau Server W...
How to Increase Performance in IBM Cognos
IBM Cognos Analytics - Cognos Business Intelligence version 11
Project Management Concepts (from PMBOK 5th Ed)
Connect Power BI & Tableau to Cognos Data
Qualitative and quantitative methods of research
Ad

Similar to The Big Picture on Big Data and Cognos (20)

DOCX
2Running Head BIG DATA PROCESSING OF SOFTWARE AND TOOLS2BIG.docx
DOCX
2Running Head BIG DATA PROCESSING OF SOFTWARE AND TOOLS2BIG.docx
PPTX
How Big Data ,Cloud Computing ,Data Science can help business
PPSX
Haddop in Business Intelligence
PDF
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
PDF
Hadoop and Big Data Analytics | Sysfore
PDF
Hadoop Overview
PPTX
IBM Smarter Analytics
PDF
Big Data Analytics Unit I CCS334 Syllabus
PDF
Big data Question bank.pdf
PDF
Aziksa hadoop for buisness users2 santosh jha
PDF
Big Data Use Cases
PDF
Getting started with Hadoop on the Cloud with Bluemix
PDF
Ibm leads way with hadoop and spark 2015 may 15
PDF
IBM InfoSphere BigInsights for Hadoop: 10 Reasons to Love It
PPTX
Stratebi Big Data
PDF
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
PDF
IBM Data Engine for Hadoop and Spark - POWER System Edition ver1 March 2016
PPTX
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
PDF
IRJET- A Comparative Study on Big Data Analytics Approaches and Tools
2Running Head BIG DATA PROCESSING OF SOFTWARE AND TOOLS2BIG.docx
2Running Head BIG DATA PROCESSING OF SOFTWARE AND TOOLS2BIG.docx
How Big Data ,Cloud Computing ,Data Science can help business
Haddop in Business Intelligence
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Hadoop and Big Data Analytics | Sysfore
Hadoop Overview
IBM Smarter Analytics
Big Data Analytics Unit I CCS334 Syllabus
Big data Question bank.pdf
Aziksa hadoop for buisness users2 santosh jha
Big Data Use Cases
Getting started with Hadoop on the Cloud with Bluemix
Ibm leads way with hadoop and spark 2015 may 15
IBM InfoSphere BigInsights for Hadoop: 10 Reasons to Love It
Stratebi Big Data
Harnessing Hadoop: Understanding the Big Data Processing Options for Optimizi...
IBM Data Engine for Hadoop and Spark - POWER System Edition ver1 March 2016
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
IRJET- A Comparative Study on Big Data Analytics Approaches and Tools

More from Senturus (20)

PPTX
Power BI Gateway: Understanding, Installing, Configuring
PPTX
Cognos Performance Tuning Tips & Tricks
PPTX
Power Automate for Power BI: Getting Started
PPTX
Collaborative BI: 3 Ways to Use Cognos with Power BI & Tableau
PPTX
Tips for Installing Cognos Analytics 11.2.1x
PDF
How to Prepare for a BI Migration
PPTX
4 Common Analytics Reporting Errors to Avoid
PPTX
Extending Power BI Functionality with R
PPTX
Take Control of Your Cloud
PPTX
Using Python with Power BI
PPTX
User-Friendly Power BI Report Nav
PPTX
Streamline Cognos Migrations & Consolidations
PPTX
What’s New in Cognos 11.2.1
PPTX
Planning for a Power BI Enterprise Deployment
PPTX
Power BI Report Builder & Paginated Reports
PPTX
Tableau: 6 Ways to Publish & Share Dashboards
PPTX
Cognos Analytics 11.2 New Features
PPTX
Azure Synapse vs. Snowflake: The Data Warehouse Dating Game
PPTX
Secrets of High Performing Report Development Teams
PPTX
Power BI: Data Cleansing & Power Query Editor
Power BI Gateway: Understanding, Installing, Configuring
Cognos Performance Tuning Tips & Tricks
Power Automate for Power BI: Getting Started
Collaborative BI: 3 Ways to Use Cognos with Power BI & Tableau
Tips for Installing Cognos Analytics 11.2.1x
How to Prepare for a BI Migration
4 Common Analytics Reporting Errors to Avoid
Extending Power BI Functionality with R
Take Control of Your Cloud
Using Python with Power BI
User-Friendly Power BI Report Nav
Streamline Cognos Migrations & Consolidations
What’s New in Cognos 11.2.1
Planning for a Power BI Enterprise Deployment
Power BI Report Builder & Paginated Reports
Tableau: 6 Ways to Publish & Share Dashboards
Cognos Analytics 11.2 New Features
Azure Synapse vs. Snowflake: The Data Warehouse Dating Game
Secrets of High Performing Report Development Teams
Power BI: Data Cleansing & Power Query Editor

Recently uploaded (20)

PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
1_Introduction to advance data techniques.pptx
PPT
Quality review (1)_presentation of this 21
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Computer network topology notes for revision
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Lecture1 pattern recognition............
PDF
Introduction to Business Data Analytics.
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Major-Components-ofNKJNNKNKNKNKronment.pptx
Launch Your Data Science Career in Kochi – 2025
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Fluorescence-microscope_Botany_detailed content
1_Introduction to advance data techniques.pptx
Quality review (1)_presentation of this 21
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Introduction to Knowledge Engineering Part 1
Computer network topology notes for revision
climate analysis of Dhaka ,Banglades.pptx
Lecture1 pattern recognition............
Introduction to Business Data Analytics.
iec ppt-1 pptx icmr ppt on rehabilitation.pptx

The Big Picture on Big Data and Cognos

  • 1. The Big Picture on Big Data and Cognos www.senturus.com/blog/big-picture-big-data-cognos/ August 1, 2016 Business Strategy & Perspectives IBM has a long history of supporting major open source projects and the most widely adopted open standards. Their enterprise customers have benefited from the flexibility, choice, and innovation that come with the open source philosophy. Major projects include SOA (Service-Oriented Architecture), Linux, Eclipse, and now Hadoop. The big data analytics open source offering is known as the IBM Open Platform with Apache Hadoop. The commercial side of this platform, announced in early 2015, is a suite of products for the enterprise branded as BigInsights. To better understand IBM's big data offerings around Hadoop and its open data platform, it is helpful to put this in context of the overall vision for the platform and the three phases of the IBM Big Data Analytics lifecycle: 1. Pull in all types of data from disparate sources 2. Put the data into a business context 3. Produce intelligent, data driven business outcomes, for example, operational efficiency, customer engagement, or risk management IBM endeavors to cover a lot of business territory with its analytics platform. For the enterprise IT department, the technology enables data integration, governance, security, and regulatory compliance. For line of business managers, the analytics environment is the home of customer and operational intelligence. While analytics play an important role in increasing operational efficiency and eliminating business process bottlenecks, it is the customer- centric analytics that have captured the imagination of business executives. Big data analytics offers many opportunities for improving customer relationships and increasing engagement across marketing channels. A common big data use case is delivering relevant promotions to customers. We all share the experience of receiving credit card offers in the mail from the bank and tossing the envelope directly into the recycling bin without even thinking about it. Despite the dismal response rate, it was cost effective for the bank to send the same direct mail piece to everyone. With a big data platform, it is possible to develop customer profiles and create targeted offers for each segment. For example, customers that have a single account and a short customer history would be candidates for a different array of promotions than someone who has been a customer for decades. The cost of amassing enough data and having the processing power to crunch the numbers in a timely fashion has dropped 1/3
  • 2. enough to make it profitable to do so. With digital advertising and social media data, analysis is required on huge amounts of unstructured data. A couple of years ago this was experimental at best, but now Hadoop software enables capturing and processing unprecedented amounts of data. It complements the enterprise data warehouse and is an integral part of the business intelligence ecosystem. Open Data Platform ODPi The ODPi open data platform is a consortium of IBM and 18 other enterprise software vendors working together to maximize the adoption of technologies based on Apache Hadoop. The goal of ODPi is to accelerate software development by providing a standard Hadoop solution on which an applications can be run, whether it is commercial software, open source, or custom code developed in-house. This gives enterprise customers assurance that they are not locking themselves into a single vendor's Hadoop solution. It also permits using a Hadoop implementation with products from multiple vendors. For Hadoop to fulfill its role as an enterprise data source, it must accommodate a broad audience who will be using many different applications. To that end, the ODPi provides a core platform of agreed on and tested big data Apache Hadoop modules. This is the ODPi standard, on which the vendors build their applications. For example, Hortonworks, IBM Open Platform 4.0 with Apache Hadoop, EMC Pivotal HD 3.0, and Infosys IIP all adhere to the ODPi standard. Analytics software vendors or in-house development shops can concentrate on developing applications further up the stack, knowing that the Hadoop core adheres to a standard and its application will interoperate with any compliant Hadoop system. This accelerates development, promotes code re-use, and simplifies the technical architecture. Implementing a Hadoop distribution that adheres to the ODPi standard means not being locked into a proprietary technology. As a standard, only time will tell if the ODPi will have a lasting impact. The organization has been criticized as being nothing more than a joint marketing effort for vendors pushing their own commercial flavor of Hadoop. Also to note are the big data vendors who are conspicuous by their absence: Cloudera, MapR, and Amazon (AWS – EMR Elastic MapReduce). IBM BigInsights and Cognos On top of Hadoop, IBM has developed a suite of big data and analytics tools under the BigInsights brand. There are tools for scaling and managing the platform (BigInsights Enterprise Management), a machine learning engine (BigInsights Data Scientist – Decision Trees, PageRank, Clustering) and a data exploration and discovery tool (BigSheets). Of particular interest to Cognos customers is BigSQL which runs SQL queries against Hadoop or in other words, BigSQL permits Cognos to use Hadoop as a data source. This is interesting as data stored in Hadoop only becomes useful when it is put into a business context. Cognos Analytics (V11) is well suited for this role. It is a powerful tool for BI developers and business power users, enabling the presentation of Hadoop data in a visually appealing format for executives, managers, and line of business staffers. Big data becomes much more valuable when it can be interpreted and understood by non-technical users. Cognos supports connecting to Hadoop using Hive, which translates code from SQL to MapReduce to get results from Hadoop. There will always be some latency as Hive cannot change the nature of MapReduce, which distributes processing work across Hadoop nodes. The query is split into discrete chunks of work and the results are assembled as they are returned. SQL join conditions, which are commonplace in Cognos generated SQL, create an additional layer of complexity for MapReduce. This further increases the query processing time and will prevent some queries from running at all. IBM addresses these problems with BigSQL. It works on the same Hive megastore, but produces faster and more reliable results. BigSQL is not just about performance, but also assuring that the SQL query will run. It optimizes 2/3
  • 3. SQL for MapReduce so that it will run faster and prevent having to modify the Cognos Framework Manager model or hand code SQL inside of Cognos. An alternative to Hive and BigSQL is Impala, which makes similar claims to performance. Success with Big Data requires getting key pieces to work together. With BigInsights and BigSQL, IBM is providing tools for facilitating Hadoop adoption, including interoperability with existing Cognos infrastructure and functionality. Stay on top of business intelligence topics, read other Senturus blogs at: http://www.senturus.com/blog/. Resources Senturus webinar Running Cognos on Hadoop: http://www.senturus.com/resources/running-cognos-on-hadoop/ Video of Hive and BigSQL performance test results: https://developer.ibm.com/hadoop/blog/2015/10/23/hive-and-big-sql-performance-test-update/ IBM BigSQL technology sandbox demo cloud environment for Hadoop and BigSQL: https://my.imdemocloud.com/projects/3467 Thanks to David Currie for contributing this article. David is a long-time business analytics consultant. He blogs about business intelligence and big data at davidpcurrie.com. Big Data / IBM Cognos 3/3