SlideShare a Scribd company logo
OPEN SOURCE BIG DATA
PROJECTS - EMERGENCE OF THE CONVERGED
DATA PLATFORM
This presentation is a summary by
InsightBrief of the 451 Research webinar:
The Big Data Blender: Converging Hadoop,
Spark, Streaming, and More
Insights for busy professionals
Read in less than 10 mins
Knowledge without the fluff
As of Jan 2016, 451 Research identified
over 275 vendors and products in the
data platform and analytics landscape.
Growth is expected, as is convergence
of data platforms.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Convergence or the ‘blending’ of data management
platforms is being driven by
1.	 Different data types - the need to adapt some data
stored in different types of databases
2.	 Operational efficiencies - reducing maintenance by
converging multiple data silos into one
3.	 Demands of variable workloads – some data stores
are better at addressing certain types of applications
and their workloads
Open Source Big Data Projects - Emergence of the Converged Data Platform
Open source technology is increasingly used in the
growing complexity of Big Data environment projects.
Although able to handle scale and complexity of the
new modern data types, a converged data platform
allows it all to be delivered on a unified platform,
providing centralized management, security, high
availability, fault tolerance and disaster recovery.
Open Source Big Data Projects - Emergence of the Converged Data Platform
A converged data platform is the
convergence or blending of two or
more processes, frameworks or
technology - coming together in a
unified whole.
Open Source Big Data Projects - Emergence of the Converged Data Platform
By 2019 it is anticipated that the value
created in the Hadoop, Event/Stream
processing, New SQL and NoSQL market
will be about $10b. Open source Hadoop
will account for about $3.6b and NoSQL
some $4.7b. 451 Research expect continued
growth in this market.
Open Source Big Data Projects - Emergence of the Converged Data Platform
NoSQLs emergence, growth and appeal have
been attributed to its different data stores for
different workloads (polyglot persistence). However,
limitations of operational complexity and inflexibility
from multiple databases driving applications have
been addressed by the growth of multi-model
databases - which support a combination of various
NoSQL data models. NoSQL’s future is predicted to
rest with this approach.
Open Source Big Data Projects - Emergence of the Converged Data Platform
With increasing complexity in data workloads,
a number of actions and data sets get joined at
many management points that require different
consoles, different hardware utilization, different
demands for security and different fault tolerance
and disaster recovery. These concepts drive the
creation of a converged data platform.
Open Source Big Data Projects - Emergence of the Converged Data Platform
In the past operational and
analytical data workloads were
more or less separated. Today
they often are not and this
demands data convergence.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Convergence is also being driven
by the idea that some in-memory
databases can also serve as memory
caches. Thus platforms are being
developed that can run a database
with a cache or simply as a cache
solution.
Open Source Big Data Projects - Emergence of the Converged Data Platform
The expansion of streaming
technology has opened up yet
another area for convergence
where diverse applications
are being developed to serve
expanding demands for analysis
of real-time data streams.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Total Data Warehouse
is a term used by 451
Research to describe
all the components that
make data platform
convergence possible.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Converged data platforms
are transforming businesses,
demonstrating significant cost
savings and helping drive additional
revenues. Six cases studies are cited
by MapR in their recent webinar with
451 Research.
Open Source Big Data Projects - Emergence of the Converged Data Platform
The key benefits of a converged data
platform:
1. It’s in real time with reduced latency -
improves responsiveness of applications
2. Improved reliability drives greater
business value and reduced costs
Open Source Big Data Projects - Emergence of the Converged Data Platform
Converging data technologies in one
place creates between 30% to 50%
reduction in the overall total cost of
ownership. Savings are evident on the
type of hardware utilized, the data center,
operating costs, and reduced costs of
data movement.
Open Source Big Data Projects - Emergence of the Converged Data Platform
In a survey of MapR
customers, it was reported
that 98% were running more
than one application on a
single cluster and 18% were
running over 50 applications.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Different data types (structured,
unstructured), operational efficiencies
(converging multiple silos of data) and
variable workloads (dependant on
applications) are all driving convergence
of data storage platforms.
Open Source Big Data Projects - Emergence of the Converged Data Platform
NoSQL and SQL vendors are starting to adopt
certain traits of each other. For example, IBM
and Oracle are adding JSON capabilities to
their databases, searchable by SQL. NoSQL
vendors are adding SQL querying capabilities.
This trend illustrates NoSQL and SQL
convergence.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Operational and analytical data
workloads are starting to converge. Until
recently, separate databases have been
used for each. Emerging databases
take advantage of in-memory and
advanced processing to deliver combined
operational and analytical processing.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Several NoSQL players, among
them MapR, have taken up the
multi-model approach and it is
anticipated that this is the future
direction of this sector.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Hadoop based convergence is taking place in
1.	Cache (Apache Geode, Apache Ignite)
2.	Operational databases (NoSQL) - (MapR-
DB, Apache HBase, Splice Machine, Apache
Trafodian)
3.	Analytic databases, made possible by
connectors, SQL-on-Hadoop, Federated
query - (e.g. Pivotal, Teradata and IBM)
Open Source Big Data Projects - Emergence of the Converged Data Platform
Convergence is also occurring with
cache and grid databases where
databases can be run with a cache
or used as a pure cache solution.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Customers’ growing need for more
frequent analysis of real-time data
streams is pushing data stream
processing into the mainstream. The
blending of streaming technologies
(e.g. Storm, Spark, Kafka and MapR
Streams) with Hadoop illustrate the
convergence in this space.
Open Source Big Data Projects - Emergence of the Converged Data Platform
JSON is an extremely popular
format among application
developers and it is emerging as a
de facto standard for storing data,
particularly around the growth of
sensors for IoT use cases.
Open Source Big Data Projects - Emergence of the Converged Data Platform
MapR provides the industry’s only converged data platform that
integrates the power of Hadoop and Spark with global event
streaming, real-time database capabilities, and enterprise storage,
enabling customers to harness the enormous power of their
data. Organizations with the most demanding production needs,
including sub-second response for fraud prevention, secure and
highly available data-driven insights for better healthcare, petabyte
analysis for threat detection, and integrated operational and analytic
processing for improved customer experiences, run on MapR.
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
MapR’s approach is to be as open as possible
in supporting as many of the API’s as possible.
One of the very popular features of MapR’s
converged data platform is the seamless
movement of data and files through drag and
drop without having to write special code
to batch-load data into and out of a Hadoop
environment.
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
MapR’s converged data platform
brings together open source engines
and tools (e.g, Hadoop, Spark, Apache
Drill) and can also support commercial
engines and applications (e.g. Vertica,
SAP, MySQL).
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
MapR’s converged data platform handles
multi-model databases. JSON is a recently
added standard native format on which
applications can also be built. MapR is
completely integrated with Hadoop and can
operate across datacenters in a multi-master
type of environment and setup.
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
In the area of security management
across a big data environment, MapR
has pushed down access control to the
granular level. This access control can
be spread across different processing
engines and all types of structured
data, files, tables and streams.
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
The MapR system
is a multi-tenant
environment supporting
many different
applications on a single
cluster.
(Key point on company offering made by
MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
In realtime applications, MapR’s data
platform allows a single system to
operate simultaneously with data-
at-rest and data-at-motion reducing
latency.
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
MapR’s converged data permits an
optimized table setup and streaming
working directly with storage hardware
resources. It operates natively against
the hardware allowing fast, efficient,
direct I/O on that system, thereby
improving performance.
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
MapR Streams is a global streaming service
that extends the capacity of Sparks, Storm and
other streaming processing engines. It pushes
billions of messages per second in a reliable
way thus allowing information consumption to
be virtually real time.
(Key point on company offering made by MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
The fulfillment of the
promise of docker
containers and the ability
to move workloads on
the fly to different data
center nodes is handled
seamlessly in MapR.
(Key point on company offering made by
MapR in the webinar)
Open Source Big Data Projects - Emergence of the Converged Data Platform
The largest healthcare provider in the United
States (United Health Group) run what they
call a ‘big data as-a-service’ platform using
MapR’s converged data platform and have
saved hundreds of millions of dollars in
improving efficiency and reducing waste in
how they manage and process insurance
claims.
Open Source Big Data Projects - Emergence of the Converged Data Platform
The business value of converging
different data workloads (e.g, batch,
interactive) and streaming them
across the business, manifests itself
in significantly improved customer
experiences and value.
Open Source Big Data Projects - Emergence of the Converged Data Platform
There is considerable cost reduction
with data convergence where data
sprawl and data duplication are
controlled. Administrative costs
associated with overall enterprise data
architecture are reduced.
Open Source Big Data Projects - Emergence of the Converged Data Platform
The converged data platform
not only simplifies app
development but also the way
apps are run in a data center.
Open Source Big Data Projects - Emergence of the Converged Data Platform
With new customer
demands and
improving
technology
capabilities, all roads
lead to a converged
data platform.
Open Source Big Data Projects - Emergence of the Converged Data Platform
MapR has created a simple process of
getting started with its converged data
platform. Free on-demand training is
available for Hadoop, Spark and SQL engines.
Quick start solutions using blueprints and
templates are available so that MapR clients
can be operating with a 6 node environment
in 4 to 6 weeks.
Open Source Big Data Projects - Emergence of the Converged Data Platform
Questions to ask your Big Data vendor:
1.	Is my data highly available? Can we plug in existing enterprise
systems?
2.	Can we properly identify users?
3.	Is multi-tenancy supported?
4.	Is my data correct and supported?
5.	Can we authorise access to data?
6.	Are apps supported across geographies and data centers?
7.	Is my data governed?
8.	Is there a proper paper trail?
Open Source Big Data Projects - Emergence of the Converged Data Platform
ABOUT 451 RESEARCH ABOUT MAPR
With a core focus on technology innovation
and market disruption, 451 Research
provides essential insight for leaders of the
digital economy. More than 100 analysts
and consultants deliver that insight via
syndicated research, advisory services and
live events to over 1,000 client organizations
in North America, Europe and around the
world. 451 Research and its customers
benefit from the combined assets and talent
of The 451 Group and its two divisions: 451
Research and Uptime Institute.
MapR provides the industry’s only converged
data platform that integrates the power
of Hadoop and Spark with global event
streaming, real-time database capabilities,
and enterprise storage, enabling customers
to harness the enormous power of their data.
Organizations with the most demanding
production needs, including sub-second
response for fraud prevention, secure and
highly available data-driven insights for
better healthcare, petabyte analysis for threat
detection, and integrated operational and
analytic processing for improved customer
experiences, run on MapR.
ABOUT INSIGHTBRIEF
Our team produces short documents for busy
professionals, summarising longer reports
and research papers so that readers can swiftly
become acquainted with a large body of
knowledge and decide whether or not to read the
full source document(s).
We vet and qualify reports for relevancy and
value to its intended audience before creating
an InsightBrief document. Our editorial team is
independent from the originator of the report,
ensuring that the insights exclude sales or vendor
centric messaging, thereby creating real value for
our time-poor readers.
The InsightBrief team in conjunction with technology analysts content, summarise existing reports and events independently of input from the source originator.We assume no responsibility for the
content or implied advice from any of the summaries / insights. InsightBrief and iBrief.ly are registered trademarks of InsightBrief.All other trademarks are the property of their respective owners.
GETTHE INSIGHTS IN AN EASY
TO READ FORMAT
OPEN SOURCE BIG
DATA PROJECTS -
EMERGENCE OF THE
CONVERGED DATA PLATFORM
COPYRIGHT ©2016 INSIGHTBRIEF.ALL RIGHTS RESERVED
CLICKTOACCESSTHEEXECUTIVEBRIEF
E X E C U T I V E B R I E F

More Related Content

PDF
Tag.bio: Self Service Data Mesh Platform
PDF
Accelerating Time to Research Using CloudBank
PDF
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
PDF
Unlock Your Data for ML & AI using Data Virtualization
PDF
ds_Pivotal_Big_Data_Suite_Product_Suite
PDF
Tag.bio aws public jun 08 2021
PDF
PDF
Big Data Analytics for Real Time Systems
Tag.bio: Self Service Data Mesh Platform
Accelerating Time to Research Using CloudBank
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Unlock Your Data for ML & AI using Data Virtualization
ds_Pivotal_Big_Data_Suite_Product_Suite
Tag.bio aws public jun 08 2021
Big Data Analytics for Real Time Systems

What's hot (20)

PDF
Minimizing the Complexities of Machine Learning with Data Virtualization
PDF
Advanced Analytics and Machine Learning with Data Virtualization
PDF
Data Virtualization: From Zero to Hero
PPTX
BDaas- BigData as a service
PPTX
Capgemini Insights and Data
PDF
Enabling digital transformation api ecosystems and data virtualization
PDF
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
PDF
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
PDF
SURVEY ON BIG DATA PROCESSING USING HADOOP, MAP REDUCE
PPTX
Data Virtualization and ETL
PDF
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
PPTX
Opportunity: Data, Analytic & Azure
PDF
Hybrid Cloud Strategy for Big Data and Analytics
PDF
Big Data Real Time Applications
PPTX
Building intelligent applications, experimental ML with Uber’s Data Science W...
PPTX
The key to unlocking the Value in the IoT? Managing the Data!
PDF
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
PDF
Introduction to Big Data
PPTX
Hd insight overview
PDF
The Future of the OS
Minimizing the Complexities of Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Data Virtualization: From Zero to Hero
BDaas- BigData as a service
Capgemini Insights and Data
Enabling digital transformation api ecosystems and data virtualization
On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms
Denodo DataFest 2016: What’s New in Denodo Platform – Demo and Roadmap
SURVEY ON BIG DATA PROCESSING USING HADOOP, MAP REDUCE
Data Virtualization and ETL
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Opportunity: Data, Analytic & Azure
Hybrid Cloud Strategy for Big Data and Analytics
Big Data Real Time Applications
Building intelligent applications, experimental ML with Uber’s Data Science W...
The key to unlocking the Value in the IoT? Managing the Data!
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Introduction to Big Data
Hd insight overview
The Future of the OS
Ad

Viewers also liked (20)

DOCX
Mejoras TP 4
PDF
Nathae corporate sales logo
PDF
171423.PDF
PPT
Ponencia Lázaro Rosa, presidente ctaquA, jornadas innovación
PPTX
Esteban alexis
PPTX
Prova no planejamento 08 04-13- dra. fabiana
PPT
Mapa conceptual mariela
PDF
หนังสือ Feq elsunah00
TXT
Huong dan cai dat
DOCX
PDF
Series i prof luis cottos
TXT
текстовый документ
PPTX
Ejemplo!!!
PDF
Anamise modificar texto
DOC
Ejercicio no 3
PDF
Lettre recommendationE Palacio
DOC
Pintxo lehiaketako arauak
Mejoras TP 4
Nathae corporate sales logo
171423.PDF
Ponencia Lázaro Rosa, presidente ctaquA, jornadas innovación
Esteban alexis
Prova no planejamento 08 04-13- dra. fabiana
Mapa conceptual mariela
หนังสือ Feq elsunah00
Huong dan cai dat
Series i prof luis cottos
текстовый документ
Ejemplo!!!
Anamise modificar texto
Ejercicio no 3
Lettre recommendationE Palacio
Pintxo lehiaketako arauak
Ad

Similar to Ss eb29 (20)

PPTX
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
PDF
Big data Question bank.pdf
PDF
Big Data Analytics Unit I CCS334 Syllabus
PPTX
From open data to API-driven business
PDF
Big data and oracle
PDF
Spark and MapR Streams: A Motivating Example
PPTX
Big Data_Architecture.pptx
PDF
Developing Enterprise Consciousness: Building Modern Open Data Platforms
PDF
Business of Big Data
PDF
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
PDF
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
PDF
BAR360 open data platform presentation at DAMA, Sydney
PPTX
MapR and Cisco Make IT Better
PDF
module4-cloudcomputing-180131071200.pdf
PPTX
VTU 6th Sem Elective CSE - Module 4 cloud computing
PDF
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
PDF
Intro to big data and applications - day 2
PDF
R180305120123
PDF
Big Data Analytics M1.pdf big data analytics
PPTX
5 years of Dataverse evolution
 
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Big data Question bank.pdf
Big Data Analytics Unit I CCS334 Syllabus
From open data to API-driven business
Big data and oracle
Spark and MapR Streams: A Motivating Example
Big Data_Architecture.pptx
Developing Enterprise Consciousness: Building Modern Open Data Platforms
Business of Big Data
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
BAR360 open data platform presentation at DAMA, Sydney
MapR and Cisco Make IT Better
module4-cloudcomputing-180131071200.pdf
VTU 6th Sem Elective CSE - Module 4 cloud computing
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Intro to big data and applications - day 2
R180305120123
Big Data Analytics M1.pdf big data analytics
5 years of Dataverse evolution
 

More from Edilberto Barrero Dávalos (8)

PDF
Slideshare test 3
PDF
Slideshare test 2
PDF
Slideshare451 eb31 test4
PDF
Slideshare451 eb31c
PDF
Slideshare451 eb31b
PDF
Slideshare451 eb31
Slideshare test 3
Slideshare test 2
Slideshare451 eb31 test4
Slideshare451 eb31c
Slideshare451 eb31b
Slideshare451 eb31

Recently uploaded (20)

PPTX
Implications Existing phase plan and its feasibility.pptx
PPTX
areprosthodontics and orthodonticsa text.pptx
PDF
Integrated-2D-and-3D-Animation-Bridging-Dimensions-for-Impactful-Storytelling...
PPTX
DOC-20250430-WA0014._20250714_235747_0000.pptx
PPTX
mahatma gandhi bus terminal in india Case Study.pptx
PDF
Key Trends in Website Development 2025 | B3AITS - Bow & 3 Arrows IT Solutions
PPTX
Complete Guide to Microsoft PowerPoint 2019 – Features, Tools, and Tips"
PPTX
HPE Aruba-master-icon-library_052722.pptx
PPTX
BSCS lesson 3.pptxnbbjbb mnbkjbkbbkbbkjb
PPT
EGWHermeneuticsffgggggggggggggggggggggggggggggggg.ppt
PPTX
Causes of Flooding by Slidesgo sdnl;asnjdl;asj.pptx
PPTX
Fundamental Principles of Visual Graphic Design.pptx
PPT
pump pump is a mechanism that is used to transfer a liquid from one place to ...
PDF
Phone away, tabs closed: No multitasking
DOCX
The story of the first moon landing.docx
PDF
Benefits_of_Cast_Aluminium_Doors_Presentation.pdf
PDF
Design Thinking - Module 1 - Introduction To Design Thinking - Dr. Rohan Dasg...
PDF
Trusted Executive Protection Services in Ontario — Discreet & Professional.pdf
PDF
Africa 2025 - Prospects and Challenges first edition.pdf
PPTX
Wisp Textiles: Where Comfort Meets Everyday Style
Implications Existing phase plan and its feasibility.pptx
areprosthodontics and orthodonticsa text.pptx
Integrated-2D-and-3D-Animation-Bridging-Dimensions-for-Impactful-Storytelling...
DOC-20250430-WA0014._20250714_235747_0000.pptx
mahatma gandhi bus terminal in india Case Study.pptx
Key Trends in Website Development 2025 | B3AITS - Bow & 3 Arrows IT Solutions
Complete Guide to Microsoft PowerPoint 2019 – Features, Tools, and Tips"
HPE Aruba-master-icon-library_052722.pptx
BSCS lesson 3.pptxnbbjbb mnbkjbkbbkbbkjb
EGWHermeneuticsffgggggggggggggggggggggggggggggggg.ppt
Causes of Flooding by Slidesgo sdnl;asnjdl;asj.pptx
Fundamental Principles of Visual Graphic Design.pptx
pump pump is a mechanism that is used to transfer a liquid from one place to ...
Phone away, tabs closed: No multitasking
The story of the first moon landing.docx
Benefits_of_Cast_Aluminium_Doors_Presentation.pdf
Design Thinking - Module 1 - Introduction To Design Thinking - Dr. Rohan Dasg...
Trusted Executive Protection Services in Ontario — Discreet & Professional.pdf
Africa 2025 - Prospects and Challenges first edition.pdf
Wisp Textiles: Where Comfort Meets Everyday Style

Ss eb29

  • 1. OPEN SOURCE BIG DATA PROJECTS - EMERGENCE OF THE CONVERGED DATA PLATFORM
  • 2. This presentation is a summary by InsightBrief of the 451 Research webinar: The Big Data Blender: Converging Hadoop, Spark, Streaming, and More Insights for busy professionals Read in less than 10 mins Knowledge without the fluff
  • 3. As of Jan 2016, 451 Research identified over 275 vendors and products in the data platform and analytics landscape. Growth is expected, as is convergence of data platforms. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 4. Convergence or the ‘blending’ of data management platforms is being driven by 1. Different data types - the need to adapt some data stored in different types of databases 2. Operational efficiencies - reducing maintenance by converging multiple data silos into one 3. Demands of variable workloads – some data stores are better at addressing certain types of applications and their workloads Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 5. Open source technology is increasingly used in the growing complexity of Big Data environment projects. Although able to handle scale and complexity of the new modern data types, a converged data platform allows it all to be delivered on a unified platform, providing centralized management, security, high availability, fault tolerance and disaster recovery. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 6. A converged data platform is the convergence or blending of two or more processes, frameworks or technology - coming together in a unified whole. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 7. By 2019 it is anticipated that the value created in the Hadoop, Event/Stream processing, New SQL and NoSQL market will be about $10b. Open source Hadoop will account for about $3.6b and NoSQL some $4.7b. 451 Research expect continued growth in this market. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 8. NoSQLs emergence, growth and appeal have been attributed to its different data stores for different workloads (polyglot persistence). However, limitations of operational complexity and inflexibility from multiple databases driving applications have been addressed by the growth of multi-model databases - which support a combination of various NoSQL data models. NoSQL’s future is predicted to rest with this approach. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 9. With increasing complexity in data workloads, a number of actions and data sets get joined at many management points that require different consoles, different hardware utilization, different demands for security and different fault tolerance and disaster recovery. These concepts drive the creation of a converged data platform. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 10. In the past operational and analytical data workloads were more or less separated. Today they often are not and this demands data convergence. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 11. Convergence is also being driven by the idea that some in-memory databases can also serve as memory caches. Thus platforms are being developed that can run a database with a cache or simply as a cache solution. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 12. The expansion of streaming technology has opened up yet another area for convergence where diverse applications are being developed to serve expanding demands for analysis of real-time data streams. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 13. Total Data Warehouse is a term used by 451 Research to describe all the components that make data platform convergence possible. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 14. Converged data platforms are transforming businesses, demonstrating significant cost savings and helping drive additional revenues. Six cases studies are cited by MapR in their recent webinar with 451 Research. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 15. The key benefits of a converged data platform: 1. It’s in real time with reduced latency - improves responsiveness of applications 2. Improved reliability drives greater business value and reduced costs Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 16. Converging data technologies in one place creates between 30% to 50% reduction in the overall total cost of ownership. Savings are evident on the type of hardware utilized, the data center, operating costs, and reduced costs of data movement. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 17. In a survey of MapR customers, it was reported that 98% were running more than one application on a single cluster and 18% were running over 50 applications. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 18. Different data types (structured, unstructured), operational efficiencies (converging multiple silos of data) and variable workloads (dependant on applications) are all driving convergence of data storage platforms. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 19. NoSQL and SQL vendors are starting to adopt certain traits of each other. For example, IBM and Oracle are adding JSON capabilities to their databases, searchable by SQL. NoSQL vendors are adding SQL querying capabilities. This trend illustrates NoSQL and SQL convergence. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 20. Operational and analytical data workloads are starting to converge. Until recently, separate databases have been used for each. Emerging databases take advantage of in-memory and advanced processing to deliver combined operational and analytical processing. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 21. Several NoSQL players, among them MapR, have taken up the multi-model approach and it is anticipated that this is the future direction of this sector. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 22. Hadoop based convergence is taking place in 1. Cache (Apache Geode, Apache Ignite) 2. Operational databases (NoSQL) - (MapR- DB, Apache HBase, Splice Machine, Apache Trafodian) 3. Analytic databases, made possible by connectors, SQL-on-Hadoop, Federated query - (e.g. Pivotal, Teradata and IBM) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 23. Convergence is also occurring with cache and grid databases where databases can be run with a cache or used as a pure cache solution. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 24. Customers’ growing need for more frequent analysis of real-time data streams is pushing data stream processing into the mainstream. The blending of streaming technologies (e.g. Storm, Spark, Kafka and MapR Streams) with Hadoop illustrate the convergence in this space. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 25. JSON is an extremely popular format among application developers and it is emerging as a de facto standard for storing data, particularly around the growth of sensors for IoT use cases. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 26. MapR provides the industry’s only converged data platform that integrates the power of Hadoop and Spark with global event streaming, real-time database capabilities, and enterprise storage, enabling customers to harness the enormous power of their data. Organizations with the most demanding production needs, including sub-second response for fraud prevention, secure and highly available data-driven insights for better healthcare, petabyte analysis for threat detection, and integrated operational and analytic processing for improved customer experiences, run on MapR. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 27. MapR’s approach is to be as open as possible in supporting as many of the API’s as possible. One of the very popular features of MapR’s converged data platform is the seamless movement of data and files through drag and drop without having to write special code to batch-load data into and out of a Hadoop environment. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 28. MapR’s converged data platform brings together open source engines and tools (e.g, Hadoop, Spark, Apache Drill) and can also support commercial engines and applications (e.g. Vertica, SAP, MySQL). (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 29. MapR’s converged data platform handles multi-model databases. JSON is a recently added standard native format on which applications can also be built. MapR is completely integrated with Hadoop and can operate across datacenters in a multi-master type of environment and setup. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 30. In the area of security management across a big data environment, MapR has pushed down access control to the granular level. This access control can be spread across different processing engines and all types of structured data, files, tables and streams. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 31. The MapR system is a multi-tenant environment supporting many different applications on a single cluster. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 32. In realtime applications, MapR’s data platform allows a single system to operate simultaneously with data- at-rest and data-at-motion reducing latency. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 33. MapR’s converged data permits an optimized table setup and streaming working directly with storage hardware resources. It operates natively against the hardware allowing fast, efficient, direct I/O on that system, thereby improving performance. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 34. MapR Streams is a global streaming service that extends the capacity of Sparks, Storm and other streaming processing engines. It pushes billions of messages per second in a reliable way thus allowing information consumption to be virtually real time. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 35. The fulfillment of the promise of docker containers and the ability to move workloads on the fly to different data center nodes is handled seamlessly in MapR. (Key point on company offering made by MapR in the webinar) Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 36. The largest healthcare provider in the United States (United Health Group) run what they call a ‘big data as-a-service’ platform using MapR’s converged data platform and have saved hundreds of millions of dollars in improving efficiency and reducing waste in how they manage and process insurance claims. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 37. The business value of converging different data workloads (e.g, batch, interactive) and streaming them across the business, manifests itself in significantly improved customer experiences and value. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 38. There is considerable cost reduction with data convergence where data sprawl and data duplication are controlled. Administrative costs associated with overall enterprise data architecture are reduced. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 39. The converged data platform not only simplifies app development but also the way apps are run in a data center. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 40. With new customer demands and improving technology capabilities, all roads lead to a converged data platform. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 41. MapR has created a simple process of getting started with its converged data platform. Free on-demand training is available for Hadoop, Spark and SQL engines. Quick start solutions using blueprints and templates are available so that MapR clients can be operating with a 6 node environment in 4 to 6 weeks. Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 42. Questions to ask your Big Data vendor: 1. Is my data highly available? Can we plug in existing enterprise systems? 2. Can we properly identify users? 3. Is multi-tenancy supported? 4. Is my data correct and supported? 5. Can we authorise access to data? 6. Are apps supported across geographies and data centers? 7. Is my data governed? 8. Is there a proper paper trail? Open Source Big Data Projects - Emergence of the Converged Data Platform
  • 43. ABOUT 451 RESEARCH ABOUT MAPR With a core focus on technology innovation and market disruption, 451 Research provides essential insight for leaders of the digital economy. More than 100 analysts and consultants deliver that insight via syndicated research, advisory services and live events to over 1,000 client organizations in North America, Europe and around the world. 451 Research and its customers benefit from the combined assets and talent of The 451 Group and its two divisions: 451 Research and Uptime Institute. MapR provides the industry’s only converged data platform that integrates the power of Hadoop and Spark with global event streaming, real-time database capabilities, and enterprise storage, enabling customers to harness the enormous power of their data. Organizations with the most demanding production needs, including sub-second response for fraud prevention, secure and highly available data-driven insights for better healthcare, petabyte analysis for threat detection, and integrated operational and analytic processing for improved customer experiences, run on MapR.
  • 44. ABOUT INSIGHTBRIEF Our team produces short documents for busy professionals, summarising longer reports and research papers so that readers can swiftly become acquainted with a large body of knowledge and decide whether or not to read the full source document(s). We vet and qualify reports for relevancy and value to its intended audience before creating an InsightBrief document. Our editorial team is independent from the originator of the report, ensuring that the insights exclude sales or vendor centric messaging, thereby creating real value for our time-poor readers. The InsightBrief team in conjunction with technology analysts content, summarise existing reports and events independently of input from the source originator.We assume no responsibility for the content or implied advice from any of the summaries / insights. InsightBrief and iBrief.ly are registered trademarks of InsightBrief.All other trademarks are the property of their respective owners.
  • 45. GETTHE INSIGHTS IN AN EASY TO READ FORMAT OPEN SOURCE BIG DATA PROJECTS - EMERGENCE OF THE CONVERGED DATA PLATFORM COPYRIGHT ©2016 INSIGHTBRIEF.ALL RIGHTS RESERVED CLICKTOACCESSTHEEXECUTIVEBRIEF E X E C U T I V E B R I E F