SlideShare a Scribd company logo
Metrics Weightage Sub - Metrics Criteria
Sub-
Weightage CDH HW/HDP MapR Pivotal HD
Scalability /Fault tolerance Yes Yes Yes Yes
Multi-tenancy- Resource Pooling
1 - groups and resource pooling without YARN
2 - groups and resource pooling and YARN
3 - groups and resource pooling and YARN (significant
contributor) / groups and resource pooling and YARN
+other prop 3% 2 3 1 3
Open source Hadoop based on
products introduced
1 - 1 products introduced
2 - 2-3 products introduced
3 - >=4 products introduced 7% 2 3 0 3
Closed source products built or Closed
source products made open including
portability
1 - 1 products
2 - 2-3 products
3 - >=4 products 2% 3 2 3 3
Cloud based products introduced
1- 1-2 products introduced
2 - >2 products introduced
3 - hadoop integration products+other prop products
introduced 1% 1 1 2 3
No. of committer seats including PMC
1- =0-25 committers,
2 - >25 and <=50 committers,
3 - >50 and 25+PMC committers 3% 3 3 1 2
Support and training provided
1 - OK
2 - good support and training
3 - Excellent support and training 3% 3 2 3 2
Revisions after release
0 - Multiple even after GA
2- Makes the product available only after suitable testing 3% 0 2 2 2
SQL Focus : Open source /Closed
source
0 - Closed source
2 - Open source 3% 0 2 2 2
0.45 0.56 0.29 0.57
Data management -
data lifecycle management, data
replication between HDFS and Hive,
governance, lineage, traceability and
data discovery, process coordination
and scheduling, leveraging existing
products like Oozie and Zookeeper
100% open source framework.
Allow other plug ins.
workflow orchestration /automation (using Oozie
underneath).
Dataset replication.
Dataset retention.
Hive /Hcat integration.
Dashboard /entity viewing.
Integration with system management tool. 2% 2 2 0 1
Data Ingestion - Tools offered etc
1 - Sqoop, Sqoop2 and Flume
2 - Additional 2% 2 2 2 1
Data storage - own, with other systems
1 - HDFS
2 - HDFS and others/prop 2% 1 1 2 2
Realtime Data or OLTP - using Storm,
Spark, or Gemfire, SQLfire
1 - Not sure
2 -Spark or Storm or Prop 2% 2 2 1 2
Streaming Data like Spark Streaming,
Storm
1 - Not sure
2- Spark
3 - Spark+storm 1% 2 3 1 3
Workload Management via Oozie,
Hawq or other tools
1 - Only oozie or only HAWQ
2 - Oozie+integration 3% 1 2 1 1
Data Frameworks working together and
contribution eg: Datastax, Databricks,
MS REEF
1 - very few or through few partnerships
2 - Multiple 1% 2 2 1 1
Data Analytics like Acunu, Rev R,
0 - only tieups
2 - tieups+prop 3% 0 0 0 2
Search - Integration with Search Tool
etc
1 - Prop or external
2 - Prop+external 3% 2 1 1 1
Batch Data Processing-MapReduce and
YARN
1 - Own MR
2 - Only MR+YARN
3 - MapReduce innovation and YARN+Tez or MR
innovation+YARN 5% 3 3 1 3
Multi-cluster management using prop
tools built
1 - good
2 - better
3 - best 2% 3 1 1 2
Monitoring and Managing cluster - like
Cloudera manager, Ambari, Command
Center
1 - Closed source /proprietary
2 - Open sourced
3 - Open sourced and better monitoring product / Closed
source and better monitoring 7% 3 3 2 2
Backup and Recovery/ DR: Availability
and replicaton
1 - Restart required
2 - Autorecovery of nodes or XDR
3 - Autorecovery and XDR 5% 2 2 3 2
CBO on SQL product (cost based
optimizer)
0 - No or not in current version
1 - Yes 2% 0 1 0 1
Security: Data security - Internal
1 - Not sure or None
2 - Good or prop
3 - Better 3% 3 3 2 1
Security: Access/Authentication
Security
External Security:
0 - Not sure and only Kerberos, LDAP, AD
1 - Tie-ups with vendors = Kerberos, LDAP, AD 3% 1 1 1 1
Security: System management
1- Good and prop
2- Better and prop 1% 2 1 1 2
Security: Data governance and audit
1 - Not sure
2 - Good and prop
3 - Better and prop 3% 2 3 2 1
0.99 1.00 0.70 0.84
No SQL vendors like Cassandra, Redis,
1 - <3 or not sure
2 - Prop
3 - >=3 2% 3 3 1 3
Document DBs like MongoDB,
CouchBase
1 - <3 or not sure
2 - few
3 - >=3 2% 3 3 1 3
Graphical DBs like GraphX, InfiniDB,
Giraph
1 - <3 or not sure
2 - Prop
3 - >=3 1% 3 3 1 2
Inmemory DBs like gridgain, Hana
1 - not sure
2 - no specific integration
2 - prop and specific integration 4% 2 2 1 3
MPP Databases like Greenplum,
Vertica, Netezza
1 - not sure
2 - integrates with others
3 - Prop 5% 2 2 1 3
Analytics Databases like Marklogic
1 - <3 or not sure
2 - Prop
3 - >=3 3% 1 1 1 2Messaging tech. like Kafka, Trident,
Kinesis, Spark streaming. BI tools like
Cognos, business objects. ETL tools
like Syncsort, Talend. Data
Visualization, dashboard and reporting
tech like Tableau, Datameer, Ayasdi.
Analytical products/libraries like R,
SAS, Weka. Data Security like
Protegrity, Dataguise, Vormetric.
Configuration management like Chef,
Puppet (for cluster and XDR replication)
etc. Search tools - Solr, ElasticSearch
like Solr, ElasticSearch. RDBMS and
other integration like Oracle, DB2, etc.
List of Connectors, drivers, API.
1 - integrates with fewer technologies
2 - prop and integrates with few other technologies where
prop option is not there
3 - integrates with most better known technologies 8% 3 3 1 3
0.60 0.60 0.25 0.71
Cost and Licensing Policy +
Relationship we have
Not included to remove bias on price /relationship. So all
are 0 0% 0 0 0 0
TOTAL 100% 100% 2.04 2.16 1.24 2.04
Industry Speak / Industry Norm
Our take: No one size fits all.
HADOOP framework, feature set comparison and Performance
Architectural philosophy /open
source /proprietary
25%
The industry norm is having two implementations… eg: Cloudera and Hortonworks or Hortonworks and Pivotal or Cloudera and Pivotal based on their requirements. This also helps reduce
dependency on any one vendor and being tied to one set of technologies.
Since we are looking at the entire stack/suite of products, Pivotal has a product suite/technologies in its datalake. Pivotal CommandCenter, Cloud Foundry, GPDB, HAWQ, MADlib, SQLFire, GemFire,
GemFire XD, Spring support, HAMSTER. Pivotal adheres to open-source Hadoop and has added CommandCenter and features around the Hadoop ecosystem. It did not have Hadoop commiters
before but recently has hired numerous professionals in this matter. Cloudera is becoming more and more closed source as it introduced EDH and Impala. Hortonworks believes in the open source
philosophy which is great. Speaking with Cloudera and Hortonworks executives, the question is: THE VISION and ROADMAP.... Go-forward Strategy. Can they move beyond building wrapper around
Hadoop. Cloudera and Hortonworks do not have the deep pockets or capability to go beyond Hadoop currently. MapR offers tremendous advantages since it bypasses MapReduce and hits the prop
MapR engine(auto-node feature) but the new features take one or two months to be incorporated since it is closed and prop. Also, supporting legacy versions can be a challenge with Cloudera and
MapR where customization is done.
Hadoop framework, featureset
comparison and Performance
and Management
50%
Integration with other
technologies or prop
technologies provided and
connectors, Partnership
/Vendor strategic relationship
25%

More Related Content

PDF
TFA_Whats_New_in version 12.1.2.8.4
PPTX
Big Data Analytics with Spark
PDF
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
PDF
Talend openstudio bigdata_gettingstarted_6.3.0_en
PPTX
Operating and Supporting Apache HBase Best Practices and Improvements
PPTX
#dbhouseparty - Real World Problem Solving with SQL
PDF
How to build leakproof stream processing pipelines with Apache Kafka and Apac...
PDF
Apache Hadoop 3
TFA_Whats_New_in version 12.1.2.8.4
Big Data Analytics with Spark
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Talend openstudio bigdata_gettingstarted_6.3.0_en
Operating and Supporting Apache HBase Best Practices and Improvements
#dbhouseparty - Real World Problem Solving with SQL
How to build leakproof stream processing pipelines with Apache Kafka and Apac...
Apache Hadoop 3

What's hot (20)

PDF
Oracle Rac Performance Tunning Tips&Tricks
PDF
IOUG Collaborate 18 - Get the Oracle Performance Diagnostics Capabilities You...
PDF
IOUG Collaborate 18 - Data Guard for Beginners
PPTX
Oracle GoldenGate Microservices Overview ( with Demo )
PDF
dplyr Interfaces to Large-Scale Data
PDF
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
PPTX
Pa cloudera manager-api's_extensibility_v2
PPTX
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
PDF
Hadoop 2.0 YARN webinar
PPTX
Python in the Hadoop Ecosystem (Rock Health presentation)
PDF
Replicate data between environments
PPTX
Introduction to Cloudera's Administrator Training for Apache Hadoop
PDF
Intro to Apache Spark
PDF
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
PPT
OGCE RT Rroject Review
PDF
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
PDF
Analyzing twitter data with hadoop
PDF
PDF
Streamline it management
PPTX
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Oracle Rac Performance Tunning Tips&Tricks
IOUG Collaborate 18 - Get the Oracle Performance Diagnostics Capabilities You...
IOUG Collaborate 18 - Data Guard for Beginners
Oracle GoldenGate Microservices Overview ( with Demo )
dplyr Interfaces to Large-Scale Data
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
Pa cloudera manager-api's_extensibility_v2
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Hadoop 2.0 YARN webinar
Python in the Hadoop Ecosystem (Rock Health presentation)
Replicate data between environments
Introduction to Cloudera's Administrator Training for Apache Hadoop
Intro to Apache Spark
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
OGCE RT Rroject Review
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Analyzing twitter data with hadoop
Streamline it management
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Ad

Similar to Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integration demo (20)

PPTX
Large-Scale Data Science on Hadoop (Intel Big Data Day)
PPTX
Piranha vs. mammoth predator appliances that chew up big data
PDF
Emerging trends in data analytics
PPTX
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
PDF
Technologies for Data Analytics Platform
PPTX
Stratebi Big Data
PDF
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
PPTX
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
PPTX
Consolidate your data marts for fast, flexible analytics 5.24.18
PPTX
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
PPTX
Integrating Hadoop Into the Enterprise
PPTX
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
PDF
Architecting Agile Data Applications for Scale
PPT
Vanilla Hadoop vs. the rest
PPTX
HadoopDistributions
PPT
Data Science Day New York: Data Science: A Personal History
PDF
Hadoop summit cloudera keynote_v5
PDF
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
PDF
Kickfire: Best Of All Worlds
ODP
The power of hadoop in cloud computing
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Piranha vs. mammoth predator appliances that chew up big data
Emerging trends in data analytics
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Technologies for Data Analytics Platform
Stratebi Big Data
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Consolidate your data marts for fast, flexible analytics 5.24.18
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Integrating Hadoop Into the Enterprise
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Architecting Agile Data Applications for Scale
Vanilla Hadoop vs. the rest
HadoopDistributions
Data Science Day New York: Data Science: A Personal History
Hadoop summit cloudera keynote_v5
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Kickfire: Best Of All Worlds
The power of hadoop in cloud computing
Ad

More from nkabra (12)

PDF
How i helped rue la la become a one stop ecommerce boutique
PDF
How geo phy built a proprietary automated valuation platform for the commerci...
PDF
How fleet advantage analytics uses predic engine and iot with machine learning
PDF
Building a data science team at michelin tyres
PDF
Inmemory db nick kabra june 2013 discussion at columbia university
PDF
Comparisons of no sql databases march 2014
PPTX
Harvard case studies presentation 09102013
PDF
Hadoop compression analysis strata conference
PDF
Hadoop compression strata conference
PDF
Future of big data nick kabra speaker compendium march 2013
PDF
Solr and ElasticSearch demo and speaker feb 2014
PDF
Big data in marketing at harvard business club nick1 june 15 2013
How i helped rue la la become a one stop ecommerce boutique
How geo phy built a proprietary automated valuation platform for the commerci...
How fleet advantage analytics uses predic engine and iot with machine learning
Building a data science team at michelin tyres
Inmemory db nick kabra june 2013 discussion at columbia university
Comparisons of no sql databases march 2014
Harvard case studies presentation 09102013
Hadoop compression analysis strata conference
Hadoop compression strata conference
Future of big data nick kabra speaker compendium march 2013
Solr and ElasticSearch demo and speaker feb 2014
Big data in marketing at harvard business club nick1 june 15 2013

Recently uploaded (20)

PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PDF
Introduction to Business Data Analytics.
PDF
Foundation of Data Science unit number two notes
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
IB Computer Science - Internal Assessment.pptx
climate analysis of Dhaka ,Banglades.pptx
Introduction to Knowledge Engineering Part 1
Reliability_Chapter_ presentation 1221.5784
Clinical guidelines as a resource for EBP(1).pdf
Introduction-to-Cloud-ComputingFinal.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Business Acumen Training GuidePresentation.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Data_Analytics_and_PowerBI_Presentation.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Introduction to Business Data Analytics.
Foundation of Data Science unit number two notes
IBA_Chapter_11_Slides_Final_Accessible.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx

Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integration demo

  • 1. Metrics Weightage Sub - Metrics Criteria Sub- Weightage CDH HW/HDP MapR Pivotal HD Scalability /Fault tolerance Yes Yes Yes Yes Multi-tenancy- Resource Pooling 1 - groups and resource pooling without YARN 2 - groups and resource pooling and YARN 3 - groups and resource pooling and YARN (significant contributor) / groups and resource pooling and YARN +other prop 3% 2 3 1 3 Open source Hadoop based on products introduced 1 - 1 products introduced 2 - 2-3 products introduced 3 - >=4 products introduced 7% 2 3 0 3 Closed source products built or Closed source products made open including portability 1 - 1 products 2 - 2-3 products 3 - >=4 products 2% 3 2 3 3 Cloud based products introduced 1- 1-2 products introduced 2 - >2 products introduced 3 - hadoop integration products+other prop products introduced 1% 1 1 2 3 No. of committer seats including PMC 1- =0-25 committers, 2 - >25 and <=50 committers, 3 - >50 and 25+PMC committers 3% 3 3 1 2 Support and training provided 1 - OK 2 - good support and training 3 - Excellent support and training 3% 3 2 3 2 Revisions after release 0 - Multiple even after GA 2- Makes the product available only after suitable testing 3% 0 2 2 2 SQL Focus : Open source /Closed source 0 - Closed source 2 - Open source 3% 0 2 2 2 0.45 0.56 0.29 0.57 Data management - data lifecycle management, data replication between HDFS and Hive, governance, lineage, traceability and data discovery, process coordination and scheduling, leveraging existing products like Oozie and Zookeeper 100% open source framework. Allow other plug ins. workflow orchestration /automation (using Oozie underneath). Dataset replication. Dataset retention. Hive /Hcat integration. Dashboard /entity viewing. Integration with system management tool. 2% 2 2 0 1 Data Ingestion - Tools offered etc 1 - Sqoop, Sqoop2 and Flume 2 - Additional 2% 2 2 2 1 Data storage - own, with other systems 1 - HDFS 2 - HDFS and others/prop 2% 1 1 2 2 Realtime Data or OLTP - using Storm, Spark, or Gemfire, SQLfire 1 - Not sure 2 -Spark or Storm or Prop 2% 2 2 1 2 Streaming Data like Spark Streaming, Storm 1 - Not sure 2- Spark 3 - Spark+storm 1% 2 3 1 3 Workload Management via Oozie, Hawq or other tools 1 - Only oozie or only HAWQ 2 - Oozie+integration 3% 1 2 1 1 Data Frameworks working together and contribution eg: Datastax, Databricks, MS REEF 1 - very few or through few partnerships 2 - Multiple 1% 2 2 1 1 Data Analytics like Acunu, Rev R, 0 - only tieups 2 - tieups+prop 3% 0 0 0 2 Search - Integration with Search Tool etc 1 - Prop or external 2 - Prop+external 3% 2 1 1 1 Batch Data Processing-MapReduce and YARN 1 - Own MR 2 - Only MR+YARN 3 - MapReduce innovation and YARN+Tez or MR innovation+YARN 5% 3 3 1 3 Multi-cluster management using prop tools built 1 - good 2 - better 3 - best 2% 3 1 1 2 Monitoring and Managing cluster - like Cloudera manager, Ambari, Command Center 1 - Closed source /proprietary 2 - Open sourced 3 - Open sourced and better monitoring product / Closed source and better monitoring 7% 3 3 2 2 Backup and Recovery/ DR: Availability and replicaton 1 - Restart required 2 - Autorecovery of nodes or XDR 3 - Autorecovery and XDR 5% 2 2 3 2 CBO on SQL product (cost based optimizer) 0 - No or not in current version 1 - Yes 2% 0 1 0 1 Security: Data security - Internal 1 - Not sure or None 2 - Good or prop 3 - Better 3% 3 3 2 1 Security: Access/Authentication Security External Security: 0 - Not sure and only Kerberos, LDAP, AD 1 - Tie-ups with vendors = Kerberos, LDAP, AD 3% 1 1 1 1 Security: System management 1- Good and prop 2- Better and prop 1% 2 1 1 2 Security: Data governance and audit 1 - Not sure 2 - Good and prop 3 - Better and prop 3% 2 3 2 1 0.99 1.00 0.70 0.84 No SQL vendors like Cassandra, Redis, 1 - <3 or not sure 2 - Prop 3 - >=3 2% 3 3 1 3 Document DBs like MongoDB, CouchBase 1 - <3 or not sure 2 - few 3 - >=3 2% 3 3 1 3 Graphical DBs like GraphX, InfiniDB, Giraph 1 - <3 or not sure 2 - Prop 3 - >=3 1% 3 3 1 2 Inmemory DBs like gridgain, Hana 1 - not sure 2 - no specific integration 2 - prop and specific integration 4% 2 2 1 3 MPP Databases like Greenplum, Vertica, Netezza 1 - not sure 2 - integrates with others 3 - Prop 5% 2 2 1 3 Analytics Databases like Marklogic 1 - <3 or not sure 2 - Prop 3 - >=3 3% 1 1 1 2Messaging tech. like Kafka, Trident, Kinesis, Spark streaming. BI tools like Cognos, business objects. ETL tools like Syncsort, Talend. Data Visualization, dashboard and reporting tech like Tableau, Datameer, Ayasdi. Analytical products/libraries like R, SAS, Weka. Data Security like Protegrity, Dataguise, Vormetric. Configuration management like Chef, Puppet (for cluster and XDR replication) etc. Search tools - Solr, ElasticSearch like Solr, ElasticSearch. RDBMS and other integration like Oracle, DB2, etc. List of Connectors, drivers, API. 1 - integrates with fewer technologies 2 - prop and integrates with few other technologies where prop option is not there 3 - integrates with most better known technologies 8% 3 3 1 3 0.60 0.60 0.25 0.71 Cost and Licensing Policy + Relationship we have Not included to remove bias on price /relationship. So all are 0 0% 0 0 0 0 TOTAL 100% 100% 2.04 2.16 1.24 2.04 Industry Speak / Industry Norm Our take: No one size fits all. HADOOP framework, feature set comparison and Performance Architectural philosophy /open source /proprietary 25% The industry norm is having two implementations… eg: Cloudera and Hortonworks or Hortonworks and Pivotal or Cloudera and Pivotal based on their requirements. This also helps reduce dependency on any one vendor and being tied to one set of technologies. Since we are looking at the entire stack/suite of products, Pivotal has a product suite/technologies in its datalake. Pivotal CommandCenter, Cloud Foundry, GPDB, HAWQ, MADlib, SQLFire, GemFire, GemFire XD, Spring support, HAMSTER. Pivotal adheres to open-source Hadoop and has added CommandCenter and features around the Hadoop ecosystem. It did not have Hadoop commiters before but recently has hired numerous professionals in this matter. Cloudera is becoming more and more closed source as it introduced EDH and Impala. Hortonworks believes in the open source philosophy which is great. Speaking with Cloudera and Hortonworks executives, the question is: THE VISION and ROADMAP.... Go-forward Strategy. Can they move beyond building wrapper around Hadoop. Cloudera and Hortonworks do not have the deep pockets or capability to go beyond Hadoop currently. MapR offers tremendous advantages since it bypasses MapReduce and hits the prop MapR engine(auto-node feature) but the new features take one or two months to be incorporated since it is closed and prop. Also, supporting legacy versions can be a challenge with Cloudera and MapR where customization is done. Hadoop framework, featureset comparison and Performance and Management 50% Integration with other technologies or prop technologies provided and connectors, Partnership /Vendor strategic relationship 25%