SlideShare a Scribd company logo
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Operations With Apache Ambari
We Do Hadoop.
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ambari
Apache Ambari is the open source
operational platform to provision, manage
and monitor Hadoop clusters
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
How Do People Use Ambari?
Health Checks, Alerts Stacks, Views
Lifecycle controls, Rolling
Restarts, Decommission/
Re-commission
Host Groups, Versioning,
Compare, Revert,
Recommendations,
Security Setup
Install Wizard (UI),
Blueprints (API)
Config
Management
ExtensibilityMonitoring
Service
Management
Cluster
Provisioning
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Recent Ambari Releases
Ambari 1.7.0
Dec 2014
Ambari 1.6.0
May 2014
Introduced
Ambari Blueprints
Introduced
Ambari Views
Ambari 2.0.0
Apr 2014
HDP
2.2 GA
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What’s New in Ambari 2.0
Core Platform
Simplified Kerberos Setup (AMBARI-7204)
Ambari Alerts (AMBARI-6354)
Ambari Metrics (AMBARI-5707)
Automated (Rolling) Upgrade (AMBARI-7804)
Stack Support
HDP 2.2: Ranger, Spark, Phoenix
Hive Metastore HA (AMBARI-6684)
HiveServer2 HA (AMBARI-8906)
Oozie HA (AMBARI-6683)
Ambari Platform
Handle umask 027 setting (AMBARI-7796)
Ambari Agent non-root (AMBARI-1596)
Blueprints API
Add Host (AMBARI-8458)
For a complete list of changes
https://issues.apache.org/jira/browse/AMBARI/fixforversion/12327486
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Lab Setup
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0 Security Lab Steps – 4 node cluster
•  Detailed steps available at: http://bit.ly/1J4IbIs
•  Install Ambari server and agents
•  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD
•  Use blueprint/API to provision a minimal Hadoop cluster with custom services
•  Use Add service wizard to also install Hive
•  Configure Ambari to sync/recognize business users in OpenLDAP
•  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service
•  Install Ranger as Ambari service and configure it to recognize LDAP users
•  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access
and audit consumption
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Core Platform
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Stack Components Support
HDP 2.2 HDP 2.1 HDP 2.0
HDFS, YARN, MapReduce, Hive,
HBase, Pig, ZooKeeper, Oozie,
Sqoop
Tez, Storm, Falcon, Flume
Knox, Slider, Kafka
Ranger, Spark, Phoenix NEW in Ambari 2.0
install/manage/monitor
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Admin > Stack and Versions
List of Stack Services
Installed or Add Service
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0.0 High Availability Support
High Availability Mode Ambari 1.6.1 Ambari 1.7.0 Ambari 2.0.0
HDFS: NameNode HDP 2.0+ Active/Standby
YARN: ResourceManager HDP 2.1+ Active/Standby
HBase: HBaseMaster HDP 2.1+ Multi-master
Hive: HiveServer2 HDP 2.1+ Multi-instance
Hive: Hive Metastore HDP 2.1+ Multi-instance
Oozie: Oozie Server* HDP 2.1+ Multi-instance
* Oozie Server needs external load balancer to complete HA solution
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hive HA
Services > Hive > Service Actions
+ Add Hive Metastore
+ Add HiveServer2
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Agent Non-Root
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Non-Root Ambari Agent
Agent Runs Commands From the Ambari Server
•  Configuration Change
•  Service Start
•  Service Stop
Some Command require root level access
•  /bin/su	
  hdfs	
  -­‐l	
  -­‐s	
  /bin/bash	
  -­‐c	
  /usr/hdp/current/hadoop-­‐client/sbin/hadoop-­‐
daemon.sh	
  -­‐-­‐config	
  /etc/hadoop/conf	
  start	
  datanode	
  
Sudo Leveraged
•  Configuration for:
–  Customizable Users (su hdfs, yarn, etc.)
–  Non-Customizable Users (su mysql)
–  Commands (yum, mkdir, touch, test, etc.)
Ambari
AgentAmbari
AgentAmbari
Agent
python
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Configuring Agent for Non-Root
1.  Create and configure a sudoer account
2.  Manually bootstrap Ambari Agents
3.  Set run_as_user in ambari-agent.ini for the sudoer account
Details
http://docs.hortonworks.com/HDPDocuments/Ambari-2.0.0.0/bk_ambari_reference_guide/
content/ch_amb_ref_configuring_ambari_for_non-root.html
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Updated umask Handling
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What about umask?
Unix Permissions Basics: (user, group, other)
4 – read
2 – write
1 – execute
rwxr-­‐xr-­‐x	
  == 755
Previous Behavior:
•  If (umask > 022); Warning during agent pre-req check
•  Installations would fail if ignored
New Behavior:
•  If (umask > 027); Warning during agent pre-req check
•  Installation will fail if ignored
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
Summary
•  Migrated away from Nagios as the Ambari alerting system
•  No longer offer option to install or manage a Nagios service
•  Replaced with built-in alerting system
Motivation
•  Avoids Nagios package conflicts in customer environments
•  More flexibility with alerts in Ambari Stacks
•  Platform independence
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
•  Ambari Alerts are installed and configured by default
•  Ambari Web provides centralized management of Health Alerts
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Modifying Alerts
•  Control thresholds, check intervals and response text
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Alert Groups
•  Create and manage groups of
alerts
•  Group alerts further controls what alerts
are dispatched which notifications
•  Assign group to notifications
•  Only dispatch to interested parties
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Alert Notifications
•  What: Create and manage
multiple notification targets
•  Control who gets notified when
•  Why: Filter by severity
•  Send only certain notifications to certain
targets based on severity
•  How: Control dispatch method
•  Support for EMAIL + SNMP
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerting System
1.  User creates or modifies cluster
2.  Ambari reads alert definitions
from Stack
3.  Ambari sends alert definitions to
Agents and Agent schedules
instance checks
4.  Agents reports alert instance
status in the heartbeat
5.  Ambari responds to alert instance
status changes and dispatches
notifications (if applicable)
Ambari
Server
1
2
4
Stack definition
alerts.json
5
Ambari
Agent(s)
3
email
snmp
Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Notable Alert REST APIs
REST Endpoint Description
/api/v1/clusters/:clusterName/alert_definitions The list of alert definitions for the cluster.
/api/v1/clusters/:clusterName/alerts The list of alert instances for the cluster.
Example: find all alert instances that are CRITITAL
/api/v1/clusters/c1/alerts?Alert/state.in(CRITICAL)
Example: find all alert instances for “ZooKeeper Process” alert def
/api/v1/clusters/c1/alerts?Alert/
definition_name=zookeeper_server_process
/api/v1/clusters/:clusterName/alert_groups The list of alert groups.
/api/v1/clusters/:clusterName/alert_history The list of alert instance status changes.
/api/v1/alert_targets/ The list of configured alert notification targets for Ambari.
Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics
Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics
Summary
•  Migrated from Ganglia as the Ambari metrics collection system
•  No longer offer option to install or manage a Ganglia service
•  Replaced with built-in metrics system “Ambari Metrics”
Motivation
•  Avoids Ganglia package conflicts in customer environments
•  More flexibility to retain metrics in Hadoop
•  Platform independence
Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Terminology
Term Definition
Ambari Metrics (“AMS”) The built-in metrics collection system for Ambari (“AMS”).
Metrics Collector The standalone server that collects metrics, aggregates metrics, serves
metrics from the Hadoop service sinks and the Metrics Monitor. Analogous
to gmetad.
Metrics Monitor Installed on each host in the cluster to collect system-level metrics and
forward to the Collector. Analogous to gmond.
Metrics Hadoop Sinks Plugs into the Service sinks to send Hadoop metrics to the Collector.
Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics Collection System
1.  Metric Monitors send system-
level metrics to Collector
2.  Sinks send Hadoop-level
metrics to Collector
3.  Metrics Collector service stores
and aggregates metrics
4.  Ambari exposes REST API for
metrics retrieval
Ambari
Server
Metrics
Monitor
Metrics
Collector
Host1
Sink(s)
3
Metrics
Monitor
Host1
Sink(s)Metrics
Monitor
Hosts
Sink(s)
1 2
4
Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Metrics Collector
Built using Hadoop technologies
Default uses local filesystem for
metrics storage (“embedded”) **
Local Filesystem **
HBase
ATS
Phoenix
** Tech Preview “distributed” storage option to use existing HDFS
Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Automated Rolling Upgrade
For HDP Stack
Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Rolling vs. In Place Upgrades
In Place Upgrades Upgrade Stack with one or more service disruptions. Explicit stop all services.
Rolling Upgrades
Ambari 2.0
Update Stack with minimized service disruption and degradation.
Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Upgrading the Stack with Ambari 2.0
Source HDP
Version
Target HDP
Versions
Method
HDP 2.0.x
HDP 2.0.x
HDP 2.1.x
HDP 2.2.x
In Place
HDP 2.1.x
HDP 2.1.x
HDP 2.2.x
In Place
HDP 2.2.x HDP 2.2.x Rolling NEW!!!
Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Rolling Upgrade Process
Pre-
requisites
Prepare
Rolling
Upgrade
Finalize
Rolling
Downgrade
Rollback
NOT Rolling. Shutdown all
services.
Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Rolling Upgrade
HDP has a certified process
for Rolling Upgrades
Services are switched over to
new version in rolling fashion
ZooKeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Finalize
HDFS, YARN, MR,
Tez, HBase, Pig.
Hive
HDFS
YARN
HBase
Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Rolling Downgrade
ZooKeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Downgrade
V
V
V
V
V
V
V
V
V
V
V
V
V
Finalize
Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Wizard Driven Experience Register Install
Perform
Upgrade
Finalize
With
verification
and validation
Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Service Disruption by Component
Component Service Disruption
Zookeeper No Service Disruption
Ranger No Service Disruption
HDFS No Service Disruption
YARN No Service Disruption
HBase No Service Disruption
Hive No Service Disruption
Oozie No Service Disruption
Falcon Yes – Requires Stop/Start
Kafka Yes – Requires Stop/Start
Knox Yes – Requires Stop/Start
Storm Yes – Requires Stop/Start
Flume No Service Disruption
Slider applications Yes – Requires Stop/Start
Hue Yes – Requires Stop/Start
Accumulo Yes – Requires Stop/Start
Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Extensibility
Stacks, Blueprints and Views
Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Extensibility Features
•  To add new Services (ISV or otherwise) beyond HDP Stack
•  To customize a Stack for customer specific environments
•  To use Ambari for automating cluster installations
•  To share best practices on layout and cluster configuration
•  To extend and customize the Ambari Web UI
•  Add new capabilities, customize existing capabilities
Stacks
Blueprints
Views
Goal: Extend Ambari without hard-coding in Ambari
Page 41 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
New in Ambari 2.0 - Blueprints Add Host
Add hosts to a cluster based on a
host group from a Blueprint
Add one or more hosts with a
single call
POST /api/v1/clusters/MyCluster/hosts
{
"blueprint" : "myblueprint",
"host_group" : "workers",
"host_name" : "c6403.ambari.apache.org"
}
POST /api/v1/clusters/MyCluster/hosts
[
{
"blueprint" : "myblueprint",
"host_group" : "workers",
"host_name" : "c6403.ambari.apache.org"
},
{
"blueprint" : "myblueprint",
"host_group" : "workers",
"host_name" : "c6403.ambari.apache.org"
}
]
https://issues.apache.org/jira/browse/AMBARI-8458
Page 42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
More on Ambari Extensibility
http://hortonworks.com/partners/learn/#ambari
Page 43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Simplified Kerberos Setup
Page 44 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Quick Kerberos Overview
REALM
•  EXAMPLE.COM
Principals (Humans)
•  paul@EXAMPLE.COM
Service Principals (Services)
•  hbase@EXAMPLE.COM
•  hbase/r2u3s1.example.local@EXAMPLE.COM
Tickets
•  “paul@EXAMPLE.COM is authenticated and can access the HBASE service”
KDC – Key Distribution Center
•  Grant’s authenticated users tickets
Client
•  r1u2m1.example.com (.example.com maps to realm EXAMPLE.COM)
•  EXAMPLE.COM’s KDC is hosted on r1u2m3.example.com
Page 45 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
KDC Implementation Options
•  Microsoft Active Directory
•  Users
•  Service Principals
•  MIT Kerberos
•  Users
•  Service Principals
•  MIT Kerberos + Microsoft Active Directory (Trust Relationship)
•  Users in Active Directory
•  Service Principals in MIT
Page 46 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What Ambari 2.0 will do
•  Step-by-Step wizard to setup Kerberos
•  Supports existing MIT KDC and Active Directory (AD) infrastructure
•  Deploys and manages Kerberos Clients, and configuration
•  First Time Setup as well as New Service/Host/Component
•  Automated creation of principals
•  Automated generation of keytabs
•  Automated distribution of keytabs
•  Support for regeneration and distribution of keytabs
Page 47 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Prerequisites
Category Requirements
General
•  Ambari Server must be part of cluster
•  Ambari Server and all hosts must have JCE installed
•  Ambari Server and all hosts must have network access to the KDC
KDC Admin
•  KDC admin account credentials are on-hand
•  !!! Ambari does not retain KDC admin credentials !!!
Active Directory
•  Security LDAP (LDAPS) connectivity has been configured
•  User container for principals has been created and is on-hand
•  Admin account has delegated control of “Create, delete and manage
user accounts”
Page 48 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Terminology
Term Definition
Service Principals Principals required for HDP Service Components.
Ambari Principals
Headless principals used by Ambari to perform “smoke tests” and “health
alert checks”.
KDC Admin Account
An administrative account that will be used by Ambari to create principals
and generate keytabs in KDC.
Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Principal and Keytab Generation and Distribution
1.  User provides KDC Admin
Account credentials to Ambari
2.  Ambari connects to KDC, creates
principals (Service and Ambari)
needed for cluster
3.  Ambari generates keytabs for the
principals
4.  Ambari distributes keytabs to
Ambari Server and cluster hosts
5.  Ambari discards the KDC Admin
Account credentials
Ambari
Server KDC
1 2
4
3
5
HDP
Cluster
Page 50 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari + Service Keytab Files
Ambari
Server
HDP
Cluster
Hosts
Keytabs for
Ambari
Principals
Keytabs for
Service +
Ambari
Principals
KDC
Service Principals
Ambari Principals
Ambari and
Service
Principals
Page 51 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Wizard Driven and Automated
Page 52 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
KDC and KDC Admin Information
Page 53 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Customizable Principal Attributes
Page 54 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Kerberos Clients
Ambari installs Kerberos clients
on cluster hosts
Optional to not have Ambari
manage krb5.conf client config
OS Client
RHEL/CentOS/OEL krb5-workstation
SLES 11 krb5-client
Ubuntu 12 krb5-user, krb5-config
Page 55 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Configure Ambari and Service Identities
Page 56 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Post-Kerberos Scenarios
Ambari does not retain KDC admin credentials
User is prompted for KDC Admin credentials:
•  Add/Delete Host
•  Add Service
•  Add/Delete Component
•  Regenerate Keytabs
•  Disable Kerberos
Page 57 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Security Lab
Page 58 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Security today in Hadoop with HDP
Authorization
Restrict access to
explicit data
Audit
Understand who
did what
Data Protection
Encrypt data at
rest & in motion
•  Kerberos in native
Apache Hadoop
•  HTTP/REST API
Secured with
Apache Knox
Gateway
Authentication
Who am I/prove it?
•  Wire encryption
in Hadoop
•  Orchestrated
encryption with
partner tools
•  HDFS, Hive and
Hbase, Storm
and Knox
•  Fine grain
access control
•  Centralized
audit reporting
•  Policy and
access history
HDP2.1
Ranger
Centralized Security Administration
More on Security: http://hortonworks.com/partners/learn/#secure
Page 59 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0 Security Lab Steps
•  Detailed steps available at: http://bit.ly/1J4IbIs
•  Install Ambari server and agents
•  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD
•  Use blueprint/API to provision a minimal Hadoop cluster with custom services
•  Use Add service wizard to also install Hive
•  Configure Ambari to sync/recognize business users in OpenLDAP
•  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service
•  Install Ranger as Ambari service and configure it to recognize LDAP users
•  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access
and audit consumption

More Related Content

PDF
Hortonworks Technical Workshop: What's New in HDP 2.3
PDF
Hortonworks Technical Workshop: Apache Ambari
PPTX
Apache Ambari: Managing Hadoop and YARN
PPTX
Apache Ambari: Past, Present, Future
PPTX
YARN Ready - Integrating to YARN using Slider Webinar
PPTX
A First-Hand Look at What's New in HDP 2.3
PDF
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
PPTX
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: Apache Ambari
Apache Ambari: Managing Hadoop and YARN
Apache Ambari: Past, Present, Future
YARN Ready - Integrating to YARN using Slider Webinar
A First-Hand Look at What's New in HDP 2.3
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters

What's hot (20)

PDF
HDF: Hortonworks DataFlow: Technical Workshop
PPTX
Apache Ambari - What's New in 2.4
PPTX
Authoring and Hosting Applications on YARN using Slider
PDF
Hortonworks tech workshop in-memory processing with spark
PPTX
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
PPTX
Get Started Building YARN Applications
PDF
Hortonworks Technical Workshop: Interactive Query with Apache Hive
PPTX
Securing Hadoop with Apache Ranger
PPTX
Deploying Docker applications on YARN via Slider
PPTX
Enabling Diverse Workload Scheduling in YARN
PPTX
Hadoop crashcourse v3
PDF
What s new in spark 2.3 and spark 2.4
PPTX
Internet of things Crash Course Workshop
PPTX
Apache Ambari - What's New in 2.2
PPTX
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
PPTX
Hortonworks Hadoop summit 2011 keynote - eric14
PPTX
Apache Hadoop YARN: Past, Present and Future
PPTX
Double Your Hadoop Hardware Performance with SmartSense
PDF
Hortonworks Technical Workshop - build a yarn ready application with apache ...
PDF
Discover.hdp2.2.h base.final[2]
HDF: Hortonworks DataFlow: Technical Workshop
Apache Ambari - What's New in 2.4
Authoring and Hosting Applications on YARN using Slider
Hortonworks tech workshop in-memory processing with spark
Its Finally Here! Building Complex Streaming Analytics Apps in under 10 min w...
Get Started Building YARN Applications
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Securing Hadoop with Apache Ranger
Deploying Docker applications on YARN via Slider
Enabling Diverse Workload Scheduling in YARN
Hadoop crashcourse v3
What s new in spark 2.3 and spark 2.4
Internet of things Crash Course Workshop
Apache Ambari - What's New in 2.2
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Hortonworks Hadoop summit 2011 keynote - eric14
Apache Hadoop YARN: Past, Present and Future
Double Your Hadoop Hardware Performance with SmartSense
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Discover.hdp2.2.h base.final[2]
Ad

Similar to Hortonworks technical workshop operations with ambari (20)

PPTX
Apache Ambari - What's New in 2.1
PDF
Past, Present and Future of Apache Ambari
PPTX
Managing Enterprise Hadoop Clusters with Apache Ambari
PPTX
Managing Enterprise Hadoop Clusters with Apache Ambari
PPTX
What's new in Ambari
PPTX
Apache Ambari - What's New in 2.0.0
PPTX
Apache Ambari - What's New in 1.7.0
PPTX
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
PPTX
Hortonworks Data In Motion Series Part 3 - HDF Ambari
PPTX
Manage Add-On Services with Apache Ambari
PPTX
Running Cloudbreak on Kubernetes
PPTX
Running Cloudbreak on Kubernetes
PPTX
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
PPTX
Manage Add-on Services in Apache Ambari
PDF
Pivotal cf for_devops_mkim_20141209
PPTX
Why is My Hadoop Job Slow?
PPTX
Why is my Hadoop cluster slow?
PDF
Why is My Hadoop Job Slow?
PDF
Hadoop Operations - Past, Present, and Future
PDF
Hadoop Operations – Past, Present, and Future
Apache Ambari - What's New in 2.1
Past, Present and Future of Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
What's new in Ambari
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 1.7.0
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Why is my Hadoop cluster s...
Hortonworks Data In Motion Series Part 3 - HDF Ambari
Manage Add-On Services with Apache Ambari
Running Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Ambari metrics system - Apache ambari meetup (DataWorks Summit 2017)
Manage Add-on Services in Apache Ambari
Pivotal cf for_devops_mkim_20141209
Why is My Hadoop Job Slow?
Why is my Hadoop cluster slow?
Why is My Hadoop Job Slow?
Hadoop Operations - Past, Present, and Future
Hadoop Operations – Past, Present, and Future
Ad

More from Hortonworks (20)

PDF
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
PDF
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
PDF
Getting the Most Out of Your Data in the Cloud with Cloudbreak
PDF
Johns Hopkins - Using Hadoop to Secure Access Log Events
PDF
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
PDF
HDF 3.2 - What's New
PPTX
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
PDF
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
PDF
IBM+Hortonworks = Transformation of the Big Data Landscape
PDF
Premier Inside-Out: Apache Druid
PDF
Accelerating Data Science and Real Time Analytics at Scale
PDF
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
PDF
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
PDF
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
PDF
Making Enterprise Big Data Small with Ease
PDF
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
PDF
Driving Digital Transformation Through Global Data Management
PPTX
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
PDF
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
PDF
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Johns Hopkins - Using Hadoop to Secure Access Log Events
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
HDF 3.2 - What's New
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
IBM+Hortonworks = Transformation of the Big Data Landscape
Premier Inside-Out: Apache Druid
Accelerating Data Science and Real Time Analytics at Scale
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Making Enterprise Big Data Small with Ease
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Driving Digital Transformation Through Global Data Management
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Unlock Value from Big Data with Apache NiFi and Streaming CDC

Recently uploaded (20)

PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
A Presentation on Artificial Intelligence
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
1 - Historical Antecedents, Social Consideration.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
A Presentation on Touch Screen Technology
PPTX
Tartificialntelligence_presentation.pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
TLE Review Electricity (Electricity).pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Enhancing emotion recognition model for a student engagement use case through...
Hindi spoken digit analysis for native and non-native speakers
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Web App vs Mobile App What Should You Build First.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
cloud_computing_Infrastucture_as_cloud_p
A Presentation on Artificial Intelligence
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
1 - Historical Antecedents, Social Consideration.pdf
Group 1 Presentation -Planning and Decision Making .pptx
A Presentation on Touch Screen Technology
Tartificialntelligence_presentation.pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
TLE Review Electricity (Electricity).pptx
MIND Revenue Release Quarter 2 2025 Press Release
Enhancing emotion recognition model for a student engagement use case through...

Hortonworks technical workshop operations with ambari

  • 1. Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Operations With Apache Ambari We Do Hadoop.
  • 2. Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Apache Ambari Apache Ambari is the open source operational platform to provision, manage and monitor Hadoop clusters
  • 3. Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved How Do People Use Ambari? Health Checks, Alerts Stacks, Views Lifecycle controls, Rolling Restarts, Decommission/ Re-commission Host Groups, Versioning, Compare, Revert, Recommendations, Security Setup Install Wizard (UI), Blueprints (API) Config Management ExtensibilityMonitoring Service Management Cluster Provisioning
  • 4. Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Recent Ambari Releases Ambari 1.7.0 Dec 2014 Ambari 1.6.0 May 2014 Introduced Ambari Blueprints Introduced Ambari Views Ambari 2.0.0 Apr 2014 HDP 2.2 GA
  • 5. Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved What’s New in Ambari 2.0 Core Platform Simplified Kerberos Setup (AMBARI-7204) Ambari Alerts (AMBARI-6354) Ambari Metrics (AMBARI-5707) Automated (Rolling) Upgrade (AMBARI-7804) Stack Support HDP 2.2: Ranger, Spark, Phoenix Hive Metastore HA (AMBARI-6684) HiveServer2 HA (AMBARI-8906) Oozie HA (AMBARI-6683) Ambari Platform Handle umask 027 setting (AMBARI-7796) Ambari Agent non-root (AMBARI-1596) Blueprints API Add Host (AMBARI-8458) For a complete list of changes https://issues.apache.org/jira/browse/AMBARI/fixforversion/12327486
  • 6. Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Lab Setup
  • 7. Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari 2.0 Security Lab Steps – 4 node cluster •  Detailed steps available at: http://bit.ly/1J4IbIs •  Install Ambari server and agents •  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD •  Use blueprint/API to provision a minimal Hadoop cluster with custom services •  Use Add service wizard to also install Hive •  Configure Ambari to sync/recognize business users in OpenLDAP •  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service •  Install Ranger as Ambari service and configure it to recognize LDAP users •  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access and audit consumption
  • 8. Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Core Platform
  • 9. Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Stack Components Support HDP 2.2 HDP 2.1 HDP 2.0 HDFS, YARN, MapReduce, Hive, HBase, Pig, ZooKeeper, Oozie, Sqoop Tez, Storm, Falcon, Flume Knox, Slider, Kafka Ranger, Spark, Phoenix NEW in Ambari 2.0 install/manage/monitor
  • 10. Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Admin > Stack and Versions List of Stack Services Installed or Add Service
  • 11. Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari 2.0.0 High Availability Support High Availability Mode Ambari 1.6.1 Ambari 1.7.0 Ambari 2.0.0 HDFS: NameNode HDP 2.0+ Active/Standby YARN: ResourceManager HDP 2.1+ Active/Standby HBase: HBaseMaster HDP 2.1+ Multi-master Hive: HiveServer2 HDP 2.1+ Multi-instance Hive: Hive Metastore HDP 2.1+ Multi-instance Oozie: Oozie Server* HDP 2.1+ Multi-instance * Oozie Server needs external load balancer to complete HA solution
  • 12. Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Hive HA Services > Hive > Service Actions + Add Hive Metastore + Add HiveServer2
  • 13. Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Agent Non-Root
  • 14. Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Non-Root Ambari Agent Agent Runs Commands From the Ambari Server •  Configuration Change •  Service Start •  Service Stop Some Command require root level access •  /bin/su  hdfs  -­‐l  -­‐s  /bin/bash  -­‐c  /usr/hdp/current/hadoop-­‐client/sbin/hadoop-­‐ daemon.sh  -­‐-­‐config  /etc/hadoop/conf  start  datanode   Sudo Leveraged •  Configuration for: –  Customizable Users (su hdfs, yarn, etc.) –  Non-Customizable Users (su mysql) –  Commands (yum, mkdir, touch, test, etc.) Ambari AgentAmbari AgentAmbari Agent python
  • 15. Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Configuring Agent for Non-Root 1.  Create and configure a sudoer account 2.  Manually bootstrap Ambari Agents 3.  Set run_as_user in ambari-agent.ini for the sudoer account Details http://docs.hortonworks.com/HDPDocuments/Ambari-2.0.0.0/bk_ambari_reference_guide/ content/ch_amb_ref_configuring_ambari_for_non-root.html
  • 16. Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Updated umask Handling
  • 17. Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved What about umask? Unix Permissions Basics: (user, group, other) 4 – read 2 – write 1 – execute rwxr-­‐xr-­‐x  == 755 Previous Behavior: •  If (umask > 022); Warning during agent pre-req check •  Installations would fail if ignored New Behavior: •  If (umask > 027); Warning during agent pre-req check •  Installation will fail if ignored
  • 18. Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerts
  • 19. Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerts Summary •  Migrated away from Nagios as the Ambari alerting system •  No longer offer option to install or manage a Nagios service •  Replaced with built-in alerting system Motivation •  Avoids Nagios package conflicts in customer environments •  More flexibility with alerts in Ambari Stacks •  Platform independence
  • 20. Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerts •  Ambari Alerts are installed and configured by default •  Ambari Web provides centralized management of Health Alerts
  • 21. Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Modifying Alerts •  Control thresholds, check intervals and response text
  • 22. Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Alert Groups •  Create and manage groups of alerts •  Group alerts further controls what alerts are dispatched which notifications •  Assign group to notifications •  Only dispatch to interested parties
  • 23. Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Alert Notifications •  What: Create and manage multiple notification targets •  Control who gets notified when •  Why: Filter by severity •  Send only certain notifications to certain targets based on severity •  How: Control dispatch method •  Support for EMAIL + SNMP
  • 24. Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Alerting System 1.  User creates or modifies cluster 2.  Ambari reads alert definitions from Stack 3.  Ambari sends alert definitions to Agents and Agent schedules instance checks 4.  Agents reports alert instance status in the heartbeat 5.  Ambari responds to alert instance status changes and dispatches notifications (if applicable) Ambari Server 1 2 4 Stack definition alerts.json 5 Ambari Agent(s) 3 email snmp
  • 25. Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Notable Alert REST APIs REST Endpoint Description /api/v1/clusters/:clusterName/alert_definitions The list of alert definitions for the cluster. /api/v1/clusters/:clusterName/alerts The list of alert instances for the cluster. Example: find all alert instances that are CRITITAL /api/v1/clusters/c1/alerts?Alert/state.in(CRITICAL) Example: find all alert instances for “ZooKeeper Process” alert def /api/v1/clusters/c1/alerts?Alert/ definition_name=zookeeper_server_process /api/v1/clusters/:clusterName/alert_groups The list of alert groups. /api/v1/clusters/:clusterName/alert_history The list of alert instance status changes. /api/v1/alert_targets/ The list of configured alert notification targets for Ambari.
  • 26. Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Metrics
  • 27. Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Metrics Summary •  Migrated from Ganglia as the Ambari metrics collection system •  No longer offer option to install or manage a Ganglia service •  Replaced with built-in metrics system “Ambari Metrics” Motivation •  Avoids Ganglia package conflicts in customer environments •  More flexibility to retain metrics in Hadoop •  Platform independence
  • 28. Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Terminology Term Definition Ambari Metrics (“AMS”) The built-in metrics collection system for Ambari (“AMS”). Metrics Collector The standalone server that collects metrics, aggregates metrics, serves metrics from the Hadoop service sinks and the Metrics Monitor. Analogous to gmetad. Metrics Monitor Installed on each host in the cluster to collect system-level metrics and forward to the Collector. Analogous to gmond. Metrics Hadoop Sinks Plugs into the Service sinks to send Hadoop metrics to the Collector.
  • 29. Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Metrics Collection System 1.  Metric Monitors send system- level metrics to Collector 2.  Sinks send Hadoop-level metrics to Collector 3.  Metrics Collector service stores and aggregates metrics 4.  Ambari exposes REST API for metrics retrieval Ambari Server Metrics Monitor Metrics Collector Host1 Sink(s) 3 Metrics Monitor Host1 Sink(s)Metrics Monitor Hosts Sink(s) 1 2 4
  • 30. Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Metrics Collector Built using Hadoop technologies Default uses local filesystem for metrics storage (“embedded”) ** Local Filesystem ** HBase ATS Phoenix ** Tech Preview “distributed” storage option to use existing HDFS
  • 31. Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Automated Rolling Upgrade For HDP Stack
  • 32. Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Rolling vs. In Place Upgrades In Place Upgrades Upgrade Stack with one or more service disruptions. Explicit stop all services. Rolling Upgrades Ambari 2.0 Update Stack with minimized service disruption and degradation.
  • 33. Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Upgrading the Stack with Ambari 2.0 Source HDP Version Target HDP Versions Method HDP 2.0.x HDP 2.0.x HDP 2.1.x HDP 2.2.x In Place HDP 2.1.x HDP 2.1.x HDP 2.2.x In Place HDP 2.2.x HDP 2.2.x Rolling NEW!!!
  • 34. Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Rolling Upgrade Process Pre- requisites Prepare Rolling Upgrade Finalize Rolling Downgrade Rollback NOT Rolling. Shutdown all services.
  • 35. Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Process: Rolling Upgrade HDP has a certified process for Rolling Upgrades Services are switched over to new version in rolling fashion ZooKeeper Ranger Core Masters Core Slaves Hive Oozie Falcon Clients Kafka Knox Storm Slider Flume Finalize HDFS, YARN, MR, Tez, HBase, Pig. Hive HDFS YARN HBase
  • 36. Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Process: Rolling Downgrade ZooKeeper Ranger Core Masters Core Slaves Hive Oozie Falcon Clients Kafka Knox Storm Slider Flume Downgrade V V V V V V V V V V V V V Finalize
  • 37. Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Wizard Driven Experience Register Install Perform Upgrade Finalize With verification and validation
  • 38. Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Process: Service Disruption by Component Component Service Disruption Zookeeper No Service Disruption Ranger No Service Disruption HDFS No Service Disruption YARN No Service Disruption HBase No Service Disruption Hive No Service Disruption Oozie No Service Disruption Falcon Yes – Requires Stop/Start Kafka Yes – Requires Stop/Start Knox Yes – Requires Stop/Start Storm Yes – Requires Stop/Start Flume No Service Disruption Slider applications Yes – Requires Stop/Start Hue Yes – Requires Stop/Start Accumulo Yes – Requires Stop/Start
  • 39. Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari Extensibility Stacks, Blueprints and Views
  • 40. Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Extensibility Features •  To add new Services (ISV or otherwise) beyond HDP Stack •  To customize a Stack for customer specific environments •  To use Ambari for automating cluster installations •  To share best practices on layout and cluster configuration •  To extend and customize the Ambari Web UI •  Add new capabilities, customize existing capabilities Stacks Blueprints Views Goal: Extend Ambari without hard-coding in Ambari
  • 41. Page 41 © Hortonworks Inc. 2011 – 2015. All Rights Reserved New in Ambari 2.0 - Blueprints Add Host Add hosts to a cluster based on a host group from a Blueprint Add one or more hosts with a single call POST /api/v1/clusters/MyCluster/hosts { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" } POST /api/v1/clusters/MyCluster/hosts [ { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" }, { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" } ] https://issues.apache.org/jira/browse/AMBARI-8458
  • 42. Page 42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved More on Ambari Extensibility http://hortonworks.com/partners/learn/#ambari
  • 43. Page 43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Simplified Kerberos Setup
  • 44. Page 44 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Quick Kerberos Overview REALM •  EXAMPLE.COM Principals (Humans) •  paul@EXAMPLE.COM Service Principals (Services) •  hbase@EXAMPLE.COM •  hbase/r2u3s1.example.local@EXAMPLE.COM Tickets •  “paul@EXAMPLE.COM is authenticated and can access the HBASE service” KDC – Key Distribution Center •  Grant’s authenticated users tickets Client •  r1u2m1.example.com (.example.com maps to realm EXAMPLE.COM) •  EXAMPLE.COM’s KDC is hosted on r1u2m3.example.com
  • 45. Page 45 © Hortonworks Inc. 2011 – 2015. All Rights Reserved KDC Implementation Options •  Microsoft Active Directory •  Users •  Service Principals •  MIT Kerberos •  Users •  Service Principals •  MIT Kerberos + Microsoft Active Directory (Trust Relationship) •  Users in Active Directory •  Service Principals in MIT
  • 46. Page 46 © Hortonworks Inc. 2011 – 2015. All Rights Reserved What Ambari 2.0 will do •  Step-by-Step wizard to setup Kerberos •  Supports existing MIT KDC and Active Directory (AD) infrastructure •  Deploys and manages Kerberos Clients, and configuration •  First Time Setup as well as New Service/Host/Component •  Automated creation of principals •  Automated generation of keytabs •  Automated distribution of keytabs •  Support for regeneration and distribution of keytabs
  • 47. Page 47 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Prerequisites Category Requirements General •  Ambari Server must be part of cluster •  Ambari Server and all hosts must have JCE installed •  Ambari Server and all hosts must have network access to the KDC KDC Admin •  KDC admin account credentials are on-hand •  !!! Ambari does not retain KDC admin credentials !!! Active Directory •  Security LDAP (LDAPS) connectivity has been configured •  User container for principals has been created and is on-hand •  Admin account has delegated control of “Create, delete and manage user accounts”
  • 48. Page 48 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Terminology Term Definition Service Principals Principals required for HDP Service Components. Ambari Principals Headless principals used by Ambari to perform “smoke tests” and “health alert checks”. KDC Admin Account An administrative account that will be used by Ambari to create principals and generate keytabs in KDC.
  • 49. Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Principal and Keytab Generation and Distribution 1.  User provides KDC Admin Account credentials to Ambari 2.  Ambari connects to KDC, creates principals (Service and Ambari) needed for cluster 3.  Ambari generates keytabs for the principals 4.  Ambari distributes keytabs to Ambari Server and cluster hosts 5.  Ambari discards the KDC Admin Account credentials Ambari Server KDC 1 2 4 3 5 HDP Cluster
  • 50. Page 50 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari + Service Keytab Files Ambari Server HDP Cluster Hosts Keytabs for Ambari Principals Keytabs for Service + Ambari Principals KDC Service Principals Ambari Principals Ambari and Service Principals
  • 51. Page 51 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Wizard Driven and Automated
  • 52. Page 52 © Hortonworks Inc. 2011 – 2015. All Rights Reserved KDC and KDC Admin Information
  • 53. Page 53 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Customizable Principal Attributes
  • 54. Page 54 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Kerberos Clients Ambari installs Kerberos clients on cluster hosts Optional to not have Ambari manage krb5.conf client config OS Client RHEL/CentOS/OEL krb5-workstation SLES 11 krb5-client Ubuntu 12 krb5-user, krb5-config
  • 55. Page 55 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Configure Ambari and Service Identities
  • 56. Page 56 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Post-Kerberos Scenarios Ambari does not retain KDC admin credentials User is prompted for KDC Admin credentials: •  Add/Delete Host •  Add Service •  Add/Delete Component •  Regenerate Keytabs •  Disable Kerberos
  • 57. Page 57 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Security Lab
  • 58. Page 58 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Security today in Hadoop with HDP Authorization Restrict access to explicit data Audit Understand who did what Data Protection Encrypt data at rest & in motion •  Kerberos in native Apache Hadoop •  HTTP/REST API Secured with Apache Knox Gateway Authentication Who am I/prove it? •  Wire encryption in Hadoop •  Orchestrated encryption with partner tools •  HDFS, Hive and Hbase, Storm and Knox •  Fine grain access control •  Centralized audit reporting •  Policy and access history HDP2.1 Ranger Centralized Security Administration More on Security: http://hortonworks.com/partners/learn/#secure
  • 59. Page 59 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Ambari 2.0 Security Lab Steps •  Detailed steps available at: http://bit.ly/1J4IbIs •  Install Ambari server and agents •  Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD •  Use blueprint/API to provision a minimal Hadoop cluster with custom services •  Use Add service wizard to also install Hive •  Configure Ambari to sync/recognize business users in OpenLDAP •  Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service •  Install Ranger as Ambari service and configure it to recognize LDAP users •  Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access and audit consumption