SlideShare a Scribd company logo
#CLUS
#CLUS
Craig Hyps, Principal Engineer
BRKSEC-3699
Designing ISE for
Scale & High
Availability
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Session Abstract
Cisco Identity Services Engine (ISE) delivers context-based access control for every
endpoint that connects to your network. This session will show you how to design ISE
to deliver scalable and highly available access control services for wired, wireless, and
VPN from a single campus to a global deployment.
Focus is on design guidance for distributed ISE architectures including high availability
for all ISE nodes and their services as well as strategies for survivability and fallback
during service outages. Methodologies for increasing scalability and redundancy will
be covered such as load distribution with and without load balancers, optimal profiling
design, and the use of Anycast.
Attendees of this session will gain knowledge on how to best deploy ISE to ensure
peak operational performance, stability, and to support large volumes of authentication
activity. Various deployment architectures will be discussed including ISE platform
selection, sizing, and network placement.
BRKSEC-3699 3
Cisco Identity Services Engine (ISE) delivers context-based access control for every
endpoint that connects to your network. This session will show you how to design ISE
to deliver scalable and highly available access control services for wired, wireless, and
VPN from a single campus to a global deployment.
Focus is on design guidance for distributed ISE architectures including high availability
for all ISE nodes and their services as well as strategies for survivability and fallback
during service outages. Methodologies for increasing scalability and redundancy will
be covered such as load distribution with and without load balancers, optimal profiling
design, and the use of Anycast.
Attendees of this session will gain knowledge on how to best deploy ISE to ensure
peak operational performance, stability, and to support large volumes of authentication
activity. Various deployment architectures will be discussed including ISE platform
selection, sizing, and network placement.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 4
ISE Sessions @Live Orlando 2018
BRKSEC-2059
Deploying ISE in a
Dynamic Environment
Clark Gambrel
Monday 1:30-3:30
BRKSEC-3699
Designing ISE for Scale & High
Availability
Craig Hyps
Thursday 8:00-10:00
You are
here
TECSEC-2672
Identity Services Engine
2.4 Best Practices
Jesse Dubois,
Eugene Korneychuk,
Kevin Redmon,
Vivek Santuka
Monday 9:00-6:00
Monday
Wednesday Thursday
Sunday
BRKSEC-3697
Advanced ISE Services, Tips & Tricks
Craig Hyps, Wednesday 8:00-10:00
BRKCOC-2018
Inside Cisco IT: How Cisco Deployed ISE and
Group Based Policies throughout the Enterprise
Raj Kumar, David Iacobacci
Wednesday 8:30-10:00
BRKSEC-2464
Lets get practical with your network security
by using Cisco ISE
Imran Bashir, Wednesday 10:30-12:00
BRKSEC-2695
Building an Enterprise Access Control
Architecture using ISE and Group Based Policies
Imran Bashir, Wednesday 1:30-3:30
BRKSEC-2039
Cisco Medical Device
Segmentation
Tim Lovelace, Mark Bernard
Thursday 1:00-2:30
BRKSEC-2038
Security for the Manufacturing
Floor - The New Frontier
Shaun Muller
Thursday 10:30-12:00
You Are Here
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Important: Hidden Slide Alert
Look for this “For Your Reference”
Symbol in your PDF’s
There is a tremendous amount of
hidden content, for you to use later!
~500 +/- Slides in
Session Reference PDF
Available on
ciscolive.com
For Your
Reference
BRKSEC-3699 5
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Cisco Webex Teams
Questions?
Use Cisco Webex Teams (formerly Cisco Spark)
to chat with the speaker after the session
Find this session in the Cisco Events App
Click “Join the Discussion”
Install Webex Teams or go directly to the team space
Enter messages/questions in the team space
How
Webex Teams will be moderated
by the speaker until June 18, 2018.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
1
2
3
4
6
cs.co/ciscolivebot#BRKSEC-3699
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Where can I get help after Cisco Live?
BRKSEC-3699 7
ISE Public Community http://cs.co/ise-community
Questions answered by ISE TMEs and other Subject Matter Experts –
the same persons that support your local Cisco and Partner SEs!
ISE Compatibility Guides http://cs.co/ise-compatibility
ISE Design Guides http://cs.co/ise-guides
Courtesy
of
Thomas
Howard
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
• Sizing Deployments and Nodes
• Bandwidth and Latency
• Scaling ISE Services
• RADIUS, Guest, Web Services,
Compliance, TACACS+
• Profiling and Database Replication
• MnT (Optimize Logging and Noise
Suppression)
Agenda
BRKSEC-3699 8
• High Availability
• Appliance Redundancy
• Admin, MnT, and pxGrid Nodes
• PSN Redundancy with and without
Load Balancing
• NAD Fallback and Recovery
• Monitoring Load and System
Health
Time Permitting
Sizing Guidance for ISE
Nodes
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE 2.4 Scaling by Deployment/Platform/Persona
Max Concurrent Session Counts by Deployment Model and Platform
• By Deployment
• By PSN
Deployment Model Platform
Max Active Sessions
per Deployment
Max # Dedicated
PSNs / PXGs
Min # Nodes (no HA) /
Max # Nodes (w/ HA)
Stand-
alone
All personas on
same node
3515 7,500 0 1 / 2
3595 20,000 0 1 / 2
Hybrid
PAN+MnT+PXG on
same node;
Dedicated PSN
3515 as PAN+MNT 7,500 5 / 2* 2 / 7
3595 as PAN+MNT 20,000 5 / 2* 2 / 7
Dedicated
Dedicated PAN and
MnT nodes
3595 as PAN and MNT 500,000 50 / 2 3 / 58
3595 as PAN and
Large MNT 500,000 50 / 4 3 / 58
Scaling per PSN Platform
Max Active Sessions
per PSN
Dedicated Policy nodes
(Max Sessions Gated by Total
Deployment Size)
SNS-3515 7,500
SNS-3595 40,000
Each dedicated pxGrid
node reduces PSN count by 1
(Medium deployment only)
*
BRKSEC-3699 10
Max Active Sessions != Max Endpoints; ISE 2.1+ supports 1.5M Endpoints
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Sizing Production VMs to Physical Appliances
Summary
11
BRKSEC-3699
Appliance used for
sizing comparison
CPU Memory
(GB)
Physical Disk
(GB) **
# Cores Clock Rate*
SNS-3415 4 2.4 16 600
SNS-3495 8 2.4 32 600
SNS-3515 6 2.3 16 600
SNS-3595 8 2.6 64 1,200
* Minimum VM processor clock rate = 2.0GHz per core (same as OVA).
** Actual disk requirement is dependent on persona(s) deployed and other factors.
See slide on Disk Sizing.
Warning: # Cores not always = # Logical processors / vCPUs due to Hyper Threading
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE Platform Properties
Minimum VM Resource Allocation
12
BRKSEC-3699
Minimum
CPUs
Minimum
RAM
Minimum
Disk
Platform Profile
2 4 100 GB EVAL
4 4 200GB IBM_SMALL_MEDIUM
4 4 200GB IBM_LARGE
4 16 200GB UCS_SMALL
8 32 200GB UCS_LARGE
12 16 200GB SNS_3515
16 64 200GB SNS_3595
16 256 200GB SNS_3595 <large>
• Least Common
Denominator used
to set platform.
• Example:
4 cores
32GB RAM
= UCS_SMALL
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
Because memory, max
sessions, and other
table spaces are
based on Persona and
Platform Profile
Why Do I Care?
BRKSEC-3699 13
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE OVA Templates
Summary
14
BRKSEC-3699
OVA Template
CPU Virtual
Memory
(GB)
Virtual
NICs
(GB)
Virtual
Disk
Size
Target
Node
Type
#
CPUs
Clock Rate
(GHz)
Total CPU
(MHz)
Eval 2 2.3 4,600 8 4 200GB EVAL
SNS3415 4 2.0 8,000 16 4
200GB PSN/PXG
600GB PAN/MnT
SNS3495 8 2.0 16,000 32 4
200GB PSN/PXG
600GB PAN/MnT
SNS3515 6 2.0 12,000 16 6
200GB PSN/PXG
600GB PAN/MnT
SNS3595 8 2.0 16,000 64 6
200GB PSN/PXG
1.2TB PAN/MnT
For 35x5 ISE VMs,
HyperThreading is Mandatory
CSCvh71644 - VMware OVA templates
for SNS-35xx are not detected correctly…
12
16
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE Platform Properties
Verify ISE Detects Proper VM Resource Allocation
• From CLI...
• ise-node/admin# show tech | begin PlatformProperties
• From Admin UI (ISE 2.2 +)
• Operations > Reports >
Diagnostics > ISE Counters > [node]
(Under ISE Profile column)
15
BRKSEC-3699
UCS_SMALL
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE VM Disk Storage Requirements
Minimum Disk Sizes by Persona
• Upper range sets #days MnT log retention
• Min recommended disk for MnT = 600GB
• Max hardware appliance disk size = 1.2TB
• Max virtual appliance disk size = 2TB
CSCvb75235 - DOC ISE VM installation can't be done
if disk is greater than or equals to 2048 GB or 2 TB
** Variations depend on where backups
saved or upgrade files staged (local or
repository), debug, local logging, and data
retention requirements.
16
BRKSEC-3699
Persona Disk (GB)
Standalone 200+*
Administration Only 200-300**
Monitoring Only 200+*
Policy Service Only 200
PAN + MnT 200+*
PAN + MnT + PSN 200+*
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
VM Disk Allocation
CSCvc57684 Incorrect MnT allocations if setup
with VM disk resized to larger without ISO re-image
• ISE OVAs prior to ISE 2.2 sized to
200GB. Often sufficient for PSNs
or pxGrid nodes but not MnT.
• Misconception: Just get bigger
tank and ISE will grow into it!
• No auto-resize of ISE partitions
when disk space added after
initial software install
• Requires re-image using .iso
• Alternatively: Start with larger
OVA (ISE 2.2)
ISE
200GB
OVA
Total ISE
disk =
200GB
Accessible
to VM but
not ISE
Add
400GB
VM disk
BRKSEC-3699 17
MNT
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
MnT Node Log Storage Requirements for RADIUS
Days Retention Based on # Endpoints and Disk Size (ISE 2.2)
18
BRKSEC-3699
200 GB 400 GB 600 GB 1024 GB 2048 GB
5,000 504 1007 1510 2577 5154
10,000 252 504 755 1289 2577
25,000 101 202 302 516 1031
50,000 51 101 151 258 516
100,000 26 51 76 129 258
150,000 17 34 51 86 172
200,000 13 26 38 65 129
250,000 11 21 31 52 104
500,000 6 11 16 26 52
Total
Endpoints
Total Disk Space Allocated to MnT Node
Assumptions:
• 10+ auths/day per
endpoint
• Log suppression
enabled
Based on 60% allocation of MnT disk to RADIUS logging
(Prior to ISE 2.2, only 30% allocations)
ISE 2.2 = 50% days
increase over 2.0/2.1
ISE 2.3 = 25-33%
increase over 2.2
ISE 2.4 = 40-60%
increase over 2.2
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
RADIUS and TACACS+
MnT Log Allocation
• Administration > System > Maintenance > Operational Data Purging
19
BRKSEC-3699
• 60% total disk allocated to both RADIUS
and TACACS+ for logging
(Previously fixed at 30% and 20%)
• Purge @ 80% (First In-First Out)
• Optional archive of CSV to repository
RADIUS T+
Total Log Allocation
384 GB
80% Purge
M&T_PRIMARY
Radius : 67 GB
Days : 24
Default Retention
reduced from 90 -> 30
days
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE VM Disk Provisioning Guidance
• Please! No Snapshots!
• Snapshots NOT supported; no option
to quiesce database prior to snapshot.
• VMotion supported but storage
motion not QA tested.
• Recommend avoid VMotion due to
snapshot restrictions.
• Thin Provisioning supported
• Thick Provisioning highly recommended,
especially for PAN and MnT)
• No specific storage media and file system
restrictions.
• For example, VMFS is not required and NFS
allowed provided storage is supported by VMware
and meets ISE IO performance requirements.
20
IO Performance Requirements:
Read 300+ MB/sec
Write 50+ MB/sec
Recommended disk/controller:
 10k RPM+ disk drives
 Supercharge with SSD !
 Caching RAID Controller
 RAID mirroring
Slower writes using RAID 5*
*RAID performance levels:
http://www.datarecovery.net/articles/raid-
level-comparison.html
http://docs.oracle.com/cd/E19658-01/820-
4708-13/appendixa.html
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE VM Provisioning Guidance
• Use reservations (built into OVAs)
• Do not oversubscribe!
21
BRKSEC-3699
Customers with VMware expertise may
choose to disable resource reservations and
over-subscribe, but do so at own risk.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Introducing “Super” MnT
For Any Deployment where High-Perf MnT Operations Required
• Virtual Appliance Only option in ISE 2.4
• Requires Large VM License
• 3595 specs + 256 GB
• 8 cores @ 2GHz min (16000+ MHz)
= 16 logical processors
• 256GB RAM
• Up to 2TB* disk w/ fast I/O
• Fast I/O Recommendations:
• Disk Drives (10k/15k RPM or SSD)
• Fast RAID w/Caching (ex: RAID 10)
• More disks (ex: 8 vs 4)
22
BRKSEC-3699
* CSCvb75235 - DOC ISE VM installation
can't be done if disk is greater than or
equals to 2048 GB or 2 TB
MnT
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE 2.4 MnT -- Fast Access to Logs and Reports
23
BRKSEC-3699
Live Logs / Live Sessions
Reports
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE 2.4 MnT Vertical Scaling Scaling Enhancements
Faster Live Log Access
• Run session directory tables from pinned memory
• Tables optimized for faster queries
Faster Report & Export Performance
• Report related tables pinned into memory for faster retrieval.
• Optimize tables based on platform capabilities.
Collector Throughput improvement
• Added Multithreaded processing capability to collector.
• Increased collector socket buffer size to avoid packet drops.
Major Data Reduction
• Remove detailed BLOB data > 7 days old (beyond 2.3 reductions)
• Database optimizations resulting in up to 80% efficiencies
BRKSEC-3699 24
Benefits MnT
on ALL ISE
platforms
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Flash Removal (ISE 2.4)
• “No Flash”
• C’mon, you mean just a
little bit of flash, right?
• No. I’m Saying No
Flash! There is no
Flash in this product!
And no Yahoo! User Interface Library (YUI)
BRKSEC-3699 25
Bandwidth and Latency
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS BRKSEC-3699
PSN PSN
PAN MnT MnT
PAN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
27
Bandwidth and Latency
Starting in ISE
2.1: 300ms
Max round-trip
(RT) latency
between any
two ISE nodes
`
RADIUS generally requires much less bandwidth and is more
tolerant of higher latencies – Actual requirements based on
many factors including # endpoints, auth rate and protocols
WLC Switch
RADIUS
• Bandwidth most critical between:
• PSNs and Primary PAN (DB Replication)
• PSNs and MnT (Audit Logging)
• Latency most critical between PSNs and Primary PAN.
PSN PSN
PSN PSN
PSN PSN
PSN PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 28
BRKSEC-3699
Have I Told You My Story Over Latency Yet?
“Over Latency?” “No. I Don’t Think I’ll Ever Get Over Latency.”
• Latency guidance is not a “fall off the cliff” number, but a guard rail based on what
QA has tested.
• Not all customers have issues with > 300ms while others may have issues with <
100ms latency due to overall ISE design and deployment.
• Profiler config is primary determinant in replication requirements between PSNs and
PAN which translates to latency.
• When providing guidance, max 300ms roundtrip latency is the correct response
from SEs for their customers to design against.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
What if Distributed PSNs > 300ms RTT Latency?
< 300 ms
> 300 ms
BRKSEC-3699 29
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Option #1: Deploy Separate ISE Instances
Per-Instance Latency < 300ms
WLC Switch
RADIUS
WLC Switch
WLC
Switch
< 300 ms
> 300 ms
BRKSEC-3699 30
PSN PSN
PAN MnT MnT
PAN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
PSN PSN
PSN P
PAN MnT M
PAN
PSN PSN
PSN PSN
PSN PSN
PSN
PSN
P
PAN MnT P
PSN PSN
PSN PSN
PSN PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE Bandwidth Calculator – Updated for ISE 2.1+
ISE 2.x
ISE 2.x
Note:
Bandwidth
required for
RADIUS
traffic is not
included.
Calculator is
focused on
inter-ISE
node
bandwidth
requirements.
Available to customers @ https://communities.cisco.com/docs/DOC-64317
BRKSEC-3699 31
Scaling ISE Services
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
• Auth Policy and Service Scale
• Guest and Web Authentication and Location Services
• Compliance Services—Posture and MDM
• Scaling TACACS+
• Profiling and Database Replication
• MnT (Optimize Logging and Noise Suppression)
Scaling ISE Services Agenda
BRKSEC-3699 33
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 34
BRKSEC-3699
ISE Personas and Services
Enable Only What Is Needed !!
• ISE Personas:
• PAN
• MNT
• PSN
• pxGrid
• PSN Services
• Session
• Profiling
• TC-NAC
• ISE SXP
• Device Admin
(TACACS+)
• Passive Identity
(Easy Connect)
• Avoid unnecessary
overload of PSN
services
• Some services
should be dedicated
to one or more PSNs
Session Services includes base
user services such as RADIUS,
Guest, Posture, MDM, BYOD/CA
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Scaling RADIUS, Web, Profiling, and TACACS+ w/LB
• Policy Service nodes can be configured in a cluster behind a load balancer (LB).
• Access Devices send RADIUS and TACACS+ AAA requests to LB virtual IP.
Load
Balancers
Network
Access
Devices
PSNs (User
Services)
Virtual IP
Load Balancing covered
under the High
Availability Section
BRKSEC-3699 35
VPN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 36
BRKSEC-3699
Auth Policy Optimization
ISE 2.3 Bad Example
1. AD Groups
2. AD Attributes
3. MDM
4. Certificate
5. ID Group
6. SQL Attributes
7. Auth Method
8. Endpoint Profile
9. Location
• Policy Logic:
o First Match, Top Down
o Skip Rule on first negative match
• More specific rules generally at top
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 37
BRKSEC-3699
Auth Policy Optimization
ISE 2.3 Better Example!
BRKSEC-3699
Block 1
Block 2
Block 3
Block 4
4. AD Groups
5. AD Attributes
9. MDM
7. Certificate
6. ID Group
8. SQL Attributes
2. Auth Method
3. Endpoint Profile
1. Location
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE 2.4 Auth Policy Scale
38
BRKSEC-3699
• Max Policy Sets = 200
(up from 100 in 2.2; up from 40 in 2.1)
• Max Authentication Rules = 1000
(up from 200 in 2.2; up from 100 in 2.1)
• Max Authorization Rules = 3000
(up from 700 in 2.2; up from 600 in 2.1)
• Max Authorization Profiles = 3200
(up from 1000 in 2.2; up from 600 in 2.1)
For Your
Reference
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 39
BRKSEC-3699
Dynamic Variable Substitution
Rule Reduction
• Authorization Policy Conditions
• Authorization Profile Conditions ID Store Attribute
• Match conditions to unique values stored per-
User/Endpoint in internal or external ID stores
(AD, LDAP, SQL, etc)
• ISE supports custom User and Endpoint attributes
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Enable EAP Session Resume / Fast Reconnect
Major performance boost, but not complete auth so avoid excessive timeout value
Skip inner method
Cache TLS session
Cache TLS (TLS Handshake Only/Skip Cert)
Note: Both Server
and Client must
be configured for
Fast Reconnect
Win 7 Supplicant
40
For Your
Reference
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 41
BRKSEC-3699
ISE Stateless Session Resume
Allows Session Resume Across All PSNs
• Session ticket extension per RFC 5077
[Transport Layer Security (TLS) Session Resumption without Server-Side State]
• ISE issues TLS client a session ticket that can be presented to any PSN to
shortcut reauth process (Default = Disabled)
Time until session
ticket expires
Policy > Policy Elements > Results > Authentication > Allowed Protocols
Allows resume with
Load Balancers
Scaling Guest and
Web Authentication
Services
42
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 43
Scaling Global Sponsor / MyDevices
Anycast Example
DNS SERVER: DOMAIN =
COMPANY.COM
SPONSOR 10.1.0.100
MYDEVICES 10.1.0.101
ISE-PSN-1 10.1.1.1
ISE-PSN-2 10.1.1.2
ISE-PSN-3 10.1.1.3
ISE-PSN-4 10.2.1.4
ISE-PSN-5 10.2.1.5
ISE-PSN-6 10.2.1.6
ISE-PSN-7 10.3.1.7
ISE-PSN-8 10.3.1.8
ISE-PSN-9 10.3.1.9
Use Global Load Balancer or Anycast (example shown)
to direct traffic to closest VIP. Web Load-balancing
distributes request to single PSN.
Load Balancing also helps to scale Web Portal Services
DNS Servers
BRKSEC-3699
10.1.0.100
10.1.0.100
10.1.0.100
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 44
BRKSEC-3699
Scaling Guest Authentications Using 802.1X
“Activated Guest” allows guest accounts to be used without ISE web auth portal
• Guests auth with 802.1X using EAP methods like PEAP-MSCHAPv2 / EAP-GTC
• 802.1X auth performance generally much higher than web auth
Note: AUP and Password Change cannot be enforced since guest bypasses portal flow.
Warning:
Watch for
expired
guest
accounts,
else high #
auth failures !
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 45
BRKSEC-3699
Scaling Web Auth
“Remember Me” Guest Flows
• User logs in to Hotspot/CWA portal and MAC address auto-registered into
GuestEndpoint group
• AuthZ Policy for GuestEndpoints ID Group grants access until device purged
New in
ISE 2.4
Work Centers > Guest Access > Settings > Logging
Scaling Posture & MDM
46
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 47
BRKSEC-3699
Posture Lease
Once Compliant, user may leave/reconnect multiple times before re-posture
7
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 48
BRKSEC-3699
MDM Scalability and Survivability
What Happens When the MDM Server is Unreachable?
• Scalability ≈ 30 Calls per second per PSN.
• Cloud-Based deployment typically built for scale and redundancy
• For cloud-based solutions, Internet bandwidth and latency must be considered.
• Premise-Based deployment may leverage load balancing
• ISE 1.4+ supports multiple MDM servers – could be same or different vendors.
• Authorization permissions can be set based on MDM connectivity status:
• MDM:MDMServerReachable Equals UnReachable
MDM:MDMServerReachable Equals Reachable
• All attributes retrieved & reachability determined by single API call on each new session.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 49
BRKSEC-3699
Scaling MDM
Prepopulate MDM Enrollment and/or Compliance via ERS API
<groupId>groupId</groupId>
<identityStore>identityStore</identityStore>
<identityStoreId>identityStoreId</identityStoreId>
<mac>00:01:02:03:04:05</mac>
<mdmComplianceStatus>false</mdmComplianceStatus>
<mdmEncrypted>false</mdmEncrypted>
<mdmEnrolled>true</mdmEnrolled>
<mdmIMEI>IMEI</mdmIMEI>
<mdmJailBroken>false</mdmJailBroken>
<mdmManufacturer>Apple Inc.</mdmManufacturer>
<mdmModel>iPad</mdmModel>
<mdmOS>iOS</mdmOS>
<mdmPhoneNumber>Phone Number</mdmPhoneNumber>
<mdmPinlock>true</mdmPinlock>
<mdmReachable>true</mdmReachable>
<mdmSerial>AB23D0E45BC01</mdmSerial>
<mdmServerName>AirWatch</mdmServerName>
<portalUser>portalUser</portalUser>
<profileId>profileId</profileId>
<staticGroupAssignment>true</staticGroupAssignment>
<staticProfileAssignment>false</staticProfileAssignment>
<customAttributes>
<customAttributes>
<entry>
<key>MDM_Registered</key>
<value>true</value>
</entry>
<entry>
<key>MDM_Compliance</key>
<value>false</value>
</entry>
<entry>
<key>Attribute_XYZ</key>
<value>Value_XYZ</value>
</entry>
</customAttributes>
</customAttributes>
ISE 2.4 adds support for managing
MDM Attributes via ERS API
TACACS+ Scaling
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 51
BRKSEC-3699
Options for Deploying Device Admin
https://communities.cisco.com/docs/DOC-63930
Priorities according to Policy and
Business Goals
Separate Deployment Separate PSNs Mixed PSNs
Separation of
Configuration/
Duty
Yes: Specialization for TACACS+
No: Shared resources/Reduced $$
Independent
Scaling of
Services
Yes: Scale as needed/No impact on
Device Admin from RADIUS services
No: Avoid underutilized PSNs
Suitable for
high-volume
Device Admin
Yes: Services dedicated to TACACS+
No: Focus on “human” device admins
Separation of
Logging Store
Yes: Optimize log retention VM
No: Centralized monitoring
TACACS
RADIUS RADIUS TACACS TACACS
RADIUS/
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE 2.4 TACACS+ Multi-Service Scaling (RADIUS and T+)
Max Concurrent RADIUS + TACACS+ TPS by Deployment Model and Platform
• By Deployment
• By PSN
Deployment Model Platform
Max #
Dedicated PSNs
Max RADIUS Sessions
per Deployment
Max TACACS+ TPS
per Deployment
Standa-
alone
All personas on
same node
3515 0 7,500 100
3595 0 20,000 100
Hybrid
PAN+MnT+PXG
on same node;
Dedicated PSN
3515 as PAN+MNT * 5 / 3+2 7,500 250 / 2,000
3595 as PAN+MNT * 5 / 3+2 20,000 250 / 3,000
Dedicated
Each Persona on
Dedicated Node
3595 as PAN and MNT * 50 / 47+3 500,000 2,500 / 4,000
3595 as PAN and Large MNT * 50 / 47+3 500,000 2,500 / 6,000
Scaling per PSN Platform
Max RADIUS
Sessions per PSN
Max TACACS+ TPS
per PSN
Dedicated Policy nodes
(Max Sessions Gated by
Total Deployment Size)
SNS-3515 7,500 2,000
SNS-3595 40,000 3,000
Each dedicated T+ PSN node reduces dedicated RADIUS PSN count by 1
* Device Admin service enabled on same PSNs also used for RADIUS OR Split RADIUS and T+ PSNs
52
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE 2.4 TACACS+ Multi-Service Scaling (TACACS+ Only)
Max Concurrent TACACS+ TPS by Deployment Model and Platform
• By Deployment
• By PSN
53
BRKSEC-3699
Deployment Model Platform
Max # Dedicated
PSNs
Max RADIUS Sessions
per Deployment
Max TACACS+ TPS
per Deployment
Stand-
alone
All personas on
same node
3515 0 N/A 1,000
3595 0 N/A 1,500
Hybrid
PAN+MnT+PXG
on same node;
Dedicated PSN
3515 as PAN+MNT * 5 / 2 N/A 2,000 / 2,000
3595 as PAN+MNT * 5 / 2 N/A 3,000 / 3,000
Dedicated
Each Persona on
Dedicated Node
3595 as PAN and MNT * 50 / 4 N/A 5,000 / 5,000
3595 as PAN and Large MnT * 50 / 5 N/A 10,000 / 10,000
Scaling per PSN Platform
Max RADIUS Sessions
per PSN
Max TACACS+ TPS
per PSN
Dedicated Policy nodes
(Max Sessions Gated by
Total Deployment Size)
SNS-3515 7,500 2,000
SNS-3595 40,000 3,000
* Device Admin service can be enabled on each PSN; minimally 2 for redundancy.
Max log capacity for MNT
**
**
**
**
**
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 54
BRKSEC-3699
TACACS+ MnT Scaling
Human Versus Automated Device Administration
• Consider the “average” size syslog from TACACS+ based on following guidance:
• “Human” Device Admin Example:
• For a normal “human” session we may expect to see 10 commands, so a session would be
approximately: [5kB + (10 * 3kB)) = 35kB. Suppose a maximum of 50 such sessions per admin per
day from 50 admins (and few organizations have > 50 admins)
• 50 human admins would generate < 1 TPS average, ~60k logs/day, or ~90MB/day.
• Automated/Script Device Admin Example:
• Consider a script that runs 4 times a day against 30,000 devices, (for example, to backup config on
all devices). Generally the interaction will be short, say 5 commands:
• Storage = 30,000 * 4 * [5kB + (5 * 3kB)] = ~2.4 GB/day
• Total TPS = 30k * 4 * [3 + (5 * 2)] = 1.56M logs = 18 TPS average; 1300 TPS peak.
Each TACACS+ Session Each Command Authorization (per session)
Authentication: 2kB Command authorization: 2kB
Session authorization: 2kB Command accounting : 1kB
Session accounting: 1kB
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
TACACS+ Multi-Service Scaling
Required TACACS+ TPS by # Admins and # NADs
Session Authentication and
Accounting Only
Command Accounting Only
(10 Commands / Session)
Command Authorization + Acctg
(10 Commands / Session)
Avg
TPS
Peak
TPS
Logs/Day
Storage/
day
Avg
TPS
Peak
TPS
Logs/Day
Storage/
day
Avg
TPS
Peak
TPS
Logs/Day
Storage/
day
# Admins Based on 50 Admin Sessions per Day
1 < 1 < 1 150 < 1MB < 1 < 1 650 1MB < 1 <1 1.2k 2MB
5 < 1 < 1 750 1MB < 1 < 1 3.3k 4MB < 1 <1 5.8k 9MB
10 < 1 < 1 1.5k 3MB < 1 < 1 6.5k 8MB < 1 1 11.5k 17MB
25 < 1 < 1 3.8k 7MB < 1 1 16.3k 19MB < 1 2 28.8k 43MB
50 < 1 1 7.5k 13MB < 1 2 32.5k 37MB 1 4 57.5k 86MB
100 < 1 1 15k 25MB 1 4 65k 73MB 2 8 115k 171MB
# NADs Based on 4 Scripted Sessions per Day
500 < 1 5 6k 10MB < 1 22 26k 30MB 1 38 46k 70MB
1,000 < 1 10 12k 20MB 1 43 52k 60MB 1 77 92k 140MB
5,000 < 1 50 60k 100MB 3 217 260k 300MB 5 383 460k 700MB
10,000 1 100 120k 200MB 6 433 520k 600MB 11 767 920k 1.4GB
20,000 3 200 240k 400MB 12 867 1.04M 1.2GB 21 1.5k 1.84M 2.7GB
30,000 5 300 480k 600MB 18 1.3k 1.56M 1.7GB 32 2.3k 2.76M 4.0GB
50,000 7 500 600k 1GB 30 2.2k 2.6M 2.9GB 53 3.8k 4.6M 6.7GB
Human
Admin
Script
Admin
Human
Admin
BRKSEC-3699 55
Peak values based on 5-minute burst to complete each batch request.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
TACACS+ Multi-Service Scaling
Required TACACS+ TPS by # Admins and # NADs
Session Authentication and
Accounting Only
Command Accounting Only
(10 Commands / Session)
Command Authorization + Acctg
(10 Commands / Session)
Avg
TPS
Peak
TPS
Logs/Day
Storage/
day
Avg
TPS
Peak
TPS
Logs/Day
Storage/
day
Avg
TPS
Peak
TPS
Logs/Day
Storage/
day
# Admins Based on 50 Admin Sessions per Day
1 < 1 < 1 150 < 1MB < 1 < 1 650 1MB < 1 <1 1.2k 2MB
5 < 1 < 1 750 1MB < 1 < 1 3.3k 4MB < 1 <1 5.8k 9MB
10 < 1 < 1 1.5k 3MB < 1 < 1 6.5k 8MB < 1 1 11.5k 17MB
25 < 1 < 1 3.8k 7MB < 1 1 16.3k 19MB < 1 2 28.8k 43MB
50 < 1 1 7.5k 13MB < 1 2 32.5k 37MB 1 4 57.5k 86MB
100 < 1 1 15k 25MB 1 4 65k 73MB 2 8 115k 171MB
# NADs Based on 4 Scripted Sessions per Day
500 < 1 5 6k 10MB < 1 22 26k 30MB 1 38 46k 70MB
1,000 < 1 10 12k 20MB 1 43 52k 60MB 1 77 92k 140MB
5,000 < 1 50 60k 100MB 3 217 260k 300MB 5 383 460k 700MB
10,000 1 100 120k 200MB 6 433 520k 600MB 11 767 920k 1.4GB
20,000 3 200 240k 400MB 12 867 1.04M 1.2GB 21 1.5k 1.84M 2.7GB
30,000 5 300 480k 600MB 18 1.3k 1.56M 1.7GB 32 2.3k 2.76M 4.0GB
50,000 7 500 600k 1GB 30 2.2k 2.6M 2.9GB 53 3.8k 4.6M 6.7GB
Human
Admin
Script
Admin
Script
Admin
BRKSEC-3699 56
Peak values based on 5-minute burst to complete each batch request.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 57
BRKSEC-3699
Single Connect Mode
Scaling TACACS+ for High-Volume NADs
• Multiplexes T+ requests over single TCP connection
• All T+ requests between NAD and ISE occur over single connection rather than separate
connections for each request.
• Recommended for TACACS+ “Top Talkers”
• Note: TCP sockets locked to NADs, so limit use to NADs with highest activity.
Administration > Network Resources > Network Devices > (NAD)
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 58
BRKSEC-3699
Internal User Cache for T+ Authorization
Scaling TACACS+ for High-Volume Admin Users
First authorization caches
1) User Name
2) User Specific Attributes (Ex: Group ID,
custom attributes)
Successive requests served from cache
Default = 0 <<Cache Disabled>>
Global Setting for Single Connect
Mode (enabled by default)
Scaling Profiling and
Database Replication
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 60
BRKSEC-3699
Endpoint Attribute Filter and Whitelist Attributes
Reduces Data Collection and Replication to Subset of Profile-Specific Attributes
• Endpoint Attribute Filter – aka “Whitelist filter”
• Disabled by default. If enabled, only these attributes are collected or replicated.
• Whitelist Filter limits profile attribute collection to those required to support
default (Cisco-provided) profiles and critical RADIUS operations.
• Filter must be disabled to collect and/or replicate other attributes.
• Attributes used in custom conditions are automatically added to whitelist.
Administration > System Settings > Profiling
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Sampling of All Endpoint
Attributes
61
Whitelist Attributes vs Significant Attributes
PolicyVersion
OUI
EndPointMACAddress
MatchedPolicy
EndPointMatchedProfile
EndPointPolicy
Total Certainty Factor
EndPointProfilerServer
EndPointSource
StaticAssignment
StaticGroupAssignment
UpdateTime
Description
IdentityGroup
ElapsedDays
InactiveDays
NetworkDeviceGroups
Location
Device Type
IdentityAccessRestricted
IdentityStoreName
ADDomain
AuthState
ISEPolicySetName
IdentityPolicyMatchedRule
AllowedProtocolMatchedRule
SelectedAccessService
SelectedAuthenticationIdentityStore
s
AuthenticationIdentityStore
AuthenticationMethod
AuthorizationPolicyMatchedRule
SelectedAuthorizationProfiles
CPMSessionID
AAA-Server
OriginalUserName
DetailedInfo
EapAuthentication
NasRetransmissionTimeout
TotalFailedAttempts
TotalFailedTime
UseCase
UserType
GroupsOrAttributesProcessFailure
ExternalGroups
Called-Station-ID
Calling-Station-ID
DestinationIPAddress
DestinationPort
Device IP Address
MACAddress
MessageCode
NADAddress
NAS-IP-Address
NAS-Port
NAS-Port-Id
NAS-Port-Type
NetworkDeviceName
RequestLatency
Service-Type
Timestamp
User-Name
Egress-VLANID
Egress-VLAN-Name
Airespace-Wlan-Id
Device Port
EapTunnel
Framed-IP-Address
NAS-Identifier
RadiusPacketType
Vlan
VlanName
cafSessionAuthUserName
cafSessionAuthVlan
cafSessionAuthorizedBy
cafSessionDomain
cafSessionStatus
dot1dBasePort
dot1xAuthAuthControlledPortContr
ol
dot1xAuthAuthControlledPortStatus
dot1xAuthSessionUserName
ifDescr
ifIndex
ifOperStatus
cdpCacheAddress
cdpCacheCapabilities
cdpCacheDeviceId
cdpCachePlatform
cdpCacheVersion
lldpSystemDescription
lldpSystemName
lldpCapabilitiesMapSupported
lldpChassisId
cLApIfMacAddress
cLApName
cLApNameServerAddress
cLApSshEnable
cLApTelnetEnable
cLApTertiaryControllerAddress
cLApTertiaryControllerAddressType
cLApUpTime
cLApWipsEnable
cldcAssociationMode
cldcClientAccessVLAN
cldcClientIPAddress
cldcClientStatus
BYODRegistration
DeviceRegistrationStatus
PortalUser
AUPAccepted
LastAUPAcceptanceHours
PostureAssessmentStatus
FQDN
OpenSSLErrorMessage
OpenSSLErrorStack
User-Agent
attribute-52
attribute-53
chaddr
ciaddr
client-fqdn
host-name
domain-name
dhcp-class-identifier
dhcp-client-identifier
dhcp-message-type
dhcp-parameter-request-list
dhcp-requested-address
dhcp-user-class-id
dhcp-vendor-class
flags
giaddr
hlen
hops
htype
ip
op
secs
yiaddr
sysName
sysDescr
sysContact
sysLocation
hrDeviceDescr
LastNmapScanTime
NmapScanCount
operating-system
CLASS_ID
DIRECTION
DST_MASK
FIRST_SWITCHED
FLOW_SAMPLER_ID
FragmentOffset
INPUT_SNMP
IN_BYTES
IN_PKTS
OUT_BYTES
IPV4_DST_ADDR
IPV4_NEXT_HOP
IPV4_SRC_ADDR
IPV4_IDENT
L4_DST_PORT
L4_SRC_PORT
LAST_SWITCHED
OUTPUT_SNMP
PROTOCOL
SRC_MASK
SRC_TOS
TCP_FLAGS
SRC_VLAN
DST_VLAN
IN_SRC_MAC
OUT_DST_MAC
MAX_TTL
MIN_TTL
dst_as
src_as
count
flow_sequence
source_id
sys_uptime
unix_secs
version
MDMServerName
MDMUdid
MDMImei
MDMMeid
MDMManufacturer
MDMModel
MDMOSVersion
MDMPhoneNumber
MDMSerialNumber
MDMCompliant
MDMJailBroken
MDMPinLockSet
MDMDiskEncrypted
h323DeviceName
h323DeviceVendor
h323DeviceVersion
mdns_VSM_class_identifier
mdns_VSM_srv_identifier
mdns_VSM_txt_identifier
sipDeviceName
sipDeviceVendor
sipDeviceVersion
device-platform
device-platform-version
device-type
AD-Host-Exists
AD-Join-Point
AD-Operating-System
AD-OS-Version
AD-Service-Pack
iotAssetDeviceType
iotAssetProductCode
iotAssetProductName
iotAssetRetrievedFrom
iotAssetSerialNumber
iotAssetTrustLevel
iotAssetVendorID
80-tcp
110-tcp
135-tcp
139-tcp
143-tcp
443-tcp
445-tcp
515-tcp
3306-tcp
3389-tcp
5900-tcp
8080-tcp
9100-tcp
53-udp
67-udp
68-udp
123-udp
135-udp
137-udp
138-udp
139-udp
FirstCollection
FQDN
Framed-IP-Address
host-name
hrDeviceDescr
IdentityGroup
IdentityGroupID
IdentityStoreGUID
IdentityStoreName
ifIndex
ip
L4_DST_PORT
LastNmapScanTime
lldpCacheCapabilities
lldpCapabilitiesMapSupported
lldpSystemDescription
MACAddress
MatchedPolicy
MatchedPolicyID
MDMCompliant
MDMCompliantFailureReason
MDMDiskEncrypted
MDMEnrolled
MDMImei
MDMJailBroken
MDMManufacturer
MDMModel
MDMOSVersion
MDMPhoneNumber
MDMPinLockSet
MDMProvider
MDMSerialNumber
MDMServerReachable
MDMUpdateTime
NADAddress
NAS-IP-Address
NAS-Port-Id
NAS-Port-Type
NmapScanCount
NmapSubnetScanID
operating-system
OS Version
OUI
PhoneID
PhoneIDType
PolicyVersion
PortalUser
PostureApplicablePrevious
DeviceRegistrationStatus
ProductRegistrationTimeStamp
StaticAssignment
StaticGroupAssignment
sysDescr
TimeToProfile
Total Certainty Factor
UpdateTime
User-Agent
161-udp
AAA-Server
AC_User_Agent
AUPAccepted
BYODRegistration
CacheUpdateTime
Calling-Station-ID
cdpCacheAddress
cdpCacheCapabilities
cdpCacheDeviceId
cdpCachePlatform
cdpCacheVersion
Certificate Expiration Date
Certificate Issue Date
Certificate Issuer Name
Certificate Serial Number
ciaddr
CreateTime
Description
DestinationIPAddress
Device Identifier
Device Name
DeviceRegistrationStatus
dhcp-class-identifier
dhcp-requested-address
EndPointPolicy
EndPointPolicyID
EndPointProfilerServer
EndPointSource
MACADDRESS
MATCHEDVALUE
ENDPOINTPOLICY
ENDPOINTPOLICYVERSION
STATICASSIGNMENT
STATICGROUPASSIGNMENT
NMAPSUBNETSCANID
PORTALUSER
DEVICEREGISTRATIONSTATUS
Significant Attributes
Whitelist Attributes
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 62
Inter-Node Communications
JGroup Connections – Global Cluster
• All Secondary nodes* establish
connection to Primary PAN (JGroup
Controller) over tunneled connection
(TCP/12001) for config/database
sync.
• Secondary Admin also listens on
TCP/12001 but no connection
established unless primary
fails/secondary promoted
• All Secondary nodes participate in
the Global JGroup cluster.
*Secondary node = All nodes
except Primary Admin node;
includes PSNs, MnT, pxGrid, and
Secondary Admin nodes
TCP/12001 JGroups Tunneled
GLOBAL
JGROUP
CONTROLLER
BRKSEC-3699
Admin (P) Admin (S)
MnT (S)
MnT (P)
PSN1 PSN2
PSN3
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 63
Inter-Node Communications
Local JGroups and Node Groups
• Node Groups can be used to define
local JGroup* clusters where
members exchange heartbeat and
sync profile data over SSL (TLS v1.2).
LOCAL
JGROUP
CONTROLLER NODE GROUP A
(JGROUP A)
*JGroups: Java toolkit for reliable multicast
communications between group/cluster members.
TCP/7800 JGroup Peer Communication
JGroup Failure Detection
TCP/12001 JGroups Tunneled
GLOBAL
JGROUP
CONTROLLER
Fetch Attributes
Change
Ownership
PSN1 is current endpoint owner
– no database replication even if
whitelist attribute changes
DHCP
Update
t=0
DHCP
Update
t=1
• PSN claims endpoint ownership only
if change in whitelist attribute;
triggers ownership update to local
PSNs. Whitelist check always occurs
regardless of global whitelist filter.
PSN2 gets more current update
for same endpoint and takes
ownership – fetches all
attributes from PSN1
• Replication to PAN occurs if
significant attribute changes, then
sync all attributes via PAN; if whitelist
filter enabled, only whitelist attributes
synced to all nodes.
BRKSEC-3699
Admin
(P)
Admin (S)
MnT (S)
MnT (P)
PSN1 PSN2
PSN3
Profile
Change
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
NODE GROUP A
(JGROUP A)
L2 or L3
PSN4 PSN5
PSN6
64
Inter-Node Communications
Local JGroups and Node Groups
NODE GROUP B
(JGROUP B)
• Profiling sync leverages JGroup channels
• All replication outside node group must
traverse PAN—including Ownership Change!
• If Local JGroup fails, then nodes fall back to
Global JGroup communication channel.
BRKSEC-3699
PSN1 PSN2
PSN3
TCP/7800 JGroup Peer Communication
JGroup Failure Detection
TCP/12001 JGroups Tunneled
Admin (P) Admin (S)
MnT (S)
MnT (P)
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 65
Node Groups and Session Recovery
Dynamic Clean Up for Orphaned URL-Redirected Sessions
Primary
PAN
Primary
MnT
PSN1 PSN2
PSN3
RADIUS
Portal Redirect
to PSN3
JGroup “Master”
PSN3 not responding!
Hey Primary MnT!
Did PSN3 have any
active sessions with
pending redirect?
BRKSEC-3699
CoA Session
Terminate
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Query Attributes
66
ISE 2.4 Node Communications
Guest: tcp/8443
Discovery: tcp/8443, tcp/8905
Agent Install: tcp/8443
Posture Agent: tcp/8905; udp/8905
PRA/KA: tcp/8905
DNS: udp/53; DHCP:udp/67
WMI Client Probe: tcp/135, tcp/445
Kerberos (SPAN): tcp/88
SCEP Proxy: tcp/80, tcp/443
EST: tcp/8084
PIP
Endpoint
Posture Updates/Smart
Licensing: tcp/443
Profiler Feed: tcp/8443
Logging
HTTPS: tcp/443
Syslog: udp/20514,
tcp/1468
Secure Syslog: tcp/6514
CoA (REST API): udp/1700
RADIUS Auth: udp/1645,1812
RADIUS Acct: udp/1646,1813
RADIUS CoA: udp/1700,3799
RADSEC DTLS: udp/2083
RADIUS/IPsec: udp/500
TACACS+: tcp/49 (configurable)
WebAuth: tcp:443,8443
SNMP: udp/161
SNMP Trap: udp/162
NetFlow: udp/9996
DHCP:udp/67, udp/68
DHCPv6: udp/547
SPAN:tcp/80,8080
SXP: tcp/64999
OCSP: tcp/2560
CA SCEP: tcp/9090
NADs
DNS: tcp-udp/53
NTP: udp/123
Repository: FTP, SFTP, NFS,
HTTP, HTTPS
File Copy: FTP, SCP, SFTP, TFTP
LDAP: tcp-udp/389, tcp/3268
SMB:tcp/445
KDC:tcp-udp/88; KPASS: tcp/464
SCEP: tcp/80, tcp/443; EST: tcp/8084
OCSP: tcp/80;
CRL: tcp/80, tcp/443, tcp/389
ODBC (configurable):
Microsoft SQL: tcp/1433
Sybase: tcp/2638
PortgreSQL: tcp/5432
Oracle: tcp/1512
TS-Agent: tcp/9094
AD Agent: tcp/9095
WMI: tcp/135
Syslog: udp/40514, tcp/11468
HTTPS: tcp/443
JGroups: tcp/12001 (PSN to PAN)
CoA (Admin/Guest Limit): udp/1700
Admin(P) - Admin(S): tcp/443,
tcp/12001(JGroups)
Monitor(P) - Monitor(S): tcp/443,
udp/20514 (Syslog)
Policy - Policy:
Node Groups/JGroups: tcp/7800
Proxy CoA: udp/1700
PSN-SXPSN: tcp/443
pxGrid - pxGrid: tcp/5222
Syslog: udp/20514,
tcp/1468
Secure Syslog: tcp/6514
SNMP Traps: udp/162
SMTP: tcp/25
(PPAN: email
expiry notifiy)
Email/SMS
Gateways
Inter-Node Communications
Cloud Services
Cisco.com/Perfigo.com
Profiler Feed Service
MDM & App Stores
Push Notification
Smart Licesing
GUI: tcp/80,443
SSH: tcp/22
Sponsor (PSN): tcp/8443
SNMP: udp/161
REST API (MnT): tcp/443
ERS API: tcp/9060
Admin /
Sponsor
SMTP:
tcp/25
MDM
Partner
pxGrid: tcp/5222
JGroups: tcp/12001
pxGrid: tcp/5222
pxGrid
Subscriber/
Publisher
pxGrid: tcp/5222
pxGrid (Bulk Download): tcp/8910
Syslog: udp/20514, tcp/1468
Secure Syslog: tcp/6514
NetFlow for TS: udp/9993
BRKSEC-3699
Threat/VA
Server
MDM API: tcp/XXX
(vendor specific)
TC-NAC: tcp/443
IdP SSO Server
IdP: tcp/XXX
(Vendor specific)
Admin->Sponsor:
tcp/9002
Wireless Setup
Wizard: tcp/9103
HTTPS; tcp/443
Syslog: udp/20514, tcp/1468
Secure Syslog: tcp/6514
Oracle DB (Secure JDBC): tcp/1528
JGroups: tcp/12001 (MnT to PAN)
PSN
PXG
MNT
PAN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Profiling and Data Replication
Before Tuning
Node Group = DC1-group Node Group = DC2-group
RADIUS Auth
RADIUS Acctng
DHCP 1 DHCP 2
NMAP
pxGrid
#
Ownership
Change
Global
Replication
BRKSEC-3699 67
1
3 4
2 5
PAN(S)
MNT(S)
MNT(P)
PAN(Primary)
PSN Clusters PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Node Group = DC1-group Node Group = DC2-group
PAN(Primary)
PSN Clusters
Impact of Ownership Changes
Before Tuning
RADIUS Auth
RADIUS Acctng
DHCP 1 DHCP 2
NMAP
pxGrid
Owner? Owner? Owner? Owner? Owner?
BRKSEC-3699 68
PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Profiling and Data Replication
After Tuning
RADIUS Auth
RADIUS Acctng
DHCP 1
NMAP
pxGrid
#
Ownership
Change
Global
Replication
BRKSEC-3699 69
Node Group = DC1-group Node Group = DC2-group
PAN(S)
MNT(S)
MNT(P)
PAN(Primary)
PSN Clusters
2
1
PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Impact of Ownership Changes
After Tuning
pxGrid
RADIUS Auth
RADIUS Acctng
DHCP 1
NMAP
BRKSEC-3699 70
Node Group = DC1-group Node Group = DC2-group
PAN(Primary)
PSN Clusters
Owner
PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 71
BRKSEC-3699
ISE Profiling Best Practices
Whenever Possible…
• Use Device Sensor on Cisco switches & Wireless Controllers to optimize data collection.
• Ensure profile data for a given endpoint is sent to a single PSN (or maximum of 2)
• Sending same profile data to multiple PSNs increases inter-PSN traffic and contention for endpoint ownership.
• For redundancy, consider Load Balancing and Anycast to support a single IP target for RADIUS or profiling using…
• DHCP IP Helpers
• SNMP Traps
• DHCP/HTTP with ERSPAN (Requires validation)
• Ensure profile data for a given endpoint is sent to the same PSN
• Same issue as above, but not always possible across different probes
• Use node groups and ensure profile data for a given endpoint is sent to same node
group.
• Node Groups reduce inter-PSN communications and need to replicate endpoint changes outside of node group.
• Avoid probes that collect the same endpoint attributes
• Example: Device Sensor + SNMP Query/IP Helper
• Enable Profiler Attribute Filter
Do NOT send profile data to multiple PSNs !
DO send profile data to single and same PSN or
Node Group !
DO use Device Sensor !
DO enable the Profiler Attribute Filter !
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 72
BRKSEC-3699
ISE Profiling Best Practices
General Guidelines for Probes
• HTTP Probe:
• Use URL Redirects instead of SPAN to centralize collection and reduce traffic load related to SPAN/RSPAN.
• Avoid SPAN. If used, look for key traffic chokepoints such as Internet edge or WLC connection; use intelligent
SPAN/tap options or VACL Capture to limit amount of data sent to ISE. Also difficult to provide HA for SPAN.
• DHCP Probe:
• Use IP Helpers when possible—be aware that L3 device serving DHCP will not relay DHCP for same!
• Avoid DHCP SPAN. If used, make sure probe captures traffic to central DHCP Server. HA challenges.
• SNMP Probe:
• For polled SNMP queries, avoid short polling intervals. Be sure to set optimal PSN for polling in ISE NAD config.
• SNMP Traps primarily useful for non-RADIUS deployments like NAC Appliance—Avoid SNMP Traps w/RADIUS auth.
• NetFlow Probe:
• Use only for specific use cases in centralized deployments—Potential for high load on network devices and ISE.
• pxGrid Probe:
• Limit # PSNs enabled for pxGrid as each becomes a Subscriber to same data. 2 needed for redundancy.
• Dedicate PSNs for pxGrid Probe if high-volume data from Publishers.
Do NOT enable all probes by default !
Avoid SPAN, SNMP Traps, and NetFlow probes !
Limit pxGrid probe to two PSNs max for HA – possibly dedicated !
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 73
BRKSEC-3699
Profiling Redundancy – Duplicating Profile Data
Different DHCP Addresses
- Provides Redundancy but Leads to Contention for Ownership = Replication
• Common config is to duplicate IP helper
data at each NAD to two different PSNs
or PSN LB Clusters
• Different PSNs receive data
PSN3 (10.1.99.7)
PSN2 (10.1.99.6)
PSN1 (10.1.99.5)
interface Vlan10
ip helper-address <real_DHCP_Server>
ip helper-address 10.1.98.8
ip helper-address 10.2.100.2
PSN3 (10.2.101.7)
PSN2 (10.2.101.6)
PSN1 (10.2.101.5)
PSN-CLUSTER2
PSN-CLUSTER1
DC #2
DC #1
DHCP Request
Load
Balancer
Load
Balancer
Note: LB depicted, but NOT required
(10.1.98.8)
(10.2.100.2)
User
int Vlan10
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
PSN3 (10.1.99.7)
PSN2 (10.1.99.6)
PSN1 (10.1.99.5)
PSN3 (10.2.101.7)
PSN2 (10.2.101.6)
PSN1 (10.2.101.5)
PSN-CLUSTER2
PSN-CLUSTER1
Load
Balancer
Load
Balancer
74
BRKSEC-3699
Scaling Profiling and Replication
Single DHCP VIP Address using Anycast
- Limit Profile Data to a Single PSN and Node Group
• Different PSNs or Load Balancer VIPs host
same target IP for DHCP profile data
• Routing metrics determine which PSN
or LB VIP receives DHCP from NAD
User
interface Vlan10
ip helper-address <real_DHCP_Server>
ip helper-address 10.1.98.8
DHCP Request
Note: LB depicted, but NOT required
(10.1.98.8)
(10.1.98.8)
DC #2
DC #1
int Vlan10
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 75
Profiler Tuning for Polled SNMP Query Probe
• Set specific PSNs to
periodically poll access
devices for SNMP data.
• Choose PSN closest
to access device.
28,800 sec (8 hours)
*Minimum recommended
polling interval
SNMP Polling
(Auto)
RADIUS
PSN1
(Amer)
PSN2
(Asia)
Switch
Auto-Recovery when PSN
fails fixed in ISE 2.4
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
pxGrid Profiler Probe (Context In)
First Integration with Cisco Industrial Network Director (IND)
• IND communicates with Industrial Switches and Security Devices and collects detailed
information about the connected manufacturing devices.
• IND v1.3 adds pxGrid Publisher interface to communicate IoT attributes to ISE.
76
BRKSEC-3699
Subscriber
ISE Profiler Attributes
Custom Attributes
Supported !!!
iotIpAddress
iotMacAddress
iotName
iotVendor
iotProductId
iotSerialNumber
iotDeviceType
iotSwRevision
iotHwRevision
iotProtocol
iotConnectedLinks
iotCustomAttributes
Publisher
pxGrid
IND Asset Inventory
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS BRKSEC-3699
pxGrid Profiler Probe
Recommend limit probe
to two PSNs (2 for HA).
Each PSN becomes a
pxGrid Subscriber to
IND Asset topic
77
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Profiler Conditions Based on Custom Attributes
BRKSEC-3699 78
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Profiling Based on Custom Attributes
Performance Hit so Disabled By Default
• Global Setting MUST
be enabled
• If disabled:
• Custom Attributes are
NOT updated over
pxGrid
• Profiler ignores any
conditions based on
Customer Attributes,
even if Custom
Attribute is populated.
BRKSEC-3699 79
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
New and Updated IoT Profile Libraries
• 700+ Automation and Control
• Industrial / Manufacturing
• Building Automation
• Power / Lighting
• Transportation / Logistics
• Financial (ATM, Vending, PoS, eCommerce)
• IP Camera / Audio-Video / Surveillance and Access Control
• Other (Defense, HVAC, Elevators, etc)
• Windows Embedded
• 300+ Profiles in Medical NAC Profile Library
Delivered via ISE Community: https://communities.cisco.com/docs/DOC-66340
BRKSEC-3699 80
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Why Do I Care about # Profiles?
81
BRKSEC-3699
• ISE 2.1+ supports a MAX of 2000 profiles
• Let’s Do the Math…
• ~600 Base Profiles
• 600+ New Feed Profiles (2.4)
• 300+ Medical NAC Profiles
• 700+ Automation & Control Profiles
--------------------------------------
2300+ Profiles
• No restrictions on profile import, so must
check # profiles in library before import
large batch of new profiles.
Scaling MnT
(Optimize Logging and
Noise Suppression)
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 83
The Fall Out From the Mobile Explosion and IoT
BRKSEC-3699
 Explosion in number and type of endpoints on the network.
 High auth rates from mobile devices—many personal (unmanaged).
– Short-lived connections: Continuous sleep/hibernation to conserve battery power, roaming, …
 Misbehaving supplicants: Unmanaged endpoints from numerous mobile vendors may be
misconfigured, missing root CA certificates, or running less-than-optimal OS versions
 Misconfigured NADs. Often timeouts too low & misbehaving clients go unchecked/not throttled.
 Misconfigured Load Balancers—Suboptimal persistence and excessive RADIUS health probes.
 Increased logging from Authentication, Profiling, NADs, Guest Activity, …
 System not originally built to scale to new loads.
 End user behavior when above issues occur.
 Bugs in client, NAD, or ISE.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Clients Misbehave!
• Example education customer:
• ONLY 6,000 Endpoints (all BYOD style)
• 10M Auths / 9M Failures in a 24 hours!
• 42 Different Failure Scenarios – all related to
clients dropping TLS (both PEAP & EAP-TLS).
• Supplicant List:
• Kyocera, Asustek, Murata, Huawei, Motorola, HTC, Samsung, ZTE, RIM, SonyEric, ChiMeiCo,
Apple, Intel, Cybertan, Liteon, Nokia, HonHaiPr, Palm, Pantech, LgElectr, TaiyoYud, Barnes&N
• 5411 No response received during 120 seconds on last EAP message sent to the client
• This error has been seen at a number of Escalation customers
• Typically the result of a misconfigured or misbehaving supplicant not completing the EAP process.
84
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Challenge: How to reduce
the flood of log messages
while increasing PSN and
MNT capacity and tolerance
85
BRKSEC-3699
MnT
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 86
Getting More Information With Less Data
Scaling to Meet Current and Next Generation Logging Demands
Reauth period
Quiet-period 5 min
Held-period / Exclusion 5 min
Load
Balancer
Misbehaving supplicant
Roaming
supplicant
Unknown users
Reauth phones
LB Health
probes
Detect and reject
misbehaving clients
Log Filter
Heartbeat
frequency
Count and discard
repeated events
Count and discard
untrusted events
PSN MNT
Switch
WLC
Rate Limiting at Source Filtering at Receiving Chain
Count and discard
repeats and unknown
NAD events
Filter health
probes from
logging
Reject
bad
supplicant
Client Exclusion
Quiet period
Quiet
Period
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 87
BRKSEC-3699
Tune NAD Configuration
Rate Limiting at Wireless Source
Wireless (WLC)
• RADIUS Server Timeout: Increase from default of 2 to 5 sec
• RADIUS Aggressive-Failover: Disable aggressive failover
• RADIUS Interim Accounting: v7.6: Disable; v8.0+: Enable with
interval of 0. (Update auto-sent on DHCP lease or Device Sensor)
• Idle Timer: Increase to 1 hour (3600 sec) for secure SSIDs
• Session Timeout: Increase to 2+ hours (7200+ sec)
• Client Exclusion: Enable and set exclusion timeout to 180+ sec
• Roaming: Enable CCKM / SKC / 802.11r (when feasible)
• Bugfixes: Upgrade WLC software to address critical defects
Reauth period
Quiet-period 5 min
Held-period / Exclusion 5 min
Misbehaving
supplicant
Roaming
supplicant
Unknown users
Reauth phones
Quiet
Period
Prevent Large-Scale Wireless RADIUS Network Melt Downs
http://www.cisco.com/c/en/us/support/docs/wireless-mobility/wireless-lan-wlan/118703-technote-wlc-00.html
BRKSEC-2059 Deploying ISE in a
Dynamic Environment - Clark Gambrel
Monday, June 11 @ 1:30pm
WLC
Client Exclusion
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
One-Click Setup for ISE Best Practice Config
• Checkbox to auto-
configure WLAN and
associated RADIUS
Servers to ISE best
practice.
BRKSEC-3699 88
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 89
BRKSEC-3699
Which WLC Software Should I Deploy?
CDETS Title
CSCul83594 Session-id is not synchronized across mobility, if the network is open (fixed in 8.6)
CSCuu82607 Evaluation of all for OpenSSL June 2015
CSCuu68490 duplicate radius-acct update message sent while roaming
CSCus61445 DNS ACL on wlc is not working - AP not Send DTLS to WLC
CSCuq48218 Cisco WLC cannot process multiple sub-attributes in single RADIUS VSA
CSCuo09947 RADIUS AVP #44 (Acct-Session-ID) to be sent in RADIUS authentication messages
https://www.cisco.com/c/en/us/support/docs/wireless/wireless-lan-controller-software/
200046-TAC-Recommended-AireOS.html
• 8.0.152.0 – Currently the most mature and reliable release.
• 8.2.167.6 – Mature - Recommended when need new feature/hardware support.
• 8.3.141.0 – Less Mature – Recommend if require new features in 8.3.x
• 8.5.124.55 – Cutting edge – Recommend if require new features in 8.5.x
• 8.6.101.0 – Bleeding edge – Only if absolutely require new features in 8.6.x
8.7.102.0 – Only if absolutely require new features in 8.7.x
• Example critical defects resolved in maintenance and new releases:
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 90
BRKSEC-3699
Tune NAD Configuration
Rate Limiting at Wired Source
Wired (IOS / IOS-XE)
• RADIUS Interim Accounting: Use newinfo parameter with long
interval (for example, 24-48 hrs), if available. Otherwise, set
15 mins. If LB present, set shorter than RADIUS persist time.
• 802.1X Timeouts
• held-period: Increase to 300+ sec
• quiet-period: Increase to 300+ sec
• ratelimit-period: Increase to 300+ sec
• Inactivity Timer: Disable or increase to 1+ hours (3600+ sec)
• Session Timeout: Disable or increase to 2+ hours (7200+ sec)
• Reauth Timer: Disable or increase to 2+ hours (7200+ sec)
• Bugfixes: Upgrade software to address critical defects.
Reauth period
Held-period 5 min
Quiet-period / Exclusion 5 min
Misbehaving supplicant
Roaming
supplicant
Unknown users
Reauth phones
Switch
Quiet period
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 91
BRKSEC-3699
RADIUS Test Probes
Reduce Frequency of RADIUS Server Health Checks
Misbehaving supplicant
Roaming supplicant
Unknown users
Reauth phones
Heartbeat
frequency
Quiet
Period
• Wired NAD: RADIUS test probe interval
set with idle-time parameter in radius-
server config; Default is 60 minutes
• No action required
• Wireless NAD: If configured, WLC only
sends “active” probe when server
marked as dead.
• No action required
• Load Balancers: Set health probe
intervals and retry values short enough to
ensure prompt failover to another server
in cluster occurs prior to NAD RADIUS
timeout (typically 20-60 sec.) but long
enough to avoid excessive test probes.
Load
Balancer
LB Health
probes
Switch
WLC
Client Exclusion
Quiet period
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 92
BRKSEC-3699
Load Balancer RADIUS Test Probes
Citrix Example
 Probe frequency and retry settings:
– Time interval between probes:
interval seconds # Default: 5
– Number of retries
retries number # Default: 3
 Sample Citrix probe configuration:
 Recommended setting: Failover must occur
before RADIUS timeout (typically 15-35 sec)
while avoiding excessive probing
 Probe frequency and retry settings:
– Time interval between probes:
Interval seconds # Default: 10
– Timeout before failure = 3*(interval)+1:
Timeout seconds # Default: 31
 Sample F5 RADIUS probe configuration:
F5 Example
Name PSN-Probe
Type RADIUS
Interval 10
Timeout 31
Manual Resume No
Check Util Up Yes
User Name f5-probe
Password f5-ltm123
Secret cisco123
Alias Address * All Addresses
Alias Service Port 1812
Debug No
add lb monitor PSN-Probe RADIUS -respCode 2
-userName citrix_probe -password citrix123
-radKey cisco123 -LRTM ENABLED –interval 10
–retries 3 -destPort 1812
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 93
BRKSEC-3699
PSN Noise Suppression and Smarter Logging
Filter Noise and Provide Better Feedback on Authentication Issues
• PSN Collection Filters
• PSN Misconfigured Client Dynamic
Detection and Suppression
• PSN Accounting Flood Suppression
• Detect Slow Authentications
• Enhanced Handling for EAP sessions
dropped by supplicant or Network Access
Server (NAS)
• Failure Reason Message and Classification
• Identify RADIUS Request From Session
Started on Another PSN
• Improved Treatment for Empty NAK List
Detect and reject
misbehaving clients
Log Filter
PSN
Filter health
probes
from
logging
Reject
bad
supplicant
PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 94
BRKSEC-3699
PSN - Collection Filters
Static Client Suppression
• PSN static filter based on
single attribute:
• User Name
• Policy Set Name
• NAS-IP-Address
• Device-IP-Address
• MAC (Calling-Station-ID)
• Filter Messages Based on Auth Result:
• All (Passed/Fail)
• All Failed
• All Passed
• Select Messages to Disable Suppression
for failed auth @PSN and successful auth @MnT
Administration > System > Logging > Collection Filters
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
PSN Filtering and Noise Suppression
Dynamic Client Suppression
Administration > System > Settings > Protocols >
RADIUS
Flag misconfigured supplicants for
same auth failure within specified
interval and stop logging to MnT
Send alarm with failure statistics
Valid Time ranges displayed by default
Each endpoint tracked by:
• Calling-Station-ID (MAC Address)
• NAS-IP-Address (NAD address)
• Failure reason
BRKSEC-3699 95
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
PSN Filtering and Noise Suppression
Dynamic Client Suppression
Administration > System > Settings > Protocols > RADIUS
Flag misconfigured supplicants for
same auth failure within specified
interval and stop logging to MnT
Send alarm with failure statistics
Send immediate Access-Reject
(do not even process request) IF:
1) Flagged for suppression
2) Fail auth total X times for
same failure reason (inc 2 prev)
Fully process next request after
rejection period expires.
Hard-coded @
5 in ISE 2.0
BRKSEC-3699 96
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
PSN Noise Suppression
Drop Excessive RADIUS Accounting Updates from “Misconfigured NADs”
Administration > System > Settings > Protocols > RADIUS
Allow 2 RADIUS Accounting
Updates for same session in
specified interval, then drop.
BRKSEC-3699 97
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 98
BRKSEC-3699
MnT Log Suppression and Smarter Logging
Drop and Count Duplicates / Provide Better Monitoring Tools
• Drop duplicates and increment counter in Live Log for “matching” passed
authentications
• Display repeat counter to Live Sessions entries.
• Update session, but do not log RADIUS Accounting Interim Updates
• Log RADIUS Drops and EAP timeouts to separate table for reporting
purposes and display as counters on Live Log Dashboard along with
Misconfigured Supplicants and NADs
• Alarm enhancements
• Revised guidance to limit syslog at the source.
• MnT storage allocation and data retention limits
• More aggressive purging
• Allocate larger VM disks to increase logging capacity and retention.
Count and discard
repeated events
Count and discard
untrusted events
Count and discard
repeats and unknown
NAD events
MNT
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
MnT Noise Suppression
Suppress Storage of Repeated Successful Auth Events
Administration > System > Settings > Protocols > RADIUS
Suppress Successful Reports
= Do not save repeated successful
auth events for the same session
to MnT DB
These events will not display in
Live Authentications Log but do
increment Repeat Counter.
BRKSEC-3699 99
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 100
MnT Noise Suppression
Suppress Storage of Repeated Successful Auth Events
Administration > System > Settings > Protocols >
RADIUS
Detect NAD retransmission timeouts
and Log auth steps > threshold.
Step latency is visible in
Live Logs details
12304 Extracted EAP-Response containing PEAP challenge-response
11808 Extracted EAP-Response containing EAP-MSCHAP challenge-
response for inner method
15041 Evaluating Identity Policy (Step latency=1048 ms)
15006 Matched Default Rule
15013 Selected Identity Source - Internal Users
24430 Authenticating user against Active Directory
24454 User authentication against Active Directory failed because of a
timeout error (Step latency=30031 ms)
24210 Looking up User in Internal Users IDStore - test1
24212 Found User in Internal Users IDStore
22037 Authentication Passed
11824 EAP-MSCHAP authentication attempt passed
12305 Prepared EAP-Request with another PEAP challenge
11006 Returned RADIUS Access-Challenge
5411 Supplicant stopped responding to ISE (Step latency=120001 ms)
BRKSEC-3699
Suppress Successful Reports
= Do not save repeated successful
auth events for the same session
to MnT DB
These events will not display in
Live Authentications Log but do
increment Repeat Counter.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Suppression
RejectionAccess Device PSN
802.1X Request (12321 Cert Rejected)
Failure
Detection
Endpoint
Failed Auth Log
Failed Auth Log
Failed Auth Log
Client Suppression and Reject Timers
MnT
MAB Request (22056 Subject not found)
802.1X Request (12321 Cert Rejected)
t = T0
t = T1
t = T2 T2 < Ts
t = T3
t = T4
t = T5
MAB Request
802.1X Request
MAB Request
t = T6
t = T7
802.1X Request
MAB Request
Report
t = T9 Auth Request
t = T10
Access-Reject
Auth Request
Access-Reject
Report
Report
5434
Suppression Report
5434
5449
Reject Report
5449
Ts = Failed
Suppression
Interval
Ts
Tr = Report
Interval
Tx =
Rejection
Interval
Tr
Tr
Tx
Tr
Tr
Successful Auth Log
Release
5449
2 failures!
802.1X Request
t = T8 Total 5 failures
of same type!
Auth Request
BRKSEC-3699 101
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
BRKSEC-3699
Rejection
Client Suppression and Reject Timers
Tr = Report
Interval
Tx =
Rejection
Interval
Tr
Tr Tx
Tr
Tr
Rejection
Tr
Tr
Tx
Access Device
Endpoint
Report 5449
Report 5449
Report 5449
Report 5449
Report 5449
Report 5449 102
PSN MnT
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
RADIUS Accounting
“Bad” Auth Requests
ISE Log Suppression
“Good”-put Versus “Bad”-put
“Good” Auth Requests
Incomplete Auth
Requests
Rejected
Failed Auth
Suppressed
Successful
Auth
Suppressed
RADIUS
Accounting
updates (not
IP change)
Accounting
Updates
Suppressed
RADIUS
Drops
BRKSEC-3699 103
PSN MnT
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Typical Load Example
IN OUT
$ $
$
$ $
$
BRKSEC-3699 104
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Extreme Noise Load Example
IN OUT
$ $ $
$ $ $
$ $ $
$ $ $
BRKSEC-3699 105
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
WLC – Client Exclusion
Blacklist Misconfigured or Malicious Clients
• Excessive Authentication Failures—Clients are excluded on the fourth authentication
attempt, after three consecutive failures.
• Client excluded for Time Value specified in WLAN settings. Recommend increase to
1-5 min (60-300 sec). 3 min is a good start.
Note: Diagrams show default values
BRKSEC-3699 106
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 107
BRKSEC-3699
Live Authentications and Sessions
Blue entry = Most current Live Sessions entry with repeated successful auth counter
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 108
BRKSEC-3699
Authentication Suppression
Enable/Disable
• Global Suppression Settings: Administration > System > Settings > Protocols > RADIUS
Caution: Do not disable suppression in deployments with very high auth rates.
It is highly recommended to keep Auth Suppression enabled to reduce MnT logging
• Selective Suppression using Collection Filters: Administration > System > Logging >
Collection Filters
Configure specific traffic to bypass
Successful Auth Suppression
Useful for troubleshooting authentication for a
specific endpoint or group of endpoints, especially
in high auth environments where global suppression
is always required.
Failed Auth Suppression Successful Auth Suppression
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 109
BRKSEC-3699
Per-Endpoint Time-Constrained Suppression
Right
Click
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 110
Visibility into Reject Endpoints!
110
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 111
Releasing Rejected Endpoints
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 112
Releasing Rejected Endpoints
Query/Release Rejected
also available via ERS API!
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 113
No Log Suppression With Log Suppression Distributed Logging
BRKSEC-3699
High Availability
Agenda
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
• ISE Appliance Redundancy
• ISE Node Redundancy
• Administration Nodes
• Monitoring Nodes
• pxGrid Nodes
• HA for Certificate Services
• Policy Service Node
Redundancy
• Load Balancing
• Non-LB Options
• NAD Fallback and Recovery
BRKSEC-3699 115
High Availability
Agenda
ISE Appliance
Redundancy
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 117
BRKSEC-3699
Appliance Redundancy
In-Box High Availability
Platform
SNS-3415
(34x5 Small)
SNS-3495
(34x5 Large)
SNS-3515
(35x5 Small)
SNS-3595
(35x5 Large)
Drive
Redundancy
No
(1) 600GB disk
Yes
(2) 600-GB
No
(1) 600GB disk
Yes
(4) 600GB disk
Controller
Redundancy
No
Yes
(RAID 1)
No
(1GB FBWC
Controller Cache)
Yes
(RAID 10)
(1GB FBWC Cache)
Ethernet
Redundancy
Yes*
4 GE NICs =
Up to 2 bonded NICs
Yes*
4 GE NICs =
Up to 2 bonded NICs
Yes*
6 GE NICs =
Up to 3 bonded NICs
Yes*
6 GE NICs =
Up to 3 bonded NICs
Redundant
Power
No
(2nd PSU optional)
UCSC-PSU-650W
Yes
No
(2nd PSU optional)
UCSC-PSU1-770W
Yes
* ISE 2.1 introduced NIC Teaming support for High Availability only (not active/active)
SNS-3500 Series
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
NIC Bonding
Network Card Redundancy
GE0
GE1
Primary
Backup
Bond 0
GE2
GE3
Primary
Backup
Bond 1 Bond 2
GE4
GE5
BRKSEC-3699 118
• For Redundancy only–NOT
for increasing bandwidth.
• Up to (3) bonds in ISE 2.1
• Bonded Interfaces Preset–
Non-Configurable
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Bonded Interfaces for Redundancy
When GE0 is Down, GE1 Takes Over
GE0 GE1
Same MAC Address
• Both interfaces assume the
same L2 address.
• When GE0 fails, GE1 assumes
the IP address and keeps the
communications alive.
• Based on Link State of the
Primary Interface
• Every 100 milliseconds the link
state of the Primary is
inspected.
BRKSEC-3699 119
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 120
BRKSEC-3699
NIC Teaming
NIC Teaming / Interface Bonding
• Configured using CLI only!
• GE0 + GE1 Bonding Example:
admin(config-GigabitEthernet0)# backup interface GigabitEthernet 1
• Requires service restart. After restart, ISE recognizes bonded interfaces for
Deployment and Profiling; Guest requires manual config of eligible interfaces.
ISE Node/Persona
Redundancy
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Policy
Sync
Policy
Sync
Admin Node HA and Synchronization
PAN Steady State Operation
• Changes made to Primary Administration DB are automatically synced to all nodes.
122
BRKSEC-3699
PSN
Admin Node
(Primary)
Admin Node
(Secondary)
Monitoring Node
(Primary)
Monitoring Node
(Secondary)
Policy Sync
Admin
User
• Maximum
two PAN
nodes per
deployment
• Active /
Standby
PSN
PXG
PSN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Policy
Sync
Policy
Sync
Admin Node HA and Synchronization
Primary PAN Outage and Recovery
• Prior to ISE 1.4, upon Primary PAN failure, admin user must connect to Secondary PAN and
manually promote Secondary to Primary; new Primary syncs all new changes.
• PSNs buffer endpoint
updates if Primary PAN
unavailable; buffered
updates sent once PAN
available.
123
BRKSEC-3699
PSN
Admin Node
(Primary)
Admin Node
(Secondary)
Monitoring Node
(Primary)
Monitoring Node
(Secondary)
Policy Sync
Admin
User
PSN
PXG
PSN
Promoting
Secondary Admin
may take 10-15
minutes before
process is
complete.
New Guest Users or Registered Endpoints
cannot be added/connect to network when
Primary Administration node is unavailable!
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Policy Service Survivability When Admin Down/Unreachable
Which User Services Are Available if Primary Admin Node Is Unavailable?
Service Use case Works (Y / N)
RADIUS Auth Generally all RADIUS auth should continue provided access to ID stores Y
Guest
All existing guests can be authenticated, but new guests, self-registered
guests, or guest flows relying on device registration will fail.
N
Profiler
Previously profiled endpoints can be authenticated with existing profile. New
endpoints or updates to existing profile attributes received by owner should
apply, but not profile data received by PSN in foreign node group.
Y
Posture Provisioning/Assessment work, but Posture Lease unable to fetch timer. Y
Device Reg Device Registration fails if unable to update endpoint record in central db. N
BYOD/NSP
BYOD/NSP relies on device registration. Additionally, any provisioned
certificate cannot be saved to database. N
MDM MDM fails on update of endpoint record N
CA/Cert
Services
See BYOD/NSP use case; certificates can be issued but will not be saved
and thus fail. OCSP functions using last replicated version of database
N
pxGrid
Clients that are already authorized for a topic and connected to controller will
continue to operate, but new registrations and connections will fail.
N
TACACS+ TACACS+ requests can be locally processed per ID store availability. Y
BRKSEC-3699 124
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 125
BRKSEC-3699
Automatic PAN Switchover
Introduced ISE 1.4
• Primary PAN (PAN-1)
down or network link
down.
• If Health Check Node
unable to reach PAN-1 but
can reach PAN-2
 trigger failover
• Secondary PAN (PAN-2) is
promoted by Health Check
Node
• PAN-2 becomes Primary
and takes over PSN
replication.
WAN
PAN-2
Secondary
MNT-2
Secondary
DC-1 DC-2
PAN-1
Primary
MNT-1
Primary
1
Primary PAN
Health
Check Node
Secondary
PAN Health
Check Node
2
Note: Switchover is NOT immediate. Total time based on polling intervals and promotion time.
Expect ~15 - 30 minutes.
Don’t forget, after switchover
admin must connect to PAN-2
for ISE management!
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
PAN Failover
Health Check Node Configuration
• Configuration using GUI only under Administration > System > Deployment > PAN Failover
126
BRKSEC-3699
Health Check Node
CANNOT be a PAN !!
Requires Minimum of 3
nodes – 3rd node is
independent observer
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 127
BRKSEC-3699
HA for Monitoring and Troubleshooting
Steady State Operation
• MnT nodes concurrently receive logging from PAN, PSN, IPN*, NAD, and ASA
• PAN retrieves log/report data from Primary MnT node when available
Syslog 20514
Syslog from firewall
(or other user logging device)
is correlated with guest
session for activity logging
Syslog from access
devices are correlated
with user/device session
Syslog from ISE nodes
are sent for session
tracking and reporting
Monitoring
Node (Primary)
Monitoring
Node (Secondary)
MnT data
Admin
User
• Maximum two MnT
nodes per deployment
• Active / Active
PXG
PSN
NADs
FW
PAN
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 128
BRKSEC-3699
HA for Monitoring and Troubleshooting
Primary MnT Outage and Recovery
• Upon MnT node failure, PAN, PSN, NAD, and ASA continue to send logs to remaining MnT node
• PAN auto-detects Active MnT failure and retrieves log/report data from Secondary MnT node.
• Full failover to Secondary MnT may take from 5-15 min depending on type of failure.
Syslog 20514
Monitoring Node (Primary)
Monitoring Node
(Secondary)
MnT data Admin
User
NADs
FW
Syslog from firewall
(or other user logging device)
is correlated with guest
session for activity logging
Syslog from access
devices are correlated
with user/device session
Syslog from ISE nodes
are sent for session
tracking and reporting
PXG
PSN
PAN
• PSN logs are not locally buffered when MnT down unless use TCP/Secure syslog.
• Log DB is not synced between MnT nodes.
• Upon return to service, recovered MnT node will not include data logged during outage
• Backup/Restore required to re-sync MnT database
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 129
BRKSEC-3699
Log Buffering
TCP and Secure Syslog Targets
• Default UDP-based
audit logging does not
buffer data when MnT
is unavailable.
• TCP and Secure Syslog
options can be used to
buffer logs locally
• Note: Overall log
performance will
decrease if use these
acknowledged options.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 130
HA for pxGrid v1
Steady State
Primary
PAN
Secondary
PAN
Secondary
MnT
Active
pxGrid
Controller
pxGrid
Client
(Subscriber)
Primary
MnT
TCP/5222
TCP/5222
Standby
pxGrid
Controller
pxGrid
Clients
(Publishers)
• pxGrid clients can be
configured with up to 2
servers for redundancy.
• Clients connect to
single active controller
for given domain
TCP/5222
TCP/12001
PAN Publisher Topics:
• Controller Admin
• TrustSec/SGA
• Endpoint Profile
MnT Publisher Topics:
• Session Directory
• Identity Group
• ANC (EPS)
BRKSEC-3699
• Max two pxGrid v1
nodes per deployment
(Active/Standby)
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 131
HA for pxGrid v1
Failover and Recovery
Active
pxGrid
Controller
pxGrid
Client
(Subscriber)
PAN Publisher Topics:
• Controller Admin
• TrustSec/SGA
• Endpoint Profile
Standby
pxGrid
Controller
TCP/5222
MnT Publisher Topics:
• Session Directory
• Identity Group
• ANC (EPS)
If active pxGrid
Controller fails,
clients automatically
attempt connection
to standby controller.
TCP/5222
TCP/12001
TCP/5222
BRKSEC-3699
• Max two pxGrid v1
nodes per deployment
(Active/Standby)
Primary
PAN
Secondary
PAN
Secondary
MnT
Primary
MnT
pxGrid
Clients
(Publishers)
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 132
HA for pxGrid v2 (ISE 2.3+)
Steady State
pxGrid
Client #1
(Subscriber)
TCP/5222
TCP/5222
• pxGrid clients can be
configured with multiple
servers for redundancy.
• Clients connect to
single active controller
for given domain
TCP/5222
TCP/12001
PAN Publisher Topics:
• Controller Admin
• TrustSec/SGA
• Endpoint Profile
MnT Publisher Topics:
• Session Directory
• Identity Group
• ANC (EPS)
BRKSEC-3699
pxGrid
Client #2
(Subscriber)
TCP/5222
• 2.3: Max two pxGrid v2 nodes/
deployment (Active/Active)
• 2.4: Max 4 nodes (All Active)
Primary
PAN
Secondary
PAN
Secondary
MnT
Primary
MnT
pxGrid
Clients
(Publishers)
Active pxGrid
Controller #1
Active pxGrid
Controller #2
PSN Load Balancing
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Load Balancing RADIUS, Web, and Profiling Services
• Policy Service nodes can be configured in a cluster behind a load balancer (LB).
• Access Devices send RADIUS and TACACS+ AAA requests to LB virtual IP.
Load
Balancers
Network Access Devices
PSNs
(User
Services)
Virtual IP
BRKSEC-3699 134
VPN
• N+1 node redundancy
assumed to support total
endpoints during:
–Unexpected server outage
–Scheduled maintenance
–Scaling buffer
• HA for LB itself assumed
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
• Administration > System > Deployment
• Node group members can be L2 or L3
• Multicast not required
135
BRKSEC-3699
Configure Node Groups for LB Cluster
Place all PSNs in LB Cluster in Same Node Group
1) Create node group
2) Assign name
3) Add individual PSNs to node group
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
VLAN 99
(10.1.99.0/24)
VLAN 98
(10.1.98.0/24)
136
BRKSEC-3699
High-Level Load Balancing Diagram
End User/Device
VIP: 10.1.98.8
Access Device
NAS IP: 10.1.50.2
ISE-PAN-1 ISE-MNT-1
ISE-PAN-2 ISE-MNT-2
External
Logger
AD
LDAP
MDM
DNS
NTP
SMTP
Load Balancer
For Your
Reference
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
LB: 10.1.99.1
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
End User/Device Access Device
NAS IP: 10.1.50.2
137
BRKSEC-3699
Traffic Flow—Fully Inline: Physical Separation
Physical Network Separation Using Separate LB Interfaces
• Load Balancer is directly inline between PSNs and rest of network.
• All traffic flows through Load Balancer including RADIUS, PAN/MnT,
Profiling, Web Services, Management,
Feed Services, MDM, AD, LDAP… VLAN 99
(Internal)
VLAN 98
(External)
Fully Inline Traffic
Flow recommended—
physical or logical
VLAN 99
(10.1.99.0/24)
VLAN 98
(10.1.98.0/24)
VIP: 10.1.98.8
Load
Balancer
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
LB: 10.1.99.1
ISE-PAN ISE-MNT
External
Logger
AD
LDAP
MDM
DNS
NTP
SMTP
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
138
BRKSEC-3699
Traffic Flow—Fully Inline: VLAN Separation
Logical Network Separation Using Single LB Interface and VLAN Trunking
• LB is directly inline between ISE PSNs
and rest of network.
• All traffic flows through LB including RADIUS,
PAN/MnT, Profiling, Web Services, Management,
Feed Services, MDM, AD, LDAP…
Load Balancer
10.1.98.1
10.1.98.2 10.1.99.1
VLAN 99
(Internal)
VLAN 98
(External)
VIP: 10.1.98.8
Network
Switch
End User/Device Access Device
NAS IP: 10.1.50.2
ISE-PAN ISE-MNT
External
Logger
AD
LDAP
MDM
DNS
NTP
SMTP
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
• All inbound LB traffic such RADIUS, Profiling,
and directed Web Services sent to LB VIP.
• Other inbound non-LB traffic bypasses LB
including redirected Web Services, PAN/MnT,
Management, Feed Services, MDM, AD, LDAP…
• All outbound traffic from PSNs
sent to LB as DFGW.
• LB must be configured
to allow Asymmetric traffic
ISE-PAN ISE-MNT
External
Logger
AD
LDAP
MDM
DNS
NTP
SMTP
139
BRKSEC-3699
Partially Inline: Layer 2/Same VLAN (One PSN Interface)
Direct PSN Connections to LB and Rest of Network
Load Balancer
End User/Device Access Device
L3
Switch
VLAN 98
10.1.98.2
VIP: 10.1.98.8
10.1.98.1
10.1.98.7
10.1.98.5
10.1.98.6
NAS IP: 10.1.50.2
Generally NOT RECOMMENDED due to
traffic flow complexity—must fully
understand path of each flow to ensure
proper handling by routing, LB, and
end stations.
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Request for
service at
single host
‘psn-cluster’
140
BRKSEC-3699
PSN Load Balancing
Sample Topology and Flow
User
Response from psn-vip.company.com
DNS Lookup = psn-vip.company.com
DNS response = 10.1.98.8
Request to psn-vip.company.com
VIP: 10.1.98.8
PSN-VIP
VLAN 99 (10.1.99.0/24)
VLAN 98 (10.1.98.0/24)
DNS request
sent to resolve
psn-cluster
FQDN
Request sent to Virtual IP Address
(VIP) 10.1.98.8
Response returned from real server
ise-psn-3 @ 10.1.99.7, then Source
NAT’ed back to VIP @ 10.1.98.8
For Your
Reference
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
DNS
Server
Load Balancer
Access Device
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 141
BRKSEC-3699
Load Balancing Policy Services
• RADIUS AAA Services
Packets sent to LB virtual IP are load-balanced to real PSN based on configured algorithm. Sticky algorithm determines
method to ensure same Policy Service node services same endpoint.
• Web Services:
• URL-Redirected: Posture (CPP) / Central WebAuth (CWA) / Native Supplicant Provisioning (NSP) /
Hotspot / Device Registration WebAuth (DRW), Partner MDM.
No LB Required! PSN that terminates RADIUS returns URL Redirect with its own certificate CN name substituted for
‘ip’ variable in URL.
Direct HTTP/S: Local WebAuth (LWA) / Sponsor / MyDevices Portal, OCSP
Single web portal domain name should resolve to LB virtual IP for http/s load balancing.
• Profiling Services: DHCP Helper / SNMP Traps / Netflow / RADIUS
LB VIP is the target for one-way Profile Data (no response required). VIP can be same or different than one used by
RADIUS LB; Real server interface can be same or different than one used by RADIUS
• TACACS+ AAA Services: (Session and Command Auth and Accounting)
LB VIP is target for TACACS+ requests. T+ not session based like RADIUS, so not required that requests go to same PSN
Load Balancing
RADIUS
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Load Balancer
User Access Device
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
143
BRKSEC-3699
Load Balancing RADIUS
Sample Flow
RADIUS AUTH response from 10.1.98.8
RADIUS AUTH request to 10.1.98.8
VIP: 10.1.98.8
PSN-CLUSTER
VLAN 99 (10.1.99.0/24)
VLAN 98 (10.1.98.0/24)
RADIUS ACCTG request to 10.1.98.8
1. NAD has single RADIUS Server defined (10.1.98.8)
2. RADIUS Auth requests sent to VIP @ 10.1.98.8
3. Requests for same endpoint load balanced to same PSN via sticky based on
RADIUS Calling-Station-ID and Framed-IP-Address
4. RADIUS response received from VIP @ 10.1.98.8
(originated by real server ise-psn-3 @ 10.1.99.7 and source translated by LB)
5. RADIUS Accounting sent to/from same PSN based on sticky
2
4 5
1 radius-server host 10.1.98.8
3
RADIUS ACCTG response from 10.1.98.8
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 144
BRKSEC-3699
Load Balancer Persistence (Stickiness) Guidelines
Persistence Attributes
• Common RADIUS Sticky Attributes
o Client Address
 Calling-Station-ID
 Framed-IP-Address
o NAD Address
 NAS-IP-Address
 Source IP Address
o Session ID
 RADIUS Session ID
 Cisco Audit Session ID
o Username
• Best Practice Recommendations (depends on LB support and design)
1. Calling-Station-ID for persistence across NADs and sessions
2. Source IP or NAS-IP-Address for persistence for all endpoints connected to same NAD
3. Audit Session ID for persistence across re-authentications
Username=jdoe@company.com
Load Balancer
VIP:
10.1.98.8
Access Device
10.1.50.2
Session: 00aa…99ff
MAC Address=00:C0:FF:1A:2B:3C
IP Address=10.1.10.101
User
Device
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 145
BRKSEC-3699
Load Balancer Stickiness Guidelines
Config Examples Based on Calling-Station-ID (MAC Address)
• Cisco ACE Example:
• F5 LTM iRule Example:
• Citrix NetScaler Example:
sticky radius framed-ip calling-station-id RADIUS-STICKY
serverfarm ise-psn
ltm rule RADIUS_iRule {
when CLIENT_ACCEPTED {
persist uie [RADIUS::avp 31]
}}
Be sure to monitor load
balancer resources when
performing advanced parsing.
add lb vserver radius-auth RADIUS 172.16.0.16 1812 -rule "CLIENT.UDP.RADIUS.ATTR_TYPE(31)" -cltTimeout 120
add lb vserver radius-acct RADIUS 172.16.0.16 1813 -rule "CLIENT.UDP.RADIUS.ATTR_TYPE(31)" -cltTimeout 120
set lb group RADIUS-Calling-Station-ID -persistenceType RULE -rule "CLIENT.UDP.RADIUS.ATTR_TYPE(31)
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 146
BRKSEC-3699
LB Fragmentation and Reassembly
Be aware of load balancers that do not reassemble RADIUS fragments!
• Example: EAP-TLS with large certificates
• Need to address path fragmentation or persist on source IP
• ACE reassembles RADIUS packet.
• F5 LTM reassembles packets by default except for FastL4 Protocol
• Must be manually enabled under the FastL4 Protocol Profile
• Citrix NetScaler fragmentation defect—Resolved in NetScaler 10.5 Build 50.10
• Issue ID 429415 addresses fragmentation and the reassembly of large/jumbo frames
RADIUS w/BigCert
IP Fragment #1
IP
LB on Source IP
(No Calling ID in
RADIUS packet)
LB on Call-ID
Fragment #2
IP
Calling-Station-ID + Certificate Part 1 Certificate Part 2
RADIUS Frag1
IP
RADIUS Frag2
IP
Also watch for fragmented packets that are too small. LBs have min allowed frag size and will drop !!!
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
• Example: Intermediate switch/gateway fragments packets below LB minimum
• Need to address path fragmentation or change LB min fragment size
• ACE: fragment min-mtu <bytes> (default 576 bytes)
• F5 LTM: # tmsh modify sys db tm.minipfragsize value 1
• Pre-11.6: Default = 576 bytes
• 11.6.0+: Default = 566 bytes
147
BRKSEC-3699
LB Fragmentation and Reassembly
Watch for packet fragments smaller than LB will accept!
RADIUS w/BigCert
IP Frag1
IP
LB min
frag size =
576 bytes
Frag2
IP
Fragments <= 512 bytes
Frag3
IP Frag4
IP
Switch with
low MTU
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 148
BRKSEC-3699
NAT Restrictions for RADIUS Load Balancing
Why Source NAT (SNAT) Fails for NADs
• With SNAT, LB appears as the Network
Access Device (NAD) to PSN.
• CoA sent to wrong IP address
SNAT results in less visibility as all requests appear
sourced from LB – makes troubleshooting more difficult.
User Story 8601 : CoA
support for NAT'ed load
balanced environments
NAS IP Address is
correct, but not
currently used for CoA
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 149
BRKSEC-3699
SNAT of NAD Traffic: Live Log Example
Auth Succeeds/CoA Fails: CoA Sent to Load Balancer and Dropped
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 150
BRKSEC-3699
Allow Source NAT for PSN CoA Requests
Simplifying Switch CoA Configuration
• Match traffic from PSNs to UDP/1700 or UDP/3799
(RADIUS CoA) and translate to PSN cluster VIP.
• Access switch config:
• Before:
• After:
10.1.98.8
aaa server radius dynamic-author
client 10.1.99.5 server-key cisco123
client 10.1.99.6 server-key cisco123
client 10.1.99.7 server-key cisco123
client 10.1.99.8 server-key cisco123
client 10.1.99.9 server-key cisco123
client 10.1.99.10 server-key cisco123
<…one entry per PSN…>
aaa server radius dynamic-author
client 10.1.98.8 server-key cisco123
ISE-PSN-X
10.1.99.x
Access
Switch
Load
Balancer
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
CoA SRC=10.1.98.8
CoA SRC=10.1.99.5
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 151
BRKSEC-3699
Allow Source NAT for PSN CoA Requests
Simplifying WLC CoA Configuration
• Before: • After
One RADIUS Server entry
required per PSN that may send
CoA from behind load balancer
One RADIUS Server entry
required per load balancer VIP.
Simplifies config and
reduces # ACL entries
required to permit
access to each PSN
Load Balancing
ISE Web Services
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
153
BRKSEC-3699
Load Balancing with URL-Redirection
URL Redirect Web Services: Hotspot/DRW, CWA, BYOD, Posture, MDM
User
RADIUS response from psn-vip.company.com
DNS Lookup = ise-psn-3.company.com
DNS Response = 10.1.99.7
RADIUS request to psn-vip.company.com
VIP: 10.1.98.8
PSN-CLUSTER
DNS
Server
Access Device
1. RADIUS Authentication requests sent to VIP @ 10.1.98.8
2. Requests for same endpoint load balanced to same PSN via RADIUS sticky.
3. RADIUS Authorization received from VIP @ 10.1.98.8 (originated by ise-psn-3
@ 10.1.99.7 with URL Redirect to https://ise-psn-3.company.com:8443/...
4. Client browser redirected and resolves FQDN in URL to real server address.
5. User sends web request directly to same PSN that serviced RADIUS request.
ISE Certificate
Subject CN =
ise-psn-3.company.com
https://ise-psn-3.company.com:8443/...
HTTPS response from ise-psn-3.company.com
1
2
3
4
5
Load Balancer
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 154
BRKSEC-3699
Load Balancing URL-Redirected Services
When and How to Override Default URL Redirection from Client to PSN
• Use Cases for LB to Terminate redirected HTTPS Requests
• Obfuscate PSN node names/IP addresses. (Do not want PSN name exposed to browser)
• Ability to use a different certificate for user facing connection
• Apply security inspections on web-based requires
• As a way to secure PSN interfaces in DMZ.
• Requires Authorization Profile be configured with Static Hostname option.
• Load Balancer must be able to persist web request to same PSN that serviced
RADIUS session Common methods (else rely on ISE policy logic):
• LB includes Framed-IP-Address with RADIUS sticky; correlates Framed-IP to HTTPS source IP
• LB includes Session Id with RADIUS sticky; correlates Session Id in web request
Note: Since ISE assumes HTTPS for web access, offload cannot be used to increase SSL performance.
Load Balancer must reestablish SSL connection to real PSN servers.
url-redirect=https://<PSN_CN>:8443/guestportal/gateway?sessionId=SessionIdValue&action=cwa
F5 LTM loadbalancing Radius and HTTP traffic for ISE
http://www.cisco.com/c/en/us/support/docs/security/identity-services-engine/200317-F5-LTM-loadbalancing-Radius-and-HTTP-tra.html
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 155
BRKSEC-3699
URL Redirection Using Static IP/Hostname
Overriding Automatic Redirection to PSN IP Address/FQDN
• Allows static IP or FQDN value to be returned for CWA or other URL-Redirected Flows
• Common use case: Public DNS or IP address (no DNS available) must be used while
preserving variable substitution for port and sessionId variables.
Policy > Policy Elements > Results > Authorization > Authorization Profiles
DMZ PSN Certificate must match IP/Static FQDN
Specified IP Address/Hostname MUST point to the
same PSN that terminates the RADIUS session.
If multiple PSNs, requires LB persistence or AuthZ
Policy logic to ensure redirect occurs to correct PSN.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS BRKSEC-3699
“Universal Certs”
UCC or Wildcard SAN Certificates
CN must also exist in
SAN
Other FQDNs or wildcard
as “DNS Names”
IP Address is also option
ise-psn.company.com
mydevices.company.com
sponsor.company.com
ise-psn/Admin
ise-psn
Universal Cert options:
• UCC / Multi-SAN
• Wildcard SAN
156
*.ise.company.com
psn.ise.company.com
Check box to use wildcards
Load Balancing
ISE Profiling Services
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
User
DHCP
Server
Access Device
Load Balancer
158
BRKSEC-3699
Load Balancing Profiling Services
Sample Flow
DHCP Request to Helper IP 10.1.98.8
VIP: 10.1.98.8
PSN-CLUSTER
1. Client OS sends DHCP Request
2. Next hop router with IP Helper configured forwards DHCP request to
real DHCP server and to secondary entry = LB VIP
3. Real DHCP server responds and provide client a valid IP address
4. DHCP request to VIP is load balanced to PSN @ 10.1.99.7 based on
source IP stick (L3 gateway) or DHCP field parsed from request.
2
DHCP Request to Helper IP 10.1.1.10
2
DHCP Response returned from DHCP Server
3
4
1
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 159
BRKSEC-3699
Load Balancing Simplifies Device Configuration
L3 Switch Example for DHCP Relay
• Before
• After
!
interface Vlan10
description EMPLOYEE
ip address 10.1.10.1 255.255.255.0
ip helper-address 10.1.100.100 <--- Real DHCP Server
ip helper-address 10.1.99.5 <--- ISE-PSN-1
ip helper-address 10.1.99.6 <--- ISE-PSN-2
!
!
interface Vlan10
description EMPLOYEE
ip address 10.1.10.1 255.255.255.0
ip helper-address 10.1.100.100 <--- Real DHCP Server
ip helper-address 10.1.98.8 <--- LB VIP
!
Settings apply to each
L3 interface servicing
DHCP endpoints
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
User
NAD
160
BRKSEC-3699
Load Balancing Sticky Guidelines
Ensure DHCP and RADIUS for a Given Endpoint Use Same PSN
VIP: 10.1.98.8
1. RADIUS Authentication request sent to VIP @ 10.1.98.8.
2. Request is Load Balanced to PSN-3, and entry added to Persistence Cache
3. DHCP Request is sent to VIP @ 10.1.98.8
4. Load Balancer uses the same “Sticky” as RADIUS based on client MAC address
5. DHCP is received by same PSN, thus optimizing endpoint replication
1
5
IP Helper sends DHCP to VIP
Persistence Cache:
11:22:33:44:55:66 -> PSN-3
RADIUS response from PSN-3
RADIUS request to VIP
MAC: 11:22:33:44:55:66
DHCP Request
3
F5 LTM
2
2
4
4
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
when RULE_INIT {
set static::DDIP_debug 1
}
when CLIENT_ACCEPTED {
if { [UDP::payload length] > 200 } {
binary scan [UDP::payload] x240H* dhcp_option_payload
set option_hex 0
set options_length [expr {([UDP::payload length] -240) * 2 }]
for {set i 0} {$i < $options_length} {incr i [expr { $length * 2 + 2 }]} {
# extract option value and convert into decimal
# for human readability
binary scan $dhcp_option_payload x[expr { $i } ]a2 option_hex
set tmpvalue1 0x$option_hex
set option [expr { $tmpvalue1 }]
# move index to get length field
incr i 2
# extract length value and convert length from Hex string to decimal
binary scan $dhcp_option_payload x[expr { $i } ]a2 length_hex
set tmpvalue2 0x$length_hex
set length [expr { $tmpvalue2 }]
# extract value filed in hexadecimal format
binary scan $dhcp_option_payload x[expr { $i + 2} ]a[expr { $length * 2 }] value_hex
F5 iRule to Drop DHCP Informs courtesy of
161
BRKSEC-3699
# iRule Continued
if { $static::DDIP_debug } { log local0.
"DHCP option is $option, value is $value_hex" }
switch $option {
53 {
# DHCP Message Type
switch $value_hex {
08 {
if { $static::DDIP_debug } {
log local0.
"Dropping DHCP Inform packet: $value_hex"
}
drop
return
}
default { }
}
}
}
}
}
}
Load Balancing
TACACS+
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE-PSN-3
ISE-PSN-2
ISE-PSN-1
10.1.99.5
10.1.99.6
10.1.99.7
Device Admin Access Device
Load Balancer
Load Balancing TACACS+
Session Authentication, Authorization, and Accounting
TACACS+ Session AUTHC reply from 10.1.98.18
TACACS+ Session AUTHC request to 10.1.98.18
VIP: 10.1.98.18
ISE-CLUSTER
VLAN 99 (10.1.99.0/24)
VLAN 98 (10.1.98.0/24)
TACACS+ Session AUTHZ request to 10.1.98.18
1. NAD has single TACACS+ Server defined (10.1.98.18)
2. TACACS+ Session Authentication requests sent to VIP @ 10.1.98.18
3. Requests from same Admin user load balanced to same PSN via sticky based on
Source IP (NAD IP Address)
4. TACACS+ response received from VIP @ 10.1.98.18
(originated by real server ise-psn-3 @ 10.1.99.7 and source translated by LB)
5. TACACS+ Session Authorization & Accounting sent to/from same PSN per sticky
2
4 5
1 tacacs-server host 10.1.98.18
3
TACACS+ Session AUTHZ reply from 10.1.98.18
TACACS+ Session ACCTG request to 10.1.98.18
TACACS+ Session ACCTG reply from 10.1.98.18
• Virtual IP = TACACS+ Server
• VIP listens on TCP/49
• Sticky based on source IP
BRKSEC-3699 163
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 164
BRKSEC-3699
Load Balancing TACACS+
General Recommendations
• Load Balance based on TCP/49.
• Source NAT (SNAT) can be used – No CoA like RADIUS
• Recommend LB inline with TACACS traffic, else need to address TCP asymmetry.
• Without SNAT, make sure PSNs set default gateway to LB internal interface IP.
• Persistence – Recommend source IP address
• Based on assumption that number of T+ clients high and requests per client is low.
• Health Monitoring options:
• Simple response to TCP/49
• 3-way handshake expected response
• Scripts can be used to validate full auth flow.
Packet format: http://www.cisco.com/warp/public/459/tac-rfc.1.76.txt
Packet capture(encrypted):https://www.cloudshark.org/captures/1a9c284c49b0
LDAP Server
Redundancy and Load
Balancing
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 166
BRKSEC-3699
Per-PSN LDAP Servers
• Assign unique
Primary and
Secondary to
each PSN
• Allows each
PSN to use
local or
regional LDAP
Servers
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Load Balancing
LDAP Servers
ldap1.company.com
10.1.95.5
10.1.95.6
10.1.95.7
ldap2.company.com
ldap3.company.com
LDAP Response from 10.1.95.6
Lookup1 = ldap.company.com
Response = 10.1.95.6
LDAP Query to 10.1.95.6
Lookup2 = ldap.company.com
Response = 10.1.95.7
LDAP Query to 10.1.95.7
LDAP Response from 10.1.95.7
15 minute reconnect timer
BRKSEC-3699 167
PSN
Vendor-Specific LB Configurations
• F5 LTM
• Citrix NetScaler
• Cisco ACE
• Cisco ITD (Note)
https://communities.cisco.com/docs/DOC-64434
PSN HA Without Load
Balancers
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 170
BRKSEC-3699
Load Balancing Web Requests Using DNS
Client-Based Load Balancing/Distribution Based on DNS Response
• Examples:
• Cisco Global Site Selector (GSS) / F5 BIG-IP GTM / Microsoft’s DNS Round-Robin feature
• Useful for web services that use static URLs including LWA, Sponsor, My Devices, OCSP.
sponsor IN A 10.1.99.5
sponsor IN A 10.1.99.6
sponsor IN A 10.2.100.7
sponsor IN A 10.2.100.8
What is IP address for
sponsor.company.com?
DNS SOA for company.com
10.1.99.5
What is IP address for
sponsor.company.com?
10.2.100.8 10.2.5.221
10.1.60.105
10.2.100.8
10.2.100.7
10.1.99.6
10.1.99.5
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 171
BRKSEC-3699
ISE Configuration for Anycast
On each PSN that will participate in Anycast…
1. Configure PSN probes to profile
DHCP (IP Helper), SNMP Traps, or
NetFlow on dedicated interface
2. From CLI, configure dedicated interface
with same IP address on each PSN node.
ISE-PSN-1 Example:
#ise-psn-1/admin# config t
#ise-psn-1/admin (config)# int GigabitEthernet1
#ise-psn-1/admin (config-GigabitEthernet)# ip address 10.10.10.10 255.255.255.0
ISE-PSN-2 Example:
#ise-psn-1/admin# config t
#ise-psn-1/admin (config)# int GigabitEthernet1
#ise-psn-1/admin (config-GigabitEthernet)# ip address 10.10.10.10 255.255.255.0
Anycast address should only be applied
to ISE secondary interfaces, or LB VIP, but
never to ISE GE0 management interface.
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 172
BRKSEC-3699
Sample Routing Configuration for Anycast
Real-World Customer Example using Anycast with RADIUS:
http://www.networkworld.com/article/3074954/security/how-to-use-anycast-to-provide-high-availability-to-a-radius-server.html
• Access Switch 1
interface gigabitEthernet 1/0/23
no switchport
ip address 10.10.10.50 255.255.255.0
!
router eigrp 100
no auto-summary
redistribute connected route-map CONNECTED-
2-EIGRP
!
route-map CONNECTED-2-EIGRP permit 10
match ip address prefix-list 5
set metric 1000 100 255 1 1500
set metric-type internal
!
route-map CONNECTED-2-EIGRP permit 20
ip prefix-list 5 seq 5 permit 10.10.10.0/24
• Access Switch 2
interface gigabitEthernet 1/0/23
no switchport
ip address 10.10.10.51 255.255.255.0
!
router eigrp 100
no auto-summary
redistribute connected route-map CONNECTED-
2-EIGRP
!
route-map CONNECTED-2-EIGRP permit 10
match ip address prefix-list 5
set metric 500 50 255 1 1500
set metric-type external
!
route-map CONNECTED-2-EIGRP permit 20
ip prefix-list 5 seq 5 permit 10.10.10.0/24
Both switches
advertise same
network used
for profiling but
different metrics
# less preferred
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 173
BRKSEC-3699
NAD-Based RADIUS Server Redundancy (IOS)
Multiple RADIUS Servers Defined in Access Device
• Configure Access Devices with multiple RADIUS Servers.
• Fallback to secondary servers if primary fails
PSN3 (10.7.8.9)
PSN2 (10.4.5.6)
PSN1 (10.1.2.3)
RADIUS Auth
User
radius-server host 10.1.2.3 auth-port 1812 acct-port 1813
radius-server host 10.4.5.6 auth-port 1812 acct-port 1813
radius-server host 10.7.8.9 auth-port 1812 acct-port 1813
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 174
IOS-Based RADIUS Server Load Balancing
Switch Dynamically Distributes Requests to Multiple RADIUS Servers
• RADIUS LB feature distributes batches of AAA transactions to servers within a group.
• Each batch assigned to server with least number of outstanding transactions.
PSN3 (10.7.8.9)
PSN2 (10.4.5.6)
PSN1 (10.1.2.3)
RADIUS
User 1
radius-server host 10.1.2.3 auth-port 1812 acct-port 1813
radius-server host 10.4.5.6 auth-port 1812 acct-port 1813
radius-server host 10.7.8.9 auth-port 1812 acct-port 1813
radius-server load-balance method least-outstanding batch-size 5
NAD controls the load
distribution of AAA
requests to all PSNs
in RADIUS group
without dedicated LB.
User 2
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 175
NAD-Based RADIUS Redundancy (WLC)
Wireless LAN Controller
• Multiple RADIUS Auth & Accounting Server Definitions
• RADIUS Fallback options: none, passive, or active
http://www.cisco.com/en/US/products/ps6366/products_configuration_example09186a008098987e.shtml
Off = Continue exhaustively through
list; never preempt to preferred server
(entry with lowest index)
Passive = Quarantine failed RADIUS
server for interval then return to active
list w/o validation; always preempt.
Active = Mark failed server dead then
actively probe status per interval
w/username until succeed before
return to list; always preempt.
Password=
Username
BRKSEC-3699
NAD Fallback and
Recovery
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Recovery
Deadtime
Layer 2 Point-to-Point
Access Switch Policy Service Node
Layer 3 Link
Access VLAN 10 (or Authorized VLAN)
Auth Request
15 sec, Auth-Timeout
radius-server dead-criteria 15 tries 3
Dead
Detection
Endpoint
No response
Auth Request
Retry
Retry
Retry
Wait Deadtime = 2 minutes
SERVER DEAD
Traffic permitted on Critical VLAN per port ACL
15 sec, Auth-Timeout
15 sec, Auth-Timeout
15 sec, Auth-Timeout
Authorize Critical VLAN 11
No response
No response
Deadtime Test request
Deadtime Test request
Deadtime Test request
Deadtime Test request
SERVER ALIVE
Reinitialize Port / Set Access VLAN per Recovery Interval
Traffic permitted per RADIUS authorization Idle-Time Test request
radius-server deadtime 2
authentication event server dead action reinitialize vlan 11
authentication event server alive action reinitialize
radius-server host ... test username radtest idle-time 60 key cisco123
authentication critical recovery delay 1000
Idle-Time Test request
60 minute Idle-Time
NAD Fallback and Recovery Sequence
Deadtime Test reply
177
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 178
BRKSEC-3699
RADIUS Test User Account
Which User Account Should Be Used?
• Does NAD uniformly treat Auth Fail and Success the same for detecting server health?
IOS treats them the same; F5 RADIUS probe treats Auth Fail= “server down”. Check your LB
behavior.
• Do I use an Internal or External ID store account?
If goal is to validate backend ID store, then Auth Fail may not detect external ID store failure.
• IOS Example: Failover on AD failure. Solution: Drop auth requests when external ID store is down.
• Identity Server Sequence > Advanced Settings:
• ACE Example: If auth fails, then PSN declared down.
Solution: Create valid user account so ACE test probes
return Access-Accept.
• Could this present a potential security risk?
Authentication Policy >
ID Source custom
processing based on
authentication results
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 179
BRKSEC-3699
Inaccessible Authentication Bypass (IAB)
Also Known As “Critical Auth VLAN” for Data
• Switch detects PSN unavailable by one of two methods
• Periodic probe
• Failure to respond to AAA request
• Enables port in critical VLAN
• Existing sessions retain authorization status
• Recovery action can re-initialize port when AAA returns
WAN / Internet
WAN or PSN Down
Access VLAN
Critical VLAN
Critical Data VLAN can be anything:
• Same as default access VLAN
• Same as guest/auth-fail
VLAN
• New VLAN
authentication event server dead action authorize vlan 100
authentication event server alive action reinitialize
authentication event server dead action authorize voice Critical Voice VLAN
PSN
Access Switch
Client
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Critical Auth for Data and Voice
Data VLAN Enabled
interface GigabitEthernet 3/48
dot1x pae authenticator
authentication port-control auto
authentication event server dead action authorize vlan x
authentication event server dead action authorize voice
Voice VLAN Enabled
# show authentication sessions interface fa3/48
…
Critical Authorization is in effect for domain(s) DATA and VOICE
BRKSEC-3699 180
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 181
BRKSEC-3699
Default Port ACL Issues with Critical VLAN
Limited Access Even After Authorization to New VLAN!
• Data VLAN reassigned to critical auth VLAN, but new (or reinitialized) connections are
still restricted by existing port ACL!
WAN or PSN Down
Access VLAN
Critical VLAN
interface GigabitEthernet1/0/2
switchport access vlan 10
switchport voice vlan 13
ip access-group ACL-DEFAULT in
authentication event server dead action reinitialize vlan 11
authentication event server dead action authorize voice
authentication event server alive action reinitialize
Gi1/0/2
ip access-list extended ACL-DEFAULT
permit udp any eq bootpc any eq bootps
permit udp any any eq domain
permit icmp any any
permit udp any any eq tftp
Only DHCP/DNS/PING/TFTP allowed !
Voice VLAN
Default ACL
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS BRKSEC-3699 182
Using Embedded Event Manager with Critical VLAN
Modify or Remove/Add Static Port ACLs Based on PSN Availability
• Allows scripted actions to occur based on various conditions and triggers
track 1 ip route 10.1.98.0 255.255.255.0 reachability
event manager applet default-acl-fallback
event track 1 state down maxrun 5
action 1.0 cli command "enable"
action 1.1 cli command "conf t" pattern "CNTL/Z."
action 2.0 cli command "ip access-list extended ACL-DEFAULT"
action 3.0 cli command "1 permit ip any any"
action 4.0 cli command "end"
event manager applet default-acl-recovery
event track 1 state up maxrun 5
action 1.0 cli command "enable"
action 1.1 cli command "conf t" pattern "CNTL/Z."
action 2.0 cli command "ip access-list extended ACL-DEFAULT"
action 3.0 cli command "no 1 permit ip any any"
action 4.0 cli command "end"
EEM available
on Catalyst
3k/4k/6k
switches
https://supportforums.cisco.com/document/117596/cisco-eem-basic-overview-and-sample-configurations
https://supportforums.cisco.com/document/48891/cisco-eem-best-practices
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 183
BRKSEC-3699
Critical ACL using Service Policy Templates
Apply ACL, VLAN, or SGT on RADIUS Server Failure!
• Critical Auth ACL applied on Server Down
WAN or PSN Down
Access VLAN
Critical VLAN
interface GigabitEthernet1/0/2
switchport access vlan 10
switchport voice vlan 13
ip access-group ACL-DEFAULT in
access-session port-control auto
mab
dot1x pae authenticator
service-policy type control subscriber ACCESS-POLICY
Gi1/0/2
ip access-list extended ACL-DEFAULT
permit udp any eq bootpc any eq bootps
permit udp any any eq domain
permit icmp any any
permit udp any any eq tftp
Only DHCP/DNS/PING/TFTP allowed !
Voice VLAN
Default ACL
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Only DHCP/DNS/PING/TFTP allowed !
184
BRKSEC-3699
Critical ACL using Service Policy Templates
Apply ACL, VLAN, or SGT on RADIUS Server Failure!
• Critical Auth ACL applied on Server Down
WAN or PSN Down
Access VLAN
Critical VLAN
Gi1/0/2
Voice VLAN
Default ACL
Critical ACL
Deny PCI networks; Permit Everything Else !
policy-map type control subscriber ACCESS-POLICY
event authentication-failure match-first
10 class AAA_SVR_DOWN_UNAUTHD do-until-failure
10 activate service-template CRITICAL_AUTH_VLAN
20 activate service-template DEFAULT_CRITICAL_VOICE_TEMPLATE
30 activate service-template CRITICAL-ACCESS
service-template CRITICAL-ACCESS
access-group ACL-CRITICAL
!
service-template CRITICAL_AUTH_VLAN
vlan 10
service-template DEFAULT_CRITICAL_VOICE_TEMPLATE
voice vlan
class-map type control subscriber match-all AAA_SVR_DOWN_UNAUTHD
match result-type aaa-timeout
match authorization-status unauthorized
2k/3k/4k: 15.2(1)E
3k IOS-XE: 3.3.0SE
4k: IOS-XE 3.5.0E
6k: 15.2(1)SY
ip access-list extended ACL-DEFAULT
permit udp any eq bootpc any eq bootps
permit udp any any eq domain
permit icmp any any
permit udp any any eq tftp
ip access-list extended ACL-CRITICAL
remark Deny access to PCI zone scopes
deny tcp any 172.16.8.0 255.255.240.0
deny udp any 172.16.8.0 255.255.240.0
deny ip any 192.168.0.0 255.255.0.0
permit ip any any
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 185
BRKSEC-3699
Critical MAB
Local Authentication During Server Failure
policy-map type control subscriber ACCESS-POL
...
event authentication-failure match-first
10 class AAA_SVR_DOWN_UNAUTHD_HOST do-↵
until-failure
10 terminate mab
20 terminate dot1x
30 authenticate using mab aaa authc-↵
list mab-local authz-list mab-local
...
000c.293c.8dca
000c.293c.331e
 Additional level of check to authorize hosts during a critical condition.
 EEM Scripts could be used for dynamic update of whitelist MAC addresses
 Sessions re-initialize once the server connectivity resumes.
username 000c293c8dca password 0 000c293c8dca
username 000c293c8dca aaa attribute list mab-local
!
aaa local authentication default authorization mab-local
aaa authorization credential-download mab-local local
!
aaa attribute list mab-local
attribute type tunnel-medium-type all-802
attribute type tunnel-private-group-id "150"
attribute type tunnel-type vlan
attribute type inacl "CRITICAL-V4"
!
WAN
?
Monitoring Load and
System Health
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 187
BRKSEC-3699
Home Dashboard - High-Level Server Health
187
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 188
BRKSEC-3699
Server Health/Utilization Reports
Operations > Reports > Diagnostics > Health Summary
188
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 189
BRKSEC-3699
Key Performance Metrics (KPM)
• KPM Reports added in ISE 2.2: Operations > Reports > Diagnostics > KPM
• Also available from CLI (# application configure ise) since ISE 1.4
• Provide RADIUS Load, Latency, and Suppression Stats
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 190
BRKSEC-3699
Serviceability Counter Framework (CF)
The Easy Way: MnT auto-collects key metrics from each node!
• Enable/disable from
‘app configure ise’
• Enabled by default
• Threshold are hard
set by platform size
• Alarm sent when
exceed threshold
• Running count
displayed per
collection interval
Detected
platform size
Thresholds
Node specific report
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Key Takeaway Points
• CHECK ISE Virtual Appliances for proper resources and platform
detection!
• Avoid excessive auth activity through proper NAD / supplicant tuning
and Log Suppression
• Minimize data replication by implementing node groups and profiling
best practices
• Leverage load balancers for scale, high availability, and simplifying
network config changes
• Be sure to have a local fallback plan on you network access devices
191
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS 192
Cisco Community Page on Sizing and Scalability
https://communities.cisco.com/docs/DOC-68347
BRKSEC-3699
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
ISE Performance & Scale Resources
https://communities.cisco.com/docs/DOC-65625
• Community Page
• Cisco Live:
BRKSEC-3699
Reference version
• ISE Load Balancing
Design Guide (be sure
to read customer notes
at bottom of download
page—guide errata!)
• Calculators for
Bandwidth and Logging
193
BRKSEC-3699
Complete your online session evaluation
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Give us your feedback to be entered
into a Daily Survey Drawing.
Complete your session surveys through
the Cisco Live mobile app or on
www.CiscoLive.com/us.
Don’t forget: Cisco Live sessions will be available for viewing
on demand after the event at www.CiscoLive.com/Online.
BRKSEC-3699 194
© 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public
#CLUS
Demos in
the Cisco
campus
Walk-in
self-paced
labs
Meet the
engineer
1:1
meetings
Related
sessions
Continue
your
education
BRKSEC-3699 195
Thank you
#CLUS
#CLUS

More Related Content

PDF
Cisco ISE Performance, Scalability and Best Practices.pdf
PDF
ISE_FireJumper_Design.pdf---------------------------
PPTX
Horizon 6 pilot accelerator appliance
PDF
BRKDCN-2670 Day2 operations for Datacenter VxLAN EVPN fabrics.pdf
PDF
ISE-CiscoLive.pdf
PPTX
Cisco ISE Document which is NAC solution
PPTX
Cisco UCS - CA World 2013
PPTX
Cisco Intersight Technical OverView.pptx
Cisco ISE Performance, Scalability and Best Practices.pdf
ISE_FireJumper_Design.pdf---------------------------
Horizon 6 pilot accelerator appliance
BRKDCN-2670 Day2 operations for Datacenter VxLAN EVPN fabrics.pdf
ISE-CiscoLive.pdf
Cisco ISE Document which is NAC solution
Cisco UCS - CA World 2013
Cisco Intersight Technical OverView.pptx

Similar to Designing ISE for Scale & High Availability.pdf (20)

PDF
L'azienda è più agile? Tutto merito del Data Center
PDF
Elastic Cloud Enterprise @ Cisco
PDF
Cisco identity services engine (ise) ordering steps &amp; guide
PDF
Cisco connect montreal 2018 compute v final
PDF
BRKSEC-2494.pdf
PDF
Cisco Connect Halifax 2018 Compute infrastructure for a hybrid cloud ucs an...
PDF
Architecture of Cisco Container Platform: A new Enterprise Multi-Cloud Kubern...
PDF
BRKSEC-3771 - WSA with wccp.pdf
PPTX
NetCom learning webinar start your network foundations with ccna(handouts)
PPTX
emea_cisco_live_webinar_150623.pptx
PPTX
Ansible x napalm x nso 解説・比較パネルディスカッション nso
PPTX
TechWiseTV Workshop: Application Hosting on Catalyst 9000 Series Switches
PPTX
Oracle Database Consolidation with FlexPod on Cisco UCS
PPTX
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...
PPTX
TechWiseTV Workshop: Cisco TrustSec
PDF
Presentation cisco desktop virtualization with ucs a blueprint for success
PPTX
CCNA (R & S) Module 04 - Scaling Networks - Chapter 1
PPTX
Hyper-Convergence: Worth the Hype?
PDF
3. ami big data hadoop on ucs seminar may 2013
PPTX
IBM BC2015 - Cisco - Cloud is Now - VersaStack
L'azienda è più agile? Tutto merito del Data Center
Elastic Cloud Enterprise @ Cisco
Cisco identity services engine (ise) ordering steps &amp; guide
Cisco connect montreal 2018 compute v final
BRKSEC-2494.pdf
Cisco Connect Halifax 2018 Compute infrastructure for a hybrid cloud ucs an...
Architecture of Cisco Container Platform: A new Enterprise Multi-Cloud Kubern...
BRKSEC-3771 - WSA with wccp.pdf
NetCom learning webinar start your network foundations with ccna(handouts)
emea_cisco_live_webinar_150623.pptx
Ansible x napalm x nso 解説・比較パネルディスカッション nso
TechWiseTV Workshop: Application Hosting on Catalyst 9000 Series Switches
Oracle Database Consolidation with FlexPod on Cisco UCS
MongoDB World 2018: Managing a Mission Critical eCommerce Application on Mong...
TechWiseTV Workshop: Cisco TrustSec
Presentation cisco desktop virtualization with ucs a blueprint for success
CCNA (R & S) Module 04 - Scaling Networks - Chapter 1
Hyper-Convergence: Worth the Hype?
3. ami big data hadoop on ucs seminar may 2013
IBM BC2015 - Cisco - Cloud is Now - VersaStack
Ad

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Getting Started with Data Integration: FME Form 101
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Big Data Technologies - Introduction.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPT
Teaching material agriculture food technology
PPTX
1. Introduction to Computer Programming.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Tartificialntelligence_presentation.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Empathic Computing: Creating Shared Understanding
Building Integrated photovoltaic BIPV_UPV.pdf
Getting Started with Data Integration: FME Form 101
Spectroscopy.pptx food analysis technology
Big Data Technologies - Introduction.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Group 1 Presentation -Planning and Decision Making .pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
SOPHOS-XG Firewall Administrator PPT.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Encapsulation theory and applications.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Teaching material agriculture food technology
1. Introduction to Computer Programming.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Tartificialntelligence_presentation.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Empathic Computing: Creating Shared Understanding
Ad

Designing ISE for Scale & High Availability.pdf

  • 2. #CLUS Craig Hyps, Principal Engineer BRKSEC-3699 Designing ISE for Scale & High Availability
  • 3. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Session Abstract Cisco Identity Services Engine (ISE) delivers context-based access control for every endpoint that connects to your network. This session will show you how to design ISE to deliver scalable and highly available access control services for wired, wireless, and VPN from a single campus to a global deployment. Focus is on design guidance for distributed ISE architectures including high availability for all ISE nodes and their services as well as strategies for survivability and fallback during service outages. Methodologies for increasing scalability and redundancy will be covered such as load distribution with and without load balancers, optimal profiling design, and the use of Anycast. Attendees of this session will gain knowledge on how to best deploy ISE to ensure peak operational performance, stability, and to support large volumes of authentication activity. Various deployment architectures will be discussed including ISE platform selection, sizing, and network placement. BRKSEC-3699 3 Cisco Identity Services Engine (ISE) delivers context-based access control for every endpoint that connects to your network. This session will show you how to design ISE to deliver scalable and highly available access control services for wired, wireless, and VPN from a single campus to a global deployment. Focus is on design guidance for distributed ISE architectures including high availability for all ISE nodes and their services as well as strategies for survivability and fallback during service outages. Methodologies for increasing scalability and redundancy will be covered such as load distribution with and without load balancers, optimal profiling design, and the use of Anycast. Attendees of this session will gain knowledge on how to best deploy ISE to ensure peak operational performance, stability, and to support large volumes of authentication activity. Various deployment architectures will be discussed including ISE platform selection, sizing, and network placement.
  • 4. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 4 ISE Sessions @Live Orlando 2018 BRKSEC-2059 Deploying ISE in a Dynamic Environment Clark Gambrel Monday 1:30-3:30 BRKSEC-3699 Designing ISE for Scale & High Availability Craig Hyps Thursday 8:00-10:00 You are here TECSEC-2672 Identity Services Engine 2.4 Best Practices Jesse Dubois, Eugene Korneychuk, Kevin Redmon, Vivek Santuka Monday 9:00-6:00 Monday Wednesday Thursday Sunday BRKSEC-3697 Advanced ISE Services, Tips & Tricks Craig Hyps, Wednesday 8:00-10:00 BRKCOC-2018 Inside Cisco IT: How Cisco Deployed ISE and Group Based Policies throughout the Enterprise Raj Kumar, David Iacobacci Wednesday 8:30-10:00 BRKSEC-2464 Lets get practical with your network security by using Cisco ISE Imran Bashir, Wednesday 10:30-12:00 BRKSEC-2695 Building an Enterprise Access Control Architecture using ISE and Group Based Policies Imran Bashir, Wednesday 1:30-3:30 BRKSEC-2039 Cisco Medical Device Segmentation Tim Lovelace, Mark Bernard Thursday 1:00-2:30 BRKSEC-2038 Security for the Manufacturing Floor - The New Frontier Shaun Muller Thursday 10:30-12:00 You Are Here BRKSEC-3699
  • 5. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Important: Hidden Slide Alert Look for this “For Your Reference” Symbol in your PDF’s There is a tremendous amount of hidden content, for you to use later! ~500 +/- Slides in Session Reference PDF Available on ciscolive.com For Your Reference BRKSEC-3699 5
  • 6. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Cisco Webex Teams Questions? Use Cisco Webex Teams (formerly Cisco Spark) to chat with the speaker after the session Find this session in the Cisco Events App Click “Join the Discussion” Install Webex Teams or go directly to the team space Enter messages/questions in the team space How Webex Teams will be moderated by the speaker until June 18, 2018. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 6 1 2 3 4 6 cs.co/ciscolivebot#BRKSEC-3699 BRKSEC-3699
  • 7. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Where can I get help after Cisco Live? BRKSEC-3699 7 ISE Public Community http://cs.co/ise-community Questions answered by ISE TMEs and other Subject Matter Experts – the same persons that support your local Cisco and Partner SEs! ISE Compatibility Guides http://cs.co/ise-compatibility ISE Design Guides http://cs.co/ise-guides Courtesy of Thomas Howard
  • 8. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS • Sizing Deployments and Nodes • Bandwidth and Latency • Scaling ISE Services • RADIUS, Guest, Web Services, Compliance, TACACS+ • Profiling and Database Replication • MnT (Optimize Logging and Noise Suppression) Agenda BRKSEC-3699 8 • High Availability • Appliance Redundancy • Admin, MnT, and pxGrid Nodes • PSN Redundancy with and without Load Balancing • NAD Fallback and Recovery • Monitoring Load and System Health Time Permitting
  • 10. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE 2.4 Scaling by Deployment/Platform/Persona Max Concurrent Session Counts by Deployment Model and Platform • By Deployment • By PSN Deployment Model Platform Max Active Sessions per Deployment Max # Dedicated PSNs / PXGs Min # Nodes (no HA) / Max # Nodes (w/ HA) Stand- alone All personas on same node 3515 7,500 0 1 / 2 3595 20,000 0 1 / 2 Hybrid PAN+MnT+PXG on same node; Dedicated PSN 3515 as PAN+MNT 7,500 5 / 2* 2 / 7 3595 as PAN+MNT 20,000 5 / 2* 2 / 7 Dedicated Dedicated PAN and MnT nodes 3595 as PAN and MNT 500,000 50 / 2 3 / 58 3595 as PAN and Large MNT 500,000 50 / 4 3 / 58 Scaling per PSN Platform Max Active Sessions per PSN Dedicated Policy nodes (Max Sessions Gated by Total Deployment Size) SNS-3515 7,500 SNS-3595 40,000 Each dedicated pxGrid node reduces PSN count by 1 (Medium deployment only) * BRKSEC-3699 10 Max Active Sessions != Max Endpoints; ISE 2.1+ supports 1.5M Endpoints
  • 11. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Sizing Production VMs to Physical Appliances Summary 11 BRKSEC-3699 Appliance used for sizing comparison CPU Memory (GB) Physical Disk (GB) ** # Cores Clock Rate* SNS-3415 4 2.4 16 600 SNS-3495 8 2.4 32 600 SNS-3515 6 2.3 16 600 SNS-3595 8 2.6 64 1,200 * Minimum VM processor clock rate = 2.0GHz per core (same as OVA). ** Actual disk requirement is dependent on persona(s) deployed and other factors. See slide on Disk Sizing. Warning: # Cores not always = # Logical processors / vCPUs due to Hyper Threading
  • 12. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE Platform Properties Minimum VM Resource Allocation 12 BRKSEC-3699 Minimum CPUs Minimum RAM Minimum Disk Platform Profile 2 4 100 GB EVAL 4 4 200GB IBM_SMALL_MEDIUM 4 4 200GB IBM_LARGE 4 16 200GB UCS_SMALL 8 32 200GB UCS_LARGE 12 16 200GB SNS_3515 16 64 200GB SNS_3595 16 256 200GB SNS_3595 <large> • Least Common Denominator used to set platform. • Example: 4 cores 32GB RAM = UCS_SMALL
  • 13. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public Because memory, max sessions, and other table spaces are based on Persona and Platform Profile Why Do I Care? BRKSEC-3699 13
  • 14. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE OVA Templates Summary 14 BRKSEC-3699 OVA Template CPU Virtual Memory (GB) Virtual NICs (GB) Virtual Disk Size Target Node Type # CPUs Clock Rate (GHz) Total CPU (MHz) Eval 2 2.3 4,600 8 4 200GB EVAL SNS3415 4 2.0 8,000 16 4 200GB PSN/PXG 600GB PAN/MnT SNS3495 8 2.0 16,000 32 4 200GB PSN/PXG 600GB PAN/MnT SNS3515 6 2.0 12,000 16 6 200GB PSN/PXG 600GB PAN/MnT SNS3595 8 2.0 16,000 64 6 200GB PSN/PXG 1.2TB PAN/MnT For 35x5 ISE VMs, HyperThreading is Mandatory CSCvh71644 - VMware OVA templates for SNS-35xx are not detected correctly… 12 16
  • 15. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE Platform Properties Verify ISE Detects Proper VM Resource Allocation • From CLI... • ise-node/admin# show tech | begin PlatformProperties • From Admin UI (ISE 2.2 +) • Operations > Reports > Diagnostics > ISE Counters > [node] (Under ISE Profile column) 15 BRKSEC-3699 UCS_SMALL
  • 16. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE VM Disk Storage Requirements Minimum Disk Sizes by Persona • Upper range sets #days MnT log retention • Min recommended disk for MnT = 600GB • Max hardware appliance disk size = 1.2TB • Max virtual appliance disk size = 2TB CSCvb75235 - DOC ISE VM installation can't be done if disk is greater than or equals to 2048 GB or 2 TB ** Variations depend on where backups saved or upgrade files staged (local or repository), debug, local logging, and data retention requirements. 16 BRKSEC-3699 Persona Disk (GB) Standalone 200+* Administration Only 200-300** Monitoring Only 200+* Policy Service Only 200 PAN + MnT 200+* PAN + MnT + PSN 200+*
  • 17. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS VM Disk Allocation CSCvc57684 Incorrect MnT allocations if setup with VM disk resized to larger without ISO re-image • ISE OVAs prior to ISE 2.2 sized to 200GB. Often sufficient for PSNs or pxGrid nodes but not MnT. • Misconception: Just get bigger tank and ISE will grow into it! • No auto-resize of ISE partitions when disk space added after initial software install • Requires re-image using .iso • Alternatively: Start with larger OVA (ISE 2.2) ISE 200GB OVA Total ISE disk = 200GB Accessible to VM but not ISE Add 400GB VM disk BRKSEC-3699 17 MNT
  • 18. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS MnT Node Log Storage Requirements for RADIUS Days Retention Based on # Endpoints and Disk Size (ISE 2.2) 18 BRKSEC-3699 200 GB 400 GB 600 GB 1024 GB 2048 GB 5,000 504 1007 1510 2577 5154 10,000 252 504 755 1289 2577 25,000 101 202 302 516 1031 50,000 51 101 151 258 516 100,000 26 51 76 129 258 150,000 17 34 51 86 172 200,000 13 26 38 65 129 250,000 11 21 31 52 104 500,000 6 11 16 26 52 Total Endpoints Total Disk Space Allocated to MnT Node Assumptions: • 10+ auths/day per endpoint • Log suppression enabled Based on 60% allocation of MnT disk to RADIUS logging (Prior to ISE 2.2, only 30% allocations) ISE 2.2 = 50% days increase over 2.0/2.1 ISE 2.3 = 25-33% increase over 2.2 ISE 2.4 = 40-60% increase over 2.2
  • 19. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS RADIUS and TACACS+ MnT Log Allocation • Administration > System > Maintenance > Operational Data Purging 19 BRKSEC-3699 • 60% total disk allocated to both RADIUS and TACACS+ for logging (Previously fixed at 30% and 20%) • Purge @ 80% (First In-First Out) • Optional archive of CSV to repository RADIUS T+ Total Log Allocation 384 GB 80% Purge M&T_PRIMARY Radius : 67 GB Days : 24 Default Retention reduced from 90 -> 30 days
  • 20. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE VM Disk Provisioning Guidance • Please! No Snapshots! • Snapshots NOT supported; no option to quiesce database prior to snapshot. • VMotion supported but storage motion not QA tested. • Recommend avoid VMotion due to snapshot restrictions. • Thin Provisioning supported • Thick Provisioning highly recommended, especially for PAN and MnT) • No specific storage media and file system restrictions. • For example, VMFS is not required and NFS allowed provided storage is supported by VMware and meets ISE IO performance requirements. 20 IO Performance Requirements: Read 300+ MB/sec Write 50+ MB/sec Recommended disk/controller:  10k RPM+ disk drives  Supercharge with SSD !  Caching RAID Controller  RAID mirroring Slower writes using RAID 5* *RAID performance levels: http://www.datarecovery.net/articles/raid- level-comparison.html http://docs.oracle.com/cd/E19658-01/820- 4708-13/appendixa.html BRKSEC-3699
  • 21. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE VM Provisioning Guidance • Use reservations (built into OVAs) • Do not oversubscribe! 21 BRKSEC-3699 Customers with VMware expertise may choose to disable resource reservations and over-subscribe, but do so at own risk.
  • 22. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Introducing “Super” MnT For Any Deployment where High-Perf MnT Operations Required • Virtual Appliance Only option in ISE 2.4 • Requires Large VM License • 3595 specs + 256 GB • 8 cores @ 2GHz min (16000+ MHz) = 16 logical processors • 256GB RAM • Up to 2TB* disk w/ fast I/O • Fast I/O Recommendations: • Disk Drives (10k/15k RPM or SSD) • Fast RAID w/Caching (ex: RAID 10) • More disks (ex: 8 vs 4) 22 BRKSEC-3699 * CSCvb75235 - DOC ISE VM installation can't be done if disk is greater than or equals to 2048 GB or 2 TB MnT
  • 23. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE 2.4 MnT -- Fast Access to Logs and Reports 23 BRKSEC-3699 Live Logs / Live Sessions Reports
  • 24. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE 2.4 MnT Vertical Scaling Scaling Enhancements Faster Live Log Access • Run session directory tables from pinned memory • Tables optimized for faster queries Faster Report & Export Performance • Report related tables pinned into memory for faster retrieval. • Optimize tables based on platform capabilities. Collector Throughput improvement • Added Multithreaded processing capability to collector. • Increased collector socket buffer size to avoid packet drops. Major Data Reduction • Remove detailed BLOB data > 7 days old (beyond 2.3 reductions) • Database optimizations resulting in up to 80% efficiencies BRKSEC-3699 24 Benefits MnT on ALL ISE platforms
  • 25. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Flash Removal (ISE 2.4) • “No Flash” • C’mon, you mean just a little bit of flash, right? • No. I’m Saying No Flash! There is no Flash in this product! And no Yahoo! User Interface Library (YUI) BRKSEC-3699 25
  • 27. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS BRKSEC-3699 PSN PSN PAN MnT MnT PAN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN 27 Bandwidth and Latency Starting in ISE 2.1: 300ms Max round-trip (RT) latency between any two ISE nodes ` RADIUS generally requires much less bandwidth and is more tolerant of higher latencies – Actual requirements based on many factors including # endpoints, auth rate and protocols WLC Switch RADIUS • Bandwidth most critical between: • PSNs and Primary PAN (DB Replication) • PSNs and MnT (Audit Logging) • Latency most critical between PSNs and Primary PAN. PSN PSN PSN PSN PSN PSN PSN PSN
  • 28. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 28 BRKSEC-3699 Have I Told You My Story Over Latency Yet? “Over Latency?” “No. I Don’t Think I’ll Ever Get Over Latency.” • Latency guidance is not a “fall off the cliff” number, but a guard rail based on what QA has tested. • Not all customers have issues with > 300ms while others may have issues with < 100ms latency due to overall ISE design and deployment. • Profiler config is primary determinant in replication requirements between PSNs and PAN which translates to latency. • When providing guidance, max 300ms roundtrip latency is the correct response from SEs for their customers to design against.
  • 29. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS What if Distributed PSNs > 300ms RTT Latency? < 300 ms > 300 ms BRKSEC-3699 29 © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
  • 30. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Option #1: Deploy Separate ISE Instances Per-Instance Latency < 300ms WLC Switch RADIUS WLC Switch WLC Switch < 300 ms > 300 ms BRKSEC-3699 30 PSN PSN PAN MnT MnT PAN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN PSN P PAN MnT M PAN PSN PSN PSN PSN PSN PSN PSN PSN P PAN MnT P PSN PSN PSN PSN PSN PSN
  • 31. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE Bandwidth Calculator – Updated for ISE 2.1+ ISE 2.x ISE 2.x Note: Bandwidth required for RADIUS traffic is not included. Calculator is focused on inter-ISE node bandwidth requirements. Available to customers @ https://communities.cisco.com/docs/DOC-64317 BRKSEC-3699 31
  • 33. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS • Auth Policy and Service Scale • Guest and Web Authentication and Location Services • Compliance Services—Posture and MDM • Scaling TACACS+ • Profiling and Database Replication • MnT (Optimize Logging and Noise Suppression) Scaling ISE Services Agenda BRKSEC-3699 33
  • 34. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 34 BRKSEC-3699 ISE Personas and Services Enable Only What Is Needed !! • ISE Personas: • PAN • MNT • PSN • pxGrid • PSN Services • Session • Profiling • TC-NAC • ISE SXP • Device Admin (TACACS+) • Passive Identity (Easy Connect) • Avoid unnecessary overload of PSN services • Some services should be dedicated to one or more PSNs Session Services includes base user services such as RADIUS, Guest, Posture, MDM, BYOD/CA
  • 35. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Scaling RADIUS, Web, Profiling, and TACACS+ w/LB • Policy Service nodes can be configured in a cluster behind a load balancer (LB). • Access Devices send RADIUS and TACACS+ AAA requests to LB virtual IP. Load Balancers Network Access Devices PSNs (User Services) Virtual IP Load Balancing covered under the High Availability Section BRKSEC-3699 35 VPN
  • 36. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 36 BRKSEC-3699 Auth Policy Optimization ISE 2.3 Bad Example 1. AD Groups 2. AD Attributes 3. MDM 4. Certificate 5. ID Group 6. SQL Attributes 7. Auth Method 8. Endpoint Profile 9. Location • Policy Logic: o First Match, Top Down o Skip Rule on first negative match • More specific rules generally at top BRKSEC-3699
  • 37. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 37 BRKSEC-3699 Auth Policy Optimization ISE 2.3 Better Example! BRKSEC-3699 Block 1 Block 2 Block 3 Block 4 4. AD Groups 5. AD Attributes 9. MDM 7. Certificate 6. ID Group 8. SQL Attributes 2. Auth Method 3. Endpoint Profile 1. Location
  • 38. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE 2.4 Auth Policy Scale 38 BRKSEC-3699 • Max Policy Sets = 200 (up from 100 in 2.2; up from 40 in 2.1) • Max Authentication Rules = 1000 (up from 200 in 2.2; up from 100 in 2.1) • Max Authorization Rules = 3000 (up from 700 in 2.2; up from 600 in 2.1) • Max Authorization Profiles = 3200 (up from 1000 in 2.2; up from 600 in 2.1) For Your Reference
  • 39. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 39 BRKSEC-3699 Dynamic Variable Substitution Rule Reduction • Authorization Policy Conditions • Authorization Profile Conditions ID Store Attribute • Match conditions to unique values stored per- User/Endpoint in internal or external ID stores (AD, LDAP, SQL, etc) • ISE supports custom User and Endpoint attributes
  • 40. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Enable EAP Session Resume / Fast Reconnect Major performance boost, but not complete auth so avoid excessive timeout value Skip inner method Cache TLS session Cache TLS (TLS Handshake Only/Skip Cert) Note: Both Server and Client must be configured for Fast Reconnect Win 7 Supplicant 40 For Your Reference
  • 41. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 41 BRKSEC-3699 ISE Stateless Session Resume Allows Session Resume Across All PSNs • Session ticket extension per RFC 5077 [Transport Layer Security (TLS) Session Resumption without Server-Side State] • ISE issues TLS client a session ticket that can be presented to any PSN to shortcut reauth process (Default = Disabled) Time until session ticket expires Policy > Policy Elements > Results > Authentication > Allowed Protocols Allows resume with Load Balancers
  • 42. Scaling Guest and Web Authentication Services 42
  • 43. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 43 Scaling Global Sponsor / MyDevices Anycast Example DNS SERVER: DOMAIN = COMPANY.COM SPONSOR 10.1.0.100 MYDEVICES 10.1.0.101 ISE-PSN-1 10.1.1.1 ISE-PSN-2 10.1.1.2 ISE-PSN-3 10.1.1.3 ISE-PSN-4 10.2.1.4 ISE-PSN-5 10.2.1.5 ISE-PSN-6 10.2.1.6 ISE-PSN-7 10.3.1.7 ISE-PSN-8 10.3.1.8 ISE-PSN-9 10.3.1.9 Use Global Load Balancer or Anycast (example shown) to direct traffic to closest VIP. Web Load-balancing distributes request to single PSN. Load Balancing also helps to scale Web Portal Services DNS Servers BRKSEC-3699 10.1.0.100 10.1.0.100 10.1.0.100
  • 44. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 44 BRKSEC-3699 Scaling Guest Authentications Using 802.1X “Activated Guest” allows guest accounts to be used without ISE web auth portal • Guests auth with 802.1X using EAP methods like PEAP-MSCHAPv2 / EAP-GTC • 802.1X auth performance generally much higher than web auth Note: AUP and Password Change cannot be enforced since guest bypasses portal flow. Warning: Watch for expired guest accounts, else high # auth failures !
  • 45. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 45 BRKSEC-3699 Scaling Web Auth “Remember Me” Guest Flows • User logs in to Hotspot/CWA portal and MAC address auto-registered into GuestEndpoint group • AuthZ Policy for GuestEndpoints ID Group grants access until device purged New in ISE 2.4 Work Centers > Guest Access > Settings > Logging
  • 47. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 47 BRKSEC-3699 Posture Lease Once Compliant, user may leave/reconnect multiple times before re-posture 7
  • 48. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 48 BRKSEC-3699 MDM Scalability and Survivability What Happens When the MDM Server is Unreachable? • Scalability ≈ 30 Calls per second per PSN. • Cloud-Based deployment typically built for scale and redundancy • For cloud-based solutions, Internet bandwidth and latency must be considered. • Premise-Based deployment may leverage load balancing • ISE 1.4+ supports multiple MDM servers – could be same or different vendors. • Authorization permissions can be set based on MDM connectivity status: • MDM:MDMServerReachable Equals UnReachable MDM:MDMServerReachable Equals Reachable • All attributes retrieved & reachability determined by single API call on each new session.
  • 49. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 49 BRKSEC-3699 Scaling MDM Prepopulate MDM Enrollment and/or Compliance via ERS API <groupId>groupId</groupId> <identityStore>identityStore</identityStore> <identityStoreId>identityStoreId</identityStoreId> <mac>00:01:02:03:04:05</mac> <mdmComplianceStatus>false</mdmComplianceStatus> <mdmEncrypted>false</mdmEncrypted> <mdmEnrolled>true</mdmEnrolled> <mdmIMEI>IMEI</mdmIMEI> <mdmJailBroken>false</mdmJailBroken> <mdmManufacturer>Apple Inc.</mdmManufacturer> <mdmModel>iPad</mdmModel> <mdmOS>iOS</mdmOS> <mdmPhoneNumber>Phone Number</mdmPhoneNumber> <mdmPinlock>true</mdmPinlock> <mdmReachable>true</mdmReachable> <mdmSerial>AB23D0E45BC01</mdmSerial> <mdmServerName>AirWatch</mdmServerName> <portalUser>portalUser</portalUser> <profileId>profileId</profileId> <staticGroupAssignment>true</staticGroupAssignment> <staticProfileAssignment>false</staticProfileAssignment> <customAttributes> <customAttributes> <entry> <key>MDM_Registered</key> <value>true</value> </entry> <entry> <key>MDM_Compliance</key> <value>false</value> </entry> <entry> <key>Attribute_XYZ</key> <value>Value_XYZ</value> </entry> </customAttributes> </customAttributes> ISE 2.4 adds support for managing MDM Attributes via ERS API
  • 51. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 51 BRKSEC-3699 Options for Deploying Device Admin https://communities.cisco.com/docs/DOC-63930 Priorities according to Policy and Business Goals Separate Deployment Separate PSNs Mixed PSNs Separation of Configuration/ Duty Yes: Specialization for TACACS+ No: Shared resources/Reduced $$ Independent Scaling of Services Yes: Scale as needed/No impact on Device Admin from RADIUS services No: Avoid underutilized PSNs Suitable for high-volume Device Admin Yes: Services dedicated to TACACS+ No: Focus on “human” device admins Separation of Logging Store Yes: Optimize log retention VM No: Centralized monitoring TACACS RADIUS RADIUS TACACS TACACS RADIUS/
  • 52. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE 2.4 TACACS+ Multi-Service Scaling (RADIUS and T+) Max Concurrent RADIUS + TACACS+ TPS by Deployment Model and Platform • By Deployment • By PSN Deployment Model Platform Max # Dedicated PSNs Max RADIUS Sessions per Deployment Max TACACS+ TPS per Deployment Standa- alone All personas on same node 3515 0 7,500 100 3595 0 20,000 100 Hybrid PAN+MnT+PXG on same node; Dedicated PSN 3515 as PAN+MNT * 5 / 3+2 7,500 250 / 2,000 3595 as PAN+MNT * 5 / 3+2 20,000 250 / 3,000 Dedicated Each Persona on Dedicated Node 3595 as PAN and MNT * 50 / 47+3 500,000 2,500 / 4,000 3595 as PAN and Large MNT * 50 / 47+3 500,000 2,500 / 6,000 Scaling per PSN Platform Max RADIUS Sessions per PSN Max TACACS+ TPS per PSN Dedicated Policy nodes (Max Sessions Gated by Total Deployment Size) SNS-3515 7,500 2,000 SNS-3595 40,000 3,000 Each dedicated T+ PSN node reduces dedicated RADIUS PSN count by 1 * Device Admin service enabled on same PSNs also used for RADIUS OR Split RADIUS and T+ PSNs 52 BRKSEC-3699
  • 53. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE 2.4 TACACS+ Multi-Service Scaling (TACACS+ Only) Max Concurrent TACACS+ TPS by Deployment Model and Platform • By Deployment • By PSN 53 BRKSEC-3699 Deployment Model Platform Max # Dedicated PSNs Max RADIUS Sessions per Deployment Max TACACS+ TPS per Deployment Stand- alone All personas on same node 3515 0 N/A 1,000 3595 0 N/A 1,500 Hybrid PAN+MnT+PXG on same node; Dedicated PSN 3515 as PAN+MNT * 5 / 2 N/A 2,000 / 2,000 3595 as PAN+MNT * 5 / 2 N/A 3,000 / 3,000 Dedicated Each Persona on Dedicated Node 3595 as PAN and MNT * 50 / 4 N/A 5,000 / 5,000 3595 as PAN and Large MnT * 50 / 5 N/A 10,000 / 10,000 Scaling per PSN Platform Max RADIUS Sessions per PSN Max TACACS+ TPS per PSN Dedicated Policy nodes (Max Sessions Gated by Total Deployment Size) SNS-3515 7,500 2,000 SNS-3595 40,000 3,000 * Device Admin service can be enabled on each PSN; minimally 2 for redundancy. Max log capacity for MNT ** ** ** ** **
  • 54. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 54 BRKSEC-3699 TACACS+ MnT Scaling Human Versus Automated Device Administration • Consider the “average” size syslog from TACACS+ based on following guidance: • “Human” Device Admin Example: • For a normal “human” session we may expect to see 10 commands, so a session would be approximately: [5kB + (10 * 3kB)) = 35kB. Suppose a maximum of 50 such sessions per admin per day from 50 admins (and few organizations have > 50 admins) • 50 human admins would generate < 1 TPS average, ~60k logs/day, or ~90MB/day. • Automated/Script Device Admin Example: • Consider a script that runs 4 times a day against 30,000 devices, (for example, to backup config on all devices). Generally the interaction will be short, say 5 commands: • Storage = 30,000 * 4 * [5kB + (5 * 3kB)] = ~2.4 GB/day • Total TPS = 30k * 4 * [3 + (5 * 2)] = 1.56M logs = 18 TPS average; 1300 TPS peak. Each TACACS+ Session Each Command Authorization (per session) Authentication: 2kB Command authorization: 2kB Session authorization: 2kB Command accounting : 1kB Session accounting: 1kB
  • 55. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS TACACS+ Multi-Service Scaling Required TACACS+ TPS by # Admins and # NADs Session Authentication and Accounting Only Command Accounting Only (10 Commands / Session) Command Authorization + Acctg (10 Commands / Session) Avg TPS Peak TPS Logs/Day Storage/ day Avg TPS Peak TPS Logs/Day Storage/ day Avg TPS Peak TPS Logs/Day Storage/ day # Admins Based on 50 Admin Sessions per Day 1 < 1 < 1 150 < 1MB < 1 < 1 650 1MB < 1 <1 1.2k 2MB 5 < 1 < 1 750 1MB < 1 < 1 3.3k 4MB < 1 <1 5.8k 9MB 10 < 1 < 1 1.5k 3MB < 1 < 1 6.5k 8MB < 1 1 11.5k 17MB 25 < 1 < 1 3.8k 7MB < 1 1 16.3k 19MB < 1 2 28.8k 43MB 50 < 1 1 7.5k 13MB < 1 2 32.5k 37MB 1 4 57.5k 86MB 100 < 1 1 15k 25MB 1 4 65k 73MB 2 8 115k 171MB # NADs Based on 4 Scripted Sessions per Day 500 < 1 5 6k 10MB < 1 22 26k 30MB 1 38 46k 70MB 1,000 < 1 10 12k 20MB 1 43 52k 60MB 1 77 92k 140MB 5,000 < 1 50 60k 100MB 3 217 260k 300MB 5 383 460k 700MB 10,000 1 100 120k 200MB 6 433 520k 600MB 11 767 920k 1.4GB 20,000 3 200 240k 400MB 12 867 1.04M 1.2GB 21 1.5k 1.84M 2.7GB 30,000 5 300 480k 600MB 18 1.3k 1.56M 1.7GB 32 2.3k 2.76M 4.0GB 50,000 7 500 600k 1GB 30 2.2k 2.6M 2.9GB 53 3.8k 4.6M 6.7GB Human Admin Script Admin Human Admin BRKSEC-3699 55 Peak values based on 5-minute burst to complete each batch request.
  • 56. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS TACACS+ Multi-Service Scaling Required TACACS+ TPS by # Admins and # NADs Session Authentication and Accounting Only Command Accounting Only (10 Commands / Session) Command Authorization + Acctg (10 Commands / Session) Avg TPS Peak TPS Logs/Day Storage/ day Avg TPS Peak TPS Logs/Day Storage/ day Avg TPS Peak TPS Logs/Day Storage/ day # Admins Based on 50 Admin Sessions per Day 1 < 1 < 1 150 < 1MB < 1 < 1 650 1MB < 1 <1 1.2k 2MB 5 < 1 < 1 750 1MB < 1 < 1 3.3k 4MB < 1 <1 5.8k 9MB 10 < 1 < 1 1.5k 3MB < 1 < 1 6.5k 8MB < 1 1 11.5k 17MB 25 < 1 < 1 3.8k 7MB < 1 1 16.3k 19MB < 1 2 28.8k 43MB 50 < 1 1 7.5k 13MB < 1 2 32.5k 37MB 1 4 57.5k 86MB 100 < 1 1 15k 25MB 1 4 65k 73MB 2 8 115k 171MB # NADs Based on 4 Scripted Sessions per Day 500 < 1 5 6k 10MB < 1 22 26k 30MB 1 38 46k 70MB 1,000 < 1 10 12k 20MB 1 43 52k 60MB 1 77 92k 140MB 5,000 < 1 50 60k 100MB 3 217 260k 300MB 5 383 460k 700MB 10,000 1 100 120k 200MB 6 433 520k 600MB 11 767 920k 1.4GB 20,000 3 200 240k 400MB 12 867 1.04M 1.2GB 21 1.5k 1.84M 2.7GB 30,000 5 300 480k 600MB 18 1.3k 1.56M 1.7GB 32 2.3k 2.76M 4.0GB 50,000 7 500 600k 1GB 30 2.2k 2.6M 2.9GB 53 3.8k 4.6M 6.7GB Human Admin Script Admin Script Admin BRKSEC-3699 56 Peak values based on 5-minute burst to complete each batch request.
  • 57. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 57 BRKSEC-3699 Single Connect Mode Scaling TACACS+ for High-Volume NADs • Multiplexes T+ requests over single TCP connection • All T+ requests between NAD and ISE occur over single connection rather than separate connections for each request. • Recommended for TACACS+ “Top Talkers” • Note: TCP sockets locked to NADs, so limit use to NADs with highest activity. Administration > Network Resources > Network Devices > (NAD)
  • 58. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 58 BRKSEC-3699 Internal User Cache for T+ Authorization Scaling TACACS+ for High-Volume Admin Users First authorization caches 1) User Name 2) User Specific Attributes (Ex: Group ID, custom attributes) Successive requests served from cache Default = 0 <<Cache Disabled>> Global Setting for Single Connect Mode (enabled by default)
  • 60. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 60 BRKSEC-3699 Endpoint Attribute Filter and Whitelist Attributes Reduces Data Collection and Replication to Subset of Profile-Specific Attributes • Endpoint Attribute Filter – aka “Whitelist filter” • Disabled by default. If enabled, only these attributes are collected or replicated. • Whitelist Filter limits profile attribute collection to those required to support default (Cisco-provided) profiles and critical RADIUS operations. • Filter must be disabled to collect and/or replicate other attributes. • Attributes used in custom conditions are automatically added to whitelist. Administration > System Settings > Profiling
  • 61. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Sampling of All Endpoint Attributes 61 Whitelist Attributes vs Significant Attributes PolicyVersion OUI EndPointMACAddress MatchedPolicy EndPointMatchedProfile EndPointPolicy Total Certainty Factor EndPointProfilerServer EndPointSource StaticAssignment StaticGroupAssignment UpdateTime Description IdentityGroup ElapsedDays InactiveDays NetworkDeviceGroups Location Device Type IdentityAccessRestricted IdentityStoreName ADDomain AuthState ISEPolicySetName IdentityPolicyMatchedRule AllowedProtocolMatchedRule SelectedAccessService SelectedAuthenticationIdentityStore s AuthenticationIdentityStore AuthenticationMethod AuthorizationPolicyMatchedRule SelectedAuthorizationProfiles CPMSessionID AAA-Server OriginalUserName DetailedInfo EapAuthentication NasRetransmissionTimeout TotalFailedAttempts TotalFailedTime UseCase UserType GroupsOrAttributesProcessFailure ExternalGroups Called-Station-ID Calling-Station-ID DestinationIPAddress DestinationPort Device IP Address MACAddress MessageCode NADAddress NAS-IP-Address NAS-Port NAS-Port-Id NAS-Port-Type NetworkDeviceName RequestLatency Service-Type Timestamp User-Name Egress-VLANID Egress-VLAN-Name Airespace-Wlan-Id Device Port EapTunnel Framed-IP-Address NAS-Identifier RadiusPacketType Vlan VlanName cafSessionAuthUserName cafSessionAuthVlan cafSessionAuthorizedBy cafSessionDomain cafSessionStatus dot1dBasePort dot1xAuthAuthControlledPortContr ol dot1xAuthAuthControlledPortStatus dot1xAuthSessionUserName ifDescr ifIndex ifOperStatus cdpCacheAddress cdpCacheCapabilities cdpCacheDeviceId cdpCachePlatform cdpCacheVersion lldpSystemDescription lldpSystemName lldpCapabilitiesMapSupported lldpChassisId cLApIfMacAddress cLApName cLApNameServerAddress cLApSshEnable cLApTelnetEnable cLApTertiaryControllerAddress cLApTertiaryControllerAddressType cLApUpTime cLApWipsEnable cldcAssociationMode cldcClientAccessVLAN cldcClientIPAddress cldcClientStatus BYODRegistration DeviceRegistrationStatus PortalUser AUPAccepted LastAUPAcceptanceHours PostureAssessmentStatus FQDN OpenSSLErrorMessage OpenSSLErrorStack User-Agent attribute-52 attribute-53 chaddr ciaddr client-fqdn host-name domain-name dhcp-class-identifier dhcp-client-identifier dhcp-message-type dhcp-parameter-request-list dhcp-requested-address dhcp-user-class-id dhcp-vendor-class flags giaddr hlen hops htype ip op secs yiaddr sysName sysDescr sysContact sysLocation hrDeviceDescr LastNmapScanTime NmapScanCount operating-system CLASS_ID DIRECTION DST_MASK FIRST_SWITCHED FLOW_SAMPLER_ID FragmentOffset INPUT_SNMP IN_BYTES IN_PKTS OUT_BYTES IPV4_DST_ADDR IPV4_NEXT_HOP IPV4_SRC_ADDR IPV4_IDENT L4_DST_PORT L4_SRC_PORT LAST_SWITCHED OUTPUT_SNMP PROTOCOL SRC_MASK SRC_TOS TCP_FLAGS SRC_VLAN DST_VLAN IN_SRC_MAC OUT_DST_MAC MAX_TTL MIN_TTL dst_as src_as count flow_sequence source_id sys_uptime unix_secs version MDMServerName MDMUdid MDMImei MDMMeid MDMManufacturer MDMModel MDMOSVersion MDMPhoneNumber MDMSerialNumber MDMCompliant MDMJailBroken MDMPinLockSet MDMDiskEncrypted h323DeviceName h323DeviceVendor h323DeviceVersion mdns_VSM_class_identifier mdns_VSM_srv_identifier mdns_VSM_txt_identifier sipDeviceName sipDeviceVendor sipDeviceVersion device-platform device-platform-version device-type AD-Host-Exists AD-Join-Point AD-Operating-System AD-OS-Version AD-Service-Pack iotAssetDeviceType iotAssetProductCode iotAssetProductName iotAssetRetrievedFrom iotAssetSerialNumber iotAssetTrustLevel iotAssetVendorID 80-tcp 110-tcp 135-tcp 139-tcp 143-tcp 443-tcp 445-tcp 515-tcp 3306-tcp 3389-tcp 5900-tcp 8080-tcp 9100-tcp 53-udp 67-udp 68-udp 123-udp 135-udp 137-udp 138-udp 139-udp FirstCollection FQDN Framed-IP-Address host-name hrDeviceDescr IdentityGroup IdentityGroupID IdentityStoreGUID IdentityStoreName ifIndex ip L4_DST_PORT LastNmapScanTime lldpCacheCapabilities lldpCapabilitiesMapSupported lldpSystemDescription MACAddress MatchedPolicy MatchedPolicyID MDMCompliant MDMCompliantFailureReason MDMDiskEncrypted MDMEnrolled MDMImei MDMJailBroken MDMManufacturer MDMModel MDMOSVersion MDMPhoneNumber MDMPinLockSet MDMProvider MDMSerialNumber MDMServerReachable MDMUpdateTime NADAddress NAS-IP-Address NAS-Port-Id NAS-Port-Type NmapScanCount NmapSubnetScanID operating-system OS Version OUI PhoneID PhoneIDType PolicyVersion PortalUser PostureApplicablePrevious DeviceRegistrationStatus ProductRegistrationTimeStamp StaticAssignment StaticGroupAssignment sysDescr TimeToProfile Total Certainty Factor UpdateTime User-Agent 161-udp AAA-Server AC_User_Agent AUPAccepted BYODRegistration CacheUpdateTime Calling-Station-ID cdpCacheAddress cdpCacheCapabilities cdpCacheDeviceId cdpCachePlatform cdpCacheVersion Certificate Expiration Date Certificate Issue Date Certificate Issuer Name Certificate Serial Number ciaddr CreateTime Description DestinationIPAddress Device Identifier Device Name DeviceRegistrationStatus dhcp-class-identifier dhcp-requested-address EndPointPolicy EndPointPolicyID EndPointProfilerServer EndPointSource MACADDRESS MATCHEDVALUE ENDPOINTPOLICY ENDPOINTPOLICYVERSION STATICASSIGNMENT STATICGROUPASSIGNMENT NMAPSUBNETSCANID PORTALUSER DEVICEREGISTRATIONSTATUS Significant Attributes Whitelist Attributes BRKSEC-3699
  • 62. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 62 Inter-Node Communications JGroup Connections – Global Cluster • All Secondary nodes* establish connection to Primary PAN (JGroup Controller) over tunneled connection (TCP/12001) for config/database sync. • Secondary Admin also listens on TCP/12001 but no connection established unless primary fails/secondary promoted • All Secondary nodes participate in the Global JGroup cluster. *Secondary node = All nodes except Primary Admin node; includes PSNs, MnT, pxGrid, and Secondary Admin nodes TCP/12001 JGroups Tunneled GLOBAL JGROUP CONTROLLER BRKSEC-3699 Admin (P) Admin (S) MnT (S) MnT (P) PSN1 PSN2 PSN3
  • 63. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 63 Inter-Node Communications Local JGroups and Node Groups • Node Groups can be used to define local JGroup* clusters where members exchange heartbeat and sync profile data over SSL (TLS v1.2). LOCAL JGROUP CONTROLLER NODE GROUP A (JGROUP A) *JGroups: Java toolkit for reliable multicast communications between group/cluster members. TCP/7800 JGroup Peer Communication JGroup Failure Detection TCP/12001 JGroups Tunneled GLOBAL JGROUP CONTROLLER Fetch Attributes Change Ownership PSN1 is current endpoint owner – no database replication even if whitelist attribute changes DHCP Update t=0 DHCP Update t=1 • PSN claims endpoint ownership only if change in whitelist attribute; triggers ownership update to local PSNs. Whitelist check always occurs regardless of global whitelist filter. PSN2 gets more current update for same endpoint and takes ownership – fetches all attributes from PSN1 • Replication to PAN occurs if significant attribute changes, then sync all attributes via PAN; if whitelist filter enabled, only whitelist attributes synced to all nodes. BRKSEC-3699 Admin (P) Admin (S) MnT (S) MnT (P) PSN1 PSN2 PSN3 Profile Change
  • 64. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS NODE GROUP A (JGROUP A) L2 or L3 PSN4 PSN5 PSN6 64 Inter-Node Communications Local JGroups and Node Groups NODE GROUP B (JGROUP B) • Profiling sync leverages JGroup channels • All replication outside node group must traverse PAN—including Ownership Change! • If Local JGroup fails, then nodes fall back to Global JGroup communication channel. BRKSEC-3699 PSN1 PSN2 PSN3 TCP/7800 JGroup Peer Communication JGroup Failure Detection TCP/12001 JGroups Tunneled Admin (P) Admin (S) MnT (S) MnT (P)
  • 65. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 65 Node Groups and Session Recovery Dynamic Clean Up for Orphaned URL-Redirected Sessions Primary PAN Primary MnT PSN1 PSN2 PSN3 RADIUS Portal Redirect to PSN3 JGroup “Master” PSN3 not responding! Hey Primary MnT! Did PSN3 have any active sessions with pending redirect? BRKSEC-3699 CoA Session Terminate
  • 66. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Query Attributes 66 ISE 2.4 Node Communications Guest: tcp/8443 Discovery: tcp/8443, tcp/8905 Agent Install: tcp/8443 Posture Agent: tcp/8905; udp/8905 PRA/KA: tcp/8905 DNS: udp/53; DHCP:udp/67 WMI Client Probe: tcp/135, tcp/445 Kerberos (SPAN): tcp/88 SCEP Proxy: tcp/80, tcp/443 EST: tcp/8084 PIP Endpoint Posture Updates/Smart Licensing: tcp/443 Profiler Feed: tcp/8443 Logging HTTPS: tcp/443 Syslog: udp/20514, tcp/1468 Secure Syslog: tcp/6514 CoA (REST API): udp/1700 RADIUS Auth: udp/1645,1812 RADIUS Acct: udp/1646,1813 RADIUS CoA: udp/1700,3799 RADSEC DTLS: udp/2083 RADIUS/IPsec: udp/500 TACACS+: tcp/49 (configurable) WebAuth: tcp:443,8443 SNMP: udp/161 SNMP Trap: udp/162 NetFlow: udp/9996 DHCP:udp/67, udp/68 DHCPv6: udp/547 SPAN:tcp/80,8080 SXP: tcp/64999 OCSP: tcp/2560 CA SCEP: tcp/9090 NADs DNS: tcp-udp/53 NTP: udp/123 Repository: FTP, SFTP, NFS, HTTP, HTTPS File Copy: FTP, SCP, SFTP, TFTP LDAP: tcp-udp/389, tcp/3268 SMB:tcp/445 KDC:tcp-udp/88; KPASS: tcp/464 SCEP: tcp/80, tcp/443; EST: tcp/8084 OCSP: tcp/80; CRL: tcp/80, tcp/443, tcp/389 ODBC (configurable): Microsoft SQL: tcp/1433 Sybase: tcp/2638 PortgreSQL: tcp/5432 Oracle: tcp/1512 TS-Agent: tcp/9094 AD Agent: tcp/9095 WMI: tcp/135 Syslog: udp/40514, tcp/11468 HTTPS: tcp/443 JGroups: tcp/12001 (PSN to PAN) CoA (Admin/Guest Limit): udp/1700 Admin(P) - Admin(S): tcp/443, tcp/12001(JGroups) Monitor(P) - Monitor(S): tcp/443, udp/20514 (Syslog) Policy - Policy: Node Groups/JGroups: tcp/7800 Proxy CoA: udp/1700 PSN-SXPSN: tcp/443 pxGrid - pxGrid: tcp/5222 Syslog: udp/20514, tcp/1468 Secure Syslog: tcp/6514 SNMP Traps: udp/162 SMTP: tcp/25 (PPAN: email expiry notifiy) Email/SMS Gateways Inter-Node Communications Cloud Services Cisco.com/Perfigo.com Profiler Feed Service MDM & App Stores Push Notification Smart Licesing GUI: tcp/80,443 SSH: tcp/22 Sponsor (PSN): tcp/8443 SNMP: udp/161 REST API (MnT): tcp/443 ERS API: tcp/9060 Admin / Sponsor SMTP: tcp/25 MDM Partner pxGrid: tcp/5222 JGroups: tcp/12001 pxGrid: tcp/5222 pxGrid Subscriber/ Publisher pxGrid: tcp/5222 pxGrid (Bulk Download): tcp/8910 Syslog: udp/20514, tcp/1468 Secure Syslog: tcp/6514 NetFlow for TS: udp/9993 BRKSEC-3699 Threat/VA Server MDM API: tcp/XXX (vendor specific) TC-NAC: tcp/443 IdP SSO Server IdP: tcp/XXX (Vendor specific) Admin->Sponsor: tcp/9002 Wireless Setup Wizard: tcp/9103 HTTPS; tcp/443 Syslog: udp/20514, tcp/1468 Secure Syslog: tcp/6514 Oracle DB (Secure JDBC): tcp/1528 JGroups: tcp/12001 (MnT to PAN) PSN PXG MNT PAN
  • 67. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Profiling and Data Replication Before Tuning Node Group = DC1-group Node Group = DC2-group RADIUS Auth RADIUS Acctng DHCP 1 DHCP 2 NMAP pxGrid # Ownership Change Global Replication BRKSEC-3699 67 1 3 4 2 5 PAN(S) MNT(S) MNT(P) PAN(Primary) PSN Clusters PSN
  • 68. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Node Group = DC1-group Node Group = DC2-group PAN(Primary) PSN Clusters Impact of Ownership Changes Before Tuning RADIUS Auth RADIUS Acctng DHCP 1 DHCP 2 NMAP pxGrid Owner? Owner? Owner? Owner? Owner? BRKSEC-3699 68 PSN
  • 69. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Profiling and Data Replication After Tuning RADIUS Auth RADIUS Acctng DHCP 1 NMAP pxGrid # Ownership Change Global Replication BRKSEC-3699 69 Node Group = DC1-group Node Group = DC2-group PAN(S) MNT(S) MNT(P) PAN(Primary) PSN Clusters 2 1 PSN
  • 70. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Impact of Ownership Changes After Tuning pxGrid RADIUS Auth RADIUS Acctng DHCP 1 NMAP BRKSEC-3699 70 Node Group = DC1-group Node Group = DC2-group PAN(Primary) PSN Clusters Owner PSN
  • 71. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 71 BRKSEC-3699 ISE Profiling Best Practices Whenever Possible… • Use Device Sensor on Cisco switches & Wireless Controllers to optimize data collection. • Ensure profile data for a given endpoint is sent to a single PSN (or maximum of 2) • Sending same profile data to multiple PSNs increases inter-PSN traffic and contention for endpoint ownership. • For redundancy, consider Load Balancing and Anycast to support a single IP target for RADIUS or profiling using… • DHCP IP Helpers • SNMP Traps • DHCP/HTTP with ERSPAN (Requires validation) • Ensure profile data for a given endpoint is sent to the same PSN • Same issue as above, but not always possible across different probes • Use node groups and ensure profile data for a given endpoint is sent to same node group. • Node Groups reduce inter-PSN communications and need to replicate endpoint changes outside of node group. • Avoid probes that collect the same endpoint attributes • Example: Device Sensor + SNMP Query/IP Helper • Enable Profiler Attribute Filter Do NOT send profile data to multiple PSNs ! DO send profile data to single and same PSN or Node Group ! DO use Device Sensor ! DO enable the Profiler Attribute Filter !
  • 72. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 72 BRKSEC-3699 ISE Profiling Best Practices General Guidelines for Probes • HTTP Probe: • Use URL Redirects instead of SPAN to centralize collection and reduce traffic load related to SPAN/RSPAN. • Avoid SPAN. If used, look for key traffic chokepoints such as Internet edge or WLC connection; use intelligent SPAN/tap options or VACL Capture to limit amount of data sent to ISE. Also difficult to provide HA for SPAN. • DHCP Probe: • Use IP Helpers when possible—be aware that L3 device serving DHCP will not relay DHCP for same! • Avoid DHCP SPAN. If used, make sure probe captures traffic to central DHCP Server. HA challenges. • SNMP Probe: • For polled SNMP queries, avoid short polling intervals. Be sure to set optimal PSN for polling in ISE NAD config. • SNMP Traps primarily useful for non-RADIUS deployments like NAC Appliance—Avoid SNMP Traps w/RADIUS auth. • NetFlow Probe: • Use only for specific use cases in centralized deployments—Potential for high load on network devices and ISE. • pxGrid Probe: • Limit # PSNs enabled for pxGrid as each becomes a Subscriber to same data. 2 needed for redundancy. • Dedicate PSNs for pxGrid Probe if high-volume data from Publishers. Do NOT enable all probes by default ! Avoid SPAN, SNMP Traps, and NetFlow probes ! Limit pxGrid probe to two PSNs max for HA – possibly dedicated !
  • 73. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 73 BRKSEC-3699 Profiling Redundancy – Duplicating Profile Data Different DHCP Addresses - Provides Redundancy but Leads to Contention for Ownership = Replication • Common config is to duplicate IP helper data at each NAD to two different PSNs or PSN LB Clusters • Different PSNs receive data PSN3 (10.1.99.7) PSN2 (10.1.99.6) PSN1 (10.1.99.5) interface Vlan10 ip helper-address <real_DHCP_Server> ip helper-address 10.1.98.8 ip helper-address 10.2.100.2 PSN3 (10.2.101.7) PSN2 (10.2.101.6) PSN1 (10.2.101.5) PSN-CLUSTER2 PSN-CLUSTER1 DC #2 DC #1 DHCP Request Load Balancer Load Balancer Note: LB depicted, but NOT required (10.1.98.8) (10.2.100.2) User int Vlan10
  • 74. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS PSN3 (10.1.99.7) PSN2 (10.1.99.6) PSN1 (10.1.99.5) PSN3 (10.2.101.7) PSN2 (10.2.101.6) PSN1 (10.2.101.5) PSN-CLUSTER2 PSN-CLUSTER1 Load Balancer Load Balancer 74 BRKSEC-3699 Scaling Profiling and Replication Single DHCP VIP Address using Anycast - Limit Profile Data to a Single PSN and Node Group • Different PSNs or Load Balancer VIPs host same target IP for DHCP profile data • Routing metrics determine which PSN or LB VIP receives DHCP from NAD User interface Vlan10 ip helper-address <real_DHCP_Server> ip helper-address 10.1.98.8 DHCP Request Note: LB depicted, but NOT required (10.1.98.8) (10.1.98.8) DC #2 DC #1 int Vlan10
  • 75. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 75 Profiler Tuning for Polled SNMP Query Probe • Set specific PSNs to periodically poll access devices for SNMP data. • Choose PSN closest to access device. 28,800 sec (8 hours) *Minimum recommended polling interval SNMP Polling (Auto) RADIUS PSN1 (Amer) PSN2 (Asia) Switch Auto-Recovery when PSN fails fixed in ISE 2.4 BRKSEC-3699
  • 76. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS pxGrid Profiler Probe (Context In) First Integration with Cisco Industrial Network Director (IND) • IND communicates with Industrial Switches and Security Devices and collects detailed information about the connected manufacturing devices. • IND v1.3 adds pxGrid Publisher interface to communicate IoT attributes to ISE. 76 BRKSEC-3699 Subscriber ISE Profiler Attributes Custom Attributes Supported !!! iotIpAddress iotMacAddress iotName iotVendor iotProductId iotSerialNumber iotDeviceType iotSwRevision iotHwRevision iotProtocol iotConnectedLinks iotCustomAttributes Publisher pxGrid IND Asset Inventory
  • 77. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS BRKSEC-3699 pxGrid Profiler Probe Recommend limit probe to two PSNs (2 for HA). Each PSN becomes a pxGrid Subscriber to IND Asset topic 77
  • 78. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Profiler Conditions Based on Custom Attributes BRKSEC-3699 78
  • 79. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Profiling Based on Custom Attributes Performance Hit so Disabled By Default • Global Setting MUST be enabled • If disabled: • Custom Attributes are NOT updated over pxGrid • Profiler ignores any conditions based on Customer Attributes, even if Custom Attribute is populated. BRKSEC-3699 79
  • 80. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS New and Updated IoT Profile Libraries • 700+ Automation and Control • Industrial / Manufacturing • Building Automation • Power / Lighting • Transportation / Logistics • Financial (ATM, Vending, PoS, eCommerce) • IP Camera / Audio-Video / Surveillance and Access Control • Other (Defense, HVAC, Elevators, etc) • Windows Embedded • 300+ Profiles in Medical NAC Profile Library Delivered via ISE Community: https://communities.cisco.com/docs/DOC-66340 BRKSEC-3699 80
  • 81. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Why Do I Care about # Profiles? 81 BRKSEC-3699 • ISE 2.1+ supports a MAX of 2000 profiles • Let’s Do the Math… • ~600 Base Profiles • 600+ New Feed Profiles (2.4) • 300+ Medical NAC Profiles • 700+ Automation & Control Profiles -------------------------------------- 2300+ Profiles • No restrictions on profile import, so must check # profiles in library before import large batch of new profiles.
  • 82. Scaling MnT (Optimize Logging and Noise Suppression)
  • 83. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 83 The Fall Out From the Mobile Explosion and IoT BRKSEC-3699  Explosion in number and type of endpoints on the network.  High auth rates from mobile devices—many personal (unmanaged). – Short-lived connections: Continuous sleep/hibernation to conserve battery power, roaming, …  Misbehaving supplicants: Unmanaged endpoints from numerous mobile vendors may be misconfigured, missing root CA certificates, or running less-than-optimal OS versions  Misconfigured NADs. Often timeouts too low & misbehaving clients go unchecked/not throttled.  Misconfigured Load Balancers—Suboptimal persistence and excessive RADIUS health probes.  Increased logging from Authentication, Profiling, NADs, Guest Activity, …  System not originally built to scale to new loads.  End user behavior when above issues occur.  Bugs in client, NAD, or ISE.
  • 84. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Clients Misbehave! • Example education customer: • ONLY 6,000 Endpoints (all BYOD style) • 10M Auths / 9M Failures in a 24 hours! • 42 Different Failure Scenarios – all related to clients dropping TLS (both PEAP & EAP-TLS). • Supplicant List: • Kyocera, Asustek, Murata, Huawei, Motorola, HTC, Samsung, ZTE, RIM, SonyEric, ChiMeiCo, Apple, Intel, Cybertan, Liteon, Nokia, HonHaiPr, Palm, Pantech, LgElectr, TaiyoYud, Barnes&N • 5411 No response received during 120 seconds on last EAP message sent to the client • This error has been seen at a number of Escalation customers • Typically the result of a misconfigured or misbehaving supplicant not completing the EAP process. 84 BRKSEC-3699
  • 85. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Challenge: How to reduce the flood of log messages while increasing PSN and MNT capacity and tolerance 85 BRKSEC-3699 MnT
  • 86. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 86 Getting More Information With Less Data Scaling to Meet Current and Next Generation Logging Demands Reauth period Quiet-period 5 min Held-period / Exclusion 5 min Load Balancer Misbehaving supplicant Roaming supplicant Unknown users Reauth phones LB Health probes Detect and reject misbehaving clients Log Filter Heartbeat frequency Count and discard repeated events Count and discard untrusted events PSN MNT Switch WLC Rate Limiting at Source Filtering at Receiving Chain Count and discard repeats and unknown NAD events Filter health probes from logging Reject bad supplicant Client Exclusion Quiet period Quiet Period BRKSEC-3699
  • 87. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 87 BRKSEC-3699 Tune NAD Configuration Rate Limiting at Wireless Source Wireless (WLC) • RADIUS Server Timeout: Increase from default of 2 to 5 sec • RADIUS Aggressive-Failover: Disable aggressive failover • RADIUS Interim Accounting: v7.6: Disable; v8.0+: Enable with interval of 0. (Update auto-sent on DHCP lease or Device Sensor) • Idle Timer: Increase to 1 hour (3600 sec) for secure SSIDs • Session Timeout: Increase to 2+ hours (7200+ sec) • Client Exclusion: Enable and set exclusion timeout to 180+ sec • Roaming: Enable CCKM / SKC / 802.11r (when feasible) • Bugfixes: Upgrade WLC software to address critical defects Reauth period Quiet-period 5 min Held-period / Exclusion 5 min Misbehaving supplicant Roaming supplicant Unknown users Reauth phones Quiet Period Prevent Large-Scale Wireless RADIUS Network Melt Downs http://www.cisco.com/c/en/us/support/docs/wireless-mobility/wireless-lan-wlan/118703-technote-wlc-00.html BRKSEC-2059 Deploying ISE in a Dynamic Environment - Clark Gambrel Monday, June 11 @ 1:30pm WLC Client Exclusion
  • 88. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS One-Click Setup for ISE Best Practice Config • Checkbox to auto- configure WLAN and associated RADIUS Servers to ISE best practice. BRKSEC-3699 88
  • 89. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 89 BRKSEC-3699 Which WLC Software Should I Deploy? CDETS Title CSCul83594 Session-id is not synchronized across mobility, if the network is open (fixed in 8.6) CSCuu82607 Evaluation of all for OpenSSL June 2015 CSCuu68490 duplicate radius-acct update message sent while roaming CSCus61445 DNS ACL on wlc is not working - AP not Send DTLS to WLC CSCuq48218 Cisco WLC cannot process multiple sub-attributes in single RADIUS VSA CSCuo09947 RADIUS AVP #44 (Acct-Session-ID) to be sent in RADIUS authentication messages https://www.cisco.com/c/en/us/support/docs/wireless/wireless-lan-controller-software/ 200046-TAC-Recommended-AireOS.html • 8.0.152.0 – Currently the most mature and reliable release. • 8.2.167.6 – Mature - Recommended when need new feature/hardware support. • 8.3.141.0 – Less Mature – Recommend if require new features in 8.3.x • 8.5.124.55 – Cutting edge – Recommend if require new features in 8.5.x • 8.6.101.0 – Bleeding edge – Only if absolutely require new features in 8.6.x 8.7.102.0 – Only if absolutely require new features in 8.7.x • Example critical defects resolved in maintenance and new releases:
  • 90. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 90 BRKSEC-3699 Tune NAD Configuration Rate Limiting at Wired Source Wired (IOS / IOS-XE) • RADIUS Interim Accounting: Use newinfo parameter with long interval (for example, 24-48 hrs), if available. Otherwise, set 15 mins. If LB present, set shorter than RADIUS persist time. • 802.1X Timeouts • held-period: Increase to 300+ sec • quiet-period: Increase to 300+ sec • ratelimit-period: Increase to 300+ sec • Inactivity Timer: Disable or increase to 1+ hours (3600+ sec) • Session Timeout: Disable or increase to 2+ hours (7200+ sec) • Reauth Timer: Disable or increase to 2+ hours (7200+ sec) • Bugfixes: Upgrade software to address critical defects. Reauth period Held-period 5 min Quiet-period / Exclusion 5 min Misbehaving supplicant Roaming supplicant Unknown users Reauth phones Switch Quiet period
  • 91. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 91 BRKSEC-3699 RADIUS Test Probes Reduce Frequency of RADIUS Server Health Checks Misbehaving supplicant Roaming supplicant Unknown users Reauth phones Heartbeat frequency Quiet Period • Wired NAD: RADIUS test probe interval set with idle-time parameter in radius- server config; Default is 60 minutes • No action required • Wireless NAD: If configured, WLC only sends “active” probe when server marked as dead. • No action required • Load Balancers: Set health probe intervals and retry values short enough to ensure prompt failover to another server in cluster occurs prior to NAD RADIUS timeout (typically 20-60 sec.) but long enough to avoid excessive test probes. Load Balancer LB Health probes Switch WLC Client Exclusion Quiet period
  • 92. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 92 BRKSEC-3699 Load Balancer RADIUS Test Probes Citrix Example  Probe frequency and retry settings: – Time interval between probes: interval seconds # Default: 5 – Number of retries retries number # Default: 3  Sample Citrix probe configuration:  Recommended setting: Failover must occur before RADIUS timeout (typically 15-35 sec) while avoiding excessive probing  Probe frequency and retry settings: – Time interval between probes: Interval seconds # Default: 10 – Timeout before failure = 3*(interval)+1: Timeout seconds # Default: 31  Sample F5 RADIUS probe configuration: F5 Example Name PSN-Probe Type RADIUS Interval 10 Timeout 31 Manual Resume No Check Util Up Yes User Name f5-probe Password f5-ltm123 Secret cisco123 Alias Address * All Addresses Alias Service Port 1812 Debug No add lb monitor PSN-Probe RADIUS -respCode 2 -userName citrix_probe -password citrix123 -radKey cisco123 -LRTM ENABLED –interval 10 –retries 3 -destPort 1812
  • 93. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 93 BRKSEC-3699 PSN Noise Suppression and Smarter Logging Filter Noise and Provide Better Feedback on Authentication Issues • PSN Collection Filters • PSN Misconfigured Client Dynamic Detection and Suppression • PSN Accounting Flood Suppression • Detect Slow Authentications • Enhanced Handling for EAP sessions dropped by supplicant or Network Access Server (NAS) • Failure Reason Message and Classification • Identify RADIUS Request From Session Started on Another PSN • Improved Treatment for Empty NAK List Detect and reject misbehaving clients Log Filter PSN Filter health probes from logging Reject bad supplicant PSN
  • 94. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 94 BRKSEC-3699 PSN - Collection Filters Static Client Suppression • PSN static filter based on single attribute: • User Name • Policy Set Name • NAS-IP-Address • Device-IP-Address • MAC (Calling-Station-ID) • Filter Messages Based on Auth Result: • All (Passed/Fail) • All Failed • All Passed • Select Messages to Disable Suppression for failed auth @PSN and successful auth @MnT Administration > System > Logging > Collection Filters
  • 95. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS PSN Filtering and Noise Suppression Dynamic Client Suppression Administration > System > Settings > Protocols > RADIUS Flag misconfigured supplicants for same auth failure within specified interval and stop logging to MnT Send alarm with failure statistics Valid Time ranges displayed by default Each endpoint tracked by: • Calling-Station-ID (MAC Address) • NAS-IP-Address (NAD address) • Failure reason BRKSEC-3699 95
  • 96. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS PSN Filtering and Noise Suppression Dynamic Client Suppression Administration > System > Settings > Protocols > RADIUS Flag misconfigured supplicants for same auth failure within specified interval and stop logging to MnT Send alarm with failure statistics Send immediate Access-Reject (do not even process request) IF: 1) Flagged for suppression 2) Fail auth total X times for same failure reason (inc 2 prev) Fully process next request after rejection period expires. Hard-coded @ 5 in ISE 2.0 BRKSEC-3699 96
  • 97. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS PSN Noise Suppression Drop Excessive RADIUS Accounting Updates from “Misconfigured NADs” Administration > System > Settings > Protocols > RADIUS Allow 2 RADIUS Accounting Updates for same session in specified interval, then drop. BRKSEC-3699 97
  • 98. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 98 BRKSEC-3699 MnT Log Suppression and Smarter Logging Drop and Count Duplicates / Provide Better Monitoring Tools • Drop duplicates and increment counter in Live Log for “matching” passed authentications • Display repeat counter to Live Sessions entries. • Update session, but do not log RADIUS Accounting Interim Updates • Log RADIUS Drops and EAP timeouts to separate table for reporting purposes and display as counters on Live Log Dashboard along with Misconfigured Supplicants and NADs • Alarm enhancements • Revised guidance to limit syslog at the source. • MnT storage allocation and data retention limits • More aggressive purging • Allocate larger VM disks to increase logging capacity and retention. Count and discard repeated events Count and discard untrusted events Count and discard repeats and unknown NAD events MNT
  • 99. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS MnT Noise Suppression Suppress Storage of Repeated Successful Auth Events Administration > System > Settings > Protocols > RADIUS Suppress Successful Reports = Do not save repeated successful auth events for the same session to MnT DB These events will not display in Live Authentications Log but do increment Repeat Counter. BRKSEC-3699 99
  • 100. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 100 MnT Noise Suppression Suppress Storage of Repeated Successful Auth Events Administration > System > Settings > Protocols > RADIUS Detect NAD retransmission timeouts and Log auth steps > threshold. Step latency is visible in Live Logs details 12304 Extracted EAP-Response containing PEAP challenge-response 11808 Extracted EAP-Response containing EAP-MSCHAP challenge- response for inner method 15041 Evaluating Identity Policy (Step latency=1048 ms) 15006 Matched Default Rule 15013 Selected Identity Source - Internal Users 24430 Authenticating user against Active Directory 24454 User authentication against Active Directory failed because of a timeout error (Step latency=30031 ms) 24210 Looking up User in Internal Users IDStore - test1 24212 Found User in Internal Users IDStore 22037 Authentication Passed 11824 EAP-MSCHAP authentication attempt passed 12305 Prepared EAP-Request with another PEAP challenge 11006 Returned RADIUS Access-Challenge 5411 Supplicant stopped responding to ISE (Step latency=120001 ms) BRKSEC-3699 Suppress Successful Reports = Do not save repeated successful auth events for the same session to MnT DB These events will not display in Live Authentications Log but do increment Repeat Counter.
  • 101. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Suppression RejectionAccess Device PSN 802.1X Request (12321 Cert Rejected) Failure Detection Endpoint Failed Auth Log Failed Auth Log Failed Auth Log Client Suppression and Reject Timers MnT MAB Request (22056 Subject not found) 802.1X Request (12321 Cert Rejected) t = T0 t = T1 t = T2 T2 < Ts t = T3 t = T4 t = T5 MAB Request 802.1X Request MAB Request t = T6 t = T7 802.1X Request MAB Request Report t = T9 Auth Request t = T10 Access-Reject Auth Request Access-Reject Report Report 5434 Suppression Report 5434 5449 Reject Report 5449 Ts = Failed Suppression Interval Ts Tr = Report Interval Tx = Rejection Interval Tr Tr Tx Tr Tr Successful Auth Log Release 5449 2 failures! 802.1X Request t = T8 Total 5 failures of same type! Auth Request BRKSEC-3699 101
  • 102. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS BRKSEC-3699 Rejection Client Suppression and Reject Timers Tr = Report Interval Tx = Rejection Interval Tr Tr Tx Tr Tr Rejection Tr Tr Tx Access Device Endpoint Report 5449 Report 5449 Report 5449 Report 5449 Report 5449 Report 5449 102 PSN MnT
  • 103. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS RADIUS Accounting “Bad” Auth Requests ISE Log Suppression “Good”-put Versus “Bad”-put “Good” Auth Requests Incomplete Auth Requests Rejected Failed Auth Suppressed Successful Auth Suppressed RADIUS Accounting updates (not IP change) Accounting Updates Suppressed RADIUS Drops BRKSEC-3699 103 PSN MnT
  • 104. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Typical Load Example IN OUT $ $ $ $ $ $ BRKSEC-3699 104
  • 105. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Extreme Noise Load Example IN OUT $ $ $ $ $ $ $ $ $ $ $ $ BRKSEC-3699 105
  • 106. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS WLC – Client Exclusion Blacklist Misconfigured or Malicious Clients • Excessive Authentication Failures—Clients are excluded on the fourth authentication attempt, after three consecutive failures. • Client excluded for Time Value specified in WLAN settings. Recommend increase to 1-5 min (60-300 sec). 3 min is a good start. Note: Diagrams show default values BRKSEC-3699 106
  • 107. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 107 BRKSEC-3699 Live Authentications and Sessions Blue entry = Most current Live Sessions entry with repeated successful auth counter
  • 108. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 108 BRKSEC-3699 Authentication Suppression Enable/Disable • Global Suppression Settings: Administration > System > Settings > Protocols > RADIUS Caution: Do not disable suppression in deployments with very high auth rates. It is highly recommended to keep Auth Suppression enabled to reduce MnT logging • Selective Suppression using Collection Filters: Administration > System > Logging > Collection Filters Configure specific traffic to bypass Successful Auth Suppression Useful for troubleshooting authentication for a specific endpoint or group of endpoints, especially in high auth environments where global suppression is always required. Failed Auth Suppression Successful Auth Suppression
  • 109. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 109 BRKSEC-3699 Per-Endpoint Time-Constrained Suppression Right Click
  • 110. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 110 Visibility into Reject Endpoints! 110 BRKSEC-3699
  • 111. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 111 Releasing Rejected Endpoints BRKSEC-3699
  • 112. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 112 Releasing Rejected Endpoints Query/Release Rejected also available via ERS API! BRKSEC-3699
  • 113. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 113 No Log Suppression With Log Suppression Distributed Logging BRKSEC-3699
  • 115. Agenda © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS • ISE Appliance Redundancy • ISE Node Redundancy • Administration Nodes • Monitoring Nodes • pxGrid Nodes • HA for Certificate Services • Policy Service Node Redundancy • Load Balancing • Non-LB Options • NAD Fallback and Recovery BRKSEC-3699 115 High Availability Agenda
  • 117. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 117 BRKSEC-3699 Appliance Redundancy In-Box High Availability Platform SNS-3415 (34x5 Small) SNS-3495 (34x5 Large) SNS-3515 (35x5 Small) SNS-3595 (35x5 Large) Drive Redundancy No (1) 600GB disk Yes (2) 600-GB No (1) 600GB disk Yes (4) 600GB disk Controller Redundancy No Yes (RAID 1) No (1GB FBWC Controller Cache) Yes (RAID 10) (1GB FBWC Cache) Ethernet Redundancy Yes* 4 GE NICs = Up to 2 bonded NICs Yes* 4 GE NICs = Up to 2 bonded NICs Yes* 6 GE NICs = Up to 3 bonded NICs Yes* 6 GE NICs = Up to 3 bonded NICs Redundant Power No (2nd PSU optional) UCSC-PSU-650W Yes No (2nd PSU optional) UCSC-PSU1-770W Yes * ISE 2.1 introduced NIC Teaming support for High Availability only (not active/active) SNS-3500 Series
  • 118. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS NIC Bonding Network Card Redundancy GE0 GE1 Primary Backup Bond 0 GE2 GE3 Primary Backup Bond 1 Bond 2 GE4 GE5 BRKSEC-3699 118 • For Redundancy only–NOT for increasing bandwidth. • Up to (3) bonds in ISE 2.1 • Bonded Interfaces Preset– Non-Configurable
  • 119. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Bonded Interfaces for Redundancy When GE0 is Down, GE1 Takes Over GE0 GE1 Same MAC Address • Both interfaces assume the same L2 address. • When GE0 fails, GE1 assumes the IP address and keeps the communications alive. • Based on Link State of the Primary Interface • Every 100 milliseconds the link state of the Primary is inspected. BRKSEC-3699 119
  • 120. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 120 BRKSEC-3699 NIC Teaming NIC Teaming / Interface Bonding • Configured using CLI only! • GE0 + GE1 Bonding Example: admin(config-GigabitEthernet0)# backup interface GigabitEthernet 1 • Requires service restart. After restart, ISE recognizes bonded interfaces for Deployment and Profiling; Guest requires manual config of eligible interfaces.
  • 122. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Policy Sync Policy Sync Admin Node HA and Synchronization PAN Steady State Operation • Changes made to Primary Administration DB are automatically synced to all nodes. 122 BRKSEC-3699 PSN Admin Node (Primary) Admin Node (Secondary) Monitoring Node (Primary) Monitoring Node (Secondary) Policy Sync Admin User • Maximum two PAN nodes per deployment • Active / Standby PSN PXG PSN
  • 123. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Policy Sync Policy Sync Admin Node HA and Synchronization Primary PAN Outage and Recovery • Prior to ISE 1.4, upon Primary PAN failure, admin user must connect to Secondary PAN and manually promote Secondary to Primary; new Primary syncs all new changes. • PSNs buffer endpoint updates if Primary PAN unavailable; buffered updates sent once PAN available. 123 BRKSEC-3699 PSN Admin Node (Primary) Admin Node (Secondary) Monitoring Node (Primary) Monitoring Node (Secondary) Policy Sync Admin User PSN PXG PSN Promoting Secondary Admin may take 10-15 minutes before process is complete. New Guest Users or Registered Endpoints cannot be added/connect to network when Primary Administration node is unavailable!
  • 124. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Policy Service Survivability When Admin Down/Unreachable Which User Services Are Available if Primary Admin Node Is Unavailable? Service Use case Works (Y / N) RADIUS Auth Generally all RADIUS auth should continue provided access to ID stores Y Guest All existing guests can be authenticated, but new guests, self-registered guests, or guest flows relying on device registration will fail. N Profiler Previously profiled endpoints can be authenticated with existing profile. New endpoints or updates to existing profile attributes received by owner should apply, but not profile data received by PSN in foreign node group. Y Posture Provisioning/Assessment work, but Posture Lease unable to fetch timer. Y Device Reg Device Registration fails if unable to update endpoint record in central db. N BYOD/NSP BYOD/NSP relies on device registration. Additionally, any provisioned certificate cannot be saved to database. N MDM MDM fails on update of endpoint record N CA/Cert Services See BYOD/NSP use case; certificates can be issued but will not be saved and thus fail. OCSP functions using last replicated version of database N pxGrid Clients that are already authorized for a topic and connected to controller will continue to operate, but new registrations and connections will fail. N TACACS+ TACACS+ requests can be locally processed per ID store availability. Y BRKSEC-3699 124
  • 125. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 125 BRKSEC-3699 Automatic PAN Switchover Introduced ISE 1.4 • Primary PAN (PAN-1) down or network link down. • If Health Check Node unable to reach PAN-1 but can reach PAN-2  trigger failover • Secondary PAN (PAN-2) is promoted by Health Check Node • PAN-2 becomes Primary and takes over PSN replication. WAN PAN-2 Secondary MNT-2 Secondary DC-1 DC-2 PAN-1 Primary MNT-1 Primary 1 Primary PAN Health Check Node Secondary PAN Health Check Node 2 Note: Switchover is NOT immediate. Total time based on polling intervals and promotion time. Expect ~15 - 30 minutes. Don’t forget, after switchover admin must connect to PAN-2 for ISE management!
  • 126. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS PAN Failover Health Check Node Configuration • Configuration using GUI only under Administration > System > Deployment > PAN Failover 126 BRKSEC-3699 Health Check Node CANNOT be a PAN !! Requires Minimum of 3 nodes – 3rd node is independent observer
  • 127. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 127 BRKSEC-3699 HA for Monitoring and Troubleshooting Steady State Operation • MnT nodes concurrently receive logging from PAN, PSN, IPN*, NAD, and ASA • PAN retrieves log/report data from Primary MnT node when available Syslog 20514 Syslog from firewall (or other user logging device) is correlated with guest session for activity logging Syslog from access devices are correlated with user/device session Syslog from ISE nodes are sent for session tracking and reporting Monitoring Node (Primary) Monitoring Node (Secondary) MnT data Admin User • Maximum two MnT nodes per deployment • Active / Active PXG PSN NADs FW PAN
  • 128. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 128 BRKSEC-3699 HA for Monitoring and Troubleshooting Primary MnT Outage and Recovery • Upon MnT node failure, PAN, PSN, NAD, and ASA continue to send logs to remaining MnT node • PAN auto-detects Active MnT failure and retrieves log/report data from Secondary MnT node. • Full failover to Secondary MnT may take from 5-15 min depending on type of failure. Syslog 20514 Monitoring Node (Primary) Monitoring Node (Secondary) MnT data Admin User NADs FW Syslog from firewall (or other user logging device) is correlated with guest session for activity logging Syslog from access devices are correlated with user/device session Syslog from ISE nodes are sent for session tracking and reporting PXG PSN PAN • PSN logs are not locally buffered when MnT down unless use TCP/Secure syslog. • Log DB is not synced between MnT nodes. • Upon return to service, recovered MnT node will not include data logged during outage • Backup/Restore required to re-sync MnT database
  • 129. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 129 BRKSEC-3699 Log Buffering TCP and Secure Syslog Targets • Default UDP-based audit logging does not buffer data when MnT is unavailable. • TCP and Secure Syslog options can be used to buffer logs locally • Note: Overall log performance will decrease if use these acknowledged options.
  • 130. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 130 HA for pxGrid v1 Steady State Primary PAN Secondary PAN Secondary MnT Active pxGrid Controller pxGrid Client (Subscriber) Primary MnT TCP/5222 TCP/5222 Standby pxGrid Controller pxGrid Clients (Publishers) • pxGrid clients can be configured with up to 2 servers for redundancy. • Clients connect to single active controller for given domain TCP/5222 TCP/12001 PAN Publisher Topics: • Controller Admin • TrustSec/SGA • Endpoint Profile MnT Publisher Topics: • Session Directory • Identity Group • ANC (EPS) BRKSEC-3699 • Max two pxGrid v1 nodes per deployment (Active/Standby)
  • 131. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 131 HA for pxGrid v1 Failover and Recovery Active pxGrid Controller pxGrid Client (Subscriber) PAN Publisher Topics: • Controller Admin • TrustSec/SGA • Endpoint Profile Standby pxGrid Controller TCP/5222 MnT Publisher Topics: • Session Directory • Identity Group • ANC (EPS) If active pxGrid Controller fails, clients automatically attempt connection to standby controller. TCP/5222 TCP/12001 TCP/5222 BRKSEC-3699 • Max two pxGrid v1 nodes per deployment (Active/Standby) Primary PAN Secondary PAN Secondary MnT Primary MnT pxGrid Clients (Publishers)
  • 132. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 132 HA for pxGrid v2 (ISE 2.3+) Steady State pxGrid Client #1 (Subscriber) TCP/5222 TCP/5222 • pxGrid clients can be configured with multiple servers for redundancy. • Clients connect to single active controller for given domain TCP/5222 TCP/12001 PAN Publisher Topics: • Controller Admin • TrustSec/SGA • Endpoint Profile MnT Publisher Topics: • Session Directory • Identity Group • ANC (EPS) BRKSEC-3699 pxGrid Client #2 (Subscriber) TCP/5222 • 2.3: Max two pxGrid v2 nodes/ deployment (Active/Active) • 2.4: Max 4 nodes (All Active) Primary PAN Secondary PAN Secondary MnT Primary MnT pxGrid Clients (Publishers) Active pxGrid Controller #1 Active pxGrid Controller #2
  • 134. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Load Balancing RADIUS, Web, and Profiling Services • Policy Service nodes can be configured in a cluster behind a load balancer (LB). • Access Devices send RADIUS and TACACS+ AAA requests to LB virtual IP. Load Balancers Network Access Devices PSNs (User Services) Virtual IP BRKSEC-3699 134 VPN • N+1 node redundancy assumed to support total endpoints during: –Unexpected server outage –Scheduled maintenance –Scaling buffer • HA for LB itself assumed
  • 135. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS • Administration > System > Deployment • Node group members can be L2 or L3 • Multicast not required 135 BRKSEC-3699 Configure Node Groups for LB Cluster Place all PSNs in LB Cluster in Same Node Group 1) Create node group 2) Assign name 3) Add individual PSNs to node group
  • 136. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS VLAN 99 (10.1.99.0/24) VLAN 98 (10.1.98.0/24) 136 BRKSEC-3699 High-Level Load Balancing Diagram End User/Device VIP: 10.1.98.8 Access Device NAS IP: 10.1.50.2 ISE-PAN-1 ISE-MNT-1 ISE-PAN-2 ISE-MNT-2 External Logger AD LDAP MDM DNS NTP SMTP Load Balancer For Your Reference ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 LB: 10.1.99.1
  • 137. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS End User/Device Access Device NAS IP: 10.1.50.2 137 BRKSEC-3699 Traffic Flow—Fully Inline: Physical Separation Physical Network Separation Using Separate LB Interfaces • Load Balancer is directly inline between PSNs and rest of network. • All traffic flows through Load Balancer including RADIUS, PAN/MnT, Profiling, Web Services, Management, Feed Services, MDM, AD, LDAP… VLAN 99 (Internal) VLAN 98 (External) Fully Inline Traffic Flow recommended— physical or logical VLAN 99 (10.1.99.0/24) VLAN 98 (10.1.98.0/24) VIP: 10.1.98.8 Load Balancer ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 LB: 10.1.99.1 ISE-PAN ISE-MNT External Logger AD LDAP MDM DNS NTP SMTP
  • 138. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 138 BRKSEC-3699 Traffic Flow—Fully Inline: VLAN Separation Logical Network Separation Using Single LB Interface and VLAN Trunking • LB is directly inline between ISE PSNs and rest of network. • All traffic flows through LB including RADIUS, PAN/MnT, Profiling, Web Services, Management, Feed Services, MDM, AD, LDAP… Load Balancer 10.1.98.1 10.1.98.2 10.1.99.1 VLAN 99 (Internal) VLAN 98 (External) VIP: 10.1.98.8 Network Switch End User/Device Access Device NAS IP: 10.1.50.2 ISE-PAN ISE-MNT External Logger AD LDAP MDM DNS NTP SMTP
  • 139. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS • All inbound LB traffic such RADIUS, Profiling, and directed Web Services sent to LB VIP. • Other inbound non-LB traffic bypasses LB including redirected Web Services, PAN/MnT, Management, Feed Services, MDM, AD, LDAP… • All outbound traffic from PSNs sent to LB as DFGW. • LB must be configured to allow Asymmetric traffic ISE-PAN ISE-MNT External Logger AD LDAP MDM DNS NTP SMTP 139 BRKSEC-3699 Partially Inline: Layer 2/Same VLAN (One PSN Interface) Direct PSN Connections to LB and Rest of Network Load Balancer End User/Device Access Device L3 Switch VLAN 98 10.1.98.2 VIP: 10.1.98.8 10.1.98.1 10.1.98.7 10.1.98.5 10.1.98.6 NAS IP: 10.1.50.2 Generally NOT RECOMMENDED due to traffic flow complexity—must fully understand path of each flow to ensure proper handling by routing, LB, and end stations. ISE-PSN-3 ISE-PSN-2 ISE-PSN-1
  • 140. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Request for service at single host ‘psn-cluster’ 140 BRKSEC-3699 PSN Load Balancing Sample Topology and Flow User Response from psn-vip.company.com DNS Lookup = psn-vip.company.com DNS response = 10.1.98.8 Request to psn-vip.company.com VIP: 10.1.98.8 PSN-VIP VLAN 99 (10.1.99.0/24) VLAN 98 (10.1.98.0/24) DNS request sent to resolve psn-cluster FQDN Request sent to Virtual IP Address (VIP) 10.1.98.8 Response returned from real server ise-psn-3 @ 10.1.99.7, then Source NAT’ed back to VIP @ 10.1.98.8 For Your Reference ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 DNS Server Load Balancer Access Device
  • 141. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 141 BRKSEC-3699 Load Balancing Policy Services • RADIUS AAA Services Packets sent to LB virtual IP are load-balanced to real PSN based on configured algorithm. Sticky algorithm determines method to ensure same Policy Service node services same endpoint. • Web Services: • URL-Redirected: Posture (CPP) / Central WebAuth (CWA) / Native Supplicant Provisioning (NSP) / Hotspot / Device Registration WebAuth (DRW), Partner MDM. No LB Required! PSN that terminates RADIUS returns URL Redirect with its own certificate CN name substituted for ‘ip’ variable in URL. Direct HTTP/S: Local WebAuth (LWA) / Sponsor / MyDevices Portal, OCSP Single web portal domain name should resolve to LB virtual IP for http/s load balancing. • Profiling Services: DHCP Helper / SNMP Traps / Netflow / RADIUS LB VIP is the target for one-way Profile Data (no response required). VIP can be same or different than one used by RADIUS LB; Real server interface can be same or different than one used by RADIUS • TACACS+ AAA Services: (Session and Command Auth and Accounting) LB VIP is target for TACACS+ requests. T+ not session based like RADIUS, so not required that requests go to same PSN
  • 143. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Load Balancer User Access Device ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 143 BRKSEC-3699 Load Balancing RADIUS Sample Flow RADIUS AUTH response from 10.1.98.8 RADIUS AUTH request to 10.1.98.8 VIP: 10.1.98.8 PSN-CLUSTER VLAN 99 (10.1.99.0/24) VLAN 98 (10.1.98.0/24) RADIUS ACCTG request to 10.1.98.8 1. NAD has single RADIUS Server defined (10.1.98.8) 2. RADIUS Auth requests sent to VIP @ 10.1.98.8 3. Requests for same endpoint load balanced to same PSN via sticky based on RADIUS Calling-Station-ID and Framed-IP-Address 4. RADIUS response received from VIP @ 10.1.98.8 (originated by real server ise-psn-3 @ 10.1.99.7 and source translated by LB) 5. RADIUS Accounting sent to/from same PSN based on sticky 2 4 5 1 radius-server host 10.1.98.8 3 RADIUS ACCTG response from 10.1.98.8
  • 144. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 144 BRKSEC-3699 Load Balancer Persistence (Stickiness) Guidelines Persistence Attributes • Common RADIUS Sticky Attributes o Client Address  Calling-Station-ID  Framed-IP-Address o NAD Address  NAS-IP-Address  Source IP Address o Session ID  RADIUS Session ID  Cisco Audit Session ID o Username • Best Practice Recommendations (depends on LB support and design) 1. Calling-Station-ID for persistence across NADs and sessions 2. Source IP or NAS-IP-Address for persistence for all endpoints connected to same NAD 3. Audit Session ID for persistence across re-authentications Username=jdoe@company.com Load Balancer VIP: 10.1.98.8 Access Device 10.1.50.2 Session: 00aa…99ff MAC Address=00:C0:FF:1A:2B:3C IP Address=10.1.10.101 User Device ISE-PSN-3 ISE-PSN-2 ISE-PSN-1
  • 145. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 145 BRKSEC-3699 Load Balancer Stickiness Guidelines Config Examples Based on Calling-Station-ID (MAC Address) • Cisco ACE Example: • F5 LTM iRule Example: • Citrix NetScaler Example: sticky radius framed-ip calling-station-id RADIUS-STICKY serverfarm ise-psn ltm rule RADIUS_iRule { when CLIENT_ACCEPTED { persist uie [RADIUS::avp 31] }} Be sure to monitor load balancer resources when performing advanced parsing. add lb vserver radius-auth RADIUS 172.16.0.16 1812 -rule "CLIENT.UDP.RADIUS.ATTR_TYPE(31)" -cltTimeout 120 add lb vserver radius-acct RADIUS 172.16.0.16 1813 -rule "CLIENT.UDP.RADIUS.ATTR_TYPE(31)" -cltTimeout 120 set lb group RADIUS-Calling-Station-ID -persistenceType RULE -rule "CLIENT.UDP.RADIUS.ATTR_TYPE(31)
  • 146. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 146 BRKSEC-3699 LB Fragmentation and Reassembly Be aware of load balancers that do not reassemble RADIUS fragments! • Example: EAP-TLS with large certificates • Need to address path fragmentation or persist on source IP • ACE reassembles RADIUS packet. • F5 LTM reassembles packets by default except for FastL4 Protocol • Must be manually enabled under the FastL4 Protocol Profile • Citrix NetScaler fragmentation defect—Resolved in NetScaler 10.5 Build 50.10 • Issue ID 429415 addresses fragmentation and the reassembly of large/jumbo frames RADIUS w/BigCert IP Fragment #1 IP LB on Source IP (No Calling ID in RADIUS packet) LB on Call-ID Fragment #2 IP Calling-Station-ID + Certificate Part 1 Certificate Part 2 RADIUS Frag1 IP RADIUS Frag2 IP Also watch for fragmented packets that are too small. LBs have min allowed frag size and will drop !!!
  • 147. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS • Example: Intermediate switch/gateway fragments packets below LB minimum • Need to address path fragmentation or change LB min fragment size • ACE: fragment min-mtu <bytes> (default 576 bytes) • F5 LTM: # tmsh modify sys db tm.minipfragsize value 1 • Pre-11.6: Default = 576 bytes • 11.6.0+: Default = 566 bytes 147 BRKSEC-3699 LB Fragmentation and Reassembly Watch for packet fragments smaller than LB will accept! RADIUS w/BigCert IP Frag1 IP LB min frag size = 576 bytes Frag2 IP Fragments <= 512 bytes Frag3 IP Frag4 IP Switch with low MTU
  • 148. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 148 BRKSEC-3699 NAT Restrictions for RADIUS Load Balancing Why Source NAT (SNAT) Fails for NADs • With SNAT, LB appears as the Network Access Device (NAD) to PSN. • CoA sent to wrong IP address SNAT results in less visibility as all requests appear sourced from LB – makes troubleshooting more difficult. User Story 8601 : CoA support for NAT'ed load balanced environments NAS IP Address is correct, but not currently used for CoA
  • 149. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 149 BRKSEC-3699 SNAT of NAD Traffic: Live Log Example Auth Succeeds/CoA Fails: CoA Sent to Load Balancer and Dropped
  • 150. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 150 BRKSEC-3699 Allow Source NAT for PSN CoA Requests Simplifying Switch CoA Configuration • Match traffic from PSNs to UDP/1700 or UDP/3799 (RADIUS CoA) and translate to PSN cluster VIP. • Access switch config: • Before: • After: 10.1.98.8 aaa server radius dynamic-author client 10.1.99.5 server-key cisco123 client 10.1.99.6 server-key cisco123 client 10.1.99.7 server-key cisco123 client 10.1.99.8 server-key cisco123 client 10.1.99.9 server-key cisco123 client 10.1.99.10 server-key cisco123 <…one entry per PSN…> aaa server radius dynamic-author client 10.1.98.8 server-key cisco123 ISE-PSN-X 10.1.99.x Access Switch Load Balancer ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 CoA SRC=10.1.98.8 CoA SRC=10.1.99.5
  • 151. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 151 BRKSEC-3699 Allow Source NAT for PSN CoA Requests Simplifying WLC CoA Configuration • Before: • After One RADIUS Server entry required per PSN that may send CoA from behind load balancer One RADIUS Server entry required per load balancer VIP. Simplifies config and reduces # ACL entries required to permit access to each PSN
  • 153. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 153 BRKSEC-3699 Load Balancing with URL-Redirection URL Redirect Web Services: Hotspot/DRW, CWA, BYOD, Posture, MDM User RADIUS response from psn-vip.company.com DNS Lookup = ise-psn-3.company.com DNS Response = 10.1.99.7 RADIUS request to psn-vip.company.com VIP: 10.1.98.8 PSN-CLUSTER DNS Server Access Device 1. RADIUS Authentication requests sent to VIP @ 10.1.98.8 2. Requests for same endpoint load balanced to same PSN via RADIUS sticky. 3. RADIUS Authorization received from VIP @ 10.1.98.8 (originated by ise-psn-3 @ 10.1.99.7 with URL Redirect to https://ise-psn-3.company.com:8443/... 4. Client browser redirected and resolves FQDN in URL to real server address. 5. User sends web request directly to same PSN that serviced RADIUS request. ISE Certificate Subject CN = ise-psn-3.company.com https://ise-psn-3.company.com:8443/... HTTPS response from ise-psn-3.company.com 1 2 3 4 5 Load Balancer
  • 154. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 154 BRKSEC-3699 Load Balancing URL-Redirected Services When and How to Override Default URL Redirection from Client to PSN • Use Cases for LB to Terminate redirected HTTPS Requests • Obfuscate PSN node names/IP addresses. (Do not want PSN name exposed to browser) • Ability to use a different certificate for user facing connection • Apply security inspections on web-based requires • As a way to secure PSN interfaces in DMZ. • Requires Authorization Profile be configured with Static Hostname option. • Load Balancer must be able to persist web request to same PSN that serviced RADIUS session Common methods (else rely on ISE policy logic): • LB includes Framed-IP-Address with RADIUS sticky; correlates Framed-IP to HTTPS source IP • LB includes Session Id with RADIUS sticky; correlates Session Id in web request Note: Since ISE assumes HTTPS for web access, offload cannot be used to increase SSL performance. Load Balancer must reestablish SSL connection to real PSN servers. url-redirect=https://<PSN_CN>:8443/guestportal/gateway?sessionId=SessionIdValue&action=cwa F5 LTM loadbalancing Radius and HTTP traffic for ISE http://www.cisco.com/c/en/us/support/docs/security/identity-services-engine/200317-F5-LTM-loadbalancing-Radius-and-HTTP-tra.html
  • 155. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 155 BRKSEC-3699 URL Redirection Using Static IP/Hostname Overriding Automatic Redirection to PSN IP Address/FQDN • Allows static IP or FQDN value to be returned for CWA or other URL-Redirected Flows • Common use case: Public DNS or IP address (no DNS available) must be used while preserving variable substitution for port and sessionId variables. Policy > Policy Elements > Results > Authorization > Authorization Profiles DMZ PSN Certificate must match IP/Static FQDN Specified IP Address/Hostname MUST point to the same PSN that terminates the RADIUS session. If multiple PSNs, requires LB persistence or AuthZ Policy logic to ensure redirect occurs to correct PSN.
  • 156. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS BRKSEC-3699 “Universal Certs” UCC or Wildcard SAN Certificates CN must also exist in SAN Other FQDNs or wildcard as “DNS Names” IP Address is also option ise-psn.company.com mydevices.company.com sponsor.company.com ise-psn/Admin ise-psn Universal Cert options: • UCC / Multi-SAN • Wildcard SAN 156 *.ise.company.com psn.ise.company.com Check box to use wildcards
  • 158. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 User DHCP Server Access Device Load Balancer 158 BRKSEC-3699 Load Balancing Profiling Services Sample Flow DHCP Request to Helper IP 10.1.98.8 VIP: 10.1.98.8 PSN-CLUSTER 1. Client OS sends DHCP Request 2. Next hop router with IP Helper configured forwards DHCP request to real DHCP server and to secondary entry = LB VIP 3. Real DHCP server responds and provide client a valid IP address 4. DHCP request to VIP is load balanced to PSN @ 10.1.99.7 based on source IP stick (L3 gateway) or DHCP field parsed from request. 2 DHCP Request to Helper IP 10.1.1.10 2 DHCP Response returned from DHCP Server 3 4 1
  • 159. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 159 BRKSEC-3699 Load Balancing Simplifies Device Configuration L3 Switch Example for DHCP Relay • Before • After ! interface Vlan10 description EMPLOYEE ip address 10.1.10.1 255.255.255.0 ip helper-address 10.1.100.100 <--- Real DHCP Server ip helper-address 10.1.99.5 <--- ISE-PSN-1 ip helper-address 10.1.99.6 <--- ISE-PSN-2 ! ! interface Vlan10 description EMPLOYEE ip address 10.1.10.1 255.255.255.0 ip helper-address 10.1.100.100 <--- Real DHCP Server ip helper-address 10.1.98.8 <--- LB VIP ! Settings apply to each L3 interface servicing DHCP endpoints
  • 160. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 User NAD 160 BRKSEC-3699 Load Balancing Sticky Guidelines Ensure DHCP and RADIUS for a Given Endpoint Use Same PSN VIP: 10.1.98.8 1. RADIUS Authentication request sent to VIP @ 10.1.98.8. 2. Request is Load Balanced to PSN-3, and entry added to Persistence Cache 3. DHCP Request is sent to VIP @ 10.1.98.8 4. Load Balancer uses the same “Sticky” as RADIUS based on client MAC address 5. DHCP is received by same PSN, thus optimizing endpoint replication 1 5 IP Helper sends DHCP to VIP Persistence Cache: 11:22:33:44:55:66 -> PSN-3 RADIUS response from PSN-3 RADIUS request to VIP MAC: 11:22:33:44:55:66 DHCP Request 3 F5 LTM 2 2 4 4
  • 161. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS when RULE_INIT { set static::DDIP_debug 1 } when CLIENT_ACCEPTED { if { [UDP::payload length] > 200 } { binary scan [UDP::payload] x240H* dhcp_option_payload set option_hex 0 set options_length [expr {([UDP::payload length] -240) * 2 }] for {set i 0} {$i < $options_length} {incr i [expr { $length * 2 + 2 }]} { # extract option value and convert into decimal # for human readability binary scan $dhcp_option_payload x[expr { $i } ]a2 option_hex set tmpvalue1 0x$option_hex set option [expr { $tmpvalue1 }] # move index to get length field incr i 2 # extract length value and convert length from Hex string to decimal binary scan $dhcp_option_payload x[expr { $i } ]a2 length_hex set tmpvalue2 0x$length_hex set length [expr { $tmpvalue2 }] # extract value filed in hexadecimal format binary scan $dhcp_option_payload x[expr { $i + 2} ]a[expr { $length * 2 }] value_hex F5 iRule to Drop DHCP Informs courtesy of 161 BRKSEC-3699 # iRule Continued if { $static::DDIP_debug } { log local0. "DHCP option is $option, value is $value_hex" } switch $option { 53 { # DHCP Message Type switch $value_hex { 08 { if { $static::DDIP_debug } { log local0. "Dropping DHCP Inform packet: $value_hex" } drop return } default { } } } } } } }
  • 163. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE-PSN-3 ISE-PSN-2 ISE-PSN-1 10.1.99.5 10.1.99.6 10.1.99.7 Device Admin Access Device Load Balancer Load Balancing TACACS+ Session Authentication, Authorization, and Accounting TACACS+ Session AUTHC reply from 10.1.98.18 TACACS+ Session AUTHC request to 10.1.98.18 VIP: 10.1.98.18 ISE-CLUSTER VLAN 99 (10.1.99.0/24) VLAN 98 (10.1.98.0/24) TACACS+ Session AUTHZ request to 10.1.98.18 1. NAD has single TACACS+ Server defined (10.1.98.18) 2. TACACS+ Session Authentication requests sent to VIP @ 10.1.98.18 3. Requests from same Admin user load balanced to same PSN via sticky based on Source IP (NAD IP Address) 4. TACACS+ response received from VIP @ 10.1.98.18 (originated by real server ise-psn-3 @ 10.1.99.7 and source translated by LB) 5. TACACS+ Session Authorization & Accounting sent to/from same PSN per sticky 2 4 5 1 tacacs-server host 10.1.98.18 3 TACACS+ Session AUTHZ reply from 10.1.98.18 TACACS+ Session ACCTG request to 10.1.98.18 TACACS+ Session ACCTG reply from 10.1.98.18 • Virtual IP = TACACS+ Server • VIP listens on TCP/49 • Sticky based on source IP BRKSEC-3699 163
  • 164. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 164 BRKSEC-3699 Load Balancing TACACS+ General Recommendations • Load Balance based on TCP/49. • Source NAT (SNAT) can be used – No CoA like RADIUS • Recommend LB inline with TACACS traffic, else need to address TCP asymmetry. • Without SNAT, make sure PSNs set default gateway to LB internal interface IP. • Persistence – Recommend source IP address • Based on assumption that number of T+ clients high and requests per client is low. • Health Monitoring options: • Simple response to TCP/49 • 3-way handshake expected response • Scripts can be used to validate full auth flow. Packet format: http://www.cisco.com/warp/public/459/tac-rfc.1.76.txt Packet capture(encrypted):https://www.cloudshark.org/captures/1a9c284c49b0
  • 165. LDAP Server Redundancy and Load Balancing
  • 166. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 166 BRKSEC-3699 Per-PSN LDAP Servers • Assign unique Primary and Secondary to each PSN • Allows each PSN to use local or regional LDAP Servers
  • 167. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Load Balancing LDAP Servers ldap1.company.com 10.1.95.5 10.1.95.6 10.1.95.7 ldap2.company.com ldap3.company.com LDAP Response from 10.1.95.6 Lookup1 = ldap.company.com Response = 10.1.95.6 LDAP Query to 10.1.95.6 Lookup2 = ldap.company.com Response = 10.1.95.7 LDAP Query to 10.1.95.7 LDAP Response from 10.1.95.7 15 minute reconnect timer BRKSEC-3699 167 PSN
  • 168. Vendor-Specific LB Configurations • F5 LTM • Citrix NetScaler • Cisco ACE • Cisco ITD (Note) https://communities.cisco.com/docs/DOC-64434
  • 169. PSN HA Without Load Balancers
  • 170. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 170 BRKSEC-3699 Load Balancing Web Requests Using DNS Client-Based Load Balancing/Distribution Based on DNS Response • Examples: • Cisco Global Site Selector (GSS) / F5 BIG-IP GTM / Microsoft’s DNS Round-Robin feature • Useful for web services that use static URLs including LWA, Sponsor, My Devices, OCSP. sponsor IN A 10.1.99.5 sponsor IN A 10.1.99.6 sponsor IN A 10.2.100.7 sponsor IN A 10.2.100.8 What is IP address for sponsor.company.com? DNS SOA for company.com 10.1.99.5 What is IP address for sponsor.company.com? 10.2.100.8 10.2.5.221 10.1.60.105 10.2.100.8 10.2.100.7 10.1.99.6 10.1.99.5
  • 171. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 171 BRKSEC-3699 ISE Configuration for Anycast On each PSN that will participate in Anycast… 1. Configure PSN probes to profile DHCP (IP Helper), SNMP Traps, or NetFlow on dedicated interface 2. From CLI, configure dedicated interface with same IP address on each PSN node. ISE-PSN-1 Example: #ise-psn-1/admin# config t #ise-psn-1/admin (config)# int GigabitEthernet1 #ise-psn-1/admin (config-GigabitEthernet)# ip address 10.10.10.10 255.255.255.0 ISE-PSN-2 Example: #ise-psn-1/admin# config t #ise-psn-1/admin (config)# int GigabitEthernet1 #ise-psn-1/admin (config-GigabitEthernet)# ip address 10.10.10.10 255.255.255.0 Anycast address should only be applied to ISE secondary interfaces, or LB VIP, but never to ISE GE0 management interface.
  • 172. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 172 BRKSEC-3699 Sample Routing Configuration for Anycast Real-World Customer Example using Anycast with RADIUS: http://www.networkworld.com/article/3074954/security/how-to-use-anycast-to-provide-high-availability-to-a-radius-server.html • Access Switch 1 interface gigabitEthernet 1/0/23 no switchport ip address 10.10.10.50 255.255.255.0 ! router eigrp 100 no auto-summary redistribute connected route-map CONNECTED- 2-EIGRP ! route-map CONNECTED-2-EIGRP permit 10 match ip address prefix-list 5 set metric 1000 100 255 1 1500 set metric-type internal ! route-map CONNECTED-2-EIGRP permit 20 ip prefix-list 5 seq 5 permit 10.10.10.0/24 • Access Switch 2 interface gigabitEthernet 1/0/23 no switchport ip address 10.10.10.51 255.255.255.0 ! router eigrp 100 no auto-summary redistribute connected route-map CONNECTED- 2-EIGRP ! route-map CONNECTED-2-EIGRP permit 10 match ip address prefix-list 5 set metric 500 50 255 1 1500 set metric-type external ! route-map CONNECTED-2-EIGRP permit 20 ip prefix-list 5 seq 5 permit 10.10.10.0/24 Both switches advertise same network used for profiling but different metrics # less preferred
  • 173. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 173 BRKSEC-3699 NAD-Based RADIUS Server Redundancy (IOS) Multiple RADIUS Servers Defined in Access Device • Configure Access Devices with multiple RADIUS Servers. • Fallback to secondary servers if primary fails PSN3 (10.7.8.9) PSN2 (10.4.5.6) PSN1 (10.1.2.3) RADIUS Auth User radius-server host 10.1.2.3 auth-port 1812 acct-port 1813 radius-server host 10.4.5.6 auth-port 1812 acct-port 1813 radius-server host 10.7.8.9 auth-port 1812 acct-port 1813
  • 174. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 174 IOS-Based RADIUS Server Load Balancing Switch Dynamically Distributes Requests to Multiple RADIUS Servers • RADIUS LB feature distributes batches of AAA transactions to servers within a group. • Each batch assigned to server with least number of outstanding transactions. PSN3 (10.7.8.9) PSN2 (10.4.5.6) PSN1 (10.1.2.3) RADIUS User 1 radius-server host 10.1.2.3 auth-port 1812 acct-port 1813 radius-server host 10.4.5.6 auth-port 1812 acct-port 1813 radius-server host 10.7.8.9 auth-port 1812 acct-port 1813 radius-server load-balance method least-outstanding batch-size 5 NAD controls the load distribution of AAA requests to all PSNs in RADIUS group without dedicated LB. User 2 BRKSEC-3699
  • 175. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 175 NAD-Based RADIUS Redundancy (WLC) Wireless LAN Controller • Multiple RADIUS Auth & Accounting Server Definitions • RADIUS Fallback options: none, passive, or active http://www.cisco.com/en/US/products/ps6366/products_configuration_example09186a008098987e.shtml Off = Continue exhaustively through list; never preempt to preferred server (entry with lowest index) Passive = Quarantine failed RADIUS server for interval then return to active list w/o validation; always preempt. Active = Mark failed server dead then actively probe status per interval w/username until succeed before return to list; always preempt. Password= Username BRKSEC-3699
  • 177. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Recovery Deadtime Layer 2 Point-to-Point Access Switch Policy Service Node Layer 3 Link Access VLAN 10 (or Authorized VLAN) Auth Request 15 sec, Auth-Timeout radius-server dead-criteria 15 tries 3 Dead Detection Endpoint No response Auth Request Retry Retry Retry Wait Deadtime = 2 minutes SERVER DEAD Traffic permitted on Critical VLAN per port ACL 15 sec, Auth-Timeout 15 sec, Auth-Timeout 15 sec, Auth-Timeout Authorize Critical VLAN 11 No response No response Deadtime Test request Deadtime Test request Deadtime Test request Deadtime Test request SERVER ALIVE Reinitialize Port / Set Access VLAN per Recovery Interval Traffic permitted per RADIUS authorization Idle-Time Test request radius-server deadtime 2 authentication event server dead action reinitialize vlan 11 authentication event server alive action reinitialize radius-server host ... test username radtest idle-time 60 key cisco123 authentication critical recovery delay 1000 Idle-Time Test request 60 minute Idle-Time NAD Fallback and Recovery Sequence Deadtime Test reply 177 BRKSEC-3699
  • 178. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 178 BRKSEC-3699 RADIUS Test User Account Which User Account Should Be Used? • Does NAD uniformly treat Auth Fail and Success the same for detecting server health? IOS treats them the same; F5 RADIUS probe treats Auth Fail= “server down”. Check your LB behavior. • Do I use an Internal or External ID store account? If goal is to validate backend ID store, then Auth Fail may not detect external ID store failure. • IOS Example: Failover on AD failure. Solution: Drop auth requests when external ID store is down. • Identity Server Sequence > Advanced Settings: • ACE Example: If auth fails, then PSN declared down. Solution: Create valid user account so ACE test probes return Access-Accept. • Could this present a potential security risk? Authentication Policy > ID Source custom processing based on authentication results
  • 179. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 179 BRKSEC-3699 Inaccessible Authentication Bypass (IAB) Also Known As “Critical Auth VLAN” for Data • Switch detects PSN unavailable by one of two methods • Periodic probe • Failure to respond to AAA request • Enables port in critical VLAN • Existing sessions retain authorization status • Recovery action can re-initialize port when AAA returns WAN / Internet WAN or PSN Down Access VLAN Critical VLAN Critical Data VLAN can be anything: • Same as default access VLAN • Same as guest/auth-fail VLAN • New VLAN authentication event server dead action authorize vlan 100 authentication event server alive action reinitialize authentication event server dead action authorize voice Critical Voice VLAN PSN Access Switch Client
  • 180. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Critical Auth for Data and Voice Data VLAN Enabled interface GigabitEthernet 3/48 dot1x pae authenticator authentication port-control auto authentication event server dead action authorize vlan x authentication event server dead action authorize voice Voice VLAN Enabled # show authentication sessions interface fa3/48 … Critical Authorization is in effect for domain(s) DATA and VOICE BRKSEC-3699 180
  • 181. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 181 BRKSEC-3699 Default Port ACL Issues with Critical VLAN Limited Access Even After Authorization to New VLAN! • Data VLAN reassigned to critical auth VLAN, but new (or reinitialized) connections are still restricted by existing port ACL! WAN or PSN Down Access VLAN Critical VLAN interface GigabitEthernet1/0/2 switchport access vlan 10 switchport voice vlan 13 ip access-group ACL-DEFAULT in authentication event server dead action reinitialize vlan 11 authentication event server dead action authorize voice authentication event server alive action reinitialize Gi1/0/2 ip access-list extended ACL-DEFAULT permit udp any eq bootpc any eq bootps permit udp any any eq domain permit icmp any any permit udp any any eq tftp Only DHCP/DNS/PING/TFTP allowed ! Voice VLAN Default ACL
  • 182. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS BRKSEC-3699 182 Using Embedded Event Manager with Critical VLAN Modify or Remove/Add Static Port ACLs Based on PSN Availability • Allows scripted actions to occur based on various conditions and triggers track 1 ip route 10.1.98.0 255.255.255.0 reachability event manager applet default-acl-fallback event track 1 state down maxrun 5 action 1.0 cli command "enable" action 1.1 cli command "conf t" pattern "CNTL/Z." action 2.0 cli command "ip access-list extended ACL-DEFAULT" action 3.0 cli command "1 permit ip any any" action 4.0 cli command "end" event manager applet default-acl-recovery event track 1 state up maxrun 5 action 1.0 cli command "enable" action 1.1 cli command "conf t" pattern "CNTL/Z." action 2.0 cli command "ip access-list extended ACL-DEFAULT" action 3.0 cli command "no 1 permit ip any any" action 4.0 cli command "end" EEM available on Catalyst 3k/4k/6k switches https://supportforums.cisco.com/document/117596/cisco-eem-basic-overview-and-sample-configurations https://supportforums.cisco.com/document/48891/cisco-eem-best-practices
  • 183. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 183 BRKSEC-3699 Critical ACL using Service Policy Templates Apply ACL, VLAN, or SGT on RADIUS Server Failure! • Critical Auth ACL applied on Server Down WAN or PSN Down Access VLAN Critical VLAN interface GigabitEthernet1/0/2 switchport access vlan 10 switchport voice vlan 13 ip access-group ACL-DEFAULT in access-session port-control auto mab dot1x pae authenticator service-policy type control subscriber ACCESS-POLICY Gi1/0/2 ip access-list extended ACL-DEFAULT permit udp any eq bootpc any eq bootps permit udp any any eq domain permit icmp any any permit udp any any eq tftp Only DHCP/DNS/PING/TFTP allowed ! Voice VLAN Default ACL
  • 184. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Only DHCP/DNS/PING/TFTP allowed ! 184 BRKSEC-3699 Critical ACL using Service Policy Templates Apply ACL, VLAN, or SGT on RADIUS Server Failure! • Critical Auth ACL applied on Server Down WAN or PSN Down Access VLAN Critical VLAN Gi1/0/2 Voice VLAN Default ACL Critical ACL Deny PCI networks; Permit Everything Else ! policy-map type control subscriber ACCESS-POLICY event authentication-failure match-first 10 class AAA_SVR_DOWN_UNAUTHD do-until-failure 10 activate service-template CRITICAL_AUTH_VLAN 20 activate service-template DEFAULT_CRITICAL_VOICE_TEMPLATE 30 activate service-template CRITICAL-ACCESS service-template CRITICAL-ACCESS access-group ACL-CRITICAL ! service-template CRITICAL_AUTH_VLAN vlan 10 service-template DEFAULT_CRITICAL_VOICE_TEMPLATE voice vlan class-map type control subscriber match-all AAA_SVR_DOWN_UNAUTHD match result-type aaa-timeout match authorization-status unauthorized 2k/3k/4k: 15.2(1)E 3k IOS-XE: 3.3.0SE 4k: IOS-XE 3.5.0E 6k: 15.2(1)SY ip access-list extended ACL-DEFAULT permit udp any eq bootpc any eq bootps permit udp any any eq domain permit icmp any any permit udp any any eq tftp ip access-list extended ACL-CRITICAL remark Deny access to PCI zone scopes deny tcp any 172.16.8.0 255.255.240.0 deny udp any 172.16.8.0 255.255.240.0 deny ip any 192.168.0.0 255.255.0.0 permit ip any any
  • 185. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 185 BRKSEC-3699 Critical MAB Local Authentication During Server Failure policy-map type control subscriber ACCESS-POL ... event authentication-failure match-first 10 class AAA_SVR_DOWN_UNAUTHD_HOST do-↵ until-failure 10 terminate mab 20 terminate dot1x 30 authenticate using mab aaa authc-↵ list mab-local authz-list mab-local ... 000c.293c.8dca 000c.293c.331e  Additional level of check to authorize hosts during a critical condition.  EEM Scripts could be used for dynamic update of whitelist MAC addresses  Sessions re-initialize once the server connectivity resumes. username 000c293c8dca password 0 000c293c8dca username 000c293c8dca aaa attribute list mab-local ! aaa local authentication default authorization mab-local aaa authorization credential-download mab-local local ! aaa attribute list mab-local attribute type tunnel-medium-type all-802 attribute type tunnel-private-group-id "150" attribute type tunnel-type vlan attribute type inacl "CRITICAL-V4" ! WAN ?
  • 187. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 187 BRKSEC-3699 Home Dashboard - High-Level Server Health 187
  • 188. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 188 BRKSEC-3699 Server Health/Utilization Reports Operations > Reports > Diagnostics > Health Summary 188
  • 189. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 189 BRKSEC-3699 Key Performance Metrics (KPM) • KPM Reports added in ISE 2.2: Operations > Reports > Diagnostics > KPM • Also available from CLI (# application configure ise) since ISE 1.4 • Provide RADIUS Load, Latency, and Suppression Stats
  • 190. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 190 BRKSEC-3699 Serviceability Counter Framework (CF) The Easy Way: MnT auto-collects key metrics from each node! • Enable/disable from ‘app configure ise’ • Enabled by default • Threshold are hard set by platform size • Alarm sent when exceed threshold • Running count displayed per collection interval Detected platform size Thresholds Node specific report
  • 191. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Key Takeaway Points • CHECK ISE Virtual Appliances for proper resources and platform detection! • Avoid excessive auth activity through proper NAD / supplicant tuning and Log Suppression • Minimize data replication by implementing node groups and profiling best practices • Leverage load balancers for scale, high availability, and simplifying network config changes • Be sure to have a local fallback plan on you network access devices 191 BRKSEC-3699
  • 192. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS 192 Cisco Community Page on Sizing and Scalability https://communities.cisco.com/docs/DOC-68347 BRKSEC-3699
  • 193. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS ISE Performance & Scale Resources https://communities.cisco.com/docs/DOC-65625 • Community Page • Cisco Live: BRKSEC-3699 Reference version • ISE Load Balancing Design Guide (be sure to read customer notes at bottom of download page—guide errata!) • Calculators for Bandwidth and Logging 193 BRKSEC-3699
  • 194. Complete your online session evaluation © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Give us your feedback to be entered into a Daily Survey Drawing. Complete your session surveys through the Cisco Live mobile app or on www.CiscoLive.com/us. Don’t forget: Cisco Live sessions will be available for viewing on demand after the event at www.CiscoLive.com/Online. BRKSEC-3699 194
  • 195. © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public #CLUS Demos in the Cisco campus Walk-in self-paced labs Meet the engineer 1:1 meetings Related sessions Continue your education BRKSEC-3699 195
  • 197. #CLUS