SlideShare a Scribd company logo
G D P S A N D S Y S T E M C O M P L E X
M A I N F R A M E C L U S T E R I N G
Najmi Mansoor Ahmed
Principal Architect PSS (IBM ALCS v241 under z/OS 2.1)
Presented on 23-Nov-2016
BASICS
• A system is made up of hardware products including a central processor (CPU), and
software products, with the primary one being an operating system such as z/OS.
• The CPU and other system hardware, such as channels and storage (RAM), make up a
Central Processor Complex (CPC) or in general terms – mainframe box.
Mainframe
Disks
• It is possible to run mainframe with a single processor /uniprocessor (CP) , but this is not
a typical system.
• When all the CPs share central storage and a single OS image manages the processing,
work is assigned to a CP that is available to do the work. If a CP fails, work can be routed
to another CP.
Mainframe
Disks
z/OS
CP1
Multi and Uniprocessor
 The ability to partition a large system into multiple smaller systems, called logical
partitions or LPARs, is now a core requirement in practically all mainframe installations.
It allows to build virtual clusters of CPUs, OSs and applications with in a single box
Multi and Uniprocessor
• Mainframe has three major clustering techniques:
LPAR
Channels
z/OS
Disk control unit
Clustering
LPAR
Channels
z/OS
Disk control unit
Basic Shared Storage (DASD)
It is typically used when operations
staff controls which jobs go to which
system and ensures that there is no
conflict (both system trying to
update same data at the same time)
A channel is a high-speed data
bus.
Todays mainframe use FICON
(FIber CONnection) channels.
• Basic Shared DASD
• CTC rings
• Parallel Sysplex
• Channel-to-Channel CTC ring
LPAR
Channels
z/OS
Disk control unit
Clustering
LPAR
Channels
z/OS
Disk control unit
Channel-to-Channel (CTC)
It simulates an I/O device that can
be used by one system to
communicate with another and
provides data path and
synchronization for data transfer
When CTC is used to connect
two channels a loosely coupled
multiprocessing system is
established
CTC ring
• A loosely coupled configuration has more than one mainframes managed by more than
one z/OS image.
• Although a loosely coupled configuration increases system capacity, it is not as easy to
manage as either a uniprocessor or a tightly coupled multiprocessor.
• Each system must be managed separately, often by a human operator, who monitors
product-specific messages on a set of consoles for each system.
• Products and applications that need to communicate and are running on separate
systems have to create their own communication mechanism.
Loosely coupled multiprocessors
CP4CP3CP2CP1 z/OS Image CP4CP3CP2CP1 z/OS Image
• To help solve the difficulties of managing many z/OS systems, IBM introduced the z/OS systems complex or
sysplex
• A sysplex is a collection of z/OS systems that cooperate, using certain hardware and software products, to
process work.
Sysplex
Mainframe 4
Disks
Mainframe 3
Disks
Mainframe 2
Disks
Mainframe 1
• SYSPLEX - System Complex - Is clustering more systems for availability, sharing workload, recovery,
resource and data sharing.
• System complex can be build in same data centre or in two different data centres from 2 to 32
mainframes
Sysplex
Mainframe 4Mainframe 3Mainframe 2Mainframe 1
SYSPLEX
It is a Clustering technique
• Every server (node) / LPAR has access to data resources
• Every cloned sysplex enabled application can run on every
LPAR
• It appears as single large system having single operating
interface to control it
SYSPLEX
Base Sysplex
• Joining systems through Channel to Channel
(CTC) connections
Types of Sysplex
LPAR
Channels
z/OS
Disk control unit
LPAR
Channels
z/OS
Disk control unit
CTC ring
PARALLEL SYSPLEX
A Parallel Sysplex is a cluster of IBM mainframes acting together
as a single system image with z/OS.
Used for disaster recovery, Parallel Sysplex combines data
sharing and parallel computing to allow a cluster of up to 32
systems to share a workload for high performance and high
availability.
PARALLEL SYSPLEX
Parallel Sysplex = Base Sysplex + Coupling Facility (CF)
Mainframe 1 Mainframe 2
Site A Site B
System Complex - SysPlex
Parallel Sysplex
• An enhancement to Base Sysplex by joining systems through a
Facility (CF)
Types of Sysplex
LPAR
Channels
z/OS
Disk control unit
LPAR
Channels
z/OS
Disk control unit
CTC ring
Coupling
Facility
CF channels
Types of Sysplex
LPAR
Channels
z/OS
Disk control unit
LPAR
Channels
z/OS
Disk control unit
CTC ring
Coupling
Facility
CF channels;
 Parallel Sysplex technology is an enabling technology with two critical capabilities
 Parallel processing
 Enabling read/write data sharing across multiple systems with full data integrity. "shared data" (as
opposed to "shared nothing")
DisksDisks
A key component in any Parallel Sysplex is the Coupling Facility (CF)
infrastructure.
Coupling facility = sharing of central memory between two systems.
Parallel Sysplex
Parallel Sysplex is analogous in concept to a UNIX cluster – allow the
customer to operate multiple copies of the operating system as a
single system. This allows systems to be added or removed as
needed, while applications continue to run.
PrimarySecondary
System Complex (Clustering)
System Complex (Clustering)
Standby
Production
Production
Non Production
Standby
Non Production
• A coupling facility is a special logical partition that runs the coupling facility
control code (CFCC) and provides high-speed caching, list processing, and
locking functions in a sysplex.
• A CF functions largely as a fast scratch pad. It is used for three purposes:
Coupling Facility (CF) structure
 Locking information that is shared among all attached systems
 Cache information (such as for a database) that is shared among all attached systems
 Data list information that is shared among all attached systems
Characteristics of Parallel Sysplex
 A common time source to synchronize all Mainframes systems' clocks.
 Coupling facility (CF) sharing of central memory between two systems for high performance data
sharing
 Cross System Coupling Facility (XCF) allows systems to communicate peer-to- peer
 Global Resource Serialization (GRS) allows multiple systems to access the same resources
concurrently, serializing where necessary to ensure exclusive access to prevents updates to the
same data
 Couple Data Sets (CDS) requires by sysplex to store information about its systems
FICON
Sysplex Timer
Coupling Facility
XCF
XCF
GRS
GRS
z/OS
LPAR
z/OS
LPAR
Couple Data Sets
Couple Data Sets
CDS CDS
Characteristics of Parallel Sysplex
• The best practice for any data sharing Parallel Sysplex is that there is at
least one failure-isolated CF implemented.
• It is critical that all Parallel Sysplex have at least two CFs to every member
of sysplex
FICON
Sysplex Timer
Coupling Facility
XCF
XCF
GRS
GRS
z/OS
LPAR
z/OS
LPAR
CF CF
Couple Data Sets
Couple Data Sets
CDS CDS
Characteristics of Parallel Sysplex
 Timer is a mandatory hardware requirement for a parallel sysplex consisting of more than one z
Series servers.
 It provides synchronization for the time-of-day (TOD) clocks of multiple servers thereby allows
events started by different servers to be properly sequenced in time.
 When multiple server updates same data base , all updates are required to be time stamped in
proper sequence.
 The Server Time Protocol feature is designed to provide the capability for multiple servers and
Coupling Facilities to maintain time synchronization with each other
 Redundancy of timers allows to stay Sysplex if either of has planned /unplanned outage
FICON
Couple Data Sets
Couple Data Sets
STP Timer
XCF
XCF
GRS
GRS
z/OS
LPAR
z/OS
LPAR
CF CF
Server Time Protocol
(STP) is a time
synchronization
architecture designed to
provide the capability for
multiple servers (CPCs) to
maintain time
synchronization with each
other and to form a
Coordinated Timing
Network (CTN)
To maintain time accuracy,
the STP facility supports
connectivity to an External
Time Source (ETS)
It is IBM’s License Internal
Code (LIC)
CDS CDS
Parallel Sysplex
Continuous Availability
 With Parallel Sysplex cluster it is possible to construct a parallel processing environment
single points of failure
 Because of the redundancy in the configuration, there is a significant reduction in the
number of single points of failure.
 Ability to perform hardware and software maintenance and installations in a non-
disruptive manner. Through data sharing and dynamic workload management, servers
can be dynamically removed from or added to the cluster allowing installation and
maintenance activities to be performed while the remaining systems continue to
work.
Capacity
 Parallel Sysplex environment can scale near linearly from 2 to 32 systems
Dynamic Workload Balancing
 The entire Parallel Sysplex cluster can be viewed as a single logical resource to end
and business applications.
- Benefits
GDPS
Geographically Dispersed Parallel Sysplex (GDPS) is an extension
of Parallel Sysplex of mainframes located, potentially, in different
cities or/and data centres.
GDPS is an end to end application availability solution
Globally Dispersed Parallel Sysplex - GDPS
• It is the ultimate disaster recovery and continuous availability
solution for a multi-site enterprise
• GDPS is combination of storage and Parallel Sysplex technology.
• Automates Parallel Sysplex operation tasks and perform failure
recovery from a single point of control.
• Types of GDPS configurations
• GDPS/PPRC based on synchronous data mirroring technology (PPRC)
that can be used on mainframes 200 kilometres (120 mi) apart.
• GDPS/XRC is an asynchronous Extended Remote Copy (XRC)
technology with no restrictions on distance
• GDPS/Global Mirror is based on asynchronous IBM Global
Mirror technology with no restrictions on distance.
• GDPS/active-active is a disaster recovery / continuous availability
solution, based on two or more sites, separated by unlimited
distances, running the same applications and having the same data to
provide cross-site workload balancing.
GDPS ACTIVE/ACTIVE
To achieve GDPS Active/Active configuration :
• All critical data must be PPRCed and Hyper-swaped enabled
• All critical CF structure must be duplexed
• Application must be parallel sysplex enabled
GDPS/PPRC
GDPS/PPRC, is metro area Continuous Availability (CA) and Disaster
Recovery (DR) solution, based upon
 Multi-site Parallel (Sys)tem Com(plex)
 Synchronous disk replication
It supports two configurations
 Active/Standby or single site workload
 Active/active or multi-site workload
Disks
Site 1 Active
Disks
GDPS / PPRC
Site 2 Warm
PPRC / Metro Mirror
• Even with the multi-path and RAID architecture within DASD subsystems the
single copy of the data continues to be a single point of failure (SPOF).
• A failure of a disk subsystem or even a single disk array failure can take down
major applications, the system, or even the sysplex.
• GDPS/PPRC is IBM disk replication technology to supplement removing SPOF
A Parallel Sysplex environment has been designed to reduce
outages by replicating hardware, operating systems, and
application components. In spite of this redundancy, having only
one copy on the data is an exposure.
GDPS Hyperswap
If there is a problem writing or accessing the primary disk, then
there is a need to swap I/O from the primary disks to the
secondary disks.
HyperSwap, a feature of GDPS, enhances the resilience by facilitating the
immediate switching I/O operations from the primary to the secondary
disks therefore providing near-continuous access to data
Secondary
Disks
Primary
Disks PPRC / Metro Mirror
GDPS Hyperswap
GDPS
Controlling
System
GDPS
Controlling
System
Primary Site Secondary Site
Hypersawap provides continuous availability of
data by masking disk outages and automates
switching between the two copies of the data
without causing an application outage In real time.
Failure
HyperSwap
K2K1 CDS
• In order for GDPS to operate, there must be a separate, isolated, z/OS system
known as the Controlling system.
• GDPS environments without a Controlling system are not supported.
• IBM strongly recommends 2 Controlling systems are setup per Sysplex.
• The idea is for one to act as a Primary and the other to be a Backup.
GDPS controlling System
PPRC / Metro Mirror
GDPS
Controlling
System (K1)
GDPS
Controlling
System(K2)
K2
K1
Primary Site Secondary Site
PPRC / Metro Mirror
GDPS controlling system :
GDPS controlling system ?
GDPS
Controlling
System (K1)
GDPS
Controlling
System (K2)
1. Performs situation analysis (after the unplanned event) to determine the status of
the production system and/or disks.
2. Drives automated recovery actions.
The controlling system must be in same sysplex so it can see all the messages
from systems in sysplex and communicate with them.
K2K1
Primary Site Secondary Site
CDSCDS
z/OS
LPAR
(K1)
XCF
GRS
CF
z/OS
LPAR
XCF
GRS
CF
(K2)
PPRC / Metro Mirror
K2K1
Primary Site Secondary Site
CDSCDS
ALCS
(K1)
XCF
GRS
CF
ALCS
XCF
GRS
CF
(K2)
The availability of controlling system is fundamental to GDPS
Why does a GDPS configuration need a controlling system ?
GDPS controlling system is designed to survive a failure in the opposite site of
where primary disks are .
Primary disks are normally in Site1 and controlling system in Site2 is designed to
survive if Site1 or the disks in Site1 fail.
Final view - Combining the jigsaw puzzle
PPRC / Hyperswap
K2
K1
Primary Site
CDSCDS
ALCS
(K1)
XCF
GRS
CF
GDPS managed
40 KM
Secondary Site
ALCS
XCF
GRS
CF
(K2)
CF links (Timer links)
ADVA /DWDM ADVA /DWDM
PPRC links , ISL Channels
GDPS
Site 1 TIBCO Site 2- Tibco
Applications
ESW Network
Conclusion
Mainframe physical clustering (System complex /Sysplex) between
dispersed data centres (GDPS) provides enterprise level disaster
recovery, data sharing , parallel computing capability to share workload
workload for high performance and high availability.
G D P S A N D S Y S T E M C O M P L E X
N A J M I M A N S O O R A H M E D

More Related Content

PPT
Parallel Sysplex Implement2
PDF
Sysplex in a Nutshell
PDF
Z4R: Intro to Storage and DFSMS for z/OS
PDF
z16 zOS Support - March 2023 - SHARE in Atlanta.pdf
PDF
z/OS Small Enhancements - Episode 2016A
PDF
z/OS Communications Server Overview
PDF
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
PDF
Hints for a successful hfs to zfs migration
Parallel Sysplex Implement2
Sysplex in a Nutshell
Z4R: Intro to Storage and DFSMS for z/OS
z16 zOS Support - March 2023 - SHARE in Atlanta.pdf
z/OS Small Enhancements - Episode 2016A
z/OS Communications Server Overview
DB2 for z/OS and DASD-based Disaster Recovery - Blowing away the myths
Hints for a successful hfs to zfs migration

What's hot (20)

PDF
zOSMF Software Update Lab.pdf
PDF
Upgrade to IBM z/OS V2.4 planning
PDF
IP Routing on z/OS
PDF
Upgrade to zOS V2.5 - Planning and Tech Actions.pdf
PDF
TN3270 Access to Mainframe SNA Applications
PDF
Upgrade to IBM z/OS V2.5 technical actions
PDF
Db2 for z/OS and FlashCopy - Practical use cases (June 2019 Edition)
PDF
z/OS Communications Server: z/OS Resolver
PDF
z/OS V2R2 Communications Server Overview
PDF
z/OS Communications Server Technical Update
PDF
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
PDF
Networking on z/OS
PDF
z/OS 2.3 HiperSockets Converged Interface (HSCI) support
PDF
Parallel sysplex
PDF
Upgrade to IBM z/OS V2.5 Planning
PDF
Apple Captive Network Assistant Bypass with ClearPass Guest
PPTX
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
PDF
Cisco Live Brksec 3032 - NGFW Clustering
PDF
IBM MQ - What's new in 9.2
PDF
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
zOSMF Software Update Lab.pdf
Upgrade to IBM z/OS V2.4 planning
IP Routing on z/OS
Upgrade to zOS V2.5 - Planning and Tech Actions.pdf
TN3270 Access to Mainframe SNA Applications
Upgrade to IBM z/OS V2.5 technical actions
Db2 for z/OS and FlashCopy - Practical use cases (June 2019 Edition)
z/OS Communications Server: z/OS Resolver
z/OS V2R2 Communications Server Overview
z/OS Communications Server Technical Update
TCP/IP Stack Configuration with Configuration Assistant for IBM z/OS CS
Networking on z/OS
z/OS 2.3 HiperSockets Converged Interface (HSCI) support
Parallel sysplex
Upgrade to IBM z/OS V2.5 Planning
Apple Captive Network Assistant Bypass with ClearPass Guest
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Cisco Live Brksec 3032 - NGFW Clustering
IBM MQ - What's new in 9.2
Implementation and Use of Generic VTAM Resources with Parallel SYSPLEX Features
Ad

Similar to GDPS and System Complex (20)

PDF
Conceitos de Capacity Planning e Sysplex por Fernando Ferreira
PPT
Mainframe
PDF
DB2 Data Sharing Performance for Beginners
PPT
17. Computer System Configuration And Methods
PPT
342557457-Tcs-d-DBA-Rac.ppt
PDF
Parallel Sysplex Performance Topics
PDF
17294_HiperSockets.pdf
DOCX
cloud service management.Details of classic data center
PPT
Huawei Symantec Oceanspace N8000 clustered NAS Overview
PDF
Educational seminar lessons learned from customer db2 for z os health check...
PPTX
Understand oracle real application cluster
PDF
“z/OS Multi-Site Business Continuity” September, 2012
PPTX
Advanced Storage Area Network
PPTX
Clusters
PDF
Chapter 5(2).pdf
ODP
PDF
KA 5 - Lecture 1 - Parallel Processing.pdf
PPT
Parallel processing extra
PPT
Mainframe Architecture & Product Overview
PDF
Coupling Facility CPU
Conceitos de Capacity Planning e Sysplex por Fernando Ferreira
Mainframe
DB2 Data Sharing Performance for Beginners
17. Computer System Configuration And Methods
342557457-Tcs-d-DBA-Rac.ppt
Parallel Sysplex Performance Topics
17294_HiperSockets.pdf
cloud service management.Details of classic data center
Huawei Symantec Oceanspace N8000 clustered NAS Overview
Educational seminar lessons learned from customer db2 for z os health check...
Understand oracle real application cluster
“z/OS Multi-Site Business Continuity” September, 2012
Advanced Storage Area Network
Clusters
Chapter 5(2).pdf
KA 5 - Lecture 1 - Parallel Processing.pdf
Parallel processing extra
Mainframe Architecture & Product Overview
Coupling Facility CPU
Ad

Recently uploaded (20)

PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Machine learning based COVID-19 study performance prediction
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Modernizing your data center with Dell and AMD
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Approach and Philosophy of On baking technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Electronic commerce courselecture one. Pdf
GamePlan Trading System Review: Professional Trader's Honest Take
Machine learning based COVID-19 study performance prediction
Unlocking AI with Model Context Protocol (MCP)
Dropbox Q2 2025 Financial Results & Investor Presentation
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Per capita expenditure prediction using model stacking based on satellite ima...
Modernizing your data center with Dell and AMD
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
NewMind AI Weekly Chronicles - August'25 Week I
Chapter 3 Spatial Domain Image Processing.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
“AI and Expert System Decision Support & Business Intelligence Systems”
Approach and Philosophy of On baking technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Review of recent advances in non-invasive hemoglobin estimation
Electronic commerce courselecture one. Pdf

GDPS and System Complex

  • 1. G D P S A N D S Y S T E M C O M P L E X M A I N F R A M E C L U S T E R I N G Najmi Mansoor Ahmed Principal Architect PSS (IBM ALCS v241 under z/OS 2.1) Presented on 23-Nov-2016
  • 2. BASICS • A system is made up of hardware products including a central processor (CPU), and software products, with the primary one being an operating system such as z/OS. • The CPU and other system hardware, such as channels and storage (RAM), make up a Central Processor Complex (CPC) or in general terms – mainframe box. Mainframe Disks
  • 3. • It is possible to run mainframe with a single processor /uniprocessor (CP) , but this is not a typical system. • When all the CPs share central storage and a single OS image manages the processing, work is assigned to a CP that is available to do the work. If a CP fails, work can be routed to another CP. Mainframe Disks z/OS CP1 Multi and Uniprocessor  The ability to partition a large system into multiple smaller systems, called logical partitions or LPARs, is now a core requirement in practically all mainframe installations.
  • 4. It allows to build virtual clusters of CPUs, OSs and applications with in a single box Multi and Uniprocessor
  • 5. • Mainframe has three major clustering techniques: LPAR Channels z/OS Disk control unit Clustering LPAR Channels z/OS Disk control unit Basic Shared Storage (DASD) It is typically used when operations staff controls which jobs go to which system and ensures that there is no conflict (both system trying to update same data at the same time) A channel is a high-speed data bus. Todays mainframe use FICON (FIber CONnection) channels. • Basic Shared DASD • CTC rings • Parallel Sysplex
  • 6. • Channel-to-Channel CTC ring LPAR Channels z/OS Disk control unit Clustering LPAR Channels z/OS Disk control unit Channel-to-Channel (CTC) It simulates an I/O device that can be used by one system to communicate with another and provides data path and synchronization for data transfer When CTC is used to connect two channels a loosely coupled multiprocessing system is established CTC ring
  • 7. • A loosely coupled configuration has more than one mainframes managed by more than one z/OS image. • Although a loosely coupled configuration increases system capacity, it is not as easy to manage as either a uniprocessor or a tightly coupled multiprocessor. • Each system must be managed separately, often by a human operator, who monitors product-specific messages on a set of consoles for each system. • Products and applications that need to communicate and are running on separate systems have to create their own communication mechanism. Loosely coupled multiprocessors CP4CP3CP2CP1 z/OS Image CP4CP3CP2CP1 z/OS Image
  • 8. • To help solve the difficulties of managing many z/OS systems, IBM introduced the z/OS systems complex or sysplex • A sysplex is a collection of z/OS systems that cooperate, using certain hardware and software products, to process work. Sysplex Mainframe 4 Disks Mainframe 3 Disks Mainframe 2 Disks Mainframe 1
  • 9. • SYSPLEX - System Complex - Is clustering more systems for availability, sharing workload, recovery, resource and data sharing. • System complex can be build in same data centre or in two different data centres from 2 to 32 mainframes Sysplex Mainframe 4Mainframe 3Mainframe 2Mainframe 1
  • 10. SYSPLEX It is a Clustering technique • Every server (node) / LPAR has access to data resources • Every cloned sysplex enabled application can run on every LPAR • It appears as single large system having single operating interface to control it
  • 12. Base Sysplex • Joining systems through Channel to Channel (CTC) connections Types of Sysplex LPAR Channels z/OS Disk control unit LPAR Channels z/OS Disk control unit CTC ring
  • 13. PARALLEL SYSPLEX A Parallel Sysplex is a cluster of IBM mainframes acting together as a single system image with z/OS. Used for disaster recovery, Parallel Sysplex combines data sharing and parallel computing to allow a cluster of up to 32 systems to share a workload for high performance and high availability.
  • 14. PARALLEL SYSPLEX Parallel Sysplex = Base Sysplex + Coupling Facility (CF) Mainframe 1 Mainframe 2 Site A Site B System Complex - SysPlex
  • 15. Parallel Sysplex • An enhancement to Base Sysplex by joining systems through a Facility (CF) Types of Sysplex LPAR Channels z/OS Disk control unit LPAR Channels z/OS Disk control unit CTC ring Coupling Facility CF channels
  • 16. Types of Sysplex LPAR Channels z/OS Disk control unit LPAR Channels z/OS Disk control unit CTC ring Coupling Facility CF channels;  Parallel Sysplex technology is an enabling technology with two critical capabilities  Parallel processing  Enabling read/write data sharing across multiple systems with full data integrity. "shared data" (as opposed to "shared nothing") DisksDisks
  • 17. A key component in any Parallel Sysplex is the Coupling Facility (CF) infrastructure. Coupling facility = sharing of central memory between two systems. Parallel Sysplex Parallel Sysplex is analogous in concept to a UNIX cluster – allow the customer to operate multiple copies of the operating system as a single system. This allows systems to be added or removed as needed, while applications continue to run. PrimarySecondary System Complex (Clustering) System Complex (Clustering) Standby Production Production Non Production Standby Non Production
  • 18. • A coupling facility is a special logical partition that runs the coupling facility control code (CFCC) and provides high-speed caching, list processing, and locking functions in a sysplex. • A CF functions largely as a fast scratch pad. It is used for three purposes: Coupling Facility (CF) structure  Locking information that is shared among all attached systems  Cache information (such as for a database) that is shared among all attached systems  Data list information that is shared among all attached systems
  • 19. Characteristics of Parallel Sysplex  A common time source to synchronize all Mainframes systems' clocks.  Coupling facility (CF) sharing of central memory between two systems for high performance data sharing  Cross System Coupling Facility (XCF) allows systems to communicate peer-to- peer  Global Resource Serialization (GRS) allows multiple systems to access the same resources concurrently, serializing where necessary to ensure exclusive access to prevents updates to the same data  Couple Data Sets (CDS) requires by sysplex to store information about its systems FICON Sysplex Timer Coupling Facility XCF XCF GRS GRS z/OS LPAR z/OS LPAR Couple Data Sets Couple Data Sets CDS CDS
  • 20. Characteristics of Parallel Sysplex • The best practice for any data sharing Parallel Sysplex is that there is at least one failure-isolated CF implemented. • It is critical that all Parallel Sysplex have at least two CFs to every member of sysplex FICON Sysplex Timer Coupling Facility XCF XCF GRS GRS z/OS LPAR z/OS LPAR CF CF Couple Data Sets Couple Data Sets CDS CDS
  • 21. Characteristics of Parallel Sysplex  Timer is a mandatory hardware requirement for a parallel sysplex consisting of more than one z Series servers.  It provides synchronization for the time-of-day (TOD) clocks of multiple servers thereby allows events started by different servers to be properly sequenced in time.  When multiple server updates same data base , all updates are required to be time stamped in proper sequence.  The Server Time Protocol feature is designed to provide the capability for multiple servers and Coupling Facilities to maintain time synchronization with each other  Redundancy of timers allows to stay Sysplex if either of has planned /unplanned outage FICON Couple Data Sets Couple Data Sets STP Timer XCF XCF GRS GRS z/OS LPAR z/OS LPAR CF CF Server Time Protocol (STP) is a time synchronization architecture designed to provide the capability for multiple servers (CPCs) to maintain time synchronization with each other and to form a Coordinated Timing Network (CTN) To maintain time accuracy, the STP facility supports connectivity to an External Time Source (ETS) It is IBM’s License Internal Code (LIC) CDS CDS
  • 22. Parallel Sysplex Continuous Availability  With Parallel Sysplex cluster it is possible to construct a parallel processing environment single points of failure  Because of the redundancy in the configuration, there is a significant reduction in the number of single points of failure.  Ability to perform hardware and software maintenance and installations in a non- disruptive manner. Through data sharing and dynamic workload management, servers can be dynamically removed from or added to the cluster allowing installation and maintenance activities to be performed while the remaining systems continue to work. Capacity  Parallel Sysplex environment can scale near linearly from 2 to 32 systems Dynamic Workload Balancing  The entire Parallel Sysplex cluster can be viewed as a single logical resource to end and business applications. - Benefits
  • 23. GDPS Geographically Dispersed Parallel Sysplex (GDPS) is an extension of Parallel Sysplex of mainframes located, potentially, in different cities or/and data centres. GDPS is an end to end application availability solution
  • 24. Globally Dispersed Parallel Sysplex - GDPS • It is the ultimate disaster recovery and continuous availability solution for a multi-site enterprise • GDPS is combination of storage and Parallel Sysplex technology. • Automates Parallel Sysplex operation tasks and perform failure recovery from a single point of control. • Types of GDPS configurations • GDPS/PPRC based on synchronous data mirroring technology (PPRC) that can be used on mainframes 200 kilometres (120 mi) apart. • GDPS/XRC is an asynchronous Extended Remote Copy (XRC) technology with no restrictions on distance • GDPS/Global Mirror is based on asynchronous IBM Global Mirror technology with no restrictions on distance. • GDPS/active-active is a disaster recovery / continuous availability solution, based on two or more sites, separated by unlimited distances, running the same applications and having the same data to provide cross-site workload balancing.
  • 25. GDPS ACTIVE/ACTIVE To achieve GDPS Active/Active configuration : • All critical data must be PPRCed and Hyper-swaped enabled • All critical CF structure must be duplexed • Application must be parallel sysplex enabled
  • 26. GDPS/PPRC GDPS/PPRC, is metro area Continuous Availability (CA) and Disaster Recovery (DR) solution, based upon  Multi-site Parallel (Sys)tem Com(plex)  Synchronous disk replication It supports two configurations  Active/Standby or single site workload  Active/active or multi-site workload
  • 27. Disks Site 1 Active Disks GDPS / PPRC Site 2 Warm PPRC / Metro Mirror • Even with the multi-path and RAID architecture within DASD subsystems the single copy of the data continues to be a single point of failure (SPOF). • A failure of a disk subsystem or even a single disk array failure can take down major applications, the system, or even the sysplex. • GDPS/PPRC is IBM disk replication technology to supplement removing SPOF
  • 28. A Parallel Sysplex environment has been designed to reduce outages by replicating hardware, operating systems, and application components. In spite of this redundancy, having only one copy on the data is an exposure. GDPS Hyperswap If there is a problem writing or accessing the primary disk, then there is a need to swap I/O from the primary disks to the secondary disks. HyperSwap, a feature of GDPS, enhances the resilience by facilitating the immediate switching I/O operations from the primary to the secondary disks therefore providing near-continuous access to data
  • 29. Secondary Disks Primary Disks PPRC / Metro Mirror GDPS Hyperswap GDPS Controlling System GDPS Controlling System Primary Site Secondary Site Hypersawap provides continuous availability of data by masking disk outages and automates switching between the two copies of the data without causing an application outage In real time. Failure HyperSwap K2K1 CDS
  • 30. • In order for GDPS to operate, there must be a separate, isolated, z/OS system known as the Controlling system. • GDPS environments without a Controlling system are not supported. • IBM strongly recommends 2 Controlling systems are setup per Sysplex. • The idea is for one to act as a Primary and the other to be a Backup. GDPS controlling System PPRC / Metro Mirror GDPS Controlling System (K1) GDPS Controlling System(K2) K2 K1 Primary Site Secondary Site
  • 31. PPRC / Metro Mirror GDPS controlling system : GDPS controlling system ? GDPS Controlling System (K1) GDPS Controlling System (K2) 1. Performs situation analysis (after the unplanned event) to determine the status of the production system and/or disks. 2. Drives automated recovery actions. The controlling system must be in same sysplex so it can see all the messages from systems in sysplex and communicate with them. K2K1 Primary Site Secondary Site CDSCDS z/OS LPAR (K1) XCF GRS CF z/OS LPAR XCF GRS CF (K2)
  • 32. PPRC / Metro Mirror K2K1 Primary Site Secondary Site CDSCDS ALCS (K1) XCF GRS CF ALCS XCF GRS CF (K2) The availability of controlling system is fundamental to GDPS Why does a GDPS configuration need a controlling system ? GDPS controlling system is designed to survive a failure in the opposite site of where primary disks are . Primary disks are normally in Site1 and controlling system in Site2 is designed to survive if Site1 or the disks in Site1 fail.
  • 33. Final view - Combining the jigsaw puzzle PPRC / Hyperswap K2 K1 Primary Site CDSCDS ALCS (K1) XCF GRS CF GDPS managed 40 KM Secondary Site ALCS XCF GRS CF (K2) CF links (Timer links) ADVA /DWDM ADVA /DWDM PPRC links , ISL Channels GDPS Site 1 TIBCO Site 2- Tibco Applications ESW Network
  • 34. Conclusion Mainframe physical clustering (System complex /Sysplex) between dispersed data centres (GDPS) provides enterprise level disaster recovery, data sharing , parallel computing capability to share workload workload for high performance and high availability.
  • 35. G D P S A N D S Y S T E M C O M P L E X N A J M I M A N S O O R A H M E D