SlideShare a Scribd company logo
Distributed and Cloud
Computing –Unit1:
22MCN1B2T
RV College of
Engineering
Go, change the
world
By
Prof Srividya M S,
Department of CSE,
RVCE
05/04/25 1
05/04/25 2
CO1 : Apply the distributed and cloud computing concepts to solve problems in
computing domain.
CO2 : Analyse various architectures, work flow models and algorithms used to
implement cloud and distributed systems.
CO3 : Design solutions using modern tools to solve applicable problems in
cloud and distributed systems.
CO4 : Demonstrate effective communication , report writing and usage of
modern tools for implementing cloud and distributed systems applications
RV College of
Engineering Go, change the world
Course Outcomes for
Distributed and Cloud Computing
RV College of
Engineering Go, change the world
05/04/25 3
UNIT-1:
Distributed System Models & Enabling technology:
Scalable computing over the internet,
Technologies for network-based system,
System models for distributed & cloud,
Software environments for distributed & Cloud,
Performance security and energy efficiency
SCALABLE COMPUTING OVER THE INTERNET
• Over the past 60 years, computing technology has undergone a series of platform
and environment changes.
• Evolutionary changes in machine architecture, operating system platform,
network connectivity, and application workload.
• Parallel and distributed computing system uses multiple computers to solve large-
scale problems
• Distributed computing becomes data-intensive and network-centric
• High-performance computing (HPC) and High-throughput computing (HTC)
systems built with parallel and distributed computing technologies is the need of
the hour
05/04/25 4
RV College of
Engineering Go, change the world
05/04/25 5
RV College of
Engineering Go, change the world
The Platform Evolution
• Computer technology has gone through five generations of development, with
each generation lasting from 10 to 20 years. Successive generations are
overlapped in about 10 years.
• 1950 to 1970, a handful of mainframes, including the IBM 360 and CDC 6400.
• 1960 to 1980, lower-cost minicomputers such as the DEC PDP 11 and VAX
Series became popular
• 1970 to 1990, we saw widespread use of personal computers built with VLSI
microprocessor
• 1980 to 2000, massive numbers of portable computers and pervasive devices
appeared in both wired and wireless applications
• Since 1990, the use of both HPC and HTC systems hidden clusters, grids, or
Internet clouds has proliferated
05/04/25 6
RV College of
Engineering Go, change the world
05/04/25 7
RV College of
Engineering Go, change the world
HPC and HTC systems
• On the HPC side,
• Supercomputers (massively parallel processors or MPPs)
• The cluster is often a collection of homogeneous compute nodes that are physically connected in close range
to one another.
• On the HTC side,
• Peer-to-peer (P2P) networks are formed for distributed file sharing and content delivery applications.
• P2P, cloud computing, and web service platforms are more focused on HTC applications
• There is a strategic change from an HPC paradigm to an HTC paradigm.
• The performance goal thus shifts to measure high throughput or the number of tasks completed per unit of time.
• HTC technology needs to not only improve in terms of batch processing speed, but cost saving, energy savings,
security, and reliability at many data and enterprise computing centers.
05/04/25 8
RV College of
Engineering Go, change the world
05/04/25 9
Parameter HPC HTC
Stands for HPC stands for High-Performance Computing HTC stands for High Throughput Computing
Definition
HPC is defined as the type of computing that makes use of multiple
computer processors in order to perform complex computations
parallelly.
HTC is defined as a type of computing that parallelly
executes a large number of simple and computationally
independent tasks.
Workload
HPC consists of running large-scale, complex, and computationally
intensive applications that need significant resources and memory.
HPC consists of running a large number of tasks that are
independent and small in size and does not require a large
amount of memory and resources.
Processing Power
It is designed to provide maximum performance and speed for large
tasks.
HTC is designed to increase the number of tasks that needs
to be completed in a given specific amount of time.
Resource
Management
For resource management to processes, HPC makes use of job
schedulers and resource managers.
For the resource management to processes, HTC makes use
of distributed management resources.
Fault Tolerance
To reduce the risk of data loss and data corruption HPC systems have a
complex fault tolerance mechanism.
HTC systems do not affect any other running processes due
to the failure of an individual task.
Scaling HPC scales up when few users are running together.
HTC systems scale horizontally for simple tasks and require
less computational speed.
Applications
HPC can be used in applications such as engineering design, weather
forecasting, drug discovery etc.
HTC can be used in applications such as bioinformatics,
research applications, etc.
Three New Computing Paradigms
• The maturity of Radio-frequency Identification (Rfid), Global Positioning System (GPS), and sensor
technologies has triggered the development of the Internet of Things (IoT).
• With the introduction of SOA, Web 2.0 services become available.
• Advances in virtualization make it possible to see the growth of Internet clouds
05/04/25 10
RV College of
Engineering Go, change the world
Computing Paradigm Distinctions
• In general, distributed computing is the opposite of centralized computing.
• The field of parallel computing overlaps with distributed computing to a great extent,
• And cloud computing overlaps with distributed, centralized
 Centralized computing :
1. This is a computing paradigm by which all computer resources are centralized in one physical system.
2. All resources (processors, memory, and storage) are fully shared and tightly coupled within one integrated
OS.
3. Many data centers and supercomputers are centralized systems, but they are used in parallel, distributed,
and cloud computing applications
 Parallel computing :
1. In parallel computing, all processors are either tightly coupled with centralized shared memory or loosely
coupled with distributed memory.
2. Inter processor communication is accomplished through shared memory or via message passing
05/04/25 11
RV College of
Engineering Go, change the world
Computing Paradigm Distinctions
 Distributed computing :
1. A distributed system consists of multiple autonomous computers, each having its own private memory,
communicating through a computer network.
2. Information exchange in a distributed system is accomplished through message passing.
 Cloud computing :
1. An Internet cloud of resources can be either a centralized or a distributed computing system.
2. The cloud applies parallel or distributed computing, or both.
3. Clouds can be built with physical or virtualized resources over large data centers that are centralized or
distributed
05/04/25 12
RV College of
Engineering Go, change the world
Scalable Computing Trends and New Paradigms
Includes,
Degrees of Parallelism
Innovative Applications
The Trend toward Utility Computing
The Hype Cycle of New Technologies
Fifty years ago, when hardware was bulky and expensive, most computers were designed in a bit-serial fashion.
Data-level parallelism (DLP) was made popular through SIMD (single instruction, multiple data) and vector
machines using vector or array types of instructions. DLP requires even more hardware support and compiler
assistance to work properly.
Innovative Applications
Both HPC and HTC systems desire transparency in many application aspects. For example, data access,
resource allocation, process location, concurrency in execution, job replication, and failure recovery should be
made transparent to both users and system management
05/04/25 13
RV College of
Engineering Go, change the world
05/04/25 14
RV College of
Engineering Go, change the world
The Trend toward Utility Computing
Utility computing focuses on a business model in which customers receive
computing resources from a paid service provider. All grid/cloud platforms are
regarded as utility service providers
05/04/25 15
RV College of
Engineering Go, change the world
The Trend toward Utility Computing
Identifies major computing paradigms to facilitate the study of distributed systems
and their applications. These paradigms share some common characteristics.
1.First, they are all ubiquitous in daily life. Reliability and scalability are two major
design objectives in these computing models.
2.Second, they are aimed at autonomic operations that can be self- organized to
support dynamic discovery
05/04/25 16
RV College of
Engineering Go, change the world
The Hype Cycle of New Technologies
05/04/25 17
RV College of
Engineering Go, change the world
The Internet of Things and Cyber-Physical Systems
•The IoT refers to the networked interconnection of everyday objects, tools, devices, or
computers
•With IPv6 protocol, 2^128 IP addresses are available to distinguish all the objects on Earth,
including all computers and pervasive devices
•The IoT researchers have estimated that every human being will be surrounded by 1,000 to
5,000 objects
•In the IoT era, all objects and devices are instrumented, interconnected, and interacted with
each other intelligently
•Three communication patterns co-exist: namely H2H (human-to-human), H2T (human-to-
thing), and T2T (thing-to-thing)
•A cyber-physical system (CPS) is the result of interaction between computational processes
and the physical world
•A CPS merges the “3C” technologies of computation, communication, and control into an
intelligent closed feedback system between the physical world and theinformation world
•CPS emphasizes exploration of virtual reality (VR)
05/04/25 18
RV College of
Engineering Go, change the world
TECHNOLOGIES FOR NETWORK-BASED SYSTEMS
Multicore CPUs and Multithreading Technologies
Advances in CPU Processors
05/04/25 19
RV College of
Engineering Go, change the world
Multicore CPU and Many-Core GPU Architectures
05/04/25 20
RV College of
Engineering Go, change the world
Multithreading Technology
05/04/25 21
RV College of
Engineering Go, change the world
Memory, Storage, and Wide-Area Networking
•Memory chips have
experienced a 4x increase in
capacity every three years
•Memory access time did not
improve much in the past
•Faster processor speed and
larger memory capacity result in
a wider gap between processors
and memory
•disk storage growth in 7 orders
of magnitude in 33 years
05/04/25 22
RV College of
Engineering Go, change the world
System-Area Interconnects
• LAN typically is used to connect client
hosts to big servers.
• A storage area network (SAN) connects
servers to network storage such as disk
arrays.
• Network attached storage (NAS) connects
client hosts directly to the disk arrays.
05/04/25 23
RV College of
Engineering Go, change the world
Virtual Machines and Virtualization Middleware
• Virtual machine is an application environment, that is installed on software which
imitates dedicated hardware
• VM is a virtual environment that functions as a virtual computer system with its
own CPU, memory, network interface and storage, created on physical hardware
system (on/off premises)
• The end user has the same experience on a virtual machine as they would have on
dedicated hardware
05/04/25 24
RV College of
Engineering Go, change the world
Need for VM
• Conventional computer has a single OS image. This offers an inflexible architecture that
tightly couples application software to a specific hardware platform
• Some software running well on one machine may not be executable on another platform with
a different instruction set under fixed OS
• Virtual machines (VMs) offer novel solutions to underutilized resources, application
inflexibility, software manageability, and security concerns in existing physical machines
• To build large clusters, grids, and clouds, we need to access large amounts of computing,
storage, and networking resources in a virtualized manner
• We need to aggregate those resources, and offer a single system image (SSI)
• A cloud of provisioned resources must rely on virtualization of processors, memory, and I/O
facilities dynamically
05/04/25 25
RV College of
Engineering Go, change the world
Virtual Machines and Virtualization Middleware
05/04/25 26
RV College of
Engineering Go, change the world
05/04/25 27
Physical Machine
a. The host machine is equipped with the physical hardware.
An example is an x-86 architecture desktop running its installed Windows OS
RV College of
Engineering Go, change the world
05/04/25 28
Native VM
• The VM can be provisioned for any hardware system.
• The VM is built with virtual resources managed by a guest OS to run
a specific application.
• Between the VMs and the host platform, one needs to deploy a
middleware layer called a virtual machine monitor (VMM) or
hypervisor, handles the bare metal hardware (CPU, memory, and I/O)
directly.
• The guest OS could be a Linux system and the hypervisor is the XEN
system
• This hypervisor approach is also called bare-metal VM
RV College of
Engineering Go, change the world
05/04/25 29
Hosted VM
• The VMM runs in nonprivileged mode
• The host OS need not be modified
RV College of
Engineering Go, change the world
05/04/25 30
Dual Mode VM
• Part of the VMM runs at the user level and another part runs at the
supervisor level.
• In this case, the host OS may have to be modified to some extent.
• Multiple VMs can be ported to a given hardware system to support
the virtualization process.
• The VM approach offers hardware independence of the OS and
applications.
• The user application running on its dedicated OS could be bundled
together as a virtual appliance that can be ported to any hardware
platform.
• The VM could run on an OS different from that of the host computer
RV College of
Engineering Go, change the world
VM Primitive Operations
1. The VMM provides the VM abstraction to the guest OS
2. With full virtualization, the VMM exports a VM abstraction identical to the physical machine
3. So that a standard OS such as Windows 2000 or Linux can run just as it would on the physical
hardware
4. Basic VM operations: VM multiplexing, suspension, provision, and migration in a distributed
computing environment.
05/04/25 31
RV College of
Engineering Go, change the world
VM Primitive Operations
a. First, the VMs can be multiplexed between hardware machines.
b. Second, a VM can be suspended and stored in stable storage
c. Third, a suspended VM can be resumed or provisioned to a new hardware platform
d. Finally, a VM can be migrated from one hardware platform to another
05/04/25 32
RV College of
Engineering Go, change the world
VM Primitive Operations
1. These VM operations enable a VM to be provisioned to any available hardware platform
2. They also enable flexibility in porting distributed application executions
3. VMware claimed that server utilization could be increased from its current 5–15 percent to 60–
80 percent
4. Multiple server functions can be consolidated on the same hardware platform to achieve higher
system efficiency.
05/04/25 33
RV College of
Engineering Go, change the world
VM Migration
• The process of moving a running VM or application between
different physical machines without disconnecting the client
or application
• Memory, storage and network connectivity of the VM are
transferred from one original guest machine to the destination
machine
05/04/25 34
RV College of
Engineering Go, change the world
Data Center Virtualization for Cloud Computing
• Data Centers:
•A data center is designed based on a network of huge computing and storage devices
•Data centers enabled to handle and delivery of shared applications and data
• Modern Data Centers:
•Data centers shifted to virtual networks
•Multicloud environment is connected across multiple data centers, public and private
clouds
05/04/25 35
RV College of
Engineering Go, change the world
Data Center Growth and Cost Breakdown
05/04/25 36
RV College of
Engineering Go, change the world
SYSTEM MODELS FOR DISTRIBUTED AND CLOUD COMPUTING
 Distributed and cloud computing systems are built over a large number of autonomous
computer nodes.
 These node machines are interconnected by SANs, LANs, or WANs in a hierarchical
manner
 Massive systems are considered highly scalable, and can reach web-scale
connectivity, either physically or logically
 Massive systems are classified into four groups: clusters, P2P networks, computing
grids, and Internet clouds over huge data centers
 These machines work collectively, cooperatively, or collaboratively at various levels
05/04/25 37
RV College of
Engineering Go, change the world
05/04/25 38
RV College of
Engineering Go, change the world
 From the application perspective, clusters are most popular in supercomputing applications.
 In 2009, 417 of the Top 500 supercomputers were built with cluster architecture.
 It is fair to say that clusters have laid the necessary foundation for building large-scale grids and
clouds.
 P2P networks appeal most to business applications.
 Potential advantages of cloud computing include its low cost and simplicity for both providers
and users.
05/04/25 39
RV College of
Engineering Go, change the world
Clusters of Cooperative Computers
Cluster Architecture:
• The architecture of a typical server cluster built around a low-latency, high bandwidth interconnection network.
• The cluster is connected to the Internet via a virtual private network (VPN) gateway.
• The gateway IP address locates the cluster.
• The system image of a computer is decided by the way the OS manages the shared cluster resources.
• Most clusters have loosely coupled node computers.
• Most clusters have multiple system images as a result of having many autonomous nodes under different OS control
05/04/25 40
RV College of
Engineering Go, change the world
Single-System Image
 An ideal cluster should merge multiple system images into a single-system image (SSI).
 Cluster designers desire a cluster operating system or some middleware to support SSI at
various levels, including the sharing of CPUs, memory, and I/O across all cluster nodes.
 An SSI is an illusion created by software or hardware that presents a collection of resources
as one integrated, powerful resource.
 SSI makes the cluster appear like a single machine to the user.
 A cluster with multiple system images is nothing but a collection of independent computers
05/04/25 41
RV College of
Engineering Go, change the world
Hardware, Software, and Middleware Support
 Clusters exploring massive parallelism are commonly known as MPPs.
 Almost all HPC clusters in the Top 500 list are also MPPs.
 The computer nodes are interconnected by a high-bandwidth network (such as Gigabit Ethernet,
Myrinet, InfiniBand, etc.)
 Special cluster middleware supports are needed to create SSI or high availability (HA).
 Both sequential and parallel applications can run on the cluster, and special parallel environments
are needed to facilitate use of the cluster resources
 Distributed memory has multiple images. Users may want all distributed memory to be shared by
all servers by forming distributed shared memory (DSM).
 Many SSI features are expensive or difficult to achieve at various cluster operational levels
05/04/25 42
RV College of
Engineering Go, change the world
Major Cluster Design Issues
05/04/25 43
RV College of
Engineering Go, change the world
Computational Grids
 Computing grid offers an infrastructure that couples computers, software/middleware, special
instruments, and people and sensors together.
 The grid is often constructed across LAN, WAN, or Internet backbone networks at a regional,
national, or global scale.
 Enterprises or organizations present grids as integrated computing resources.
 They can also be viewed as virtual platforms to support virtual organizations.
 The computers used in a grid are primarily workstations, servers, clusters, and supercomputers.
 Personal computers, laptops, and PDAs can be used as access devices to a grid system
05/04/25 44
RV College of
Engineering Go, change the world
 Computational grid built over multiple resource sites owned by different organizations.
 The resource sites offer complementary computing resources, including workstations, large servers, a mesh of
processors, and Linux clusters to satisfy a chain of computational needs.
 The grid is built across various IP broadband networks including LANs and WANs already used by enterprises or
organizations over the Internet.
 At the server end, the grid is a network.
 At the client end, we see wired or wireless terminal devices.
 The grid integrates the computing, communication, contents, and transactions as rented services.
 Enterprises and consumers form the user base, which then defines the usage trends and service characteristics.
05/04/25 45
RV College of
Engineering Go, change the world
Cloud Computing over the Internet
 Gordon Bell, Jim Gray, and Alex Szalay [5] have advocated: “Computational science is changing
to be data-intensive. Supercomputers must be balanced systems, not just CPU farms but also
petascale I/O and networking arrays.”
 In the future, working with large data sets will typically mean sending the computations
(programs) to the data, rather than copying the data to the workstations.
 This reflects the trend in IT of moving computing and data from desktops to large data centers,
where there is on-demand provision of software, hardware, and data as a service.
 This data explosion has promoted the idea of cloud computing.
 IBM, a major player in cloud computing, has defined it as follows: “A cloud is a pool of virtualized
computer resources. A cloud can host a variety of different workloads, including batch-style
backend jobs and interactive and user-facing applications.”
05/04/25 46
RV College of
Engineering Go, change the world
Internet Clouds
 A cloud allows workloads to be deployed and scaled out quickly through
rapid provisioning of virtual or physical machines.
 The cloud supports redundant, self-recovering, highly scalable programming
models that allow workloads to recover from many unavoidable
hardware/software failures.
 Finally, the cloud system should be able to monitor resource use in real time
to enable rebalancing of allocations when needed.
 Virtualized resources from data centers to form an Internet cloud,
provisioned with hardware, software, storage, network, and services for paid
users to run their applications.
05/04/25 47
RV College of
Engineering Go, change the world
 The cloud ecosystem must be designed to be secure, trustworthy, and dependable.
 Some computer users think of the cloud as a centralized resource pool.
 Others consider the cloud to be a server cluster which practices distributed computing
over all the servers used
 Traditional systems have encountered several performance bottlenecks: constant
system maintenance, poor utilization, and increasing costs associated with
hardware/software upgrades.
 Cloud computing as an on-demand computing paradigm resolves or relieves us from
these problems
05/04/25 48
RV College of
Engineering Go, change the world
05/04/25 49
Software as a Service (SaaS):
• This refers to browser-initiated application software over thousands of paid cloud customers.
• The SaaS model applies to business processes, industry applications, consumer relationship management
(CRM), enterprise resources planning (ERP), human resources (HR), and collaborative applications.
• On the customer side, there is no upfront investment in servers or software licensing.
• On the provider side, costs are rather low, compared with conventional hosting of user applications.
Platform as a Service (PaaS):
• This model enables the user to deploy user-built applications onto a virtualized cloud platform.
• PaaS includes middleware, databases, development tools, and some runtime support such as Web 2.0 and
Java.
• The platform includes both hardware and software integrated with specific programming interfaces.
• The provider supplies the API and software tools (e.g., Java, Python, Web 2.0, .NET).
• The user is freed from managing the cloud infrastructure.
RV College of
Engineering Go, change the world
The Cloud Landscape
• Internet clouds offer four deployment modes: private, public, managed, and hybrid.
• These modes demand different levels of security implications.
• The different SLAs imply that the security responsibility is shared among all the cloud
providers, the cloud resource consumers, and the third party cloud-enabled software providers
• The following list highlights eight reasons to adapt the cloud for upgraded Internet
applications and web services:
1. Desired location in areas with protected space and higher energy efficiency
2. Sharing of peak-load capacity among a large pool of users, improving overall utilization
3. Separation of infrastructure maintenance duties from domain-specific application
development
4. Significant reduction in cloud computing cost, compared with traditional computing
paradigms
5. Cloud computing programming and application development
6. Service and data discovery and content/service distribution
7. Privacy, security, copyright, and reliability issues
8. Service agreements, business models, and pricing policies
05/04/25 50
RV College of
Engineering Go, change the world
SOFTWARE ENVIRONMENTS FOR DISTRIBUTED SYSTEMS AND CLOUDS
Service-Oriented Architecture (SOA): These architectures build on the traditional seven Open
Systems Interconnection (OSI) layers that provide the base networking abstractions.
Layered Architecture for Web Services and Grids:
05/04/25 51
RV College of
Engineering Go, change the world
 The entity interfaces correspond to the Web Services Description Language (WSDL), Java method, and
CORBA interface definition language (IDL) specifications in these example distributed systems.
 These interfaces are linked with customized, high-level communication systems:
 These communication systems support features including particular message patterns (such as Remote
Procedure Call or RPC), fault recovery, and specialized routing.
 Often, these communication systems are built on message-oriented middleware (enterprise bus) infrastructure
such as Web Sphere MQ or Java Message Service (JMS) which provide rich functionality and support
virtualization of routing, senders, and recipients
 In the case of fault tolerance, the features in the Web Services Reliable Messaging (WSRM)
framework mimic the OSI layer capability (as in TCP fault tolerance) modified to match the different
abstractions (such as messages versus packets, virtualized addressing) at the entity levels.
05/04/25 52
RV College of
Engineering Go, change the world
 Here, one might get several models with, for example, JNDI (Jini and Java Naming and Directory Interface) illustrating
different approaches within the Java distributed object model.
 The CORBA Trading Service, UDDI (Universal Description, Discovery, and Integration), LDAP (Lightweight
Directory Access Protocol), and ebXML (Electronic Business using eXtensible Markup Language) are other examples
of discovery and information services.
 Management services include service state and lifetime support; examples include the CORBA Life Cycle and
Persistent states, the different Enterprise JavaBeans models, Jini’s lifetime model, and a suite of web services
specifications
 The distributed model has two critical advantages:
 Higher performance and
 Cleaner separation of software functions with clear software reuse and maintenance advantages
05/04/25 53
RV College of
Engineering Go, change the world
The Evolution of SOA
05/04/25 54
RV College of
Engineering Go, change the world
 Service-oriented architecture (SOA) has evolved over the years. SOA applies to building grids,
clouds, grids of clouds, clouds of grids, clouds of clouds (also known as interclouds), and systems
of systems in general.
 A large number of sensors provide data-collection services, denoted in the figure as SS (sensor
service).
 A sensor can be a ZigBee device, a Bluetooth device, a WiFi access point, a personal computer, a
GPA, or a wireless phone, among other things.
 Raw data is collected by sensor services.
 All the SS devices interact with large or small computers, many forms of grids, databases, the
compute cloud, the storage cloud, the filter cloud, the discovery cloud, and so on.
 Filter services ( fs in the figure) are used to eliminate unwanted raw data, in order to respond to
specific requests from the web, the grid, or web services.
 A collection of filter services forms a filter cloud.
 SOA aims to search for, or sort out, the useful data from the massive amounts of raw data items.
 Processing this data will generate useful information, and subsequently, the knowledge for our daily
use.
 In fact, wisdom or intelligence is sorted out of large knowledge bases.
 Finally, we make intelligent decisions based on both biological and machine wisdom
05/04/25 55
RV College of
Engineering Go, change the world
Grids versus Clouds
• The boundary between grids and clouds are getting blurred in recent years
• In general, a grid system applies static resources, while a cloud emphasizes elastic
resources.
• For some researchers, the differences between grids and clouds are limited only in
dynamic resource allocation based on virtualization and autonomic computing.
• One can build a grid out of multiple clouds.
• This type of grid can do a better job than a pure cloud, because it can explicitly support
negotiated resource allocation.
• Thus one may end up building with a system of systems: such as a cloud of clouds, a
grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA architecture
05/04/25 56
RV College of
Engineering Go, change the world
Trends toward Distributed Operating Systems
05/04/25 57
RV College of
Engineering Go, change the world
Parallel and Distributed Programming Models
• A transparent computing environment that
separates the user data, application, OS, and
hardware in time and space – an ideal model for
cloud computing.
• In this section, we will explore three
programming models for distributed computing
with expected scalable performance and
application flexibility
05/04/25 58
RV College of
Engineering Go, change the world
Parallel and Distributed Programming Models
• MPI is the most popular programming model for message-passing systems.
• Google’s MapReduce and BigTable are for effective use of resources from Internet clouds and data centers.
• Service clouds demand extending Hadoop, EC2, and S3 to facilitate distributed computing over distributed storage systems.
• Message-Passing Interface (MPI):
• This is the primary programming standard used to develop parallel and concurrent programs to run on a distributed system.
• MPI is essentially a library of subprograms that can be called from C or FORTRAN to write parallel programs running on a
distributed system.
• The idea is to embody clusters, grid systems, and P2P systems with upgraded web services and utility computing
applications.
• Besides MPI, distributed programming can be also supported with low-level primitives such as the Parallel Virtual Machine
(PVM)
• MapReduce:
• This is a web programming model for scalable data processing on large clusters over large data sets
• The model is applied mainly in web-scale search and cloud computing applications.
• Hadoop: offers a software platform that was originally developed by a Yahoo! group.
• The package enables users to write and run applications over vast amounts of distributed data.
• Users can easily scale Hadoop to store and process petabytes of data in the web space.
• Also, Hadoop is economical in that it comes with an open source version of MapReduce that minimizes overhead
05/04/25 59
RV College of
Engineering Go, change the world
05/04/25 60
RV College of
Engineering Go, change the world
05/04/25 61
RV College of
Engineering Go, change the world
PERFORMANCE, SECURITY, AND ENERGY EFFICIENCY
Performance Metrics:
 CPU speed in MIPS and network bandwidth in Mbps to estimate processor and network performance.
 In a distributed system, performance is attributed to a large number of factors.
 System throughput is often measured in MIPS, Tflops (tera floating-point operations per second), or TPS (transactions per
second).
 Other measures include job response time and network latency.
 An interconnection network that has low latency and high bandwidth is preferred.
 System overhead is often attributed to OS boot time, compile time, I/O data rate, and the runtime support system used.
 Other performance-related metrics include the QoS for Internet and web services; system availability and dependability; and
security resilience for system defense against network attacks.
05/04/25 62
RV College of
Engineering Go, change the world
Dimensions of Scalability
 Users want to have a distributed system that can achieve scalable performance.
 Any resource upgrade in a system should be backward compatible with existing hardware and software
resources.
 Overdesign may not be cost-effective.
 System scaling can increase or decrease resources depending on many practical factors.
 The following dimensions of scalability are characterized in parallel and distributed system
1. Size scalability
 This refers to achieving higher performance or more functionality by increasing the machine size.
 The word “size” refers to adding processors, cache, memory, storage, or I/O channels.
 To determine size scalability is to simply count the number of processors installed.
 Not all parallel computer or distributed architectures are equally size scalable.
 For example, the IBM S2 was scaled up to 512 processors in 1997. But in 2008, the IBM BlueGene/L system
scaled up to 65,000 processors.
2. Software scalability
• This refers to upgrades in the OS or compilers, adding mathematical and engineering libraries, porting new
application software, and installing more user-friendly programming environments.
• Some software upgrades may not work with large system configurations.
• Testing and fine-tuning of new software on larger systems is a nontrivial job.
05/04/25 63
RV College of
Engineering Go, change the world
3. Application scalability
• This refers to matching problem size scalability with machine size scalability.
• Problem size affects the size of the data set or the workload increase.
• Instead of increasing machine size, users can enlarge the problem size to enhance system efficiency
or cost-effectiveness.
4. Technology scalability
• This refers to a system that can adapt to changes in building technologies, such as the component
and networking technologies.
• When scaling a system design with new technology one must consider three aspects: time, space,
and heterogeneity.
(1)Time refers to generation scalability. When changing to new-generation processors, one must
consider the impact to the motherboard, power supply, packaging and cooling, and so forth.
(2)Space is related to packaging and energy concerns. Technology scalability demands harmony and
portability among suppliers.
(3)Heterogeneity refers to the use of hardware components or software packages from different
vendors. Heterogeneity may limit the scalability.
05/04/25 64
RV College of
Engineering Go, change the world
Scalability versus OS Image Count
• Scalable performance implies that the system can achieve higher speed by adding more processors or servers,
enlarging the physical node’s memory size, extending the disk capacity, or adding more I/O channels.
• The OS image is counted by the number of independent OS images observed in a cluster, grid, P2P network, or
the cloud.
05/04/25 65
RV College of
Engineering Go, change the world
Scalability versus OS Image Count
• An SMP (symmetric multiprocessor) server has a single system image, which could be a single node in a large cluster.
• The scalability of SMP systems is constrained primarily by packaging and the system interconnect used.
• NUMA (nonuniform memory access) machines are often made out of SMP nodes with distributed, shared memory.
• A NUMA machine can run with multiple operating systems, and can scale to a few thousand processors communicating
with the MPI library. For example, a NUMA machine may have 2,048 processors running 32 SMP operating systems,
resulting in 32 OS images in the 2,048-processor NUMA system.
• The cluster nodes can be either SMP servers or high-end machines that are loosely coupled together. Therefore, clusters
have much higher scalability than NUMA machines.
• The number of OS images in a cluster is based on the cluster nodes concurrently in use.
• The cloud could be a virtualized cluster. As of 2010, the largest cloud was able to scale up to a few thousand VMs
• Keeping in mind that many cluster nodes are SMP or multicore servers, the total number of processors or cores in a
cluster system is one or two orders of magnitude greater than the number of OS images running in the cluster.
• The grid node could be a server cluster, or a mainframe, or a supercomputer, or an MPP. Therefore, the number of OS
images in a large grid structure could be hundreds or thousands fewer than the total number of processors in the grid.
• A P2P network can easily scale to millions of independent peer nodes, essentially desktop machines
05/04/25 66
RV College of
Engineering Go, change the world
Amdahl’s Law
 On a uniprocessor workstation the total execution time is T minutes.
 The program has been parallelized or partitioned for parallel execution on a cluster of many processing nodes.
 Assume that a fraction α of the code must be executed sequentially, called the sequential bottleneck.
 Therefore, (1 − α) of the code can be compiled for parallel execution by n processors.
 The total execution time of the program is calculated by α T + (1 − α)T/n, where the first term is the sequential
execution time on a single processor and the second term is the parallel execution time on n processing nodes.
 Amdahl’s Law states that the speedup factor of using the n-processor system over the use of a single processor
is expressed by:
 The maximum speedup of n is achieved only if the sequential bottleneck α is reduced to zero or the code is
fully parallelizable with α = 0.
 As the cluster becomes sufficiently large, that is, n → ∞, S approaches 1/α, surprisingly, this upper bound is
independent of the cluster size n.
 Amdahl’s law teaches us that we should make the sequential bottleneck as small as possible. Increasing the
cluster size alone may not result in a good speedup in this case
 To achieve higher efficiency when using a large cluster, we must consider scaling the problem size to match
the cluster capability. This leads to the speedup law proposed by John Gustafson, referred as scaled-workload
speedup
05/04/25 67
RV College of
Engineering Go, change the world
Fault Tolerance and System Availability
• System Availability HA (high availability) is desired in all clusters, grids, P2P networks, and cloud systems.
• A system is highly available if it has a long mean time to failure (MTTF) and a short mean time to repair
(MTTR).
• System availability is formally defined as follows
• The rule of thumb is to design a dependable computing system with no single point of failure.
• Adding hardware redundancy, increasing component reliability, and designing for testability will help to
enhance system availability and dependability.
• Both SMP and MPP are very vulnerable with centralized resources under one OS.
• NUMA machines have improved in availability due to the use of multiple OSes.
• Most clusters are designed to have HA with failover capability.
• Clusters, clouds, and grids have decreasing availability as the system increases in size.
• A P2P file-sharing network has the highest aggregation of client machines.
05/04/25 68
RV College of
Engineering Go, change the world
Network Threats and Data Integrity
05/04/25 69
RV College of
Engineering Go, change the world
Energy Efficiency in Distributed Computing
05/04/25 70
RV College of
Engineering Go, change the world
Department of Computer Science and
Engineering
Organizing
Invited Talk on
Distributed Systems with AWS
RV College of
Engineering
Go, change the
world
By
Mr Mohammad Hannan
SDE-2, CISCO Systems, India
On
1st
June 2023, Thursday
9.30 am to 10.30 am
Meeting Link: https://meet.google.com/utx-gixg-tgv
Organized by: Prof Srividya M S and Dr. Anala M R
Department of CSE/ISE, RVCE
For 1st
Sem MTech CSE and CNE students

More Related Content

PPTX
CLOUD COMPUTING UNIT-1
PPTX
CC & Security for learners_Module 1.pptx
PPT
Presentation-1.ppt
PPTX
Cloud Computing-UNIT 1 claud computing basics
PPTX
DistributedSystemModels - cloud computing and distributed system models
PPTX
(19-23)CC Unit-1 ppt.pptx
DOC
Gcc notes unit 1
PDF
UNIT I -Cloud Computing (1).pdf
CLOUD COMPUTING UNIT-1
CC & Security for learners_Module 1.pptx
Presentation-1.ppt
Cloud Computing-UNIT 1 claud computing basics
DistributedSystemModels - cloud computing and distributed system models
(19-23)CC Unit-1 ppt.pptx
Gcc notes unit 1
UNIT I -Cloud Computing (1).pdf

Similar to Distributed_and_cloud_computing-unit-1.ppt (20)

PDF
R21 Sasi Engineering College cloud-computing-notes.pdf
PPTX
UNIT-1-PARADIGMS.pptx cloud computing cc
PDF
CC LECTURE NOTES (1).pdf
PDF
Cloud Computing notes ccomputing paradigms UNIT 1.pdf
PDF
R15A0529_CloudComputing_Notes-converted.pdf
PDF
Cloud Computing BCS601 Notef of Viswesvaraya University
PPTX
Lecture 1 - Computing Paradigms and.pptx
PPTX
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
PDF
High–Performance Computing
PPTX
CLOUD ENABLING TECHNOLOGIES.pptx
PDF
CLOUD COMPUTING Unit-I.pdf
PDF
CloudComputing_UNIT1.pdf
PDF
CloudComputing_UNIT1.pdf
PPTX
cc_mod1.ppt useful for engineering students
PPTX
1..pptxcloud commuting cloud commuting cloud commuting
PDF
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
PPTX
CC unit 1.pptx
PPTX
Cloud computing is a paradigm for enabling network access to a scalable and e...
PDF
00 - BigData-Chapter_01-PDC.pdf
PDF
Week 1 lecture material cc
R21 Sasi Engineering College cloud-computing-notes.pdf
UNIT-1-PARADIGMS.pptx cloud computing cc
CC LECTURE NOTES (1).pdf
Cloud Computing notes ccomputing paradigms UNIT 1.pdf
R15A0529_CloudComputing_Notes-converted.pdf
Cloud Computing BCS601 Notef of Viswesvaraya University
Lecture 1 - Computing Paradigms and.pptx
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
High–Performance Computing
CLOUD ENABLING TECHNOLOGIES.pptx
CLOUD COMPUTING Unit-I.pdf
CloudComputing_UNIT1.pdf
CloudComputing_UNIT1.pdf
cc_mod1.ppt useful for engineering students
1..pptxcloud commuting cloud commuting cloud commuting
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
CC unit 1.pptx
Cloud computing is a paradigm for enabling network access to a scalable and e...
00 - BigData-Chapter_01-PDC.pdf
Week 1 lecture material cc
Ad

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Approach and Philosophy of On baking technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
KodekX | Application Modernization Development
Electronic commerce courselecture one. Pdf
Empathic Computing: Creating Shared Understanding
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25 Week I
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Approach and Philosophy of On baking technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
The Rise and Fall of 3GPP – Time for a Sabbatical?
KodekX | Application Modernization Development
Ad

Distributed_and_cloud_computing-unit-1.ppt

  • 1. Distributed and Cloud Computing –Unit1: 22MCN1B2T RV College of Engineering Go, change the world By Prof Srividya M S, Department of CSE, RVCE 05/04/25 1
  • 2. 05/04/25 2 CO1 : Apply the distributed and cloud computing concepts to solve problems in computing domain. CO2 : Analyse various architectures, work flow models and algorithms used to implement cloud and distributed systems. CO3 : Design solutions using modern tools to solve applicable problems in cloud and distributed systems. CO4 : Demonstrate effective communication , report writing and usage of modern tools for implementing cloud and distributed systems applications RV College of Engineering Go, change the world Course Outcomes for Distributed and Cloud Computing
  • 3. RV College of Engineering Go, change the world 05/04/25 3 UNIT-1: Distributed System Models & Enabling technology: Scalable computing over the internet, Technologies for network-based system, System models for distributed & cloud, Software environments for distributed & Cloud, Performance security and energy efficiency
  • 4. SCALABLE COMPUTING OVER THE INTERNET • Over the past 60 years, computing technology has undergone a series of platform and environment changes. • Evolutionary changes in machine architecture, operating system platform, network connectivity, and application workload. • Parallel and distributed computing system uses multiple computers to solve large- scale problems • Distributed computing becomes data-intensive and network-centric • High-performance computing (HPC) and High-throughput computing (HTC) systems built with parallel and distributed computing technologies is the need of the hour 05/04/25 4 RV College of Engineering Go, change the world
  • 5. 05/04/25 5 RV College of Engineering Go, change the world
  • 6. The Platform Evolution • Computer technology has gone through five generations of development, with each generation lasting from 10 to 20 years. Successive generations are overlapped in about 10 years. • 1950 to 1970, a handful of mainframes, including the IBM 360 and CDC 6400. • 1960 to 1980, lower-cost minicomputers such as the DEC PDP 11 and VAX Series became popular • 1970 to 1990, we saw widespread use of personal computers built with VLSI microprocessor • 1980 to 2000, massive numbers of portable computers and pervasive devices appeared in both wired and wireless applications • Since 1990, the use of both HPC and HTC systems hidden clusters, grids, or Internet clouds has proliferated 05/04/25 6 RV College of Engineering Go, change the world
  • 7. 05/04/25 7 RV College of Engineering Go, change the world
  • 8. HPC and HTC systems • On the HPC side, • Supercomputers (massively parallel processors or MPPs) • The cluster is often a collection of homogeneous compute nodes that are physically connected in close range to one another. • On the HTC side, • Peer-to-peer (P2P) networks are formed for distributed file sharing and content delivery applications. • P2P, cloud computing, and web service platforms are more focused on HTC applications • There is a strategic change from an HPC paradigm to an HTC paradigm. • The performance goal thus shifts to measure high throughput or the number of tasks completed per unit of time. • HTC technology needs to not only improve in terms of batch processing speed, but cost saving, energy savings, security, and reliability at many data and enterprise computing centers. 05/04/25 8 RV College of Engineering Go, change the world
  • 9. 05/04/25 9 Parameter HPC HTC Stands for HPC stands for High-Performance Computing HTC stands for High Throughput Computing Definition HPC is defined as the type of computing that makes use of multiple computer processors in order to perform complex computations parallelly. HTC is defined as a type of computing that parallelly executes a large number of simple and computationally independent tasks. Workload HPC consists of running large-scale, complex, and computationally intensive applications that need significant resources and memory. HPC consists of running a large number of tasks that are independent and small in size and does not require a large amount of memory and resources. Processing Power It is designed to provide maximum performance and speed for large tasks. HTC is designed to increase the number of tasks that needs to be completed in a given specific amount of time. Resource Management For resource management to processes, HPC makes use of job schedulers and resource managers. For the resource management to processes, HTC makes use of distributed management resources. Fault Tolerance To reduce the risk of data loss and data corruption HPC systems have a complex fault tolerance mechanism. HTC systems do not affect any other running processes due to the failure of an individual task. Scaling HPC scales up when few users are running together. HTC systems scale horizontally for simple tasks and require less computational speed. Applications HPC can be used in applications such as engineering design, weather forecasting, drug discovery etc. HTC can be used in applications such as bioinformatics, research applications, etc.
  • 10. Three New Computing Paradigms • The maturity of Radio-frequency Identification (Rfid), Global Positioning System (GPS), and sensor technologies has triggered the development of the Internet of Things (IoT). • With the introduction of SOA, Web 2.0 services become available. • Advances in virtualization make it possible to see the growth of Internet clouds 05/04/25 10 RV College of Engineering Go, change the world
  • 11. Computing Paradigm Distinctions • In general, distributed computing is the opposite of centralized computing. • The field of parallel computing overlaps with distributed computing to a great extent, • And cloud computing overlaps with distributed, centralized  Centralized computing : 1. This is a computing paradigm by which all computer resources are centralized in one physical system. 2. All resources (processors, memory, and storage) are fully shared and tightly coupled within one integrated OS. 3. Many data centers and supercomputers are centralized systems, but they are used in parallel, distributed, and cloud computing applications  Parallel computing : 1. In parallel computing, all processors are either tightly coupled with centralized shared memory or loosely coupled with distributed memory. 2. Inter processor communication is accomplished through shared memory or via message passing 05/04/25 11 RV College of Engineering Go, change the world
  • 12. Computing Paradigm Distinctions  Distributed computing : 1. A distributed system consists of multiple autonomous computers, each having its own private memory, communicating through a computer network. 2. Information exchange in a distributed system is accomplished through message passing.  Cloud computing : 1. An Internet cloud of resources can be either a centralized or a distributed computing system. 2. The cloud applies parallel or distributed computing, or both. 3. Clouds can be built with physical or virtualized resources over large data centers that are centralized or distributed 05/04/25 12 RV College of Engineering Go, change the world
  • 13. Scalable Computing Trends and New Paradigms Includes, Degrees of Parallelism Innovative Applications The Trend toward Utility Computing The Hype Cycle of New Technologies Fifty years ago, when hardware was bulky and expensive, most computers were designed in a bit-serial fashion. Data-level parallelism (DLP) was made popular through SIMD (single instruction, multiple data) and vector machines using vector or array types of instructions. DLP requires even more hardware support and compiler assistance to work properly. Innovative Applications Both HPC and HTC systems desire transparency in many application aspects. For example, data access, resource allocation, process location, concurrency in execution, job replication, and failure recovery should be made transparent to both users and system management 05/04/25 13 RV College of Engineering Go, change the world
  • 14. 05/04/25 14 RV College of Engineering Go, change the world
  • 15. The Trend toward Utility Computing Utility computing focuses on a business model in which customers receive computing resources from a paid service provider. All grid/cloud platforms are regarded as utility service providers 05/04/25 15 RV College of Engineering Go, change the world
  • 16. The Trend toward Utility Computing Identifies major computing paradigms to facilitate the study of distributed systems and their applications. These paradigms share some common characteristics. 1.First, they are all ubiquitous in daily life. Reliability and scalability are two major design objectives in these computing models. 2.Second, they are aimed at autonomic operations that can be self- organized to support dynamic discovery 05/04/25 16 RV College of Engineering Go, change the world
  • 17. The Hype Cycle of New Technologies 05/04/25 17 RV College of Engineering Go, change the world
  • 18. The Internet of Things and Cyber-Physical Systems •The IoT refers to the networked interconnection of everyday objects, tools, devices, or computers •With IPv6 protocol, 2^128 IP addresses are available to distinguish all the objects on Earth, including all computers and pervasive devices •The IoT researchers have estimated that every human being will be surrounded by 1,000 to 5,000 objects •In the IoT era, all objects and devices are instrumented, interconnected, and interacted with each other intelligently •Three communication patterns co-exist: namely H2H (human-to-human), H2T (human-to- thing), and T2T (thing-to-thing) •A cyber-physical system (CPS) is the result of interaction between computational processes and the physical world •A CPS merges the “3C” technologies of computation, communication, and control into an intelligent closed feedback system between the physical world and theinformation world •CPS emphasizes exploration of virtual reality (VR) 05/04/25 18 RV College of Engineering Go, change the world
  • 19. TECHNOLOGIES FOR NETWORK-BASED SYSTEMS Multicore CPUs and Multithreading Technologies Advances in CPU Processors 05/04/25 19 RV College of Engineering Go, change the world
  • 20. Multicore CPU and Many-Core GPU Architectures 05/04/25 20 RV College of Engineering Go, change the world
  • 21. Multithreading Technology 05/04/25 21 RV College of Engineering Go, change the world
  • 22. Memory, Storage, and Wide-Area Networking •Memory chips have experienced a 4x increase in capacity every three years •Memory access time did not improve much in the past •Faster processor speed and larger memory capacity result in a wider gap between processors and memory •disk storage growth in 7 orders of magnitude in 33 years 05/04/25 22 RV College of Engineering Go, change the world
  • 23. System-Area Interconnects • LAN typically is used to connect client hosts to big servers. • A storage area network (SAN) connects servers to network storage such as disk arrays. • Network attached storage (NAS) connects client hosts directly to the disk arrays. 05/04/25 23 RV College of Engineering Go, change the world
  • 24. Virtual Machines and Virtualization Middleware • Virtual machine is an application environment, that is installed on software which imitates dedicated hardware • VM is a virtual environment that functions as a virtual computer system with its own CPU, memory, network interface and storage, created on physical hardware system (on/off premises) • The end user has the same experience on a virtual machine as they would have on dedicated hardware 05/04/25 24 RV College of Engineering Go, change the world
  • 25. Need for VM • Conventional computer has a single OS image. This offers an inflexible architecture that tightly couples application software to a specific hardware platform • Some software running well on one machine may not be executable on another platform with a different instruction set under fixed OS • Virtual machines (VMs) offer novel solutions to underutilized resources, application inflexibility, software manageability, and security concerns in existing physical machines • To build large clusters, grids, and clouds, we need to access large amounts of computing, storage, and networking resources in a virtualized manner • We need to aggregate those resources, and offer a single system image (SSI) • A cloud of provisioned resources must rely on virtualization of processors, memory, and I/O facilities dynamically 05/04/25 25 RV College of Engineering Go, change the world
  • 26. Virtual Machines and Virtualization Middleware 05/04/25 26 RV College of Engineering Go, change the world
  • 27. 05/04/25 27 Physical Machine a. The host machine is equipped with the physical hardware. An example is an x-86 architecture desktop running its installed Windows OS RV College of Engineering Go, change the world
  • 28. 05/04/25 28 Native VM • The VM can be provisioned for any hardware system. • The VM is built with virtual resources managed by a guest OS to run a specific application. • Between the VMs and the host platform, one needs to deploy a middleware layer called a virtual machine monitor (VMM) or hypervisor, handles the bare metal hardware (CPU, memory, and I/O) directly. • The guest OS could be a Linux system and the hypervisor is the XEN system • This hypervisor approach is also called bare-metal VM RV College of Engineering Go, change the world
  • 29. 05/04/25 29 Hosted VM • The VMM runs in nonprivileged mode • The host OS need not be modified RV College of Engineering Go, change the world
  • 30. 05/04/25 30 Dual Mode VM • Part of the VMM runs at the user level and another part runs at the supervisor level. • In this case, the host OS may have to be modified to some extent. • Multiple VMs can be ported to a given hardware system to support the virtualization process. • The VM approach offers hardware independence of the OS and applications. • The user application running on its dedicated OS could be bundled together as a virtual appliance that can be ported to any hardware platform. • The VM could run on an OS different from that of the host computer RV College of Engineering Go, change the world
  • 31. VM Primitive Operations 1. The VMM provides the VM abstraction to the guest OS 2. With full virtualization, the VMM exports a VM abstraction identical to the physical machine 3. So that a standard OS such as Windows 2000 or Linux can run just as it would on the physical hardware 4. Basic VM operations: VM multiplexing, suspension, provision, and migration in a distributed computing environment. 05/04/25 31 RV College of Engineering Go, change the world
  • 32. VM Primitive Operations a. First, the VMs can be multiplexed between hardware machines. b. Second, a VM can be suspended and stored in stable storage c. Third, a suspended VM can be resumed or provisioned to a new hardware platform d. Finally, a VM can be migrated from one hardware platform to another 05/04/25 32 RV College of Engineering Go, change the world
  • 33. VM Primitive Operations 1. These VM operations enable a VM to be provisioned to any available hardware platform 2. They also enable flexibility in porting distributed application executions 3. VMware claimed that server utilization could be increased from its current 5–15 percent to 60– 80 percent 4. Multiple server functions can be consolidated on the same hardware platform to achieve higher system efficiency. 05/04/25 33 RV College of Engineering Go, change the world
  • 34. VM Migration • The process of moving a running VM or application between different physical machines without disconnecting the client or application • Memory, storage and network connectivity of the VM are transferred from one original guest machine to the destination machine 05/04/25 34 RV College of Engineering Go, change the world
  • 35. Data Center Virtualization for Cloud Computing • Data Centers: •A data center is designed based on a network of huge computing and storage devices •Data centers enabled to handle and delivery of shared applications and data • Modern Data Centers: •Data centers shifted to virtual networks •Multicloud environment is connected across multiple data centers, public and private clouds 05/04/25 35 RV College of Engineering Go, change the world
  • 36. Data Center Growth and Cost Breakdown 05/04/25 36 RV College of Engineering Go, change the world
  • 37. SYSTEM MODELS FOR DISTRIBUTED AND CLOUD COMPUTING  Distributed and cloud computing systems are built over a large number of autonomous computer nodes.  These node machines are interconnected by SANs, LANs, or WANs in a hierarchical manner  Massive systems are considered highly scalable, and can reach web-scale connectivity, either physically or logically  Massive systems are classified into four groups: clusters, P2P networks, computing grids, and Internet clouds over huge data centers  These machines work collectively, cooperatively, or collaboratively at various levels 05/04/25 37 RV College of Engineering Go, change the world
  • 38. 05/04/25 38 RV College of Engineering Go, change the world
  • 39.  From the application perspective, clusters are most popular in supercomputing applications.  In 2009, 417 of the Top 500 supercomputers were built with cluster architecture.  It is fair to say that clusters have laid the necessary foundation for building large-scale grids and clouds.  P2P networks appeal most to business applications.  Potential advantages of cloud computing include its low cost and simplicity for both providers and users. 05/04/25 39 RV College of Engineering Go, change the world
  • 40. Clusters of Cooperative Computers Cluster Architecture: • The architecture of a typical server cluster built around a low-latency, high bandwidth interconnection network. • The cluster is connected to the Internet via a virtual private network (VPN) gateway. • The gateway IP address locates the cluster. • The system image of a computer is decided by the way the OS manages the shared cluster resources. • Most clusters have loosely coupled node computers. • Most clusters have multiple system images as a result of having many autonomous nodes under different OS control 05/04/25 40 RV College of Engineering Go, change the world
  • 41. Single-System Image  An ideal cluster should merge multiple system images into a single-system image (SSI).  Cluster designers desire a cluster operating system or some middleware to support SSI at various levels, including the sharing of CPUs, memory, and I/O across all cluster nodes.  An SSI is an illusion created by software or hardware that presents a collection of resources as one integrated, powerful resource.  SSI makes the cluster appear like a single machine to the user.  A cluster with multiple system images is nothing but a collection of independent computers 05/04/25 41 RV College of Engineering Go, change the world
  • 42. Hardware, Software, and Middleware Support  Clusters exploring massive parallelism are commonly known as MPPs.  Almost all HPC clusters in the Top 500 list are also MPPs.  The computer nodes are interconnected by a high-bandwidth network (such as Gigabit Ethernet, Myrinet, InfiniBand, etc.)  Special cluster middleware supports are needed to create SSI or high availability (HA).  Both sequential and parallel applications can run on the cluster, and special parallel environments are needed to facilitate use of the cluster resources  Distributed memory has multiple images. Users may want all distributed memory to be shared by all servers by forming distributed shared memory (DSM).  Many SSI features are expensive or difficult to achieve at various cluster operational levels 05/04/25 42 RV College of Engineering Go, change the world
  • 43. Major Cluster Design Issues 05/04/25 43 RV College of Engineering Go, change the world
  • 44. Computational Grids  Computing grid offers an infrastructure that couples computers, software/middleware, special instruments, and people and sensors together.  The grid is often constructed across LAN, WAN, or Internet backbone networks at a regional, national, or global scale.  Enterprises or organizations present grids as integrated computing resources.  They can also be viewed as virtual platforms to support virtual organizations.  The computers used in a grid are primarily workstations, servers, clusters, and supercomputers.  Personal computers, laptops, and PDAs can be used as access devices to a grid system 05/04/25 44 RV College of Engineering Go, change the world
  • 45.  Computational grid built over multiple resource sites owned by different organizations.  The resource sites offer complementary computing resources, including workstations, large servers, a mesh of processors, and Linux clusters to satisfy a chain of computational needs.  The grid is built across various IP broadband networks including LANs and WANs already used by enterprises or organizations over the Internet.  At the server end, the grid is a network.  At the client end, we see wired or wireless terminal devices.  The grid integrates the computing, communication, contents, and transactions as rented services.  Enterprises and consumers form the user base, which then defines the usage trends and service characteristics. 05/04/25 45 RV College of Engineering Go, change the world
  • 46. Cloud Computing over the Internet  Gordon Bell, Jim Gray, and Alex Szalay [5] have advocated: “Computational science is changing to be data-intensive. Supercomputers must be balanced systems, not just CPU farms but also petascale I/O and networking arrays.”  In the future, working with large data sets will typically mean sending the computations (programs) to the data, rather than copying the data to the workstations.  This reflects the trend in IT of moving computing and data from desktops to large data centers, where there is on-demand provision of software, hardware, and data as a service.  This data explosion has promoted the idea of cloud computing.  IBM, a major player in cloud computing, has defined it as follows: “A cloud is a pool of virtualized computer resources. A cloud can host a variety of different workloads, including batch-style backend jobs and interactive and user-facing applications.” 05/04/25 46 RV College of Engineering Go, change the world
  • 47. Internet Clouds  A cloud allows workloads to be deployed and scaled out quickly through rapid provisioning of virtual or physical machines.  The cloud supports redundant, self-recovering, highly scalable programming models that allow workloads to recover from many unavoidable hardware/software failures.  Finally, the cloud system should be able to monitor resource use in real time to enable rebalancing of allocations when needed.  Virtualized resources from data centers to form an Internet cloud, provisioned with hardware, software, storage, network, and services for paid users to run their applications. 05/04/25 47 RV College of Engineering Go, change the world
  • 48.  The cloud ecosystem must be designed to be secure, trustworthy, and dependable.  Some computer users think of the cloud as a centralized resource pool.  Others consider the cloud to be a server cluster which practices distributed computing over all the servers used  Traditional systems have encountered several performance bottlenecks: constant system maintenance, poor utilization, and increasing costs associated with hardware/software upgrades.  Cloud computing as an on-demand computing paradigm resolves or relieves us from these problems 05/04/25 48 RV College of Engineering Go, change the world
  • 49. 05/04/25 49 Software as a Service (SaaS): • This refers to browser-initiated application software over thousands of paid cloud customers. • The SaaS model applies to business processes, industry applications, consumer relationship management (CRM), enterprise resources planning (ERP), human resources (HR), and collaborative applications. • On the customer side, there is no upfront investment in servers or software licensing. • On the provider side, costs are rather low, compared with conventional hosting of user applications. Platform as a Service (PaaS): • This model enables the user to deploy user-built applications onto a virtualized cloud platform. • PaaS includes middleware, databases, development tools, and some runtime support such as Web 2.0 and Java. • The platform includes both hardware and software integrated with specific programming interfaces. • The provider supplies the API and software tools (e.g., Java, Python, Web 2.0, .NET). • The user is freed from managing the cloud infrastructure. RV College of Engineering Go, change the world
  • 50. The Cloud Landscape • Internet clouds offer four deployment modes: private, public, managed, and hybrid. • These modes demand different levels of security implications. • The different SLAs imply that the security responsibility is shared among all the cloud providers, the cloud resource consumers, and the third party cloud-enabled software providers • The following list highlights eight reasons to adapt the cloud for upgraded Internet applications and web services: 1. Desired location in areas with protected space and higher energy efficiency 2. Sharing of peak-load capacity among a large pool of users, improving overall utilization 3. Separation of infrastructure maintenance duties from domain-specific application development 4. Significant reduction in cloud computing cost, compared with traditional computing paradigms 5. Cloud computing programming and application development 6. Service and data discovery and content/service distribution 7. Privacy, security, copyright, and reliability issues 8. Service agreements, business models, and pricing policies 05/04/25 50 RV College of Engineering Go, change the world
  • 51. SOFTWARE ENVIRONMENTS FOR DISTRIBUTED SYSTEMS AND CLOUDS Service-Oriented Architecture (SOA): These architectures build on the traditional seven Open Systems Interconnection (OSI) layers that provide the base networking abstractions. Layered Architecture for Web Services and Grids: 05/04/25 51 RV College of Engineering Go, change the world
  • 52.  The entity interfaces correspond to the Web Services Description Language (WSDL), Java method, and CORBA interface definition language (IDL) specifications in these example distributed systems.  These interfaces are linked with customized, high-level communication systems:  These communication systems support features including particular message patterns (such as Remote Procedure Call or RPC), fault recovery, and specialized routing.  Often, these communication systems are built on message-oriented middleware (enterprise bus) infrastructure such as Web Sphere MQ or Java Message Service (JMS) which provide rich functionality and support virtualization of routing, senders, and recipients  In the case of fault tolerance, the features in the Web Services Reliable Messaging (WSRM) framework mimic the OSI layer capability (as in TCP fault tolerance) modified to match the different abstractions (such as messages versus packets, virtualized addressing) at the entity levels. 05/04/25 52 RV College of Engineering Go, change the world
  • 53.  Here, one might get several models with, for example, JNDI (Jini and Java Naming and Directory Interface) illustrating different approaches within the Java distributed object model.  The CORBA Trading Service, UDDI (Universal Description, Discovery, and Integration), LDAP (Lightweight Directory Access Protocol), and ebXML (Electronic Business using eXtensible Markup Language) are other examples of discovery and information services.  Management services include service state and lifetime support; examples include the CORBA Life Cycle and Persistent states, the different Enterprise JavaBeans models, Jini’s lifetime model, and a suite of web services specifications  The distributed model has two critical advantages:  Higher performance and  Cleaner separation of software functions with clear software reuse and maintenance advantages 05/04/25 53 RV College of Engineering Go, change the world
  • 54. The Evolution of SOA 05/04/25 54 RV College of Engineering Go, change the world
  • 55.  Service-oriented architecture (SOA) has evolved over the years. SOA applies to building grids, clouds, grids of clouds, clouds of grids, clouds of clouds (also known as interclouds), and systems of systems in general.  A large number of sensors provide data-collection services, denoted in the figure as SS (sensor service).  A sensor can be a ZigBee device, a Bluetooth device, a WiFi access point, a personal computer, a GPA, or a wireless phone, among other things.  Raw data is collected by sensor services.  All the SS devices interact with large or small computers, many forms of grids, databases, the compute cloud, the storage cloud, the filter cloud, the discovery cloud, and so on.  Filter services ( fs in the figure) are used to eliminate unwanted raw data, in order to respond to specific requests from the web, the grid, or web services.  A collection of filter services forms a filter cloud.  SOA aims to search for, or sort out, the useful data from the massive amounts of raw data items.  Processing this data will generate useful information, and subsequently, the knowledge for our daily use.  In fact, wisdom or intelligence is sorted out of large knowledge bases.  Finally, we make intelligent decisions based on both biological and machine wisdom 05/04/25 55 RV College of Engineering Go, change the world
  • 56. Grids versus Clouds • The boundary between grids and clouds are getting blurred in recent years • In general, a grid system applies static resources, while a cloud emphasizes elastic resources. • For some researchers, the differences between grids and clouds are limited only in dynamic resource allocation based on virtualization and autonomic computing. • One can build a grid out of multiple clouds. • This type of grid can do a better job than a pure cloud, because it can explicitly support negotiated resource allocation. • Thus one may end up building with a system of systems: such as a cloud of clouds, a grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA architecture 05/04/25 56 RV College of Engineering Go, change the world
  • 57. Trends toward Distributed Operating Systems 05/04/25 57 RV College of Engineering Go, change the world
  • 58. Parallel and Distributed Programming Models • A transparent computing environment that separates the user data, application, OS, and hardware in time and space – an ideal model for cloud computing. • In this section, we will explore three programming models for distributed computing with expected scalable performance and application flexibility 05/04/25 58 RV College of Engineering Go, change the world
  • 59. Parallel and Distributed Programming Models • MPI is the most popular programming model for message-passing systems. • Google’s MapReduce and BigTable are for effective use of resources from Internet clouds and data centers. • Service clouds demand extending Hadoop, EC2, and S3 to facilitate distributed computing over distributed storage systems. • Message-Passing Interface (MPI): • This is the primary programming standard used to develop parallel and concurrent programs to run on a distributed system. • MPI is essentially a library of subprograms that can be called from C or FORTRAN to write parallel programs running on a distributed system. • The idea is to embody clusters, grid systems, and P2P systems with upgraded web services and utility computing applications. • Besides MPI, distributed programming can be also supported with low-level primitives such as the Parallel Virtual Machine (PVM) • MapReduce: • This is a web programming model for scalable data processing on large clusters over large data sets • The model is applied mainly in web-scale search and cloud computing applications. • Hadoop: offers a software platform that was originally developed by a Yahoo! group. • The package enables users to write and run applications over vast amounts of distributed data. • Users can easily scale Hadoop to store and process petabytes of data in the web space. • Also, Hadoop is economical in that it comes with an open source version of MapReduce that minimizes overhead 05/04/25 59 RV College of Engineering Go, change the world
  • 60. 05/04/25 60 RV College of Engineering Go, change the world
  • 61. 05/04/25 61 RV College of Engineering Go, change the world
  • 62. PERFORMANCE, SECURITY, AND ENERGY EFFICIENCY Performance Metrics:  CPU speed in MIPS and network bandwidth in Mbps to estimate processor and network performance.  In a distributed system, performance is attributed to a large number of factors.  System throughput is often measured in MIPS, Tflops (tera floating-point operations per second), or TPS (transactions per second).  Other measures include job response time and network latency.  An interconnection network that has low latency and high bandwidth is preferred.  System overhead is often attributed to OS boot time, compile time, I/O data rate, and the runtime support system used.  Other performance-related metrics include the QoS for Internet and web services; system availability and dependability; and security resilience for system defense against network attacks. 05/04/25 62 RV College of Engineering Go, change the world
  • 63. Dimensions of Scalability  Users want to have a distributed system that can achieve scalable performance.  Any resource upgrade in a system should be backward compatible with existing hardware and software resources.  Overdesign may not be cost-effective.  System scaling can increase or decrease resources depending on many practical factors.  The following dimensions of scalability are characterized in parallel and distributed system 1. Size scalability  This refers to achieving higher performance or more functionality by increasing the machine size.  The word “size” refers to adding processors, cache, memory, storage, or I/O channels.  To determine size scalability is to simply count the number of processors installed.  Not all parallel computer or distributed architectures are equally size scalable.  For example, the IBM S2 was scaled up to 512 processors in 1997. But in 2008, the IBM BlueGene/L system scaled up to 65,000 processors. 2. Software scalability • This refers to upgrades in the OS or compilers, adding mathematical and engineering libraries, porting new application software, and installing more user-friendly programming environments. • Some software upgrades may not work with large system configurations. • Testing and fine-tuning of new software on larger systems is a nontrivial job. 05/04/25 63 RV College of Engineering Go, change the world
  • 64. 3. Application scalability • This refers to matching problem size scalability with machine size scalability. • Problem size affects the size of the data set or the workload increase. • Instead of increasing machine size, users can enlarge the problem size to enhance system efficiency or cost-effectiveness. 4. Technology scalability • This refers to a system that can adapt to changes in building technologies, such as the component and networking technologies. • When scaling a system design with new technology one must consider three aspects: time, space, and heterogeneity. (1)Time refers to generation scalability. When changing to new-generation processors, one must consider the impact to the motherboard, power supply, packaging and cooling, and so forth. (2)Space is related to packaging and energy concerns. Technology scalability demands harmony and portability among suppliers. (3)Heterogeneity refers to the use of hardware components or software packages from different vendors. Heterogeneity may limit the scalability. 05/04/25 64 RV College of Engineering Go, change the world
  • 65. Scalability versus OS Image Count • Scalable performance implies that the system can achieve higher speed by adding more processors or servers, enlarging the physical node’s memory size, extending the disk capacity, or adding more I/O channels. • The OS image is counted by the number of independent OS images observed in a cluster, grid, P2P network, or the cloud. 05/04/25 65 RV College of Engineering Go, change the world
  • 66. Scalability versus OS Image Count • An SMP (symmetric multiprocessor) server has a single system image, which could be a single node in a large cluster. • The scalability of SMP systems is constrained primarily by packaging and the system interconnect used. • NUMA (nonuniform memory access) machines are often made out of SMP nodes with distributed, shared memory. • A NUMA machine can run with multiple operating systems, and can scale to a few thousand processors communicating with the MPI library. For example, a NUMA machine may have 2,048 processors running 32 SMP operating systems, resulting in 32 OS images in the 2,048-processor NUMA system. • The cluster nodes can be either SMP servers or high-end machines that are loosely coupled together. Therefore, clusters have much higher scalability than NUMA machines. • The number of OS images in a cluster is based on the cluster nodes concurrently in use. • The cloud could be a virtualized cluster. As of 2010, the largest cloud was able to scale up to a few thousand VMs • Keeping in mind that many cluster nodes are SMP or multicore servers, the total number of processors or cores in a cluster system is one or two orders of magnitude greater than the number of OS images running in the cluster. • The grid node could be a server cluster, or a mainframe, or a supercomputer, or an MPP. Therefore, the number of OS images in a large grid structure could be hundreds or thousands fewer than the total number of processors in the grid. • A P2P network can easily scale to millions of independent peer nodes, essentially desktop machines 05/04/25 66 RV College of Engineering Go, change the world
  • 67. Amdahl’s Law  On a uniprocessor workstation the total execution time is T minutes.  The program has been parallelized or partitioned for parallel execution on a cluster of many processing nodes.  Assume that a fraction α of the code must be executed sequentially, called the sequential bottleneck.  Therefore, (1 − α) of the code can be compiled for parallel execution by n processors.  The total execution time of the program is calculated by α T + (1 − α)T/n, where the first term is the sequential execution time on a single processor and the second term is the parallel execution time on n processing nodes.  Amdahl’s Law states that the speedup factor of using the n-processor system over the use of a single processor is expressed by:  The maximum speedup of n is achieved only if the sequential bottleneck α is reduced to zero or the code is fully parallelizable with α = 0.  As the cluster becomes sufficiently large, that is, n → ∞, S approaches 1/α, surprisingly, this upper bound is independent of the cluster size n.  Amdahl’s law teaches us that we should make the sequential bottleneck as small as possible. Increasing the cluster size alone may not result in a good speedup in this case  To achieve higher efficiency when using a large cluster, we must consider scaling the problem size to match the cluster capability. This leads to the speedup law proposed by John Gustafson, referred as scaled-workload speedup 05/04/25 67 RV College of Engineering Go, change the world
  • 68. Fault Tolerance and System Availability • System Availability HA (high availability) is desired in all clusters, grids, P2P networks, and cloud systems. • A system is highly available if it has a long mean time to failure (MTTF) and a short mean time to repair (MTTR). • System availability is formally defined as follows • The rule of thumb is to design a dependable computing system with no single point of failure. • Adding hardware redundancy, increasing component reliability, and designing for testability will help to enhance system availability and dependability. • Both SMP and MPP are very vulnerable with centralized resources under one OS. • NUMA machines have improved in availability due to the use of multiple OSes. • Most clusters are designed to have HA with failover capability. • Clusters, clouds, and grids have decreasing availability as the system increases in size. • A P2P file-sharing network has the highest aggregation of client machines. 05/04/25 68 RV College of Engineering Go, change the world
  • 69. Network Threats and Data Integrity 05/04/25 69 RV College of Engineering Go, change the world
  • 70. Energy Efficiency in Distributed Computing 05/04/25 70 RV College of Engineering Go, change the world
  • 71. Department of Computer Science and Engineering Organizing Invited Talk on Distributed Systems with AWS RV College of Engineering Go, change the world By Mr Mohammad Hannan SDE-2, CISCO Systems, India On 1st June 2023, Thursday 9.30 am to 10.30 am Meeting Link: https://meet.google.com/utx-gixg-tgv Organized by: Prof Srividya M S and Dr. Anala M R Department of CSE/ISE, RVCE For 1st Sem MTech CSE and CNE students