1. Distributed and Cloud
Computing –Unit1:
22MCN1B2T
RV College of
Engineering
Go, change the
world
By
Prof Srividya M S,
Department of CSE,
RVCE
05/04/25 1
2. 05/04/25 2
CO1 : Apply the distributed and cloud computing concepts to solve problems in
computing domain.
CO2 : Analyse various architectures, work flow models and algorithms used to
implement cloud and distributed systems.
CO3 : Design solutions using modern tools to solve applicable problems in
cloud and distributed systems.
CO4 : Demonstrate effective communication , report writing and usage of
modern tools for implementing cloud and distributed systems applications
RV College of
Engineering Go, change the world
Course Outcomes for
Distributed and Cloud Computing
3. RV College of
Engineering Go, change the world
05/04/25 3
UNIT-1:
Distributed System Models & Enabling technology:
Scalable computing over the internet,
Technologies for network-based system,
System models for distributed & cloud,
Software environments for distributed & Cloud,
Performance security and energy efficiency
4. SCALABLE COMPUTING OVER THE INTERNET
• Over the past 60 years, computing technology has undergone a series of platform
and environment changes.
• Evolutionary changes in machine architecture, operating system platform,
network connectivity, and application workload.
• Parallel and distributed computing system uses multiple computers to solve large-
scale problems
• Distributed computing becomes data-intensive and network-centric
• High-performance computing (HPC) and High-throughput computing (HTC)
systems built with parallel and distributed computing technologies is the need of
the hour
05/04/25 4
RV College of
Engineering Go, change the world
6. The Platform Evolution
• Computer technology has gone through five generations of development, with
each generation lasting from 10 to 20 years. Successive generations are
overlapped in about 10 years.
• 1950 to 1970, a handful of mainframes, including the IBM 360 and CDC 6400.
• 1960 to 1980, lower-cost minicomputers such as the DEC PDP 11 and VAX
Series became popular
• 1970 to 1990, we saw widespread use of personal computers built with VLSI
microprocessor
• 1980 to 2000, massive numbers of portable computers and pervasive devices
appeared in both wired and wireless applications
• Since 1990, the use of both HPC and HTC systems hidden clusters, grids, or
Internet clouds has proliferated
05/04/25 6
RV College of
Engineering Go, change the world
8. HPC and HTC systems
• On the HPC side,
• Supercomputers (massively parallel processors or MPPs)
• The cluster is often a collection of homogeneous compute nodes that are physically connected in close range
to one another.
• On the HTC side,
• Peer-to-peer (P2P) networks are formed for distributed file sharing and content delivery applications.
• P2P, cloud computing, and web service platforms are more focused on HTC applications
• There is a strategic change from an HPC paradigm to an HTC paradigm.
• The performance goal thus shifts to measure high throughput or the number of tasks completed per unit of time.
• HTC technology needs to not only improve in terms of batch processing speed, but cost saving, energy savings,
security, and reliability at many data and enterprise computing centers.
05/04/25 8
RV College of
Engineering Go, change the world
9. 05/04/25 9
Parameter HPC HTC
Stands for HPC stands for High-Performance Computing HTC stands for High Throughput Computing
Definition
HPC is defined as the type of computing that makes use of multiple
computer processors in order to perform complex computations
parallelly.
HTC is defined as a type of computing that parallelly
executes a large number of simple and computationally
independent tasks.
Workload
HPC consists of running large-scale, complex, and computationally
intensive applications that need significant resources and memory.
HPC consists of running a large number of tasks that are
independent and small in size and does not require a large
amount of memory and resources.
Processing Power
It is designed to provide maximum performance and speed for large
tasks.
HTC is designed to increase the number of tasks that needs
to be completed in a given specific amount of time.
Resource
Management
For resource management to processes, HPC makes use of job
schedulers and resource managers.
For the resource management to processes, HTC makes use
of distributed management resources.
Fault Tolerance
To reduce the risk of data loss and data corruption HPC systems have a
complex fault tolerance mechanism.
HTC systems do not affect any other running processes due
to the failure of an individual task.
Scaling HPC scales up when few users are running together.
HTC systems scale horizontally for simple tasks and require
less computational speed.
Applications
HPC can be used in applications such as engineering design, weather
forecasting, drug discovery etc.
HTC can be used in applications such as bioinformatics,
research applications, etc.
10. Three New Computing Paradigms
• The maturity of Radio-frequency Identification (Rfid), Global Positioning System (GPS), and sensor
technologies has triggered the development of the Internet of Things (IoT).
• With the introduction of SOA, Web 2.0 services become available.
• Advances in virtualization make it possible to see the growth of Internet clouds
05/04/25 10
RV College of
Engineering Go, change the world
11. Computing Paradigm Distinctions
• In general, distributed computing is the opposite of centralized computing.
• The field of parallel computing overlaps with distributed computing to a great extent,
• And cloud computing overlaps with distributed, centralized
Centralized computing :
1. This is a computing paradigm by which all computer resources are centralized in one physical system.
2. All resources (processors, memory, and storage) are fully shared and tightly coupled within one integrated
OS.
3. Many data centers and supercomputers are centralized systems, but they are used in parallel, distributed,
and cloud computing applications
Parallel computing :
1. In parallel computing, all processors are either tightly coupled with centralized shared memory or loosely
coupled with distributed memory.
2. Inter processor communication is accomplished through shared memory or via message passing
05/04/25 11
RV College of
Engineering Go, change the world
12. Computing Paradigm Distinctions
Distributed computing :
1. A distributed system consists of multiple autonomous computers, each having its own private memory,
communicating through a computer network.
2. Information exchange in a distributed system is accomplished through message passing.
Cloud computing :
1. An Internet cloud of resources can be either a centralized or a distributed computing system.
2. The cloud applies parallel or distributed computing, or both.
3. Clouds can be built with physical or virtualized resources over large data centers that are centralized or
distributed
05/04/25 12
RV College of
Engineering Go, change the world
13. Scalable Computing Trends and New Paradigms
Includes,
Degrees of Parallelism
Innovative Applications
The Trend toward Utility Computing
The Hype Cycle of New Technologies
Fifty years ago, when hardware was bulky and expensive, most computers were designed in a bit-serial fashion.
Data-level parallelism (DLP) was made popular through SIMD (single instruction, multiple data) and vector
machines using vector or array types of instructions. DLP requires even more hardware support and compiler
assistance to work properly.
Innovative Applications
Both HPC and HTC systems desire transparency in many application aspects. For example, data access,
resource allocation, process location, concurrency in execution, job replication, and failure recovery should be
made transparent to both users and system management
05/04/25 13
RV College of
Engineering Go, change the world
15. The Trend toward Utility Computing
Utility computing focuses on a business model in which customers receive
computing resources from a paid service provider. All grid/cloud platforms are
regarded as utility service providers
05/04/25 15
RV College of
Engineering Go, change the world
16. The Trend toward Utility Computing
Identifies major computing paradigms to facilitate the study of distributed systems
and their applications. These paradigms share some common characteristics.
1.First, they are all ubiquitous in daily life. Reliability and scalability are two major
design objectives in these computing models.
2.Second, they are aimed at autonomic operations that can be self- organized to
support dynamic discovery
05/04/25 16
RV College of
Engineering Go, change the world
17. The Hype Cycle of New Technologies
05/04/25 17
RV College of
Engineering Go, change the world
18. The Internet of Things and Cyber-Physical Systems
•The IoT refers to the networked interconnection of everyday objects, tools, devices, or
computers
•With IPv6 protocol, 2^128 IP addresses are available to distinguish all the objects on Earth,
including all computers and pervasive devices
•The IoT researchers have estimated that every human being will be surrounded by 1,000 to
5,000 objects
•In the IoT era, all objects and devices are instrumented, interconnected, and interacted with
each other intelligently
•Three communication patterns co-exist: namely H2H (human-to-human), H2T (human-to-
thing), and T2T (thing-to-thing)
•A cyber-physical system (CPS) is the result of interaction between computational processes
and the physical world
•A CPS merges the “3C” technologies of computation, communication, and control into an
intelligent closed feedback system between the physical world and theinformation world
•CPS emphasizes exploration of virtual reality (VR)
05/04/25 18
RV College of
Engineering Go, change the world
19. TECHNOLOGIES FOR NETWORK-BASED SYSTEMS
Multicore CPUs and Multithreading Technologies
Advances in CPU Processors
05/04/25 19
RV College of
Engineering Go, change the world
20. Multicore CPU and Many-Core GPU Architectures
05/04/25 20
RV College of
Engineering Go, change the world
22. Memory, Storage, and Wide-Area Networking
•Memory chips have
experienced a 4x increase in
capacity every three years
•Memory access time did not
improve much in the past
•Faster processor speed and
larger memory capacity result in
a wider gap between processors
and memory
•disk storage growth in 7 orders
of magnitude in 33 years
05/04/25 22
RV College of
Engineering Go, change the world
23. System-Area Interconnects
• LAN typically is used to connect client
hosts to big servers.
• A storage area network (SAN) connects
servers to network storage such as disk
arrays.
• Network attached storage (NAS) connects
client hosts directly to the disk arrays.
05/04/25 23
RV College of
Engineering Go, change the world
24. Virtual Machines and Virtualization Middleware
• Virtual machine is an application environment, that is installed on software which
imitates dedicated hardware
• VM is a virtual environment that functions as a virtual computer system with its
own CPU, memory, network interface and storage, created on physical hardware
system (on/off premises)
• The end user has the same experience on a virtual machine as they would have on
dedicated hardware
05/04/25 24
RV College of
Engineering Go, change the world
25. Need for VM
• Conventional computer has a single OS image. This offers an inflexible architecture that
tightly couples application software to a specific hardware platform
• Some software running well on one machine may not be executable on another platform with
a different instruction set under fixed OS
• Virtual machines (VMs) offer novel solutions to underutilized resources, application
inflexibility, software manageability, and security concerns in existing physical machines
• To build large clusters, grids, and clouds, we need to access large amounts of computing,
storage, and networking resources in a virtualized manner
• We need to aggregate those resources, and offer a single system image (SSI)
• A cloud of provisioned resources must rely on virtualization of processors, memory, and I/O
facilities dynamically
05/04/25 25
RV College of
Engineering Go, change the world
26. Virtual Machines and Virtualization Middleware
05/04/25 26
RV College of
Engineering Go, change the world
27. 05/04/25 27
Physical Machine
a. The host machine is equipped with the physical hardware.
An example is an x-86 architecture desktop running its installed Windows OS
RV College of
Engineering Go, change the world
28. 05/04/25 28
Native VM
• The VM can be provisioned for any hardware system.
• The VM is built with virtual resources managed by a guest OS to run
a specific application.
• Between the VMs and the host platform, one needs to deploy a
middleware layer called a virtual machine monitor (VMM) or
hypervisor, handles the bare metal hardware (CPU, memory, and I/O)
directly.
• The guest OS could be a Linux system and the hypervisor is the XEN
system
• This hypervisor approach is also called bare-metal VM
RV College of
Engineering Go, change the world
29. 05/04/25 29
Hosted VM
• The VMM runs in nonprivileged mode
• The host OS need not be modified
RV College of
Engineering Go, change the world
30. 05/04/25 30
Dual Mode VM
• Part of the VMM runs at the user level and another part runs at the
supervisor level.
• In this case, the host OS may have to be modified to some extent.
• Multiple VMs can be ported to a given hardware system to support
the virtualization process.
• The VM approach offers hardware independence of the OS and
applications.
• The user application running on its dedicated OS could be bundled
together as a virtual appliance that can be ported to any hardware
platform.
• The VM could run on an OS different from that of the host computer
RV College of
Engineering Go, change the world
31. VM Primitive Operations
1. The VMM provides the VM abstraction to the guest OS
2. With full virtualization, the VMM exports a VM abstraction identical to the physical machine
3. So that a standard OS such as Windows 2000 or Linux can run just as it would on the physical
hardware
4. Basic VM operations: VM multiplexing, suspension, provision, and migration in a distributed
computing environment.
05/04/25 31
RV College of
Engineering Go, change the world
32. VM Primitive Operations
a. First, the VMs can be multiplexed between hardware machines.
b. Second, a VM can be suspended and stored in stable storage
c. Third, a suspended VM can be resumed or provisioned to a new hardware platform
d. Finally, a VM can be migrated from one hardware platform to another
05/04/25 32
RV College of
Engineering Go, change the world
33. VM Primitive Operations
1. These VM operations enable a VM to be provisioned to any available hardware platform
2. They also enable flexibility in porting distributed application executions
3. VMware claimed that server utilization could be increased from its current 5–15 percent to 60–
80 percent
4. Multiple server functions can be consolidated on the same hardware platform to achieve higher
system efficiency.
05/04/25 33
RV College of
Engineering Go, change the world
34. VM Migration
• The process of moving a running VM or application between
different physical machines without disconnecting the client
or application
• Memory, storage and network connectivity of the VM are
transferred from one original guest machine to the destination
machine
05/04/25 34
RV College of
Engineering Go, change the world
35. Data Center Virtualization for Cloud Computing
• Data Centers:
•A data center is designed based on a network of huge computing and storage devices
•Data centers enabled to handle and delivery of shared applications and data
• Modern Data Centers:
•Data centers shifted to virtual networks
•Multicloud environment is connected across multiple data centers, public and private
clouds
05/04/25 35
RV College of
Engineering Go, change the world
36. Data Center Growth and Cost Breakdown
05/04/25 36
RV College of
Engineering Go, change the world
37. SYSTEM MODELS FOR DISTRIBUTED AND CLOUD COMPUTING
Distributed and cloud computing systems are built over a large number of autonomous
computer nodes.
These node machines are interconnected by SANs, LANs, or WANs in a hierarchical
manner
Massive systems are considered highly scalable, and can reach web-scale
connectivity, either physically or logically
Massive systems are classified into four groups: clusters, P2P networks, computing
grids, and Internet clouds over huge data centers
These machines work collectively, cooperatively, or collaboratively at various levels
05/04/25 37
RV College of
Engineering Go, change the world
39. From the application perspective, clusters are most popular in supercomputing applications.
In 2009, 417 of the Top 500 supercomputers were built with cluster architecture.
It is fair to say that clusters have laid the necessary foundation for building large-scale grids and
clouds.
P2P networks appeal most to business applications.
Potential advantages of cloud computing include its low cost and simplicity for both providers
and users.
05/04/25 39
RV College of
Engineering Go, change the world
40. Clusters of Cooperative Computers
Cluster Architecture:
• The architecture of a typical server cluster built around a low-latency, high bandwidth interconnection network.
• The cluster is connected to the Internet via a virtual private network (VPN) gateway.
• The gateway IP address locates the cluster.
• The system image of a computer is decided by the way the OS manages the shared cluster resources.
• Most clusters have loosely coupled node computers.
• Most clusters have multiple system images as a result of having many autonomous nodes under different OS control
05/04/25 40
RV College of
Engineering Go, change the world
41. Single-System Image
An ideal cluster should merge multiple system images into a single-system image (SSI).
Cluster designers desire a cluster operating system or some middleware to support SSI at
various levels, including the sharing of CPUs, memory, and I/O across all cluster nodes.
An SSI is an illusion created by software or hardware that presents a collection of resources
as one integrated, powerful resource.
SSI makes the cluster appear like a single machine to the user.
A cluster with multiple system images is nothing but a collection of independent computers
05/04/25 41
RV College of
Engineering Go, change the world
42. Hardware, Software, and Middleware Support
Clusters exploring massive parallelism are commonly known as MPPs.
Almost all HPC clusters in the Top 500 list are also MPPs.
The computer nodes are interconnected by a high-bandwidth network (such as Gigabit Ethernet,
Myrinet, InfiniBand, etc.)
Special cluster middleware supports are needed to create SSI or high availability (HA).
Both sequential and parallel applications can run on the cluster, and special parallel environments
are needed to facilitate use of the cluster resources
Distributed memory has multiple images. Users may want all distributed memory to be shared by
all servers by forming distributed shared memory (DSM).
Many SSI features are expensive or difficult to achieve at various cluster operational levels
05/04/25 42
RV College of
Engineering Go, change the world
43. Major Cluster Design Issues
05/04/25 43
RV College of
Engineering Go, change the world
44. Computational Grids
Computing grid offers an infrastructure that couples computers, software/middleware, special
instruments, and people and sensors together.
The grid is often constructed across LAN, WAN, or Internet backbone networks at a regional,
national, or global scale.
Enterprises or organizations present grids as integrated computing resources.
They can also be viewed as virtual platforms to support virtual organizations.
The computers used in a grid are primarily workstations, servers, clusters, and supercomputers.
Personal computers, laptops, and PDAs can be used as access devices to a grid system
05/04/25 44
RV College of
Engineering Go, change the world
45. Computational grid built over multiple resource sites owned by different organizations.
The resource sites offer complementary computing resources, including workstations, large servers, a mesh of
processors, and Linux clusters to satisfy a chain of computational needs.
The grid is built across various IP broadband networks including LANs and WANs already used by enterprises or
organizations over the Internet.
At the server end, the grid is a network.
At the client end, we see wired or wireless terminal devices.
The grid integrates the computing, communication, contents, and transactions as rented services.
Enterprises and consumers form the user base, which then defines the usage trends and service characteristics.
05/04/25 45
RV College of
Engineering Go, change the world
46. Cloud Computing over the Internet
Gordon Bell, Jim Gray, and Alex Szalay [5] have advocated: “Computational science is changing
to be data-intensive. Supercomputers must be balanced systems, not just CPU farms but also
petascale I/O and networking arrays.”
In the future, working with large data sets will typically mean sending the computations
(programs) to the data, rather than copying the data to the workstations.
This reflects the trend in IT of moving computing and data from desktops to large data centers,
where there is on-demand provision of software, hardware, and data as a service.
This data explosion has promoted the idea of cloud computing.
IBM, a major player in cloud computing, has defined it as follows: “A cloud is a pool of virtualized
computer resources. A cloud can host a variety of different workloads, including batch-style
backend jobs and interactive and user-facing applications.”
05/04/25 46
RV College of
Engineering Go, change the world
47. Internet Clouds
A cloud allows workloads to be deployed and scaled out quickly through
rapid provisioning of virtual or physical machines.
The cloud supports redundant, self-recovering, highly scalable programming
models that allow workloads to recover from many unavoidable
hardware/software failures.
Finally, the cloud system should be able to monitor resource use in real time
to enable rebalancing of allocations when needed.
Virtualized resources from data centers to form an Internet cloud,
provisioned with hardware, software, storage, network, and services for paid
users to run their applications.
05/04/25 47
RV College of
Engineering Go, change the world
48. The cloud ecosystem must be designed to be secure, trustworthy, and dependable.
Some computer users think of the cloud as a centralized resource pool.
Others consider the cloud to be a server cluster which practices distributed computing
over all the servers used
Traditional systems have encountered several performance bottlenecks: constant
system maintenance, poor utilization, and increasing costs associated with
hardware/software upgrades.
Cloud computing as an on-demand computing paradigm resolves or relieves us from
these problems
05/04/25 48
RV College of
Engineering Go, change the world
49. 05/04/25 49
Software as a Service (SaaS):
• This refers to browser-initiated application software over thousands of paid cloud customers.
• The SaaS model applies to business processes, industry applications, consumer relationship management
(CRM), enterprise resources planning (ERP), human resources (HR), and collaborative applications.
• On the customer side, there is no upfront investment in servers or software licensing.
• On the provider side, costs are rather low, compared with conventional hosting of user applications.
Platform as a Service (PaaS):
• This model enables the user to deploy user-built applications onto a virtualized cloud platform.
• PaaS includes middleware, databases, development tools, and some runtime support such as Web 2.0 and
Java.
• The platform includes both hardware and software integrated with specific programming interfaces.
• The provider supplies the API and software tools (e.g., Java, Python, Web 2.0, .NET).
• The user is freed from managing the cloud infrastructure.
RV College of
Engineering Go, change the world
50. The Cloud Landscape
• Internet clouds offer four deployment modes: private, public, managed, and hybrid.
• These modes demand different levels of security implications.
• The different SLAs imply that the security responsibility is shared among all the cloud
providers, the cloud resource consumers, and the third party cloud-enabled software providers
• The following list highlights eight reasons to adapt the cloud for upgraded Internet
applications and web services:
1. Desired location in areas with protected space and higher energy efficiency
2. Sharing of peak-load capacity among a large pool of users, improving overall utilization
3. Separation of infrastructure maintenance duties from domain-specific application
development
4. Significant reduction in cloud computing cost, compared with traditional computing
paradigms
5. Cloud computing programming and application development
6. Service and data discovery and content/service distribution
7. Privacy, security, copyright, and reliability issues
8. Service agreements, business models, and pricing policies
05/04/25 50
RV College of
Engineering Go, change the world
51. SOFTWARE ENVIRONMENTS FOR DISTRIBUTED SYSTEMS AND CLOUDS
Service-Oriented Architecture (SOA): These architectures build on the traditional seven Open
Systems Interconnection (OSI) layers that provide the base networking abstractions.
Layered Architecture for Web Services and Grids:
05/04/25 51
RV College of
Engineering Go, change the world
52. The entity interfaces correspond to the Web Services Description Language (WSDL), Java method, and
CORBA interface definition language (IDL) specifications in these example distributed systems.
These interfaces are linked with customized, high-level communication systems:
These communication systems support features including particular message patterns (such as Remote
Procedure Call or RPC), fault recovery, and specialized routing.
Often, these communication systems are built on message-oriented middleware (enterprise bus) infrastructure
such as Web Sphere MQ or Java Message Service (JMS) which provide rich functionality and support
virtualization of routing, senders, and recipients
In the case of fault tolerance, the features in the Web Services Reliable Messaging (WSRM)
framework mimic the OSI layer capability (as in TCP fault tolerance) modified to match the different
abstractions (such as messages versus packets, virtualized addressing) at the entity levels.
05/04/25 52
RV College of
Engineering Go, change the world
53. Here, one might get several models with, for example, JNDI (Jini and Java Naming and Directory Interface) illustrating
different approaches within the Java distributed object model.
The CORBA Trading Service, UDDI (Universal Description, Discovery, and Integration), LDAP (Lightweight
Directory Access Protocol), and ebXML (Electronic Business using eXtensible Markup Language) are other examples
of discovery and information services.
Management services include service state and lifetime support; examples include the CORBA Life Cycle and
Persistent states, the different Enterprise JavaBeans models, Jini’s lifetime model, and a suite of web services
specifications
The distributed model has two critical advantages:
Higher performance and
Cleaner separation of software functions with clear software reuse and maintenance advantages
05/04/25 53
RV College of
Engineering Go, change the world
54. The Evolution of SOA
05/04/25 54
RV College of
Engineering Go, change the world
55. Service-oriented architecture (SOA) has evolved over the years. SOA applies to building grids,
clouds, grids of clouds, clouds of grids, clouds of clouds (also known as interclouds), and systems
of systems in general.
A large number of sensors provide data-collection services, denoted in the figure as SS (sensor
service).
A sensor can be a ZigBee device, a Bluetooth device, a WiFi access point, a personal computer, a
GPA, or a wireless phone, among other things.
Raw data is collected by sensor services.
All the SS devices interact with large or small computers, many forms of grids, databases, the
compute cloud, the storage cloud, the filter cloud, the discovery cloud, and so on.
Filter services ( fs in the figure) are used to eliminate unwanted raw data, in order to respond to
specific requests from the web, the grid, or web services.
A collection of filter services forms a filter cloud.
SOA aims to search for, or sort out, the useful data from the massive amounts of raw data items.
Processing this data will generate useful information, and subsequently, the knowledge for our daily
use.
In fact, wisdom or intelligence is sorted out of large knowledge bases.
Finally, we make intelligent decisions based on both biological and machine wisdom
05/04/25 55
RV College of
Engineering Go, change the world
56. Grids versus Clouds
• The boundary between grids and clouds are getting blurred in recent years
• In general, a grid system applies static resources, while a cloud emphasizes elastic
resources.
• For some researchers, the differences between grids and clouds are limited only in
dynamic resource allocation based on virtualization and autonomic computing.
• One can build a grid out of multiple clouds.
• This type of grid can do a better job than a pure cloud, because it can explicitly support
negotiated resource allocation.
• Thus one may end up building with a system of systems: such as a cloud of clouds, a
grid of clouds, or a cloud of grids, or inter-clouds as a basic SOA architecture
05/04/25 56
RV College of
Engineering Go, change the world
57. Trends toward Distributed Operating Systems
05/04/25 57
RV College of
Engineering Go, change the world
58. Parallel and Distributed Programming Models
• A transparent computing environment that
separates the user data, application, OS, and
hardware in time and space – an ideal model for
cloud computing.
• In this section, we will explore three
programming models for distributed computing
with expected scalable performance and
application flexibility
05/04/25 58
RV College of
Engineering Go, change the world
59. Parallel and Distributed Programming Models
• MPI is the most popular programming model for message-passing systems.
• Google’s MapReduce and BigTable are for effective use of resources from Internet clouds and data centers.
• Service clouds demand extending Hadoop, EC2, and S3 to facilitate distributed computing over distributed storage systems.
• Message-Passing Interface (MPI):
• This is the primary programming standard used to develop parallel and concurrent programs to run on a distributed system.
• MPI is essentially a library of subprograms that can be called from C or FORTRAN to write parallel programs running on a
distributed system.
• The idea is to embody clusters, grid systems, and P2P systems with upgraded web services and utility computing
applications.
• Besides MPI, distributed programming can be also supported with low-level primitives such as the Parallel Virtual Machine
(PVM)
• MapReduce:
• This is a web programming model for scalable data processing on large clusters over large data sets
• The model is applied mainly in web-scale search and cloud computing applications.
• Hadoop: offers a software platform that was originally developed by a Yahoo! group.
• The package enables users to write and run applications over vast amounts of distributed data.
• Users can easily scale Hadoop to store and process petabytes of data in the web space.
• Also, Hadoop is economical in that it comes with an open source version of MapReduce that minimizes overhead
05/04/25 59
RV College of
Engineering Go, change the world
62. PERFORMANCE, SECURITY, AND ENERGY EFFICIENCY
Performance Metrics:
CPU speed in MIPS and network bandwidth in Mbps to estimate processor and network performance.
In a distributed system, performance is attributed to a large number of factors.
System throughput is often measured in MIPS, Tflops (tera floating-point operations per second), or TPS (transactions per
second).
Other measures include job response time and network latency.
An interconnection network that has low latency and high bandwidth is preferred.
System overhead is often attributed to OS boot time, compile time, I/O data rate, and the runtime support system used.
Other performance-related metrics include the QoS for Internet and web services; system availability and dependability; and
security resilience for system defense against network attacks.
05/04/25 62
RV College of
Engineering Go, change the world
63. Dimensions of Scalability
Users want to have a distributed system that can achieve scalable performance.
Any resource upgrade in a system should be backward compatible with existing hardware and software
resources.
Overdesign may not be cost-effective.
System scaling can increase or decrease resources depending on many practical factors.
The following dimensions of scalability are characterized in parallel and distributed system
1. Size scalability
This refers to achieving higher performance or more functionality by increasing the machine size.
The word “size” refers to adding processors, cache, memory, storage, or I/O channels.
To determine size scalability is to simply count the number of processors installed.
Not all parallel computer or distributed architectures are equally size scalable.
For example, the IBM S2 was scaled up to 512 processors in 1997. But in 2008, the IBM BlueGene/L system
scaled up to 65,000 processors.
2. Software scalability
• This refers to upgrades in the OS or compilers, adding mathematical and engineering libraries, porting new
application software, and installing more user-friendly programming environments.
• Some software upgrades may not work with large system configurations.
• Testing and fine-tuning of new software on larger systems is a nontrivial job.
05/04/25 63
RV College of
Engineering Go, change the world
64. 3. Application scalability
• This refers to matching problem size scalability with machine size scalability.
• Problem size affects the size of the data set or the workload increase.
• Instead of increasing machine size, users can enlarge the problem size to enhance system efficiency
or cost-effectiveness.
4. Technology scalability
• This refers to a system that can adapt to changes in building technologies, such as the component
and networking technologies.
• When scaling a system design with new technology one must consider three aspects: time, space,
and heterogeneity.
(1)Time refers to generation scalability. When changing to new-generation processors, one must
consider the impact to the motherboard, power supply, packaging and cooling, and so forth.
(2)Space is related to packaging and energy concerns. Technology scalability demands harmony and
portability among suppliers.
(3)Heterogeneity refers to the use of hardware components or software packages from different
vendors. Heterogeneity may limit the scalability.
05/04/25 64
RV College of
Engineering Go, change the world
65. Scalability versus OS Image Count
• Scalable performance implies that the system can achieve higher speed by adding more processors or servers,
enlarging the physical node’s memory size, extending the disk capacity, or adding more I/O channels.
• The OS image is counted by the number of independent OS images observed in a cluster, grid, P2P network, or
the cloud.
05/04/25 65
RV College of
Engineering Go, change the world
66. Scalability versus OS Image Count
• An SMP (symmetric multiprocessor) server has a single system image, which could be a single node in a large cluster.
• The scalability of SMP systems is constrained primarily by packaging and the system interconnect used.
• NUMA (nonuniform memory access) machines are often made out of SMP nodes with distributed, shared memory.
• A NUMA machine can run with multiple operating systems, and can scale to a few thousand processors communicating
with the MPI library. For example, a NUMA machine may have 2,048 processors running 32 SMP operating systems,
resulting in 32 OS images in the 2,048-processor NUMA system.
• The cluster nodes can be either SMP servers or high-end machines that are loosely coupled together. Therefore, clusters
have much higher scalability than NUMA machines.
• The number of OS images in a cluster is based on the cluster nodes concurrently in use.
• The cloud could be a virtualized cluster. As of 2010, the largest cloud was able to scale up to a few thousand VMs
• Keeping in mind that many cluster nodes are SMP or multicore servers, the total number of processors or cores in a
cluster system is one or two orders of magnitude greater than the number of OS images running in the cluster.
• The grid node could be a server cluster, or a mainframe, or a supercomputer, or an MPP. Therefore, the number of OS
images in a large grid structure could be hundreds or thousands fewer than the total number of processors in the grid.
• A P2P network can easily scale to millions of independent peer nodes, essentially desktop machines
05/04/25 66
RV College of
Engineering Go, change the world
67. Amdahl’s Law
On a uniprocessor workstation the total execution time is T minutes.
The program has been parallelized or partitioned for parallel execution on a cluster of many processing nodes.
Assume that a fraction α of the code must be executed sequentially, called the sequential bottleneck.
Therefore, (1 − α) of the code can be compiled for parallel execution by n processors.
The total execution time of the program is calculated by α T + (1 − α)T/n, where the first term is the sequential
execution time on a single processor and the second term is the parallel execution time on n processing nodes.
Amdahl’s Law states that the speedup factor of using the n-processor system over the use of a single processor
is expressed by:
The maximum speedup of n is achieved only if the sequential bottleneck α is reduced to zero or the code is
fully parallelizable with α = 0.
As the cluster becomes sufficiently large, that is, n → ∞, S approaches 1/α, surprisingly, this upper bound is
independent of the cluster size n.
Amdahl’s law teaches us that we should make the sequential bottleneck as small as possible. Increasing the
cluster size alone may not result in a good speedup in this case
To achieve higher efficiency when using a large cluster, we must consider scaling the problem size to match
the cluster capability. This leads to the speedup law proposed by John Gustafson, referred as scaled-workload
speedup
05/04/25 67
RV College of
Engineering Go, change the world
68. Fault Tolerance and System Availability
• System Availability HA (high availability) is desired in all clusters, grids, P2P networks, and cloud systems.
• A system is highly available if it has a long mean time to failure (MTTF) and a short mean time to repair
(MTTR).
• System availability is formally defined as follows
• The rule of thumb is to design a dependable computing system with no single point of failure.
• Adding hardware redundancy, increasing component reliability, and designing for testability will help to
enhance system availability and dependability.
• Both SMP and MPP are very vulnerable with centralized resources under one OS.
• NUMA machines have improved in availability due to the use of multiple OSes.
• Most clusters are designed to have HA with failover capability.
• Clusters, clouds, and grids have decreasing availability as the system increases in size.
• A P2P file-sharing network has the highest aggregation of client machines.
05/04/25 68
RV College of
Engineering Go, change the world
69. Network Threats and Data Integrity
05/04/25 69
RV College of
Engineering Go, change the world
70. Energy Efficiency in Distributed Computing
05/04/25 70
RV College of
Engineering Go, change the world
71. Department of Computer Science and
Engineering
Organizing
Invited Talk on
Distributed Systems with AWS
RV College of
Engineering
Go, change the
world
By
Mr Mohammad Hannan
SDE-2, CISCO Systems, India
On
1st
June 2023, Thursday
9.30 am to 10.30 am
Meeting Link: https://meet.google.com/utx-gixg-tgv
Organized by: Prof Srividya M S and Dr. Anala M R
Department of CSE/ISE, RVCE
For 1st
Sem MTech CSE and CNE students