2. Chapter 1 assesses the evolutional changes in computing and IT
trends in the past 30 years, driven by applications with variable
workloads and large data sets. We study both high-performance
computing (HPC) for scientific computing and high throughput
computing (HTC) systems for business computing.
We examine clusters/MPP, grids, P2P networks, and Internet
clouds.
These systems are distinguished by their platform
architectures, OS platforms, processing algorithms,
communication protocols, security demands, and service
models applied.
The study emphasizes the scalability, performance, availability,
security, energy-efficiency, workload outsourcing, data center
protection, and so on.
3. 1 - 3
SCALABLE COMPUTING OVER THE
INTERNET
Data Deluge Enabling New
Challenges
4. 4
From Desktop/HPC/Grids to
Internet Clouds in 30 Years
◼ HPC moving from centralized supercomputers
to geographically distributed desktops,
desksides,
clusters, and grids to clouds over last 30 years
◼ R/D efforts on HPC, clusters, Grids, P2P, and
virtual machines has laid the foundation of cloud
computing that has been greatly advocated since
2007
◼ Location of computing infrastructure in areas
with lower costs in hardware, software, datasets,
space, and power requirements – moving
from desktopcomputing to datacenter-based
6. i.The Age of Internet Computing
⚫ Billions of people use the Internet every day.
As a result, supercomputer sites and large
data centers must provide high-
performance computing services to huge
numbers of Internet users concurrently.
⚫ HPC/HTC
⚫ We have to upgrade data centers using
fast servers, storage systems, and high-
bandwidth networks. The purpose is to
advance network- based computing and web
services with the emerging new technologies.
6
7. ii.The Platform Evolution
⚫ Computer technology has gone through five generations
of development,
⚫ 1950 to 1970, a handful of mainframes, including the
IBM 360 and CDC 6400, were built to satisfy
the demands of large businesses and
government organizations.
⚫ 1960 to 1980, lower-cost minicomputers such as the
DEC PDP 11 and VAX Series became popular
among small businesses and on college
campuses.
⚫ 1970 to 1990, we saw widespread use of personal
computers built with VLSI microprocessors.
computers and pervasive devices appeared
⚫ From 1980 to 2000, massive numbers of
portable
in both
wired and wireless applications
⚫ Since 1990, the use of both HPC and HTC
systems hidden in clusters, grids, or Internet clouds 7
9. ⚫ High-performance computing (HPC) is the use of parallel
processing for running advanced application
programs efficiently, reliably and quickly. The term
applies especially to systems that function above a teraflop
or 1012 floating-point operations per second.
⚫ High-throughput computing (HTC) is a computer science
term to describe the use of many computing resources
over long periods of time to accomplish a computational task.
⚫ HTC paradigm pays more attention to high-flux computing.
The main application for high-flux computing is in
Internet searches and web services by millions or
more users simultaneously.
⚫ The performance goal thus shifts to measure high
throughput or the number of tasks completed per unit of time.
⚫ HTC technology needs to not only improve in terms of batch
processing speed, but also address the acute
problems of cost, energy savings, security, and reliability
at many data and enterprise computing centers.
9
10. iii.Three New Computing Paradigms
⚫ With the introduction of SOA, Web 2.0
services become available.
⚫ Advances in virtualization make
it possible to see the growth of
Internet clouds as a new computing
paradigm.
⚫ The maturity of radio-frequency
identification (RFID), Global Positioning
System (GPS), and sensor technologies
has triggered the development of the
Internet of Things (IoT).
10
11. iv.Computing Paradigm Distinctions
In general, distributed computing is the opposite of centralized
computing. The field of parallel computing overlaps with
distributed computing to a great extent, and cloud computing
overlaps with distributed, centralized, and parallel computing.
i. Centralized computing - paradigm by which all computer
resources are centralized in one physical system. All
resources (processors, memory, and storage) are fully shared
and tightly coupled within one integrated OS. Many data
centers and supercomputers are centralized systems, but they
are used in parallel, distributed, and cloud computing
applications
11
12. ii. Parallel computing :; here.. all processors are
either tightly coupled with centralized
shared memory or loosely coupled
with distributed
memory.
Interprocessor communication is accomplished
through shared memory or via message passing.
A computer system capable of parallel computing is
commonly known as a parallel computer .
Programs running in a parallel computer are called
parallel programs.
The process of writing parallel
programs is often referred to as parallel
programming. 12
13. :
field
studies
of
computer
distributed
iii. Distributed computing
science/engineering that
systems.
A distributed system consists of multiple
private memory, communicating through
autonomous computers, each having its own
a
computer network.
Information exchange in a distributed system is
accomplished through message passing.
A computer program that runs in a distributed
system is known as a distributed program. The
process of writing distributed programs is referred
to as distributed programming.
13
14. iv. Cloud computing : An Internet cloud of resources
can be either a centralized or a distributed
computing system.
⚫ The cloud applies parallel or
distributed computing, or both.
⚫ Clouds can be built with physical or
virtualized resources over large data
centers that are centralized or distributed.
⚫ Some authors consider cloud computing to
be a form of utility computing or service
computing.
⚫ high-tech community prefer the term
concurrent computing or concurrent
programming. These terms typically refer to
the union of parallel computing and
14
15. ⚫ Ubiquitous computing refers to computing with
pervasive devices at any place and time.. using
wired or wireless communication.
⚫ The Internet of Things (IoT) is a
networked connection of everyday objects
including computers, sensors, humans, etc.
⚫ The IoT is supported by Internet clouds to
achieve ubiquitous computing with any object at
any place and time.
⚫ Finally, the term Internet computing is even
broader and covers all computing paradigms
over the Internet
15
16. 16
v.Distributed System Families
⚫ Technologies used for building P2P networks and
networks of clusters have been consolidated into
many national projects designed to establish wide
area computing infrastructures, known as computational
grids or data grids
⚫ Internet clouds are the result of moving desktop computing
to service-oriented computing using server clusters
and huge databases at data centers.
⚫ In October 2010, the highest performing cluster machine
was built in China with 86016 CPU processor cores
and 3,211,264 GPU cores in a Tianhe-1A system.
The largest computational grid connects up to hundreds of
server clusters.
<A graphics processing unit (GPU), also occasionally called visual processing
unit (VPU), is a specialized electronic circuit designed to rapidly manipulate and
alter memory to accelerate the creation of images in a frame buffer intended for
output to
a display.>
17. ⚫ In the future, both HPC and HTC systems will
demand multicore or many-core processors that
can handle large numbers of computing threads
per core.
⚫ Both HPC and HTC systems emphasize
parallelism and distributed computing.
⚫ Future HPC and HTC systems must be able to
satisfy this huge demand in computing power in
terms of throughput, efficiency, scalability, and
reliability.
⚫ The system efficiency is decided by speed,
programming, and energy factors (i.e., throughput
per watt of energy consumed).
17
18. Meeting these goals requires to yield the following design
objectives:
⚫ Efficiency measures the utilization rate of resources in an
execution model by exploiting massive parallelism in
HPC. For HTC, efficiency is more closely related
to job throughput, data access, storage, and power
efficiency.
⚫ Dependability measures the reliability and self-
management from the chip to the system and
application levels. The purpose is to provide high-
throughput service with Quality of Service (QoS)
assurance, even under failure conditions.
⚫ Adaptation in the programming model measures the ability
to support billions of job requests over massive data
sets and virtualized cloud resources under various
workload and service models.
⚫ Flexibility in application deployment measures the ability of
distributed systems to run well in both HPC (science
and engineering) and HTC (business) applications.
18
19. Scalable Computing Trends and New
Paradigms
⚫ Degrees of Parallelism
when hardware was bulky and expensive, most computers were designed
in a bit-serial fashion.
bit-level parallelism (BLP): converts bit-serial processing to word-level
processing gradually.
Over the years, users graduated from 4-bit microprocessors to 8-,16-, 32-,
and 64-bit CPUs. This led us to the next wave of improvement, known as
instruction-level parallelism (ILP) , in which the processor executes
multiple instructions simultaneously rather than only one instruction at a
time.
For the past 30 years, we have practiced ILP through pipelining, super-
scalar computing,
VLIW (very long instruction word) architectures, and multithreading.
ILP requires branch prediction, dynamic scheduling, speculation, and
compiler support to work efficiently.
19
20. Data-level parallelism (DLP): was made popular through SIMD (single
instruction, multiple data) and vector machines using vector or array
types of instructions.
DLP requires even more hardware support and compiler assistance to
work properly. Ever since the introduction of multicore processors and
chip multiprocessors (CMPs) ,
we have been exploring task-level parallelism (TLP) .
A modern processor explores all of the aforementioned parallelism
types. In fact, BLP, ILP, and DLP are well supported by advances in
hardware and compilers. However, TLP is far from being very successful
due to difficulty in programming and compilation of code for efficient
execution on multicore CMPs.
As we move from parallel processing to distributed processing, we will
see an increase in computing granularity to job-level parallelism (JLP)
. It is fair to say that coarse-grain parallelism is built on top of fine-grain
parallelism
20
21. ⚫ Innovative Applications
Few key applications that have driven the development of
parallel and distributed systems over the years
These applications spread across many important domains in science, engineering,
business, education, health care, traffic control, Internet and web services, military,
and government applications.
Almost all applications demand computing economics, web-scale data collection,
system reliability, and scalable performance. For example, distributed transaction
processing is often practiced in the banking and finance industry. Transactions
represent 90 percent of the existing market for reliable banking systems.
21
24. 24
Disillusionment:a feeling of disappointment resulting from the
discovery that something is not as good as one believed it to be.
Inflated: excessively or unreasonably high.
Enlighten: greater knowledge and understanding about a subject or
situation.
Trough:::low point. a short period of low activity, low prices etc
Cyber-Physical Systems
A cyber-physical system (CPS) is the result of interaction between
computational processes and the physical world. A CPS integrates “
cyber ” (heterogeneous, asynchronous) with“ physical ” (concurrent
and information-dense) objects. A CPS merges the “ 3C ”
technologies of computation, communication , and control into an
intelligent closed feedback system between the physical world and
the information world, a concept which is actively explored in the
United States.
27. Multicore CPU and Many-Core GPU Architectures
⚫ Multicore CPUs may increase from the tens of cores to hundreds
or more in the future
⚫ But the CPU has reached its limit in terms of exploiting massive
DLP
due to the aforementioned memory wall problem.
⚫ This has triggered the development of many-core GPUs with
hundreds or more thin cores.
⚫ IA-32 and IA-64 instruction set architectures are built into commercial
CPUs.
⚫ Now, x-86 processors have been extended to serve HPC and HTC
systems in some high-end server processors.
⚫ Many RISC processors have been replaced with multicore x-
86 processors and many-core GPUs in the Top 500 systems.
⚫ This trend indicates that x-86 upgrades will dominate in data
centers
and supercomputers.
⚫ The GPU also has been applied in large clusters to
build supercomputers in MPPs
⚫ In future , house both fat CPU cores and thin GPU cores on
27
30. ⚫
⚫ The superscalar processor is single-threaded with four functional
units. Each of the three multithreaded processors is four-way
multithreaded over four functional data paths.
In the dual-core processor, assume two processing cores, each
a
single-threaded two-way superscalar processor.
⚫ Instructions from different threads are distinguished by specific
shading patterns for instructions from five independent
threads.
⚫ Typical instruction scheduling patterns are shown here.
⚫ Only instructions from the same thread are executed in a
superscalar
processor.
⚫ Fine-grain multithreading switches the execution of instructions from
different threads per cycle.
⚫ Course-grain multithreading executes many instructions from the
same thread for quite a few cycles before switching to another
thread.
⚫ The multicore CMP executes instructions from different
30
31. GPU Computing to Exascale and Beyond
⚫ A GPU is a graphics coprocessor or accelerator mounted
on a computer’s graphics card or video card.
⚫ A GPU offloads the CPU from tedious graphics
tasks in video editing applications.
⚫ The world’s first GPU, the GeForce 256, was
marketed by NVIDIA in 1999.
⚫ These GPU chips can process a minimum of 10 million
polygons per second, and are used in nearly
every computer on the market today.
⚫ Some GPU features were also integrated into certain
CPUs.
⚫ Traditional CPUs are structured with only a few cores.
For example, the Xeon X5670 CPU has six
cores. However, a modern GPU chip can be built with
hundreds of processing cores.
31
32. How GPUs Work
⚫ Early GPUs functioned as coprocessors attached to the CPU.
⚫ Today, the NVIDIA GPU has been upgraded to 128 cores on
a single chip.
⚫ Furthermore, each core on a GPU can handle eight threads
of instructions.
⚫ Thistranslates to having up to 1,024 threads
executed concurrently on a single GPU.
⚫ Thisis true massive parallelism, compared to only a
few
threads that can be handled by a conventional CPU.
⚫ Modern GPUs are not restricted to accelerated graphics or
video coding.
⚫ They are used in HPC systems to power supercomputers
with massive parallelism at multicore and
multithreading levels.
⚫ GPUs are designed to handle large numbers of floating-point
operations in parallel.
32
34. GPU Programming Model
⚫ The CPU instructs the GPU to perform
massive data processing.
⚫ The bandwidth must be matched between
the on-board main memory and the on-chip
GPU memory.
⚫ This process is carried out in NVIDIA’s
CUDA programming using the GeForce 8800
or Tesla
and Fermi GPUs.
34
35. ⚫ Figure 1.8 shows the architecture of the
Fermi GPU, a next-generation GPU from
NVIDIA. This is a streaming multiprocessor
(SM) module.
⚫ Multiple SMs can be built on a single GPU
chip. The Fermi chip has 16 SMs implemented
with 3 billion transistors.
⚫ Each SM comprises up to 512
streaming processors (SPs), known as CUDA
cores.
⚫ In November 2010, three of the five
fastest supercomputers in the world (the
Tianhe-1a, Nebulae, and Tsubame) used large
numbers of GPU chips to accelerate
35
37. ⚫ There are 32 CUDA cores per SM. Only one SM is shown
in Figure 1.8.
⚫ Each CUDA core has a simple pipelined integer ALU and
an FPU that can be used in parallel.
⚫ Each SM has 16 load/store units allowing source and
destination addresses to be calculated for 16 threads
per clock.
⚫ There are four special function units (SFUs) for executing
transcendental instructions.
⚫ All functional units and CUDA cores are interconnected by
an NoC (network on chip) to a large number of SRAM
banks (L2 caches).
⚫ Each SM has a 64 KB L1 cache. The 768 KB unified L2
cache is shared by all SMs and serves all load, store, and
texture operations.
⚫ Memory controllers are used to connect to 6 GB of off-chip
DRAMs. 37
38. ⚫ The SM schedules threads in groups of 32
parallel threads called warps.
⚫ In total, 256/512 FMA (fused multiply and
add) operations can be done in parallel to
produce 32/64-bit floating-point results.
⚫ The 512 CUDA cores in an SM can
work in parallel to deliver up to 515 Gflops of
double- precision results, if fully utilized.
38
40. Memory, Storage, and Wide-Area Networking
⚫ Memory Technology
Memory access time did not improve much in the past.
In fact, the memory wall problem is getting worse as the processor gets
faster.
For hard drives, capacity increased from 260 MB in 1981 to 250 GB
in 2004.
The Seagate Barracuda XT hard drive reached 3 TB in 2011.
This represents an approximately 10x increase in capacity every
eight
years.
⚫ Disks and Storage Technology
The rapid growth of flash memory and solid-state drives (SSDs)
also impacts the future of HPC and HTC systems.
A typical SSD can handle 300,000 to 1 million write cycles per block
Eventually, power consumption, cooling, and packaging will limit large
system development.
Power increases linearly with respect to clock frequency
43. Wide-Area Networking
⚫ Rapid growth of Ethernet bandwidth from 10 Mbps in
1979 to 1 Gbps in 1999, and 40 ~ 100 GE
(Gigabit Ethernet )in 2011. It has been speculated
that 1 Tbps network links will become available by
2013
⚫ An increase factor of two per year on network
performance was reported, which is faster
than Moore’s law on CPU speed doubling every 18
months.
The implication is that more computers will be
used concurrently in the future.
⚫ High-bandwidth networking increases the
capability of 43
45. ⚫ Virtual machines (VMs) offer novel
solutions to underutilized resources, application
inflexibility, software manageability, and security
concerns in existing physical machines.
⚫ Today, to build large clusters, grids, and
clouds, we need to access large amounts of
computing, storage, and networking
resources in a virtualized manner.
⚫ We need to aggregate those resources,
and hopefully, offer a single system image.
⚫ In particular, a cloud of provisioned
resources must rely on virtualization of
processors, memory, and I/O facilities
dynamically. 45
47. ⚫ The VM is built with virtual resources managed by a
guest OS to run a specific application. Between the
VMs and the host platform, one needs to deploy a
middleware layer called a virtual machine monitor
(VMM).
⚫ 1.12(b) shows a native VM installed with the use of a
VMM called a hypervisor in privileged mode.
For example, the hardware has x-86 architecture
running the Windows system & guest OS could be a
Linux system and the hypervisor is the XEN system
developed at Cambridge University. This hypervisor
approach is also called bare-metal VM, because the
hypervisor handles the bare hardware (CPU, memory,
and I/O) directly.
⚫ Another architecture is the host VM shown in Figure
1.12(c). Here the VMM runs in nonprivileged mode.
The host OS need not be modified.
49. 49
⚫ First, the VMs can be multiplexed between
hardware machines, as shown in Figure 1.13(a).
⚫ Second, a VM can be suspended and stored in
stable storage, as shown in Figure 1.13(b).
⚫ Third, a suspended VM can be resumed or provisioned
to a new hardware platform, as shown in Figure 1.13(c).
⚫ Finally, a VM can be migrated from one
hardware platform to another, as shown in Figure
1.13(d).
⚫ These VM operations enable a VM to be provisioned to
any available hardware platform. They also
enable flexibility in porting distributed application
executions.
⚫ **Enhance the utilization of server resources.
⚫ **Multiple server functions can be consolidated on the
efficiency
50. 50
Virtual Infrastructures
⚫ Virtual infrastructure is what connects
resources to distributed applications. It is a
dynamic mapping of system resources to
specific applications.
⚫ The result is decreased costs and
increased efficiency and responsiveness.
⚫ Virtualization for server consolidation
and containment is a good example
52. Data Center Growth and Cost Breakdown
⚫ A large data center may be built with thousands of
servers.
⚫ Smaller data centers are typically built with hundreds of
servers.
⚫ The cost to build and maintain data center servers has
increased over the years.
⚫ Figure 1.14, typically only 30 percent of data center costs
goes toward purchasing IT equipment (such as
servers and disks),
⚫ 33 percent is attributed to the chiller,
⚫ 18 percent to the uninterruptible power supply (UPS),
⚫ 9 percent to computer room air conditioning (CRAC), and
⚫ the remaining 7 percent to power distribution,
lighting, and transformer costs.
⚫ Thus, about 60 percent of the cost to run a data
center is allocated to management and maintenance. 52
54. Convergence of Technologies
cloud computing is enabled by the convergence of technologies in
four areas:
(1) hardware virtualization and multi-core chips,
(2) utility and grid computing,
(3) SOA, Web 2.0, and WS mashups, and
(4) atonomic computing and data center automation.
• Hardware virtualization and multicore chips enable the existence
of dynamic configurations in the cloud.
• Utility and grid computing technologies lay the
necessary foundation for computing clouds.
• Recent advances in SOA, Web 2.0, and mashups of platforms
are pushing the cloud another step forward.
•Finally, achievements in autonomic computing and automated
data
59. ⚫ Above shows the architecture of a typical server cluster
built around a low-latency,
highbandwidth
interconnection network.
⚫ This network can be as simple as a SAN (e.g., Myrinet)
or a LAN (e.g., Ethernet).
⚫ To build a larger cluster with more nodes, the
interconnection network can be built with multiple
levels of Gigabit Ethernet, Myrinet, or InfiniBand
switches.
⚫ Through hierarchical construction using a SAN, LAN, or
WAN, one can build scalable clusters with an
increasing number of nodes.
⚫ The cluster is connected to the Internet via a virtual
private network (VPN) gateway. The gateway
IP address locates the cluster. 59
60. Single-System Image
⚫ an ideal cluster should merge multiple
system images into a single-system image (SSI).
⚫ Cluster designers desire a cluster
operating system or some middleware to
support SSI at various levels, including the
sharing of CPUs, memory, and I/O across all
cluster nodes.
⚫ An SSI is an illusion created by
software or hardware that presents a collection
of resources as one integrated, powerful
resource.
⚫ SSI makes the cluster appear like a
single machine to the user.
60
61. Hardware, Software, and Middleware Support
⚫ building blocks are computer nodes
(PCs, workstations, servers, or SMP),
special communication software such as PVM or
MPI, and
a network interface card in each computer node.
Most clusters run under the Linux OS.
⚫ The computer nodes are interconnected by a
high- bandwidth network (such as Gigabit
Ethernet, Myrinet, InfiniBand, etc.).
⚫ Special cluster middleware supports are
needed to create SSI or high availability (HA).
⚫ Both sequential and parallel applications can
run on the cluster, and special parallel
environments are needed to facilitate use of the
cluster resources.
DSM-Virtualization(on demand) tool for parallel
61
⚫
⚫ Parallel Virtual Machine (PVM) is a
software networking of computers.
62. Major Cluster Design Issues
⚫ Without middleware, cluster nodes cannot
work together effectively to achieve
cooperative computing.
⚫ The software environments and applications
must rely on the middleware to achieve
high performance.
⚫ The cluster benefits come from
scalable performance, efficient message
passing, high system availability, seamless fault
tolerance, and cluster-wide job management.
62
64. Grid Computing Infrastructures
⚫ Internet services such as the Telnet
command enables a local computer to connect
to a remote computer.
⚫ A web service such as HTTP enables
remote access of remote web pages.
⚫ Grid computing is envisioned to allow
close interaction among applications
running on distant computers simultaneously.
64
65. 65
Grid computing is a form of distributed computing
whereby a "super and virtual computer" is composed of
a cluster of networked, loosely coupled computers,
acting in concern to perform very large tasks.
Grid computing (Foster and Kesselman, 1999) is a
growing technology that facilitates the executions of
large-scale resource intensive applications on
geographically distributed computing resources.
Facilitates flexible, secure, coordinated large scale
resource sharing among dynamic collections of
individuals, institutions, and resource
Enable communities (“virtual organizations”) to share
geographically distributed resources as they pursue
common goals
66. Criteria for a Grid:
Coordinates resources that are not subject to
centralized control.
Uses standard, open, general-purpose protocols
and interfaces.
Delivers nontrivial/ significant qualities of
service.
Benefits
▪ Exploit Underutilized resources
▪ Resource load Balancing
▪ Virtualize resources across an enterprise
▪ Data Grids, Compute Grids
▪ Enable collaboration for virtual organizations
66
67. Grid Applications
Data and computationally intensive applications:
This technology has been applied to computationally-intensive
scientific, mathematical, and academic problems like
drug
discovery, economic forecasting, seismic analysis back office data
processing in support of e-commerce
⚫ A chemist may utilize hundreds of processors to screen
thousands of compounds per hour.
⚫ Teams of engineers worldwide pool resources to
analyze terabytes of structural data.
⚫ Meteorologists seek to visualize and analyze petabytes
of
climate data with enormous computational demands.
Resource sharing
⮚ Computers, storage, sensors, networks, …
⮚ Sharing always conditional: issues of trust, policy, negotiation,
payment, …
Coordinated problem solving
⮚ distributed data analysis, computation, collaboration, …
67
68. Grid Topologies
• Intragrid
– Local grid within an organization
– Trust based on personal contracts
• Extragrid
– Resources of a consortium of organizations
connected through a (Virtual) Private Network
– Trust based on Business to Business contracts
• Intergrid
– Global sharing of resources through the internet
– Trust based on certification
68
69. Computational Grid
“A computational grid is a hardware and software
infrastructure that provides dependable,
pervasive, and inexpensive access to
consistent,
high-end
computational capabilities.”
”The Grid: Blueprint for a New Computing
Infrastructure”, Kesselman & Foster
Example : Science Grid (US Department of
Energy)
69
70. Data Grid
⚫ A data grid is a grid computing system that deals with data
— the controlled sharing and management of
large amounts of distributed data.
⚫ Data Grid is the storage component of a grid environment.
Scientific and engineering applications require
access to large amounts of data, and often this
data is widely distributed.
⚫ A data grid provides seamless access to the local or
remote data required to complete compute
intensive calculations.
Example :
Biomedical informatics Research Network (BIRN),
the Southern California earthquake Center (SCEC).
70
73. Distributed Supercomputing
⚫ Combining multiple high-capacity
resources on a computational grid into a
single, virtual distributed supercomputer.
⚫ Tackle problems that cannot be solved on a
single system.
73
74. High-Throughput Computing
⚫ Uses the grid to schedule large
numbers of loosely coupled or independent
tasks, with the goal of putting unused
processor cycles to work.
On-Demand Computing
🞆 Uses grid capabilities to meet short-term
requirements for resources that are not
locally accessible.
🞆 Models real-time computing demands.
74
75. Collaborative Computing
⚫ Concerned primarily with enabling and
enhancing human-to-human interactions.
⚫ Applications are often structured in terms of a
virtual shared space.
Data-Intensive Computing
🞆 The focus is on synthesizing new information
from data that is maintained in geographically
distributed repositories, digital libraries, and
databases.
🞆 Particularly useful for distributed data mining.
75
76. Logistical Networking
⚫ Logistical networks focus on exposing
storage resources inside networks by
optimizing the global scheduling of data
transport, and data storage.
⚫ Contrasts with traditional networking, which
does not explicitly model storage resources
in the network.
⚫ high-level services for Grid applications
called "logistical" because of the analogy
/similarity it bears with the systems of
warehouses, depots, and distribution channels.
76
77. P2P Computing vs Grid Computing
⚫ Differ in Target Communities
⚫ Grid system deals with more complex, more
set of resources
powerful, more diverse and highly
than
interconnected
P2P.
⚫ VO
77
78. A typical view of Grid
environment
User
Resource Broker
Grid Resources
Grid Information
Service
A User sends computation
or data intensive
application to Global Grids
in order to speed up the
execution of the
application.
A Resource Broker distribute the
jobs in an application to the Grid
resources based on user’s QoS
requirements and details of
available Grid resources for
further
executions.
Grid Resources (Cluster, PC,
Supercomputer, database,
instruments, etc.) in the Global
Grid execute the user jobs.
Grid Information Service
system collects the details
of the available Grid
resources and passes the
information to the
resource broker.
Grid application
Details of Grid resources
2
78
Com3putation
result
1
Computational jobs
4
Processed jobs
79. Grid Middleware
Grids are typically managed by grid ware - a special type of
middleware that enable
manage grid
components and
resource attributes
based on
user
(e.g., capacity,
sharing and
requirements
performance)
⚫ Software that connects other software
components or applications to provide the following
functions:
Run applications on suitable available resources
– Brokering, Scheduling
Provide uniform, high-level access to resources
– Semantic interfaces
–Web Services, Service Oriented Architectures
Address inter-domain issues of security, policy, etc.
– Federated / united / Associated Identities
Provide application-level status
monitoring and control
79
80. Middleware
⚫ Globus –chicago Univ
⚫ Condor – Wisconsin Univ – High throughput
computing
⚫ Legion – Virginia Univ – virtual workspaces-
collaborative computing
⚫ IBP – Internet back pane – Tennesse Univ –
logistical networking
⚫ NetSolve – solving scientific problems
in heterogeneous env – high throughput &
data intensive
80
82. Some of the Major Grid Projects
82
Name URL/Sponsor Focus
EuroGrid, Grid
Interoperability
(GRIP)
eurogrid.org
European Union
Create tech for remote access to super
comp resources & simulation codes; in
GRIP, integrate with Globus Toolkit™
Fusion
Collaboratory
fusiongrid.org
DOE Off. Science
Create a national computational
collaboratory for fusion research
Globus Project™ globus.org
DARPA, DOE,
NSF, NASA,
Msoft
Research on Grid technologies;
development and support of Globus
Toolkit™; application and deployment
GridLab gridlab.org
European Union
Grid technologies and applications
GridPP gridpp.ac.uk
U.K. eScience
Create & apply an operational grid within
the U.K. for particle physics research
Grid Research
Integration Dev. &
Support Center
grids-center.org
NSF
Integration, deployment, support of the
NSF Middleware Infrastructure for
research & education
85. 85
⚫ There are two types of overlay networks: unstructured
and structured.
An unstructured overlay network is characterized by a
random graph.
⮚ There is no fixed route to send messages or files
among the nodes.
⮚ Often, flooding is applied to send a query to all
nodes in an unstructured overlay, thus
resulting in heavy network traffic and
nondeterministic search results.
Structured overlay networks follow certain connectivity
topology and rules for inserting and removing nodes
(peer IDs) from the overlay graph.
⮚ Routing mechanisms are developed to take
advantage of the structured
87. 87
Challenges – P2P
⚫ P2P computing faces three types
of heterogeneity problems in hardware,
software, and network requirements.
⚫ There are too many hardware models
and architectures to select from; incompatibility
exists
between software and the OS; and different
network connections and protocols make it too
complex to apply in real applications
⚫ Fault tolerance, failure management, and
overlay networks.
88. 88
⚫ Lack of trust among peers poses another
problem; Peers are strangers to one another.
⚫ Security, privacy, and copyright violations are
major worries by those in the industry in terms of
applying P2P technology in business applications
⚫ the system is not centralized, so managing
it is difficult.
⚫ In addition, the system lacks security. Anyone
can log on to the system and cause damage or
abuse.
⚫ Further, all client computers connected to a
P2P network cannot be considered reliable or virus-
free.
⚫ In summary, P2P networks are reliable for a
small number of peer nodes.
89. 89
The Cloud
⚫ Historical roots in today’s
Internet apps
▪ Search, email, social
networks
▪ File storage (Live Mesh, Mobile
Me, Flicker, …)
⚫ A cloud infrastructure provides a
framework to manage scalable,
reliable, on-demand access to
applications
⚫ A cloud is the “invisible” backend to
many of our mobile applications
⚫ A model of computation and data
storage based on “pay as you go”
access to “unlimited” remote data
center capabilities
93. The following list highlights eight reasons to adapt
the cloud for upgraded Internet applications and web
services:
1.Desired location in areas with protected space and
higher energy efficiency
2.Sharing of peak-load capacity among a large pool of
users, improving overall utilization
3.Separationof infrastructure maintenance duties from
domain-specific application development
4.Significant reductionin cloud
computing cost, compared with traditional computing
paradigms
5.Cloud computing programming and application
development
6. Service and data discovery and content/service distribution
7. Privacy, security, copyright, and reliability issues
8. Service agreements, business models, and pricing policies
#2:A cluster is a group of interconnected computers (nodes) that work together as a single system to perform tasks. MPP is a computing architecture where many processors work on different parts of a task simultaneously. P2P networks are decentralized systems where participants (peers) share resources directly with each other without relying on a central server. Each peer acts as both a client and a server, enabling distributed communication and resource sharing. Internet clouds refer to cloud computing services delivered over the internet. Cloud scalability in cloud computing refers to the ability to increase or decrease IT resources as needed to meet changing demand. Performance can be defined as how efficientlly a software can accomplish its tasks. Cloud security is an important concern which refers to the act of protecting cloud environments, data, information and applications against unauthorized access, DDOS attacks, malwares, hackers and other similar attacks. High Availability in cloud computing is the ability of a system or application to remain accessible and operational for a long period of time. It is a key characteristic of cloud services that ensures they can handle different load levels and recover quickly from any failures.