chapter 1-Introductionkkkclll;;;x;lc,.pdf

Lecture One
Introduction
Introduction to Distributed Systems
Kedist Meshesha
Faculty of Computing
Bahir Dar Institute of Technology
2/24/2021 1

2
Introduction and Definition
 Before the mid-80s, computers were
 very expensive (hundred of thousands or even millions of
dollars)
 very slow (a few thousand instructions per second)
 not connected among themselves
 After the mid-80s two major developments
 Cheap and powerful microprocessor-based computers
appeared
 Computer networks
 LANs at speeds ranging from 10 to 1000 Mbps
 WANs at speed ranging from 64 Kbps to gigabits/sec
2
2/24/2021

3
 Consequence
Feasibility of using a large network of computers to
work for the same application; this is in contrast to
the old centralized systems where there was a single
computer with its peripherals
3
2/24/2021

4
 A distributed system is:
a collection of independent computers that appears to
its users as a single coherent system - computer
(Tanenbaum & Van Steen)
 This definition has two aspects:
1. Hardware: autonomous machines
2. Software: a single system view for the users
Definition
4
2/24/2021

5
 Distributed system is a system designed to support the
development of applications and services which can
exploit a physical architecture consisting of multiple,
autonomous processing elements that do not share
primary memory but cooperate by sending asynchronous
messages over a communication network (Blair &
Stefani)
 Distributed system is one that stops you getting any work
done when a machine you’ve never even heard of crashes
(Leslie)
Other Definitions
5
2/24/2021

6
Paradigmatic Examples
 The Web over the Internet.
 Mobile telephony over cellular networks.
 Electronic funds transfer systems over special
purpose networks
 For example, between bank accounts, on credit
card purchases, via cash machines.
6
2/24/2021

7
 Other examples abound:
 Email,
 Instant messaging,
 Videoconferencing,
 Multiuser gaming,
 Home entertainment systems,
 Global positioning systems,
 etc..
7
2/24/2021

8
Why Distribute Systems?
 Constructing a distributed system can be motivated in
many ways:
 By making continuously evolving, remote resources
accessible for sharing,
 By opening proprietary processes to external
interaction in order to foster cooperation,
 By leading to better performance/cost ratios,
 By scaling effectively and efficiently if demand for
resources changes significantly,
 By attaining high levels of reliability and availability.
8
2/24/2021

9
The Benefit of Scale
 Ultimately, more is more.
 Interconnecting many systems has increased our
ability to tackle problems that centralized systems in
sequential mode cannot solve efficiently.
 More users can do more work of a more valuable
nature more effectively and more efficiently with
distributed systems than with centralized ones.
9
2/24/2021

10
What is a Distributed System?
 A distributed system is one which is
– independent, self-suﬃcient,
– often heterogeneous and autonomous,
– spatially separated
– components must use a common interconnection to
exchange information in order to
• coordinate their actions and
• Allow the whole to appear to its users as
a single coherent system.
10
2/24/2021

11
Independent, SelfSufficient, Autonomous, Heterogeneous
 By independent, self-sufficient we mean that each
components have its own
 processor
 state (i.e., memory)
 resource control and management (e.g., operating
system)
 By autonomous we mean that each component may change or
be changed of its own accord (agreement)(i.e., without
previous agreement or notification).
 By heterogeneous we mean that different components may
have different capabilities (e.g., performance).
11
2/24/2021

12
Independent, SelfSufficient, Autonomous, Heterogeneous
 There are many sources of heterogeneity:
 Different hardware
 Different software
 Different software interface
 The above in combination, and more.
 Such differences may cause interacting components to drift
further apart in time.
 Also, failures cause components to have to deal with a gap in
their knowledge of the current system state.
 Given a system, the more spatially distant its components, the
more representative of a distributed system it is.
12
2/24/2021

13
Characteristics of Distributed Systems
 differences between the computers and the way they
communicate are hidden from users.
 users and applications can interact with a distributed
system in a consistent and uniform way regardless of
location.
 distributed systems should be easy to expand and scale.
 a distributed system is normally continuously available,
even if there may be partial failures
13
2/24/2021

14
Why Distributed?
 Resource and Data Sharing
 printers, databases, multimedia servers, ...
 Availability, Reliability
 the loss of some instances can be hidden
 Scalability, Extensibility
 the system grows with demand (e.g., extra servers)
 Performance
 huge power (CPU, memory, ...) available
 Inherent distribution, communication
 organizational distribution, e-mail, video
14
2/24/2021

15
Problems of Distribution
 Concurrency, Security
 clients must not disturb each other
 Privacy
 e.g., when building a preference profile
 unwanted communication such as spam
 Partial failure
 we often do not know where the error is (e.g., RPC)
 Location, Migration, Replication
 clients must be able to find their servers
 Heterogeneity
 hardware, platforms, languages, management
15
2/24/2021

16
 To support heterogeneous computers and networks and
to provide a single-system view, a distributed system is
often organized by means of a layer of software called
middleware that extends over multiple machines
16
Organization of a Distributed System
2/24/2021

17
A distributed system organized as middleware; note that the
middleware layer extends over multiple machines
17
2/24/2021

18
A distributed system should
 easily connect users with resources (printers, computers,
storage facilities, data, files, Web pages, ...)
 reasons: economics, to collaborate and exchange
information
 be transparent: hide the fact that the resources and
processes are distributed across multiple computers
 be open
 be scalable
18
Goals of a Distributed System
2/24/2021

19
Transparency in a Distributed System
a distributed system that is able to present itself to users
and applications as if it were only a single computer
system is said to be transparent
2/24/2021

20
Transparency Description
Access Hide differences in data representation
(endianness, file naming, ...) and how a resource
is accessed
Location Hide where a resource is physically located; where
is http://www.prenhall.com/index.html? (naming)
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another
location while in use; e.g., mobile users using their
wireless laptops
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several
competitive users; a resource must be left in a
consistent state 20
Different forms of transparency in a distributed system
2/24/2021

21
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in
memory or on disk
21
2/24/2021

22
 A distributed system should be open.
 we need well-defined interfaces.
 Interoperability
 components of different origin can communicate
 Portability
 components work on different platforms
 Another goal of an open distributed system is that it should be
flexible and extensible; easy to configure the system out of
different components; easy to add new components, replace
existing ones
 An Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services; e.g., protocols in networks
22
Openness in a Distributed System
2/24/2021

Scalability in Distributed Systems
 a distributed system should be scalable
 size: adding more users and resources to the system
 geographically: users and resources may be far apart
 administratively: should be easy to manage even if it spans
many administrative organizations
2/24/2021 23

24
examples of scalability limitations
Scalability problems
 Scaling Techniques
• how to solve scaling problems
• the problem is mainly performance, and arises as a result
of limitations in the capacity of servers and networks (for
geographical scalability)
 three possible solutions: hiding communication latencies,
distribution, and replication
24
2/24/2021

25
 try to avoid waiting for responses to remote service
requests
 let the requester do other useful job
 i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
 good for batch processing and parallel applications but
not for interactive applications
 for interactive applications, move part of the job to the
client to reduce communication; e.g. filling a form and
checking the entries
25
a. Hide Communication Latencies
2/24/2021

26
(a) a server checking the correctness of field entries
(b) a client doing the job
• e.g., shipping code is now supported in Web applications using Java
Applets
26
2/24/2021

27
– e.g., DNS - Domain Name System
– divide the name space into zones
an example of dividing the DNS name space into zones 27
b. Distribution
2/24/2021

28
 Replicate components across a distributed system to
increase availability and for load balancing, leading to
better performance
 Decided by the owner of a resource
 Caching (a special form of replication) also reduces
communication latency; decided by the user
 But, caching and replication may lead to consistency
problems
28
c. Replication
2/24/2021

29
Types of distribution systems
Three types distributed
• Distributed computing systems,
• Distributed information systems
• Distributed pervasive/embedded systems
29
2/24/2021

30
 Used for high-performance computing tasks
 Two types: Cluster computing and Grid computing
Cluster Computing
 A collection of similar workstations or PCs
(homogeneous), closely connected by means of a
high-speed LAN
 Each node runs the same operating system
 Used for parallel programming in which a single
compute intensive program is run in parallel on
multiple machines
1. Distributed Computing Systems
2/24/2021

31
An example of a cluster computing system
2/24/2021
 a master node runs a middleware (containing
libraries for parallel programs) and controls other
compute nodes;
 it allocates tasks and provides an interface to users
,etc
ns a middleware (containing libraries for parallel programs) and controls
other compute nodes;

32
 Grid Computing
 “Resource sharing and coordinated problem solving in
dynamic, multi-institutional virtual organizations” (I. Foster)
 high degree of heterogeneity: no assumptions are made
concerning hardware, operating systems, networks,
administrative domains, security policies, etc.
2/24/2021

33
2. Distributed Information Systems
 Problem: many networked applications with a problem
of interoperability
 At the lowest level: wrap a number of requests into a
single larger request and have it executed as a
distributed transaction
 How to let applications communicate directly with
each other, i.e., Enterprise Application Integration
(EAI)
2/24/2021

34
 e.g., Assume the following banking operation
 withdraw an amount x from account 1
 deposit the amount x to account 2
 what happens if there is a problem after the first activity is
carried out?
 group the two operations into one transaction; either both
are carried out or neither
 we need a way to roll back when a transaction is not
completed
2/24/2021

35
 Properties of transactions, often referred to as ACID
1. Atomic: to the outside world, the transaction happens indivisibly; a
transaction either happens completely or not at all; intermediate
states are not seen by other processes
2. Consistent: the transaction does not violate system invariants;
e.g., in an internal transfer in a bank, the amount of money in the
bank must be the same as it was before the transfer (the law of
conservation of money); this may be violated for a brief period of
time, but not seen to other processes
3. Isolated or Serializable: concurrent transactions do not interfere
with each other; if two or more transactions are running at the
same time, the final result must look as though all transactions run
sequentially in some order
4. Durable: once a transaction commits, the changes are permanent;
2/24/2021

36
 Enterprise Application Integration
 how to integrate applications independent from their
databases
 transaction systems rely on request/reply
 how can applications communicate with each other
middleware as a communication facilitator in enterprise application
integration
2/24/2021

37
 There are different communication models
 RPC (Remote procedure Call)
 RMI (Remote Method Invocation)
 MOM (Message-Oriented Communication)
3. Distributed Pervasive Systems
 The distributed systems discussed so far are characterized
by their stability; fixed nodes having high-quality connection
to a network
 There are also mobile and embedded computing devices
with wireless connections
2/24/2021

38
 Three requirements for pervasive applications
 Embrace contextual changes: a device is aware that its
environment may change all the time
 Encourage ad hoc composition: devices are used in
different ways by different users
 Recognize sharing as the default: devices join a system to
access or provide information
 Examples of pervasive systems
 Home Systems
 Electronic Health Care Systems
 Sensor Networks
2/24/2021

39
Hardware Concepts
• Different classification schemes exist
– multiprocessors - with shared memory
– multicomputers - that do not share memory
» can be homogeneous or heterogeneous
39
Hardware and Software Concepts
2/24/2021

40
Parallel system?
 a single
backbone
40
2/24/2021

41
Multiprocessors - Shared Memory
 the shared memory has to be coherent - the same value
written by one processor must be read by another
processor
 performance problem for bus-based organization since the
bus will be overloaded as the number of processors
increases
 the solution is to add a high-speed cache memory
between the processors and the bus to hold the most
recently accessed words; may result in incoherent
memory
41
2/24/2021

42
42
 bus-based multiprocessors are difficult to scale even with
caches
 two possible solutions: crossbar switch and omega network
a bus-based multiprocessor
2/24/2021

43
 Crossbar switch
 divide memory into modules and connect them to the
processors with a crossbar switch
 at every intersection, a crosspoint switch is opened and closed
to establish connection
 problem: expensive; with n CPUs and n memories, n2 switches
are required
43
2/24/2021

44
Omega network
 use switches with multiple input and output lines
 drawback: high latency because of several switching stages
between the CPU and memory
44
2/24/2021

45
– OSs in relation to distributed systems
• Tightly-coupled systems, referred to as distributed OSs
(DOS)
– the OS tries to maintain a single, global view of the resources
it manages
– used for multiprocessors and homogeneous multicomputers
• Loosely-coupled systems, referred to as network OSs
(NOS)
– a collection of computers each running its own OS; they work
together to make their services and resources available to
others
– used for heterogeneous multicomputers
– Middleware: to enhance the services of NOSs so that a better
support for distribution transparency is provided 45
Software Concepts
2/24/2021

46
System Description Main Goal
DOS Tightly-coupled operating system for multi-
processors and homogeneous multicomputer
Hide and manage
hardware
resources
NOS
Loosely-coupled operating system for
heterogeneous multicomputer (LAN and
WAN)
Offer local
services to remote
clients
Middleware Additional layer atop of NOS implementing
general-purpose services
Provide
distribution
transparency
 Summary of main issues
an overview of DOSs, NOSs, and middleware
46
2/24/2021

47
 two types
 multiprocessor operating system: to manage the resources of
a multiprocessor
 multicomputer operating system: for homogeneous
multicomputer
 Uniprocessor Operating Systems
 separating applications from operating system code through a
microkernel
47
2/24/2021
 Distributed Operating Systems

48
• Multiprocessor Operating Systems
 extended uniprocessor operating systems to support
multiple processors having access to a shared memory
 A protection mechanism is required for concurrent access
to guarantee consistency.
48
2/24/2021

49
• Multicomputer Operating Systems
general structure of a multicomputer operating system
 processors can not share memory; instead communication is
through message passing
 each node has its own
 kernel for managing local resources
 separate module for handling interprocess communication
49
2/24/2021

50
 how to emulate shared memories on distributed systems to
provide a virtual shared memory
 page-based distributed shared memory (DSM) - use the virtual
memory capabilities of each individual node
pages of address space distributed among four machines
50
2/24/2021
 Distributed Shared Memory Systems

51
situation if page 10 is read only and replication is used
situation after CPU 1 references page 10
 read-only pages can be easily replicated
51
2/24/2021

52
– possibly heterogeneous underlying hardware
– constructed from a collection of uniprocessor systems, each with
its own operating system and connected to each other in a
computer network
general structure of a network operating system
52
2/24/2021
•Network Operating Systems

53
• Services offered by network operating systems
– remote login (rlogin)
– remote file copy (rcp)
– shared file systems through file servers
53
2/24/2021

54
 a distributed operating system is not intended to handle a
collection of independent computers but provides
transparency and ease of use
 a network operating system does not provide a view of a
single coherent system but is scalable and open
 combine the scalability and openness of network operating
systems and the transparency and ease of use of
distributed operating systems
 this is achieved through a middleware, another layer of
software
54
2/24/2021
 Middleware

55
general structure of a distributed system as middleware
55
2/24/2021

56
• different middleware models exist
– treat every resource as a file; just as in UNIX
– through Remote Procedure Calls (RPCs) - calling a procedure on
a remote machine
– distributed object invocation
middleware services
 access transparency: by hiding the low-level message
passing
 naming: such as a URL in the WWW
 distributed transactions: by allowing multiple read and
write operations to occur atomically
 security
56
2/24/2021

57
 Middleware and Openness
 in an open middleware-based distributed system, the protocols
used by each middleware layer should be the same, as well as
the interfaces they offer to applications
57
2/24/2021

58
Thanks for Your Attention!
2/24/2021

chapter 1-Introductionkkkclll;;;x;lc,.pdf

More Related Content

Similar to chapter 1-Introductionkkkclll;;;x;lc,.pdf (20)

Recently uploaded (20)

chapter 1-Introductionkkkclll;;;x;lc,.pdf