chapter-18-parallel-processing-multiprocessing (1).ppt

Chapter 18
Parallel Processing
(Multiprocessing)

Its All About Increasing Performance
• Processor performance can be measured by the
rate at which it executes instructions
MIPS rate = f * IPC (Millions of Instructions per Second)
— f is the processor clock frequency, in MHz
— IPC the is average Instructions Per Cycle
• Increase performance by
—increasing clock frequency and
—increasing instructions that complete during cycle
– May be reaching limit 
+ Complexity
+ Power consumption

Multiprogramming and Multiprocessing

Taxonomy of Parallel Processor Architectures

Multiple Processor Organization
• SISD - Single instruction, single data stream
• SIMD - Single instruction, multiple data stream
• MISD - Multiple instruction, single data stream
• MIMD - Multiple instruction, multiple data stream

SISD - Single Instruction, Single Data Stream
• Single processor
• Single instruction stream
• Data stored in single memory
• Uni-processor
Control Unit Processing Unit Memory Unit

SIMD - Single Instruction, Multiple Data Stream
• Single machine instruction
• Number of processing elements
• Each processing element has associated data memory
• Each instruction simultaneously executed on
different set of data by different processors
Typical Application - Vector and Array processors

MISD Multiple Instruction, Single Data Stream
• One sequence of data
• A set of processors
• Each processor executes different
instruction sequence
Not much practical application

MIMD - Multiple Instruction, Multiple Data Stream
• Set of processors
• Simultaneously execute different instruction sequences
• Different sets of data
— SMPs (Symmetric Multiprocessors)
— NUMA systems (Non-uniform Memory Access)
— Clusters (Groups of “partnering” computers)
Shared memory (SMP or NUMA) Distributed memory (Clusters)

MIMD - Overview
• General purpose processors
• Each can process all instructions
necessary
• Further classified by method of processor
communication & memory access

MIMD - Tightly Coupled
• Processors share memory
• Communicate via that shared memory
Symmetric Multiprocessor (SMP)
- Share single memory or pool
- Shared bus to access memory
- Memory access time to given area of memory is
approximately the same for each processor
Nonuniform memory access (NUMA)
- Access times to different regions of memory may differ

Block Diagram of Tightly Coupled Multiprocessor

MIMD - Loosely Coupled
Clusters
• Collection of independent
uniprocessors
• Interconnected to form a cluster
• Communication via fixed path or network
connections

Symmetric Multiprocessors (SMP)
• A stand alone “computer” with the following
characteristics:
—Two or more similar processors of comparable
capacity
– All processors can perform the same functions (hence
symmetric)
—Processors share same memory and I/O access
– Memory access time is approximately the same for each
processor (time shared bus or multi-port memory)
—Processors are connected by a bus or other internal
connection
—System controlled by integrated operating system
– Providing interaction between processors
– Providing interaction at job, task, file and data element
levels

SMP Advantages
• Performance
— If some work can be done in parallel
• Availability
— Since all processors can perform the same
functions, failure of a single processor
does not halt the system
• Incremental growth
— User can enhance performance by adding
additional processors
• Scaling
— Vendors can offer range of products based
on number of processors

Symmetric Multiprocessor Organization

Time Shared Bus (vs Multiport memory)
• Simplest form
• Structure and interface similar to single
processor system
• Following features provided
—Addressing - distinguish modules on bus
—Arbitration - any module can be temporary master
—Time sharing - if one module has the bus, others must
wait and may have to suspend

Time Share Bus - Advantages
Advantages:
• Simplicity
• Flexibility
• Reliability
Disadvantages
• Performance limited by bus cycle time
• Each processor must have local cache
— Reduce number of bus accesses
• Leads to problems with cache coherence

Operating System Issues
• Simultaneous concurrent processes
• Scheduling
• Synchronization
• Memory management
• Reliability and fault tolerance
• Cache Coherence

Cache Coherence
• Problem - multiple copies of same data
in different caches
• Can result in an inconsistent view of
memory
—Write back policy can lead to inconsistency
—Write through can also give problems unless
caches monitor memory traffic
 MESI Protocol (Modify - Exclusive - Shared - Invalid )

Software Solution to Cache Coherence
Compiler and operating system deal with
problem
• Overhead transferred to compile time
• Design complexity transferred from
hardware to software
— Analyze code to determine safe periods for
caching shared variables
— However, software tends to (must) make
conservative decisions
— Inefficient cache utilization

Hardware Solution to Cache Coherence
Cache coherence hardware protocols
• Dynamic recognition of potential
problems
• Run time solution
—More efficient use of cache
• Transparent to programmer / Compiler
Implemented with:
• Directory protocols
• Snoopy protocols

Directory & Snoopy Protocols
Directory Protocols
Effective in large scale systems with complex interconnection
schemes
• Collect and maintain information about copies of data in cache
— Directory stored in main memory
• Requests are checked against directory
— Appropriate transfers are performed
Creates central bottleneck
Snoopy Protocols
Suited to bus based multiprocessor
• Distribute cache coherence responsibility among cache
controllers
• Cache recognizes that a line is shared
• Updates announced to other caches
Increases bus traffic

Snoopy Protocols
• Write Update Protocol (Write Broadcast)
—Multiple readers and writers
—Updated word is distributed to all other processors
—Multiple readers, one writer
• Write Invalidate protocol (MESI)
—When a write is required, all other caches of the line
are invalidated
—Writing processor then has exclusive (cheap) access
until line is required by another processor
MESI Protocol - State of every line is marked as Modified,
Exclusive, Shared or Invalid
- two bits are included with each cache tag

Processor Designs
• Pipelined ALU
—Within operations
—Across operations
• Parallel ALUs
• Parallel processors

chapter-18-parallel-processing-multiprocessing (1).ppt

More Related Content

Similar to chapter-18-parallel-processing-multiprocessing (1).ppt (20)

More from NANDHINIS109942 (6)

Recently uploaded (20)

chapter-18-parallel-processing-multiprocessing (1).ppt