2. 2
Course Objective
Explain the basic sub systems of a computer, their organization, structure
and operation.
Illustrate the concept of programs as sequences of machine instructions.
Demonstrate different ways of communicating with I/O devices
Describe memory hierarchy and concept of virtual memory.
Illustrate organization of simple pipelined processor and other computing
systems.
3. 3
Textbooks
•“Computer Organization,” by Carl Hamacher, Zvonko
Vranesic and Safwat Zaky. Fifth Edition McGraw-Hill,
2002.
•“SPARC Architecture, Assembly Language Programming
and C,” Richard P. Paul, Prentice Hall, 2000.
6. 6
What is a computer?
a computer is a sophisticated electronic calculating machine
that:
Accepts input information,
Processes the information according to a list of
internally stored instructions and
Produces the resulting output information.
Functions performed by a computer are:
Accepting information to be processed as input.
Storing a list of instructions to process the information.
Processing the information according to the list of
instructions.
Providing the results of the processing as output.
What are the functional units of a computer?
8. 8
Functional units of a computer
I/O Processor
Output
Memory
Input
Control
Arithmetic
& Logic
Instr1
Instr2
Instr3
Data1
Data2
Input unit accepts
information:
•Human operators,
•Electromechanical devices (keyboard)
•Other computers
Output unit sends
results of processing:
•To a monitor display,
•To a printer
Arithmetic and logic unit(ALU):
•Performs the desired
operations on the input
information as determined
by instructions in the memory
Control unit coordinates
various actions
•Input,
•Output
•Processing
Stores
information:
•Instructions,
•Data
9. 9
Information in a computer -- Instructions
Instructions specify commands to:
Transfer information within a computer (e.g., from memory to
ALU)
Transfer of information between the computer and I/O devices
(e.g., from keyboard to computer, or computer to printer)
Perform arithmetic and logic operations (e.g., Add two numbers,
Perform a logical AND).
A sequence of instructions to perform a task is called a
program, which is stored in the memory.
Processor fetches instructions that make up a program from
the memory and performs the operations stated in those
instructions.
What do the instructions operate upon?
10. 10
Information in a computer -- Data
Data are the “operands” upon which instructions operate.
Data could be:
Numbers,
Encoded characters.
Data, in a broad sense means any digital information.
Computers use data that is encoded as a string of binary
digits called bits.
11. 11
Input unit
Input Unit
Processor
Memory
Computer
Real world
Keyboard
Audio input
……
Binary information must be presented to a computer in a specific format. This
task is performed by the input unit:
- Interfaces with input devices.
- Accepts binary information from the input devices.
- Presents this binary information in a format expected by the computer.
- Transfers this information to the memory or processor.
12. 12
Memory unit
Memory unit stores instructions and data.
Recall, data is represented as a series of bits.
To store data, memory unit thus stores bits.
Processor reads instructions and reads/writes data from/to
the memory during the execution of a program.
In theory, instructions and data could be fetched one bit at a
time.
In practice, a group of bits is fetched at a time.
Group of bits stored or retrieved at a time is termed as “word”
Number of bits in a word is termed as the “word length” of a
computer.
In order to read/write to and from memory, a processor
should know where to look:
“Address” is associated with each word location.
13. 13
Memory unit (contd..)
Processor reads/writes to/from memory based on the
memory address:
Access any word location in a short and fixed amount of time
based on the address.
Random Access Memory (RAM) provides fixed access time
independent of the location of the word.
Access time is known as “Memory Access Time”.
Memory and processor have to “communicate” with each
other in order to read/write information.
In order to reduce “communication time”, a small amount of
RAM (known as Cache) is tightly coupled with the processor.
Modern computers have three to four levels of RAM units with
different speeds and sizes:
Fastest, smallest known as Cache
Slowest, largest known as Main memory.
14. 14
Memory unit (contd..)
Primary storage of the computer consists of RAM units.
Fastest, smallest unit is Cache.
Slowest, largest unit is Main Memory.
Primary storage is insufficient to store large amounts of
data and programs.
Primary storage can be added, but it is expensive.
Store large amounts of data on secondary storage devices:
Magnetic disks and tapes,
Optical disks (CD-ROMS).
Access to the data stored in secondary storage in slower, but
take advantage of the fact that some information may be
accessed infrequently.
Cost of a memory unit depends on its access time, lesser
access time implies higher cost.
15. 15
Arithmetic and logic unit (ALU)
Operations are executed in the Arithmetic and Logic Unit
(ALU).
Arithmetic operations such as addition, subtraction.
Logic operations such as comparison of numbers.
In order to execute an instruction, operands need to be
brought into the ALU from the memory.
Operands are stored in general purpose registers available in
the ALU.
Access times of general purpose registers are faster than the
cache.
Results of the operations are stored back in the memory or
retained in the processor for immediate use.
16. 16
Output unit
•Computers represent information in a specific binary form. Output units:
- Interface with output devices.
- Accept processed results provided by the computer in specific binary form.
- Convert the information in binary form to a form understood by an
output device.
Output Unit
Processor
Memory
Computer Real world
Printer
Graphics display
Speakers
……
17. 17
Control unit
Operation of a computer can be summarized as:
Accepts information from the input units (Input unit).
Stores the information (Memory).
Processes the information (ALU).
Provides processed results through the output units (Output
unit).
Operations of Input unit, Memory, ALU and Output unit are
coordinated by Control unit.
Instructions control “what” operations take place (e.g. data
transfer, processing).
Control unit generates timing signals which determines
“when” a particular operation takes place.
18. 18
How are the functional units connected?
•For a computer to achieve its operation, the functional units need to
communicate with each other.
•In order to communicate, they need to be connected.
Memory
Input Output Processor
•Functional units may be connected by a group of parallel wires.
•The group of parallel wires is called a bus.
•Each wire in a bus can transfer one bit of information.
•The number of parallel wires in a bus is equal to the word length of
a computer
Bus
19. 19
Organization of cache and main memory
Main
memory Processor
Bus
Cache
memory
Why is the access time of the cache memory lesser than the
access time of the main memory?
29. Interrupt
Normal execution of programs may be interrupted if some
device requires urgent servicing
To deal with the situation immediately, the normal execution of
the current program must be interrupted
Procedure of interrupt operation
The device raises an interrupt signal
The processor provides the requested service by executing an
appropriate interrupt-service routine
The state of the processor is first saved before servicing the
interrupt
• Normally, the contents of the PC, the general registers, and some
control information are stored in memory
When the interrupt-service routine is completed, the state of the
processor is restored so that the interrupted program may
continue
30. Classes of Interrupts
Program
Generated by some condition that occurs as a result of an
instruction execution such as arithmetic overflow, division
by zero, attempt to execute an illegal machine instruction,
or reference outside a user’s allowed memory space
Timer
Generated by a timer within the processor. This allows the
operating system to perform certain functions on a regular
basis
I/O
Generated by an I/O controller, to signal normal completion
of an operation or to signal a variety of error conditions
Hardware failure
Generated by a failure such as power failure or memory
parity error
31. Bus Structures
A group of lines that serves a connecting path for several
devices is called a bus
In addition to the lines that carry the data, the bus must
have lines for address and control purposes
The simplest way to interconnect functional units is to use a
single bus, as shown below
32. Drawbacks of the Single Bus Structure
The devices connected to a bus vary widely in their speed of
operation
Some devices are relatively slow, such as printer and keyboard
Some devices are considerably fast, such as optical disks
Memory and processor units operate are the fastest parts of
a computer
Efficient transfer mechanism thus is needed to cope with this
problem
A common approach is to include buffer registers with the
devices to hold the information during transfers
Prevent a high processor from being locked to a slow IO
devices during a sequence of operations.
An another approach is to use two-bus structure and an
additional transfer mechanism
• A high-performance bus, a low-performance, and a bridge
for transferring the data between the two buses.
33. Software
In order for a user to enter and run an application
program, the computer must already contain some system
software in its memory
System software is a collection of programs that are
executed as needed to perform functions such as
Receiving and interpreting user commands
Running standard application programs such as word
processors, etc, or games
Managing the storage and retrieval of files in secondary
storage devices
Controlling I/O units to receive input information and
produce output results
34. Software
Translating programs from source form prepared by
the user into object form consisting of machine
instructions
Linking and running user-written application programs
with existing standard library routines, such as
numerical computation packages
System software is thus responsible for the
coordination of all activities in a computing system
35. Operating System
Operating system (OS)
This is a large program, or actually a collection of routines,
that is used to control the sharing of and interaction among
various computer units as they perform application programs
The OS routines perform the tasks required to assign computer
resource to individual application programs
These tasks include assigning memory and magnetic disk
space to program and data files, moving data between
memory and disk units, and handling I/O operations
In the following, a system with one processor, one disk, and one
printer is given to explain the basics of OS
Assume that part of the program’s task involves reading a
data file from the disk into the memory, performing some
computation on the data, and printing the results
38. Performance
The speed with which a computer executes programs
is affected by the design of its hardware and its
machine language instructions
Because programs are usually written in a high-level
language, performance is also affected by the
compiler that translates programs into machine
languages
For best performance, the following factors must be
considered
Compiler
Instruction set
Hardware design
39. Performance
Processor circuits are controlled by a timing signal
called a clock
The clock defines regular time intervals, called clock cycles
To execute a machine instruction, the processor
divides the action to be performed into a sequence of
basic steps, such that each step can be completed in
one clock cycle
Let the length P of one clock cycle, its inverse is the
clock rate, R=1/P
Basic performance equation
T=(NxS)/R, where T is the processor time required to
execute a program, N is the number of instruction executions,
and S is the average number of basic steps needed to execute
one machine instruction
40. Performance Improvement
Pipelining and superscalar operation
Pipelining: by overlapping the execution of successive
instructions
Superscalar: different instructions are concurrently
executed with multiple instruction pipelines. This means that
multiple functional units are needed
Clock rate improvement
Improving the integrated-circuit technology makes
logic circuits faster, which reduces the time needed
to complete a basic step
41. Problem 1:
A program contains 1000 instructions. Out of that 25% instructions requires 4
clock cycles,40% instructions requires 5 clock cycles and remaining require 3
clock cycles for execution. Find the total time required to execute the program
running in a 1 GHz machine.
Solution:
N = 1000
25% of N= 250 instructions require 4 clock cycles.
40% of N =400 instructions require 5 clock cycles. 35% of N=350 instructions
require 3 clock cycles.
T = (N*S)/R= (250*4+400*5+350*3)/1X109 =(1000+2000+1050)/1*109= 4.05
μs.
41
42. Problem 2:
For the following processor, obtain the performance.
Clock rate = 800 MHz
No. of instructions executed = 1000
Average no of steps needed / machine instruction = 20
42
44. NUMBERS, ARITHMETIC OPERATIONS AND
CHARACTERS NUMBER REPRESENTATION
Numbers can be represented in 3 formats:
1) Sign and magnitude
2) 1's complement
3) 2's complement
In all three formats, MSB=0 for +ve numbers & MSB=1 for -ve numbers.
In sign-and-magnitude system
negative value is obtained by changing the MSB from 0 to 1 of the corresponding
positive value.
For ex, +5 is represented by 0101 &
-5 is represented by 1101.
45. 1's complement system
negative values are obtained by complementing each bit of the corresponding
positive number.
For ex, -5 is obtained by complementing each bit in 0101 to yield 1010.
2's complement system,
For ex, -5 is obtained by complementing each bit in 0101 & then adding 1 to
yield 1011. (In other words, the 2's complement of a number is obtained by
adding 1 to the 1's complement of that number).
2's complement system yields the most efficient way to carry out
addition/subtraction operations.
45
46. ADDITION OF POSITIVE NUMBERS
Consider adding two 1-bit numbers.
The sum of 1 & 1 requires the 2-bit vector 10 to represent the value 2. We say
that sum is 0
46
48. ADDITION & SUBTRACTION OF SIGNED NUMBERS
Subtraction using 2's complement
• In the first step, find the 2's complement of the subtrahend.
• Add the complement number with the minuend.
• If we get the carry by adding both the numbers, then we
discard this carry and the result is positive else take 2's
complement of the result which will be negative.
48
50. MEMORY-LOCATIONS & ADDRESSES
Memory consists of many millions of storage cells (flip-flops).
Each cell can store a bit of information i.e. 0 or 1 (Figure 2.1).
Each group of n bits is referred to as a word of information, and n is called the word
length.
The word length can vary from 8 to 64 bits.
A unit of 8 bits is called a byte.
Accessing the memory to store or retrieve a single item of information (word/byte)
requires distinct addresses for each item location. (It is customary to use numbers from
0 through 2k-1 as the addresses of successive-locations in the memory).
If 2k = no. of addressable locations; then 2k addresses constitute the address-space of
the computer.
For example, a 24-bit address generates an address-space of (16 MB).
50
52. BYTE-ADDRESSABILITY
In byte-addressable memory, successive addresses refer to successive byte
locations in the memory.
Byte locations have addresses 0, 1, 2. . . . .
If the word-length is 32 bits, successive words are located at addresses 0, 4,
8. . with each word having 4 bytes.
52
53. BIG-ENDIAN & LITTLE-ENDIAN ASSIGNMENTS
There are two ways in which byte-addresses are arranged (Figure 2.3).
1) Big-Endian: Lower byte-addresses are used for the more significant bytes of
the word.
2) Little-Endian: Lower byte-addresses are used for the less significant bytes of
the word
• In both cases, byte-addresses 0, 4, 8. are taken as the addresses of successive
words in the memory.
53
54. Example:Consider a 32-bit integer (in hex): 0x12345678 which consists of 4
bytes: 12, 34, 56, and 78.
Hence this integer will occupy 4 bytes in memory.
Assume, we store it at memory address starting 1000.
On little-endian, memory will look like
Address Value
1000 78
1001 56
1002 34
1003 12
On big-endian, memory will look like
Address Value
1000 12
1001 34
1002 56
1003 78
54
55. WORD ALIGNMENT
Words are said to be Aligned in memory if they begin at a byte-address that is a
multiple of the number of bytes in a word.
For example, If the word length is 16(2 bytes), aligned words begin at byte-
addresses 0, 2, 4 . . . . .
If the word length is 64(2 bytes), aligned words begin at byte-addresses 0, 8, 16 .
Words are said to have Unaligned Addresses, if they begin at an arbitrary byte-
address.
ACCESSING NUMBERS, CHARACTERS & CHARACTERS STRINGS
A number usually occupies one word. It can be accessed in the memory by
specifying its word address. Similarly, individual characters can be accessed by their
byte-address.
There are two ways to indicate the length of the string:
1) A special control character with the meaning "end of string" can be used as the
last character in the string.
2) A separate memory word location or register can contain a number indicating the
length of the string in bytes.
55
56. MEMORY OPERATIONS
• Two memory operations are:
1) Load (Read/Fetch) &
2) Store (Write).
• The Load operation transfers a copy of the contents of a specific memory-location to the
processor. The memory contents remain unchanged.
Steps for Load operation:
1) Processor sends the address of the desired location to the memory.
2) Processor issues “read signal to memory to fetch the data.
‟
3) Memory reads the data stored at that address.
4) Memory sends the read data to the processor.
• The Store operation transfers the information from the register to the specified memory-
location. This will destroy the original contents of that memory-location.
Steps for Store operation are:
1) Processor sends the address of the memory-location where it wants to store data.
2) Processor issues “write signal to memory to store the data.
‟
3) Content of register(MDR) is written into the specified memory-location.
56
57. INSTRUCTIONS & INSTRUCTION SEQUENCING
A computer must have instructions capable of performing 4 types of operations:
1) Data transfers between the memory and the registers (MOV, PUSH, POP,
XCHG).
2) Arithmetic and logic operations on data (ADD, SUB, MUL, DIV, AND, OR,
NOT).
3) Program sequencing and control (CALL.RET, LOOP, INT).
4) I/0 transfers (IN, OUT).
57
58. REGISTER TRANSFER NOTATION (RTN)
The possible locations in which transfer of information occurs are: 1)
Memory-location 2) Processor register & 3) Registers in I/O device.
58
59. ASSEMBLY LANGUAGE NOTATION
To represent machine instructions and programs, assembly language
format is used.
59
61. INSTRUCTION EXECUTION & STRAIGHT LINE SEQUENCING
There are 2 phases for Instruction Execution:
1) Fetch Phase: The instruction is fetched from the memory-location and placed
in the IR.
2) Execute Phase: The contents of IR is examined to determine which operation
is to be performed. The specified-operation is then performed by the
processor.
61
64. CONDITION CODES
• The processor keeps track of information about the results of various operations.
This is accomplished by recording the required information in individual bits, called
Condition Code Flags.
• These flags are grouped together in a special processor-register called the condition
code register (or statue register).
• Four commonly used flags are:
1) N (negative) set to 1 if the result is negative, otherwise cleared to 0.
2) Z (zero) set to 1 if the result is 0; otherwise, cleared to 0.
3) V (overflow) set to 1 if arithmetic overflow occurs; otherwise, cleared to 0.
4) C (carry) set to 1 if a carry-out results from the operation; otherwise cleared to 0.
64
Editor's Notes
#13:Since the instructions and data need to be feteched from the memory in order to perform a task, the time it takes to access and fetch this information will be one factor influencing how fast a given task will complete. In order to increase the speed of performing a task, one way is to reduce the amount of time it takes to fetch the data and the instructions. This time is called as “access time”.
Suppose if we want to fetch the data at memory location with the address 10. In case of sequential access, we have to access locations 1-9, and then access location 10. Clearly, in case of sequential access the access times increase as memory locations with higher access times are accessed. We need some kind of memory which provides fixed and short access time irrespective of the memory location being accessed. That is, it provides random access.
Why is the access time faster for the Cache than it is for primary storage? I haven’t yet discussed how the various units communicate with each other. In a few minutes I will discuss that, and it will become clear.
#18:What is a word? What is a word length? During the discussion of which functional unit did we come across this concept?