SlideShare a Scribd company logo
Microprocessor
Microcomputers
In a microcomputer, the central processing unit (CPU) is fabricated on a single
integrated circuit, and is called a microprocessor. A microcomputer uses other
microcircuits, but the microprocessor is the most complex. The use of microprocessors
meant that the computer manufacturer no longer had to design the CPU. It was a
standard, off the-shelf component.
As newer, more powerful microprocessors become available, they are quickly
implemented into smaller, more powerful, low-cost microcomputers. The microprocessor
provides a complete computer processor, with a standardized instruction set and
standardized signals. The semiconductor manufacturers standardized on TTL signal
levels.
Microprocessor Components
The four major components in a microprocessor are listed as follows:
1. A bank of registers for holding information
2. An arithmetic logic unit (ALU) for processing the information
3. A bus interface for moving information into and out of the microprocessor
4. Control logic for managing the operation of the microprocessor ( The control logic
instructs the bus interface to get an instruction.)
1. A bi-directional data bus which is implemented with tri-state logic devices to allow
the use of a direct-memory-access controller, or other similar chips.
2. A mono directional address bus, connected internally, within the A microprocessor,
to address pointers and the program counter The address bus is also implemented
in tristate logic.
3. A control bus, which carries the various synchronization signals to, from the
microprocessor. Control lines are not necessarily tristate.
All the usual system components are connected to these three busses. The basic
components are shown in Fig. They include the ROM, the RAM, and the I/0 chips.
The ROM is the Read-Only Memory. It stores the programs that the microprocessor
needs to power-up. The RAM is the Random-Access Memory. It is a read-write MOS
memory which stores temporary data and programs. The input-output chips are used for
1
such functions as multiplexing the data bus for two or more input-output ports. These
ports may be connected directly to input-output devices, or to device controllers, which
may require the use of interface circuits.
The interface circuits or interface chips required to interface this basic system to the I/0
devices, will be connected to these buses, which include the microprocessor buses or
special input-output buses.
Interfacing techniques are the methods required to connect this system to the various
input-output devices. The basic interfacing techniques required to connect any
microprocessor system to input-output devices are similar.
At the level of the microprocessor itself, the logical and electrical interface required is
similar. Many standard microprocessors of the same data width have essentially the
same data bus and the same address bus. The main difference is the control bus. It is
the specific characteristics of the control bus which make input-output interface chips
compatible or incompatible from one microprocessor to the next.
Microprocessor Control Signals
We have seen that the microprocessor uses three buses: a bi-directional data bus, a
mono directional address bus, and a control bus with the needed signals. The data bus
is essentially identical for microprocessors with the same data width. It is a bi-directional
bus, normally implemented in tristate logic. The address bus is a mono directional bus,
used to select memory or a device external to the microprocessor. The third bus, the
control bus, is the most complex. It carries the microprocessor control signals (the
interface signals). The control bus has four functions:
1. Memory synchronization
2. Input-output synchronization
3. Microprocessor scheduling-interrupts
4. Utilities, including clock and reset
Memory and input-output synchronization are similar A handshake procedure is used. In
a read-operation, a ready status or signal is needed to indicate the availability of data.
Data can then be transferred on the data bus. In some types of input-output devices, an
acknowledge signaI is generated to confirm the receipt, of data. In a write operation, the
availability of the external device is verified through a status bit or signal, and the data is
then placed on the data bus. An acknowledge signal may also be generated by the
device to confirm the receipt of data.
The use of an acknowledge signal or handshake is typical in an asynchronous
procedure. In a synchronous procedure, all events take place in a specified period of
time; so there is no need to acknowledge a transmission. In an asynchronous system,
an acknowledge signal is needed to verify a transmission.
The use of synchronous versus asynchronous communication depends on a number of
factors. A synchronous bus has the potential for higher speed and a lower number of
control lines, but, it places speed requirements on the external devices. An
asynchronous bus is more complex and requires more logic, but allows more flexibility
for device speeds in the system.
2
Data Bus Operation
The microprocessor data bus is a bi-directional, three-state bus. It is the same as a
single-line bus except that there are eight lines instead of one. To use all of the data bus
lines, each talker must have eight drivers, there must be one for each line, and each
listener must have eight inputs.
The microprocessor and RAM act as both talkers and listeners. Input ports act as talkers
since they take inputs from outside the system and place them on the bus. Output ports
act as listeners since they take data off the bus and send it outside the system. A ROM
acts like only a talker.
The microprocessor, ROM, RAM, and input ports use three-state drivers on their
outputs. Chip Select (CS) inputs. are used to enable the drivers and allow the data from
the selected device to appear on the data bus.
The microprocessor acts as the main controller for the system. It allows only one device
to use the bus at any given time. When the microprocessor needs to read data from
ROM, it first disables its own data outputs and then generates the control signals needed
to enable the ROM. The ROM's outputs then appear on the data bus and the micro-
processor reads the data. Reading RAM or an input port is done in a similar way.
To write data to a device, such as RAM or an output port, the microprocessor first places
the data to be written onto the data bus. It then generates control signals that send a
write pulse to the device. This write pulse causes the device to internally latch the data.
In most cases, data flows through the microprocessor. In a data transfer from an input
port to RAM, the microprocessor will first read the data from the input port and then write
it to RAM. Since the data cannot be transferred directly from the input port to the RAM, it
is temporarily stored in the microprocessor.
To summarize, the data bus is used for all transfers of data within the microprocessor
system. All devices share the same bus. The control logic, operating from signals
generated by the microprocessor, directs each device as to when it should place data on
the bus or read data from the bus.
Address Bus Operation
A set of lines is used by the microprocessor to specify where information will come from
or go to on the data bus. This is the address bus used by the microprocessor. The
memory is divided into locations and each data storage location has a unique address.
This address is specified on the address bus when the microprocessor needs to get or
place information at that location.
The address bus is controlled by the microprocessor Addresses are never sent to the
microprocessor over the address bus by another device. But, the microprocessor may
be asked to release the address bus so that other devices may use it. This capability is
required for direct memory access (DMA). DMA is an input/output technique designed to
speed up certain types of data operations.
Eight-bit processors usually have 16-bit address buses made up of 16 individual address
lines. These 16 bits give the processor an addressing capability of 65,536 locations. In
an 8-bit processor, this allows a storage capacity of 65,536 bytes of information. In the
early microprocessors, memory was expensive and this was more than most
microprocessor systems could afford to use. This amount of memory is very inexpensive
today and many current programs require much more than this to function properly.
Sixteen-bit processors offer a greater addressing range. Some 16-bit processors, such
as the Intel 8086, use 20 address lines, giving them an addressing capability of
1,048,576 locations, while others have 24 address lines and provide 16,777,216
locations. Although these seemed more than adequate when these processors were
introduced, PCs with several megabytes of memory are common. .
Thirty-two-bit processors with 32-bit address buses have address spaces in excess of 4
billion locations.
3
Control Bus Operation
The control bus uses the timing information to synchronize the other devices with the
internal operation of the microprocessor. Memory, devices are told when the address
bus has a valid address and when to place data on the data bus. The processor can
then read this data. The memory and I/O devices also need to know when the processor
has placed information on the data bus so it can be accepted.
Some lines of the control bus are bi-directional, white others are not. Some of the signal
lines on the control bus are driven by the microprocessor, while other signal lines are
driven by other devices in the system: The control bus is not as uniform as the data lines
in the data bus or the address tines in the address bus. It is a mixture of timing, data
direction, and functions.
Each microprocessor operates with its own set of control lines, but most have two
signals, the interrupt and DMA lines. These two groups of lines on the control bus have
control over the microprocessor in certain situations. They are able to take over control
of the microprocessor when they are called upon.
Interrupts allow external devices to stop the normal operation of the processor so that
another task can be started. This task can be the transfer of small amounts of data or
large routines or programs.
DMA (direct memory access) is a hardware technique where special hardware takes
control of the bus from the processor for the required period of time to complete a data
transfer. This special hardware can usually perform the data transfer much faster than
the general-purpose processor, so this technique is often used for high speed
input/output transfers.
Microprocessor Bus Characteristics
The microprocessor component level bus is made up of the three sub buses: data bus,
address bus, and control bus. Each of these three sub buses has a critical task which is
needed for the proper operation of the microprocessor.
All data entering or leaving the microprocessor does so over this bus, so a data bus is
needed. This bus is used to move information to be processed into the microprocessor
and move the processed information out of the microprocessor. The data bus has bi-
directional data lines so data may flow either into or out of the processor, but only in one
direction at a time. The direction that the information flows is controlled by the control
bus.
Microprocessors can be characterized by the size of their data buses. If the data bus of
a microprocessor is 8 bits wide, the microprocessor is known as an 8·bit microprocessor.
The Intel 8080 and 8085, the Motorola 6800, 6801, and 6802, and the Zilog Z80
microprocessor are all 8-bit microprocessors since they have 8-bit data buses and
internally they process information in 8-bit chunks.
The 32-bit chips include Motorola's 68020 and 68030 and Intel's 80386 and 80486. The
Motorola 68000 series is used in the Apple Macintosh. The Macintosh IIfx, which was
introduced' in 1990, uses a 68030 running at 40 MHz: The Intel Pentium has a 64-bit
data bus and a 32-bit address bus. There are also smaller microprocessor which have 4-
bit data buses. These are often used in dedicated control applications.
The size of the data bus used by the microprocessor usually indicates the size of the
data word the processor is designed to manipulate. An 8-bit data bus generally indicates
that the microprocessor processes data 8 bits at a time.
Some microprocessors like the Motorola 6809 and Intel 8088 have 8-bit data buses but
internally they are designed to work with 16-bit data chunks.
The reason for limiting the size of the data bus while the processor works with larger
quantities internally is cost. The chips are less expensive to make.
Microprocessor I/O Techniques
The three types of I/O techniques which the microprocessor uses to communicate with
the external world are programmed I/O, interrupt I/O, and Direct Memory Access (DMA).
Programmed I/O is a microprocessor-initiated I/O transfer. The data transfer between the
4
microprocessor and an external device is controlled by the microprocessor. A program
must be executed by the microprocessor to accomplish this.
Interrupt I/O is device-initiated. An external device is connected to the interrupt pin of the
microprocessor. In order to transfer data, the device changes the state on the interrupt
pin. The microprocessor completes execution of the current instruction, saves the rest of
the program in its memory, and executes an interrupt service, routine to complete the
transfer.
Direct memory access is also device-initiated. Data transfer between the microprocessor
memory and the I/O device occurs without the microprocessor. Special DMA control
circuits are used to complete the transfer.
Addressing
The data bus is used by different devices to exchange data, so a method is needed by
the microprocessor to select the particular device that communicates with the data bus.
The address bus (with the aid of the control bus) provides this function.
The address bus is unidirectional, so its operation is simpler than the data bus. Each
memory location has a unique address. Before a data transfer can take place on the
data bus, the microprocessor sends out an address. This address specifies the memory
location that the processor needs to access. This allows the microprocessor to select
any part of the system it needs to communicate with.
An address bus with 16 lines allows direct addressing of 216
or 65,536 memory locations
and I/O ports. The 16 lines are usually labeled from the least significant bit to the most
significant as
A0, A1, A2, A3 .... A15.
Address Decoders
An address decoder, which is a part of the control logic, generates device-select signals
when a certain address or range of addresses is present on the address bus. Figure
shows an address decoder for address 3000 hex (0011 0000 0000 0000 binary). The
output of the decoder is TRUE . only when this address is present on the address bus.
This output is then used to enable the port that is assigned to address 3000.
The address bus selects the memory location or I/O port and the data bus carries the
data. The entire process is coordinated by the control bus with its control signals.
The microprocessor will use signals like READ and WRITE.* When READ* is low, it,
indicates a read operation is taking place, and the microprocessor signals the addressed
device to place data on the data bus. If WRITE* is low, then a write operation is taking
place, and the microprocessor places data on the data bus and signals the addressed
device to store this data.
Each wire in the control bus has a unique function; in the address and data buses, each
line carries the same type of information-1 bit of the address or data.
The actual control signals in some microprocessors can differ, but the data transfers are
the same. They are just achieved m different ways.
Input and Output Ports
Suppose an output port latch has an address of 2000. The latch is clocked when
address 2000 is present on the address bus and a low-to-high switching takes place on
the WRITE* control signal. When the latch is clocked the data from the data bus is
stored on it. The microprocessor causes data specified by the software program to be at
the output of the latch by writing the data to address 2000.
Input ports are handled in a similar way: the output of the address decoder is ANDed
with READ instead of WRITE to generate the port enable. The input port is usually an
eight-line three-state driver. It places the input signals on the data bus when enabled
(Fig. 2.10b). The microprocessor can read these input signals on the data bus when
enabled. The microprocessor reads these input signals by performing a read operation
from the proper address The processor can then store this data in one of its registers.
5
MEMORIES
ROM Addressing
A ROM can be viewed as a device with many 8-bit input ports on a single chip, with one
port for each memory location. When the ROM is programmed, the ROM memory
locations are permanently set in a pattern of 1s and 0s. In a RAM, each memory location
can be viewed with having both input and an output ports combined.
In a ROM system the low-order bits of the address bus are connected to an address
decoder that selects one of low-order locations. The high order 8 bits of the address bus
are decoded by another address decoder to enable the ROM when a desired range of
addresses is present on the upper half of the address bus. In a 16-bit system, the lower
8 bits would go to one address decoder and the upper 8 bits to the other address
decoder .
The READ signal is ANDed with the address decoder output to generate the ROM
enable. This is the same technique used for input ports.
6
The address lines must indicate which memory chip should be selected and which word
within that chip should be addressed. The addresses are shown in binary. The lower 8
bits of address designate the location within each chip, and the upper 8 bits designate
which chip is being addressed. Bits 8 and 9 are used for decoding one chip from the
other The lower 8 bits of address would be connected to the address lines of all four
ROMs, since these bits specify the location within the chip. The address decoder then
checks the upper 8 bits of address and generates the chip selects.
There are variations to this approach, but the basic principles remain the same:
1. The low-order address bits are connected to the memory's address lines.
2. The high-order bits are decoded to generate the chip selects. No more than one
chip can be selected at any given time.
RAM Addressing
RAMs are decode in a similar way, but additional control signals are needed to write
input to the RAM and to read output from the RAM. RAMs have a WRITE* input in
addition to the CS* input (chip select). In order for the RAM to be controlled, CS* must
be low for either a read or write to take place. If WRITE* is high (not TRUE) when CS* is
Iow, the RAM outputs data to the data bus so the processor can read it. This is done
when CS* enables the RAMs three-state output drivers. If WRITE* is low, CS* will not
turn on the RAM's output drivers. Instead, the data on the data bus is written into the
memory at the location designated by the address bus.
The bus gating is shown in Fig. 2.10e. Chip select (CS) is low when the RAM address
select and either READ or WRITE are low. The WRITE* line is connected to the RAM's
WRITE* input. The WRITE* input is internally gated with the CS* input, so it will be
ignored unless CS is low.
Static and Dynamic RAMs
RAMs are used for holding data and running programs in microcomputer systems.
ROMs provide a means of storing programs and data. RAMs lose their contents when
power is removed.
The two different types of RAMs are static and dynamic. Static RAMs (SRAMs) use a
flip-flop circuit for each memory element. Each 1K of RAM has 1024 flip-flops. Each flip-
flop can be set to store a 1 or reset to store a 0.
Address decoding circuits in the RAM chip select the particular flip-flop specified by the
address lines. The state of the flip-flop does not change until different data is stored in it
or power to the RAM is interrupted.
7
Dynamic RAMs (DRAMs) use a storage capacitor. When a charge is stored in the
capacitor, this indicates a 1; no charge indicates a 0. This technique reduces the size of
the storage cell, and allows denser memory chips. Since the charge leaks off the
capacitor, it must be refreshed. Refreshing consists of rewriting the data. All of the 1 bits
are restored to full charge and the 0 bits to no charge.
Cache Basics
(High-performance processors like the 68020 and 68030 place a great demand on the
bandwidth of memory systems.) The newer integrated circuit implementations have
greatly reduced the cycle times of processors. Although computer memories have
improved in performance, the ratio of memory speed to processor cycle time has
continued to increase.
Shared-memory, multiple-processor systems have also become more common and have
increased the bandwidth requirements of system memories and the buses that connect
them. This has resulted in increasing use of cache memories.
(Caches are high-speed buffer memories that hold the most frequently used instructions
and data for quick access by the processor.) Caches operate the locality of memory
references. Computer programs tend to execute instructions stored in close proximity to
each other. Programs also exhibit a locality in that they tend to access a small subset of
the entire data set a number of times in any time period. The cache hardware tracks the
data accessed by the processor and saves it with the likelihood that it will lie requested
again.(Typically, the cache memory is 20 to 1000 times smaller in size than the system
memory and 5 to 20 times faster.)
Cache memories have been used in many computers. A cache memory was used in the
IBM System 360, and the DEC PDP-11/70 followed. The Motorola 68030 processor uses
two on-chip caches.
The cache memory is made up of a number of lines or blocks that can hold the contents
of the corresponding elements of the system memory. When the processor issues a
memory reference, it is checked with the contents of the cache (If the data is already in
the cache due to a previous access to system memory, it is sent back to the processor.
This is called cache hit. A cache, miss requires that the data be fetched from the system
memory and sent to the cache.)
The cache is made up of two parts. One memory array is for the cached data, and each
element in the array is a cache line or block. The other array is used for the cache
directory.) The memory addresses from the processor have fields tag, index, and byte
number. The index allows access to both the directory and data arrays.
The contents of a previously stored tag are compared with the present tag. A match
indicates a hit and a nonmatch indicates a miss. In a miss the address is forwarded to
the system memory and the data returned from memory overwrites the data in the
cache. The tag in the directory is also updated.
CPU ORGANIZATION (PENTIUM FAMILY) 64 bit ARCHITECTURE
Pentium
This is the next generation of the 386 and 486 microprocessor family. It is binary
compatible with the 8086188, 80286, 386 DX, 386 SX, 486 DX, 486 SX and DX2. The
Pentium processor has all of the features of the 486 with the following enhancements
and additions; superscalar architecture, dynamic branch prediction, pipelined floating-
point unit, improved instruction execution, separate code and data caches, write back
data cache, 64-bit data bus, bus cycle pipelining, and address parity and internal parity
checking.
The instruction set of the Pentium includes the 486 instruction set with extensions for the
additional functions of the Pentium. Software written for the 386 and 486 can run on the
Pentium. The on-chip memory management unit (MMU) is also compatible with the 388
and 486.
The Pentium has two instruction pipelines and floating-point units that are capable of
independent operation. Each pipeline issues frequently used instructions in a single
clock; the two pipelines can issue two integer instructions in one clock or one to two
floating point instructions in one clock. Branch prediction is accomplished in the Pentium
8
with two prefetch buffers. The floating-point unit has faster algorithms to speed some
math operations up to 10 times
Caches
The Pentium processor has separate code and data caches on the chip. Each cache is 8
Kbytes, with a 32-byte line size and 2-way set associative. Each cache uses a
Translation Lookaside Buffer (TLB) to translate linear addresses to physical addresses.
The data cache use write-back or write-through on a line-by-line basis. The cache tags
are triple-ported to support two data transfers and an inquire cycle in the same clock.
The data bus is 64 bits which improves the data transfer rate. Burst read and write-back
cycles are supported as well as bus cycle pipelining which allows two bus cycles to take
place simultaneously.
Test Functions
The Pentium processor uses functional redundancy checking for error detection. This is
done for the processor and the interface to the processor In functional redundancy
checking, a second processor acts as the checker. It runs in parallel with the processor
being tested. The checker samples the processor's outputs and compares them for a
match. It signals an error condition if a match does not occur.
Since more functions have been placed on the chip, board-level testing becomes
difficult. So, the Pentium processor has increased test and debug capability. Like other
486 CPUs, the Pentium uses IEEE Boundary Scan (Standard 1149). There are, four
breakpoint pins for the debug registers. These can be used for a breakpoint match.
Signal Functions
Figure shows a block diagram of the Pentium processor, which is a 32-bit
microprocessor with 32-bit addressing and a 64-bit data bus. A 273-pin grid array
package is used.
9
Pentium Processor Registers
(a) Integer Unit
Type Number Length (bits) Purpose
General 8 32 General-purpose user registers
Segment 6 16 Contain segment selectors
Flags 1 32 Status and control bits
Instruction
Pointer
1 32 Instruction pointer
(b) Floating-Point Unit
Type Number Length (bits) Purpose
Numeric 8 80 Hold floating-point numbers
Control 1 16 Control bits
Status 1 16 Status bits
Tag Word 1 16 Specifies contents of numeric registers
Instruction
Pointer
1 48 Points to instruction interrupted by exception
Data Pointer 1 48 Points to operand interrupted by exception
 General : There are eight 32-bit general-purpose registers
These may be used for all types of Pentium instructions; they can also operands for
address calculations. In addition, some of these register olso serve special purposes.
For example, string instructions use the contents ECX, ESI, and EDI registers as
operands without having to explicitly reference these registers in the instruction. As a
result, a number of instructions e encoded more compactly.
 Segment: The six 16-bit segment registers contain segment selectors, which index
into segment tables, The code segment (CS) register references the
segment containing the instruction being executed. The stack segment (SS)
register references the segment containing a user-visible stack. The remaining
segment registers (DS, ES, FS, GS) enable the user to reference up to four
separate data segments at a time.
 Flags: The EFLAGS register contains condition codes and various mode bits.
 Instruction Pointer: Contains the address of the current instruction.
There are also registers specifically devoted to the floating-point unit.
 Numeric : Each register holds an extended-precision 80-bit floating-point number.
There are eight registers that function as a stack, with push and pop operations
available in the instruction set.
 Control: The 16-bit control register contains bits that control the operation of the
floating-point, unit; including the type of rounding control; single, double, extended
precision; and bits to enable or disable various exception condition.
 Status: The 16-bit status register contains bits that reflect the current state of
floating-point unit, including a 3-bit pointer to the top of the stack; condition codes
reporting the outcome of the last operation; and exception flags.
 Tag Word: This 16-bit register contains a 2-bit tag for each floating-point numeric
register, which indicates the nature of the contents of the corresponding register.
The four possible values are valid, zero, special (NaN, infinity, denrmalized) and
empty. These tags enable programs to check the contents of a numeric register
without performing complex decoding of the actual data in the register.
The use of most of the above registers is easily understood Let us elaborate briefly on
several of the registers
10
EFLAGS Register
The EFLAGS register indicates the condition of the processor and
helps to control its operation. It includes the six condition codes defined in Table 9.8
(carry, parity, auxiliary, zero, sign, overflow), which report the results of an integer
operation. In addition, there are bits in the register that may be referred to as control bits;
these are
ID = Identification Flag DF = Direction Flag
VIP = Virtual Interrupt Pending IF = Interrupt Enable Flag
VIF = Virtual Interrupt flag TF = Trap Flag
AC = Alignment Check SF = Sign Flag
VM = Vartual 8086 Mode ZF = Zero Flag
RF = Resume Flag AF = Auxiliary Carry Flag
NT = Nested Task Flag PF = Parity Flag
IOPL = I/O Privilege Level CF = Carry Flag
OF = Overflow Flag
 Trap Flag (TF): When set, causes an interrupt after the execution of each
instruction. This is used for debugging.
 Interrupt Enable Flag (IF): When set, the processor will recognize external
interrupts.
 Direction Flag (DF): Determines whether string processing instructions increment
or decrement the 16-bit half-registers SI and DI (for 15-bit operation) the 32-bit
registers ESI and EDI (for 32-bit operations).
 I/O Privilege Flag (IOPL): When set, causes the processor to generate an
exception on all accesses to I/O devices during protected-mode operation:
 Resume Flag (RF): Allows the programmer to disable debug exceptions so the
instruction can be restarted after a debug exception without immediately causing
another debug exception.
 Alignment Check (AC): Activates if a word or doubleword is addressed on a non
word or non doubleword boundary.
 Identification Flag (ID):-If-this bit can be set and cleared, that indicates that this
processor supports the CPUID instruction. This instruction provides information
about the vendor, family, arid model.
In addition, there are four bits that relate to operating. mode. The nested (NT) flag
indicates that the current task is nested within another task in protected mode operation.
The virtual mode (VM) bit allows the programmer to enable or disable virtual 8086 mode,
which determines whether the processor runs as an 8086 machine. The virtual interrupt
flag (VIF) and virtual interrupt pending (VIP) flag are used in a multitasking environment.
Control Registers
The Pentium employs four 32-bit control registers (register CRl is unused) to control
various aspects of processor operation (Figure ). The CR0 register contains
system control flags, which control modes or indicate states that apply generally to the
processor rather than to the execution of an individual task The flags are
11
MCE = Machine Check Enable NW = Not Write Through
PSE = Page Size Extensions AM = Alignment Mask
DE = Debugging Extensions WP = Write Protect
TSD = Time Stamp Disable NE = Numeric Error
PVI = Protected-Mode Virtual Interrupts ET = Extension Type
VME = Virtual-8086 Mode Extensions TS = Task Switched
PCD = Page-level Cache Disable EM = Emulation
PWT = Page-level Writes transparent MP = Monitor Coprocessor
PG = Paging PE = Protection Enable
CD = Cache Disable
 Protection Enable (PE): Enable/disable protected mode of operation.
 Monitor Coprocessor (MP): Only of interest when running programs from earlier
machines on the Pentium; it relates to the presence of an arithmetic coprocessor.
 Emulation (EM): Set when the processor does not have a floating-point and
causes an interrupt when an attempt is made to execute floating-point unit, and
causes an interrupt when an attempt is made to execute floating-point instructions.
 Task Switched (TS): Indicates that the processor has switched tasks.
 Extension Type (ET): Not used on the Pentium; used to indicate support of math
coprocessor instructions on earlier machines.
 Numeric Error (NE): Enables the standard mechanism for reporting floating-point
errors on external bus lines.
 Write Protect (WP): When this bit is clear, read-only user-level pages can be writen
by a supervisor process. This feature is useful for supporting process creation in
some operating systems.
 Alignment Mask (AM): Enables/disables alignment checking.
 Not Write Through (NW): Selects mode of operation of the data cache. When this
bit is set, the data cache is inhibited from cache write-through operations.
 Cache Disable (CD): Enables/disables the internal cache fill mechanism.
 Paging (PG): Enables/disables paging.
When paging is enabled, the CR2 and CR3 registers are valid. The CR2 register holds
the 32-bit linear address of the last page accessed before a page fault interrupt. The
12
leftmost 20 bits of CR3 hold the 20 most significant bits of the base address of the page
directory; the remainder of the address contains zeros. Two bits of CR3 are used to drive
pins that control the operation of an external cache. The page-level cache disable (PCD}
enables or disables the external cache, and the page-level writes transparent (PWT) bit
controls write through in the external cache.
Six additional control bits are defined in CR4:
 Virtual-8086 Mode Extension (VME): Enables support for the virtual interrupt flag in
virtual-8086 mode.
 Protected-Mode Virtual Interrupts (PVI):r support for the virtual interrupt flag in
protected mode.
 Time Stamp Disable (TSD): Disables the read from time stamp counter (RDTSC)
instruction, which is used for debugging purposes.
 Debugging Extensions (DE): Enables I/O breakpoint; this allows the pro to interrupt
on I/O reads and writes.
 Page Size Extensions (PSE): Enables the use of 4-Mbyte pages.
 Machine Check Enable (MCE): Enables the machine check interrupt, which occurs
when a data parity error occurs during a read bus cycle or when a bus cycle is not
successfully completed.
32 BIT ARCHITECTURE
INTEL 80386
The 80386 includes separate 32-bit internal and external data paths along with eight
general - purpose 32-bit registers. The processor can handle 8-, 16-, and 32-bit data
types. It has separate 32-bit data and address pins and generates a 32-bit physical
address. The 80386 can directly address up to four gigabytes of physical memory and
64 tetrabytes (246
) of virtual memory. The 80386 can be operated from a 12.5-, 16-, 20-,
25-, or 33-MHz clock. The chip has 132 pins housed in a Pin Grid Array (PGA) package.
The 80386 is designed using high-speed CHMOS III technology.
The 80386 is highly pipelined and can perform instruction perform instruction fetching,
decoding, execution, and memory management functions in parallel. The on-chip
memory management and protection hardware translates logical addresses to physical
addresses and provides the protection rules required in a multitasking environment
The internal architecture of the 80386 includes six functional units that operate in
parallel. The parallel operation is known as pipelined processing. Fetching, decoding,
execution, memory management, and bus access for several instructions are performed
simultaneously. The six functional units of the 80386 are –
13
14
 Bus interface unit
 Code prefetch unit
 Execution unit
 Segmentation unit
 Paging unit
 Decode unit
The bus interface unit interfaces between the 80386 with memory and I/O. Based on
internal requests for fetching instructions and transferring data from the code prefetch
unit, the 80386 generates the address control signals for the current bus cycles.
The code prefetch unit prefetches instructions when the bus inter unit is not executing
bus cycles. It then stores them in a 16-byte instruction queue for execution by the
instruction decode unit.
The instruction decode unit translates instructions from the prefetch queue into
microcodes. The decoded instructions are then stored in instruction queue (FIFO.) for
processing by the execution unit.
The execution unit processes the instructions from the instruction queue. It contains a
control unit, a data unit, arid a protection test unit.
The control unit contains microcode and parallel hardware for fast multiply, divide, and
effective address calculation.
The data unit includes an ALU, 8 general-purpose registers, and a 64-bit barrel shifter for
performing multiple bit shifts in one clock. The data unit carries out data operations
requested by the control unit. The protection test unit checks for segmentation violations
under the control of microcode.
The segmentation unit translates logical addresses into linear addresses at the request
of the execution unit.
The translated linear address is sent to the paging unit. Upon enabling of the paging
mechanism, the 80386 translates these linear addresses into physical addresses. If
paging is not enabled, the physical address is identical to the linear addresses and no
translation is necessary.
80386 REGISTERS
15
Figure shows 80386. registers. The 80386 has 16 registers as general, segment, status
and instruction.
The eight general registers are, the 32-bit registers EAX; EBX; ECX, EDX, EBP, ESP,
ESI, and EDI. The low-order word of each of these eight registers has the
8086/80186/80286 register names AX (AH or AL), BX (BH or BL), CX (CH or CL), DX,
(DH or DL), BP, SP, SI and DI. They are useful for making the 80386 compatible with the
8086, 80286 and 80286 processors.
The six 16-bit segment registers (CS, SS, DS, ES, FS, and GS)allow systems software
designers to select either a flat or segmented model of memory organization. The
purpose of CS, SS, DS, and ES is obvious. Two additional data segment registers FS
and GS are included in the 80386. The four data segmetn registers (DS, ES, FS, GS)
can access four separate data areas and allow programs to access different types of
data structures. For example, one data segment register can point to the data structures
of the current module, another to the exported data of a higher level module, another to
dynamically created data structure, and another to data shared with another task.
The flag register is a 32-bit register named EFLAGS. The flags are grouped into three
types : the status flags, the control flags, and the system flags.
The status flags include CF, PF, AF, ZF, SF, and OF
Instruction Format
The op code byte field varies depending on the class of operation. This field defines
information such as direction of the operation, displacement sizes, register encoding, or
sign extension.
The displacement field can be 8, 16, or 32 bits if the addressing mode includes a
displacement. The last field of the instruction is the immediate data which can be 8, 16,
or 32 bits if the addressing mode is immediate.
16
Component Description
80386 Microprocessor
32-bit high-perrormance imcroprocessor with on-chip
memory management and protection.
80287 or 80287 Numeric
Coprocessor
Performs numeric instruction in parallel with 80386;
expands instruction set
82384 Clock Generator Generates system clock and RESET signal.
8259A Programmable
Interrupt Controller
Provides interrupt control and management
82258 Advanced DMA Performs direct memory controller access (DMA)
FIGURE 80386 system block diagram.
17
INSTRUCTION EXECUTION CYCLE
REGISTERS :
As the instructions are interpreted and executed by CPU, there is a movement of
information between the various units of the computer system. The order to handle this
precess satisfactority and to speed up the rate of information transfer, the computer uses
a number of special memory units called registers. These registers are not considered
as a part of the main memory and are used to retain information on a temporary basis.
The number of registers varies among computers as does the data-flow pattern. Most
computers use several types of registers, each designed to perform a specific function.
Each of these registers possess the ability to receive information, to old it temporarily,
and to pass it on as directed by the control unit. The length of a register equals the
number of bits it can store. Thus a register that can store 8 bits is normally referred to as
8-bit register. Although the number of registers varies from computer to computer, there
are some registers that are common to all computers. The function of these registers is
described below.
1. Memory Address Register (MAR) : It holds the address of the active memory
location. It is loaded from the program control register when an instruction is read
from memory.
2. Memory Buffer Register (MBR) : It holds the contents of the memory work read
from, or written in, memory. An instruction work placed in this register is transferred
to the instruction register. A data work placed in this register is accessible for
operation with the accumulator register or for transfer to the I/O register. A work to be
stored in a memory location must first be transfferred to the MBR from where it is
written in memory.
3. Program Control Register (PC) : It holds the address of the next instruction to be
executed. This register goes through a step-by-step counting sequence and causes
the computer to read successive instructions previously stored in memory. It is
assumed that instruction words are stored in consecutive memory locations and read
and executed in sequence unless a branch instruction is encountered. A branch
instruction is an operation that calls for a transfer to a non-consecutive instruction.
The address part of a branch instruction is transferred to the PC register to become
the address of the next instruction. To read an instruction, the contents of the PC
register are transferred to the MAR and a memory read cycle is initiated. The
instruction placed in the MBR is then transferred to the instruction register.
4. Accumulator Register (A) : This register holds the initial data to be operated upon,
the intermediate results, and also the final results of processing operations. It is used
during the execution of most instructions. The results of arithmetic operations are
returned to the accumulator register for transfer to main storage through the memory
buffer register. In many computers there are more than one accumulator registers.
5. Instruction Register (I) : It holds the current instruction that is being executed. As
soon as the instruction is stored in this register, the operation part and the address
part of the instruction are separated. The address part of the instruction is sent to the
MAR while its operation part is sent to the control section where it is decoded and
interpreted and ultimately command signals are generated to carry out the task
specified by the instruction.
6. Input / Output Register (I/O) : This register is used to communicate with the input /
output devices. All input information such as instructions and data is transferred to
this register by an input device. Similarly, all output information to be transferred to
an output device is found in this register.
Sr.
No.
Name of Register Function
1. Memory Address (MAR) Holds the address of the active memory location.
2. Memory Buffer (MBR) Holds information on its way to and from memory.
3. Program Control (PC) Holds the address of the next instruction to be
executed.
4. Accumulator (A) Accumulates results and data to be operated upon.
5. Instruction (I) Holds an instruction while it is being executed.
18
6. Input / Output (I/O) Communicates with the I/O Address.
MACHINE INSTRUCTIONS : Instructions in a form which can be used directly by the
control unit are called machine instructions and programs, written in the form of machine
instructions are said to be written in machine language. A special register is used to hold
the machine instruction which is currently being interpreted by the control unit, and this
register is called the Current Instruction Register (CIR).
THE STEPS IN EXECUTING INSTRUCTIONS
1. The control unit decodes the function part of the instruction.
2. The control unit signals to appropriate hardware to pass the operand address part of
the instructions to a decoder which passes the operand address into the MAR.
3. Then the control unit signals main memory to perform a read which results in the
data in address being copied into the MDR.
4. The control unit signals the ALU to do an operation.
5. Once the control unit finishes the execution of one instruction it must fetch the next
instruction from the memory into CIR (Via MAR). The address of the next instruction
is held in a special register called as sequence control register (SCR or PC).
This operation is in two stages (called as Fetch Execute Cycle).
a) First it fetches the requisite instruction from main storage via the MDR and places in
CIR.
b) Then it interprets the instruction in CIR and causes the instruction to be executed by
sending commands signals to the appropriate hardware device.
PIPELINING
Pipelining is an implementation technique where multiple instructions are overlapped in
execution. The computer pipeline is divided in stages. Each stage completes a part of an
instruction in parallel. The stages are connected one to the next to form a pipeline
instructions enter at one end, progress through the stages, and exit at the other end.
Pipelining does not decrease the time for individual instruction execution. Instead, it
increases instruction throughput. The throughput of the instruction pipe is determined by
how often an instruction exits the pipeline.
Because the pipe stages are hooked together, all the stages must be ready to proceed
at the same time. We call the time required to move an instruction one step further in the
pipeline a machine cycle. The length of the machine cycle is determined by the time
required for the slowest pipe stage.
The pipeline designer's goal is to belance the length of each pipeline stage. If the stages
are perfectly balanced, then the time per instruction on the pipeline machine is equal to
Time per instruction on nonpipelined machine
Number of pipe stages.
Under these conditions, the speedup from pipeline equals the number of pipe stages.
Usually, however, the stages will not be perfectly balanced; the pipelining itself involves
some overhead.
Pipelining leads to dramatic improvements in system performance, as you can well
imagine, compared to allowing much of the processor circuitry to lie idle as with
sequential execution. The more stages that you can break the pipeline into, the more
theoretical speed you can get from it. For example, let's suppose it takes 12 clock cycles
to handle all the steps to process an instruction. In theory, if you use a 4-stage pipeline,
your maximum throughput is 1 instruction every 3 cycles. But if you use a 6-stage
pipeline, maximum throughput is 1 instruction every 2 cycles. (This is of course highly
simplified).
Pipelining also has some drawbacks of course. One of these is complexity; there is a lot
more work for the processor to do to keep the pipeline moving. Other problems relate to
data dependencies. Let's take a very simple 2-line program as an example.
19
A = A + 1 (Add 1 to the value at memory location A).
B = B + a (Add the value of memory location A to the value at memory location B).
Can you see how pipeline would cause a problem with this (very common kind of) code
fragment? The processor will start executing the second instruction before the first one is
finished, but it needs the results from the first instruction in order to execute the second
one! A pipelining processor will of course detect and handle this condition, but in
proceeding with the second one. This condition is called a pipeline stall and leads to
reduced performance. Newer processors have special performance-enhancing features
to partially eliminate this sort of problem. In general the processor wants to keep the
pipeline ''flowing'' as much as possible, since when the pipeline stalls performance
decreases.
SUPERSCALAR
The evolution of microprocessors has reached the point where architectural concepts
pioneered in vector processors and mainframe computer of the 1970s (most notably the
CDC-6600 and Cray-1) are starting to appear in RISC processors. Early RISC machines
were very simple single-chip processors. As VLSI technology improved more room
became available on the chip. Rather than increase the complexity of the architecture,
most designers decided to use this room on techniques to improve the execution of their
current architecture. The two principle techniques are on-chip caches and instruction
pipelines.
The latest step in this evolutionary process is the superscalar processor. The name
means these processors are scalar processors that the capable of executing more than
one instruction in each cycle. The keys to superscalar execution are an instruction
fetching unit that can fetch more than one instruction at a time from cache; instruction
decoding logic that can decide when instructions are independent and thus executed
simultaneously; and sufficient execution units to be able to process several instructions
at one time. Note that the execution units may be pipeline, e.g. they may be floating
point address or multipliers, in the which case the cycle time for each stage matches the
cycle times on the fetching and decoding logic. In many systems the high level
architecture is unchanged from earlier scalar designs. The superscalar designs use
instruction level parallelism for improved implementation of these architectures.
A good example of a superscalar processor is the IBM Rs/6000. There are three major
subsystems in this processor: the instruction fetch unit, an integer processor, and a
floating point processor. The instruction fetch unit is a 2-stage pipeline; during the first
stage a packet of four instructions is fetched from an instruction cache, and in the
second stage instructions are routed to the integer processor and/or floating point
processor. An interesting feature of this instruction unit is that it executes branch
instructions itself so that in a tight loop there is effectively no overhead from branching
since the instruction unit executes branches while the data units are computing values.
The integer unit is a four-stage pipeline. In addition to executing data processing
instructions this unit does some preprocessing for the floating point unit. The floating
point unit itself is six stages deep.
The advantage of the superscalar approach is that it does not rely on a vectorizing
compiler to detect loops and turn them into vector instruction. A superscalar machine still
requires a very sophisticated compiler to allocate resources and schedule operations in
an order that will best take advantage of the resources of the machine, but in the long
run the superscalar approach may be more flexible and applicable to a wider range of
applications than vector processing.
RISC ARCHITECTURE
RISC stands for Reduced Instruction Set Computer. PA-RISC is the name for Hewlett-
Packard's standard hardware that runs both the MPE/iX and HP-UX operating systems.
The classic HP DEC VAX, and IBM 360 all used CISC processors: Complex Instruction
Set Computer. The instructions that programmers use of those machines are not the real
hardware instruction. Each complex instruction is implemented by a hidden
microprogram written in the real instructions.
On RISC computers there are no microprograms. Machine instructions are implemented
directly in hardware. Any task too complex for the hardware to execute in a single cycle
20
with large, complex instruction sets, the simple, often executed instruction incur a
performance penalty by the overhead of additional instruction decoding, the use of
microcode, and the longer cycle time resulting from increased functionality.
PA-RISC Machine Instructions
RISC machines use an instruction set that is based on 32 general-purpose registers.
Here are some tips to help you guess the function of an instruction from the mnemonic:
Text version
Arithmetic : ADD@andSUB@
Branches : B@ as in BL Branch and Link, BV Branch Vectored.
Compare and Branch : C @ as in COMIBF, COMpare Immediate and Branch If False.
Extract : EXTRS for signed and EXTRU for unsigned.
Load : L@ as in LDH load halfword, LDO load offset.
Shift : SH@ as in SH2ADD Shift 2 and Add.
Store : ST@ as in STB Store Byte, STW Store Word.
CISC (Contemporary complex instruction set architecture)
We have noted the trend to richer instruction sets, which include a larger number of
instructions and more-complex instructions. Two principal reasons have motivated this
trend: a desire to simplify compilers and a desire to improve performance. Underlying
both of these reasons was the shift to high-level languages (HLL) on the part of
programmers; architects attempted to design machines that provided better support for
HLLs.
The first of the reasons cited, compiler simplification, seems obvious. The task of the
compiler writer is to generate a sequence of machine instruction for each HLL statement.
If there are machine instructions that resemble HLL statements, this task is simplified.
The task of optimizing the generated code to minimize code size, reduce instruction
execution count, and enhance pipelining is much more difficult with a complex instruction
set. As evidence of this, studies cited earlier in this chapter indicate that most of the
instructions in a compiled program are the relatively simple ones.
The other major reason cited is the expectation that a CISC will yield smaller, faster
programs. Let us examine both aspects of this assertion : that programs will be smaller
and that they will execute faster.
CISC Versus RISC Characteristics
After the initial enthusiasm for RISC machines, there has been a growing realization that
(1) RISC designs may benefit from the inclusion of some CISC features and that (2)
CISC designs may benefit from the inclusion of some RISC features. The result is that
the more recent RISC designs, notably the PowerPC, are no longer "pure" RISC and the
more recent CISC designs, notably the Pentium, do incorporate some RISC
characteristics.
For purposes of this comparison, to following are considered typical of a RISC:
1. A single instruction size.
2. That size is typically 4 bytes.
3. A small number of data addressing modes, typically less than five. This parameter is
difficult to pin down. In the table, register and literal modes are not counted and
different formats with different offset sizes are counted separately.
4. No indirect addressing that requires you to make on memory access to get the
address of another operand in memory.
5. No operations that combine load/store with arithmetic (e.g., add from memory, add to
memory).
6. No more than one memory-addressed operand per instruction.
7. Does not support arbitrary alignment of data for load/store operations.
8. Maximum number of uses of the memory management unit (MMU) for a data
address in an instruction.
21
9. Number of bits for integer register specifier equal to five or more. This means that at
least 32 integer registers can be explicitly referenced at a time.
10. Number of bits for floating-point register specifier equal to four or more. This means
that at least 16 floating-point registers can be explicitly referenced at a time.
Items 1 through 3 are an indication of instruction decode complexity. Items 4 through 8
suggest the ease or difficulty of pipelining, especially in the presence of virtual-memory
requirements. Items 9 and 10 are related to the ability to take good advantage of
compilers.
BUSES FOR INTERFCING
The first IBM PC-XT used the intel 8088 microprocessor running at a clock rate of 4.77
MHz. The bus had the following characteristics:
1. A set of eight bidirectional data lines
2. A set of 20 address lines
3. Six interrupt lines
4. Three sets of Direct-Memory-Access control lines
5. A group of lines for data control and status
6. Power supply and ground lines
The 62 pins that make up the bus are divided into two rows (A1-A31 and B1-B31) for the
edge connectors. These connectors, which are sometimes called expansion connectors,
are located on the main circuit board of the computer. The microprocessor along with
some of the I/O circuits and memory are also on this board.
In the IBM PC-XT, there are five edge connectors for the expansion boards, Data moves
over the bus on the data lines A2-A9. Addresses for data transfers are specified on the
20 address lines A12-A31.
The use of the 8-bit version of the Intel 8086 16-bit processor in the PC and XT reduced
the number of bus lines needed and resulted in a smaller bus and lower hardware costs.
The six interrupt pins B21-B25 and B4 are connected to an interrupt controller on the
system board. This controller generates the addresses needed for interrupt servicing.
There are six interrupt lines, IRQ2 through IRQ7. Theses are connected to an interrupt
controller on the processor board which automatically generates vectors for the interrupt
service routine. As a result, there is no explicit interrupt-acknowledge signal on the IBM
PC bus. Interrupts are acknowledged by data transactions with the processor.
ISA and EISA Systems
The ISA bus refers to the bus used in Industry-Standard-Architecture compatible
computers. This is the same as the IBM AT 16-bit bus. In an EISA system, which is the
extended version of ISA with a 32-bit bus, it refers to the ISA subset of the EISA bus.
The EISA bus is a superset of the ISA bus. It has all of the ISA bus features, along with
extendions to enhance performance and capabilities. The host CPU is the main system
processor with its separate host bus.
An EISA master is a 16-bit or 32-bit bus master that uses the EISA signal set to
generate memory or I/O cycle. A bus controller is used to convert the EISA control
signals to ISA signals. An ISA master is a 16-bit bus master that uses the ISA subset fo
the EISA bus to generate memory or I/O cycles. This master must communicate with 8-
bit or 16-bit ISA slaves, and route data to the proper paths. It is not used to handle any of
the signals associated with the extended section of the EISA bus.
The EISA slaves can be 8-bit, 16-bit or 32 32-bit memory or I/O slave devices that use
the extended signal set of the EISA bus to accept cycles from the different masters. They
handle information on the type and width of data using both extended and ISA signals.
The ISA slaves are 8 ro 16-bit slave devices that use the ISA subset of the EISA bus to
accept cycles from the different master. They use ISA signals to indicate the type and
width of data. A DMA slave is an I/O device that uses DMA signals like DREQ or DACK*
to perform a direct memory access.
22
Assembly and disassembly are needed when the master/slave data bus size are
mismatched. Multiple cycles are used to route bytes to the proper byte paths. When a
32-bit CPU accesses an 8-bit slave, four cycles will be used to route the bytes.
A cycle translation is performed when the master and slave are on different buses. The
master protocol is translated to the slave protocol.
EISA Systems
The Extended Industry Standard Architecture (EISA) is a 32-bit architecture based upon
the Industry Standard Architecture (ISA) for the PC AT. EISA's capabilities and 32-bit
architecture are needed to get the maximum performance out of the 386 and 486 CPUs.
The EISA consortium defined the EISA bus as a 32-bit high-performance ISA-compatible
system. This open industry standard allows industry wide compatibility.
EISA provides 32-bit memory addressing and data transfers for CPU, DMA, and bus
masters. It allows a 33 Mbyte/second transfer rate for DMA and bus masters on the EISA
bus. EISA provides automatic configuration of add-in cards that eliminates the need for
jumpers and switches. Interrupts are both shareable and programmable. Figure 4.3
shows the type of buses used in an EISA system. The bus-arbitration scheme allows
intelligent bus master add-in cards.
Since the EISA system is compatible with the ISA 8 and 16-bit expansion boards and
software, ISA cards can be plugged into the EISA connector slots. The EISA slots are
defined as ISA or EISA for compatibility during configuration. The EISA connector set is
a superset of the ISA connector set so there is full compatibility with ISA add-in boards is
allowed with the automatic system and expansion board configuration scheme.
EISA Chips
The Intel 82350 EISA chip set is an EISA/ISA compatible chip set. It supports the 386 or
486 CPU, 82385 cache controller, and 80387 numerics coprocessor. The 82350 chip set
is designed fo PCs and PC compatible workstations. The chip set also supports a
buffered configuration for extended architectures with SCSI and LAN functions on the
system board. The chip set includes the 82352 EISA Bus Buffers (EBB), 82357
Integrated System Peripheral (ISP) and 82358 EISA Bus Controller (EBC).
The EBB supports three buses when used in an IESA system. These are called the A, B,
and S buses in the EBB and correspond to the host system bus and the LA and SA
buses in the EISA system.
The ISP handles the DMA functions of the system. It has seven 32-bit DMA channels,
five 16-bit timer/counters, two eight-channel interrupt controllers, and provides the NMI
control and generation. It also provides refresh address generation and keeps track of
the refresh requests when the bus is not available. The ISP support multiple EISA bus
masters using a system arbitration scheme which grants the bus on a rotating basis.
The EBC acts as the EISA engine, since it works as an intelligent bus controller for the
8, 16 and 32-bit bus masters and slaves. It provides the state machine interface to host
ISA/EISA buses and the other ICs in the chip set.
It provides the interface to the 386/486 CPUs and the EISA bus. The EBC acts as a
bridge between the EISA and ISA devices. The data bus size differences are handled by
the EBC, including the assembly and disassembly. The 82355 Bus Master Interface
(BMIC) is a device for add-in cards that makes use of the EISA bus master capabilities.
ICs like these support the EISA bus in 386 and 486 processors at various clock speeds.
These chips use a CPU to memory protocol which allows the memory subsection to
operate independently of the CPU clock. The CPU protocol is translated to this CPU
speed independent of the protocol.
Bus Architecture
Three buses are used : a host bus, the EISA bus, and a peripheral bus called the X-bus.
The host bus connects the CPU or host master to the memory system. The EISA bus
interfaces the system board resources to expansion bus resources. The Peripheral bus
supports the system board IO.
Host Bus
The host bus provides the connection between the CPU memory system. Zero-wait-
state burst cycles are implemented using a 64/128-bit interleaved memory interface.
There are zero-wait-state posted writes with the posted write buffer of the 82353.
23
CPU frequency independence is dur to the 82359's delay line and the programmable
state tracker function. The programmable delay line allows the DRAM cycle sequence to
be tuned to DRAM parameters. The memory interface handshake. Even though the
interface hand-shake is clockless, it is synchronous, since the CPU wait state counts
match those needed by the DRAM access.
EISA Bus
The EISA bus connects the masters to memory and acts as a path for CPU accesses to
system resources. The interface of EISA masters to memory is optimized for the full
memory bandwidth defined by the EISA specifications. This is done with the
synchronous tracking of the EISA master cycles by the 82359. The 82359 is always
synchronous to the EISA master talking to it.
When the CPU accesses the system, the EBC converts the 82359 handshake into the
EiSA protocol. The EBC performs any required cycle control for byte
assembly/disassembly, and controls the latches and transceivers fo the 82359 DRAM
controller and the 82353 data path chip. The EBC runs back-to back read cycles to
support CPU to system bursts and coordinates posted system writes.
Peripheral Bus (X-Bus)
This is an 8-bit bus that supports the system board IO functions such as the keyboard,
floppy, and the LIOE chip which contains a parallel port, and supports external real-time
clock and serial ports. The peripheral bus is a buffered version of the 8-bit ISA bus.
THE PCI LOCAL BUS - A BUS BUILT BY INTEL
The PCI-Bus (Peripheral Component Interconnect) was originally designed to speed up
the display of graphics on Intel-based personal computers, but the standard itself if
processor independent and suitable for other hardware add-ons that require high
bandwidth, including network, video and SCSI adopters. PCI was developed by INTEL
but it did take some time to get it to work reliably. By the middle of 1993 the VESA-Bus
became firmly entrenched in the market place and almost all DOS computer systems
had VESA-Bus slots as standard. The wide acceptance of local bus techonology only
took a few months and by default, VESA-Bus become the first Local Bus standard.
For a while, many people in the computer industry saw a local-bus war between the two
competing local-bus standards.
The PCI-Bus has some attractive features, such as concurrent bus-mastering, a full
burst mode, and a type of pipe lining queue that can reduce the number of potential wait
states compared to the VESA-Bus design.
The PCI-Bus uses three elegant techniques to resolve local bus problems. The first,
known as reflective wave signaling, reduces the amount of electrical amplification
required on the signal paths and thus reduces noise and loading problems. The second
is multiplexing, Multiplexing allows tow different signals to use the same electrical path,
reducing the number of pins required for peripheral chips and lowering manufacturing
costs. The third is a protocol letting the PCI controller receive specific configuration
information from the PCI devices themselves. Intel did not defined a standard adapter
connector for the bus, leaving that job up to a PCI-Bus special-interest group who settled
on the the white 112 pin connector.
PCI the Universal Bus
PCI is platform independent and was soon used in computers built around the PowerPC
chip. This is one of the few times a standard I/O bus has been used across platforms
and so this has to be a big feature in it's favour. The various companies involved in the
PowerPC development, including Apple and IBM adopted the PCI-Bus for Power-PC
based computers. Apple had been using the Macintosh NuBus for many years, but
switched to the PCI-Bus it's Power-PC products. It is ironical that the largest user of
Motorola based processors lined up to buy bus technology from Intel.
Other computer manufacturers are also using the PCI-Bus in there computer platforms
with Digital Equipment Corp. (DEC ) with their Alpha RISC-based systems, and Hewlett-
Packard and SUN Microsystems all including PCI-Bus slots in there products. Intel
licensed its patents on the PCI Bus free of royalties to all who wished to use it.
24
By adopting a established industry standard the manufacturers of the other computer
platforms are ensuring lower costs and more options for both users and developers who
are no longer locked into their own proprietary options. The wide range of cards that
have followed the use of the PCI-Bus on PC systems are available for the first time to
users of other hardware. All that should be required is alternative driver software for the
various platforms.
The Characteristics of the various busses
Bus type Bus data width Bus speed Data transfer rate
PC/XT 8 bits 4.7-8 MHz 3.25 (Mbits/Sec)
ISA 16 bits 8 MHz 6.5 (Mbits/Sec)
EISA 32 bits 8 MHz 32 (Mbits/Sec)
MCA 32 bits 8 MHz 0 (Mbits/Sec)
VESA 32 bits 33 MHz to 50 MHz 132 (Mbits/Sec) and above
PCI 32 bits 33 Mhz 132 (Mbits/Sec)
I/O Interfacing
The 386 DX microprocessor supports 8-bit, 26-bit and 32-bit I/O devices. These can be
mapped into either the 64-kilobyte I/O address space or the 4-gigabyte physical memory
address space. I/O mapping and memory mapping of I/O devices have the following
characteristics : (1) The address decoding needed to generate chip selects for I/O
mapping is usually simpler than memory mapping, (2) I/O mapped devices reside in the
I/O space of the 386 microprocessor (64 kilobytes), and (3) memory mapped devices
reside in the much larger memory space of 4 gigabytes.
Memory mapped devices can be accessed using any 386 microprocessor instruction.
I/O mapped devices can be accessed only through the IN, OUT INS, and OUTS
instructions. Memory mapped devices are protected by the memory management and
protection features.
The interface to a peripheral device depends not only upon data width, but also on the
signal requirements of the device and its location within the memory space or I/O space.
Address Decoding
Address decoding to generate chip selects is required if the I/O devices are I/O mapped
or memory mapped. One technique for decoding memory mapped I/O addresses is to
map the entire I/O space of the 386 into a 64 kilobyte region of the memory space. The
address decoding logic ensures that each I/O device responds to both a memory
address and an I/O address. Addresses can be assigned to I/O devices arbitrarily within
the I/O space or memory space.
Eight-bit I/O devices can be connected to any of the four 8-bit sections of the data bus. If
the addresses of two devices lie within the same doubleword boundaries,. BE3- BE0 are
decoded to provide a chip-select signal that prevents a write to one device from
erroneously performing a write to the other. This chip select is generated with an
address decoder.
In most systems, the same control logic, address latches, and data buffers are used to
access both memory and I/O devices. Latches hold the address for the duration of the
bus cycle. If 74LS373 latches are used, the Latch Enable (LE) input is controlled by the
Address Latch Enable (ALE) signal from the bus control logic. If goes active at the start
of each bus cycle.
The address decoder converts the 386 address into chip-select signals. It can be located
before the address latches or after the latches. If it is placed before the latches, the chip-
select signal becomes valid as early as possible but it must be latched along with the
address. The chip-select signals are sent to the bus control logic to set the correct
number of wait states for the accessed device.
The decoder may be made up of two one-of-four decoders. One is used for memory
address decoding and one for I/O address decoding. An output of the memory address
decoder will activate the I/O address decoder for I/O accesses.
25
I/O Interface
The peripheral (I/O) interface is an essential part of a microprocessor system. It supports
communications between the microprocessor and the peripherals. The peripheral
system must allow a variety of interfaces. An important factor are the buses which
connect the major parts of the system. Devices like disks must be able to transfer data to
a memory with minimal CPU overhead or interaction.
I/O devices may be accessed by dedicated I/O instructions for I/O mapped devices, or
by memory operand instructions for memory mapped devices. The 486 microprocessor
synchronizes I/O instruction execution with external bus activity. The previous
instructions are completed before an I/O operation begins. All writes in the write buffers
will be completed before an I/O read or write is performed.
All microprocessor systems include a microprocessor, memory, and I/O devices which
are linked by the address, data, and control buses. Figure shows the configuration of a
typical 486 microprocessor based system.
In most systems, the same control and data can access memory as well as I/O devices.
The bus interface consists of bus control, data transceiver, byte swap logic, and address
decoder.
A typical peripheral device has address inputs which the processor uses to select the
device's internal registers. It also has a chip-select (CS#) signal which enables it to read
data from and write data to the data bus as controlled by the READ (RD) and WRITE
(WR) control signals. If the microprocessor has separate memory and I/O addressing,
either memory of I/O read and write signals can be used.
26
Many peripheral devices also generate an interrupt output which is asserted when a
response is required from the microprocessor. Here, the microprocessor must generate
a low interrupt acknowledge (INTA) signal.
Transceivers
An 8-bit transceiver like the 74LS245 provides isolation and additional drive for the 386
data bus. Transceivers are used to prevent any contention on the data bus that may
occur if some devices take too long to remove read data from the data bus after a read
cycle.
If a write cycle follows a read cycle, the 386 can drive the data bus before a slow device
has removed its output from the bus, resulting in bus contention problems. Transceivers
are not used when the device is fast enough.
The bus interface must have enough transceivers to handle the device with the most
inputs and outputs on the bus. If the widest device has 16 data bits and all devices can
be connected only to the lower half of the data bus, only two 8-bit transceivers are used.
The 74LS245 transceiver is controlled with two input signals. A Data Transmit/ Receive
(DT/R) is switched high to enable the transceiver for a write cycle. When it is switched
low, it enables the transceiver for a read cycle. This signal is a latched version of the
transceiver outputs. This signal is generated by the bus control logic.
DMA CONTROLLERS
Direct-Memory-Access Techniques
When I/O devices must transfer large amounts of data too quickly to be controlled by the
microprocessor, the transfer can be made directly between the device and the memory
of the microprocessor system using direct-memory-access (DMA). Here, the transfer is
under the control of a DMA controller which is usually a dedicated chip or circuit that
operates independently of the microprocessor.
In a DMA data transfer, the DMA controller can take over control of the microprocessor
buses using one of several methods. An external control line can stop the
microprocessor after the current bus cycle is completed. The microprocessor's memory-
control signals are disabled while the DMA controller initiates the data transfer. When
the DMA transfer is completed, the controller resets the halt line so the microprocessor
can resume execution.
In the cycle-stealing scheme, external control lines are used to stop the operation of the
microprocessor by suspending the execution of an instruction cycle. The microprocessor
is halted while the memory control lines are disabled.
The controller takes over operation and steals several machine cycles to implement the
data transfer. When the transfer is complete, the control lines are reset, the clock is
started, and the microprocessor continues execution of the instruction that had been
delayed. Microprocessors using dynamic memory may need to restrict the number of
machine cycles that can be stolen, so that status conditions are not lost between
refreshing.
Another DMA technique is memory sharing where the microprocessor is allowed to
access the memory only at certain times during a machine cycle. Thus, the memory is
available for other devices at the other times. This requires synchronizing the DMA
controller with the microprocessor clock. This type of interleaved DMA can reduce the
delays in microprocessor processing that can occur when cycle stealing is used.
In a typical DMA controller chip like the Intel 8257, the acquisition of the bus for the DMA
operation is done with the hold function for the microprocessor. The priority logic in the
controller is used to resolve conflicts and issue the hold request to the microprocessor.
The controller also keeps track of the cycles used and notifies the peripheral when the
number of cycles used for the data transfer is complete.
8237A DMA Controller
The Intel 8237A provides enable/disable control of individual DMA requests for four
independent DMA channels with independent auto initialization of all channels. It allows
transfers up to 1.6 Mbytes/second with the 5-MHz version and is directly expandable to
any number of channels. The controller comes in a 40-lead Cerdip or plastic package.
27
The 8237A peripheral interface circuit is designed to improve system performance by
allowing external devices to directly transfer information from the system memory.
Memory-to-memory transfer capability is also provided.
The 8237A is used with an external 8-bit address latch. The channels are expanded by
cascading additional controller chips (fig. 8.5). There are three basic transfer modes.
28
Registers
The 8237A has 344 bits of internal memory in its registers as shown in the following :
4 Base address registers of 16 bits each
4 base word count registers of 16 bits each
4 Current address registers of 16 bits each
4 Current word count registers of 16 bits each
1 Temporary address registers of 16 bits
1 Temporary word count register of 16 bits
1 Status register of 8 bits
1 Command register of 8 bits
1 Temporary register of 8 bits
4 Mode registers of 6 bits each
1 Mask register of 4 bits
1 Request register of 4 bits
DMA Operation
The 8237A uses two major cycles called the idle and active cycles. Each cycle is made
up of several states. The chip can be in one of seven states. Each state lasts for on
clock period. The inactive state is used when there are no DMA requests pending. In this
state the DMA controller is inactive but it may be in the program condition, being
programmed by the processor.
In the first state of DMA servicing, called the SO state, the 8237A has requested a hold
but the processor has not yet returned an acknowledge. The chip may still be
programmed until it receives an HLDA from the CPU. An acknowledge from the CPU
allows DMA transfers to begin.
S1, S2, S3 and S4 are the working states of DMA servicing. If more time is needed to
complete a transfer, wait states (SW) are inserted between the I/O device to memory
with IOR* and MENW* or MEMR* and IOW* active low at the same time.
Memory-to-memory transfers use a read-from and a write-to-memory for each transfer.
These states are similar to the normal working states and use a two digit number for
identification. Eight states are needed for a transfer. The first four states, S11 to S14 are
used for the read-from-memory and the last four states, S12 to S 24, for the write-to-
memory part of the transfer.
Idle Cycle
When there are no requests for service, the chip goes into the idle cycle and performs
S1 states. In this cycle the chip samples the DDREQ lines every clock cycle to check if
any channel is requesting service. The device will look at CS*, which indicates when the
microprocessor tries to read or write any internal registers of the 82237A. When CS is
low and HLDA is low, the 8237A enters the Program Condition. In this mode the CPU
can set or check the condition of the internal registers. Address lines A0 through A3
select the registers to be read or written to.
The IOR* and IOW* lines select and time the reads and writes. An internal flip-flop is
used to generate an additional bit of address. This bit determines the upper or lower byte
of the 16-bit address and word count registers. The flip-flop is reset by a Master Clear,
Reset or a soft-wate command.
Active Cycle
When the 8237A is in the idle cycle and a non-masked channel requests a service, the
devices sends an HRQ to the microprocessor and goes into the Active cycle. The DMA
servicing will take place, using one of the following four modes.
Single transfer
In the single transfer mode the device is programmed to make one transfer only. The
word count will be decrements and the address decrements or incremented following
29
each transfer. When the word count goes through zero, a Terminal Count (TC) will cause
an auto initialize if the channel has been programmed for this.
Block transfer
In the block transfer mode the device is activated by a DREQ and continues with the
transfer until a TC, caused by the word count ending or an external End of Process
(EOP). An auto initialization will occur at the end of the service if the channel has been
programmed for it.
Demand transfer
In the demand transfer mode the device continues with the transfer until a TC or external
EOP occurs or until DREQ goes inactive. This allows the transfer to continue until the I/O
device has no more data ready to transfer. When the I/O device has more data to
transfer, the DMA service is reestablished with a DREQ. During the time between
services, the values of address and word count are stored in the Current Address and
Current Word Count registers. An EOP is needed to Auto initialize at the end of the
service. The EOP may be generated by either a TC or by an external signal.
Cascade
The cascade mode is used to cascade more than one device for system expansion. The
HRQ and HLDA signals from the additional device are connected to the DREQ and
DACK signals of a channel of the initial device. This allows the DMA requests of the
additional device to propagate through the priority circuits of the preceding device.
The priority chain is preserved and the new device must wait for its turn to acknowledge
requests. Since the cascade channel of the initial device is used only for prioritizing the
additional device, it does not output any address or control signals of its own. These
could conflict with the outputs of the active channel in the added device. The 8237A will
respond to DREQ and DACK but all other outputs except HRQ are disabled. The ready
input is ignored.
This results in a two-level DMA system. More devices could be added at the second
level by using the remaining channels of the first level. Additional devices may also be
added by cascading into the channels of second level device to form a third level.
Controlling I/O devices (Polling and interrupts)
The three basic scheduling techniques used for controlling the input~ output devices and
synchronizing the data transfers are as follows :
1. Polling or program-control
2. Interrupt-control
3. Direct-memory-access control
The one that is used in the microcomputer system depends on three factors: the rate
that the data is to be transmitted, the time delays between the I/O device, and the actual
data transfer and the feasibility of overlapping or interleaving I/O operations.
The polling or programmed I/O technique is the simplest to implement. The I/O devices
are connected to the system bus with some connections to the control lires The basic
principle is to implement a procedure in hardware or software for determining which
input/output device requires service. The polling technique is synchronous in nature as
the microprocessor periodically questions each device if it requires service. Each device
then answers with a yes or no. If a no is received, the microprocessor will advance to the
next device and question it. In this way the microprocessor checks each I/O device
successively to determine if service is required.
In a polling microcomputer the asynchronous inputs are detected by an instruction which
checks if the input has occurred. The sequence of instructions in the polling loop tests
the various input lines at a rate which will provide the desired system response time.
In a microcomputer system that communicates with several I/O devices, the periodic
status checks that must be made on each device can result in considerable time lags for
some devices since they indicate when they are ready to transfer data and then they
wait for the actual transfer.
In some microcomputer systems, the time spent checking the device status may be
reduced with a common test line, which signals when a device needs attention. The
30
microprocessor can periodically check the status of this line without having to poll the
individual devices until one of them signals for service. Then, a polling loop is used to
find which device requested service.
Polling takes minimal hardware since usually no special lines are required. It is also
synchronous with the program execution, so it is easy to find when a device is being
interrogated and how long it takes to service it. No events may occur, which tends to
disrupt the scheduled polling sequence. In contrast, interrupts and DMA are
asynchronous and cannot be predicted.
Interrupts
In applications where the polling technique does not provide fast enough response-or
uses too much microprocessor time-interrupts should be considered.
An interrupt line is connected to the microprocessor, and each of the devices is
connected to this line. Each one of the devices which may need to get service has the
option of using this line to request service. ,
When a device requests service, it generates an interrupt pulse or level on this line. The
microprocessor can then sense this change on the line. The microprocessor must accept
the interrupt identify it, and service it. Accepting the interrupt may be done with an
internal mask bit called either an interrupt mask, interrupt inhibit, or interrupt-enable. ~
This bit is normally stored in the flag or status register. After the interrupt is accepted, the
microprocessor must then determine which device originated it.
Several devices may generate interrupts simultaneously When multiple devices are
connected to the same interrupt line, priorities must :; be assigned.
After the interrupt has been accepted and the device identified, the service requested by
the device is performed. The microprocessor may suspend the program it was executing
and branch to the interrupt routine. The required branching address will need to be
available. The software that does this is called the interrupt routine or interrupt.
The execution of the interrupt routine handler is similar in some ways to that of a polling
system. The termination of this routine allows the program which had been suspended
by the interrupt to continue its execution. This may require several instructions.
Multiple Interrupts
The details of the interrupt-servicing procedures discussed so far apply only when a
single device is generating interrupts. Most microcomputer systems have more than one
source and more than one type of interrupt. The types of interrupts falI into three
general classes: external, internal, and simulated. External interrupts are generated from
the peripheral devices. Internal interrupts are generated by the microcomputer system to
indicate error conditions such as a power failure, system malfunction, or transmission
break. Simulated interrupts are generated by the software for interrupt testing and
debugging.
The different sources of interrupts can have different service requirements. Some may
require immediate attention, while others can wait until the task underway is completed.
The interrupt procedure must differentiate between the various sources and determine
the order in which the interrupts are serviced when more than one occurs at the same
time. Finally, the contents of registers must be saved and restored so the program can
continue after the multiple interrupts.
If several interrupt-request lines are used, each can have its own interrupt-trap address.
Then, with one source of interrupt assigned to each line, the system can distinguish
between internal, external, and simulated interrupts. When several U0 devices use the
same interrupt-request Iine, the interrupt may be recognized either by polling using
software or by vectored interrupts using hardware. In the polling technique as we have
discussed, the interrupt produces a jump to the service program using the interrupt trap
address. The service program will check the status word of each I/0 device to determine
which one caused the interrupt. The interrupt status bit will indicate if a device has
generated an interrupt request as it is checked for each device. The device status word
is read into the status register of the microprocessor and if the bit is set, a jump is then
made to the service program.
Vectored Interrupts
A vectored interrupt system allows the microprocessor to recognize the interrupting
device since each UO device is assigned a unique interrupt address This address then
31
generates an interrupt trap address for the device. The trap addresses are normally
located sequentially in the program memory in order to form the interrupt vector. Each
location contains the starting address of a device-service program. The contents of the
interrupt vector are loaded into the program counter and program control transferred to
the correct device-service program.
Some vectored systems do not transmit an address. They use an I/O device to transmit
an instruction to the microprocessor after the request has been acknowledged. Next, the
system Ioads the instruction into an instruction register Normal operation continues after
this instruction is executed
The vectoring is achieved by a jump instruction which derives the jump address from
part of the instruction. A unique jump address is defined for each UO device in the
system.
In systems with several sources of interrupt, one or more interrupt request can occur
during the servicing of an earlier request. In the simplest way to handle this, the interrupt
mask bit is set when the first request is recognized. Then the following requests are
placed in a queue, waiting until the service of the first interrupt is complete before they
are recognized and serviced.
The order in which the queued interrupts are recognized will determine the individual
delay before service. This order, or priority, is set either by software or by hardware using
a priority scheme. After recognizing interrupt request, the service program can poll the
devices in the desired order. The devices which are polled first will be serviced first
In systems with hardware priority, the interrupt logic sends an external signal to the
request logic in each of the I/0 devices. This signal indicates the state of the interrupt
mask bit and is passed to each device according to its priority. When the mask is set,
the signal prevents any device from generating an interrupt request. If the mask is reset
and the device has no interrupt request pending, the signal is passed on to the next
device. The interrupt logic in the device will generate the interrupt request and prevent
the signal from passing on. If more than one device requires interrupt service; the device
receiving the control signal first will be serviced first.
Software and-hardware priority schemes may be slow to respond to a high priority
interrupt if it occurs during the servicing of a Iow-priority interrupt. Individual interrupt
mask bits for each interrupt request line or each I/0 device can then be used. By setting
and resetting these individual mask bits under software control, the interrupt priorities
are changed to fit the needs of the application. '
Some microprocessors use one interrupt-request line that has software-controlled mask
bits and another line that is permanently enabled. This non maskable interrupt request
Iine has the higher priority and is used when service is required immediately
Some vectored system use the microprocessor to define and control ~ the priorities.
When an interrupt request occurs, it is transferred to the , microprocessor and the
vectored address is compared with an interrupt-enabling mask. When the vectored
address is equal to or less than the mask, the request is recognized. The mask is then
set to one less than this address and servicing starts. If the vectored address is greater
than the mask, the request is simply queued.
Note that only interrupts issued from a device with an address lower than that of the
device being serviced are recognized. The lower the address, the higher the priority.
Interrupt Types
The types of interrupts. include vectored, nonvectored, maskable, and nonmaskable. A
maskable interrupt can be turned off by the processor and is used when a software
operation cannot be interrupted. In these `cases, the processor is instructed to disable
.the maskable interrupts. There is usually a disable-interrupt instruction in the
processor's instruction set to do this. A nonmaskable interrupt cannot be turned off. This
type of interrupt is designed for critical events such as a power failure.
When an interrupt occurs, the processor can branch to a location that contains the first
instruction of the interrupt-service routine. Another approach is to use a location in
memory for the starting address of the' service routine. In most newer processors, a
single interrupt service routine is not enough. These processors have several memory
locations reserved for the addresses of the interrupt service routines.
32
The hardware selected interrupt scheme uses separate lines for each memory location
and its service routine. This is called a hardware-vectored interrupt. The processor can
use an interrupt acknowledge cycle ,where the hardware supplies additional information
to the processor This guides the processor to the proper service-routine and is called
software vectored interrupt
Interrupt Routines
When an interrupt branch occurs for an I/O request for service, either data is waiting for
input or output or there is a problem with the data transfer .The processor is much faster
than the peripheral devices, so instead of waiting for the device to get ready for the
transfer, interrupts are used.
Most I/0 devices use buffers to hold the information to be transferred. When the output
buffer becomes empty or the input buffer becomes full; the interrupt service routine
signals that the data transfer is complete. The main program will create a buffer and then
fill it if it is an output buffer Next, the interrupts are enabled.
The interrupts may be enabled by the processor, if they are maskable interrupts, or they
may be enabled by the I/O chip. In some cases hardware between the I/O chip
processor may have an interrupt enable facility.
High-Level interrupts
The low or machine-level interrupts discussed previously are supported by the interrupt
circuits built into the processor When a micro-computer is executing a program written in
a high- level language such as BASIC, thousands of machine instructions are being
executed and `the registers in the processor are changing & frequently.
The Microsoft BASIC used in the IBM PC supports user interrupt servicing through its
subroutines. The subroutines are invoked with the
GOSUB
(go subroutine)statement, and the main program is returned with
RETURN
The user interrupt service routines are a variation of the subroutine. After the interrupts
are enabled, the subroutine is invoked when a peripheral causes an interrupt. The
subroutine is written in the high level language of the computer and is terminated with an
interrupt return statement such as RETURN.
If the interface is not busy before interrupts are enabled, the interrupt is immediate. The
interrupts must be re-enabled in the interrupt service routine if the transfer is not done.
This must be done when the enable is canceled as it is invoked. Otherwise, the interrupt
service routine could be interrupted.
Types of Peripheral Interfaces
The three major types of peripheral interfaces are parallel, serial, and analog. Each type
also has a number of different variations Parallel interfaces are like microprocessor
buses. These are often used to interface personal computers to printers. Data is
transferred over a set of wires called data lines, like the microprocessor data bus. There
are variations in parallel interfaces among the number of data lines used and the amount
of signals used for handshaking. Handshaking is a technique used to control the rate at
which information moves from one device to another .
Serial interfaces use a single line to transmit one bit at a time. The two types of serial
interface are asynchronous and synchronous. The asynchronous interface is more
common in microcomputers. The serial interface is often used to interface a mouse or
keyboard to a personal computer.
Analog interfaces are different from both serial and parallel interfaces since they do not
use digital signals (zero or one). Microprocessor buses use digital signals and serial and
parallel interfaces use digital signals to communicate with peripherals. Analog interfaces
must convert digital signals into signals that vary continuously or convert continuous
signals into digital signals.
Serial Interface Standards
A common serial interface standard is RS-232: The main RS-232 signal lines are those
used to transmit and receive data (BA and BB). These lines are used to send the serial
33
information between the two communicating systems. The following bit rates are
common: 19,200, 9,600, 4,800 and 2,400. Other rates have also been used in the past.
The other signals of RS-282, are used to indicate the status of the modulation
demodulator (modem) communications link. Signals such as request-to-send, clear-to-
send, data-set-ready, and data-terminal ready are used to control the modem link. The
signals between the modem (communications equipment) and the computer (or
terminal) implement a handshake similar to that used in other buses.
The difference in RS-232, is that the handshake is used only at the beginning and end of
a block of serial data. RS-232 has been popular in larger computers and this popularity
has migrated to PC peripheral communications.
PINOUT of the SERIAL PORT
(--> direction is out of PC)
(Note DCD is sometimes labeled CD)
Pin# Pin# Acronym Full-Name Direction What-it-May-Do/Mean
9-
pin
25-pin
3 2 TxD Transmit Data -> Transmits bytes out of PC
2 3 RxD Receive Data <-- Receives bytes into PC
7 4 RTS Request To Send --> RTS/CTS flow control
8 5 CTS Clear To Send <-- RTS/CTS flow control
6 6 DSR Data Set Ready <-- I'm ready to communicate
4 20 DTR data Terminal Ready --> I’m ready to communicate
1 8 DCD Data Carrier Deteet <-- Modem connected-to another
9 22 RI Ring Indicator <-- Telephone line ringing
5 7 Signal Ground
Only 3 of the 9 pins have a fixed assignment: transmit, receive and signal ground. This is
fixed by the hardware and you can't change it. But the other signal lines are controlled by
software and may do (and mean) almost anything at all. However they can only be in
34
one of two states: asserted (+12 volts) or negated (-12 volts). Asserted is "on" and
negated is "off” . For example, software may command that DTR be negated and the
hardware only carries out this command and puts -12 volts on the DTR pin. A modem (or
other device) that receives this DTR signal may do various things. if a modem has been
configured a certain way it will hang-up the telephone line when DTR is negated. In other
cases it may ignore this signal or do something else when DTR is negated (turned off).
It's like this for all the 6 signal lines. The hardware only sends and receives the signals,
but what action (if any) they perform is up to the software and the configuration/design of
devices that you connect to the serial port. However, most pins have certain functions
which they normally perform but this may vary with the operating system and the device
driver configuration.
Serial and Parallel Ports
PC communications depend heavily on serial and parallel ports. The serial and parallel
ports are the connectors (plug-ins) on the back of the computer The parallel and serial
ports are an important part of interfacing. They allow the computer to talk to the outside
world. A port is simply a connection, or plug-in, that gives access to the computer The
computer peripherals, which are the devices that extend the usefulness of the computer,
such as printers, mouse, and modems, all talk to the computer through the
communications ports.
When you connect peripherals and communications ports, first you , must determine the
type of communications that the device uses, either parallel or serial. Then, you make
the physical connection this requires the proper port(s) on your computer, and the right
cable(s). The next step is to inform the software of the connections that were made. The
connectors on the computer are referred to as ports. These can be thought of as the
passageway through which the signals are sent and received. Sending signals from the
computer is, referred to as output and receiving signals is referred to as input. A printer
and a modem are output devices and a keyboard and a mouse are input devices.
The two basic methods that PCs use to communicate with the outside world are the
serial and parallel communication techniques. The main difference between parallel and
serial is the way they transmit signals over their cables. Internally, the computer
recognizes each '' ( character as an 8-bit code. In serial communications the signals are
sent one at a time, over a single wire or of pair of wires.
In parallel transmission, the signals are sent over eight different wires. Parallel
communications are like an eight-lane express way with automobiles next to each other
in the lanes. Serial communications, on the other hand, are more like railroad cars
travelling down a single track. Parallel ports are like an expressway since they can
handle larger volumes of traffic. But, just as vehicles in adjacent lanes can interfere with
each other, so can the wires in parallel transmission. The longer the distance becomes,
the greater is the chance of interference. Because II of this potential for interference in
parallel communications, it is normally used only to communicate short distance. Parallel
cables are usually no longer than 15 feet. Serial communications are more like a train of
cars connected together and riding on rails; it does not have the same potential for
interference between the signals. This allows serial cables to be used for distances up to
50 feet. Serial communications cannot transmit the same volume as parallel
communications because of the number of signal paths used. Another difference
between serial and parallel communications is the ease of use. Parallel communications
35
are more straightforward to the user All that is needed is to plug them in there are no
transmission parameters to configure and match ' between the sender and the user.
Parallel communications always occur in the same way. If a printer has a parallel
connector, it just needs to plugged into a parallel port. Serial communications are more
flexible since they allow a variety of settings. This increases the potential uses but
requires the proper set tings for transmission.
One setting is the speed effective communication is not possible if the sender uses one
speed and the receiver a different speed. The speed of serial communications is called
the baud.rate and common speeds are 300, 1200, 2400, 4800, and 9600 baud.
Because of their ease of use, parallel communications have become the method of
choice for the majority of IBM-compatible printers. Parallel communications ports are
easier to hook up and ready to be used after the connection is made. (Serial printers are
used when the distance to the printer is greater than 15 feet)
Port Characteristics
Most PCs include one or more parallel ports, which are also called printer ports. These
two terms are interchangeable. Serial ports are normally used for a mouse. Interface
cards for both are available on expansion boards that can be installed in the expansion
slots inside the computer. These cards are usually inexpensive and can be installed
easily. Up to four parallel ports can be installed in a PC. These ports are designated
LPT1 to LPT4. Serial ports are numbered COM1 to COM4. If you already have two serial
ports in use, be sure any serial cards you add can be configured for COM3
(Communications Port 3) or COM4 (Communications Port 4). Many serial ports are not
designed to work as COM3 and COM4. The serial ports in these cards can only serve as
either COM1 or COM2.
Serial ports are RS-232 ports. This terminology is more common in larger computers.
The RS means recommended standard and the 232 is the identification number for the
standard that the Electronic Industries Association (EIA) uses. So if someone refers to a
RS-232 port they are talking about a serial port or a COM port. The serial ports on the
computer are either 9 or 25 pin male connectors. The parallel ports are always 25-pin
female connectors.
Identifying the Ports
In the back of your IBM personal computer or compatible, there are several different
connectors. The parallel ports are the 25-pin female connectors. These are called DB-25
connectors. On many video cards, the parallel printer port is usually under the smaller 9-
pin video output port.
Serial ports have the pins reversed to keep you from plugging a cable into wrong plug.
On most newer computers (from the 286 on) the serial ports have a 9-pin male
connector (with the pins showing). This type of connector is known as a DB-9 connector.
On most XT computers, the serial port is male 25-pin (DB-25) connector. The newer
computers use the 9-pin serial ports to save space. The smaller ports allow two serial
ports in the same amount of space. PC serial ports use only 8 pins, so the other 17 pins
of the DB-25 connector are unused.
There are adapters if you want to install a device that has a 9-pin serial connector into a
PC that has a 25-pin serial connector. This can occur when you try to hook up a serial
mouse that needs to be connected to a serial port. These adapters that allow such a
connection to be made have a 9-pin connector on one end and a 25-pin connector on
the other. These adapters are often included with a mouse or they can be purchased
separately.
The other connectors on the computer include a 9-pin video port connector that
connects to the monitor. A larger game port connector may also be used to connect a
joystick or game paddle.
When you add or change the devices that are hooked up to the computer, you may need
to change the software configurations. This is minimal if the device uses parallel
communications to one of the LPT parallel ports. You may be asked to confirm LPT port
that the computer selected for you. But, if you want to have a serial communications part
for a printer or other device connected to one of the COM serial ports, you will need to
set the variable communications parameters.
36
The communication parameters instruct the computer which serial port to use, how fast it
is, and other parameters such as parity, data bits, and stop bits. These parameters can
either be set by an application software package or by the DOS MODE command.
MODE COM2 : 9600, 8, n, 1
Receiver / Transmitter ICs
The early serial interfaces used integrated circuit shift registers for the parallel-to-serial
and serial-to-parallel conversions. Shift registers for synchronous transmission use a
clock to indicate when the next data bit is to be shifted in or out. In asynchronous
transmission the start and stop bits are loaded into the shift register and shifted out like
the data bits. When shift registers are used for asynchronous reception, circuits are
needed to synchronize the receiving shift register with the incoming bits.
Receiver / transmitter chips were introduced during the 1970s. These integrated circuits
usually provide two channels of asynchronous and synchronous receivers and
transmitters along with bit-rate generators, buffers and status, interrupt, and DMA control
lines.
These receiver / transmitter integrated circuits include the General Instruments AY-
31015D UART, Motorola 6850 ACLA, National Semi-conductor 8250 ACE, and Intel
8251A USART. The following definitions are used for these receiver / transmitter chips :
UART Universal Asynchronous Receiver / Transmitter
ACLA Asynchronous Communications Interface Adapter
ACE Asynchronous Communications Element
USART Universal / Synchronous / Asynchronous Receiver / Transmitter
Each of these integrated circuits performs the same take, but there are different
capabilities among these chips.
UART Chips
One of the first receiver / transmitter integrated circuits to become popular was the
Universal Asynchronous Receiver / Transmitter or UART (pronounced "you-art"). It
combines the transmitter and receiver shift registers with other features to simplify serial
interfacing.
The UART has separate receiver, transmitter and control sections. The transmitter and
receiver sections operate independently but they share the control and status pins.
The UART is like four separate shift registers that have their own control signal line. The
two write registers are a transmit-buffer register and a control register and the two read
registers are a received data buffer register and a status register. Each register has its
own data lines and control signal line.
UART CHIPS
UART (Universal Asynchronous Receiver and Transmitter) are chips inside
communication devices that are responsible for serially transmitting and receiving
information. One chip at each end of a serial communication channel can do the
communication. These chips convert bytes into bits and send each byte down a line,
where another chip transforms the bits back into bytes. UART chips are usually the
37
brains behind communicating serially on a personal computer. The CPU gets an
interrupt every time a byte is sent or received. The CPU then moves a received byte out
of the UART's register and into memory somewhere, or gives the UART another byte to
send.
The most recent UART chip is the 16550A. Some normal chips in a PC are the 2450,
16450, 8250, or 16550A. The 8250 and 16450 UARTs only have 1 byte buffer. This
means that every time a byte is sent or received, the CPU gets an interrupt. For slow
communication up to 19200 bps, this is acceptable. However, at high communication
rates, the CPU might not have time to service interrupts sent from the UART, to receive
the information. The 16550A UART chip is important for high speed communications
because it comes with 16 byte FIFO. This means the 16550A chip can send and receive
information up to a 16 bytes before it has to interrupt the CPU. The CPU can then
transfer all 16 bytes at a time. Although the interrupt to this chip is seldom set right at 16,
the chip sends less interrupts and relieves congestion to the CPU.
When data is lost due to the CPU being unable to service the request of the UART chip,
correction algorithms detect the loss and send again. The results in slow communication,
but not always failed communication. That is why it may not be readily obvious to a user
that something is wrong.
UARTs (Universal Asynchronous Receiver Transmitter) are serial chips on your PC
motherboard (or on an internal modem card). The UART function may also be done on a
chip that does other things as well. On older computers like many 486's, the chips were
on the disk IO controller card. Still older computer have dedicated serial boards.
The UART's purpose is to convert bytes from the PC's parallel bus to a serial bit-stream.
The cable going out of the serial port is serial and has only one wire for each direction of
flow. The serial port sends out a stream of bits, one bit at a time. Conversely, the bit
stream that enters the serial port via the external cable is converted to parallel bytes that
the computer can understand. UARTs deal with data in byte sized pieces, which is
conveniently also the size o ASCII characters. Say you have a terminal hooked up to
your PC. When you type a character, the terminal gives that character to its transmitter
(also a UART). The transmitter sends that byte out onto the serial line, on bit at a time, at
a specific rate. On the PC end, the receiving UART takes all the bits and rebuilds the
(parallel) byte and puts it in a buffer.
Along with converting between serial and parallel, the UART does some other things as
a byproduct (side effect) of its primary task. The voltage used to represent bits is also
converted (changed). Extra bits (called start and stop bits) are added to each byte before
it transmitted. Also, while the flow rate (in bytes / sec.) on the parallel bus inside the
computer is very high, the flow rate out the UART on the serial port side of it is much
lower. The UART has a fixed set of rates (speeds) which it can use at its serial port
interface.
Two Types of UARTs
There are two basic types of UARTs : dumb UARTS and FIFO UARTS. Dumb UARTs
are the 8250, 16450, early 16550, and early 16650. They are obsolete but if you
understand how they work it's easy to understand how the modern ones work with FIFO
UARTS (late 16550, 16550A, 16c552, late 16650, 16750 and 16C950).
There is some confusion regarding 16550. Early models had a bug and worked properly
only as 16450's (no FIFO). Later models with the bug fixed were named 16550A but
many manufacturers did not accept the name change and continued calling it a 16550.
Most all 16550's in use today are like 16550A's. Linux will report it as being a 16550A
even though your hardware manual (or a label note) says it's a 16550. A similar situation
exists for the 16650 (only it's worse since the manufacturer allegedly didn't admit
anything was wrong). Linux will report a late 16650 as being a 16650V2. If it reports it as
16650 it is bad news and only is used as if it had a one-byte buffer.
FIFOs
To understand the differences between dumb and FIFO (First In, First Out queue
discipline) first let's examine what happens when a UART has sent or received a byte.
The UART itself can't do anything with the data passing thru it, it just receives and sends
it. For the obsolete dumb UARTS, the CPU gets an interrupt from the serial device every
time a byte has been sent or received. The CPU then moves the received byte out of the
UART's buffer and into memory somewhere, or gives the UART another byte to send.
38
The obsolete 8250 and 16450 UARTs only have a 1 byte buffer. That means, that every
time 1 byte is sent or received, the CPU is interrupted. At low transfer rates, this is OK.
But, at high transfer rates, the CPU gets so busy dealing with the UART, that is doesn't
have time to adequately tend to other tasks. In some cases, the CPU does not get
around to servicing the interrupt in time, and the byte is overwritten, because they are
coming in so fast. This is called an 'overrun" or "overflow".
FIFO UARTs help solve this problem. The 16550A (or 16550) FIFO chips comes with 16
byte FIFO buffers. This mans that it can receive up to 14 bytes (or send 16 bytes) before
it has to interrupt the CPU. Not only can it wait for more bytes, but the CPU then can
transfer all (14 to 16) bytes at a time. This is significant advantage over the obsolete
UARTs, which only had 1 byte buffers.
The CPU receives less interrupts, and is free to do other things. Data is rarely lost. Note
that the interrupt threshold of FIFO buffers (trigger level) may be set at less than 14.1, 4
and 8 are other possible choices. As of late 2000 there was no way the Linux user could
set these directly (setserial can't do it). While many PC's only have a 16550 with 16-byte
buffers, better UARTS have even large bufers.
Note that the interrupt is issued slightly before the buffer gets full (at say a "trigger level"
of 14 bytes for a 16-byte buffer). This allows room for a couple more bytes to be received
before the interrupt service routine is able to actually fetch all these bytes. The trigger
level may be set to various permitted values by kernel software. A trigger level of 1 will
be almost like an obsolete UART (except that it still has room for 15 more bytes after it
issues the interrupt).
Now consider the case where you're on the Internet. It's just sent you a short webpage
of text. All of this came in thru the serial port. If you had a 16-byte buffer on the serial
port which held back characters until it had 14 of them, some of the last several
characters on the screen might be missing as the FIFO buffer waited to get the 14th
character. But the 14th character doesn't arrive since you've been sent the entire page
(over the phone line) and there are no more characters to send to you. It could be that
these last characters are part of the HTML formatting, etc. and are not characters to
display on the screen but you don't want to lose format either.
There is a "timeout" to prevent the above problem. The "timeout" works like this for the
receive UART buffer. If characters arrive one after another, then an interrupt is issued
only when say the 14th character reaches the buffer. But if a character arrives and the
next character doesn't arrive soon thereafter, then an interrupt is issued anyway. This
results in fetching all of the characters in the FIFO buffer, even if only a few (or only one)
are present. There is also "timeout" for the transmit buffer as well.
UART Model Numbers
Here's a list of UARTs. TL is Trigger Level.
8250, 16450, early 16550 : Obsolete with 1-byte buffers.
16550, 16550A, 16c552 : 16-byte buffers, TL = 1, 4, 8, 14.
16650 : 32-byte buffers. Speed up to 460.8 kbps.
16750 : 64-byte buffer for send, 56-byte for receive. Speed up to 921.6 kbps.
Hayes ESP : 1k-byte buffers.
The obsolete ones are only good for modems no higher than 14.4k (DTE speeds up to
38400 bps). For modern modems you need at least a 16550 (and not an early 16550).
For V.90 56k modems, it may be a several percent faster with a 16650 (especially if you
are downloading large uncompressed files). The main advantage of the 16650 is its
larger buffer size as the extra speed isn't needed unless the modem compression ratio is
high. Some 56k internal modems may come with a 16650.
Many 486 PCs (old) and all Pentiums (or the like) should have 16550As (usually called
just 16550's) with FIFOs. Some better motherboards today (2000) even have 16650s.
The 8250 SERIAL COMMUNICATIONS CHIP
The 8250 and compatible chips provide nine I/O registers. Certain upwards compatible
devices provide a tenth register as well. These registers consume eight I/O port
addresses in the PC's address space. The hardware and locations of the addresses for
these devices are the following :
39
COM Port Physical Base Address (in hex) BIOS variable
COM1: 3F8 40:0
COM2: 2F8 40:2
The base address is the first of eight I/O locations consumed by the 8250. The exact
purpose of these eight I/O locations appears in the following table :
40
I/O Address (hex) Description
3F8/2F8 Receive / Transmit data register.
Also the L.O. byte of the Baud Rate Divisor Latch register.
3F9/2F9 Interrupt Enable Register.
Also the H.O. byte of the Baud Rate
Divisor Register.
3FA/2FA Interrupt Identification Register (read only).
3FB/2FB Line Control Register.
3FC/2FC Modem Control Register.
3FD/2FD Line Status Register (read only).
3FE/2FE Modem Status Register (read only).
3FF/2FF Shadow Receive Register (read only, not available on original
PCs).
The following sections describe the purpose of each of these registers.
The Data Register (Transmit / Receive Register)
The data register is actually two separate registers: the transmit register and the receive
register. You select the transmit register by writing to I/O addresses 3F8h or 2F8h, you
select the receive register by reading from these addresses. Assuming the transmit
register is empty, writing to the transmit register begins a data transmission across the
serial line. Assuming the receive register is full, reading the receive register returns the
data. To determine if the transmitter is empty or the receiver is full, see the Line Status
Register.
The Interrupt Enable Register (IER)
When operating in interrupt mode, the 8250 SCC provides four sources of interrupt: the
character received interrupt, the transmitter empty interrupt, the communication error
interrupt, and the status change interrupt. You can individually enable or disable these
interrupt sources by writing ones or zeros to the 8250 IER (Interrupt Enable Register).
Writing a zero to a corresponding bit disables that particular interrupt. Writing a one
enables that interrupt. This register is read / write, so you can interrogate the current
settings at any time (for example, if you want to mask in a particular interrupt without
affecting the others). The layout of this register is
The Baud Rate Divisor
The Baud Rate Divisor Register is a 16 bit register that shares I/O locations 3F8h / 2F8h
and 3F9h / 2F9h with the data and interrupt enable registers. Bit seven of the Line
Control Register selects the divisor register or the data / interrupt enable registers. The
Baud Rate Divisor register lets you select the data transmission rate (properly called bits
per second, or bps, not baud). The following table lists the values you should write to
these registers to control the transmission / reception rate :
Baud Rate Divisor Register Values Bits Per Second3F9/3F9 Value3F8/2F8 Value
110417h, 300180h, 6000C0h, 1200060h, 1800040h, 2400030h, 3600020h, 4800018h,
960000Ch, 19.2K06, 38.4K03, 56K01.
41
The Interrupt Identification Register (IIR)
The Interrupt Identification Register is read-only register that specifies whether an
interrupt is pending and which of the four interrupt sources requires attention. This
register has the following layout :
Since the IIR can only report one interrupt at a time, and it is certainly possible to have
two or more pending interrupts, the 8250 SCC prioritizes the interrupts. Interrupts source
00 (status change) has the lowest priority and interrupt source 11 (error or break) has
the highest priority, i.e., the interrupt source number provides the priority (with three
being the highest priority).
The Line Control Register
The Line Control Register lets you specify the transmission parameters for the SCC.
This includes setting the data size, number of stop bits, parity, forcing a break, and
selecting the Baud Rate Divisor Register. The Line Control Register is laid out as follows
The Modem Control Register
The 8250’s Modem Control Register contains five bits that let you directly control various
output pins on the 8250 as well as enable the 8250’s loopback mode. The following
diagram displays the contents of this register :
42
The Line Status Register (LSR)
The Line Status Register (LSR) is a read-only register that returns the current
communication status. The bit layout for this register is the following :
The data available bit is set if there is data available in the Receive Register. This also
generates an interrupt. Reading the data in the Receive Register clears this bit.
The Modem Status Register (MSR)
The Modem Status Register (MSR) reports the status of the handshake and other
modem signals. Four bits provide the instantaneous values of these signals, the 8250
sets the other four bits if any of these signals change since the last time the CPU
interrogates the MSR. The MSR has the following layout :
The Auxiliary Input Register
43
The auxiliary input register is available only on later model 8250 compatible devices.
This is read-only register that returns the same value as reading the data register. The
difference between reading this register and reading the data register is that reading the
auxiliary input register does not affect the data available bit in the LSR. This allows you
to test the incoming data value without removing it from the input register.
8250 Asynchronous Communications
Controller
This Intel chip allows asynchronous operation in a 5-bit to 8-bit character format. Odd-,
even-, or no-parity generation and detection is allowed with a bit rate to 56 Kb/s. There is
a programmable, 16-bit baud rate generator with an on-chip crystal oscillator. it is
available in a 28 lead DIP PLCC package.
This CHMOS 82050 asynchronous communications controller is a low-cost alternative to
the INS 16450. It emulates INS 16450 and is compatible with IBM PC software. The
82050 is also used in modems when combined with modem chips like Intel’s 89024.
82050 Signals
The three address pins (A2-A0) or pins 24-22 interface with the system address bus to
select one of the internal registers for read or write operations. D7-D0 on pins 1-4 and
25-28 make up the bi-directional, three state, 8-bit data bus. It allows the transfer of
bytes between the microprocessor and the 82050. A RESET input on pin 17 resets the
82050. CS* is the chip select on pin 18. A low on this input pin enables the 82050 and
allows the following read or write operations :
1. RD* on pin 20 allows the microprocessor to read data or status from the chip.
2. WR* on pin 19 allows the microprocessor to write data or control bytes to the 82050.
INTERRUPT is on pin 5. A high on this output indicates an interrupt request to the
microprocessor. The source and cause of the interrupt can be found by reading the
status registers.
CLK/X1 on pin 9 is used for the internal system clock. In the CLK mode an
externally generated clock is used. In the X1 mode the clock is generated by a crystal
that is connected between the X1 and X2 pin.
OUT2*/X2 on pin 8 is another dual-function pin. OUT2* is a general purpose
output used by the CLK/X1 pin. It is driven by an externally generated clock. X2 is an
output pin for the crystal oscillator. The configuration of this pin takes place during a
hardware reset.
TXD on pin 6 is the TRANSMIT DATA pin. The serial data is transmitted on this
output pin starting at the least significant bit. RXD on pin 13 is used as the RECEIVE
DATA pin. The serial data is received on this pin starting at the least significant bit.
RI* or pin 10 is used as a ring indicator input. DTR* on pin 15 is the DATA
TERMINAL READY output. During a hardware reset, this pin is an input used to set the
system clock mode.
DSR* on pin 11 is the DATA SET READY input. RTS* on pin 16 is the REQUEST
TO SEND output. During a hardware reset, this pin is an input used to set the system
clock mode.
CTS* on pin 14 is the CLEAR TO SEND input. DCD* on pin 12 is the DATA
CARRIER DETECTED input. Pin 21 is the device power supply and pin 7 is ground.
44
System Interface
The 82050 uses a demultiplexed bus interface made up of a bi-directional, three-state,
8-bit data bus and a 3-bit address bus. The Reset, Chip-Select, Read, and Write pins,
along with the Interrupt pin, are the other signals needed to interface to the
microprocessor.
The system clock can be generated externally and sent to the CLK pin. The on-chip
crystal oscillator is used by connecting a crystal to the X1 and X2 pins. The 82050 chip
along with a transceiver, address decoder, and crystal, complete the interface to the IBM
PC bus.
Transmitting and Receiving
In the 82050, the transmission mechanism involves a section in the chip called the TX
machine along with the TXD register. The TX machine reads characters from the TXD
register, serializes the bits, and transmits them over the TXD pin according to signals
provided by the baud rate generator. It also generates the parity and break
transmissions.
Receiving involves a section called the RX machine along with the RXD register. The RX
machine assembles the incoming characters and loads them onto the RXD register. The
RX machine also synchronizes the data, passes it through a digital filter to filter out
spikes, and generates the bit polarity. The falling edge of the start bit triggers the RX
machine, which then samples the RXD input. When a start bit is detected, the RX
machine samples for data bits.
If the RXD input is low for an entire character time, then the RX machine sets Break
Detect and Framing Error bits in the Line Status Register (LSR) and loads a NULL
character into the RXD register. The RX machine then goes into an idle state until it
senses a one and it resumes normal operation.
Like other I/O-based peripherals, the 82050 is programmed through its registers. The
82050 register set is the same as the 16450 register set to provide compatibility with
previous software written for the IBM PC.
Parallel Interfacing Techniques
Microcomputer interfaces are designed to link microprocessor buses with peripheral
devices. They can take the form of a board plugged into the microprocessor bus or they
can be built into the main circuit board. Built into the interface board or the main circuit
board is a connector to link the interface to the peripheral. The interface depends on the
signals that are passed through this cable and the circuits on the interface board or main
board that generate these signals.
A simple parallel interface can be built with a single TTL integrated circuit. Other parallel
interfacing techniques, such as IEEE-488 or SCSI, require complex circuitry.
Parallel interfaces have two major distinguishing features. There is the data path or
width, which is the number of bits transferred in parallel by the interface. In addition to
the data path, there is the type of handshake used to co-ordinate the movement of these
bits between the computer and the peripheral.
The data width can range from a single bit to 128 bits or wider. The most common size
for microcomputers in an 8-bit data path. This allows the microprocessor to transfer an
8-bit data word over the interface during each transfer. The 8-bit parallel interface is also
used by 16- and 32-bit microprocessors. This is because of the large number of 8-bit
peripherals available. These devices, such as printers, were originally designed for 8-bit
computers. Another reason is that ASCII, the most common character code, requires at
least a 7-bit interface. Larger data words are also transferred where higher speeds are
required.
Handshaking
The type of handshake used to move information over the data line can be classified by
the number of wires dedicated to the handshaking operation. This results in zero-wire
handshakes, one-wire handshakes, two-wire handshakes and three-wire handshakes.
Within these classifications there are variations on how the wires are actually used for
the handshake. This includes how they are pulsed or interlocked.
The zero-wire handshake is the most simple of these interfaces. It uses an 8-bit latch to
store the state of the processor's data bus. On the rising edge of a write signal, the latch
45
takes the states of the data bus lines and stores them in the latch. The states are
reproduced on the output lines of the latch after a read signal occurs. A 16-bit interface
can be built by adding a second 8-bit latch.
This type of zero-wire handshake, parallel-output interface, can be used to drive simple
outputs for lights or relays. Devices such as these do not have handshaking
requirements. Each of the output lines from the interface can be used to drive a light or
relay.
The single wire handshake requires adding another wire to indicate when data is valid
on the data lines. This signal has the effect of stretching the write pulse from the
microprocessor. This allows slower devices to respond to the write pulse and provide
some settling time.
A two-wire handshake adds another line so the receiving device can indicate when it is
ready for data. This provides a true handshake using this acknowledge line, and full
interlocking is possible. The two-wire handshake is adequate for interfacing a single
peripheral, but some interfaces use a third wire to create a protocol that allows several
peripherals to use the interface. An example of this is the IEEE-488 bus.
Zero-Wire Handshake
In this relatively simple interface, an 8-bit latch stores the signals from the
microprocessor's data bus. An address circuit is also used to interface to the
microprocessor. This can take the form of a NAND gate, as shown in fig. . The gate
has two inputs. The address valid input goes true when the address of the circuit
appears on the microprocessor address bus. The generation of this address may require
a comparison of the states of address and control lines in the microprocessor. This is
usually done with exclusive-OR gates.
The write input may be generated by the microprocessor or it may be decoded from the
microprocessor address and control-line states. When both the write and address valid
signals are high, the output of the NAND gate is low. When the write input goes low, the
output of the NAND gate goes high, which is the idle state.
The zero-wire-handshake parallel-output interface is used to drive simple peripherals
such as lights or relays. Each of the output lines from the interface is used to drive a
relay, incandescent lamp, or light emitting diode (LEDs). If the light or relay is connected
between the latch outputs and a+5 volt power supply, then the lamp or relay will be on
46
when that bit in the latch is a zero. Most TTL circuits use this approach since the TTL
outputs are a better sink for current than a source for current.
A zero-wire-handshake parallel-input interface can also be used. It is similar to the zero-
wire handshake output interface, since a NAND gate is used with one of the inputs
connected to the address valid line. The other input is connected to a microprocessor
read line.
When both inputs to the NAND gate are true, then the latch or buffer is enabled and the
data at the buffer inputs is placed on the microprocessor data bus. This occurs during
the read cycle of the microprocessor.
This type of zero-wire-handshake parallel interface can be used to read the status of a
bank switches. Each of the buffer inputs goes through a switch and a resistor. These
resistors acts as current limiters and pull-up resistors. They pull the buffer inputs up
closer to +5 volts when the switch is open. The buffer has some resistance to ground so
the resistor is used to swamp this resistance. These resistors also limit the current when
the switch is closed since the +5 volts would be grounded.
A read t the desired address by the microprocessor allows the states of the input lines on
the buffer to be placed on the microprocessor data bus so that the processor may read
them. The input lines are controlled by the states of the switches.
The IBM PC and clones use an encoded keyboard which is connected to the system unit
with a 5-pin input-output plug. Two of the pins provide the power (+5 volts and ground).
The other three are left to provide the interface between the keyboard and the system
board. This is done with serial transmission through the keyboard cable.
In the keyboard, a key depression causes the encoded circuits to generate the ASCII
code for the key. The keyboard feeds its ASCII output to the system unit. Most
keyboards use a keyboard processor, like an 8048 microprocessor. The 8048 has an 8-
bit microprocessor and 2 Kbytes of ROM. The ROM is preloaded with a character code
known as a scan code.
Debouncing
The mechanical contact that occurs when you strike a key can generate oscillations.
When a key is pressed and makes the metallic connection, there is a short period of
oscillation until the connection is completed. This usually lasts for a few milliseconds.
During this time the keyswitch voltage is not stable and it oscillates between the two
switching voltages. The same type of oscillations occur when the key is released.
(In nonencoded keyboards a resistor and capacitor can be used as a filter to reduce
these oscillations. In the encoded keyboards used in the IBM PCs and clones, a delay of
a few milliseconds is used before the keystrike is encoded. The delay is usually made
with a programmed loop that inserts the delay. This inhibiting of the key action during the
switch bouncing is called debouncing). The 8048 microprocessor performs this
debouncing by generating an interrupt during the time the keyboard voltage is bouncing.
One-Wire Handshake
Most peripheral devices like printers have timing requirements for the various
operations. A single-handshake wire can be used to indicate when information is valid on
the data lines.
The one-wire handshake is the next step up from the zero-wire handshake. In the zero-
wire handshake interface the output latch or input buffer were controlled by a single wire
tied to the clock input on the latch or the enable input on the buffer.
A one-wire parallel-output interface can be built from a zero-wire parallel-output
interface. A new signal for the peripheral-write-pulse handshake signal can be generated
with flip-flops.
47
When the microprocessor write signal ends, this is used to toggle the first D flip-flop. The
D input to this flip-flop is always high, so the flip-flop is set when a positive going signal
appears at its clock input. This flip-flop indicates that an output operation has been
started.
A one-wire-handshake parallel-input interface can also be from a zero-wire-handshake
input interface. The peripheral needs to send the microprocessor some information at a
particular time, so it places the information on the peripheral data lines and then sends
the peripheral strobe. This allows the latch to hold the information on the peripheral data
lines and sets the interrupt flip-flop.
The interrupt takes the microprocessor into an interrupt service routine, which forces the
processor to read the input interface. The read operation places the contents of the latch
on the data bus and resets the interrupt flip-flop.
Two-Wire Handshake
The single-handshake interface does not indicate if the peripheral device is ready for a
data transfer. The single handshake presents the message and it assumes the
peripheral is ready to accept the data. Multiple-wire handshakes are usually
implemented with integrated circuit designed for the interface instead of using latches,
buffers, flip-flops, and gates.
One type of two-wire handshake interface for parallel output ports uses a pulsed
handshake. The interface places the data to be output on the data lines and then a
strobe pulse is sent. This is the same as a one-wire handshake. The additional line is
used by the peripheral as an acknowledge signal. It indicates that the peripheral has
accepted the information. It is also used to signal that the peripheral is ready for another
data transfer. Both of these are signaled by the falling edge of the acknowledge pulse.
An interlocked handshake with unique state conditions is needed. If the strobe and
acknowledge are overlapped as shown in fig. then the two-handshake lines are
interlocked. The strobe and acknowledge timings start with data being placed on the
data bus and then the strobe is turned on, starting the transfer. The strobe is held on
while acknowledge is switched on. Then strobe is turned of, followed by acknowledge, to
end the cycle.
The two-wire handshake parallel-input interface is similar to the two-wire output
interface. In a pulsed two-wire parallel input handshake, the interface asks for
information from the peripheral by sending the strobe. The peripheral places the
information on the data lines and sends acknowledge. The interface will use a data latch
to hold the data during the acknowledge pulse.
In an interlocked, two-wire input handshake, the interface asks for information from the
peripheral by sending the strobe. The strobe remains on and overlaps the acknowledge
which is turned on when the peripheral places the information on the data lines. These
lines remain valid until strobe is turned off. This allows the microprocessor enough time
to read the data lines, so no latch is needed. When the data is accepted by the
processor, the strobe is turned off. The microprocessor then turns off acknowledge and
completes the transfer.
Centronics Parallel Printer Interface
The Centronics printer interface is an 8-bit parallel connection that uses a three-wire
handshake. This interface does not support device addresses, so only one device can
be connected to the output port. The following signals are used in this interface.
48
Signal Function
STROBE Starts the reading of data, initiated by the computer.
ACK Indicates that the printer has received data and it is ready
to accept the next data.
BUSY Indicates that the printer cannot receive data.
PE Indicates that the printer is out of paper.
SELECT Indicates that the printer is online.
DEMAND Inverse of the BUSY signal.
INPUT PRINT A pulse from the computer that initializes the printer.
FAULT Indicates that the printer is in the error mode.
Types of Parallel Interfaces
The nonprogrammable parallel interface is used for the simplest applications. It performs
the basic bus-interface functions and although it can include some interrupt-request
control logic, it operates as a simple parallel I/O port.`
A hardware-programmable interface includes decoding logic, addressable parallel I/O
ports, and interrupt-control logic. External wiring or switches are used to determine the
address, data direction, and width of each port, and to control the operations of the
interface.
The type of general-purpose parallel interface that is most popular is software-
programmable. Here the computer software will determine how the interface is
structured. The interface is controlled from the contents of a control register which is
loaded and updated by the software program. This type of interface an also include
another control register called the data-direction register. It allows the input or output
function of individual /O lines to be selected by the program.
The programmable input/output interface chips are not based on an industry standard.
Since no standards have been established for these devices, the component
manufactures use various names for them. PIO is sometimes used to designate this
general class of programmable I/O devices. The various differences are found in the
manufacturer's literature.
These PIO programmable interface devices provide the basic input and output functions
for a parallel data interface. In order to connect an input or output device to a
microprocessor data bus, the minimum connection requires latches for the inputs and
outputs. An input latch must hold the data long enough for the microprocessor to read
the data and it also isolates the signals from the bus.
The output latches must told the output data long enough for the output device to make
use of it. The data on a typical microprocessor bus may be valid for a time period that is
too fast for many input/output devices to react and make use of it.
This basic type of general-purpose parallel I/O interface thus requires at least one input
register, one output register, status bits, and some interrupt control. There are least 16 or
24 I/O lines in these general purpose interface chips to provide a number of channels.
49
These channels, which are also called ports, usually provide an 8-bit signal-byte
connection which is configured as some combination of inputs or outputs. One or more
command registers may be used to specify the configuration of the ports and the
operation of the control logic.
The use of data-direction register allows you to define these ports as each bit is
configured as an input or output in the combination desired. Each bit of the data-
direction register specifies if a corresponding bit of the PIO port will be an input or an
output. The use of a 0 in the data-direction register may specify an input, while a 1
specifies an output.
The typical PIO multiplexes its connections to the microprocessor data bus into two or
more of the 8-bit ports. The maximum is three, because of the control and address lines
for the I/O devices, using a 40-pin package for the PIO. A typical IO configuration is
shown in fig. . The device has two ports and each has its own buffer and function or
direction register. A status or mode register is used to indicate the status of each 8-bit
port.
Using the PIO
In order to use a PIO, the microprocessor must execute the following operations : (1)
lead the control registers to specify the mode in which the control signals operate and (2)
load the direction registers to specify the direction which the lines (which make up the
ports) will use.
These operations must be done for every port in the interface. The data which is to be
loaded in the various registers is placed on the data bus and then a register selects one
of the internal registers with the appropriate pattern on the address bus and then
supplies the 8 data bits to be transferred into one of these registers using the data bus.
The multiplexer in the chip will gate the 8-bit data to the register.
The microprocessor must also generate the read or write signal on the control bus. To
read the status from the chip, the contents of the status register are read. After the chip
has been configured with its control and direction registers loaded, no additional
changes are normally necessary and the microprocessor will communicate with the data
buffers using a single instruction.
The trend in interfaces has been toward more programmed functions. Higher levels of
integration result in more functions per chip. This trend has been occurring in most of the
newer microprocessor chips.
Programmable Interfaces
A programmable interface to connect three L.E.D. display digits would use three I/O
ports, one for each digit. A scanned multiplex system could also be used which would
use two ports but more software. There are a number of chips that can be used for this
simple application and, depending which type is used, it will tend to have certain
50
characteristics that may differ from other chips. The chip designed for the 6800
microprocessor family is the 6820 peripheral interface adapter (PIA). Each 6820 is a
double port device with two sets of eight output lines.
In an application like this requiring three digits, one-and-a-half PIA chips are needed to
service the display. Each chip has two data registers which are called peripheral
registers in the PIA. One of these registers is used for each set of input/output lines.
There are also two other types of registers used with each peripheral register. This gives
a total of six registers in each chip. One of these is the data direction register which
controls the directions of the input/output lines. Each data direction regsiter has eight
bits, one for each input/output line.
Since the PIA has six registers and only two register select (RS) pins, the data and data
direction registers in each port share the same address. They differ by the value of bit 2
of the control register. Table 6.3 indicates how the registers are selected using the RS1
and RS0 pins and the state of the internal bit 2 of the control register.
Since the PIA cannot drive a heavily loaded data bus with many connections, it is
sometimes required to buffer the data bus to this chip using a tristate buffer.
8255 Programmable Peripheral Interface (PIA)
This is a parallel I/O chip with four registers. The interface to the microprocessor is made
up of chip-select pin (CS), two address pins (A0 and A1), three control pins [READ (RD),
WRITE (WR), and RESET] and eight bidirectional data pins (D0 through D7). The I/O
pins are grouped into four ports : Port A, Port B, Port C Upper, and Port C Lower. Ports A
and B are 8 bits wide, while ports C Upper and C Lower have 4 bits each. This provides
a total of 24 I/O pins.
Control of the registers depends on the state of the inputs shown as follows :
A0 A1 RD* WR* CS*
0 0 0 1 0 Read Port A
0 1 0 1 0 Read Port B
1 0 0 1 0 Read Port C
0 0 1 0 0 Write to Port A
0 1 1 0 0 Write to Port B
1 0 1 0 0 Write to Port C
1 1 1 0 0 Write to Control Register
X X X X 1 Not recognized
1 1 0 1 0 Illegal
X X 1 1 0 Not recognized
X indicates that the pin may assume either level.
There are three modes of operation :
Mode 0 The basic input and output mode for all 24 I/O pins, also called the bit I/O
mode.
Mode 1 Provides a strobed input/output (Port C is used for control and status)
Mode 2 The bidirectional data bus mode (five bits on Port C are used for
handshaking)
Ports A and B can be set to the various modes as needed, but Port C Upper depends on
how Port A is set and Port C Lower depends on how Port B is set. Programming takes
place by sending a control word from the microprocessor through the 8255 data bus.
In the mode definition control word, the bits are used as follows. Bit 7 is set to "1" to trip
the mode-active flag. Bits 5 and 6 are used to set the Port A mode. Bit 4 is the Port A
data-direction bit, it determines if the Port A pins are inputs or outputs. Bit 3 determines
the direction of the Port C Upper pins.
Bit 2 is the mode-select bit for Port B. Port B cannot be used in mode 2, which is the
bidirectional-bus mode. Bit 1 determines the direction of the Port B pins and Bit 0
determines the direction of the Port C Lower pins.
51
Along with this control word, a bit set/reset control word is used. This allows the Port C
pins to be set or reset for status and control for Ports A and B. Bit 7 is set to "0" for the
bit control. It is "1" for mode select operations. Bit 4, 5 and 6 are not used in the bit
control word. Bits 1 through 3 specify which Port C bit is to be used and bit 0 sets the
state of the bit.
Disk Drives
Most disk drives are thought of as parallel interfaces, actually the floppy disk drive has a
parallel interface for the control signals, but the data is transferred in serial mode. Hard
disk drives are a true parallel interface since both the control and data signals use
individual connections.
Floppy Disk Drives
In a floppy disk drive there is circuit board that handles the drive's mechanical operations
and interprets inputs from the drive's sensors. Signals to and from the computer's main
board take place over a single, 34-pin ribbon cable. The 34-pin configuration is standard
for IBM PCs and compatibles. A separate four-conductor cable supplies power to the
drive.
The floppy drive needs to control three main mechanisms : the R/W heads, stepping
motor, and spindle motor. This is done with a disk controller IC which handles the
communications with the motherboard as well as with the drive's sensors. Older drives
use several ICs for these operations but newer drives use more integrated devices that
combine most or all of the functions into a single IC chip.
The floppy drive uses four sensors : index sensor, disk-in-place sensor, write protect
sensor and track 00 sensor. The index sensor is an optical sensor that monitors the
diskette's rotation. An index wheel spins along with the disk and causes a pulsed signal
to be sent to the controller chip. If the pulses indicate that the disk speed is not correct,
the controller changes the spindle motor speed to hold the speed at 300 or 360 rpm.
The disk-in-place sensor provides a signal to indicate that a disk is in the drive. This
keeps the drive from operating without a disk. The write protect sensor checks the write-
protect notch. When this notch is uncovered, the drive does not allow any write
operations to take place; the disk can only be read. The track 00 sensor is used to
generate a signal when the R/W heads are in the track 00 position. This is done to
initialize the heads to a fixed starting location.
The transfer of information in or out of a drive involves the interaction of the
microprocessor and the floppy drive controller. The overall operation of the drive is
handled by the floppy drive controller which may plug into or be a part of the system
board. The microprocessor does not interact directly with the floppy drive. It directs the
controller to start the data transfer in or out of the floppy drive. The instructions or
routines needed to operate the floppy drive are fetched by the microprocessor from the
BIOS ROM on the system board.
Data being loaded into a floppy drive is taken one byte at a time from system RAM by
the floppy disk controller and converted into serial form. They are sent as serial data
over the drive cable. Other control signals are needed to handle the drive's motors and
sensors. When the data bits arrive at the floppy drive, they are converted into magnetic
recording signals so they can be written to the disk.
When data is read from the floppy disk, the process involves finding the desired program
or file. The floppy disk controller must seek the track and sector with the recorded data.
After the starting location is found, the disk's read-write head produces signals from the
recorded data. These low-level signals are amplified and then converted into standard
digital logic levels. The digital data is sent in serial format over the cable to the floppy
disk controller. The controller converts the serial data into parallel words while deleting
the housekeeping information, and sends the data to RAM.
Drive Interface
The connections between the floppy disk controller and the floppy drive unit is the drive
interface. This a standard set of connections used by most floppy drives and controllers.
The standard interface allows any floppy drive to operate in the computer as long as it
uses the standard interface.
The interface is made up of two cables, power and signal. The signal cable pinout is
shown in Table 6.4. The power connector is a 4-pin, mate-lock-type connector. The
52
digital signals use +5.0 Vdc (pin 4, pin 3 return) in most desktop systems although +3.3
or +3.0 Vdc are used in some portable computers. The motors normally operate on +12
Vdc (pin 1, pin 2 return). A return or ground lead is provided for each supply in the
connector. In the 34-pin signal connector, the odd-numbered pins are ground lines, while
even-numbered pins are used for the signals.
Up to four drive-selection inputs, DRIVE SELECT 0* through DRIVE SELECT 3*, are
used to determine which drive in the system is active. Smaller computers will not use all
of these lines. A MOTOR ON* signal is used to start the drive spindle motor turning. This
signal must be negative true before a red or write operation can take place.
The head direction is controlled by a DIRECTION SELECT* signal that tells the head
stepping motor to move in toward the center of the disk or out toward the edge of the
disk. A STEP* pulse controls the number of steps that the head stepping motor must
take. Both STEP* and DIRECTION* position the R/W heads on the disk.
A WRITE DATA line records information on the disk, and a WRITE GATE* signal is used
to enable the drive to accept data on the WRITE DATA line. The IN USE/HEAD LOAD*
signal indicates that the read/write head is busy. The WRITE PROTECT* output
prevents writing to the disk if the write protection notch is covered. When a read takes
place, the data is sent on the READ DATA line. The DISK CHANGE/READY signal tells
when the disk is ready for a read or write operation. The SIDE SELECT* input
determines which side of the disk is written or read to.
The output signals include a NORMAL/HIGH-DENSITY* signal that tells the floppy drive
controller IC what type of media is currently in use.
The INDEX* signal is actually a stream of negative indexing pulses. These are sent to
the floppy drive controller to regulate the spindle speed at the proper value. The TRACK
00* signal indicates that the head is at track 00 on the disk.
Pinlist for IMB PC Floppy Drive Interface
Pin Function Pin Function
2 Normal / high density* 1 Ground
4 In use / head load* 3 Ground
6 Drive select 3 5 Ground
8 Index 7 Ground
10 Drive select 0* 9 Ground
12 Drive select 1* 11 Ground
14 Drive select 2* 13 Ground
16 Motor ON* 15 Ground
18 Direction* 17 Ground
20 Step 19 Ground
22 Write data 21 Ground
24 Write gate* 23 Ground
26 Track 00* 25 Ground
28 Write protect* 27 Ground
30 Read data 29 Ground
32 Side select* 32 Ground
34 Disk change / ready* 33 Ground
Floppy Disk Controllers
The Intel 82077 is a single-chip floppy disk, and tape drive controller for the PC-AT and
PS/2 buses. The 82077 needs only a 24-MHz crystal, resistor array, and chip select
circuits to implement the floppy-disk controller. The drive control signals are decoded
and buffered. There is an analog data separator for motor speed control and a 16-byte
FIFO (First-In-First-Out) register. All command parameters and data transfers go through
the FIFO.
Controller Interface
The following signals make up the controller interface. CS* on pin 6 is used to decode
the base address range. A0, A1, and A2 on pins 7, 8 and 10 are used to select one of
the chip's registers as shown in the following :
53
54
A2 A1 A0 Read / Write Select Register
0 0 0 Read Status Register A
0 0 1 Read Status Register B
0 1 0 Read / Write Digital Output Register
0 1 1 Read / Write Tape Drive Register
1 0 0 Read Main Status Register
1 0 0 Write Data Rate Select Register
1 0 1 Read / Write Data (FIFO)
1 1 0 Reserved
1 1 1 Read Digital Input Register
1 1 1 Write Configuration Control Register
The following pins are used for the data bus :
DB0 - 11 DB4 - 17
DB1 - 13 DB5 - 19
DB2 - 14 DB6 - 20
DB3 - 15 DB7 - 22
RD* on pin 4 is the READ control input and WR* on pin 5 is the WRITE control input.
RDDATA on pin 41 is the READ DATA input. It provides serial data from the disk.
INVERT affects the polarity of this signal. WP on pin 1 is the WRITE PROTECT input. It
indicates if the disk drive is write-protected. DSKCHG in pin 31 indicates a DISK
CHANGE has occurred. This means that the disk is now ready for a read or write.
DRQ on pin 24 is the DMA REQUEST signal, which is sent out to request service from a
DMA controller. DACK* on pin 3 is the DMA ACKNOWLEDGE control input used in DMA
cycles.
TC on pin 25 is the TERMINAL COUNT control signal sent from a DMA controller to end
the disk transfer, DACK* must be active to use this signal. INT on pin 23 is the
INTERRUPT output. It signals a data transfer in the non-DMA mode.
DENSEL on pin 49 is used as the DENSITY SELECT. It indicates if a low (250/300
Kbps) or high (500 Kbps/1 Mbps) data rate is selected. The polarity of the DENSEL pin
is controlled with the INDENT pin, after a hardware reset.
DRV2 on pin 30 indicates if a second drive is installed and its state is reflected in Status
Register A. DRATE0 and DRATE1 on pins 28 and 29 indicate the contents of bits 0 and
1 of the Data Rate Register.
INDX on pin 26 is the INDEX input. It indicates the beginning of the track. TRK0 on pin 2
stands for the TRACK0 control line. It indicates that the head is on track 0.
The chip runs on +5 volts on pins 18, 40, 60, and 68. The ground pins are 9, 12, 16, 21,
36, 50, 54, 59, and 65. AVCC on pin 46 is used for the analog supply and AVCC on pin
45 is used for the analog ground.
Hard Drives
Hard drives usually require a read/write controller, a head actuator/driver, a spindle
motor controller, and a disk interface controller. Data enters and leaves the hard drive
through the disk interface controller. This controller is designed for the drive's interface.
Most early drives used the Seagate ST-506 for drives which were under 40Mb. The
ESDI (Enhanced Small Device Interface) doubled the transfer rate to 10 MB per second
which allowed more data on the hard disk. Both of these use a 34-pin cable for the drive
control signals, similar to a floppy drive, and a 20-pin cable for the parallel data transfers.
The IDE and SCSI interfaces are later standards used in most current hard drives. The
disk interfaces controller also controls the head actuator driver circuit and spindle motor
driver.
The read/write controller works with the head preamplifier and drive circuits to covert the
analog waveforms from the read heads into standard logic levels. The read/write
controller separates the clock and synchronization signals from the actual binary data.
When data is written to the disks, the read/write controller generates the write signals
that are amplified by the write drive circuits.
Built into the hard drive circuitry is a small microprocessor that coordinates the drive's
operations by synchronizing the disk interface controller and the read/write controller.
This microprocessor is also used for disk spinup and spindown, as well as other safety
55
control features that the drive might have. Some drives use a custom version of a
microprocessor called a micro-controller. Other hard drives use a standard
microprocessor. For the small drives, such as the 1.3-in units in small portable
computers, these circuits are integrated onto one or two complex surface-mount ICs.
A data transfer starts when the main board microprocessor initiates a command to the
hard drive controller. In many systems a system controller chip actually drives the hard
drive controller. Any parameters that are needed to control the hard drive are taken by
the microprocessor from the BIOS ROM.
The hard drive controller interfaces the system buses (control, address, and data) to the
drive's interface. Data and commands from the drive are converted into computer bus
signals by the hard drive controller. The control circuits on the hard drive are used to
operate the drive's mechanical functions and to convert the digital information from the
interface into magnetic flux patterns that are recorded on the disk. This process of
recording and data transfer is reversed for write operation, where the flux patterns are
amplified and interpreted for the microprocessor.
IDE Drives
IDE, which stands for Intelligent Drive Electronics or Integrated Drive Electronics, is a
popular interface in personal computers for connecting hard drives, especially the newer,
smaller drives. The circuits needed to operate an IDE drive is on a circuit board which is
part of the hard drive assembly. The software routines needed to communicate with the
IDE drive are stored in the BIOS ROM on the system board.
The IDE interface connects the hard drive to the system board with a 40-pin connector.
The signal cable typically uses a 40-pin insulation displacement connector (IDC). All
signals on the IDE-interface are TTL-compatible, a logic zero is 0.0 to +0.8 Vdc, and a
logic one is +2.0 to Vcc.
There is also a 4-pin power cable in addition to the 40-pin signal cable. The signal cable
pinouts are shown in Table 6.5. The power connector is a 4-pin mate-n-lock-type
connector. IDE hard drives normally use +5 Vdc (pin 4) and +12 Vdc (pin 1). In some
low-voltage systems, +3.0 or +3.3 Vdc is used instead of +5.0 Vdc. The return lines for
each supply are also part of the power connector (+5V return, pin 3, +12V return, pin 2).
The IDE interface provides sixteen bidirectional data lines (DD0 to Dd15) to move data
bits in and out of the drive. IORDY (I/O Ready) is used to indicate to the drive that a data
transfer is needed. The direction of the data transfer is set with DIOR* and DIOW*.
IOCS16 is the 16-bit I/O control signal. It tells the microprocessor that the drive is ready
to send or receive data.
Pinlist for IDE Hard Drive Interface
Pin Function Pin Function
2 Ground 1 Reset*
4 DD8 3 DD7
6 DD9 5 DD6
8 DD10 7 DD5
10 DD11 9 DD4
12 DD12 11 DD3
14 DD13 13 DD2
16 DD14 15 DD1
18 DD15 17 DD0
20 Connector key 19 Ground
22 Ground 21 DMARQ
24 Ground 23 DIOW*
26 Ground 25 DIOR*
28 Reserved 27 IORDY
30 Ground 29 DMACK*
32 IOCS16* 31 INTQ
34 PDIAG* 33 DA1
36 DA2 35 DA0
38 CS3FX* 37 CS1FX*
40 Ground 39 DASP*
56
The outputs to the system board include a Direct Memory Access Request (DMARQ),
which is used to start the transfer of data to or from the drive. When a data transfer is
finished, a DMA ACKNOWLEDGE (DMACK*) is sent to the drive from the hard disk
controller. A Drive Interrupt Request (INTQ) is used by the drive when there is an
interrupt pending. A Drive Active (DASP*) signal is used when the hard drive is busy.
A Passed Diagnostic (PDIAG) pin indicates the results of a diagnostic command or
reset. If PDIAG* is negative true, the microprocessor knows that the drive is okay to use.
A negative true signal on the RESET* line forces the drive to its initial condition during
power-on or reboot.
SCSI Drives
The Small Computers Systems Interface (SCSI, pronounced "scuzzy") was developed
as a hard disk drive interface. It differs from other disk interfaces in that it is intelligent.
Rather than using a hard drive controller that controls the drive, SCSI drives use a host
adapter that allows the computer to send commands to the drive. A SCSI drive has an
instruction set of commands, and up of eight SCSI devices can be connected on a single
computer.
SCSI is a computer bus which uses its own protocol (sequence of events) to
communicate between devices. The system microprocessor is not required for the
particular conditions of the drive; the hard drive system has enough intelligence to
complete each task.
The original specification for SCSI appeared in 1986. Other enhanced versions, SCSI-2
and SCSI-3, were released after this. It is a complex parallel interface. The Small
Computer System Interface, or SCSI, has its roots in a disk-drive interface developed by
Shugart Associates. The Shugart interface was called the SASI (pronounced "sassy")
bus for Shugart Associates System Interface. It was intended primarily for disk drives.
In both SASI and SCSI systems, a controller board moves data transfers over the SASI
or SCSI bus. Like the IDE interface, a SCSI drive needs only to be connected to a
system board using a standard cable. The SCSI bus uses a 50-pin connector even
though it is an 8-bit interface.
The SCSI standard defines the way peripherals are connected to the computer system
and how they communicate with the system. It is often grouped together with other hard
disk interfaces, but it is more than a hard disk interface.
SCSI provides a common bus for many types of peripherals, such as CD-ROMs, optical
memory devices, modems, and printers. The common bus allows the connection of up to
seven other peripherals to one port on the back of the PC.
A SCSI hard disk drive system gives you high performance and automatic error
correction. You also get an external port which allows daisy chaining of up to seven
peripherals. SCSI drives for the PC cost more to install due to the additional cost of the
SCSI interface. In the 30- to 60-MB range, the lowest cost solution is usually MFM or
RLL ST-506 technology in a drive kit or a drive with an integrated controller. The Apple
Macintosh is an exception, being completely SCSI due to the built-in SCSI interfaces on
Macintosh Computers. SCSI offers the easiest plug-in installation when executed
properly.
Usually when you connect peripherals to your system you need controller boards for
each device. A CD-ROM drive, a modem, and a printer might require three controller
boards and three expansion slots for any functions not provided on the system board.
You would also have to be sure that each of these devices is compatible with your
system, and compatible with each other. Whenever there are multiple controller boards
in the system, there are possible con-used with the I/O read (IOR*) and I/O write (IOW*)
pins to allow the microprocessor to address the 16 registers used in the chip.
Switching the chip select (CS) signal low makes the chip ready for microprocessor bus
communications. The RESET signal resets the chip by placing into a known, stable
state. The data lines (D0 through D7) make up the data port.
The other microprocessor bus signal lines are used for interrupt and DMA operations.
IRQ is the interrupt request line and it is used to signal an error or the completion of a
command. DRQ is the DMA request line, it indicates that an internal data register should
be read or written to for a SCSI data transfer. DACK is used by the DMA controller to
57
signal that is has responded to a DMA request. DACK allows access to the data
registers without using the address lines.
Other DMA signals include the EOP input which allows a DMA controller to tell the 5380
that the current transfer cycle on the microprocessor bus is the last data transfer of a
block. READY allows the 5380 to control when the DMA controller moves data in and
out. READY indicates when the 5380 is prepared to take another transfer, based on the
activity on the SCSI bus. The READY signal is used with DRQ for block-mode DMA
transfers. The DMA controller can be set to take control of the microprocessor bus for a
specified number of transfers.
Functional Signals of the 5380 SCSI Adapter Chip
SCSI Data Bus
DMA EOP* Input Bidirectional
READY Output DB0-7, DBP*
Control DRQ Output BSY* SCSI
DACK* Input SEL*
RST* Controls
Register CS* ATN*
IOR* ACK*
Bidirectional
Addressing IOW* REQ*
A0 MSG*
Inputs
A1 C/D*
A2 I/O*
Data Bus D0-D7 Bidirectional
RESET* Input
IRQ Output
Analog Interfaces
Analog interfaces are used in data acquisition systems. Analog or continuous signals are
used by many devices as inputs or outputs to computer systems. An example of an input
would be temperature measurement and an example of an output would be motor
control. These signals generally use voltage or current variations, but frequency and
pulse width are among the other techniques used. No matter what form the analog
signal has, it must be converted to a digital representation to be used by the computer.
Any digital control signals that are sent to analog control devices must be converted to
the proper analog format. The devices for accomplishing this are analog-to-digital and
digital-to-analog converters. Analog interfaces use a different set of codes to represent
numbers. The analog or continuous values in physical control and measurement
applications can be represented by digital numbers. The presence or absence of fixed
voltage levels characterize these numbers. These digital representations are binary
since each bit or unit of information can have one of two possible states : TRUE or
FALSE, ON or OFF, ONE or ZERO, HIGH or LOW.
A binary code is used to interpret the analog value. The different bits represent different
portions or weights of the digital number. The bit with the most weight is the first bit in the
leftmost position. This is called the most significant bit or MSB. The bit with the least
weight is the last bit in the rightmost position and is called the least significant bit or LSB.
An analog-to-digital converter is used to change the analog values to their digital
equivalents. The resolution of an analog-to-digital converter is determined by the number
of bits. The coding used is the set of coefficients representing the fractional parts of full
scale.
58
Coding Methods
Natural Binary Code
Bipolar Codes
Offset Binary
Polarity
Sign Magnitude
One's Complement
Two's Complement
D/A Converters
The R-2R ladder circuit is often used in digital-to-analog (D/A) conversion. The basic
circuit is shown in fig. . Notice that is used with an inverting operational amplifier.
When all bits but the MSB are off, (grounded) the output equals (-R/2R)Vref).
A D/A converter with buffer storage can be used as a sample-hold with digital input and
analog output and an infinite hold time. The register is under control of a strobe which
causes the converter to update.
The rate at which the strobe may update is determined by the settling time of the
converter and the response time of the logic.
If bipolar current-switching D/A conversion is used with offset binary or two's
complement codes, an offset current equal and opposite to the MSB current is summed
with the converter output. This is usually taken from a resistor divider network rather
than a separate offset reference. This is done in order to minimize errors due to
temperature changes.
If the gain of the output inverting amplifier is doubled, this increases the output range,
from 0-10 V to 10 V. When the amplifier is connected for sign inversion, conversion is
negative reference. In a non-inverting application, the same values of offset voltage and
resistance are used, but the value of the output voltage scale factor will depend on the
load.
Some bipolar D/A converters with R-2R ladder networks and offset binary or two's
complement coding have switches that are normally grounded for unipolar operation. If
the LSB node is grounded, the output will be symmetrical. For sign-magnitude
conversion, the converter's current output can be inverted.
The analog output in a parallel-input D/A converter circuit will follow the state of the logic
inputs. The converter may be preceded by a register, then the converter will respond
only when the inputs are gated into it. This is done in some data distribution systems,
where the data may be continually changing, but samples are needed on a periodic
basis.
DAC811
This is a single-chip integrated circuit microcomputer-compatible 12-bit digital-to-analog
converter. The Burr-Brown DAC811 chip includes a precision voltage reference, interface
logic, buffered latch, and a 12-bit D/A converter with a voltage output amplifier. Fast
59
current switches and a laser-trimmed thin-film resistor network are used to provide an
accurate and fast D/A converter. Laser trimming is done at the wafer level to maintain a
1
/4 LSB linearity error at 25°C and a 1
/2 LSB error over the temperature range.
The DAC811 is available in a 28-pin plastic molded package, a 28-pin 0.6 inch wide
dual-in-line ceramic side-brazed package, and a 28-terminal 0.45 inch-square ceramic
leadless chip carrier.
Interface
The microprocessor interface uses a double-buffered latch which is divided into three 4-
bit nybbles for interfacing to 4, 8, 12, or 16-bit buses and for handling right-or-left-
justified data. The 12-bit data in the input latches is moved to the D/A latch which holds
the output value. Loading the last nybble or byte of data can be done simultaneously
with the transfer of data between latches. This avoids spurious analog output values and
saves computer instructions.
Most interfaces require a base address decoder, but if blocks of memory are not used,
the base address decoder is simplified or not needed. For example, if half the memory
space is not used, address line A15 of the microprocessor may be used as the chip select
control.
The control logic allows interfacing to right-or-left justified data format. When a 12-bit D/A
converter is loaded from an 8-bit bus, 2 bytes of data are required. The base address is
decoded from the high-order address bits and A and A are used to address the latches.
Adjacent addresses are used.
Analog-to-Digital Converters
Most of the data acquisition boards for personal computers use successive
approximation conversion. These A/D converters are built around a D/A converter and
use a comparison technique. When a conversion command is applied, the D/A
converter's MSB output (1
/2 full scale) is compared with the input.
Then if the input is greater than the MSB, it remains on and the next bit is tested. But, if
the input is less than the MSB, it is turned off, and the next bit is tested.
If the second bit does not have enough weight to exceed the input, it is left on and the
third bit is tested. But, if the second bit exceeds the input, it is turned off. The bit testing
continues until the last bit has been tested.
60
When the bit tests are complete, a status line indicates that a valid conversion has
occurred. An output register is used to hold the digital code corresponding to the input
signal.
61
Figure is a block diagram of a successive approximation A/D converter. Internally, the
converter operates as follows. When a true signal is applied to the command input, the
D/A switches are set to their off state, except for the significant bit, which is set to
logic”1”. This turns on the corresponding D/A switch to apply the analog equivalent of
MSB to the comparator. If the analog input voltage is less than the MSB weight , the
MSB is switched off at the first edge of the clock pluse. If the analog input is greater than
the MSB, the “1” remains in the register.
During the second pulse, the sum of the first result and second bit is compared with the
analog input voltage. The comparator is gated by the next clock pulse. It will cause the
register to either accept or reject that bit. Successive clock pulses will cause all the bits,
in order of decreasing significance to be tested until the LSb is accepted or rejected.
A/D Converter Considerations.
The following considerations are important to A/D conversion:
1. The analog input range.
2. Resolution required for the signal to be measured.
3. The requirements for linearity error, relative accuracy, and stability of calibration.
4. The changes in the various sources of errors as temperature changes
5. Conditions for missed codes if allowable
6. The time allowed for a complete conversion
7. Stability of the system power supply
8. Errors due to power supply variations.
9. Character of the input signal: noisy, sampled, filtered , frequency
10. Types of preprocessing needed or desired
Other A/D conversion circuits may be more acceptable for the application instead of
successive approximation. These include the integration and counter comparator types.
The integrating types are generally better for converting noisy input signals at relatively
slow rates.
Successive approximation is best suited for converting sampled or filtered inputs to the
MHz range. Counter comparator types allow low cost, but can be slow and noise-
susceptible. They are useful for peak followers and sample holds for digital storage
applications.
62
Converter Parameters
When the converter’s full-scale range is adjusted , it will be set with respect to the
reference voltage which can be traced to some recognized voltage standard. The
absolute accuracy error is the tolerance of the full-scale point referred to this absolute
voltage standard . Offset is measured for zero and it usually a function of time and
temperature. Nonlinearity monotenicity is the ability to include all code numbers in actual
operation. It is the amount by which the plot of output versus input deviates from a
straight line. Settling time is the time required for the input to attain a final value within a
specified fraction of full scale, usually ½ LSB.
In the A/D conversion process, an error from the quantization uncertainty of ½ LSB
exists along with other conversion processing errors. The way to reduce this
quantization uncertainty error is to increase the number of bits.
Statistical interpolation can be used during processing or filtering after the conversion.
This tends to fill in missing analog values for rapidly changing signals, but it will not
reduce errors due to any variations within ±½ LSB.
It is usually easier to determine the location of a transition than to determine a midrange
value, so errors and settings of AID converters are normally defined in terms of the
analog values when actual transitions occur in relation to the ideal transition values.
ADC674 Analog-to-Digital Converter
This Burr-Brown chip is a complete 12-bit A/D converter with reference, clock and 8-,
12-, or 16-bit microprocessor bus interface. It has a 15-microsecond maximum
conversion time and is specified for operation with no missing codes.
The chip contains a 12-bit successive approximation analog-to-digital converter It has a
self-contained +10 V reference, internal clock, digital interface for microprocessor
control, and three-state outputs.
The reference circuit uses a buried zener and is laser trimmed. The clock oscillator is
current-controlled, and full-scale and offset errors may be externally trimmed. Internal
scaling resistors are provided for selecting the following analog input signal ranges:
O to +10 V O to + 20 V + or -5V + or -10V
The converter can be externally programmed to provide 8- or 12 bit resolution. The
output data is available in a parallel format from TTL compatible three-state output
buffers. The output data is coded in straight binary for unipolar input signals. and bipolar
offset binary for bipolar input signals. It is packaged in a 28-pin ceramic DIP
Calibration Techniques
Both digital-to-analog (D/A) and analog-to-digital (A/D) converters have offset errors
since the first transition will not always occur at exactly ½ LSB. Scale factor or gain
errors can cause a difference between the values at which the first transition and the last
transition occur since this is not always equal to ½ LSB. Linearity errors can exist since
the differences between transition values are not all equal or uniform in changing. When
the differential linearity error becomes too large, it is possible for codes to be missed.
Offset and full-scale errors are trimmed using external offset and full scale trim
potentiometers connected to the reference and offset terminals. If adjustments for
unipolar offset arid full scale are not required, a 50-ohm 1 percent metal film resistor is
connected between pin 10 (Reference In) and pin 8 (Reference Out). Pin 12 Bipolar'
Offset) is connected to pin 9 (AnaIog Common), grounding the offset adjustment.
If adjustment is required, one 100-ohm potentiometer is connected between pins 10 and
8, and another is connected between pins 8 and 12. Then the input is vsried through the-
end-point transition voltage; 0V + ½ LSB; +1.22 mV for the 10-V range, +2.44 mV for the
20-V range.
This causes the output code to be DBO ON (high). Then the potentiometer between pins
12 and 8 until DBO just switches off with all other bits off. Next, an input voltage of full-
scale value minus 3/2 LSB is applied to cause all bits to be on. This value is +9.9963 V
for the 10-V range and +19.9927 V for the 20-V range. The potentiometer between pins
8 and 10 is adjusted until bits DB1 and DBl1 are on and DBO is switching between on
and off.
If external adjustments of full-scale and bipolar offset are not required, the
potentiometers are replaced with 50-ohm metal film resistors. If adjustments are
63
required, the calibration procedure is similar to that used for unipolar operation, except
that the offset adjustment is performed with an input voltage which is ½ LSB above the
minus full-scale value, -4.9988 V for the +5-V range, -9.9976 V for the +10 V range.
Then the pot between pins 8 and 12 is adjusted for DBO to switch between on and off
with all other bits off.
To adjust full-scale, a DC input signal is used which is 3/2 LSB below the nominal plus
the full-scale value. This is +4.9963 V for the +5 V range and +9.9927 V for the +10-V
range. Then the pot between pins and 10 is adjusted for DBO to switch between on and
off with all other bits on.
Interfacing
The AD674 is designed to be interfaced to microprocessor systems and other digital
systems. The microprocessor can have full control of the conversions, or the converter
can operate in a stand-alone mode, trolled by the R/C* input. Full control involves the
following:
1. Setting up an 8- or 12-bit conversion cycle
2. Initiating the conversion
3. Reading the output data
This can be done by reading the 12 bits all at once, or 8 bits followed by 4 bits in a left
justified format. There are five control inputs -12/8*, CS*, A0, R/C*, and CE. These are all
TTL/CMOS-compatible. The functions of the control inputs are shown in Table 7.2.
The stand-alone mode is used in systems with dedicated input ports. In stand-alone
operation, control of the converter is done with a single control line connected to R/C*. In
this mode CS* and A0 are tied to digital common and CE and 12/8* are tied to +5 V The
output will be in 12 bit words.
The conversion is initiated by forcing R/C* to low. The three-state data output buffers are
enabled when R/C* is high and STATUS is low. The conversion can be initiated with
either positive or negative pulses. The R/C pulse must be low for at least 50
nanoseconds.
DEVICE DRIVERS
What is a Device Driver?
A device driver is a program that controls a device. Every device, a printer, disk drive, or
keyboard, must have a driver program, to interact with the OS. Many drivers, like
keyboard driver, are built into the operating system itself. So a "driver" is a piece of
software that lets your PC talk to peripherals, components, and other hardware. It
interprets OS commands to the specific needs of the device.
Where Are the Drivers?
Some of the essential device drivers like the keyboard driver and floppy disk driver are
built into the "ROM" (Read Only Memory) or "BIOS" (Basic Input Output System) of the
computer system itself. There are also drivers built into the operating system to control
memory, cache, and other basic devices of the PC. For all other devices you may need
to load a new driver when you connect the device to your computer. In Windows, drivers
often have a DRV extension.
How Does the Driver Work?
A driver acts like a translator between the device that it controls and programs that use
the device. For example, a mouse driver translates the "actions" of the mouse to
something more understandable by the OS. Each device has its own set of specialized
commands that only its driver knows. The driver, therefore, accepts special commands
from a program and then translates them into specialized commands for the device.
PORTS AND SOCKETS
A socket is an endpoint used by a process for bi-directional communication with a socket
associated with another process. Sockets, introduced in Berkeley Unix, are a basic
mechanism for IPC on a computer system, or on different computer systems connected
by local or wide area networks. How to program with sockets to create communication
channels. The communication channel created with sockets can be like a telephone line
(connection oriented), with the sockets as telephones over which a conversation can
64
take place. Or the channel can be as when we send mail (datagram oriented), with the
sockets as mailboxes.
A socket appears to the user to be like a file descriptor on which we can read, write, and
ioctl. In the connection oriented mode, the file is like a sequence of characters that we
can read with as many read operations as we like. In the connectionless mode we have
to get a whole message in a single read operation.
If we don't, what is left over of the message is lost. Though sockets can be used in a
single computer system for interprocess communication (the Unix domain), we will only
consider their use for communication across computer systems (the Internet domain). It
is possible to sent message on a socket that take precedence over other undelivered
messages. These priority messages are called out-of-band messages.
A problem in communication is how to identify interlocutors. In the case of phones we
have telephone numbers, for mail we have addresses. For communicating between
sockets we identify an interlocutor with a pair: IP address and port.
Ports are 16-bit unsigned integers. (The first 1024 port numbers are reserved for things
like http. 80. These ports are called well-known ports. Certainly from 49152 to 65535 the
ports are private and can be dynamically allocated (ephemeral ports). The interval 1024
to 49151 consists of registered ports.
Client-Server Architecture
A standard way of using sockets and communication channels is between clients and
servers. A server is a process that is able to carry out some function, called a service,
like transferring files, translating host names to IP addresses, or inverting a matrix. A
client is a process that requests a server to do a service (say, "translate snowhite cis.
temple edu"). Typically the server will be at a known IP address and will respond to
requests sent to a known port.
In some cases that port is not universally known, so the server will advertize the port it is
currently using (it may advertize the port by printing out its value, or sending email, or
having inetd. a special process, know about it, etc.). In some cases the IP address of the
server is not known and one may have a "standard" server that responds to requests of
the form "where can I find service Moo" by responding with an appropriate IP address.
The client requests the kernel to obtain a free port to be used for communication with the
server.
The server does not have to know in advance the identity of its clients. It is ready to
accept a message from any interlocutor. When it receives a message from a client, the
message itself contains the IP and the port of the client, so that the server knows whom
to answer to.
An address, host+port, can be used for multiplexing more than one communication
channel. So one server can communicate simultaneously with more than one client.
Each communication channel on the server will have its own socket bound to the same
address. In other words, each connection on the internet is identified by a socket pair.
(client IP client Port) + (server IP, server port), plus the protocol being used (say TCP or
UDP).
Summary On Socket Functions
The following is a summary of the basic socket functions as they are used for datagram
and connection oriented service by clients and servers. In the following section we will
go in greater detail over these functions.
Datagram Service
Client
socket => ([bind =>] [connect =>] {write => read}*) | {sendto => recvfrom}*
=> close | shutdown
In words: create a socket, then bind it to a local port [if bind is not used, the kernel will
select a free local port], establish the address of the server, write and read from it, or just
send to and recvfrom it; then determinate. In the case that client is not interested in a
response, it does not need to use bind. Connect is worth using when we send many
datagrams to the same server.
65
Server
socket => bind => {read | recvfrom => write | sendto}* => close | shutdown
In words; create a socket, bind it to a local port, accept and reply to messages from
client, terminate. In the case that the server does not need reply to the client, it can just
use read instead of recvfrom.
Connection Oriented service
Client
socket => [bind =>] connect => {write | sendto => read | recvfrom}* =>
close | shutdown
In words: create a socket, bind it to a local port (we usually do not call bind), establish
the address of the server, communicate with it, terminate. If bind is not used, the kernel
will select a free local port.
Server
socket => bind => listen => {accept => {read | recvfrom => write | sendto}* }* =>
close | shutdown
In words : create a socket. bind it to a local port, set up service with indication of
maximum number of concurrent services, accept requests from connection oriented
clients, receive messages and reply to them, terminate.
/* A simple server in the internet domain using TCP
The port number is passed as an argument /*
#include<stdio.h>
#include<sys/types.h>
#include<sys/socket.h>
#include<netinet/in.h>
void error(char*msg)
{
perror(msg);
exit(1);
}
int main(int argc, char *argv[ ])
{
int sockfd, newsockfd,portno, clien;
char buffer[256];
struct sockaddr_in serv_addr, cli_addr,
int n;
if (argc > 2 ) }
fprintf(stderr,"ERROR, no port providedn");
exit(1);
}
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error(ERROR opening socket");
bzero(char*) & serve_addr, sizeof(serv_addr));
portno = atoi(argv[1]);
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons(portno);
if (bind(sockfd, (struct sockaddr*) &serv_addr,
sizeof(serv_addr))<0)
error("ERROR on binding");
listen(sockfd,5);
clilen = sizeof(cli_addr);
newsockfd = acept(sockfd,
(struct sockaddr*) &cli_addr,
&clien);
66
if (newsockfd < 0)
error("ERROR on accept");
bzero(buffer,256);
n = read(newsockfd,buffer,255);
if (n < 0) error("ERROR reading from socket");
printf("Here is the message: %sn",buffer);
n = write(newsockfd,"I got your message",18);
if (n < 0) error("ERROR writing to socket");
return 0;
}
/* A simple client in the internet domain using TCP
The port number is passed as an argument */
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
void error(char *msg)
{
perror(msg);
exit(0);
}
int main(int argc, char *argv[ ])
{
int sockfd, portno, n;
struct sockaddr_in serv_addr;
struct hostent *server;
char buffer[256];
if (argc < 3) {
fprint(stderr,"usage %s hostname portn", argv[0];
exit(0);
}
portno = atoi(argv[2]);
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error(ERROR opening socket");
server = gethostbyname(argv[1]);
if (server == NULL) {
fprintf(staderr, "ERROR, no such hostn");
exit(0);
}
bzero(char*) & serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
bcopy((char*)server->h_addr,
(char*)&serv_addr.sin_addr.s_addr,
server->h_length);
serv_addr.sin_port = htons(portno);
if (connect(sockfd,&serv_addr,sizeof(serv_addr)) <0)
error("ERROR connecting");
printf ("Please enter the message: ");
bzero(buffer,256);
fgets(buffer,255,stdin);
n = write(sockfd,buffer,strlen(buffer));
if (n < 0)
error("ERROR writing to socket");
bzero(buffer,256);
67
n = read(sockfd,buffer,255);
if (n < 0)
error("ERROR reading from socket”
printf("%sn",buffer);
return 0;
}
BENCHMARKS
What is comp.benchmarks?
Comp.benchmarks is a USENET newsgroup for discussing computer benchmarks and
publishing benchmark results and source code. If it's about benchmarks, this is the place
to post or cross post it.
What is a benchmark?
A benchmark is test that measures the performance of a system or subsystem on a well-
defined task or set of tasks.
How are benchmarks used?
Benchmarks are commonly used to predict the performance of an unknown, system on a
known, or at least well-defined, task or workload.
Benchmarks can also be used as monitoring and diagnostic tools. By running a
benchmark and comparing the results against a known configuration, one can potentially
pinpoint the cause of poor performance. Similarly, a developer can run a benchmark
after making a change that might impact performance to determine the extent of the -
impact.
Benchmarks are frequently used to ensure the minimum level of performance in a
procurement specification. Rarely is performance the most important factor in a
purchase, though. One must never forget that it's more important to be able to do the job
correctly than it is to get the wrong answer in half the time.
What kinds of performance do benchmarks measure?
Benchmarks are often used to measure general things like graphics, I/O, compute
(integer and floating point), etc., performance, but most measure more specific tasks like
rendering polygons, reading and writing files, or performing operations on matrixes.
SPEC
SPEC stands for "Standard Performance Evaluation " a non-profit Corporation,
organization with the goal to "establish, maintain and endorse a standardized set of
relevant benchmarks that can be applied to the newest generation of high-performance
computers" (from SPEC's bylaws).
The current SPEC benchmark suites are
CINT92 (CPU intensive integer benchmarks)
CFP92 (CPU intensive floating point benchmarks)
SDM Y (IUNTX Software Development Workloads)
SFS (System level file server (NFS) workload)
In August 1995, SPEG introduced the SPEC9S CPU benchmarks as a replacement for
the older SPEC92 CPU benchmarks (see below, section 4.2). These benchmarks
measure the performance of CPU, memory system, and compiler code generation. They
normally use UNIX as the portability vehicle, but they have been parted to other
operating systems as well. The percentage of time spent in operating system and I/O
functions is generally negligible.
Throughput (Rate) Measurement method, called the "homogeneous capacity method",
several copies of a given benchmark are executed. This method is particularly suitable
for multiprocessor systems. The results, called SPEC rate, express how many jobs of a
particular type (characterized by the individual benchmark) can be executed in a given
time. (The SPEC reference time happens to be a week, the execution times are
normalized with respect to SPEC reference machine).
The SPEC rates therefore characterize the capacity of a system for compute-intensive
jobs of similar characteristics.
68
69
TRANSACTION PROCESSING BENCHMARK
TPC Benchmark TM
W (TPC-W} is a transactional web benchmark. The workload is
performed in a controlled internet commerce environment that simulates the activities of
a business oriented transactional web server. The workload exercises a breadth of
system components associated with such environments, which are characterized by:
Multiple on-line browser sessions
Dynamic-page generation with database access and update
Consistent web objects
The simultaneous execution of multiple transaction types that span a breadth of
complexity.
On-line transaction execution modes
Databases consisting of many tables with a wide variety of sizes, attributes, and
relationships
Transaction integrity (ACTD properties)
Contention on data access and update
The Transaction Processing Performance Council (TPC) is a non-profit corporation
founded to define transaction processing and database benchmarks and to disseminate
objective, verifiable TPC performance data to the industry. While TPC benchmarks
certainly involve the measurement and evaluation of computer functions and operations,
the TPC regards a transaction in the same way as it is commonly understood in the
business world: as a commercial exchange of goods, services, or money.A typical
transaction, as defined by the TPC, would include the updating of a database system for
such things as inventory control (goods), airline reservations(services), or banking
(money).
The TPC-D benchmark is an accepted industry-standard measure for decision support
system performance. The TPC-D consists of a suite of business-oriented ad-hoc queries
and concurrent data modifications. The queries and the data populating the database
were chosen for their broad industry-wide relevance and relative ease of
implementation. This benchmark illustrates decision support systems that examine large
volumes of data, execute queries with a high degree of complexity, and pro-vide
answers to critical business questions.
The TPC-D benchmark evaluates the performance of various decision support systems
by the execution of sets of queries against a standard database under controlled
conditions. TPC-D queries measure both the server and the storage performance of a
complete solution. Since this is an industry-standard benchmark, vendors can be ranked
based on their TPC-D performance and rice/performance results.
70

More Related Content

PPT
Introduction to Interfacing Technique
PPTX
Hardware I/O organization
PDF
Microprocessor note
PDF
Unit 5
PPTX
Input output accessing
PPTX
I/O Channel IBM 370
PPTX
Input Output - Computer Architecture
PPTX
Computer architecture input output organization
Introduction to Interfacing Technique
Hardware I/O organization
Microprocessor note
Unit 5
Input output accessing
I/O Channel IBM 370
Input Output - Computer Architecture
Computer architecture input output organization

What's hot (20)

PPT
Microprocessor
PPTX
Input output organisation
PPTX
Io processing
PPTX
Input - output organzation
PDF
Motorola Mc68040 Microprocessor :::
PPT
Introduction to-microprocessor
PPTX
Input Output Organization
PDF
Chapter 1-Microprocessors, Microcomputers, and Assembly Language
PDF
Microprocessors & Microcomputers Lecture Notes
PDF
Assembly Language and Structures of Microcomputer | Chap-1
PDF
Computer oganization input-output
PPT
Microprocessor application (Introduction)
PPT
Unit – 2
PPT
Module4
PPT
Interface
PPTX
Input output interface
PPT
Unit 4 ca-input-output
PDF
Unit 1. introduction
PDF
Io pro
PPT
Chapter8-mikroprocessor
Microprocessor
Input output organisation
Io processing
Input - output organzation
Motorola Mc68040 Microprocessor :::
Introduction to-microprocessor
Input Output Organization
Chapter 1-Microprocessors, Microcomputers, and Assembly Language
Microprocessors & Microcomputers Lecture Notes
Assembly Language and Structures of Microcomputer | Chap-1
Computer oganization input-output
Microprocessor application (Introduction)
Unit – 2
Module4
Interface
Input output interface
Unit 4 ca-input-output
Unit 1. introduction
Io pro
Chapter8-mikroprocessor
Ad

Similar to Co notes (20)

PDF
Microprocessor System qwefqwgqwgqwgqwgqegew
PPT
I. Introduction to Microprocessor System.ppt
PDF
Lecture Notes - Microprocessor - Unit 1 - Microprocessor Architecture and its...
PPTX
UNIT 2 8086 System Bus Structure.pptx
PPTX
8085 Microprocessor - Ramesh Gaonkar.pdf-27 (1).pptx
PPT
Microprocessor and Application (8085)
PPT
architect.ppt
PPTX
introduction to microprocessors
PPT
computer architecture
PPTX
Mechatronics ME8791
PPTX
UNIT 3.pptx
PPTX
Unit 2 - Microprocessor & Microcontroller.pptx
PPT
Microprocessor
PPT
Microprocessor based sys presntation.ppt
PPT
PPT
Basic operational concepts.ppt
PPTX
Module -4_microprocessor (1).pptx
DOCX
Es notes unit 2
PPT
12429908.ppt
PPTX
MICRO-PROCESSORS and MICRO -CONTROLLER topic
Microprocessor System qwefqwgqwgqwgqwgqegew
I. Introduction to Microprocessor System.ppt
Lecture Notes - Microprocessor - Unit 1 - Microprocessor Architecture and its...
UNIT 2 8086 System Bus Structure.pptx
8085 Microprocessor - Ramesh Gaonkar.pdf-27 (1).pptx
Microprocessor and Application (8085)
architect.ppt
introduction to microprocessors
computer architecture
Mechatronics ME8791
UNIT 3.pptx
Unit 2 - Microprocessor & Microcontroller.pptx
Microprocessor
Microprocessor based sys presntation.ppt
Basic operational concepts.ppt
Module -4_microprocessor (1).pptx
Es notes unit 2
12429908.ppt
MICRO-PROCESSORS and MICRO -CONTROLLER topic
Ad

Recently uploaded (20)

PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Cell Structure & Organelles in detailed.
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
Lesson notes of climatology university.
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Classroom Observation Tools for Teachers
PPTX
Pharma ospi slides which help in ospi learning
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Complications of Minimal Access Surgery at WLH
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Supply Chain Operations Speaking Notes -ICLT Program
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
RMMM.pdf make it easy to upload and study
Insiders guide to clinical Medicine.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
102 student loan defaulters named and shamed – Is someone you know on the list?
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Final Presentation General Medicine 03-08-2024.pptx
Cell Structure & Organelles in detailed.
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Lesson notes of climatology university.
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Classroom Observation Tools for Teachers
Pharma ospi slides which help in ospi learning
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Complications of Minimal Access Surgery at WLH
Sports Quiz easy sports quiz sports quiz
GDM (1) (1).pptx small presentation for students
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...

Co notes

  • 1. Microprocessor Microcomputers In a microcomputer, the central processing unit (CPU) is fabricated on a single integrated circuit, and is called a microprocessor. A microcomputer uses other microcircuits, but the microprocessor is the most complex. The use of microprocessors meant that the computer manufacturer no longer had to design the CPU. It was a standard, off the-shelf component. As newer, more powerful microprocessors become available, they are quickly implemented into smaller, more powerful, low-cost microcomputers. The microprocessor provides a complete computer processor, with a standardized instruction set and standardized signals. The semiconductor manufacturers standardized on TTL signal levels. Microprocessor Components The four major components in a microprocessor are listed as follows: 1. A bank of registers for holding information 2. An arithmetic logic unit (ALU) for processing the information 3. A bus interface for moving information into and out of the microprocessor 4. Control logic for managing the operation of the microprocessor ( The control logic instructs the bus interface to get an instruction.) 1. A bi-directional data bus which is implemented with tri-state logic devices to allow the use of a direct-memory-access controller, or other similar chips. 2. A mono directional address bus, connected internally, within the A microprocessor, to address pointers and the program counter The address bus is also implemented in tristate logic. 3. A control bus, which carries the various synchronization signals to, from the microprocessor. Control lines are not necessarily tristate. All the usual system components are connected to these three busses. The basic components are shown in Fig. They include the ROM, the RAM, and the I/0 chips. The ROM is the Read-Only Memory. It stores the programs that the microprocessor needs to power-up. The RAM is the Random-Access Memory. It is a read-write MOS memory which stores temporary data and programs. The input-output chips are used for 1
  • 2. such functions as multiplexing the data bus for two or more input-output ports. These ports may be connected directly to input-output devices, or to device controllers, which may require the use of interface circuits. The interface circuits or interface chips required to interface this basic system to the I/0 devices, will be connected to these buses, which include the microprocessor buses or special input-output buses. Interfacing techniques are the methods required to connect this system to the various input-output devices. The basic interfacing techniques required to connect any microprocessor system to input-output devices are similar. At the level of the microprocessor itself, the logical and electrical interface required is similar. Many standard microprocessors of the same data width have essentially the same data bus and the same address bus. The main difference is the control bus. It is the specific characteristics of the control bus which make input-output interface chips compatible or incompatible from one microprocessor to the next. Microprocessor Control Signals We have seen that the microprocessor uses three buses: a bi-directional data bus, a mono directional address bus, and a control bus with the needed signals. The data bus is essentially identical for microprocessors with the same data width. It is a bi-directional bus, normally implemented in tristate logic. The address bus is a mono directional bus, used to select memory or a device external to the microprocessor. The third bus, the control bus, is the most complex. It carries the microprocessor control signals (the interface signals). The control bus has four functions: 1. Memory synchronization 2. Input-output synchronization 3. Microprocessor scheduling-interrupts 4. Utilities, including clock and reset Memory and input-output synchronization are similar A handshake procedure is used. In a read-operation, a ready status or signal is needed to indicate the availability of data. Data can then be transferred on the data bus. In some types of input-output devices, an acknowledge signaI is generated to confirm the receipt, of data. In a write operation, the availability of the external device is verified through a status bit or signal, and the data is then placed on the data bus. An acknowledge signal may also be generated by the device to confirm the receipt of data. The use of an acknowledge signal or handshake is typical in an asynchronous procedure. In a synchronous procedure, all events take place in a specified period of time; so there is no need to acknowledge a transmission. In an asynchronous system, an acknowledge signal is needed to verify a transmission. The use of synchronous versus asynchronous communication depends on a number of factors. A synchronous bus has the potential for higher speed and a lower number of control lines, but, it places speed requirements on the external devices. An asynchronous bus is more complex and requires more logic, but allows more flexibility for device speeds in the system. 2
  • 3. Data Bus Operation The microprocessor data bus is a bi-directional, three-state bus. It is the same as a single-line bus except that there are eight lines instead of one. To use all of the data bus lines, each talker must have eight drivers, there must be one for each line, and each listener must have eight inputs. The microprocessor and RAM act as both talkers and listeners. Input ports act as talkers since they take inputs from outside the system and place them on the bus. Output ports act as listeners since they take data off the bus and send it outside the system. A ROM acts like only a talker. The microprocessor, ROM, RAM, and input ports use three-state drivers on their outputs. Chip Select (CS) inputs. are used to enable the drivers and allow the data from the selected device to appear on the data bus. The microprocessor acts as the main controller for the system. It allows only one device to use the bus at any given time. When the microprocessor needs to read data from ROM, it first disables its own data outputs and then generates the control signals needed to enable the ROM. The ROM's outputs then appear on the data bus and the micro- processor reads the data. Reading RAM or an input port is done in a similar way. To write data to a device, such as RAM or an output port, the microprocessor first places the data to be written onto the data bus. It then generates control signals that send a write pulse to the device. This write pulse causes the device to internally latch the data. In most cases, data flows through the microprocessor. In a data transfer from an input port to RAM, the microprocessor will first read the data from the input port and then write it to RAM. Since the data cannot be transferred directly from the input port to the RAM, it is temporarily stored in the microprocessor. To summarize, the data bus is used for all transfers of data within the microprocessor system. All devices share the same bus. The control logic, operating from signals generated by the microprocessor, directs each device as to when it should place data on the bus or read data from the bus. Address Bus Operation A set of lines is used by the microprocessor to specify where information will come from or go to on the data bus. This is the address bus used by the microprocessor. The memory is divided into locations and each data storage location has a unique address. This address is specified on the address bus when the microprocessor needs to get or place information at that location. The address bus is controlled by the microprocessor Addresses are never sent to the microprocessor over the address bus by another device. But, the microprocessor may be asked to release the address bus so that other devices may use it. This capability is required for direct memory access (DMA). DMA is an input/output technique designed to speed up certain types of data operations. Eight-bit processors usually have 16-bit address buses made up of 16 individual address lines. These 16 bits give the processor an addressing capability of 65,536 locations. In an 8-bit processor, this allows a storage capacity of 65,536 bytes of information. In the early microprocessors, memory was expensive and this was more than most microprocessor systems could afford to use. This amount of memory is very inexpensive today and many current programs require much more than this to function properly. Sixteen-bit processors offer a greater addressing range. Some 16-bit processors, such as the Intel 8086, use 20 address lines, giving them an addressing capability of 1,048,576 locations, while others have 24 address lines and provide 16,777,216 locations. Although these seemed more than adequate when these processors were introduced, PCs with several megabytes of memory are common. . Thirty-two-bit processors with 32-bit address buses have address spaces in excess of 4 billion locations. 3
  • 4. Control Bus Operation The control bus uses the timing information to synchronize the other devices with the internal operation of the microprocessor. Memory, devices are told when the address bus has a valid address and when to place data on the data bus. The processor can then read this data. The memory and I/O devices also need to know when the processor has placed information on the data bus so it can be accepted. Some lines of the control bus are bi-directional, white others are not. Some of the signal lines on the control bus are driven by the microprocessor, while other signal lines are driven by other devices in the system: The control bus is not as uniform as the data lines in the data bus or the address tines in the address bus. It is a mixture of timing, data direction, and functions. Each microprocessor operates with its own set of control lines, but most have two signals, the interrupt and DMA lines. These two groups of lines on the control bus have control over the microprocessor in certain situations. They are able to take over control of the microprocessor when they are called upon. Interrupts allow external devices to stop the normal operation of the processor so that another task can be started. This task can be the transfer of small amounts of data or large routines or programs. DMA (direct memory access) is a hardware technique where special hardware takes control of the bus from the processor for the required period of time to complete a data transfer. This special hardware can usually perform the data transfer much faster than the general-purpose processor, so this technique is often used for high speed input/output transfers. Microprocessor Bus Characteristics The microprocessor component level bus is made up of the three sub buses: data bus, address bus, and control bus. Each of these three sub buses has a critical task which is needed for the proper operation of the microprocessor. All data entering or leaving the microprocessor does so over this bus, so a data bus is needed. This bus is used to move information to be processed into the microprocessor and move the processed information out of the microprocessor. The data bus has bi- directional data lines so data may flow either into or out of the processor, but only in one direction at a time. The direction that the information flows is controlled by the control bus. Microprocessors can be characterized by the size of their data buses. If the data bus of a microprocessor is 8 bits wide, the microprocessor is known as an 8·bit microprocessor. The Intel 8080 and 8085, the Motorola 6800, 6801, and 6802, and the Zilog Z80 microprocessor are all 8-bit microprocessors since they have 8-bit data buses and internally they process information in 8-bit chunks. The 32-bit chips include Motorola's 68020 and 68030 and Intel's 80386 and 80486. The Motorola 68000 series is used in the Apple Macintosh. The Macintosh IIfx, which was introduced' in 1990, uses a 68030 running at 40 MHz: The Intel Pentium has a 64-bit data bus and a 32-bit address bus. There are also smaller microprocessor which have 4- bit data buses. These are often used in dedicated control applications. The size of the data bus used by the microprocessor usually indicates the size of the data word the processor is designed to manipulate. An 8-bit data bus generally indicates that the microprocessor processes data 8 bits at a time. Some microprocessors like the Motorola 6809 and Intel 8088 have 8-bit data buses but internally they are designed to work with 16-bit data chunks. The reason for limiting the size of the data bus while the processor works with larger quantities internally is cost. The chips are less expensive to make. Microprocessor I/O Techniques The three types of I/O techniques which the microprocessor uses to communicate with the external world are programmed I/O, interrupt I/O, and Direct Memory Access (DMA). Programmed I/O is a microprocessor-initiated I/O transfer. The data transfer between the 4
  • 5. microprocessor and an external device is controlled by the microprocessor. A program must be executed by the microprocessor to accomplish this. Interrupt I/O is device-initiated. An external device is connected to the interrupt pin of the microprocessor. In order to transfer data, the device changes the state on the interrupt pin. The microprocessor completes execution of the current instruction, saves the rest of the program in its memory, and executes an interrupt service, routine to complete the transfer. Direct memory access is also device-initiated. Data transfer between the microprocessor memory and the I/O device occurs without the microprocessor. Special DMA control circuits are used to complete the transfer. Addressing The data bus is used by different devices to exchange data, so a method is needed by the microprocessor to select the particular device that communicates with the data bus. The address bus (with the aid of the control bus) provides this function. The address bus is unidirectional, so its operation is simpler than the data bus. Each memory location has a unique address. Before a data transfer can take place on the data bus, the microprocessor sends out an address. This address specifies the memory location that the processor needs to access. This allows the microprocessor to select any part of the system it needs to communicate with. An address bus with 16 lines allows direct addressing of 216 or 65,536 memory locations and I/O ports. The 16 lines are usually labeled from the least significant bit to the most significant as A0, A1, A2, A3 .... A15. Address Decoders An address decoder, which is a part of the control logic, generates device-select signals when a certain address or range of addresses is present on the address bus. Figure shows an address decoder for address 3000 hex (0011 0000 0000 0000 binary). The output of the decoder is TRUE . only when this address is present on the address bus. This output is then used to enable the port that is assigned to address 3000. The address bus selects the memory location or I/O port and the data bus carries the data. The entire process is coordinated by the control bus with its control signals. The microprocessor will use signals like READ and WRITE.* When READ* is low, it, indicates a read operation is taking place, and the microprocessor signals the addressed device to place data on the data bus. If WRITE* is low, then a write operation is taking place, and the microprocessor places data on the data bus and signals the addressed device to store this data. Each wire in the control bus has a unique function; in the address and data buses, each line carries the same type of information-1 bit of the address or data. The actual control signals in some microprocessors can differ, but the data transfers are the same. They are just achieved m different ways. Input and Output Ports Suppose an output port latch has an address of 2000. The latch is clocked when address 2000 is present on the address bus and a low-to-high switching takes place on the WRITE* control signal. When the latch is clocked the data from the data bus is stored on it. The microprocessor causes data specified by the software program to be at the output of the latch by writing the data to address 2000. Input ports are handled in a similar way: the output of the address decoder is ANDed with READ instead of WRITE to generate the port enable. The input port is usually an eight-line three-state driver. It places the input signals on the data bus when enabled (Fig. 2.10b). The microprocessor can read these input signals on the data bus when enabled. The microprocessor reads these input signals by performing a read operation from the proper address The processor can then store this data in one of its registers. 5
  • 6. MEMORIES ROM Addressing A ROM can be viewed as a device with many 8-bit input ports on a single chip, with one port for each memory location. When the ROM is programmed, the ROM memory locations are permanently set in a pattern of 1s and 0s. In a RAM, each memory location can be viewed with having both input and an output ports combined. In a ROM system the low-order bits of the address bus are connected to an address decoder that selects one of low-order locations. The high order 8 bits of the address bus are decoded by another address decoder to enable the ROM when a desired range of addresses is present on the upper half of the address bus. In a 16-bit system, the lower 8 bits would go to one address decoder and the upper 8 bits to the other address decoder . The READ signal is ANDed with the address decoder output to generate the ROM enable. This is the same technique used for input ports. 6
  • 7. The address lines must indicate which memory chip should be selected and which word within that chip should be addressed. The addresses are shown in binary. The lower 8 bits of address designate the location within each chip, and the upper 8 bits designate which chip is being addressed. Bits 8 and 9 are used for decoding one chip from the other The lower 8 bits of address would be connected to the address lines of all four ROMs, since these bits specify the location within the chip. The address decoder then checks the upper 8 bits of address and generates the chip selects. There are variations to this approach, but the basic principles remain the same: 1. The low-order address bits are connected to the memory's address lines. 2. The high-order bits are decoded to generate the chip selects. No more than one chip can be selected at any given time. RAM Addressing RAMs are decode in a similar way, but additional control signals are needed to write input to the RAM and to read output from the RAM. RAMs have a WRITE* input in addition to the CS* input (chip select). In order for the RAM to be controlled, CS* must be low for either a read or write to take place. If WRITE* is high (not TRUE) when CS* is Iow, the RAM outputs data to the data bus so the processor can read it. This is done when CS* enables the RAMs three-state output drivers. If WRITE* is low, CS* will not turn on the RAM's output drivers. Instead, the data on the data bus is written into the memory at the location designated by the address bus. The bus gating is shown in Fig. 2.10e. Chip select (CS) is low when the RAM address select and either READ or WRITE are low. The WRITE* line is connected to the RAM's WRITE* input. The WRITE* input is internally gated with the CS* input, so it will be ignored unless CS is low. Static and Dynamic RAMs RAMs are used for holding data and running programs in microcomputer systems. ROMs provide a means of storing programs and data. RAMs lose their contents when power is removed. The two different types of RAMs are static and dynamic. Static RAMs (SRAMs) use a flip-flop circuit for each memory element. Each 1K of RAM has 1024 flip-flops. Each flip- flop can be set to store a 1 or reset to store a 0. Address decoding circuits in the RAM chip select the particular flip-flop specified by the address lines. The state of the flip-flop does not change until different data is stored in it or power to the RAM is interrupted. 7
  • 8. Dynamic RAMs (DRAMs) use a storage capacitor. When a charge is stored in the capacitor, this indicates a 1; no charge indicates a 0. This technique reduces the size of the storage cell, and allows denser memory chips. Since the charge leaks off the capacitor, it must be refreshed. Refreshing consists of rewriting the data. All of the 1 bits are restored to full charge and the 0 bits to no charge. Cache Basics (High-performance processors like the 68020 and 68030 place a great demand on the bandwidth of memory systems.) The newer integrated circuit implementations have greatly reduced the cycle times of processors. Although computer memories have improved in performance, the ratio of memory speed to processor cycle time has continued to increase. Shared-memory, multiple-processor systems have also become more common and have increased the bandwidth requirements of system memories and the buses that connect them. This has resulted in increasing use of cache memories. (Caches are high-speed buffer memories that hold the most frequently used instructions and data for quick access by the processor.) Caches operate the locality of memory references. Computer programs tend to execute instructions stored in close proximity to each other. Programs also exhibit a locality in that they tend to access a small subset of the entire data set a number of times in any time period. The cache hardware tracks the data accessed by the processor and saves it with the likelihood that it will lie requested again.(Typically, the cache memory is 20 to 1000 times smaller in size than the system memory and 5 to 20 times faster.) Cache memories have been used in many computers. A cache memory was used in the IBM System 360, and the DEC PDP-11/70 followed. The Motorola 68030 processor uses two on-chip caches. The cache memory is made up of a number of lines or blocks that can hold the contents of the corresponding elements of the system memory. When the processor issues a memory reference, it is checked with the contents of the cache (If the data is already in the cache due to a previous access to system memory, it is sent back to the processor. This is called cache hit. A cache, miss requires that the data be fetched from the system memory and sent to the cache.) The cache is made up of two parts. One memory array is for the cached data, and each element in the array is a cache line or block. The other array is used for the cache directory.) The memory addresses from the processor have fields tag, index, and byte number. The index allows access to both the directory and data arrays. The contents of a previously stored tag are compared with the present tag. A match indicates a hit and a nonmatch indicates a miss. In a miss the address is forwarded to the system memory and the data returned from memory overwrites the data in the cache. The tag in the directory is also updated. CPU ORGANIZATION (PENTIUM FAMILY) 64 bit ARCHITECTURE Pentium This is the next generation of the 386 and 486 microprocessor family. It is binary compatible with the 8086188, 80286, 386 DX, 386 SX, 486 DX, 486 SX and DX2. The Pentium processor has all of the features of the 486 with the following enhancements and additions; superscalar architecture, dynamic branch prediction, pipelined floating- point unit, improved instruction execution, separate code and data caches, write back data cache, 64-bit data bus, bus cycle pipelining, and address parity and internal parity checking. The instruction set of the Pentium includes the 486 instruction set with extensions for the additional functions of the Pentium. Software written for the 386 and 486 can run on the Pentium. The on-chip memory management unit (MMU) is also compatible with the 388 and 486. The Pentium has two instruction pipelines and floating-point units that are capable of independent operation. Each pipeline issues frequently used instructions in a single clock; the two pipelines can issue two integer instructions in one clock or one to two floating point instructions in one clock. Branch prediction is accomplished in the Pentium 8
  • 9. with two prefetch buffers. The floating-point unit has faster algorithms to speed some math operations up to 10 times Caches The Pentium processor has separate code and data caches on the chip. Each cache is 8 Kbytes, with a 32-byte line size and 2-way set associative. Each cache uses a Translation Lookaside Buffer (TLB) to translate linear addresses to physical addresses. The data cache use write-back or write-through on a line-by-line basis. The cache tags are triple-ported to support two data transfers and an inquire cycle in the same clock. The data bus is 64 bits which improves the data transfer rate. Burst read and write-back cycles are supported as well as bus cycle pipelining which allows two bus cycles to take place simultaneously. Test Functions The Pentium processor uses functional redundancy checking for error detection. This is done for the processor and the interface to the processor In functional redundancy checking, a second processor acts as the checker. It runs in parallel with the processor being tested. The checker samples the processor's outputs and compares them for a match. It signals an error condition if a match does not occur. Since more functions have been placed on the chip, board-level testing becomes difficult. So, the Pentium processor has increased test and debug capability. Like other 486 CPUs, the Pentium uses IEEE Boundary Scan (Standard 1149). There are, four breakpoint pins for the debug registers. These can be used for a breakpoint match. Signal Functions Figure shows a block diagram of the Pentium processor, which is a 32-bit microprocessor with 32-bit addressing and a 64-bit data bus. A 273-pin grid array package is used. 9
  • 10. Pentium Processor Registers (a) Integer Unit Type Number Length (bits) Purpose General 8 32 General-purpose user registers Segment 6 16 Contain segment selectors Flags 1 32 Status and control bits Instruction Pointer 1 32 Instruction pointer (b) Floating-Point Unit Type Number Length (bits) Purpose Numeric 8 80 Hold floating-point numbers Control 1 16 Control bits Status 1 16 Status bits Tag Word 1 16 Specifies contents of numeric registers Instruction Pointer 1 48 Points to instruction interrupted by exception Data Pointer 1 48 Points to operand interrupted by exception  General : There are eight 32-bit general-purpose registers These may be used for all types of Pentium instructions; they can also operands for address calculations. In addition, some of these register olso serve special purposes. For example, string instructions use the contents ECX, ESI, and EDI registers as operands without having to explicitly reference these registers in the instruction. As a result, a number of instructions e encoded more compactly.  Segment: The six 16-bit segment registers contain segment selectors, which index into segment tables, The code segment (CS) register references the segment containing the instruction being executed. The stack segment (SS) register references the segment containing a user-visible stack. The remaining segment registers (DS, ES, FS, GS) enable the user to reference up to four separate data segments at a time.  Flags: The EFLAGS register contains condition codes and various mode bits.  Instruction Pointer: Contains the address of the current instruction. There are also registers specifically devoted to the floating-point unit.  Numeric : Each register holds an extended-precision 80-bit floating-point number. There are eight registers that function as a stack, with push and pop operations available in the instruction set.  Control: The 16-bit control register contains bits that control the operation of the floating-point, unit; including the type of rounding control; single, double, extended precision; and bits to enable or disable various exception condition.  Status: The 16-bit status register contains bits that reflect the current state of floating-point unit, including a 3-bit pointer to the top of the stack; condition codes reporting the outcome of the last operation; and exception flags.  Tag Word: This 16-bit register contains a 2-bit tag for each floating-point numeric register, which indicates the nature of the contents of the corresponding register. The four possible values are valid, zero, special (NaN, infinity, denrmalized) and empty. These tags enable programs to check the contents of a numeric register without performing complex decoding of the actual data in the register. The use of most of the above registers is easily understood Let us elaborate briefly on several of the registers 10
  • 11. EFLAGS Register The EFLAGS register indicates the condition of the processor and helps to control its operation. It includes the six condition codes defined in Table 9.8 (carry, parity, auxiliary, zero, sign, overflow), which report the results of an integer operation. In addition, there are bits in the register that may be referred to as control bits; these are ID = Identification Flag DF = Direction Flag VIP = Virtual Interrupt Pending IF = Interrupt Enable Flag VIF = Virtual Interrupt flag TF = Trap Flag AC = Alignment Check SF = Sign Flag VM = Vartual 8086 Mode ZF = Zero Flag RF = Resume Flag AF = Auxiliary Carry Flag NT = Nested Task Flag PF = Parity Flag IOPL = I/O Privilege Level CF = Carry Flag OF = Overflow Flag  Trap Flag (TF): When set, causes an interrupt after the execution of each instruction. This is used for debugging.  Interrupt Enable Flag (IF): When set, the processor will recognize external interrupts.  Direction Flag (DF): Determines whether string processing instructions increment or decrement the 16-bit half-registers SI and DI (for 15-bit operation) the 32-bit registers ESI and EDI (for 32-bit operations).  I/O Privilege Flag (IOPL): When set, causes the processor to generate an exception on all accesses to I/O devices during protected-mode operation:  Resume Flag (RF): Allows the programmer to disable debug exceptions so the instruction can be restarted after a debug exception without immediately causing another debug exception.  Alignment Check (AC): Activates if a word or doubleword is addressed on a non word or non doubleword boundary.  Identification Flag (ID):-If-this bit can be set and cleared, that indicates that this processor supports the CPUID instruction. This instruction provides information about the vendor, family, arid model. In addition, there are four bits that relate to operating. mode. The nested (NT) flag indicates that the current task is nested within another task in protected mode operation. The virtual mode (VM) bit allows the programmer to enable or disable virtual 8086 mode, which determines whether the processor runs as an 8086 machine. The virtual interrupt flag (VIF) and virtual interrupt pending (VIP) flag are used in a multitasking environment. Control Registers The Pentium employs four 32-bit control registers (register CRl is unused) to control various aspects of processor operation (Figure ). The CR0 register contains system control flags, which control modes or indicate states that apply generally to the processor rather than to the execution of an individual task The flags are 11
  • 12. MCE = Machine Check Enable NW = Not Write Through PSE = Page Size Extensions AM = Alignment Mask DE = Debugging Extensions WP = Write Protect TSD = Time Stamp Disable NE = Numeric Error PVI = Protected-Mode Virtual Interrupts ET = Extension Type VME = Virtual-8086 Mode Extensions TS = Task Switched PCD = Page-level Cache Disable EM = Emulation PWT = Page-level Writes transparent MP = Monitor Coprocessor PG = Paging PE = Protection Enable CD = Cache Disable  Protection Enable (PE): Enable/disable protected mode of operation.  Monitor Coprocessor (MP): Only of interest when running programs from earlier machines on the Pentium; it relates to the presence of an arithmetic coprocessor.  Emulation (EM): Set when the processor does not have a floating-point and causes an interrupt when an attempt is made to execute floating-point unit, and causes an interrupt when an attempt is made to execute floating-point instructions.  Task Switched (TS): Indicates that the processor has switched tasks.  Extension Type (ET): Not used on the Pentium; used to indicate support of math coprocessor instructions on earlier machines.  Numeric Error (NE): Enables the standard mechanism for reporting floating-point errors on external bus lines.  Write Protect (WP): When this bit is clear, read-only user-level pages can be writen by a supervisor process. This feature is useful for supporting process creation in some operating systems.  Alignment Mask (AM): Enables/disables alignment checking.  Not Write Through (NW): Selects mode of operation of the data cache. When this bit is set, the data cache is inhibited from cache write-through operations.  Cache Disable (CD): Enables/disables the internal cache fill mechanism.  Paging (PG): Enables/disables paging. When paging is enabled, the CR2 and CR3 registers are valid. The CR2 register holds the 32-bit linear address of the last page accessed before a page fault interrupt. The 12
  • 13. leftmost 20 bits of CR3 hold the 20 most significant bits of the base address of the page directory; the remainder of the address contains zeros. Two bits of CR3 are used to drive pins that control the operation of an external cache. The page-level cache disable (PCD} enables or disables the external cache, and the page-level writes transparent (PWT) bit controls write through in the external cache. Six additional control bits are defined in CR4:  Virtual-8086 Mode Extension (VME): Enables support for the virtual interrupt flag in virtual-8086 mode.  Protected-Mode Virtual Interrupts (PVI):r support for the virtual interrupt flag in protected mode.  Time Stamp Disable (TSD): Disables the read from time stamp counter (RDTSC) instruction, which is used for debugging purposes.  Debugging Extensions (DE): Enables I/O breakpoint; this allows the pro to interrupt on I/O reads and writes.  Page Size Extensions (PSE): Enables the use of 4-Mbyte pages.  Machine Check Enable (MCE): Enables the machine check interrupt, which occurs when a data parity error occurs during a read bus cycle or when a bus cycle is not successfully completed. 32 BIT ARCHITECTURE INTEL 80386 The 80386 includes separate 32-bit internal and external data paths along with eight general - purpose 32-bit registers. The processor can handle 8-, 16-, and 32-bit data types. It has separate 32-bit data and address pins and generates a 32-bit physical address. The 80386 can directly address up to four gigabytes of physical memory and 64 tetrabytes (246 ) of virtual memory. The 80386 can be operated from a 12.5-, 16-, 20-, 25-, or 33-MHz clock. The chip has 132 pins housed in a Pin Grid Array (PGA) package. The 80386 is designed using high-speed CHMOS III technology. The 80386 is highly pipelined and can perform instruction perform instruction fetching, decoding, execution, and memory management functions in parallel. The on-chip memory management and protection hardware translates logical addresses to physical addresses and provides the protection rules required in a multitasking environment The internal architecture of the 80386 includes six functional units that operate in parallel. The parallel operation is known as pipelined processing. Fetching, decoding, execution, memory management, and bus access for several instructions are performed simultaneously. The six functional units of the 80386 are – 13
  • 14. 14
  • 15.  Bus interface unit  Code prefetch unit  Execution unit  Segmentation unit  Paging unit  Decode unit The bus interface unit interfaces between the 80386 with memory and I/O. Based on internal requests for fetching instructions and transferring data from the code prefetch unit, the 80386 generates the address control signals for the current bus cycles. The code prefetch unit prefetches instructions when the bus inter unit is not executing bus cycles. It then stores them in a 16-byte instruction queue for execution by the instruction decode unit. The instruction decode unit translates instructions from the prefetch queue into microcodes. The decoded instructions are then stored in instruction queue (FIFO.) for processing by the execution unit. The execution unit processes the instructions from the instruction queue. It contains a control unit, a data unit, arid a protection test unit. The control unit contains microcode and parallel hardware for fast multiply, divide, and effective address calculation. The data unit includes an ALU, 8 general-purpose registers, and a 64-bit barrel shifter for performing multiple bit shifts in one clock. The data unit carries out data operations requested by the control unit. The protection test unit checks for segmentation violations under the control of microcode. The segmentation unit translates logical addresses into linear addresses at the request of the execution unit. The translated linear address is sent to the paging unit. Upon enabling of the paging mechanism, the 80386 translates these linear addresses into physical addresses. If paging is not enabled, the physical address is identical to the linear addresses and no translation is necessary. 80386 REGISTERS 15
  • 16. Figure shows 80386. registers. The 80386 has 16 registers as general, segment, status and instruction. The eight general registers are, the 32-bit registers EAX; EBX; ECX, EDX, EBP, ESP, ESI, and EDI. The low-order word of each of these eight registers has the 8086/80186/80286 register names AX (AH or AL), BX (BH or BL), CX (CH or CL), DX, (DH or DL), BP, SP, SI and DI. They are useful for making the 80386 compatible with the 8086, 80286 and 80286 processors. The six 16-bit segment registers (CS, SS, DS, ES, FS, and GS)allow systems software designers to select either a flat or segmented model of memory organization. The purpose of CS, SS, DS, and ES is obvious. Two additional data segment registers FS and GS are included in the 80386. The four data segmetn registers (DS, ES, FS, GS) can access four separate data areas and allow programs to access different types of data structures. For example, one data segment register can point to the data structures of the current module, another to the exported data of a higher level module, another to dynamically created data structure, and another to data shared with another task. The flag register is a 32-bit register named EFLAGS. The flags are grouped into three types : the status flags, the control flags, and the system flags. The status flags include CF, PF, AF, ZF, SF, and OF Instruction Format The op code byte field varies depending on the class of operation. This field defines information such as direction of the operation, displacement sizes, register encoding, or sign extension. The displacement field can be 8, 16, or 32 bits if the addressing mode includes a displacement. The last field of the instruction is the immediate data which can be 8, 16, or 32 bits if the addressing mode is immediate. 16
  • 17. Component Description 80386 Microprocessor 32-bit high-perrormance imcroprocessor with on-chip memory management and protection. 80287 or 80287 Numeric Coprocessor Performs numeric instruction in parallel with 80386; expands instruction set 82384 Clock Generator Generates system clock and RESET signal. 8259A Programmable Interrupt Controller Provides interrupt control and management 82258 Advanced DMA Performs direct memory controller access (DMA) FIGURE 80386 system block diagram. 17
  • 18. INSTRUCTION EXECUTION CYCLE REGISTERS : As the instructions are interpreted and executed by CPU, there is a movement of information between the various units of the computer system. The order to handle this precess satisfactority and to speed up the rate of information transfer, the computer uses a number of special memory units called registers. These registers are not considered as a part of the main memory and are used to retain information on a temporary basis. The number of registers varies among computers as does the data-flow pattern. Most computers use several types of registers, each designed to perform a specific function. Each of these registers possess the ability to receive information, to old it temporarily, and to pass it on as directed by the control unit. The length of a register equals the number of bits it can store. Thus a register that can store 8 bits is normally referred to as 8-bit register. Although the number of registers varies from computer to computer, there are some registers that are common to all computers. The function of these registers is described below. 1. Memory Address Register (MAR) : It holds the address of the active memory location. It is loaded from the program control register when an instruction is read from memory. 2. Memory Buffer Register (MBR) : It holds the contents of the memory work read from, or written in, memory. An instruction work placed in this register is transferred to the instruction register. A data work placed in this register is accessible for operation with the accumulator register or for transfer to the I/O register. A work to be stored in a memory location must first be transfferred to the MBR from where it is written in memory. 3. Program Control Register (PC) : It holds the address of the next instruction to be executed. This register goes through a step-by-step counting sequence and causes the computer to read successive instructions previously stored in memory. It is assumed that instruction words are stored in consecutive memory locations and read and executed in sequence unless a branch instruction is encountered. A branch instruction is an operation that calls for a transfer to a non-consecutive instruction. The address part of a branch instruction is transferred to the PC register to become the address of the next instruction. To read an instruction, the contents of the PC register are transferred to the MAR and a memory read cycle is initiated. The instruction placed in the MBR is then transferred to the instruction register. 4. Accumulator Register (A) : This register holds the initial data to be operated upon, the intermediate results, and also the final results of processing operations. It is used during the execution of most instructions. The results of arithmetic operations are returned to the accumulator register for transfer to main storage through the memory buffer register. In many computers there are more than one accumulator registers. 5. Instruction Register (I) : It holds the current instruction that is being executed. As soon as the instruction is stored in this register, the operation part and the address part of the instruction are separated. The address part of the instruction is sent to the MAR while its operation part is sent to the control section where it is decoded and interpreted and ultimately command signals are generated to carry out the task specified by the instruction. 6. Input / Output Register (I/O) : This register is used to communicate with the input / output devices. All input information such as instructions and data is transferred to this register by an input device. Similarly, all output information to be transferred to an output device is found in this register. Sr. No. Name of Register Function 1. Memory Address (MAR) Holds the address of the active memory location. 2. Memory Buffer (MBR) Holds information on its way to and from memory. 3. Program Control (PC) Holds the address of the next instruction to be executed. 4. Accumulator (A) Accumulates results and data to be operated upon. 5. Instruction (I) Holds an instruction while it is being executed. 18
  • 19. 6. Input / Output (I/O) Communicates with the I/O Address. MACHINE INSTRUCTIONS : Instructions in a form which can be used directly by the control unit are called machine instructions and programs, written in the form of machine instructions are said to be written in machine language. A special register is used to hold the machine instruction which is currently being interpreted by the control unit, and this register is called the Current Instruction Register (CIR). THE STEPS IN EXECUTING INSTRUCTIONS 1. The control unit decodes the function part of the instruction. 2. The control unit signals to appropriate hardware to pass the operand address part of the instructions to a decoder which passes the operand address into the MAR. 3. Then the control unit signals main memory to perform a read which results in the data in address being copied into the MDR. 4. The control unit signals the ALU to do an operation. 5. Once the control unit finishes the execution of one instruction it must fetch the next instruction from the memory into CIR (Via MAR). The address of the next instruction is held in a special register called as sequence control register (SCR or PC). This operation is in two stages (called as Fetch Execute Cycle). a) First it fetches the requisite instruction from main storage via the MDR and places in CIR. b) Then it interprets the instruction in CIR and causes the instruction to be executed by sending commands signals to the appropriate hardware device. PIPELINING Pipelining is an implementation technique where multiple instructions are overlapped in execution. The computer pipeline is divided in stages. Each stage completes a part of an instruction in parallel. The stages are connected one to the next to form a pipeline instructions enter at one end, progress through the stages, and exit at the other end. Pipelining does not decrease the time for individual instruction execution. Instead, it increases instruction throughput. The throughput of the instruction pipe is determined by how often an instruction exits the pipeline. Because the pipe stages are hooked together, all the stages must be ready to proceed at the same time. We call the time required to move an instruction one step further in the pipeline a machine cycle. The length of the machine cycle is determined by the time required for the slowest pipe stage. The pipeline designer's goal is to belance the length of each pipeline stage. If the stages are perfectly balanced, then the time per instruction on the pipeline machine is equal to Time per instruction on nonpipelined machine Number of pipe stages. Under these conditions, the speedup from pipeline equals the number of pipe stages. Usually, however, the stages will not be perfectly balanced; the pipelining itself involves some overhead. Pipelining leads to dramatic improvements in system performance, as you can well imagine, compared to allowing much of the processor circuitry to lie idle as with sequential execution. The more stages that you can break the pipeline into, the more theoretical speed you can get from it. For example, let's suppose it takes 12 clock cycles to handle all the steps to process an instruction. In theory, if you use a 4-stage pipeline, your maximum throughput is 1 instruction every 3 cycles. But if you use a 6-stage pipeline, maximum throughput is 1 instruction every 2 cycles. (This is of course highly simplified). Pipelining also has some drawbacks of course. One of these is complexity; there is a lot more work for the processor to do to keep the pipeline moving. Other problems relate to data dependencies. Let's take a very simple 2-line program as an example. 19
  • 20. A = A + 1 (Add 1 to the value at memory location A). B = B + a (Add the value of memory location A to the value at memory location B). Can you see how pipeline would cause a problem with this (very common kind of) code fragment? The processor will start executing the second instruction before the first one is finished, but it needs the results from the first instruction in order to execute the second one! A pipelining processor will of course detect and handle this condition, but in proceeding with the second one. This condition is called a pipeline stall and leads to reduced performance. Newer processors have special performance-enhancing features to partially eliminate this sort of problem. In general the processor wants to keep the pipeline ''flowing'' as much as possible, since when the pipeline stalls performance decreases. SUPERSCALAR The evolution of microprocessors has reached the point where architectural concepts pioneered in vector processors and mainframe computer of the 1970s (most notably the CDC-6600 and Cray-1) are starting to appear in RISC processors. Early RISC machines were very simple single-chip processors. As VLSI technology improved more room became available on the chip. Rather than increase the complexity of the architecture, most designers decided to use this room on techniques to improve the execution of their current architecture. The two principle techniques are on-chip caches and instruction pipelines. The latest step in this evolutionary process is the superscalar processor. The name means these processors are scalar processors that the capable of executing more than one instruction in each cycle. The keys to superscalar execution are an instruction fetching unit that can fetch more than one instruction at a time from cache; instruction decoding logic that can decide when instructions are independent and thus executed simultaneously; and sufficient execution units to be able to process several instructions at one time. Note that the execution units may be pipeline, e.g. they may be floating point address or multipliers, in the which case the cycle time for each stage matches the cycle times on the fetching and decoding logic. In many systems the high level architecture is unchanged from earlier scalar designs. The superscalar designs use instruction level parallelism for improved implementation of these architectures. A good example of a superscalar processor is the IBM Rs/6000. There are three major subsystems in this processor: the instruction fetch unit, an integer processor, and a floating point processor. The instruction fetch unit is a 2-stage pipeline; during the first stage a packet of four instructions is fetched from an instruction cache, and in the second stage instructions are routed to the integer processor and/or floating point processor. An interesting feature of this instruction unit is that it executes branch instructions itself so that in a tight loop there is effectively no overhead from branching since the instruction unit executes branches while the data units are computing values. The integer unit is a four-stage pipeline. In addition to executing data processing instructions this unit does some preprocessing for the floating point unit. The floating point unit itself is six stages deep. The advantage of the superscalar approach is that it does not rely on a vectorizing compiler to detect loops and turn them into vector instruction. A superscalar machine still requires a very sophisticated compiler to allocate resources and schedule operations in an order that will best take advantage of the resources of the machine, but in the long run the superscalar approach may be more flexible and applicable to a wider range of applications than vector processing. RISC ARCHITECTURE RISC stands for Reduced Instruction Set Computer. PA-RISC is the name for Hewlett- Packard's standard hardware that runs both the MPE/iX and HP-UX operating systems. The classic HP DEC VAX, and IBM 360 all used CISC processors: Complex Instruction Set Computer. The instructions that programmers use of those machines are not the real hardware instruction. Each complex instruction is implemented by a hidden microprogram written in the real instructions. On RISC computers there are no microprograms. Machine instructions are implemented directly in hardware. Any task too complex for the hardware to execute in a single cycle 20
  • 21. with large, complex instruction sets, the simple, often executed instruction incur a performance penalty by the overhead of additional instruction decoding, the use of microcode, and the longer cycle time resulting from increased functionality. PA-RISC Machine Instructions RISC machines use an instruction set that is based on 32 general-purpose registers. Here are some tips to help you guess the function of an instruction from the mnemonic: Text version Arithmetic : ADD@andSUB@ Branches : B@ as in BL Branch and Link, BV Branch Vectored. Compare and Branch : C @ as in COMIBF, COMpare Immediate and Branch If False. Extract : EXTRS for signed and EXTRU for unsigned. Load : L@ as in LDH load halfword, LDO load offset. Shift : SH@ as in SH2ADD Shift 2 and Add. Store : ST@ as in STB Store Byte, STW Store Word. CISC (Contemporary complex instruction set architecture) We have noted the trend to richer instruction sets, which include a larger number of instructions and more-complex instructions. Two principal reasons have motivated this trend: a desire to simplify compilers and a desire to improve performance. Underlying both of these reasons was the shift to high-level languages (HLL) on the part of programmers; architects attempted to design machines that provided better support for HLLs. The first of the reasons cited, compiler simplification, seems obvious. The task of the compiler writer is to generate a sequence of machine instruction for each HLL statement. If there are machine instructions that resemble HLL statements, this task is simplified. The task of optimizing the generated code to minimize code size, reduce instruction execution count, and enhance pipelining is much more difficult with a complex instruction set. As evidence of this, studies cited earlier in this chapter indicate that most of the instructions in a compiled program are the relatively simple ones. The other major reason cited is the expectation that a CISC will yield smaller, faster programs. Let us examine both aspects of this assertion : that programs will be smaller and that they will execute faster. CISC Versus RISC Characteristics After the initial enthusiasm for RISC machines, there has been a growing realization that (1) RISC designs may benefit from the inclusion of some CISC features and that (2) CISC designs may benefit from the inclusion of some RISC features. The result is that the more recent RISC designs, notably the PowerPC, are no longer "pure" RISC and the more recent CISC designs, notably the Pentium, do incorporate some RISC characteristics. For purposes of this comparison, to following are considered typical of a RISC: 1. A single instruction size. 2. That size is typically 4 bytes. 3. A small number of data addressing modes, typically less than five. This parameter is difficult to pin down. In the table, register and literal modes are not counted and different formats with different offset sizes are counted separately. 4. No indirect addressing that requires you to make on memory access to get the address of another operand in memory. 5. No operations that combine load/store with arithmetic (e.g., add from memory, add to memory). 6. No more than one memory-addressed operand per instruction. 7. Does not support arbitrary alignment of data for load/store operations. 8. Maximum number of uses of the memory management unit (MMU) for a data address in an instruction. 21
  • 22. 9. Number of bits for integer register specifier equal to five or more. This means that at least 32 integer registers can be explicitly referenced at a time. 10. Number of bits for floating-point register specifier equal to four or more. This means that at least 16 floating-point registers can be explicitly referenced at a time. Items 1 through 3 are an indication of instruction decode complexity. Items 4 through 8 suggest the ease or difficulty of pipelining, especially in the presence of virtual-memory requirements. Items 9 and 10 are related to the ability to take good advantage of compilers. BUSES FOR INTERFCING The first IBM PC-XT used the intel 8088 microprocessor running at a clock rate of 4.77 MHz. The bus had the following characteristics: 1. A set of eight bidirectional data lines 2. A set of 20 address lines 3. Six interrupt lines 4. Three sets of Direct-Memory-Access control lines 5. A group of lines for data control and status 6. Power supply and ground lines The 62 pins that make up the bus are divided into two rows (A1-A31 and B1-B31) for the edge connectors. These connectors, which are sometimes called expansion connectors, are located on the main circuit board of the computer. The microprocessor along with some of the I/O circuits and memory are also on this board. In the IBM PC-XT, there are five edge connectors for the expansion boards, Data moves over the bus on the data lines A2-A9. Addresses for data transfers are specified on the 20 address lines A12-A31. The use of the 8-bit version of the Intel 8086 16-bit processor in the PC and XT reduced the number of bus lines needed and resulted in a smaller bus and lower hardware costs. The six interrupt pins B21-B25 and B4 are connected to an interrupt controller on the system board. This controller generates the addresses needed for interrupt servicing. There are six interrupt lines, IRQ2 through IRQ7. Theses are connected to an interrupt controller on the processor board which automatically generates vectors for the interrupt service routine. As a result, there is no explicit interrupt-acknowledge signal on the IBM PC bus. Interrupts are acknowledged by data transactions with the processor. ISA and EISA Systems The ISA bus refers to the bus used in Industry-Standard-Architecture compatible computers. This is the same as the IBM AT 16-bit bus. In an EISA system, which is the extended version of ISA with a 32-bit bus, it refers to the ISA subset of the EISA bus. The EISA bus is a superset of the ISA bus. It has all of the ISA bus features, along with extendions to enhance performance and capabilities. The host CPU is the main system processor with its separate host bus. An EISA master is a 16-bit or 32-bit bus master that uses the EISA signal set to generate memory or I/O cycle. A bus controller is used to convert the EISA control signals to ISA signals. An ISA master is a 16-bit bus master that uses the ISA subset fo the EISA bus to generate memory or I/O cycles. This master must communicate with 8- bit or 16-bit ISA slaves, and route data to the proper paths. It is not used to handle any of the signals associated with the extended section of the EISA bus. The EISA slaves can be 8-bit, 16-bit or 32 32-bit memory or I/O slave devices that use the extended signal set of the EISA bus to accept cycles from the different masters. They handle information on the type and width of data using both extended and ISA signals. The ISA slaves are 8 ro 16-bit slave devices that use the ISA subset of the EISA bus to accept cycles from the different master. They use ISA signals to indicate the type and width of data. A DMA slave is an I/O device that uses DMA signals like DREQ or DACK* to perform a direct memory access. 22
  • 23. Assembly and disassembly are needed when the master/slave data bus size are mismatched. Multiple cycles are used to route bytes to the proper byte paths. When a 32-bit CPU accesses an 8-bit slave, four cycles will be used to route the bytes. A cycle translation is performed when the master and slave are on different buses. The master protocol is translated to the slave protocol. EISA Systems The Extended Industry Standard Architecture (EISA) is a 32-bit architecture based upon the Industry Standard Architecture (ISA) for the PC AT. EISA's capabilities and 32-bit architecture are needed to get the maximum performance out of the 386 and 486 CPUs. The EISA consortium defined the EISA bus as a 32-bit high-performance ISA-compatible system. This open industry standard allows industry wide compatibility. EISA provides 32-bit memory addressing and data transfers for CPU, DMA, and bus masters. It allows a 33 Mbyte/second transfer rate for DMA and bus masters on the EISA bus. EISA provides automatic configuration of add-in cards that eliminates the need for jumpers and switches. Interrupts are both shareable and programmable. Figure 4.3 shows the type of buses used in an EISA system. The bus-arbitration scheme allows intelligent bus master add-in cards. Since the EISA system is compatible with the ISA 8 and 16-bit expansion boards and software, ISA cards can be plugged into the EISA connector slots. The EISA slots are defined as ISA or EISA for compatibility during configuration. The EISA connector set is a superset of the ISA connector set so there is full compatibility with ISA add-in boards is allowed with the automatic system and expansion board configuration scheme. EISA Chips The Intel 82350 EISA chip set is an EISA/ISA compatible chip set. It supports the 386 or 486 CPU, 82385 cache controller, and 80387 numerics coprocessor. The 82350 chip set is designed fo PCs and PC compatible workstations. The chip set also supports a buffered configuration for extended architectures with SCSI and LAN functions on the system board. The chip set includes the 82352 EISA Bus Buffers (EBB), 82357 Integrated System Peripheral (ISP) and 82358 EISA Bus Controller (EBC). The EBB supports three buses when used in an IESA system. These are called the A, B, and S buses in the EBB and correspond to the host system bus and the LA and SA buses in the EISA system. The ISP handles the DMA functions of the system. It has seven 32-bit DMA channels, five 16-bit timer/counters, two eight-channel interrupt controllers, and provides the NMI control and generation. It also provides refresh address generation and keeps track of the refresh requests when the bus is not available. The ISP support multiple EISA bus masters using a system arbitration scheme which grants the bus on a rotating basis. The EBC acts as the EISA engine, since it works as an intelligent bus controller for the 8, 16 and 32-bit bus masters and slaves. It provides the state machine interface to host ISA/EISA buses and the other ICs in the chip set. It provides the interface to the 386/486 CPUs and the EISA bus. The EBC acts as a bridge between the EISA and ISA devices. The data bus size differences are handled by the EBC, including the assembly and disassembly. The 82355 Bus Master Interface (BMIC) is a device for add-in cards that makes use of the EISA bus master capabilities. ICs like these support the EISA bus in 386 and 486 processors at various clock speeds. These chips use a CPU to memory protocol which allows the memory subsection to operate independently of the CPU clock. The CPU protocol is translated to this CPU speed independent of the protocol. Bus Architecture Three buses are used : a host bus, the EISA bus, and a peripheral bus called the X-bus. The host bus connects the CPU or host master to the memory system. The EISA bus interfaces the system board resources to expansion bus resources. The Peripheral bus supports the system board IO. Host Bus The host bus provides the connection between the CPU memory system. Zero-wait- state burst cycles are implemented using a 64/128-bit interleaved memory interface. There are zero-wait-state posted writes with the posted write buffer of the 82353. 23
  • 24. CPU frequency independence is dur to the 82359's delay line and the programmable state tracker function. The programmable delay line allows the DRAM cycle sequence to be tuned to DRAM parameters. The memory interface handshake. Even though the interface hand-shake is clockless, it is synchronous, since the CPU wait state counts match those needed by the DRAM access. EISA Bus The EISA bus connects the masters to memory and acts as a path for CPU accesses to system resources. The interface of EISA masters to memory is optimized for the full memory bandwidth defined by the EISA specifications. This is done with the synchronous tracking of the EISA master cycles by the 82359. The 82359 is always synchronous to the EISA master talking to it. When the CPU accesses the system, the EBC converts the 82359 handshake into the EiSA protocol. The EBC performs any required cycle control for byte assembly/disassembly, and controls the latches and transceivers fo the 82359 DRAM controller and the 82353 data path chip. The EBC runs back-to back read cycles to support CPU to system bursts and coordinates posted system writes. Peripheral Bus (X-Bus) This is an 8-bit bus that supports the system board IO functions such as the keyboard, floppy, and the LIOE chip which contains a parallel port, and supports external real-time clock and serial ports. The peripheral bus is a buffered version of the 8-bit ISA bus. THE PCI LOCAL BUS - A BUS BUILT BY INTEL The PCI-Bus (Peripheral Component Interconnect) was originally designed to speed up the display of graphics on Intel-based personal computers, but the standard itself if processor independent and suitable for other hardware add-ons that require high bandwidth, including network, video and SCSI adopters. PCI was developed by INTEL but it did take some time to get it to work reliably. By the middle of 1993 the VESA-Bus became firmly entrenched in the market place and almost all DOS computer systems had VESA-Bus slots as standard. The wide acceptance of local bus techonology only took a few months and by default, VESA-Bus become the first Local Bus standard. For a while, many people in the computer industry saw a local-bus war between the two competing local-bus standards. The PCI-Bus has some attractive features, such as concurrent bus-mastering, a full burst mode, and a type of pipe lining queue that can reduce the number of potential wait states compared to the VESA-Bus design. The PCI-Bus uses three elegant techniques to resolve local bus problems. The first, known as reflective wave signaling, reduces the amount of electrical amplification required on the signal paths and thus reduces noise and loading problems. The second is multiplexing, Multiplexing allows tow different signals to use the same electrical path, reducing the number of pins required for peripheral chips and lowering manufacturing costs. The third is a protocol letting the PCI controller receive specific configuration information from the PCI devices themselves. Intel did not defined a standard adapter connector for the bus, leaving that job up to a PCI-Bus special-interest group who settled on the the white 112 pin connector. PCI the Universal Bus PCI is platform independent and was soon used in computers built around the PowerPC chip. This is one of the few times a standard I/O bus has been used across platforms and so this has to be a big feature in it's favour. The various companies involved in the PowerPC development, including Apple and IBM adopted the PCI-Bus for Power-PC based computers. Apple had been using the Macintosh NuBus for many years, but switched to the PCI-Bus it's Power-PC products. It is ironical that the largest user of Motorola based processors lined up to buy bus technology from Intel. Other computer manufacturers are also using the PCI-Bus in there computer platforms with Digital Equipment Corp. (DEC ) with their Alpha RISC-based systems, and Hewlett- Packard and SUN Microsystems all including PCI-Bus slots in there products. Intel licensed its patents on the PCI Bus free of royalties to all who wished to use it. 24
  • 25. By adopting a established industry standard the manufacturers of the other computer platforms are ensuring lower costs and more options for both users and developers who are no longer locked into their own proprietary options. The wide range of cards that have followed the use of the PCI-Bus on PC systems are available for the first time to users of other hardware. All that should be required is alternative driver software for the various platforms. The Characteristics of the various busses Bus type Bus data width Bus speed Data transfer rate PC/XT 8 bits 4.7-8 MHz 3.25 (Mbits/Sec) ISA 16 bits 8 MHz 6.5 (Mbits/Sec) EISA 32 bits 8 MHz 32 (Mbits/Sec) MCA 32 bits 8 MHz 0 (Mbits/Sec) VESA 32 bits 33 MHz to 50 MHz 132 (Mbits/Sec) and above PCI 32 bits 33 Mhz 132 (Mbits/Sec) I/O Interfacing The 386 DX microprocessor supports 8-bit, 26-bit and 32-bit I/O devices. These can be mapped into either the 64-kilobyte I/O address space or the 4-gigabyte physical memory address space. I/O mapping and memory mapping of I/O devices have the following characteristics : (1) The address decoding needed to generate chip selects for I/O mapping is usually simpler than memory mapping, (2) I/O mapped devices reside in the I/O space of the 386 microprocessor (64 kilobytes), and (3) memory mapped devices reside in the much larger memory space of 4 gigabytes. Memory mapped devices can be accessed using any 386 microprocessor instruction. I/O mapped devices can be accessed only through the IN, OUT INS, and OUTS instructions. Memory mapped devices are protected by the memory management and protection features. The interface to a peripheral device depends not only upon data width, but also on the signal requirements of the device and its location within the memory space or I/O space. Address Decoding Address decoding to generate chip selects is required if the I/O devices are I/O mapped or memory mapped. One technique for decoding memory mapped I/O addresses is to map the entire I/O space of the 386 into a 64 kilobyte region of the memory space. The address decoding logic ensures that each I/O device responds to both a memory address and an I/O address. Addresses can be assigned to I/O devices arbitrarily within the I/O space or memory space. Eight-bit I/O devices can be connected to any of the four 8-bit sections of the data bus. If the addresses of two devices lie within the same doubleword boundaries,. BE3- BE0 are decoded to provide a chip-select signal that prevents a write to one device from erroneously performing a write to the other. This chip select is generated with an address decoder. In most systems, the same control logic, address latches, and data buffers are used to access both memory and I/O devices. Latches hold the address for the duration of the bus cycle. If 74LS373 latches are used, the Latch Enable (LE) input is controlled by the Address Latch Enable (ALE) signal from the bus control logic. If goes active at the start of each bus cycle. The address decoder converts the 386 address into chip-select signals. It can be located before the address latches or after the latches. If it is placed before the latches, the chip- select signal becomes valid as early as possible but it must be latched along with the address. The chip-select signals are sent to the bus control logic to set the correct number of wait states for the accessed device. The decoder may be made up of two one-of-four decoders. One is used for memory address decoding and one for I/O address decoding. An output of the memory address decoder will activate the I/O address decoder for I/O accesses. 25
  • 26. I/O Interface The peripheral (I/O) interface is an essential part of a microprocessor system. It supports communications between the microprocessor and the peripherals. The peripheral system must allow a variety of interfaces. An important factor are the buses which connect the major parts of the system. Devices like disks must be able to transfer data to a memory with minimal CPU overhead or interaction. I/O devices may be accessed by dedicated I/O instructions for I/O mapped devices, or by memory operand instructions for memory mapped devices. The 486 microprocessor synchronizes I/O instruction execution with external bus activity. The previous instructions are completed before an I/O operation begins. All writes in the write buffers will be completed before an I/O read or write is performed. All microprocessor systems include a microprocessor, memory, and I/O devices which are linked by the address, data, and control buses. Figure shows the configuration of a typical 486 microprocessor based system. In most systems, the same control and data can access memory as well as I/O devices. The bus interface consists of bus control, data transceiver, byte swap logic, and address decoder. A typical peripheral device has address inputs which the processor uses to select the device's internal registers. It also has a chip-select (CS#) signal which enables it to read data from and write data to the data bus as controlled by the READ (RD) and WRITE (WR) control signals. If the microprocessor has separate memory and I/O addressing, either memory of I/O read and write signals can be used. 26
  • 27. Many peripheral devices also generate an interrupt output which is asserted when a response is required from the microprocessor. Here, the microprocessor must generate a low interrupt acknowledge (INTA) signal. Transceivers An 8-bit transceiver like the 74LS245 provides isolation and additional drive for the 386 data bus. Transceivers are used to prevent any contention on the data bus that may occur if some devices take too long to remove read data from the data bus after a read cycle. If a write cycle follows a read cycle, the 386 can drive the data bus before a slow device has removed its output from the bus, resulting in bus contention problems. Transceivers are not used when the device is fast enough. The bus interface must have enough transceivers to handle the device with the most inputs and outputs on the bus. If the widest device has 16 data bits and all devices can be connected only to the lower half of the data bus, only two 8-bit transceivers are used. The 74LS245 transceiver is controlled with two input signals. A Data Transmit/ Receive (DT/R) is switched high to enable the transceiver for a write cycle. When it is switched low, it enables the transceiver for a read cycle. This signal is a latched version of the transceiver outputs. This signal is generated by the bus control logic. DMA CONTROLLERS Direct-Memory-Access Techniques When I/O devices must transfer large amounts of data too quickly to be controlled by the microprocessor, the transfer can be made directly between the device and the memory of the microprocessor system using direct-memory-access (DMA). Here, the transfer is under the control of a DMA controller which is usually a dedicated chip or circuit that operates independently of the microprocessor. In a DMA data transfer, the DMA controller can take over control of the microprocessor buses using one of several methods. An external control line can stop the microprocessor after the current bus cycle is completed. The microprocessor's memory- control signals are disabled while the DMA controller initiates the data transfer. When the DMA transfer is completed, the controller resets the halt line so the microprocessor can resume execution. In the cycle-stealing scheme, external control lines are used to stop the operation of the microprocessor by suspending the execution of an instruction cycle. The microprocessor is halted while the memory control lines are disabled. The controller takes over operation and steals several machine cycles to implement the data transfer. When the transfer is complete, the control lines are reset, the clock is started, and the microprocessor continues execution of the instruction that had been delayed. Microprocessors using dynamic memory may need to restrict the number of machine cycles that can be stolen, so that status conditions are not lost between refreshing. Another DMA technique is memory sharing where the microprocessor is allowed to access the memory only at certain times during a machine cycle. Thus, the memory is available for other devices at the other times. This requires synchronizing the DMA controller with the microprocessor clock. This type of interleaved DMA can reduce the delays in microprocessor processing that can occur when cycle stealing is used. In a typical DMA controller chip like the Intel 8257, the acquisition of the bus for the DMA operation is done with the hold function for the microprocessor. The priority logic in the controller is used to resolve conflicts and issue the hold request to the microprocessor. The controller also keeps track of the cycles used and notifies the peripheral when the number of cycles used for the data transfer is complete. 8237A DMA Controller The Intel 8237A provides enable/disable control of individual DMA requests for four independent DMA channels with independent auto initialization of all channels. It allows transfers up to 1.6 Mbytes/second with the 5-MHz version and is directly expandable to any number of channels. The controller comes in a 40-lead Cerdip or plastic package. 27
  • 28. The 8237A peripheral interface circuit is designed to improve system performance by allowing external devices to directly transfer information from the system memory. Memory-to-memory transfer capability is also provided. The 8237A is used with an external 8-bit address latch. The channels are expanded by cascading additional controller chips (fig. 8.5). There are three basic transfer modes. 28
  • 29. Registers The 8237A has 344 bits of internal memory in its registers as shown in the following : 4 Base address registers of 16 bits each 4 base word count registers of 16 bits each 4 Current address registers of 16 bits each 4 Current word count registers of 16 bits each 1 Temporary address registers of 16 bits 1 Temporary word count register of 16 bits 1 Status register of 8 bits 1 Command register of 8 bits 1 Temporary register of 8 bits 4 Mode registers of 6 bits each 1 Mask register of 4 bits 1 Request register of 4 bits DMA Operation The 8237A uses two major cycles called the idle and active cycles. Each cycle is made up of several states. The chip can be in one of seven states. Each state lasts for on clock period. The inactive state is used when there are no DMA requests pending. In this state the DMA controller is inactive but it may be in the program condition, being programmed by the processor. In the first state of DMA servicing, called the SO state, the 8237A has requested a hold but the processor has not yet returned an acknowledge. The chip may still be programmed until it receives an HLDA from the CPU. An acknowledge from the CPU allows DMA transfers to begin. S1, S2, S3 and S4 are the working states of DMA servicing. If more time is needed to complete a transfer, wait states (SW) are inserted between the I/O device to memory with IOR* and MENW* or MEMR* and IOW* active low at the same time. Memory-to-memory transfers use a read-from and a write-to-memory for each transfer. These states are similar to the normal working states and use a two digit number for identification. Eight states are needed for a transfer. The first four states, S11 to S14 are used for the read-from-memory and the last four states, S12 to S 24, for the write-to- memory part of the transfer. Idle Cycle When there are no requests for service, the chip goes into the idle cycle and performs S1 states. In this cycle the chip samples the DDREQ lines every clock cycle to check if any channel is requesting service. The device will look at CS*, which indicates when the microprocessor tries to read or write any internal registers of the 82237A. When CS is low and HLDA is low, the 8237A enters the Program Condition. In this mode the CPU can set or check the condition of the internal registers. Address lines A0 through A3 select the registers to be read or written to. The IOR* and IOW* lines select and time the reads and writes. An internal flip-flop is used to generate an additional bit of address. This bit determines the upper or lower byte of the 16-bit address and word count registers. The flip-flop is reset by a Master Clear, Reset or a soft-wate command. Active Cycle When the 8237A is in the idle cycle and a non-masked channel requests a service, the devices sends an HRQ to the microprocessor and goes into the Active cycle. The DMA servicing will take place, using one of the following four modes. Single transfer In the single transfer mode the device is programmed to make one transfer only. The word count will be decrements and the address decrements or incremented following 29
  • 30. each transfer. When the word count goes through zero, a Terminal Count (TC) will cause an auto initialize if the channel has been programmed for this. Block transfer In the block transfer mode the device is activated by a DREQ and continues with the transfer until a TC, caused by the word count ending or an external End of Process (EOP). An auto initialization will occur at the end of the service if the channel has been programmed for it. Demand transfer In the demand transfer mode the device continues with the transfer until a TC or external EOP occurs or until DREQ goes inactive. This allows the transfer to continue until the I/O device has no more data ready to transfer. When the I/O device has more data to transfer, the DMA service is reestablished with a DREQ. During the time between services, the values of address and word count are stored in the Current Address and Current Word Count registers. An EOP is needed to Auto initialize at the end of the service. The EOP may be generated by either a TC or by an external signal. Cascade The cascade mode is used to cascade more than one device for system expansion. The HRQ and HLDA signals from the additional device are connected to the DREQ and DACK signals of a channel of the initial device. This allows the DMA requests of the additional device to propagate through the priority circuits of the preceding device. The priority chain is preserved and the new device must wait for its turn to acknowledge requests. Since the cascade channel of the initial device is used only for prioritizing the additional device, it does not output any address or control signals of its own. These could conflict with the outputs of the active channel in the added device. The 8237A will respond to DREQ and DACK but all other outputs except HRQ are disabled. The ready input is ignored. This results in a two-level DMA system. More devices could be added at the second level by using the remaining channels of the first level. Additional devices may also be added by cascading into the channels of second level device to form a third level. Controlling I/O devices (Polling and interrupts) The three basic scheduling techniques used for controlling the input~ output devices and synchronizing the data transfers are as follows : 1. Polling or program-control 2. Interrupt-control 3. Direct-memory-access control The one that is used in the microcomputer system depends on three factors: the rate that the data is to be transmitted, the time delays between the I/O device, and the actual data transfer and the feasibility of overlapping or interleaving I/O operations. The polling or programmed I/O technique is the simplest to implement. The I/O devices are connected to the system bus with some connections to the control lires The basic principle is to implement a procedure in hardware or software for determining which input/output device requires service. The polling technique is synchronous in nature as the microprocessor periodically questions each device if it requires service. Each device then answers with a yes or no. If a no is received, the microprocessor will advance to the next device and question it. In this way the microprocessor checks each I/O device successively to determine if service is required. In a polling microcomputer the asynchronous inputs are detected by an instruction which checks if the input has occurred. The sequence of instructions in the polling loop tests the various input lines at a rate which will provide the desired system response time. In a microcomputer system that communicates with several I/O devices, the periodic status checks that must be made on each device can result in considerable time lags for some devices since they indicate when they are ready to transfer data and then they wait for the actual transfer. In some microcomputer systems, the time spent checking the device status may be reduced with a common test line, which signals when a device needs attention. The 30
  • 31. microprocessor can periodically check the status of this line without having to poll the individual devices until one of them signals for service. Then, a polling loop is used to find which device requested service. Polling takes minimal hardware since usually no special lines are required. It is also synchronous with the program execution, so it is easy to find when a device is being interrogated and how long it takes to service it. No events may occur, which tends to disrupt the scheduled polling sequence. In contrast, interrupts and DMA are asynchronous and cannot be predicted. Interrupts In applications where the polling technique does not provide fast enough response-or uses too much microprocessor time-interrupts should be considered. An interrupt line is connected to the microprocessor, and each of the devices is connected to this line. Each one of the devices which may need to get service has the option of using this line to request service. , When a device requests service, it generates an interrupt pulse or level on this line. The microprocessor can then sense this change on the line. The microprocessor must accept the interrupt identify it, and service it. Accepting the interrupt may be done with an internal mask bit called either an interrupt mask, interrupt inhibit, or interrupt-enable. ~ This bit is normally stored in the flag or status register. After the interrupt is accepted, the microprocessor must then determine which device originated it. Several devices may generate interrupts simultaneously When multiple devices are connected to the same interrupt line, priorities must :; be assigned. After the interrupt has been accepted and the device identified, the service requested by the device is performed. The microprocessor may suspend the program it was executing and branch to the interrupt routine. The required branching address will need to be available. The software that does this is called the interrupt routine or interrupt. The execution of the interrupt routine handler is similar in some ways to that of a polling system. The termination of this routine allows the program which had been suspended by the interrupt to continue its execution. This may require several instructions. Multiple Interrupts The details of the interrupt-servicing procedures discussed so far apply only when a single device is generating interrupts. Most microcomputer systems have more than one source and more than one type of interrupt. The types of interrupts falI into three general classes: external, internal, and simulated. External interrupts are generated from the peripheral devices. Internal interrupts are generated by the microcomputer system to indicate error conditions such as a power failure, system malfunction, or transmission break. Simulated interrupts are generated by the software for interrupt testing and debugging. The different sources of interrupts can have different service requirements. Some may require immediate attention, while others can wait until the task underway is completed. The interrupt procedure must differentiate between the various sources and determine the order in which the interrupts are serviced when more than one occurs at the same time. Finally, the contents of registers must be saved and restored so the program can continue after the multiple interrupts. If several interrupt-request lines are used, each can have its own interrupt-trap address. Then, with one source of interrupt assigned to each line, the system can distinguish between internal, external, and simulated interrupts. When several U0 devices use the same interrupt-request Iine, the interrupt may be recognized either by polling using software or by vectored interrupts using hardware. In the polling technique as we have discussed, the interrupt produces a jump to the service program using the interrupt trap address. The service program will check the status word of each I/0 device to determine which one caused the interrupt. The interrupt status bit will indicate if a device has generated an interrupt request as it is checked for each device. The device status word is read into the status register of the microprocessor and if the bit is set, a jump is then made to the service program. Vectored Interrupts A vectored interrupt system allows the microprocessor to recognize the interrupting device since each UO device is assigned a unique interrupt address This address then 31
  • 32. generates an interrupt trap address for the device. The trap addresses are normally located sequentially in the program memory in order to form the interrupt vector. Each location contains the starting address of a device-service program. The contents of the interrupt vector are loaded into the program counter and program control transferred to the correct device-service program. Some vectored systems do not transmit an address. They use an I/O device to transmit an instruction to the microprocessor after the request has been acknowledged. Next, the system Ioads the instruction into an instruction register Normal operation continues after this instruction is executed The vectoring is achieved by a jump instruction which derives the jump address from part of the instruction. A unique jump address is defined for each UO device in the system. In systems with several sources of interrupt, one or more interrupt request can occur during the servicing of an earlier request. In the simplest way to handle this, the interrupt mask bit is set when the first request is recognized. Then the following requests are placed in a queue, waiting until the service of the first interrupt is complete before they are recognized and serviced. The order in which the queued interrupts are recognized will determine the individual delay before service. This order, or priority, is set either by software or by hardware using a priority scheme. After recognizing interrupt request, the service program can poll the devices in the desired order. The devices which are polled first will be serviced first In systems with hardware priority, the interrupt logic sends an external signal to the request logic in each of the I/0 devices. This signal indicates the state of the interrupt mask bit and is passed to each device according to its priority. When the mask is set, the signal prevents any device from generating an interrupt request. If the mask is reset and the device has no interrupt request pending, the signal is passed on to the next device. The interrupt logic in the device will generate the interrupt request and prevent the signal from passing on. If more than one device requires interrupt service; the device receiving the control signal first will be serviced first. Software and-hardware priority schemes may be slow to respond to a high priority interrupt if it occurs during the servicing of a Iow-priority interrupt. Individual interrupt mask bits for each interrupt request line or each I/0 device can then be used. By setting and resetting these individual mask bits under software control, the interrupt priorities are changed to fit the needs of the application. ' Some microprocessors use one interrupt-request line that has software-controlled mask bits and another line that is permanently enabled. This non maskable interrupt request Iine has the higher priority and is used when service is required immediately Some vectored system use the microprocessor to define and control ~ the priorities. When an interrupt request occurs, it is transferred to the , microprocessor and the vectored address is compared with an interrupt-enabling mask. When the vectored address is equal to or less than the mask, the request is recognized. The mask is then set to one less than this address and servicing starts. If the vectored address is greater than the mask, the request is simply queued. Note that only interrupts issued from a device with an address lower than that of the device being serviced are recognized. The lower the address, the higher the priority. Interrupt Types The types of interrupts. include vectored, nonvectored, maskable, and nonmaskable. A maskable interrupt can be turned off by the processor and is used when a software operation cannot be interrupted. In these `cases, the processor is instructed to disable .the maskable interrupts. There is usually a disable-interrupt instruction in the processor's instruction set to do this. A nonmaskable interrupt cannot be turned off. This type of interrupt is designed for critical events such as a power failure. When an interrupt occurs, the processor can branch to a location that contains the first instruction of the interrupt-service routine. Another approach is to use a location in memory for the starting address of the' service routine. In most newer processors, a single interrupt service routine is not enough. These processors have several memory locations reserved for the addresses of the interrupt service routines. 32
  • 33. The hardware selected interrupt scheme uses separate lines for each memory location and its service routine. This is called a hardware-vectored interrupt. The processor can use an interrupt acknowledge cycle ,where the hardware supplies additional information to the processor This guides the processor to the proper service-routine and is called software vectored interrupt Interrupt Routines When an interrupt branch occurs for an I/O request for service, either data is waiting for input or output or there is a problem with the data transfer .The processor is much faster than the peripheral devices, so instead of waiting for the device to get ready for the transfer, interrupts are used. Most I/0 devices use buffers to hold the information to be transferred. When the output buffer becomes empty or the input buffer becomes full; the interrupt service routine signals that the data transfer is complete. The main program will create a buffer and then fill it if it is an output buffer Next, the interrupts are enabled. The interrupts may be enabled by the processor, if they are maskable interrupts, or they may be enabled by the I/O chip. In some cases hardware between the I/O chip processor may have an interrupt enable facility. High-Level interrupts The low or machine-level interrupts discussed previously are supported by the interrupt circuits built into the processor When a micro-computer is executing a program written in a high- level language such as BASIC, thousands of machine instructions are being executed and `the registers in the processor are changing & frequently. The Microsoft BASIC used in the IBM PC supports user interrupt servicing through its subroutines. The subroutines are invoked with the GOSUB (go subroutine)statement, and the main program is returned with RETURN The user interrupt service routines are a variation of the subroutine. After the interrupts are enabled, the subroutine is invoked when a peripheral causes an interrupt. The subroutine is written in the high level language of the computer and is terminated with an interrupt return statement such as RETURN. If the interface is not busy before interrupts are enabled, the interrupt is immediate. The interrupts must be re-enabled in the interrupt service routine if the transfer is not done. This must be done when the enable is canceled as it is invoked. Otherwise, the interrupt service routine could be interrupted. Types of Peripheral Interfaces The three major types of peripheral interfaces are parallel, serial, and analog. Each type also has a number of different variations Parallel interfaces are like microprocessor buses. These are often used to interface personal computers to printers. Data is transferred over a set of wires called data lines, like the microprocessor data bus. There are variations in parallel interfaces among the number of data lines used and the amount of signals used for handshaking. Handshaking is a technique used to control the rate at which information moves from one device to another . Serial interfaces use a single line to transmit one bit at a time. The two types of serial interface are asynchronous and synchronous. The asynchronous interface is more common in microcomputers. The serial interface is often used to interface a mouse or keyboard to a personal computer. Analog interfaces are different from both serial and parallel interfaces since they do not use digital signals (zero or one). Microprocessor buses use digital signals and serial and parallel interfaces use digital signals to communicate with peripherals. Analog interfaces must convert digital signals into signals that vary continuously or convert continuous signals into digital signals. Serial Interface Standards A common serial interface standard is RS-232: The main RS-232 signal lines are those used to transmit and receive data (BA and BB). These lines are used to send the serial 33
  • 34. information between the two communicating systems. The following bit rates are common: 19,200, 9,600, 4,800 and 2,400. Other rates have also been used in the past. The other signals of RS-282, are used to indicate the status of the modulation demodulator (modem) communications link. Signals such as request-to-send, clear-to- send, data-set-ready, and data-terminal ready are used to control the modem link. The signals between the modem (communications equipment) and the computer (or terminal) implement a handshake similar to that used in other buses. The difference in RS-232, is that the handshake is used only at the beginning and end of a block of serial data. RS-232 has been popular in larger computers and this popularity has migrated to PC peripheral communications. PINOUT of the SERIAL PORT (--> direction is out of PC) (Note DCD is sometimes labeled CD) Pin# Pin# Acronym Full-Name Direction What-it-May-Do/Mean 9- pin 25-pin 3 2 TxD Transmit Data -> Transmits bytes out of PC 2 3 RxD Receive Data <-- Receives bytes into PC 7 4 RTS Request To Send --> RTS/CTS flow control 8 5 CTS Clear To Send <-- RTS/CTS flow control 6 6 DSR Data Set Ready <-- I'm ready to communicate 4 20 DTR data Terminal Ready --> I’m ready to communicate 1 8 DCD Data Carrier Deteet <-- Modem connected-to another 9 22 RI Ring Indicator <-- Telephone line ringing 5 7 Signal Ground Only 3 of the 9 pins have a fixed assignment: transmit, receive and signal ground. This is fixed by the hardware and you can't change it. But the other signal lines are controlled by software and may do (and mean) almost anything at all. However they can only be in 34
  • 35. one of two states: asserted (+12 volts) or negated (-12 volts). Asserted is "on" and negated is "off” . For example, software may command that DTR be negated and the hardware only carries out this command and puts -12 volts on the DTR pin. A modem (or other device) that receives this DTR signal may do various things. if a modem has been configured a certain way it will hang-up the telephone line when DTR is negated. In other cases it may ignore this signal or do something else when DTR is negated (turned off). It's like this for all the 6 signal lines. The hardware only sends and receives the signals, but what action (if any) they perform is up to the software and the configuration/design of devices that you connect to the serial port. However, most pins have certain functions which they normally perform but this may vary with the operating system and the device driver configuration. Serial and Parallel Ports PC communications depend heavily on serial and parallel ports. The serial and parallel ports are the connectors (plug-ins) on the back of the computer The parallel and serial ports are an important part of interfacing. They allow the computer to talk to the outside world. A port is simply a connection, or plug-in, that gives access to the computer The computer peripherals, which are the devices that extend the usefulness of the computer, such as printers, mouse, and modems, all talk to the computer through the communications ports. When you connect peripherals and communications ports, first you , must determine the type of communications that the device uses, either parallel or serial. Then, you make the physical connection this requires the proper port(s) on your computer, and the right cable(s). The next step is to inform the software of the connections that were made. The connectors on the computer are referred to as ports. These can be thought of as the passageway through which the signals are sent and received. Sending signals from the computer is, referred to as output and receiving signals is referred to as input. A printer and a modem are output devices and a keyboard and a mouse are input devices. The two basic methods that PCs use to communicate with the outside world are the serial and parallel communication techniques. The main difference between parallel and serial is the way they transmit signals over their cables. Internally, the computer recognizes each '' ( character as an 8-bit code. In serial communications the signals are sent one at a time, over a single wire or of pair of wires. In parallel transmission, the signals are sent over eight different wires. Parallel communications are like an eight-lane express way with automobiles next to each other in the lanes. Serial communications, on the other hand, are more like railroad cars travelling down a single track. Parallel ports are like an expressway since they can handle larger volumes of traffic. But, just as vehicles in adjacent lanes can interfere with each other, so can the wires in parallel transmission. The longer the distance becomes, the greater is the chance of interference. Because II of this potential for interference in parallel communications, it is normally used only to communicate short distance. Parallel cables are usually no longer than 15 feet. Serial communications are more like a train of cars connected together and riding on rails; it does not have the same potential for interference between the signals. This allows serial cables to be used for distances up to 50 feet. Serial communications cannot transmit the same volume as parallel communications because of the number of signal paths used. Another difference between serial and parallel communications is the ease of use. Parallel communications 35
  • 36. are more straightforward to the user All that is needed is to plug them in there are no transmission parameters to configure and match ' between the sender and the user. Parallel communications always occur in the same way. If a printer has a parallel connector, it just needs to plugged into a parallel port. Serial communications are more flexible since they allow a variety of settings. This increases the potential uses but requires the proper set tings for transmission. One setting is the speed effective communication is not possible if the sender uses one speed and the receiver a different speed. The speed of serial communications is called the baud.rate and common speeds are 300, 1200, 2400, 4800, and 9600 baud. Because of their ease of use, parallel communications have become the method of choice for the majority of IBM-compatible printers. Parallel communications ports are easier to hook up and ready to be used after the connection is made. (Serial printers are used when the distance to the printer is greater than 15 feet) Port Characteristics Most PCs include one or more parallel ports, which are also called printer ports. These two terms are interchangeable. Serial ports are normally used for a mouse. Interface cards for both are available on expansion boards that can be installed in the expansion slots inside the computer. These cards are usually inexpensive and can be installed easily. Up to four parallel ports can be installed in a PC. These ports are designated LPT1 to LPT4. Serial ports are numbered COM1 to COM4. If you already have two serial ports in use, be sure any serial cards you add can be configured for COM3 (Communications Port 3) or COM4 (Communications Port 4). Many serial ports are not designed to work as COM3 and COM4. The serial ports in these cards can only serve as either COM1 or COM2. Serial ports are RS-232 ports. This terminology is more common in larger computers. The RS means recommended standard and the 232 is the identification number for the standard that the Electronic Industries Association (EIA) uses. So if someone refers to a RS-232 port they are talking about a serial port or a COM port. The serial ports on the computer are either 9 or 25 pin male connectors. The parallel ports are always 25-pin female connectors. Identifying the Ports In the back of your IBM personal computer or compatible, there are several different connectors. The parallel ports are the 25-pin female connectors. These are called DB-25 connectors. On many video cards, the parallel printer port is usually under the smaller 9- pin video output port. Serial ports have the pins reversed to keep you from plugging a cable into wrong plug. On most newer computers (from the 286 on) the serial ports have a 9-pin male connector (with the pins showing). This type of connector is known as a DB-9 connector. On most XT computers, the serial port is male 25-pin (DB-25) connector. The newer computers use the 9-pin serial ports to save space. The smaller ports allow two serial ports in the same amount of space. PC serial ports use only 8 pins, so the other 17 pins of the DB-25 connector are unused. There are adapters if you want to install a device that has a 9-pin serial connector into a PC that has a 25-pin serial connector. This can occur when you try to hook up a serial mouse that needs to be connected to a serial port. These adapters that allow such a connection to be made have a 9-pin connector on one end and a 25-pin connector on the other. These adapters are often included with a mouse or they can be purchased separately. The other connectors on the computer include a 9-pin video port connector that connects to the monitor. A larger game port connector may also be used to connect a joystick or game paddle. When you add or change the devices that are hooked up to the computer, you may need to change the software configurations. This is minimal if the device uses parallel communications to one of the LPT parallel ports. You may be asked to confirm LPT port that the computer selected for you. But, if you want to have a serial communications part for a printer or other device connected to one of the COM serial ports, you will need to set the variable communications parameters. 36
  • 37. The communication parameters instruct the computer which serial port to use, how fast it is, and other parameters such as parity, data bits, and stop bits. These parameters can either be set by an application software package or by the DOS MODE command. MODE COM2 : 9600, 8, n, 1 Receiver / Transmitter ICs The early serial interfaces used integrated circuit shift registers for the parallel-to-serial and serial-to-parallel conversions. Shift registers for synchronous transmission use a clock to indicate when the next data bit is to be shifted in or out. In asynchronous transmission the start and stop bits are loaded into the shift register and shifted out like the data bits. When shift registers are used for asynchronous reception, circuits are needed to synchronize the receiving shift register with the incoming bits. Receiver / transmitter chips were introduced during the 1970s. These integrated circuits usually provide two channels of asynchronous and synchronous receivers and transmitters along with bit-rate generators, buffers and status, interrupt, and DMA control lines. These receiver / transmitter integrated circuits include the General Instruments AY- 31015D UART, Motorola 6850 ACLA, National Semi-conductor 8250 ACE, and Intel 8251A USART. The following definitions are used for these receiver / transmitter chips : UART Universal Asynchronous Receiver / Transmitter ACLA Asynchronous Communications Interface Adapter ACE Asynchronous Communications Element USART Universal / Synchronous / Asynchronous Receiver / Transmitter Each of these integrated circuits performs the same take, but there are different capabilities among these chips. UART Chips One of the first receiver / transmitter integrated circuits to become popular was the Universal Asynchronous Receiver / Transmitter or UART (pronounced "you-art"). It combines the transmitter and receiver shift registers with other features to simplify serial interfacing. The UART has separate receiver, transmitter and control sections. The transmitter and receiver sections operate independently but they share the control and status pins. The UART is like four separate shift registers that have their own control signal line. The two write registers are a transmit-buffer register and a control register and the two read registers are a received data buffer register and a status register. Each register has its own data lines and control signal line. UART CHIPS UART (Universal Asynchronous Receiver and Transmitter) are chips inside communication devices that are responsible for serially transmitting and receiving information. One chip at each end of a serial communication channel can do the communication. These chips convert bytes into bits and send each byte down a line, where another chip transforms the bits back into bytes. UART chips are usually the 37
  • 38. brains behind communicating serially on a personal computer. The CPU gets an interrupt every time a byte is sent or received. The CPU then moves a received byte out of the UART's register and into memory somewhere, or gives the UART another byte to send. The most recent UART chip is the 16550A. Some normal chips in a PC are the 2450, 16450, 8250, or 16550A. The 8250 and 16450 UARTs only have 1 byte buffer. This means that every time a byte is sent or received, the CPU gets an interrupt. For slow communication up to 19200 bps, this is acceptable. However, at high communication rates, the CPU might not have time to service interrupts sent from the UART, to receive the information. The 16550A UART chip is important for high speed communications because it comes with 16 byte FIFO. This means the 16550A chip can send and receive information up to a 16 bytes before it has to interrupt the CPU. The CPU can then transfer all 16 bytes at a time. Although the interrupt to this chip is seldom set right at 16, the chip sends less interrupts and relieves congestion to the CPU. When data is lost due to the CPU being unable to service the request of the UART chip, correction algorithms detect the loss and send again. The results in slow communication, but not always failed communication. That is why it may not be readily obvious to a user that something is wrong. UARTs (Universal Asynchronous Receiver Transmitter) are serial chips on your PC motherboard (or on an internal modem card). The UART function may also be done on a chip that does other things as well. On older computers like many 486's, the chips were on the disk IO controller card. Still older computer have dedicated serial boards. The UART's purpose is to convert bytes from the PC's parallel bus to a serial bit-stream. The cable going out of the serial port is serial and has only one wire for each direction of flow. The serial port sends out a stream of bits, one bit at a time. Conversely, the bit stream that enters the serial port via the external cable is converted to parallel bytes that the computer can understand. UARTs deal with data in byte sized pieces, which is conveniently also the size o ASCII characters. Say you have a terminal hooked up to your PC. When you type a character, the terminal gives that character to its transmitter (also a UART). The transmitter sends that byte out onto the serial line, on bit at a time, at a specific rate. On the PC end, the receiving UART takes all the bits and rebuilds the (parallel) byte and puts it in a buffer. Along with converting between serial and parallel, the UART does some other things as a byproduct (side effect) of its primary task. The voltage used to represent bits is also converted (changed). Extra bits (called start and stop bits) are added to each byte before it transmitted. Also, while the flow rate (in bytes / sec.) on the parallel bus inside the computer is very high, the flow rate out the UART on the serial port side of it is much lower. The UART has a fixed set of rates (speeds) which it can use at its serial port interface. Two Types of UARTs There are two basic types of UARTs : dumb UARTS and FIFO UARTS. Dumb UARTs are the 8250, 16450, early 16550, and early 16650. They are obsolete but if you understand how they work it's easy to understand how the modern ones work with FIFO UARTS (late 16550, 16550A, 16c552, late 16650, 16750 and 16C950). There is some confusion regarding 16550. Early models had a bug and worked properly only as 16450's (no FIFO). Later models with the bug fixed were named 16550A but many manufacturers did not accept the name change and continued calling it a 16550. Most all 16550's in use today are like 16550A's. Linux will report it as being a 16550A even though your hardware manual (or a label note) says it's a 16550. A similar situation exists for the 16650 (only it's worse since the manufacturer allegedly didn't admit anything was wrong). Linux will report a late 16650 as being a 16650V2. If it reports it as 16650 it is bad news and only is used as if it had a one-byte buffer. FIFOs To understand the differences between dumb and FIFO (First In, First Out queue discipline) first let's examine what happens when a UART has sent or received a byte. The UART itself can't do anything with the data passing thru it, it just receives and sends it. For the obsolete dumb UARTS, the CPU gets an interrupt from the serial device every time a byte has been sent or received. The CPU then moves the received byte out of the UART's buffer and into memory somewhere, or gives the UART another byte to send. 38
  • 39. The obsolete 8250 and 16450 UARTs only have a 1 byte buffer. That means, that every time 1 byte is sent or received, the CPU is interrupted. At low transfer rates, this is OK. But, at high transfer rates, the CPU gets so busy dealing with the UART, that is doesn't have time to adequately tend to other tasks. In some cases, the CPU does not get around to servicing the interrupt in time, and the byte is overwritten, because they are coming in so fast. This is called an 'overrun" or "overflow". FIFO UARTs help solve this problem. The 16550A (or 16550) FIFO chips comes with 16 byte FIFO buffers. This mans that it can receive up to 14 bytes (or send 16 bytes) before it has to interrupt the CPU. Not only can it wait for more bytes, but the CPU then can transfer all (14 to 16) bytes at a time. This is significant advantage over the obsolete UARTs, which only had 1 byte buffers. The CPU receives less interrupts, and is free to do other things. Data is rarely lost. Note that the interrupt threshold of FIFO buffers (trigger level) may be set at less than 14.1, 4 and 8 are other possible choices. As of late 2000 there was no way the Linux user could set these directly (setserial can't do it). While many PC's only have a 16550 with 16-byte buffers, better UARTS have even large bufers. Note that the interrupt is issued slightly before the buffer gets full (at say a "trigger level" of 14 bytes for a 16-byte buffer). This allows room for a couple more bytes to be received before the interrupt service routine is able to actually fetch all these bytes. The trigger level may be set to various permitted values by kernel software. A trigger level of 1 will be almost like an obsolete UART (except that it still has room for 15 more bytes after it issues the interrupt). Now consider the case where you're on the Internet. It's just sent you a short webpage of text. All of this came in thru the serial port. If you had a 16-byte buffer on the serial port which held back characters until it had 14 of them, some of the last several characters on the screen might be missing as the FIFO buffer waited to get the 14th character. But the 14th character doesn't arrive since you've been sent the entire page (over the phone line) and there are no more characters to send to you. It could be that these last characters are part of the HTML formatting, etc. and are not characters to display on the screen but you don't want to lose format either. There is a "timeout" to prevent the above problem. The "timeout" works like this for the receive UART buffer. If characters arrive one after another, then an interrupt is issued only when say the 14th character reaches the buffer. But if a character arrives and the next character doesn't arrive soon thereafter, then an interrupt is issued anyway. This results in fetching all of the characters in the FIFO buffer, even if only a few (or only one) are present. There is also "timeout" for the transmit buffer as well. UART Model Numbers Here's a list of UARTs. TL is Trigger Level. 8250, 16450, early 16550 : Obsolete with 1-byte buffers. 16550, 16550A, 16c552 : 16-byte buffers, TL = 1, 4, 8, 14. 16650 : 32-byte buffers. Speed up to 460.8 kbps. 16750 : 64-byte buffer for send, 56-byte for receive. Speed up to 921.6 kbps. Hayes ESP : 1k-byte buffers. The obsolete ones are only good for modems no higher than 14.4k (DTE speeds up to 38400 bps). For modern modems you need at least a 16550 (and not an early 16550). For V.90 56k modems, it may be a several percent faster with a 16650 (especially if you are downloading large uncompressed files). The main advantage of the 16650 is its larger buffer size as the extra speed isn't needed unless the modem compression ratio is high. Some 56k internal modems may come with a 16650. Many 486 PCs (old) and all Pentiums (or the like) should have 16550As (usually called just 16550's) with FIFOs. Some better motherboards today (2000) even have 16650s. The 8250 SERIAL COMMUNICATIONS CHIP The 8250 and compatible chips provide nine I/O registers. Certain upwards compatible devices provide a tenth register as well. These registers consume eight I/O port addresses in the PC's address space. The hardware and locations of the addresses for these devices are the following : 39
  • 40. COM Port Physical Base Address (in hex) BIOS variable COM1: 3F8 40:0 COM2: 2F8 40:2 The base address is the first of eight I/O locations consumed by the 8250. The exact purpose of these eight I/O locations appears in the following table : 40
  • 41. I/O Address (hex) Description 3F8/2F8 Receive / Transmit data register. Also the L.O. byte of the Baud Rate Divisor Latch register. 3F9/2F9 Interrupt Enable Register. Also the H.O. byte of the Baud Rate Divisor Register. 3FA/2FA Interrupt Identification Register (read only). 3FB/2FB Line Control Register. 3FC/2FC Modem Control Register. 3FD/2FD Line Status Register (read only). 3FE/2FE Modem Status Register (read only). 3FF/2FF Shadow Receive Register (read only, not available on original PCs). The following sections describe the purpose of each of these registers. The Data Register (Transmit / Receive Register) The data register is actually two separate registers: the transmit register and the receive register. You select the transmit register by writing to I/O addresses 3F8h or 2F8h, you select the receive register by reading from these addresses. Assuming the transmit register is empty, writing to the transmit register begins a data transmission across the serial line. Assuming the receive register is full, reading the receive register returns the data. To determine if the transmitter is empty or the receiver is full, see the Line Status Register. The Interrupt Enable Register (IER) When operating in interrupt mode, the 8250 SCC provides four sources of interrupt: the character received interrupt, the transmitter empty interrupt, the communication error interrupt, and the status change interrupt. You can individually enable or disable these interrupt sources by writing ones or zeros to the 8250 IER (Interrupt Enable Register). Writing a zero to a corresponding bit disables that particular interrupt. Writing a one enables that interrupt. This register is read / write, so you can interrogate the current settings at any time (for example, if you want to mask in a particular interrupt without affecting the others). The layout of this register is The Baud Rate Divisor The Baud Rate Divisor Register is a 16 bit register that shares I/O locations 3F8h / 2F8h and 3F9h / 2F9h with the data and interrupt enable registers. Bit seven of the Line Control Register selects the divisor register or the data / interrupt enable registers. The Baud Rate Divisor register lets you select the data transmission rate (properly called bits per second, or bps, not baud). The following table lists the values you should write to these registers to control the transmission / reception rate : Baud Rate Divisor Register Values Bits Per Second3F9/3F9 Value3F8/2F8 Value 110417h, 300180h, 6000C0h, 1200060h, 1800040h, 2400030h, 3600020h, 4800018h, 960000Ch, 19.2K06, 38.4K03, 56K01. 41
  • 42. The Interrupt Identification Register (IIR) The Interrupt Identification Register is read-only register that specifies whether an interrupt is pending and which of the four interrupt sources requires attention. This register has the following layout : Since the IIR can only report one interrupt at a time, and it is certainly possible to have two or more pending interrupts, the 8250 SCC prioritizes the interrupts. Interrupts source 00 (status change) has the lowest priority and interrupt source 11 (error or break) has the highest priority, i.e., the interrupt source number provides the priority (with three being the highest priority). The Line Control Register The Line Control Register lets you specify the transmission parameters for the SCC. This includes setting the data size, number of stop bits, parity, forcing a break, and selecting the Baud Rate Divisor Register. The Line Control Register is laid out as follows The Modem Control Register The 8250’s Modem Control Register contains five bits that let you directly control various output pins on the 8250 as well as enable the 8250’s loopback mode. The following diagram displays the contents of this register : 42
  • 43. The Line Status Register (LSR) The Line Status Register (LSR) is a read-only register that returns the current communication status. The bit layout for this register is the following : The data available bit is set if there is data available in the Receive Register. This also generates an interrupt. Reading the data in the Receive Register clears this bit. The Modem Status Register (MSR) The Modem Status Register (MSR) reports the status of the handshake and other modem signals. Four bits provide the instantaneous values of these signals, the 8250 sets the other four bits if any of these signals change since the last time the CPU interrogates the MSR. The MSR has the following layout : The Auxiliary Input Register 43
  • 44. The auxiliary input register is available only on later model 8250 compatible devices. This is read-only register that returns the same value as reading the data register. The difference between reading this register and reading the data register is that reading the auxiliary input register does not affect the data available bit in the LSR. This allows you to test the incoming data value without removing it from the input register. 8250 Asynchronous Communications Controller This Intel chip allows asynchronous operation in a 5-bit to 8-bit character format. Odd-, even-, or no-parity generation and detection is allowed with a bit rate to 56 Kb/s. There is a programmable, 16-bit baud rate generator with an on-chip crystal oscillator. it is available in a 28 lead DIP PLCC package. This CHMOS 82050 asynchronous communications controller is a low-cost alternative to the INS 16450. It emulates INS 16450 and is compatible with IBM PC software. The 82050 is also used in modems when combined with modem chips like Intel’s 89024. 82050 Signals The three address pins (A2-A0) or pins 24-22 interface with the system address bus to select one of the internal registers for read or write operations. D7-D0 on pins 1-4 and 25-28 make up the bi-directional, three state, 8-bit data bus. It allows the transfer of bytes between the microprocessor and the 82050. A RESET input on pin 17 resets the 82050. CS* is the chip select on pin 18. A low on this input pin enables the 82050 and allows the following read or write operations : 1. RD* on pin 20 allows the microprocessor to read data or status from the chip. 2. WR* on pin 19 allows the microprocessor to write data or control bytes to the 82050. INTERRUPT is on pin 5. A high on this output indicates an interrupt request to the microprocessor. The source and cause of the interrupt can be found by reading the status registers. CLK/X1 on pin 9 is used for the internal system clock. In the CLK mode an externally generated clock is used. In the X1 mode the clock is generated by a crystal that is connected between the X1 and X2 pin. OUT2*/X2 on pin 8 is another dual-function pin. OUT2* is a general purpose output used by the CLK/X1 pin. It is driven by an externally generated clock. X2 is an output pin for the crystal oscillator. The configuration of this pin takes place during a hardware reset. TXD on pin 6 is the TRANSMIT DATA pin. The serial data is transmitted on this output pin starting at the least significant bit. RXD on pin 13 is used as the RECEIVE DATA pin. The serial data is received on this pin starting at the least significant bit. RI* or pin 10 is used as a ring indicator input. DTR* on pin 15 is the DATA TERMINAL READY output. During a hardware reset, this pin is an input used to set the system clock mode. DSR* on pin 11 is the DATA SET READY input. RTS* on pin 16 is the REQUEST TO SEND output. During a hardware reset, this pin is an input used to set the system clock mode. CTS* on pin 14 is the CLEAR TO SEND input. DCD* on pin 12 is the DATA CARRIER DETECTED input. Pin 21 is the device power supply and pin 7 is ground. 44
  • 45. System Interface The 82050 uses a demultiplexed bus interface made up of a bi-directional, three-state, 8-bit data bus and a 3-bit address bus. The Reset, Chip-Select, Read, and Write pins, along with the Interrupt pin, are the other signals needed to interface to the microprocessor. The system clock can be generated externally and sent to the CLK pin. The on-chip crystal oscillator is used by connecting a crystal to the X1 and X2 pins. The 82050 chip along with a transceiver, address decoder, and crystal, complete the interface to the IBM PC bus. Transmitting and Receiving In the 82050, the transmission mechanism involves a section in the chip called the TX machine along with the TXD register. The TX machine reads characters from the TXD register, serializes the bits, and transmits them over the TXD pin according to signals provided by the baud rate generator. It also generates the parity and break transmissions. Receiving involves a section called the RX machine along with the RXD register. The RX machine assembles the incoming characters and loads them onto the RXD register. The RX machine also synchronizes the data, passes it through a digital filter to filter out spikes, and generates the bit polarity. The falling edge of the start bit triggers the RX machine, which then samples the RXD input. When a start bit is detected, the RX machine samples for data bits. If the RXD input is low for an entire character time, then the RX machine sets Break Detect and Framing Error bits in the Line Status Register (LSR) and loads a NULL character into the RXD register. The RX machine then goes into an idle state until it senses a one and it resumes normal operation. Like other I/O-based peripherals, the 82050 is programmed through its registers. The 82050 register set is the same as the 16450 register set to provide compatibility with previous software written for the IBM PC. Parallel Interfacing Techniques Microcomputer interfaces are designed to link microprocessor buses with peripheral devices. They can take the form of a board plugged into the microprocessor bus or they can be built into the main circuit board. Built into the interface board or the main circuit board is a connector to link the interface to the peripheral. The interface depends on the signals that are passed through this cable and the circuits on the interface board or main board that generate these signals. A simple parallel interface can be built with a single TTL integrated circuit. Other parallel interfacing techniques, such as IEEE-488 or SCSI, require complex circuitry. Parallel interfaces have two major distinguishing features. There is the data path or width, which is the number of bits transferred in parallel by the interface. In addition to the data path, there is the type of handshake used to co-ordinate the movement of these bits between the computer and the peripheral. The data width can range from a single bit to 128 bits or wider. The most common size for microcomputers in an 8-bit data path. This allows the microprocessor to transfer an 8-bit data word over the interface during each transfer. The 8-bit parallel interface is also used by 16- and 32-bit microprocessors. This is because of the large number of 8-bit peripherals available. These devices, such as printers, were originally designed for 8-bit computers. Another reason is that ASCII, the most common character code, requires at least a 7-bit interface. Larger data words are also transferred where higher speeds are required. Handshaking The type of handshake used to move information over the data line can be classified by the number of wires dedicated to the handshaking operation. This results in zero-wire handshakes, one-wire handshakes, two-wire handshakes and three-wire handshakes. Within these classifications there are variations on how the wires are actually used for the handshake. This includes how they are pulsed or interlocked. The zero-wire handshake is the most simple of these interfaces. It uses an 8-bit latch to store the state of the processor's data bus. On the rising edge of a write signal, the latch 45
  • 46. takes the states of the data bus lines and stores them in the latch. The states are reproduced on the output lines of the latch after a read signal occurs. A 16-bit interface can be built by adding a second 8-bit latch. This type of zero-wire handshake, parallel-output interface, can be used to drive simple outputs for lights or relays. Devices such as these do not have handshaking requirements. Each of the output lines from the interface can be used to drive a light or relay. The single wire handshake requires adding another wire to indicate when data is valid on the data lines. This signal has the effect of stretching the write pulse from the microprocessor. This allows slower devices to respond to the write pulse and provide some settling time. A two-wire handshake adds another line so the receiving device can indicate when it is ready for data. This provides a true handshake using this acknowledge line, and full interlocking is possible. The two-wire handshake is adequate for interfacing a single peripheral, but some interfaces use a third wire to create a protocol that allows several peripherals to use the interface. An example of this is the IEEE-488 bus. Zero-Wire Handshake In this relatively simple interface, an 8-bit latch stores the signals from the microprocessor's data bus. An address circuit is also used to interface to the microprocessor. This can take the form of a NAND gate, as shown in fig. . The gate has two inputs. The address valid input goes true when the address of the circuit appears on the microprocessor address bus. The generation of this address may require a comparison of the states of address and control lines in the microprocessor. This is usually done with exclusive-OR gates. The write input may be generated by the microprocessor or it may be decoded from the microprocessor address and control-line states. When both the write and address valid signals are high, the output of the NAND gate is low. When the write input goes low, the output of the NAND gate goes high, which is the idle state. The zero-wire-handshake parallel-output interface is used to drive simple peripherals such as lights or relays. Each of the output lines from the interface is used to drive a relay, incandescent lamp, or light emitting diode (LEDs). If the light or relay is connected between the latch outputs and a+5 volt power supply, then the lamp or relay will be on 46
  • 47. when that bit in the latch is a zero. Most TTL circuits use this approach since the TTL outputs are a better sink for current than a source for current. A zero-wire-handshake parallel-input interface can also be used. It is similar to the zero- wire handshake output interface, since a NAND gate is used with one of the inputs connected to the address valid line. The other input is connected to a microprocessor read line. When both inputs to the NAND gate are true, then the latch or buffer is enabled and the data at the buffer inputs is placed on the microprocessor data bus. This occurs during the read cycle of the microprocessor. This type of zero-wire-handshake parallel interface can be used to read the status of a bank switches. Each of the buffer inputs goes through a switch and a resistor. These resistors acts as current limiters and pull-up resistors. They pull the buffer inputs up closer to +5 volts when the switch is open. The buffer has some resistance to ground so the resistor is used to swamp this resistance. These resistors also limit the current when the switch is closed since the +5 volts would be grounded. A read t the desired address by the microprocessor allows the states of the input lines on the buffer to be placed on the microprocessor data bus so that the processor may read them. The input lines are controlled by the states of the switches. The IBM PC and clones use an encoded keyboard which is connected to the system unit with a 5-pin input-output plug. Two of the pins provide the power (+5 volts and ground). The other three are left to provide the interface between the keyboard and the system board. This is done with serial transmission through the keyboard cable. In the keyboard, a key depression causes the encoded circuits to generate the ASCII code for the key. The keyboard feeds its ASCII output to the system unit. Most keyboards use a keyboard processor, like an 8048 microprocessor. The 8048 has an 8- bit microprocessor and 2 Kbytes of ROM. The ROM is preloaded with a character code known as a scan code. Debouncing The mechanical contact that occurs when you strike a key can generate oscillations. When a key is pressed and makes the metallic connection, there is a short period of oscillation until the connection is completed. This usually lasts for a few milliseconds. During this time the keyswitch voltage is not stable and it oscillates between the two switching voltages. The same type of oscillations occur when the key is released. (In nonencoded keyboards a resistor and capacitor can be used as a filter to reduce these oscillations. In the encoded keyboards used in the IBM PCs and clones, a delay of a few milliseconds is used before the keystrike is encoded. The delay is usually made with a programmed loop that inserts the delay. This inhibiting of the key action during the switch bouncing is called debouncing). The 8048 microprocessor performs this debouncing by generating an interrupt during the time the keyboard voltage is bouncing. One-Wire Handshake Most peripheral devices like printers have timing requirements for the various operations. A single-handshake wire can be used to indicate when information is valid on the data lines. The one-wire handshake is the next step up from the zero-wire handshake. In the zero- wire handshake interface the output latch or input buffer were controlled by a single wire tied to the clock input on the latch or the enable input on the buffer. A one-wire parallel-output interface can be built from a zero-wire parallel-output interface. A new signal for the peripheral-write-pulse handshake signal can be generated with flip-flops. 47
  • 48. When the microprocessor write signal ends, this is used to toggle the first D flip-flop. The D input to this flip-flop is always high, so the flip-flop is set when a positive going signal appears at its clock input. This flip-flop indicates that an output operation has been started. A one-wire-handshake parallel-input interface can also be from a zero-wire-handshake input interface. The peripheral needs to send the microprocessor some information at a particular time, so it places the information on the peripheral data lines and then sends the peripheral strobe. This allows the latch to hold the information on the peripheral data lines and sets the interrupt flip-flop. The interrupt takes the microprocessor into an interrupt service routine, which forces the processor to read the input interface. The read operation places the contents of the latch on the data bus and resets the interrupt flip-flop. Two-Wire Handshake The single-handshake interface does not indicate if the peripheral device is ready for a data transfer. The single handshake presents the message and it assumes the peripheral is ready to accept the data. Multiple-wire handshakes are usually implemented with integrated circuit designed for the interface instead of using latches, buffers, flip-flops, and gates. One type of two-wire handshake interface for parallel output ports uses a pulsed handshake. The interface places the data to be output on the data lines and then a strobe pulse is sent. This is the same as a one-wire handshake. The additional line is used by the peripheral as an acknowledge signal. It indicates that the peripheral has accepted the information. It is also used to signal that the peripheral is ready for another data transfer. Both of these are signaled by the falling edge of the acknowledge pulse. An interlocked handshake with unique state conditions is needed. If the strobe and acknowledge are overlapped as shown in fig. then the two-handshake lines are interlocked. The strobe and acknowledge timings start with data being placed on the data bus and then the strobe is turned on, starting the transfer. The strobe is held on while acknowledge is switched on. Then strobe is turned of, followed by acknowledge, to end the cycle. The two-wire handshake parallel-input interface is similar to the two-wire output interface. In a pulsed two-wire parallel input handshake, the interface asks for information from the peripheral by sending the strobe. The peripheral places the information on the data lines and sends acknowledge. The interface will use a data latch to hold the data during the acknowledge pulse. In an interlocked, two-wire input handshake, the interface asks for information from the peripheral by sending the strobe. The strobe remains on and overlaps the acknowledge which is turned on when the peripheral places the information on the data lines. These lines remain valid until strobe is turned off. This allows the microprocessor enough time to read the data lines, so no latch is needed. When the data is accepted by the processor, the strobe is turned off. The microprocessor then turns off acknowledge and completes the transfer. Centronics Parallel Printer Interface The Centronics printer interface is an 8-bit parallel connection that uses a three-wire handshake. This interface does not support device addresses, so only one device can be connected to the output port. The following signals are used in this interface. 48
  • 49. Signal Function STROBE Starts the reading of data, initiated by the computer. ACK Indicates that the printer has received data and it is ready to accept the next data. BUSY Indicates that the printer cannot receive data. PE Indicates that the printer is out of paper. SELECT Indicates that the printer is online. DEMAND Inverse of the BUSY signal. INPUT PRINT A pulse from the computer that initializes the printer. FAULT Indicates that the printer is in the error mode. Types of Parallel Interfaces The nonprogrammable parallel interface is used for the simplest applications. It performs the basic bus-interface functions and although it can include some interrupt-request control logic, it operates as a simple parallel I/O port.` A hardware-programmable interface includes decoding logic, addressable parallel I/O ports, and interrupt-control logic. External wiring or switches are used to determine the address, data direction, and width of each port, and to control the operations of the interface. The type of general-purpose parallel interface that is most popular is software- programmable. Here the computer software will determine how the interface is structured. The interface is controlled from the contents of a control register which is loaded and updated by the software program. This type of interface an also include another control register called the data-direction register. It allows the input or output function of individual /O lines to be selected by the program. The programmable input/output interface chips are not based on an industry standard. Since no standards have been established for these devices, the component manufactures use various names for them. PIO is sometimes used to designate this general class of programmable I/O devices. The various differences are found in the manufacturer's literature. These PIO programmable interface devices provide the basic input and output functions for a parallel data interface. In order to connect an input or output device to a microprocessor data bus, the minimum connection requires latches for the inputs and outputs. An input latch must hold the data long enough for the microprocessor to read the data and it also isolates the signals from the bus. The output latches must told the output data long enough for the output device to make use of it. The data on a typical microprocessor bus may be valid for a time period that is too fast for many input/output devices to react and make use of it. This basic type of general-purpose parallel I/O interface thus requires at least one input register, one output register, status bits, and some interrupt control. There are least 16 or 24 I/O lines in these general purpose interface chips to provide a number of channels. 49
  • 50. These channels, which are also called ports, usually provide an 8-bit signal-byte connection which is configured as some combination of inputs or outputs. One or more command registers may be used to specify the configuration of the ports and the operation of the control logic. The use of data-direction register allows you to define these ports as each bit is configured as an input or output in the combination desired. Each bit of the data- direction register specifies if a corresponding bit of the PIO port will be an input or an output. The use of a 0 in the data-direction register may specify an input, while a 1 specifies an output. The typical PIO multiplexes its connections to the microprocessor data bus into two or more of the 8-bit ports. The maximum is three, because of the control and address lines for the I/O devices, using a 40-pin package for the PIO. A typical IO configuration is shown in fig. . The device has two ports and each has its own buffer and function or direction register. A status or mode register is used to indicate the status of each 8-bit port. Using the PIO In order to use a PIO, the microprocessor must execute the following operations : (1) lead the control registers to specify the mode in which the control signals operate and (2) load the direction registers to specify the direction which the lines (which make up the ports) will use. These operations must be done for every port in the interface. The data which is to be loaded in the various registers is placed on the data bus and then a register selects one of the internal registers with the appropriate pattern on the address bus and then supplies the 8 data bits to be transferred into one of these registers using the data bus. The multiplexer in the chip will gate the 8-bit data to the register. The microprocessor must also generate the read or write signal on the control bus. To read the status from the chip, the contents of the status register are read. After the chip has been configured with its control and direction registers loaded, no additional changes are normally necessary and the microprocessor will communicate with the data buffers using a single instruction. The trend in interfaces has been toward more programmed functions. Higher levels of integration result in more functions per chip. This trend has been occurring in most of the newer microprocessor chips. Programmable Interfaces A programmable interface to connect three L.E.D. display digits would use three I/O ports, one for each digit. A scanned multiplex system could also be used which would use two ports but more software. There are a number of chips that can be used for this simple application and, depending which type is used, it will tend to have certain 50
  • 51. characteristics that may differ from other chips. The chip designed for the 6800 microprocessor family is the 6820 peripheral interface adapter (PIA). Each 6820 is a double port device with two sets of eight output lines. In an application like this requiring three digits, one-and-a-half PIA chips are needed to service the display. Each chip has two data registers which are called peripheral registers in the PIA. One of these registers is used for each set of input/output lines. There are also two other types of registers used with each peripheral register. This gives a total of six registers in each chip. One of these is the data direction register which controls the directions of the input/output lines. Each data direction regsiter has eight bits, one for each input/output line. Since the PIA has six registers and only two register select (RS) pins, the data and data direction registers in each port share the same address. They differ by the value of bit 2 of the control register. Table 6.3 indicates how the registers are selected using the RS1 and RS0 pins and the state of the internal bit 2 of the control register. Since the PIA cannot drive a heavily loaded data bus with many connections, it is sometimes required to buffer the data bus to this chip using a tristate buffer. 8255 Programmable Peripheral Interface (PIA) This is a parallel I/O chip with four registers. The interface to the microprocessor is made up of chip-select pin (CS), two address pins (A0 and A1), three control pins [READ (RD), WRITE (WR), and RESET] and eight bidirectional data pins (D0 through D7). The I/O pins are grouped into four ports : Port A, Port B, Port C Upper, and Port C Lower. Ports A and B are 8 bits wide, while ports C Upper and C Lower have 4 bits each. This provides a total of 24 I/O pins. Control of the registers depends on the state of the inputs shown as follows : A0 A1 RD* WR* CS* 0 0 0 1 0 Read Port A 0 1 0 1 0 Read Port B 1 0 0 1 0 Read Port C 0 0 1 0 0 Write to Port A 0 1 1 0 0 Write to Port B 1 0 1 0 0 Write to Port C 1 1 1 0 0 Write to Control Register X X X X 1 Not recognized 1 1 0 1 0 Illegal X X 1 1 0 Not recognized X indicates that the pin may assume either level. There are three modes of operation : Mode 0 The basic input and output mode for all 24 I/O pins, also called the bit I/O mode. Mode 1 Provides a strobed input/output (Port C is used for control and status) Mode 2 The bidirectional data bus mode (five bits on Port C are used for handshaking) Ports A and B can be set to the various modes as needed, but Port C Upper depends on how Port A is set and Port C Lower depends on how Port B is set. Programming takes place by sending a control word from the microprocessor through the 8255 data bus. In the mode definition control word, the bits are used as follows. Bit 7 is set to "1" to trip the mode-active flag. Bits 5 and 6 are used to set the Port A mode. Bit 4 is the Port A data-direction bit, it determines if the Port A pins are inputs or outputs. Bit 3 determines the direction of the Port C Upper pins. Bit 2 is the mode-select bit for Port B. Port B cannot be used in mode 2, which is the bidirectional-bus mode. Bit 1 determines the direction of the Port B pins and Bit 0 determines the direction of the Port C Lower pins. 51
  • 52. Along with this control word, a bit set/reset control word is used. This allows the Port C pins to be set or reset for status and control for Ports A and B. Bit 7 is set to "0" for the bit control. It is "1" for mode select operations. Bit 4, 5 and 6 are not used in the bit control word. Bits 1 through 3 specify which Port C bit is to be used and bit 0 sets the state of the bit. Disk Drives Most disk drives are thought of as parallel interfaces, actually the floppy disk drive has a parallel interface for the control signals, but the data is transferred in serial mode. Hard disk drives are a true parallel interface since both the control and data signals use individual connections. Floppy Disk Drives In a floppy disk drive there is circuit board that handles the drive's mechanical operations and interprets inputs from the drive's sensors. Signals to and from the computer's main board take place over a single, 34-pin ribbon cable. The 34-pin configuration is standard for IBM PCs and compatibles. A separate four-conductor cable supplies power to the drive. The floppy drive needs to control three main mechanisms : the R/W heads, stepping motor, and spindle motor. This is done with a disk controller IC which handles the communications with the motherboard as well as with the drive's sensors. Older drives use several ICs for these operations but newer drives use more integrated devices that combine most or all of the functions into a single IC chip. The floppy drive uses four sensors : index sensor, disk-in-place sensor, write protect sensor and track 00 sensor. The index sensor is an optical sensor that monitors the diskette's rotation. An index wheel spins along with the disk and causes a pulsed signal to be sent to the controller chip. If the pulses indicate that the disk speed is not correct, the controller changes the spindle motor speed to hold the speed at 300 or 360 rpm. The disk-in-place sensor provides a signal to indicate that a disk is in the drive. This keeps the drive from operating without a disk. The write protect sensor checks the write- protect notch. When this notch is uncovered, the drive does not allow any write operations to take place; the disk can only be read. The track 00 sensor is used to generate a signal when the R/W heads are in the track 00 position. This is done to initialize the heads to a fixed starting location. The transfer of information in or out of a drive involves the interaction of the microprocessor and the floppy drive controller. The overall operation of the drive is handled by the floppy drive controller which may plug into or be a part of the system board. The microprocessor does not interact directly with the floppy drive. It directs the controller to start the data transfer in or out of the floppy drive. The instructions or routines needed to operate the floppy drive are fetched by the microprocessor from the BIOS ROM on the system board. Data being loaded into a floppy drive is taken one byte at a time from system RAM by the floppy disk controller and converted into serial form. They are sent as serial data over the drive cable. Other control signals are needed to handle the drive's motors and sensors. When the data bits arrive at the floppy drive, they are converted into magnetic recording signals so they can be written to the disk. When data is read from the floppy disk, the process involves finding the desired program or file. The floppy disk controller must seek the track and sector with the recorded data. After the starting location is found, the disk's read-write head produces signals from the recorded data. These low-level signals are amplified and then converted into standard digital logic levels. The digital data is sent in serial format over the cable to the floppy disk controller. The controller converts the serial data into parallel words while deleting the housekeeping information, and sends the data to RAM. Drive Interface The connections between the floppy disk controller and the floppy drive unit is the drive interface. This a standard set of connections used by most floppy drives and controllers. The standard interface allows any floppy drive to operate in the computer as long as it uses the standard interface. The interface is made up of two cables, power and signal. The signal cable pinout is shown in Table 6.4. The power connector is a 4-pin, mate-lock-type connector. The 52
  • 53. digital signals use +5.0 Vdc (pin 4, pin 3 return) in most desktop systems although +3.3 or +3.0 Vdc are used in some portable computers. The motors normally operate on +12 Vdc (pin 1, pin 2 return). A return or ground lead is provided for each supply in the connector. In the 34-pin signal connector, the odd-numbered pins are ground lines, while even-numbered pins are used for the signals. Up to four drive-selection inputs, DRIVE SELECT 0* through DRIVE SELECT 3*, are used to determine which drive in the system is active. Smaller computers will not use all of these lines. A MOTOR ON* signal is used to start the drive spindle motor turning. This signal must be negative true before a red or write operation can take place. The head direction is controlled by a DIRECTION SELECT* signal that tells the head stepping motor to move in toward the center of the disk or out toward the edge of the disk. A STEP* pulse controls the number of steps that the head stepping motor must take. Both STEP* and DIRECTION* position the R/W heads on the disk. A WRITE DATA line records information on the disk, and a WRITE GATE* signal is used to enable the drive to accept data on the WRITE DATA line. The IN USE/HEAD LOAD* signal indicates that the read/write head is busy. The WRITE PROTECT* output prevents writing to the disk if the write protection notch is covered. When a read takes place, the data is sent on the READ DATA line. The DISK CHANGE/READY signal tells when the disk is ready for a read or write operation. The SIDE SELECT* input determines which side of the disk is written or read to. The output signals include a NORMAL/HIGH-DENSITY* signal that tells the floppy drive controller IC what type of media is currently in use. The INDEX* signal is actually a stream of negative indexing pulses. These are sent to the floppy drive controller to regulate the spindle speed at the proper value. The TRACK 00* signal indicates that the head is at track 00 on the disk. Pinlist for IMB PC Floppy Drive Interface Pin Function Pin Function 2 Normal / high density* 1 Ground 4 In use / head load* 3 Ground 6 Drive select 3 5 Ground 8 Index 7 Ground 10 Drive select 0* 9 Ground 12 Drive select 1* 11 Ground 14 Drive select 2* 13 Ground 16 Motor ON* 15 Ground 18 Direction* 17 Ground 20 Step 19 Ground 22 Write data 21 Ground 24 Write gate* 23 Ground 26 Track 00* 25 Ground 28 Write protect* 27 Ground 30 Read data 29 Ground 32 Side select* 32 Ground 34 Disk change / ready* 33 Ground Floppy Disk Controllers The Intel 82077 is a single-chip floppy disk, and tape drive controller for the PC-AT and PS/2 buses. The 82077 needs only a 24-MHz crystal, resistor array, and chip select circuits to implement the floppy-disk controller. The drive control signals are decoded and buffered. There is an analog data separator for motor speed control and a 16-byte FIFO (First-In-First-Out) register. All command parameters and data transfers go through the FIFO. Controller Interface The following signals make up the controller interface. CS* on pin 6 is used to decode the base address range. A0, A1, and A2 on pins 7, 8 and 10 are used to select one of the chip's registers as shown in the following : 53
  • 54. 54
  • 55. A2 A1 A0 Read / Write Select Register 0 0 0 Read Status Register A 0 0 1 Read Status Register B 0 1 0 Read / Write Digital Output Register 0 1 1 Read / Write Tape Drive Register 1 0 0 Read Main Status Register 1 0 0 Write Data Rate Select Register 1 0 1 Read / Write Data (FIFO) 1 1 0 Reserved 1 1 1 Read Digital Input Register 1 1 1 Write Configuration Control Register The following pins are used for the data bus : DB0 - 11 DB4 - 17 DB1 - 13 DB5 - 19 DB2 - 14 DB6 - 20 DB3 - 15 DB7 - 22 RD* on pin 4 is the READ control input and WR* on pin 5 is the WRITE control input. RDDATA on pin 41 is the READ DATA input. It provides serial data from the disk. INVERT affects the polarity of this signal. WP on pin 1 is the WRITE PROTECT input. It indicates if the disk drive is write-protected. DSKCHG in pin 31 indicates a DISK CHANGE has occurred. This means that the disk is now ready for a read or write. DRQ on pin 24 is the DMA REQUEST signal, which is sent out to request service from a DMA controller. DACK* on pin 3 is the DMA ACKNOWLEDGE control input used in DMA cycles. TC on pin 25 is the TERMINAL COUNT control signal sent from a DMA controller to end the disk transfer, DACK* must be active to use this signal. INT on pin 23 is the INTERRUPT output. It signals a data transfer in the non-DMA mode. DENSEL on pin 49 is used as the DENSITY SELECT. It indicates if a low (250/300 Kbps) or high (500 Kbps/1 Mbps) data rate is selected. The polarity of the DENSEL pin is controlled with the INDENT pin, after a hardware reset. DRV2 on pin 30 indicates if a second drive is installed and its state is reflected in Status Register A. DRATE0 and DRATE1 on pins 28 and 29 indicate the contents of bits 0 and 1 of the Data Rate Register. INDX on pin 26 is the INDEX input. It indicates the beginning of the track. TRK0 on pin 2 stands for the TRACK0 control line. It indicates that the head is on track 0. The chip runs on +5 volts on pins 18, 40, 60, and 68. The ground pins are 9, 12, 16, 21, 36, 50, 54, 59, and 65. AVCC on pin 46 is used for the analog supply and AVCC on pin 45 is used for the analog ground. Hard Drives Hard drives usually require a read/write controller, a head actuator/driver, a spindle motor controller, and a disk interface controller. Data enters and leaves the hard drive through the disk interface controller. This controller is designed for the drive's interface. Most early drives used the Seagate ST-506 for drives which were under 40Mb. The ESDI (Enhanced Small Device Interface) doubled the transfer rate to 10 MB per second which allowed more data on the hard disk. Both of these use a 34-pin cable for the drive control signals, similar to a floppy drive, and a 20-pin cable for the parallel data transfers. The IDE and SCSI interfaces are later standards used in most current hard drives. The disk interfaces controller also controls the head actuator driver circuit and spindle motor driver. The read/write controller works with the head preamplifier and drive circuits to covert the analog waveforms from the read heads into standard logic levels. The read/write controller separates the clock and synchronization signals from the actual binary data. When data is written to the disks, the read/write controller generates the write signals that are amplified by the write drive circuits. Built into the hard drive circuitry is a small microprocessor that coordinates the drive's operations by synchronizing the disk interface controller and the read/write controller. This microprocessor is also used for disk spinup and spindown, as well as other safety 55
  • 56. control features that the drive might have. Some drives use a custom version of a microprocessor called a micro-controller. Other hard drives use a standard microprocessor. For the small drives, such as the 1.3-in units in small portable computers, these circuits are integrated onto one or two complex surface-mount ICs. A data transfer starts when the main board microprocessor initiates a command to the hard drive controller. In many systems a system controller chip actually drives the hard drive controller. Any parameters that are needed to control the hard drive are taken by the microprocessor from the BIOS ROM. The hard drive controller interfaces the system buses (control, address, and data) to the drive's interface. Data and commands from the drive are converted into computer bus signals by the hard drive controller. The control circuits on the hard drive are used to operate the drive's mechanical functions and to convert the digital information from the interface into magnetic flux patterns that are recorded on the disk. This process of recording and data transfer is reversed for write operation, where the flux patterns are amplified and interpreted for the microprocessor. IDE Drives IDE, which stands for Intelligent Drive Electronics or Integrated Drive Electronics, is a popular interface in personal computers for connecting hard drives, especially the newer, smaller drives. The circuits needed to operate an IDE drive is on a circuit board which is part of the hard drive assembly. The software routines needed to communicate with the IDE drive are stored in the BIOS ROM on the system board. The IDE interface connects the hard drive to the system board with a 40-pin connector. The signal cable typically uses a 40-pin insulation displacement connector (IDC). All signals on the IDE-interface are TTL-compatible, a logic zero is 0.0 to +0.8 Vdc, and a logic one is +2.0 to Vcc. There is also a 4-pin power cable in addition to the 40-pin signal cable. The signal cable pinouts are shown in Table 6.5. The power connector is a 4-pin mate-n-lock-type connector. IDE hard drives normally use +5 Vdc (pin 4) and +12 Vdc (pin 1). In some low-voltage systems, +3.0 or +3.3 Vdc is used instead of +5.0 Vdc. The return lines for each supply are also part of the power connector (+5V return, pin 3, +12V return, pin 2). The IDE interface provides sixteen bidirectional data lines (DD0 to Dd15) to move data bits in and out of the drive. IORDY (I/O Ready) is used to indicate to the drive that a data transfer is needed. The direction of the data transfer is set with DIOR* and DIOW*. IOCS16 is the 16-bit I/O control signal. It tells the microprocessor that the drive is ready to send or receive data. Pinlist for IDE Hard Drive Interface Pin Function Pin Function 2 Ground 1 Reset* 4 DD8 3 DD7 6 DD9 5 DD6 8 DD10 7 DD5 10 DD11 9 DD4 12 DD12 11 DD3 14 DD13 13 DD2 16 DD14 15 DD1 18 DD15 17 DD0 20 Connector key 19 Ground 22 Ground 21 DMARQ 24 Ground 23 DIOW* 26 Ground 25 DIOR* 28 Reserved 27 IORDY 30 Ground 29 DMACK* 32 IOCS16* 31 INTQ 34 PDIAG* 33 DA1 36 DA2 35 DA0 38 CS3FX* 37 CS1FX* 40 Ground 39 DASP* 56
  • 57. The outputs to the system board include a Direct Memory Access Request (DMARQ), which is used to start the transfer of data to or from the drive. When a data transfer is finished, a DMA ACKNOWLEDGE (DMACK*) is sent to the drive from the hard disk controller. A Drive Interrupt Request (INTQ) is used by the drive when there is an interrupt pending. A Drive Active (DASP*) signal is used when the hard drive is busy. A Passed Diagnostic (PDIAG) pin indicates the results of a diagnostic command or reset. If PDIAG* is negative true, the microprocessor knows that the drive is okay to use. A negative true signal on the RESET* line forces the drive to its initial condition during power-on or reboot. SCSI Drives The Small Computers Systems Interface (SCSI, pronounced "scuzzy") was developed as a hard disk drive interface. It differs from other disk interfaces in that it is intelligent. Rather than using a hard drive controller that controls the drive, SCSI drives use a host adapter that allows the computer to send commands to the drive. A SCSI drive has an instruction set of commands, and up of eight SCSI devices can be connected on a single computer. SCSI is a computer bus which uses its own protocol (sequence of events) to communicate between devices. The system microprocessor is not required for the particular conditions of the drive; the hard drive system has enough intelligence to complete each task. The original specification for SCSI appeared in 1986. Other enhanced versions, SCSI-2 and SCSI-3, were released after this. It is a complex parallel interface. The Small Computer System Interface, or SCSI, has its roots in a disk-drive interface developed by Shugart Associates. The Shugart interface was called the SASI (pronounced "sassy") bus for Shugart Associates System Interface. It was intended primarily for disk drives. In both SASI and SCSI systems, a controller board moves data transfers over the SASI or SCSI bus. Like the IDE interface, a SCSI drive needs only to be connected to a system board using a standard cable. The SCSI bus uses a 50-pin connector even though it is an 8-bit interface. The SCSI standard defines the way peripherals are connected to the computer system and how they communicate with the system. It is often grouped together with other hard disk interfaces, but it is more than a hard disk interface. SCSI provides a common bus for many types of peripherals, such as CD-ROMs, optical memory devices, modems, and printers. The common bus allows the connection of up to seven other peripherals to one port on the back of the PC. A SCSI hard disk drive system gives you high performance and automatic error correction. You also get an external port which allows daisy chaining of up to seven peripherals. SCSI drives for the PC cost more to install due to the additional cost of the SCSI interface. In the 30- to 60-MB range, the lowest cost solution is usually MFM or RLL ST-506 technology in a drive kit or a drive with an integrated controller. The Apple Macintosh is an exception, being completely SCSI due to the built-in SCSI interfaces on Macintosh Computers. SCSI offers the easiest plug-in installation when executed properly. Usually when you connect peripherals to your system you need controller boards for each device. A CD-ROM drive, a modem, and a printer might require three controller boards and three expansion slots for any functions not provided on the system board. You would also have to be sure that each of these devices is compatible with your system, and compatible with each other. Whenever there are multiple controller boards in the system, there are possible con-used with the I/O read (IOR*) and I/O write (IOW*) pins to allow the microprocessor to address the 16 registers used in the chip. Switching the chip select (CS) signal low makes the chip ready for microprocessor bus communications. The RESET signal resets the chip by placing into a known, stable state. The data lines (D0 through D7) make up the data port. The other microprocessor bus signal lines are used for interrupt and DMA operations. IRQ is the interrupt request line and it is used to signal an error or the completion of a command. DRQ is the DMA request line, it indicates that an internal data register should be read or written to for a SCSI data transfer. DACK is used by the DMA controller to 57
  • 58. signal that is has responded to a DMA request. DACK allows access to the data registers without using the address lines. Other DMA signals include the EOP input which allows a DMA controller to tell the 5380 that the current transfer cycle on the microprocessor bus is the last data transfer of a block. READY allows the 5380 to control when the DMA controller moves data in and out. READY indicates when the 5380 is prepared to take another transfer, based on the activity on the SCSI bus. The READY signal is used with DRQ for block-mode DMA transfers. The DMA controller can be set to take control of the microprocessor bus for a specified number of transfers. Functional Signals of the 5380 SCSI Adapter Chip SCSI Data Bus DMA EOP* Input Bidirectional READY Output DB0-7, DBP* Control DRQ Output BSY* SCSI DACK* Input SEL* RST* Controls Register CS* ATN* IOR* ACK* Bidirectional Addressing IOW* REQ* A0 MSG* Inputs A1 C/D* A2 I/O* Data Bus D0-D7 Bidirectional RESET* Input IRQ Output Analog Interfaces Analog interfaces are used in data acquisition systems. Analog or continuous signals are used by many devices as inputs or outputs to computer systems. An example of an input would be temperature measurement and an example of an output would be motor control. These signals generally use voltage or current variations, but frequency and pulse width are among the other techniques used. No matter what form the analog signal has, it must be converted to a digital representation to be used by the computer. Any digital control signals that are sent to analog control devices must be converted to the proper analog format. The devices for accomplishing this are analog-to-digital and digital-to-analog converters. Analog interfaces use a different set of codes to represent numbers. The analog or continuous values in physical control and measurement applications can be represented by digital numbers. The presence or absence of fixed voltage levels characterize these numbers. These digital representations are binary since each bit or unit of information can have one of two possible states : TRUE or FALSE, ON or OFF, ONE or ZERO, HIGH or LOW. A binary code is used to interpret the analog value. The different bits represent different portions or weights of the digital number. The bit with the most weight is the first bit in the leftmost position. This is called the most significant bit or MSB. The bit with the least weight is the last bit in the rightmost position and is called the least significant bit or LSB. An analog-to-digital converter is used to change the analog values to their digital equivalents. The resolution of an analog-to-digital converter is determined by the number of bits. The coding used is the set of coefficients representing the fractional parts of full scale. 58
  • 59. Coding Methods Natural Binary Code Bipolar Codes Offset Binary Polarity Sign Magnitude One's Complement Two's Complement D/A Converters The R-2R ladder circuit is often used in digital-to-analog (D/A) conversion. The basic circuit is shown in fig. . Notice that is used with an inverting operational amplifier. When all bits but the MSB are off, (grounded) the output equals (-R/2R)Vref). A D/A converter with buffer storage can be used as a sample-hold with digital input and analog output and an infinite hold time. The register is under control of a strobe which causes the converter to update. The rate at which the strobe may update is determined by the settling time of the converter and the response time of the logic. If bipolar current-switching D/A conversion is used with offset binary or two's complement codes, an offset current equal and opposite to the MSB current is summed with the converter output. This is usually taken from a resistor divider network rather than a separate offset reference. This is done in order to minimize errors due to temperature changes. If the gain of the output inverting amplifier is doubled, this increases the output range, from 0-10 V to 10 V. When the amplifier is connected for sign inversion, conversion is negative reference. In a non-inverting application, the same values of offset voltage and resistance are used, but the value of the output voltage scale factor will depend on the load. Some bipolar D/A converters with R-2R ladder networks and offset binary or two's complement coding have switches that are normally grounded for unipolar operation. If the LSB node is grounded, the output will be symmetrical. For sign-magnitude conversion, the converter's current output can be inverted. The analog output in a parallel-input D/A converter circuit will follow the state of the logic inputs. The converter may be preceded by a register, then the converter will respond only when the inputs are gated into it. This is done in some data distribution systems, where the data may be continually changing, but samples are needed on a periodic basis. DAC811 This is a single-chip integrated circuit microcomputer-compatible 12-bit digital-to-analog converter. The Burr-Brown DAC811 chip includes a precision voltage reference, interface logic, buffered latch, and a 12-bit D/A converter with a voltage output amplifier. Fast 59
  • 60. current switches and a laser-trimmed thin-film resistor network are used to provide an accurate and fast D/A converter. Laser trimming is done at the wafer level to maintain a 1 /4 LSB linearity error at 25°C and a 1 /2 LSB error over the temperature range. The DAC811 is available in a 28-pin plastic molded package, a 28-pin 0.6 inch wide dual-in-line ceramic side-brazed package, and a 28-terminal 0.45 inch-square ceramic leadless chip carrier. Interface The microprocessor interface uses a double-buffered latch which is divided into three 4- bit nybbles for interfacing to 4, 8, 12, or 16-bit buses and for handling right-or-left- justified data. The 12-bit data in the input latches is moved to the D/A latch which holds the output value. Loading the last nybble or byte of data can be done simultaneously with the transfer of data between latches. This avoids spurious analog output values and saves computer instructions. Most interfaces require a base address decoder, but if blocks of memory are not used, the base address decoder is simplified or not needed. For example, if half the memory space is not used, address line A15 of the microprocessor may be used as the chip select control. The control logic allows interfacing to right-or-left justified data format. When a 12-bit D/A converter is loaded from an 8-bit bus, 2 bytes of data are required. The base address is decoded from the high-order address bits and A and A are used to address the latches. Adjacent addresses are used. Analog-to-Digital Converters Most of the data acquisition boards for personal computers use successive approximation conversion. These A/D converters are built around a D/A converter and use a comparison technique. When a conversion command is applied, the D/A converter's MSB output (1 /2 full scale) is compared with the input. Then if the input is greater than the MSB, it remains on and the next bit is tested. But, if the input is less than the MSB, it is turned off, and the next bit is tested. If the second bit does not have enough weight to exceed the input, it is left on and the third bit is tested. But, if the second bit exceeds the input, it is turned off. The bit testing continues until the last bit has been tested. 60
  • 61. When the bit tests are complete, a status line indicates that a valid conversion has occurred. An output register is used to hold the digital code corresponding to the input signal. 61
  • 62. Figure is a block diagram of a successive approximation A/D converter. Internally, the converter operates as follows. When a true signal is applied to the command input, the D/A switches are set to their off state, except for the significant bit, which is set to logic”1”. This turns on the corresponding D/A switch to apply the analog equivalent of MSB to the comparator. If the analog input voltage is less than the MSB weight , the MSB is switched off at the first edge of the clock pluse. If the analog input is greater than the MSB, the “1” remains in the register. During the second pulse, the sum of the first result and second bit is compared with the analog input voltage. The comparator is gated by the next clock pulse. It will cause the register to either accept or reject that bit. Successive clock pulses will cause all the bits, in order of decreasing significance to be tested until the LSb is accepted or rejected. A/D Converter Considerations. The following considerations are important to A/D conversion: 1. The analog input range. 2. Resolution required for the signal to be measured. 3. The requirements for linearity error, relative accuracy, and stability of calibration. 4. The changes in the various sources of errors as temperature changes 5. Conditions for missed codes if allowable 6. The time allowed for a complete conversion 7. Stability of the system power supply 8. Errors due to power supply variations. 9. Character of the input signal: noisy, sampled, filtered , frequency 10. Types of preprocessing needed or desired Other A/D conversion circuits may be more acceptable for the application instead of successive approximation. These include the integration and counter comparator types. The integrating types are generally better for converting noisy input signals at relatively slow rates. Successive approximation is best suited for converting sampled or filtered inputs to the MHz range. Counter comparator types allow low cost, but can be slow and noise- susceptible. They are useful for peak followers and sample holds for digital storage applications. 62
  • 63. Converter Parameters When the converter’s full-scale range is adjusted , it will be set with respect to the reference voltage which can be traced to some recognized voltage standard. The absolute accuracy error is the tolerance of the full-scale point referred to this absolute voltage standard . Offset is measured for zero and it usually a function of time and temperature. Nonlinearity monotenicity is the ability to include all code numbers in actual operation. It is the amount by which the plot of output versus input deviates from a straight line. Settling time is the time required for the input to attain a final value within a specified fraction of full scale, usually ½ LSB. In the A/D conversion process, an error from the quantization uncertainty of ½ LSB exists along with other conversion processing errors. The way to reduce this quantization uncertainty error is to increase the number of bits. Statistical interpolation can be used during processing or filtering after the conversion. This tends to fill in missing analog values for rapidly changing signals, but it will not reduce errors due to any variations within ±½ LSB. It is usually easier to determine the location of a transition than to determine a midrange value, so errors and settings of AID converters are normally defined in terms of the analog values when actual transitions occur in relation to the ideal transition values. ADC674 Analog-to-Digital Converter This Burr-Brown chip is a complete 12-bit A/D converter with reference, clock and 8-, 12-, or 16-bit microprocessor bus interface. It has a 15-microsecond maximum conversion time and is specified for operation with no missing codes. The chip contains a 12-bit successive approximation analog-to-digital converter It has a self-contained +10 V reference, internal clock, digital interface for microprocessor control, and three-state outputs. The reference circuit uses a buried zener and is laser trimmed. The clock oscillator is current-controlled, and full-scale and offset errors may be externally trimmed. Internal scaling resistors are provided for selecting the following analog input signal ranges: O to +10 V O to + 20 V + or -5V + or -10V The converter can be externally programmed to provide 8- or 12 bit resolution. The output data is available in a parallel format from TTL compatible three-state output buffers. The output data is coded in straight binary for unipolar input signals. and bipolar offset binary for bipolar input signals. It is packaged in a 28-pin ceramic DIP Calibration Techniques Both digital-to-analog (D/A) and analog-to-digital (A/D) converters have offset errors since the first transition will not always occur at exactly ½ LSB. Scale factor or gain errors can cause a difference between the values at which the first transition and the last transition occur since this is not always equal to ½ LSB. Linearity errors can exist since the differences between transition values are not all equal or uniform in changing. When the differential linearity error becomes too large, it is possible for codes to be missed. Offset and full-scale errors are trimmed using external offset and full scale trim potentiometers connected to the reference and offset terminals. If adjustments for unipolar offset arid full scale are not required, a 50-ohm 1 percent metal film resistor is connected between pin 10 (Reference In) and pin 8 (Reference Out). Pin 12 Bipolar' Offset) is connected to pin 9 (AnaIog Common), grounding the offset adjustment. If adjustment is required, one 100-ohm potentiometer is connected between pins 10 and 8, and another is connected between pins 8 and 12. Then the input is vsried through the- end-point transition voltage; 0V + ½ LSB; +1.22 mV for the 10-V range, +2.44 mV for the 20-V range. This causes the output code to be DBO ON (high). Then the potentiometer between pins 12 and 8 until DBO just switches off with all other bits off. Next, an input voltage of full- scale value minus 3/2 LSB is applied to cause all bits to be on. This value is +9.9963 V for the 10-V range and +19.9927 V for the 20-V range. The potentiometer between pins 8 and 10 is adjusted until bits DB1 and DBl1 are on and DBO is switching between on and off. If external adjustments of full-scale and bipolar offset are not required, the potentiometers are replaced with 50-ohm metal film resistors. If adjustments are 63
  • 64. required, the calibration procedure is similar to that used for unipolar operation, except that the offset adjustment is performed with an input voltage which is ½ LSB above the minus full-scale value, -4.9988 V for the +5-V range, -9.9976 V for the +10 V range. Then the pot between pins 8 and 12 is adjusted for DBO to switch between on and off with all other bits off. To adjust full-scale, a DC input signal is used which is 3/2 LSB below the nominal plus the full-scale value. This is +4.9963 V for the +5 V range and +9.9927 V for the +10-V range. Then the pot between pins and 10 is adjusted for DBO to switch between on and off with all other bits on. Interfacing The AD674 is designed to be interfaced to microprocessor systems and other digital systems. The microprocessor can have full control of the conversions, or the converter can operate in a stand-alone mode, trolled by the R/C* input. Full control involves the following: 1. Setting up an 8- or 12-bit conversion cycle 2. Initiating the conversion 3. Reading the output data This can be done by reading the 12 bits all at once, or 8 bits followed by 4 bits in a left justified format. There are five control inputs -12/8*, CS*, A0, R/C*, and CE. These are all TTL/CMOS-compatible. The functions of the control inputs are shown in Table 7.2. The stand-alone mode is used in systems with dedicated input ports. In stand-alone operation, control of the converter is done with a single control line connected to R/C*. In this mode CS* and A0 are tied to digital common and CE and 12/8* are tied to +5 V The output will be in 12 bit words. The conversion is initiated by forcing R/C* to low. The three-state data output buffers are enabled when R/C* is high and STATUS is low. The conversion can be initiated with either positive or negative pulses. The R/C pulse must be low for at least 50 nanoseconds. DEVICE DRIVERS What is a Device Driver? A device driver is a program that controls a device. Every device, a printer, disk drive, or keyboard, must have a driver program, to interact with the OS. Many drivers, like keyboard driver, are built into the operating system itself. So a "driver" is a piece of software that lets your PC talk to peripherals, components, and other hardware. It interprets OS commands to the specific needs of the device. Where Are the Drivers? Some of the essential device drivers like the keyboard driver and floppy disk driver are built into the "ROM" (Read Only Memory) or "BIOS" (Basic Input Output System) of the computer system itself. There are also drivers built into the operating system to control memory, cache, and other basic devices of the PC. For all other devices you may need to load a new driver when you connect the device to your computer. In Windows, drivers often have a DRV extension. How Does the Driver Work? A driver acts like a translator between the device that it controls and programs that use the device. For example, a mouse driver translates the "actions" of the mouse to something more understandable by the OS. Each device has its own set of specialized commands that only its driver knows. The driver, therefore, accepts special commands from a program and then translates them into specialized commands for the device. PORTS AND SOCKETS A socket is an endpoint used by a process for bi-directional communication with a socket associated with another process. Sockets, introduced in Berkeley Unix, are a basic mechanism for IPC on a computer system, or on different computer systems connected by local or wide area networks. How to program with sockets to create communication channels. The communication channel created with sockets can be like a telephone line (connection oriented), with the sockets as telephones over which a conversation can 64
  • 65. take place. Or the channel can be as when we send mail (datagram oriented), with the sockets as mailboxes. A socket appears to the user to be like a file descriptor on which we can read, write, and ioctl. In the connection oriented mode, the file is like a sequence of characters that we can read with as many read operations as we like. In the connectionless mode we have to get a whole message in a single read operation. If we don't, what is left over of the message is lost. Though sockets can be used in a single computer system for interprocess communication (the Unix domain), we will only consider their use for communication across computer systems (the Internet domain). It is possible to sent message on a socket that take precedence over other undelivered messages. These priority messages are called out-of-band messages. A problem in communication is how to identify interlocutors. In the case of phones we have telephone numbers, for mail we have addresses. For communicating between sockets we identify an interlocutor with a pair: IP address and port. Ports are 16-bit unsigned integers. (The first 1024 port numbers are reserved for things like http. 80. These ports are called well-known ports. Certainly from 49152 to 65535 the ports are private and can be dynamically allocated (ephemeral ports). The interval 1024 to 49151 consists of registered ports. Client-Server Architecture A standard way of using sockets and communication channels is between clients and servers. A server is a process that is able to carry out some function, called a service, like transferring files, translating host names to IP addresses, or inverting a matrix. A client is a process that requests a server to do a service (say, "translate snowhite cis. temple edu"). Typically the server will be at a known IP address and will respond to requests sent to a known port. In some cases that port is not universally known, so the server will advertize the port it is currently using (it may advertize the port by printing out its value, or sending email, or having inetd. a special process, know about it, etc.). In some cases the IP address of the server is not known and one may have a "standard" server that responds to requests of the form "where can I find service Moo" by responding with an appropriate IP address. The client requests the kernel to obtain a free port to be used for communication with the server. The server does not have to know in advance the identity of its clients. It is ready to accept a message from any interlocutor. When it receives a message from a client, the message itself contains the IP and the port of the client, so that the server knows whom to answer to. An address, host+port, can be used for multiplexing more than one communication channel. So one server can communicate simultaneously with more than one client. Each communication channel on the server will have its own socket bound to the same address. In other words, each connection on the internet is identified by a socket pair. (client IP client Port) + (server IP, server port), plus the protocol being used (say TCP or UDP). Summary On Socket Functions The following is a summary of the basic socket functions as they are used for datagram and connection oriented service by clients and servers. In the following section we will go in greater detail over these functions. Datagram Service Client socket => ([bind =>] [connect =>] {write => read}*) | {sendto => recvfrom}* => close | shutdown In words: create a socket, then bind it to a local port [if bind is not used, the kernel will select a free local port], establish the address of the server, write and read from it, or just send to and recvfrom it; then determinate. In the case that client is not interested in a response, it does not need to use bind. Connect is worth using when we send many datagrams to the same server. 65
  • 66. Server socket => bind => {read | recvfrom => write | sendto}* => close | shutdown In words; create a socket, bind it to a local port, accept and reply to messages from client, terminate. In the case that the server does not need reply to the client, it can just use read instead of recvfrom. Connection Oriented service Client socket => [bind =>] connect => {write | sendto => read | recvfrom}* => close | shutdown In words: create a socket, bind it to a local port (we usually do not call bind), establish the address of the server, communicate with it, terminate. If bind is not used, the kernel will select a free local port. Server socket => bind => listen => {accept => {read | recvfrom => write | sendto}* }* => close | shutdown In words : create a socket. bind it to a local port, set up service with indication of maximum number of concurrent services, accept requests from connection oriented clients, receive messages and reply to them, terminate. /* A simple server in the internet domain using TCP The port number is passed as an argument /* #include<stdio.h> #include<sys/types.h> #include<sys/socket.h> #include<netinet/in.h> void error(char*msg) { perror(msg); exit(1); } int main(int argc, char *argv[ ]) { int sockfd, newsockfd,portno, clien; char buffer[256]; struct sockaddr_in serv_addr, cli_addr, int n; if (argc > 2 ) } fprintf(stderr,"ERROR, no port providedn"); exit(1); } sockfd = socket(AF_INET, SOCK_STREAM, 0); if (sockfd < 0) error(ERROR opening socket"); bzero(char*) & serve_addr, sizeof(serv_addr)); portno = atoi(argv[1]); serv_addr.sin_family = AF_INET; serv_addr.sin_addr.s_addr = INADDR_ANY; serv_addr.sin_port = htons(portno); if (bind(sockfd, (struct sockaddr*) &serv_addr, sizeof(serv_addr))<0) error("ERROR on binding"); listen(sockfd,5); clilen = sizeof(cli_addr); newsockfd = acept(sockfd, (struct sockaddr*) &cli_addr, &clien); 66
  • 67. if (newsockfd < 0) error("ERROR on accept"); bzero(buffer,256); n = read(newsockfd,buffer,255); if (n < 0) error("ERROR reading from socket"); printf("Here is the message: %sn",buffer); n = write(newsockfd,"I got your message",18); if (n < 0) error("ERROR writing to socket"); return 0; } /* A simple client in the internet domain using TCP The port number is passed as an argument */ #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> void error(char *msg) { perror(msg); exit(0); } int main(int argc, char *argv[ ]) { int sockfd, portno, n; struct sockaddr_in serv_addr; struct hostent *server; char buffer[256]; if (argc < 3) { fprint(stderr,"usage %s hostname portn", argv[0]; exit(0); } portno = atoi(argv[2]); sockfd = socket(AF_INET, SOCK_STREAM, 0); if (sockfd < 0) error(ERROR opening socket"); server = gethostbyname(argv[1]); if (server == NULL) { fprintf(staderr, "ERROR, no such hostn"); exit(0); } bzero(char*) & serv_addr, sizeof(serv_addr)); serv_addr.sin_family = AF_INET; bcopy((char*)server->h_addr, (char*)&serv_addr.sin_addr.s_addr, server->h_length); serv_addr.sin_port = htons(portno); if (connect(sockfd,&serv_addr,sizeof(serv_addr)) <0) error("ERROR connecting"); printf ("Please enter the message: "); bzero(buffer,256); fgets(buffer,255,stdin); n = write(sockfd,buffer,strlen(buffer)); if (n < 0) error("ERROR writing to socket"); bzero(buffer,256); 67
  • 68. n = read(sockfd,buffer,255); if (n < 0) error("ERROR reading from socket” printf("%sn",buffer); return 0; } BENCHMARKS What is comp.benchmarks? Comp.benchmarks is a USENET newsgroup for discussing computer benchmarks and publishing benchmark results and source code. If it's about benchmarks, this is the place to post or cross post it. What is a benchmark? A benchmark is test that measures the performance of a system or subsystem on a well- defined task or set of tasks. How are benchmarks used? Benchmarks are commonly used to predict the performance of an unknown, system on a known, or at least well-defined, task or workload. Benchmarks can also be used as monitoring and diagnostic tools. By running a benchmark and comparing the results against a known configuration, one can potentially pinpoint the cause of poor performance. Similarly, a developer can run a benchmark after making a change that might impact performance to determine the extent of the - impact. Benchmarks are frequently used to ensure the minimum level of performance in a procurement specification. Rarely is performance the most important factor in a purchase, though. One must never forget that it's more important to be able to do the job correctly than it is to get the wrong answer in half the time. What kinds of performance do benchmarks measure? Benchmarks are often used to measure general things like graphics, I/O, compute (integer and floating point), etc., performance, but most measure more specific tasks like rendering polygons, reading and writing files, or performing operations on matrixes. SPEC SPEC stands for "Standard Performance Evaluation " a non-profit Corporation, organization with the goal to "establish, maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers" (from SPEC's bylaws). The current SPEC benchmark suites are CINT92 (CPU intensive integer benchmarks) CFP92 (CPU intensive floating point benchmarks) SDM Y (IUNTX Software Development Workloads) SFS (System level file server (NFS) workload) In August 1995, SPEG introduced the SPEC9S CPU benchmarks as a replacement for the older SPEC92 CPU benchmarks (see below, section 4.2). These benchmarks measure the performance of CPU, memory system, and compiler code generation. They normally use UNIX as the portability vehicle, but they have been parted to other operating systems as well. The percentage of time spent in operating system and I/O functions is generally negligible. Throughput (Rate) Measurement method, called the "homogeneous capacity method", several copies of a given benchmark are executed. This method is particularly suitable for multiprocessor systems. The results, called SPEC rate, express how many jobs of a particular type (characterized by the individual benchmark) can be executed in a given time. (The SPEC reference time happens to be a week, the execution times are normalized with respect to SPEC reference machine). The SPEC rates therefore characterize the capacity of a system for compute-intensive jobs of similar characteristics. 68
  • 69. 69
  • 70. TRANSACTION PROCESSING BENCHMARK TPC Benchmark TM W (TPC-W} is a transactional web benchmark. The workload is performed in a controlled internet commerce environment that simulates the activities of a business oriented transactional web server. The workload exercises a breadth of system components associated with such environments, which are characterized by: Multiple on-line browser sessions Dynamic-page generation with database access and update Consistent web objects The simultaneous execution of multiple transaction types that span a breadth of complexity. On-line transaction execution modes Databases consisting of many tables with a wide variety of sizes, attributes, and relationships Transaction integrity (ACTD properties) Contention on data access and update The Transaction Processing Performance Council (TPC) is a non-profit corporation founded to define transaction processing and database benchmarks and to disseminate objective, verifiable TPC performance data to the industry. While TPC benchmarks certainly involve the measurement and evaluation of computer functions and operations, the TPC regards a transaction in the same way as it is commonly understood in the business world: as a commercial exchange of goods, services, or money.A typical transaction, as defined by the TPC, would include the updating of a database system for such things as inventory control (goods), airline reservations(services), or banking (money). The TPC-D benchmark is an accepted industry-standard measure for decision support system performance. The TPC-D consists of a suite of business-oriented ad-hoc queries and concurrent data modifications. The queries and the data populating the database were chosen for their broad industry-wide relevance and relative ease of implementation. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and pro-vide answers to critical business questions. The TPC-D benchmark evaluates the performance of various decision support systems by the execution of sets of queries against a standard database under controlled conditions. TPC-D queries measure both the server and the storage performance of a complete solution. Since this is an industry-standard benchmark, vendors can be ranked based on their TPC-D performance and rice/performance results. 70