ARM PROCESSING BASICS PPT FOR 4TH SEM ENGINEERING

Unit - I
ARM Processor Fundamentals

Introduction
• ARM processors are a family of central processing units (CPUs) based on a reduced instruction set computer
(RISC) architecture. ARM stands for Advanced RISC Machine.
History of ARM processors:
• x86 is an older architectural approach (CISC - complex instruction set computer) , first x86 CPU design
launched in 1978.
• “Microcomputers" (PCs) evolved  high performance and a smaller design became a challenge.
• Early 1980s, Acorn Computers designed microcomputers  performance limitations with chip design.
• Around 1981, University of California, Berkeley project  resource usage with computer chips.
• Processing units have certain predefined operations collectively called instruction sets.
• Most programs used only a small subset of the instruction set. Reducing the number of predefined
instructions—cutting out complex and hard to implement (and little used) instructions —the remaining simple
instructions would run faster and take up much less power and space on the chip.  RISC
Dr. Shachi P, Dept. of ECE, BMSCE 2

History of ARM processors
• X86(Intel) design have a modular approach based on a motherboard with swappable
components. The CPU and other components—such as graphics cards and GPUs, memory
controllers, storage, or processing cores—are optimized for specific functions and can be
easily swapped out or expanded.
• However, these hardware components are typically more homogenized system
architectures, which can allow hackers to quickly breach and attack systems with "write
once, run anywhere" exploits.
• In ARM-based processor, CPU cores and other hardware functions (like I/O bus controllers
such as peripheral component interconnect) are on the same physical platform, and all of
the different functions are integrated together through an internal bus.  SoC
• x86 chips are designed to optimize performance;
• ARM-based processors are designed to balance cost with smaller sizes, lower power
consumption, lower heat generation, speed, and potentially longer battery life.

ARM Partnership Model

Microprocessors vs. Microcontrollers
Microprocessors Microcontrollers
A silicon chip representing a Central Processing
Unit(CPU), which is capable of performing arithmetic
as well as logical operations according to a pre-defined
set of Instructions.
A microcontroller is a highly integrated chip that
contains a CPU, RAM, Special and General purpose
Register Arrays, On Chip ROM/FLASH memory for
program storage, Timer and Interrupt control units
and dedicated I/O ports
It is a dependent unit. It requires the combination of
other chips like Timers, Program and data memory
chips, Interrupt controllers etc. for functioning.
It is a self contained unit and doesn’t require external
Interrupt Controller, Timer, and UART etc. for its
functioning.
Most of the time general purpose in design
and operation.
Mostly application oriented or domain
specific.
Doesn’t contain a built in I/O port. The I/O Port
functionality needs to be implemented with the help
of external Programmable Peripheral Interface Chips
Most of the processors contain multiple built-in I/O
ports which can be operated as a single 8 or 16 or 32
bit Port or as individual port pins.
Targeted for high end market where
performance is important.
Targeted for embedded market where
performance is not so critical.
Limited power saving options. Includes lot of power saving features

RISC V/S CISC Processors/Controllers:
RISC Processors/Controllers CISC Processors/Controllers
Lesser no. of instructions. Greater no. of Instructions.
Instruction Pipelining and increased
execution speed.
Generally no instruction pipelining feature.
Operations are performed on registers only, the only
memory operations are load and store
Operations are performed on registers or
memory depending on the instruction
Large number of registers are available Limited no. of general purpose registers
Programmer needs to write more code to execute a
task since the instructions are simpler ones.
A programmer can achieve the desired
functionality with a single instruction.
Single, Fixed length Instructions. Variable length Instructions.
Less Silicon usage and pin count. More silicon usage.
With Harvard Architecture Harvard or Von-Neumann Architecture

CISC vs. RISC - Instruction set
• The terms CISC and RISC refer to design principles and techniques.
RISC: Reduced instruction set computers
• Simple instructions require a small number of basic steps to execute.
• For a processor that has only simple instructions, a large number of instructions may be
needed to perform a given programming task.
• This could lead to a large value of N and a small value for S. 'N' is the total number of steps
required to complete program execution. 'S' is the average number of basic steps each
instruction execution requires.
• It is much easier to implement efficient pipelining in processors with simple instruction sets.
CISC: Complex instruction set computers
• Complex instructions involve a large number of steps.
• If individual instructions perform more complex operations, fewer instructions will be
needed, leading to a lower value of N and a larger value of S.
• Complex instructions combined with pipelining would achieve good performance.

Harvard V/s Von-Neumann
Processor/Controller Architecture
Harvard Architecture(ARM) Von-Neumann Architecture(x86)
Microprocessors/controllers based on the Harvard
architecture will have separate data bus and instruction
bus. This allows the data transfer and program fetching
to occur simultaneously on both buses.
Microprocessors/controllers based on the Von-
Neumann architecture shares a single bus for fetching
both instructions and data. Program instructions &
data are stored in a common main memory.
Separate buses for Instruction and Data fetching. Single shared bus for Instruction and Data fetching.
Easier to Pipeline, so high performance can be
achieved.
Low performance Compared to Harvard
Architecture.
Comparatively high cost. Cheaper.
No memory alignment problems Allows self modifying codes

Unit - I
• Basic Structure of computers- Von Neumann and Harvard Architecture, Basic
Processing Unit, Bus Structure, RISC and CISC Architecture, RISC and ARM
Design philosophy, ARM core Dataflow model, programming model,
processor states and operating modes, ARM pipeline.

Computer Types
• Since their introduction in the 1940s, digital computers have evolved into
many different types that vary widely in size, cost, computational power, and
intended use.
• Modern computers can be divided roughly into four general
categories:
• 1. Embedded computers are integrated into a larger device or system in
order to automatically monitor and control a physical process or
environment.
• They are used for a specific purpose rather than for general
processing tasks.
• Ex: industrial and home automation, appliances, telecommunication
products, and vehicles

Computer Types
• 2. Personal computers have achieved widespread use in homes, educational
institutions, and business and engineering office settings, primarily for
dedicated individual use.
• They support a variety of applications such as general computation,
document preparation, computer-aided design, audio visual entertainment,
interpersonal communication, and Internet browsing.
A number of classifications are used for personal computers.
• Desktop computers serve general needs and fit within a typical personal
workspace.
• Workstation computers offer higher computational capacity and more
powerful graphical display capabilities for engineering and scientific work.
• Portable and Notebook computers provide the basic features of a personal
computer in a smaller lightweight package. They can operate on batteries to
provide mobility

Computer Types
3.3. Servers and Enterprise systems are large computers that are meant to be
shared by a potentially large number of users who access them from some form
of personal computer over a public or private network.
• Such computers may host large databases and provide information
processing for a government agency or a commercial organization.
4.Supercomputers and Grid computers normally offer the highest
performance. They are the most expensive and physically the largest
category of computers.
• Supercomputers are used for the highly demanding computations
needed in weather forecasting, engineering design and simulation,
• and scientific work.

Functional Units of a Computer
Computer consists of five
functionally independent
main parts:
1. Input
2. Memory
3. Arithmetic and logic
4. Output
5. Control units

• The input unit accepts coded information from human operators using
devices such as keyboards, or from other computers over digital
communication lines.
• The information received is stored in the computer’s memory, either for later
use or to be processed immediately by the arithmetic and logic unit.
• The processing steps are specified by a program that is also stored in the
memory.
• Finally, the results are sent back to the outside world through the output unit.
• All of these actions are coordinated by the control unit.
• An interconnection network provides the means for the functional units to
exchange information and coordinate their actions.

• The information handled by a computer is categorize as either instructions or data.
• Instructions, or machine instructions, are explicit commands that
1. Govern the transfer of information within a computer as well as between the computer and
its I/O devices.
2. Specify the arithmetic and logic operations to be performed
• A program is a list of instructions which performs a task. Programs are stored in
the memory.
• The processor fetches the program instructions from the memory, one after another,
and performs the desired operations.
• The computer is controlled by the stored program, except for possible external
interruption by an operator or by I/O devices connected to it.
• Data are numbers and characters that are used as operands by the instructions. Data
are also stored in the memory.
• The instructions and data handled by a computer must be encoded in a suitable
format.

Memory Unit
• The function of the memory unit is to store programs and data. There are two
classes of storage, called primary and secondary.
Primary Memory
• Primary memory (main memory) is a fast memory that operates at electronic
speeds.
• Programs must be stored in this memory while they are being executed.
• Semiconductor storage cells, each capable of storing one bit of information. These
cells are handled in groups of fixed size called words.
• The memory is organized  one word can be stored/retrieved in one basic
operation.
• Number of bits in each word word length (typically 16, 32, or 64 bits).

• To provide easy access to any word in the memory, a distinct address is associated
with each word location.
• Addresses are consecutive numbers, starting from 0, that identify successive
locations.
• A memory in which any location can be accessed in a short and fixed amount of time
after specifying its address is called a random-access memory (RAM).
• The time required to access one word is called the memory access time.
• This time is independent of the location of the word being accessed. It typically
ranges from a few nanoseconds (ns) to about 100 ns for current RAM units.

Cache Memory
• As an adjunct to the main memory, a smaller, faster RAM unit, called a cache,
is used to hold sections of a program that are currently being executed, along
with any associated data.
• The cache is tightly coupled with the processor and is usually contained
on the same integrated-circuit chip.
• The purpose of the cache is to facilitate high instruction execution rates.
• As execution proceeds, instructions are fetched into the processor chip, and a
copy of each is placed in the cache.
• If the required data located in the main memory, the data are fetched and
copies are also placed in the cache.

Secondary Storage
• Although primary memory is essential, it tends to be expensive and does not
retain information when power is turned off.
• Thus additional, less expensive, permanent secondary storage is used when
large amounts of data and many programs have to be stored, particularly for
information that is accessed infrequently.
• Access times for secondary storage are longer than for primary memory.
• Examples: magnetic disks, optical disks (DVD and CD), and flash memory
devices.
• https://www.youtube.com/watch?v=7J7X7aZvMXQ (up to 3:17seconds)

• When operands are brought into the processor, they are stored in high-
speed storage elements called registers.
• Each register can store one word of data.
• Access times to registers are even shorter than access times to the cache unit
on the processor chip.
Control Unit
• Control circuits are responsible for generating the timing signals that govern
the transfers and determine when a given action is to take place.
• Data transfers between the processor and the memory are also
managed by the control unit through timing signals.

The Basic Operational Concepts of a Computer
• To perform a given task an appropriate program consisting of a list of instructions is stored in the memory.
Individual instructions are brought from the memory into the processor, which executes the specified
operations. Data to be stored are also stored in the memory.
• Examples: - Add LOCA, R0
• This instruction adds the operand at memory location LOCA, to operand in register R0 & places the sum into
register. This instruction requires the performance of several steps,
1. First the instruction is fetched from the memory into the processor.
2. The operand at LOCA is fetched and added to the contents of R0
3. Finally the resulting sum is stored in the register R0
• The preceding add instruction combines a memory access operation with an ALU Operations. In
some other type of computers, these two types of operations are performed by separate instructions
for performance reasons.
• Load LOCA, R1
• Add R1, R0
• Transfers between the memory and the processor are started by sending the address of the memory location
to be accessed to the memory unit and issuing the appropriate control signals. The data are then transferred
to or from the memory.

Connections between the processor and the memory
• Besides IR and PC, there are n-general purpose
registers R0 through Rn-1.
Memory Address Register (MAR):
It holds the address of the location to
be accessed.
Memory Data Register (MDR):
It contains the data to be written into
or read out of the address location.
• The instruction register (IR)
• Holds the instructions that is currently being
executed. Its output is available for the control
circuits which generates the timing signals that
control the various processing elements in one
execution of instruction.
• The program counter PC: This is another
specialized register that keeps track of execution
of a program. It contains the memory address of
the next instruction to be fetched and executed.

Operating steps for Program execution
1. Execution of the program (stored in memory) starts when the PC is set to point to the first instruction of the
program.
2. The contents of the PC are transferred to the MAR and a Read control signal is sent to the memory.
3. The addressed word is read out of the memory and loaded into the MDR. Next, the contents of the MDR
are transferred to the IR. At this point, the instruction is ready to be decoded and executed.
4. If the instruction involves an operation to be performed by the ALU, it is necessary to obtain the required
operands.
5. If an operand resides in memory (it could also be in a general purpose register in the processor), it has to be
fetched by sending its address to the MAR and initiating a Read cycle.
6. When the operand has been read from the memory into the MDR, it is transferred from the MDR to ALU.
7. After one or more operands are fetched in this way, the ALU can perform the desired operation.
8. If the result of the operation is to be stored in the memory, then the result is entered in to the MDR.

9.The address of the location where the result is to be stored is sent to the MAR, and a write cycle is initiated.
10.At some point during the execution of the current instruction, the contents of the PC are incremented so that the PC
points to the next instruction to be executed.
11. Thus, as soon as the execution of the current instruction is completed, a new instruction fetch may be started.
12. In addition to transferring data between the memory and the processor, the computer accepts data from input devices
and sends data to output devices. Thus, some machine instructions with the ability to handle I/O transfers are provided.
• Normal execution of a program may be preempted (temporarily interrupted) if some devices require urgent servicing,
to do this one device raises an Interrupt signal.
• An interrupt is a request signal from an I/O device for service by the processor. The processor provides the requested
service by executing an appropriate interrupt service routine.
• The Diversion may change the internal state of the processor. Its state must be saved in the memory location before
interruption. When the interrupt-routine service is completed the state of the processor is restored so that the
interrupted program may continue.

Bus Structures
BUS: A group of lines(wires) that serves as a connecting path for several devices of a
computer is called a bus.
The following are different types of busses:
1. Address Bus 2. Data Bus 3. Control Bus
• The Data bus Carries(transfer) data from one component (source) to other component
(destination) connected to it. The data bus consists of 8, 16, 32 or more parallel signal lines.
The data bus lines are bi-directional i.e., CPU can read data on these lines from memory or
from a port, as well as send data out on these lines to a memory location.
• The Address bus is the set of lines that carry(transfer) address information about to which
memory address, the data is to be transferred to or from. It is an unidirectional bus. The
address bus consists of 16, 20, 24 or more parallel signal lines. On these lines CPU sends out
the address of the memory location.
• The Control Bus carries the Control and timing information.

Bus Structures
Following are the other types of busses.
• System Bus: A System Bus is usually a combination of address bus, data
bus, and control bus.
• Internal Bus: The bus that operates only with the internal circuitry of the
CPU.
• External Bus: Buses which connects computer to external devices
• I/O Bus: The bus used by I/O devices to communicate with the CPU
• Synchronous Bus: While using Synchronous bus, data transmission
between source and destination units takes place in a given timeslot which
is already known to these units.

Bus Structures
• Asynchronous Bus: In this case the data transmission is governed by a
special concept. That is handshaking control signals.
• Handshaking (either software codes or hardware signals) is used to halt
transmission of data from the sending computer until the receiving
computer has emptied the buffer.
• Handshaking is a I/O control method to synchronize I/O devices with the
microprocessor.
• As many I/O devices accepts or release information at a much slower rate
than the microprocessor, this method is used to control the microprocessor to
work with a I/O device at the I/O devices data transfer rate.

The Bus interconnection Scheme
1. Bus is a connecting path for several devices of a computer
2. In addition to the lines that carry the data, the bus must have
lines for address and control purposes.

Single bus structure
• The simplest way to interconnect functional units is to use a single bus, as shown below.
• All units are connected to this bus. The bus can be used for only one transfer at a time. Bus
control lines are used to arbitrate multiple requests for use of the bus.
ADVANTAGE
• Low-cost and its flexibility for attaching peripheral devices
DISADVANTAGE
• Low-performance because at time only one transfer
• Scalability: As computer systems become more complex and require higher bandwidth for
data transfer, a single bus structure may struggle to scale efficiently.
• Contention: Contention for the bus can occur when multiple components attempt to access
it simultaneously, leading to delays and potential performance issues.

Traditional / Multiple bus Structure:
• Advantages: better performance, scalable, less contention
• Disadvantage: increased cost and complexity.

Traditional / Multiple bus Structure:
• Traditional / Multiple bus Structure:
• There is a local bus that connects the processor to cache memory and that may
support one or more local devices.
• There is also a cache memory controller that connects this cache not only to this
local bus but also to the system bus. On the system, the bus is attached to the main
memory modules.
• I/O transfers to and from the main memory across the system bus do not
interfere with the processor’s activity.
• An expansion bus interface buffers data transfers between the system bus and
the I/O controllers on the expansion bus.
• I/O devices that might be attached to the expansion bus include: Network cards
(LAN), SCSI (Small Computer System Interface), Modem, etc..

Basic Processing Unit
• Computing task consists of a series of operations specified by a sequence of machine-
language instructions that constitute a program.
• The processor fetches one instruction at a time and performs the operation specified.
Instructions are fetched from successive memory locations until a branch or a jump
instruction is encountered.
• The processor uses the program counter, PC, to keep track of the address of the next
instruction to be fetched and executed.
• After fetching an instruction, the contents of the PC are updated to point to the next
instruction in sequence. A branch instruction may cause a different value to be loaded
into the PC.
• When an instruction is fetched, it is placed in the instruction register, IR, from where it is
interpreted, or decoded, by the processor’s control circuitry. The IR holds the instruction
until its execution is completed.
• Consider a 32-bit RISC-style instruction set architecture.

• Toexecute an instruction, the processor has to perform the following steps:
1. Fetch the contents of the memory location pointed to by the PC. The
contents of this location are the instruction to be executed; hence they are
loaded into the IR.
• In register transfer notation, the required action is IR ← [[PC]]
2. Increment the PC to point to the next instruction. Assuming that the
memory is byte addressable, the PC is incremented by 4; that is PC ← [PC]
+ 4
3. Carry out the operation specified by the instruction in the IR.

• The operation specified by an instruction can be carried out by performing one
or more of the following actions:
• Read the contents of a given memory location and load them into a processor register.
• Read data from one or more processor registers.
• Perform an arithmetic or logic operation and place the result into a processor register.
• Store data from a processor register into a given memory location.
• The processor communicates with the memory through the processor- memory
interface, which transfers data from and to the memory during Read and Write
operations.
• The instruction address generator updates the contents of the PC after every
instruction is fetched. The register file is a memory unit whose storage locations are
organized to form the processor’s general-purpose registers.

• The processor communicates with the memory through the
processor-memory interface, which transfers data from and to
the memory during Read and Write operations.
• The instruction address generator updates the contents of the
PC after every instruction is fetched.
• The register file is a memory unit whose storage locations are
organized to form the processor’s general- purpose registers.
• During execution, the contents of the registers named in an
instruction that performs an arithmetic or logic operation are
sent to the arithmetic and logic unit (ALU), which performs the
required computation.
• The results of the computation are stored in a register in the
register file.
• The clock period, which is the time between two successive rising edges, must be long enough to allow
the combinational circuit to produce the correct result.

RISC and ARM Design Philosophy
The RISC Design Philosophy
• Instructions – reduced number and simpler
• Pipeline
• Registers – large number of general purpose registers (store data or address)
• Load/Store architecture – anything data on memory (to be processed), is first
moved to register/s and then processed.
ARM Design Philosophy
• Power efficiency
• High code density
• Memory footprint/ Die area
• Hardware Debug technology

The RISC Design Philosophy

Nomenclature
• ARM7TDMI-S

ARM7TDMI Features
• 32 bit data bus/ ALU
• 32 bit instructions/ Address bus
• Aligned memory
• Von Neuman architecture
• 3-stage pipeline
• 37 registers- 32 bit each
• Load- store Model
• 7 operating modes
• 7 exceptions
• 7 addressing modes
• 3 data formats

ARM ISA Features
• ARM ISA differs from pure RISC
• Variable execution cycle for certain instructions
• In-line barrel shifter leading to more complex instructions.
• Thumb instruction set
• Conditional execution
• Enhanced instructions with DSP extension

Data Sizes and Instruction Sets
■ The ARM is a 32-bit architecture.
■ When used in relation to the ARM:
■ Byte means 8 bits
■ Halfword means 16 bits (two bytes)
■ Word means 32 bits (four bytes)
■ Most ARM’s implement two instruction sets
■ 32-bit ARM Instruction Set
■ 16-bit Thumb Instruction Set
■ Jazelle cores can also execute Java bytecode

ARM core Dataflow model
• MOVS r7, r5, LSL #2
• MLA{<cond>}{S} R0,R1,R2,R3
• LDR r0, [r1, #4]!
• STRH r0,[r1,#0x4]!
• LDRSB r0,[r1]

Registers
What is a register?
• data holding places that are part of the computer processor
• high-speed memory storing units.
• memory locations that can be accessed by the CPU directly
Difference between memory and register
• A register stores the instructions which the CPU currently processes.
• Memory stores the data and instructions that the processor while operation
may require.

Registers (contd.)
• ARM has 37 registers (all are 32-bits long)
• 1 dedicated program counter
• 1 dedicated current program status register (CPSR)
• 5 dedicated saved program status registers (SPSR)
• 30 general purpose registers
• Out of 37 only 18 are active registers
• 16 data registers (r0-r15)- hold either data or address
• 2 process status registers
• r13 : stack pointer
• r14: link register
• r15: program counter

Registers (contd.)
• Register r13 :
• used as the stack pointer (sp)
• stores the head of the stack in the current processor mode.
• Register r14
• the link register (lr)
• the core puts the return address whenever it calls a subroutine.
• Register r15:
• is the program counter (pc)
• the address of the next instruction to be fetched by the processor
• These registers are distributed in several register banks, their usage depends on
the mode in which the ARM processor is operated

Banked Registers
• registers hidden from a program at different times  banked registers are
identified by the shading in the diagram
• Available only when the processor is in a particular mode
• Mode can be selected by writing directly to the mode bits of the cpsr (core must
be in privileged mode)
• Mode can also be changed by hardware when the core responds to an exception
or interrupt
• A banked register maps one-to one onto a user mode register
• If processor mode is changed , a banked register from the new mode will
replace an existing register
Saved Program Status Register (SPSR) stores the current value of the CPSR when an
exception is taken so that the CPSR can be restored after handling the exception.

• Exceptions and interrupts suspend the normal execution of sequential instructions and jump to a specific
location.
• The following exceptions and interrupts cause a mode change:
 Reset
 interrupt request
 fast interrupt request
 software interrupt
 data abort
 prefetch abort
 undefined instruction
• a new register appearing in interrupt request mode: the saved program status register (spsr), which
stores the previous mode’s cpsr
• spsr can only be modified and read in a privileged mode. There is no spsr available in user mode.
• cpsr is not copied into tspsr when a mode change is forced due to a program writing directly to the cpsr.
• The saving of the cpsr only occurs when an exception or interrupt is raised.

Current Program Status Register
• Used to monitor and control internal operations. 32-bit register, resides in register file.
• The CPSR is divided into four fields, each 8 bits wide:
• flags
• Status
• Extension
• Control
• The control fieldprocessor mode, state, and interrupt mask bits.
• The flags field  contains the condition flags.
• Some ARM processor cores have extra bits allocated.
• The J bit, in flags field  used in Jazelle-enabled processors

Interrupt Masks
• Used to stop specific interrupt requests from interrupting the processor
• Two interrupt request levels inARM core
• Interrupt Request (IRQ)
• Fast Interrupt Request (FIQ)
• CPSR: 2 interrupt mask bits
• I when set to 1 it masks requests made by IRQ
• Fwhen set to 1 it masks requests made by FIQ
Conditional Flags
There are four Conditional Flags inARM7TDMI
It is present in the CPSR, the flag bits are
⚫ N: Result is Negative
⚫ Z: Zero flag
⚫ C: Carry Flag
⚫ V: Overflow Flag

Condition Flags
• Updated by comparisons and results ofALU operations
• Only instructions having suffix S can update the flags
• Eg: SUBS instruction when executed sets Z=1 if result is zero
• Q: used in cores with DSP extensions
• Indicates an overflow/ saturation due to execution of enhanced DSP instruction
• It’s a sticky flag: can be set only by hardware
• Can be cleared by writing to CPSR directly
• ARM instructions follow conditional execution
• Its is based on the value stored in conditional flag[ Ref Table next slide]
Note 1:
 When bit=1Capital Letter
 When bit=0Lower case Letter
Figure: CPSR with both Jazelle and DSP extensions set
Note 2:
 Conditional flags Capital letter
indicate flag is set
 InterruptsCapital letter indicates
interrupt is disabled/masked

Conditional Execution
 Controls whether or not the core will
execute an instruction
 Before execution, processor compares the
attributes with the flags in CPSR
 If they match instruction is executed
 If not instruction is ignored
 Conditional attribute is post-fixed to
instruction mnemonic [REFER TABLE]
 If mnemonic is not present the default is
AL (Always)

On power up the processor by default operates in supervisor mode  privileged mode

Processor Modes
The processor mode determines which registers are active and the access rights to
the cpsr register itself.
 Each process mode is either
privileged or nonprivileged
 A privileged mode :allows full
read-write access to the cpsr
 A nonprivileged mode : only
allows to read access to the
control field in the cpsr
 but still allows read-write access
to the condition flags.

State and Instruction Sets
• State defines which instruction set needs to be executed
• Selected using the control bits of the CPSR register
• 3 states:
• ARM : default state, selected when T=J=0,ARM instructions are executed
• Thumb: Selected when T=1; 16 bit thumb instructions are executed
• Jazelle : selected when J=1; 8 bit Jazelle instruction set is selected; Used to execute java bytecodes
• States can be changed by executing branch instruction

Pipelining in ARM7TDMI
• ARM devices need pipelining because of RISC as it emphasizes oncompiler complexity.
• Each stage is equivalent to 1 cycle, that is n stages = n cycles.
• ARM7 uses 3 stage pipeline
• Pipeline speeds up the execution;
• Next instruction is fetched while the other instructions are being decoded and executed
• The pipeline stages are
• FETCH: loads instruction from memory to instruction pipeline
• DECODE : identifies instruction to be executed
• EXECUTE :processes the instruction and writes the result back to a register

Pipelining in ARM7TDMI
Three instructions are in the pipeline. Instructions are placed in pipeline
sequentially
• Cycle1: CORE fetches ADD from memory and puts it in instruction pipeline
• Cycle 2: CORE fetches SUB instruction and Decodes ADD instruction
• Cycle 3: CORE fetches CMP instruction, decodes SUB instruction and
Executes ADD instruction
• This procedure is called FILLING THE PIPELINE
Pipeline allows the CORE to execute an instruction every cycle.
Latency is 3-cycles but throughput is one instruction per cycle.

EXTRAS

Barrel Shifter
A barrel shifter is a digital circuit that can shift a binary number by a specified number
of bits in one clock cycle.
• Barrel shifter can be implemented by a combination of multiplexers
• 2 types – arithmetic and logical shifter
A few examples of barrel shifter applications:
• In Digital Signal Processing, barrel shifters are used to perform fast multiplication
and division operations. For example, in a FIR filter implementation, a barrel shifter
can be used to shift the filter coefficients based on the filter order.
• In Cryptography, barrel shifters are used to perform bitwise operations, such as
encryption and decryption. For example, a barrel shifter can be used to perform a
circular shift on a binary value to improve the security of the encryption algorithm.
• In Microprocessor Architectures, barrel shifters are used to shift the contents of
registers, allowing for efficient data manipulation. For example, in the ARM
architecture, the barrel shifter is used to perform shift and rotate operations on the
contents of registers.

Extras Load-store architecture
• A load-store architecture is a type of computer architecture where all data
processing operations (such as arithmetic, logical, and control operations) are
performed only on data that is loaded from memory into registers, and the results
are stored back into memory. In other words, the only operations that directly access
memory are load and store operations.
• For CISC machine, which is a register-memory architecture, operands may come
from register or memory and RISC a register-register(or load-store) one on the
contrary.

ARM PROCESSING BASICS PPT FOR 4TH SEM ENGINEERING

More Related Content

Similar to ARM PROCESSING BASICS PPT FOR 4TH SEM ENGINEERING (20)

Recently uploaded (20)

ARM PROCESSING BASICS PPT FOR 4TH SEM ENGINEERING

Editor's Notes