SlideShare a Scribd company logo
ARM 2007 liangalei@sjtu.edu.cn
ARM Embedded System
Optimization Technique in Embedded System (ARM)
ARM 2007 liangalei@sjtu.edu.cn
TextBook
• ARM System Developer’s Guide ---Designing
and Optimizing System Software
– ARM 嵌入式系统开发 - 件 与 化软 设计 优
• AUTHOR
– Andrew N Sloss, Dominic Symes, Chris Wright
– 沈建华 译
• PUBLISHER
– 北京航空航天大学出版社, 2005
ARM 2007 liangalei@sjtu.edu.cn
The Days of ARM
• ARM’s designers have come a long way from the first
ARM1 prototype in 1985.
• Over one billion ARM processors had been shipped
worldwide by the end of 2001.
– simple and powerful original design.
• In fact, the ARM core is not a single core, but a
whole family of designs sharing similar design
principles and a common instruction set.
– ARM7TDMI: one of ARM’s most successful cores.
» 120 Dhrystone MIPS (a small benchmarking program)
ARM 2007 liangalei@sjtu.edu.cn
Current and Future of ARM
• Cortext-M3
– Thumb-2 Instruction Set
• MPCore
– SMP(balanced), Cache consistency, L2 cache
– 1~4 ARM11
• OptimoDE technique
– Configurable VLIW, Co-work with ARM core
– MPEG4, H.264 algorithm
ARM 2007 liangalei@sjtu.edu.cn
Brief History of ARM Core
• 1985, First ARM (ARM1)
• 1995, ARM7TDMI
– Most successful ARM core
– 3-stage pipeline, 120 Dhrystone MIPS
• 1997, ARM9
– 5-stage pipeline
– Harvard (I+D cache), MMU (OS’s VM)
• 1999, ARM10
– 6-stage pipeline
– VFP(Vector Float Point) (7-stage pipeline)
• 2003, ARM11
– 8-stage pipeline
ARM 2007 liangalei@sjtu.edu.cn
Versions of ARM Architecture
• ARMv1
– 26-bit address
• ARMv2
– 32-bit Multiplier/coprocessor
• ARMv3
– 32-bit address, cpsr/spsr, MMU, undef/abort Mode
• ARMv4
– Load/store (sign/half/byte), sys Mode
• ARMv5
– Superset ARMv4T (Thumb), extend Mul/DSP
• ARMv6
– Multiprocessor support instr., unaligned/endian/MMX
ARM 2007 liangalei@sjtu.edu.cn
Others
• StrongARM
– ARM + Digital Semiconductor
– Intel Patent
• Xscale
– 1GHz, V5TE
• SC100
– Security, Low Power
– ARM7TDMI, MPU
ARM 2007 liangalei@sjtu.edu.cn
Nomenclature of ARM
• E.g. ARM7TDMI
– T: Thumb
– D: JTAG
– M: Multiplier (extend)
– I: ICE
– E: Extend Instruction (above TDMI)
– J: Jazelle
– F: Float point
– S: Synthetic (soft core)
ARM 2007 liangalei@sjtu.edu.cn
ARM, a RISC ?
• Philosophy of RISC design
– Instruction
» RISC processor have a Reduced number of instruction
classes. These classes provide simple operations that can
each execute in a single cycle. The compiler or
programmer synthesized complicated operations (e.g. a
divide operation) by combine several simple instructions.
– Pipeline
» The processing of instructions is broken down into smaller
units (stage) that can be executed in parallel by pipelines.
There is no need for an instruction to be executed by a
mini-program (microcode) as on CISC processor.
– Register
» RISC have a large General Purpose Registers (GPR) set.
– Load/store architecture
» Separating memory access from data processing.
ARM 2007 liangalei@sjtu.edu.cn
ARM, a RISC ?
• The ARM Design Philosophy
– There are a number of physical features that have
driven the ARM processor design:
» Low Power Consumption: Smallest Core;
» Limited Memory: High code density;
» Die density: Simple Hardware Executive Unit
– The ARM core is not a pure RISC architecture because
of the constraints of its primary application – the
embedded system.
• Simplicity favors regularity ?
– These design rules allow a RISC processor to be
simpler, and thus the core can operate at higher clock
frequencies.
ARM 2007 liangalei@sjtu.edu.cn
Instruction Set for
Embedded System
• The ARM instruction set differs from the pure RISC
definition in several ways
– make the ARM suitable for embedded application
» Variable cycle execution for certain instruction
• Not every ARM instruction executes in a single cycle.
» More complex instruction (inline barrel shifter)
• This expands the capability of many instructions to improve the core
performance and code density.
» Thumb 16-bit instruction set
• The Thumb instruction improve code density by about 30%.
» Conditional execution
• Improves performance and code density by reducing Branch.
» Enhanced instruction
• DSP instruction were added to the standard ARM instr-set to support
fast 16x16-bit multiplier operations and saturation.
ARM 2007 liangalei@sjtu.edu.cn
Example: SoC with ARM core
•
ARM 2007 liangalei@sjtu.edu.cn
Units inside SoC
• SoC is an embedded device.
• We can separate the device into four main
components:
– ARM Processor: controls the embedded device.
» An ARM processor comprises a core (the execution
engine that processes instructions and manipulates
data), plus the surrounding components (MMU and
caches) that interface it with a bus.
– Controllers: coordinate important functional blocks (e.g.
interrupt and memory controllers)
– Peripherals: USB, LCD, etc.
– Bus: is used to communicate between different parts of
the device.
ARM 2007 liangalei@sjtu.edu.cn
1.3.1 ARM Bus Technology
• Embedded systems use different bus technologies
than those designed for x86 PC.
– Embedded device use an on-chip bus
– Core is master who initiates a data transfer.
• A Bus has two architecture levels
– The First is a physical level that covers the electrical
characteristics and bus width (16, 32, or 64 bits).
– The Second level deals with protocol.– the logical rules
governing the communication between processor and peripheral.
• ARM seldom implements the electrical characteristics
of the bus, but it routinely specifies the bus
protocol.
ARM 2007 liangalei@sjtu.edu.cn
1.3.2 AMBA
• AMBA Advanced Micro controller Bus Architecture
– 1996, it’s introduced and widely adopted as the on-chip bus
architecture for ARM processors.
– The first AMBA buses introduced were
» ASB : ARM System Bus, and
» APB : ARM Peripheral Bus
– Later, ARM introduced another bus design
» AHB: ARM High-performance Bus
• Using AMBA,
– peripheral designers can reuse the same design on multiple
projects (with different processor architecture).
– Plug-and-play
ARM 2007 liangalei@sjtu.edu.cn
AHB
• AHB
– provides higher data throughput than ASB. Because
» It use a Centralized Multiplexed Bus Scheme
(rather than ASB’s bidirection bus).
» This change allows the AHB bus to run at higher
clock speed.
» 64/128 bits width.
• Two variations on the AHB bus
» Multi-layer AHB, and
• allows multiple active bus masters,
» AHB-Lite: only one master
ARM 2007 liangalei@sjtu.edu.cn
1.3.3 Memory
• Memory is necessary
– An embedded system has to have some form of memory
to store and execute code.
• You have to consider
– price, performance, and power consumption
• Specific memory characteristics
– hierarchy, width, and type
ARM 2007 liangalei@sjtu.edu.cn
Memory Hierarchy
• Cache
– is used to speed up data transfer between Core and
Main Memory (DRAM);
• But,
– It makes the performance unpredicted;
– It doesn’t help Real-Time system response;
» Note that many small embedded systems do not
require the benefit of a cache.
• * Cache
– Elastic buffer (different speed between Core and Bus);
– Width adaptive (e.g., 32-bit Core vs. 16-bit BUS)
ARM 2007 liangalei@sjtu.edu.cn
Memory Types
• DRAM
– the most commonly used RAM for devices;
– Dynamic: need to have its storage cells refreshed and
given a new electronic charge every few milliseconds, so
you need to set up a DRAM controller before using the
memory.
• SRAM
– is faster than the more traditional DRAM (SRAM does
not require a pause between data access).
• SDRAM
– is one of many subcategories of DRAM.
– accessed pipelined, transferred in a burst.
ARM 2007 liangalei@sjtu.edu.cn
1.3.4 Peripherals
• Embedded system that interact with the outside world
need some form of peripheral device.
– Peripherals range from a simple serial communication device to a more
complex 802.11 wireless device.
• All ARM peripherals are memory mapped – the programming
interface is a set of memory addressed register.
• Controllers are specialized peripherals that implement
higher level of functionality within an embedded system.
– Two important types of controllers are
» Memory Controller
» Interrupt Controller
• Normal IC
• Vectoring IC
– Priority
– Simple Interrupt Dispatch
ARM 2007 liangalei@sjtu.edu.cn
Memory Controllers
• Memory Controllers: Connect different
types of memory to the processor bus.
– On power-up a memory controller is configured in
hardware to allow certain memory device to be active.
These memory devices allow the initialization code to
be executed.
– Some memory devices must be set up by software.
» e.g. When using DRAM, you first have to set up
the memory timings and refresh rate before it can
be accessed.
ARM 2007 liangalei@sjtu.edu.cn
Interrupt Controller
• When a peripheral or device requires attention,
– it raise an interrupt to the processor.
• An interrupt controller
– provides a programmable governing policy
• There are two types of interrupt controller available for
the ARM processor
– Standard interrupt controller
» Sends an interrupt signal; Can be programmed to ignore or mask
an individual or set of devices.
» It’s interrupt handler determines which device requiring service.
– Vector interrupt controller (VIC)
» Associate a “priority” and a “handler address” to each interrupt.
» Depending on its type, VIC will either call the standard interrupt
exception handler (loading the handler address from VIC) or cause
core to jump to the handler for the device directly.
ARM 2007 liangalei@sjtu.edu.cn
Operating System
1.4 Embedded System Software
• An embedded system needs software to
drive it.
• There are four typical software components
required to control an embedded device.
» Each software component in the stack uses a
higher level of abstraction to separate the code
from the hardware device.
– Initialization Code (e.g. Boot loader)
– Operating System
– Device Drivers
– Application
Hardware
Initialization
Device
Driver
Application
ARM 2007 liangalei@sjtu.edu.cn
Initialization (BOOT) Code
• Initialization code (or boot code)
– takes the processor from the reset state to a state (where
the operating system can run).
» Configuring memory controller, caches
» Initializing some devices
» * Debug Monitor (replace OS in simple system)
• Three phases
– Initial hardware configuration
» Satisfy the requirements of the booted image
• e.g. re-organization of the memory map
– Diagnostics
» Fault identification and isolation
– Booting
» Loading an image and handing control over to the image
» The boot process may be complicated if the system must
boot different operating systems or different versions of
the same operating system.
ARM 2007 liangalei@sjtu.edu.cn
Example: Memory Reorganization
• Start from ROM
• Remap to RAM
– easy IVT modification
ARM 2007 liangalei@sjtu.edu.cn
Operating System
• OS organizes the system resources
– peripherals, memory, and processing time
» With an OS controlling these resources, they can
be efficiently used by different applications running
within the OS environment.
• ARM processors support over 50 OSes
– Two main categories: RTOS, platform OS
» RTOS: guarantee response times to event
» platform OS: require MMU and tend to have
secondary storage (for large application).
• N.B., These two categories of OSes are not mutually
exclusive.
– ARM has developed a set of processor cores that
specially target each category.
ARM 2007 liangalei@sjtu.edu.cn
Applications
• The OS schedules applications
– code dedicated to handling a particular task.
• ARM processors are found in numerous
market segments, including
– networking, automotive, mobile and consumer devices,
mass storage, and imaging.
• In contrast, ARM processors are not found
in applications that require leading-edge
high performance.
ARM 2007 liangalei@sjtu.edu.cn
1.5 Summary (1)
• Pure RISC is aimed at high performance
– But ARM uses a modified RISC design philosophy that also
targets
» good code density and low power consumption.
• The Key points in a RISC design philosophy are
– Reducing the complexity of instructions
» improve performance;
– Pipeline
» speed up instruction processing;
– Large register set
» store data near core;
– Load-store architecture;
ARM 2007 liangalei@sjtu.edu.cn
1.5 Summary (2)
• The ARM design philosophy also incorporates
some non-RISC ideas
– Variable cycle execution on certain instruction
» save power, area and code size
– Barrel Shifter
» expand the capability of certain instructions
– Thumb 16-bit instruction set
» improve code density
– Conditional Executing Instruction
» improve code density and performance
– Enhanced Instructions: e.g. DSP
ARM 2007 liangalei@sjtu.edu.cn
1.5 Summary (3)
• Hardware Components in ARM Processor
– Peripherals
» accessed via memory-mapped registers
– Controller (a special type peripheral)
» Higher-level functions: e.g. memory and interrupts.
– AMBA Bus
» connect the processor and peripherals
• Software Components
– Initialization Code
– Operating System
– Device Driver
– Application
ARM 2007 liangalei@sjtu.edu.cn
Data-path in ARM
• ALU
– Sources: Rn, Rm(shifted);
– Destination: Rd
• Shifter
– Helps to Extend the scope of data or
address
• Sign-extend
– LOAD data from Main memory
ARM 2007 liangalei@sjtu.edu.cn
Register in ARM
• Orthogonal Registers (ref. VAX, PDP-11)
– We say R0~R13 are orthogonal, for given instruction,
if it can use R0, then others can also be used.
• SPRs
– R13(sp), R14(lr), R15(pc)
• CPSR/SPSR
– Condition Codes: N, Z, C, V
– Interruption mask: I(IRQ), F(FIQ)
– Thumb Enable Bit
– Mode(5-bit)
ARM 2007 liangalei@sjtu.edu.cn
Instruction Sets in ARM
• Three Instruction Set (IS) in ARM
– ARM
– Thumb: 16-bit
– Jazelle(closed): 8-bit
» 60%: Hardware (JVM)
» 40%: Software
ARM 2007 liangalei@sjtu.edu.cn
Condition Codes
• N(egative), Z(ero), C(arry), Q(ov), V(signed
ov).
• EQ = Z, NE = z; HS = C, LO = c
• GE = NV or nv, LT = Nv or nV
ARM 2007 liangalei@sjtu.edu.cn
Pipeline
• ARM7
– 3 stages: Fetch, Decode, Execute
• More stages (deeper pipeline)
– means “More latency”, “More Dependence”
• ARM9 (+13% ARM7)
– 5 stages: FI, DI, EX, M, WB
• ARM10 (+34% ARM7)
– 6 stages: FI, Issue, DI, EX, M, WB
• ARM7 instruction runs on ARM9/10 ?
– Yes, same pipeline architecture as ARM7
ARM 2007 liangalei@sjtu.edu.cn
About Pipeline
• Enable IRQ
– MSR: Clear CPSR’s I bit.
– IRQ is enabled only after MSR’s third stage (WB);
• PC
– PC always point to the Current “FIing” instruction;
– It’s a tricky for pipeline when calculating PC offset;
• When Branch or direct PC updating
– ARM core will flush the whole pipeline;
– ARM10: using Branch Predict Technology;
– When being IRQed: Instruction in EX will insist to
finish, other instructions will be flushed.
ARM 2007 liangalei@sjtu.edu.cn
Vector Table
• Reset
– the 1st instr. after power-up;
• Undef
– cannot be decoded;
• Soft
– SWI instruction being executed;
• Prefecth Abort (PABT)
– try to access invalid address for instruction;
• Data Abort (DABT)
– Try to access invalid address for data;
• IRQ
• FIQ
ARM 2007 liangalei@sjtu.edu.cn
Core Extension
• Cache & TCM
– Unified vs. I/D
– TCM: fast SRAM, very near Core (unwired with AMBA)
• MM interface
– No MM: for simple embedded system;
– MPU(Memory Protect Unit): section protection;
– MMU: Translation table, Fine-grain protection;
• CP interface
– By Extend Instruction Set vs. CSR register;
– E.g.
» VFP instruction;
» CP15: cache, TCM and MMU via load/store like instr.
ARM 2007 liangalei@sjtu.edu.cn
ARM Instruction Set
• Data
– Data transfer(MOVE), Arith/Logic, CMP, MUL;
• Branch
– If-then-else
• Load/Store
• SWI
• MRS/MSR
– Status (CPSR or SPSR) <-> Register
– Coprocessor Instruction
» CDP (CP Data Processing), MRC/MCR, LDC/STC
• CONST load
• ARMv5E extension
• Condition Executed Instruction
– E.g. ADDEQ r0, r1, r2
ARM 2007 liangalei@sjtu.edu.cn
Summary
• .

More Related Content

PPSX
Lect 2 ARM processor architecture
PPTX
ARM Processor
PPTX
Unit vi (1)
PPTX
Arm architecture chapter2_steve_furber
PDF
Unit II Arm7 Thumb Instruction
PDF
ARM 32-bit Microcontroller Cortex-M3 introduction
PPTX
Advanced Pipelining in ARM Processors.pptx
PDF
Unit II arm 7 Instruction Set
Lect 2 ARM processor architecture
ARM Processor
Unit vi (1)
Arm architecture chapter2_steve_furber
Unit II Arm7 Thumb Instruction
ARM 32-bit Microcontroller Cortex-M3 introduction
Advanced Pipelining in ARM Processors.pptx
Unit II arm 7 Instruction Set

What's hot (20)

PPTX
ARM Processors
PDF
ARM CORTEX M3 PPT
PPT
ARM - Advance RISC Machine
PPTX
CISC & RISC Architecture
PDF
Unit II Arm 7 Introduction
PPTX
Introduction to arm processor
PPT
Pipelining & All Hazards Solution
DOCX
ARM7-ARCHITECTURE
PDF
Computer Organization Lecture Notes
PPTX
ARM- Programmer's Model
PDF
Embedded Systems (18EC62) - ARM Cortex-M3 Instruction Set and Programming (Mo...
DOCX
Hardware-Software Codesign
DOCX
Embedded System
PPTX
BCH Codes
PDF
RTOS for Embedded System Design
PDF
Arm instruction set
PDF
Introduction to arm architecture
PDF
ARM Architecture
PDF
Embedded Firmware Design and Development, and EDLC
PPSX
Lect 3 ARM PROCESSOR ARCHITECTURE
ARM Processors
ARM CORTEX M3 PPT
ARM - Advance RISC Machine
CISC & RISC Architecture
Unit II Arm 7 Introduction
Introduction to arm processor
Pipelining & All Hazards Solution
ARM7-ARCHITECTURE
Computer Organization Lecture Notes
ARM- Programmer's Model
Embedded Systems (18EC62) - ARM Cortex-M3 Instruction Set and Programming (Mo...
Hardware-Software Codesign
Embedded System
BCH Codes
RTOS for Embedded System Design
Arm instruction set
Introduction to arm architecture
ARM Architecture
Embedded Firmware Design and Development, and EDLC
Lect 3 ARM PROCESSOR ARCHITECTURE
Ad

Similar to Arm processor (20)

PPTX
mod1_arm_embedded_systems_ppt_2021_22_odd_oe.pptx
PPTX
ESD Module-4 ES.pptxModule-4 ES.pptxModule-4 ES.pptx
PPTX
Mces MOD 1.pptx
PPTX
18CS44-MODULE1-PPT.pptx
PDF
18CS44-MODULE1-PPT.pdf
PPTX
Microcontroller(18CS44) module 1
PPTX
Module-3 ADVANCED MICROCONTROLLER IMP.pptx
PPTX
MODULE 1 MES.pptx
PPTX
EMBEDDED SYSTEM AND INTERNET OF THINGS.pptx
PPTX
ARM Processor.pptxARM means Advanced RISC Machines.
PPTX
ARM Processor.pptxARM machines have a 32-bit Reduced Instruction Set Computer...
PPTX
18CS44-MES-Module-1.pptx
PPTX
PPT MES class.pptx
PPTX
MES PPT.pptx
PDF
Module-2 Instruction Set Cpus.pdf
PDF
ARM Processor Tutorial
PPTX
ARM Processor architecture
PPSX
LECT 1: ARM PROCESSORS
PPT
ARM-Introduction, registers and processor states.ppt
PPT
A block of logic or data that can be used in making application-specific inte...
mod1_arm_embedded_systems_ppt_2021_22_odd_oe.pptx
ESD Module-4 ES.pptxModule-4 ES.pptxModule-4 ES.pptx
Mces MOD 1.pptx
18CS44-MODULE1-PPT.pptx
18CS44-MODULE1-PPT.pdf
Microcontroller(18CS44) module 1
Module-3 ADVANCED MICROCONTROLLER IMP.pptx
MODULE 1 MES.pptx
EMBEDDED SYSTEM AND INTERNET OF THINGS.pptx
ARM Processor.pptxARM means Advanced RISC Machines.
ARM Processor.pptxARM machines have a 32-bit Reduced Instruction Set Computer...
18CS44-MES-Module-1.pptx
PPT MES class.pptx
MES PPT.pptx
Module-2 Instruction Set Cpus.pdf
ARM Processor Tutorial
ARM Processor architecture
LECT 1: ARM PROCESSORS
ARM-Introduction, registers and processor states.ppt
A block of logic or data that can be used in making application-specific inte...
Ad

More from SHREEHARI WADAWADAGI (18)

PPT
Chapter 15 software product metrics
PPT
Chapter 14 software testing techniques
PPT
Chapter 13 software testing strategies
PPT
Chapter 12 user interface design
PPT
Chapter 21 project management concepts
PPT
Ch 11-component-level-design
PPT
Ch 9-design-engineering
PPT
An introduction to software engineering
PPT
Architectural design
PPTX
Chapter 5 programming concepts iv
PPTX
Chapter 4 programming concepts III
PPTX
Chapter 1 archietecture of 8086
PDF
Brief description of all the interupts
PPTX
Chapter 7 memory & i/o
PPTX
Chapter 6 hardware structure of 8086
PPTX
Chapter 3 programming concepts-ii
PPTX
Chapter 2 programming concepts - I
PPTX
8086 complete guide
Chapter 15 software product metrics
Chapter 14 software testing techniques
Chapter 13 software testing strategies
Chapter 12 user interface design
Chapter 21 project management concepts
Ch 11-component-level-design
Ch 9-design-engineering
An introduction to software engineering
Architectural design
Chapter 5 programming concepts iv
Chapter 4 programming concepts III
Chapter 1 archietecture of 8086
Brief description of all the interupts
Chapter 7 memory & i/o
Chapter 6 hardware structure of 8086
Chapter 3 programming concepts-ii
Chapter 2 programming concepts - I
8086 complete guide

Recently uploaded (20)

PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Welding lecture in detail for understanding
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
web development for engineering and engineering
PDF
Digital Logic Computer Design lecture notes
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPT
Project quality management in manufacturing
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Arduino robotics embedded978-1-4302-3184-4.pdf
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Welding lecture in detail for understanding
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Lesson 3_Tessellation.pptx finite Mathematics
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
bas. eng. economics group 4 presentation 1.pptx
additive manufacturing of ss316l using mig welding
web development for engineering and engineering
Digital Logic Computer Design lecture notes
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Project quality management in manufacturing

Arm processor

  • 1. ARM 2007 liangalei@sjtu.edu.cn ARM Embedded System Optimization Technique in Embedded System (ARM)
  • 2. ARM 2007 liangalei@sjtu.edu.cn TextBook • ARM System Developer’s Guide ---Designing and Optimizing System Software – ARM 嵌入式系统开发 - 件 与 化软 设计 优 • AUTHOR – Andrew N Sloss, Dominic Symes, Chris Wright – 沈建华 译 • PUBLISHER – 北京航空航天大学出版社, 2005
  • 3. ARM 2007 liangalei@sjtu.edu.cn The Days of ARM • ARM’s designers have come a long way from the first ARM1 prototype in 1985. • Over one billion ARM processors had been shipped worldwide by the end of 2001. – simple and powerful original design. • In fact, the ARM core is not a single core, but a whole family of designs sharing similar design principles and a common instruction set. – ARM7TDMI: one of ARM’s most successful cores. » 120 Dhrystone MIPS (a small benchmarking program)
  • 4. ARM 2007 liangalei@sjtu.edu.cn Current and Future of ARM • Cortext-M3 – Thumb-2 Instruction Set • MPCore – SMP(balanced), Cache consistency, L2 cache – 1~4 ARM11 • OptimoDE technique – Configurable VLIW, Co-work with ARM core – MPEG4, H.264 algorithm
  • 5. ARM 2007 liangalei@sjtu.edu.cn Brief History of ARM Core • 1985, First ARM (ARM1) • 1995, ARM7TDMI – Most successful ARM core – 3-stage pipeline, 120 Dhrystone MIPS • 1997, ARM9 – 5-stage pipeline – Harvard (I+D cache), MMU (OS’s VM) • 1999, ARM10 – 6-stage pipeline – VFP(Vector Float Point) (7-stage pipeline) • 2003, ARM11 – 8-stage pipeline
  • 6. ARM 2007 liangalei@sjtu.edu.cn Versions of ARM Architecture • ARMv1 – 26-bit address • ARMv2 – 32-bit Multiplier/coprocessor • ARMv3 – 32-bit address, cpsr/spsr, MMU, undef/abort Mode • ARMv4 – Load/store (sign/half/byte), sys Mode • ARMv5 – Superset ARMv4T (Thumb), extend Mul/DSP • ARMv6 – Multiprocessor support instr., unaligned/endian/MMX
  • 7. ARM 2007 liangalei@sjtu.edu.cn Others • StrongARM – ARM + Digital Semiconductor – Intel Patent • Xscale – 1GHz, V5TE • SC100 – Security, Low Power – ARM7TDMI, MPU
  • 8. ARM 2007 liangalei@sjtu.edu.cn Nomenclature of ARM • E.g. ARM7TDMI – T: Thumb – D: JTAG – M: Multiplier (extend) – I: ICE – E: Extend Instruction (above TDMI) – J: Jazelle – F: Float point – S: Synthetic (soft core)
  • 9. ARM 2007 liangalei@sjtu.edu.cn ARM, a RISC ? • Philosophy of RISC design – Instruction » RISC processor have a Reduced number of instruction classes. These classes provide simple operations that can each execute in a single cycle. The compiler or programmer synthesized complicated operations (e.g. a divide operation) by combine several simple instructions. – Pipeline » The processing of instructions is broken down into smaller units (stage) that can be executed in parallel by pipelines. There is no need for an instruction to be executed by a mini-program (microcode) as on CISC processor. – Register » RISC have a large General Purpose Registers (GPR) set. – Load/store architecture » Separating memory access from data processing.
  • 10. ARM 2007 liangalei@sjtu.edu.cn ARM, a RISC ? • The ARM Design Philosophy – There are a number of physical features that have driven the ARM processor design: » Low Power Consumption: Smallest Core; » Limited Memory: High code density; » Die density: Simple Hardware Executive Unit – The ARM core is not a pure RISC architecture because of the constraints of its primary application – the embedded system. • Simplicity favors regularity ? – These design rules allow a RISC processor to be simpler, and thus the core can operate at higher clock frequencies.
  • 11. ARM 2007 liangalei@sjtu.edu.cn Instruction Set for Embedded System • The ARM instruction set differs from the pure RISC definition in several ways – make the ARM suitable for embedded application » Variable cycle execution for certain instruction • Not every ARM instruction executes in a single cycle. » More complex instruction (inline barrel shifter) • This expands the capability of many instructions to improve the core performance and code density. » Thumb 16-bit instruction set • The Thumb instruction improve code density by about 30%. » Conditional execution • Improves performance and code density by reducing Branch. » Enhanced instruction • DSP instruction were added to the standard ARM instr-set to support fast 16x16-bit multiplier operations and saturation.
  • 13. ARM 2007 liangalei@sjtu.edu.cn Units inside SoC • SoC is an embedded device. • We can separate the device into four main components: – ARM Processor: controls the embedded device. » An ARM processor comprises a core (the execution engine that processes instructions and manipulates data), plus the surrounding components (MMU and caches) that interface it with a bus. – Controllers: coordinate important functional blocks (e.g. interrupt and memory controllers) – Peripherals: USB, LCD, etc. – Bus: is used to communicate between different parts of the device.
  • 14. ARM 2007 liangalei@sjtu.edu.cn 1.3.1 ARM Bus Technology • Embedded systems use different bus technologies than those designed for x86 PC. – Embedded device use an on-chip bus – Core is master who initiates a data transfer. • A Bus has two architecture levels – The First is a physical level that covers the electrical characteristics and bus width (16, 32, or 64 bits). – The Second level deals with protocol.– the logical rules governing the communication between processor and peripheral. • ARM seldom implements the electrical characteristics of the bus, but it routinely specifies the bus protocol.
  • 15. ARM 2007 liangalei@sjtu.edu.cn 1.3.2 AMBA • AMBA Advanced Micro controller Bus Architecture – 1996, it’s introduced and widely adopted as the on-chip bus architecture for ARM processors. – The first AMBA buses introduced were » ASB : ARM System Bus, and » APB : ARM Peripheral Bus – Later, ARM introduced another bus design » AHB: ARM High-performance Bus • Using AMBA, – peripheral designers can reuse the same design on multiple projects (with different processor architecture). – Plug-and-play
  • 16. ARM 2007 liangalei@sjtu.edu.cn AHB • AHB – provides higher data throughput than ASB. Because » It use a Centralized Multiplexed Bus Scheme (rather than ASB’s bidirection bus). » This change allows the AHB bus to run at higher clock speed. » 64/128 bits width. • Two variations on the AHB bus » Multi-layer AHB, and • allows multiple active bus masters, » AHB-Lite: only one master
  • 17. ARM 2007 liangalei@sjtu.edu.cn 1.3.3 Memory • Memory is necessary – An embedded system has to have some form of memory to store and execute code. • You have to consider – price, performance, and power consumption • Specific memory characteristics – hierarchy, width, and type
  • 18. ARM 2007 liangalei@sjtu.edu.cn Memory Hierarchy • Cache – is used to speed up data transfer between Core and Main Memory (DRAM); • But, – It makes the performance unpredicted; – It doesn’t help Real-Time system response; » Note that many small embedded systems do not require the benefit of a cache. • * Cache – Elastic buffer (different speed between Core and Bus); – Width adaptive (e.g., 32-bit Core vs. 16-bit BUS)
  • 19. ARM 2007 liangalei@sjtu.edu.cn Memory Types • DRAM – the most commonly used RAM for devices; – Dynamic: need to have its storage cells refreshed and given a new electronic charge every few milliseconds, so you need to set up a DRAM controller before using the memory. • SRAM – is faster than the more traditional DRAM (SRAM does not require a pause between data access). • SDRAM – is one of many subcategories of DRAM. – accessed pipelined, transferred in a burst.
  • 20. ARM 2007 liangalei@sjtu.edu.cn 1.3.4 Peripherals • Embedded system that interact with the outside world need some form of peripheral device. – Peripherals range from a simple serial communication device to a more complex 802.11 wireless device. • All ARM peripherals are memory mapped – the programming interface is a set of memory addressed register. • Controllers are specialized peripherals that implement higher level of functionality within an embedded system. – Two important types of controllers are » Memory Controller » Interrupt Controller • Normal IC • Vectoring IC – Priority – Simple Interrupt Dispatch
  • 21. ARM 2007 liangalei@sjtu.edu.cn Memory Controllers • Memory Controllers: Connect different types of memory to the processor bus. – On power-up a memory controller is configured in hardware to allow certain memory device to be active. These memory devices allow the initialization code to be executed. – Some memory devices must be set up by software. » e.g. When using DRAM, you first have to set up the memory timings and refresh rate before it can be accessed.
  • 22. ARM 2007 liangalei@sjtu.edu.cn Interrupt Controller • When a peripheral or device requires attention, – it raise an interrupt to the processor. • An interrupt controller – provides a programmable governing policy • There are two types of interrupt controller available for the ARM processor – Standard interrupt controller » Sends an interrupt signal; Can be programmed to ignore or mask an individual or set of devices. » It’s interrupt handler determines which device requiring service. – Vector interrupt controller (VIC) » Associate a “priority” and a “handler address” to each interrupt. » Depending on its type, VIC will either call the standard interrupt exception handler (loading the handler address from VIC) or cause core to jump to the handler for the device directly.
  • 23. ARM 2007 liangalei@sjtu.edu.cn Operating System 1.4 Embedded System Software • An embedded system needs software to drive it. • There are four typical software components required to control an embedded device. » Each software component in the stack uses a higher level of abstraction to separate the code from the hardware device. – Initialization Code (e.g. Boot loader) – Operating System – Device Drivers – Application Hardware Initialization Device Driver Application
  • 24. ARM 2007 liangalei@sjtu.edu.cn Initialization (BOOT) Code • Initialization code (or boot code) – takes the processor from the reset state to a state (where the operating system can run). » Configuring memory controller, caches » Initializing some devices » * Debug Monitor (replace OS in simple system) • Three phases – Initial hardware configuration » Satisfy the requirements of the booted image • e.g. re-organization of the memory map – Diagnostics » Fault identification and isolation – Booting » Loading an image and handing control over to the image » The boot process may be complicated if the system must boot different operating systems or different versions of the same operating system.
  • 25. ARM 2007 liangalei@sjtu.edu.cn Example: Memory Reorganization • Start from ROM • Remap to RAM – easy IVT modification
  • 26. ARM 2007 liangalei@sjtu.edu.cn Operating System • OS organizes the system resources – peripherals, memory, and processing time » With an OS controlling these resources, they can be efficiently used by different applications running within the OS environment. • ARM processors support over 50 OSes – Two main categories: RTOS, platform OS » RTOS: guarantee response times to event » platform OS: require MMU and tend to have secondary storage (for large application). • N.B., These two categories of OSes are not mutually exclusive. – ARM has developed a set of processor cores that specially target each category.
  • 27. ARM 2007 liangalei@sjtu.edu.cn Applications • The OS schedules applications – code dedicated to handling a particular task. • ARM processors are found in numerous market segments, including – networking, automotive, mobile and consumer devices, mass storage, and imaging. • In contrast, ARM processors are not found in applications that require leading-edge high performance.
  • 28. ARM 2007 liangalei@sjtu.edu.cn 1.5 Summary (1) • Pure RISC is aimed at high performance – But ARM uses a modified RISC design philosophy that also targets » good code density and low power consumption. • The Key points in a RISC design philosophy are – Reducing the complexity of instructions » improve performance; – Pipeline » speed up instruction processing; – Large register set » store data near core; – Load-store architecture;
  • 29. ARM 2007 liangalei@sjtu.edu.cn 1.5 Summary (2) • The ARM design philosophy also incorporates some non-RISC ideas – Variable cycle execution on certain instruction » save power, area and code size – Barrel Shifter » expand the capability of certain instructions – Thumb 16-bit instruction set » improve code density – Conditional Executing Instruction » improve code density and performance – Enhanced Instructions: e.g. DSP
  • 30. ARM 2007 liangalei@sjtu.edu.cn 1.5 Summary (3) • Hardware Components in ARM Processor – Peripherals » accessed via memory-mapped registers – Controller (a special type peripheral) » Higher-level functions: e.g. memory and interrupts. – AMBA Bus » connect the processor and peripherals • Software Components – Initialization Code – Operating System – Device Driver – Application
  • 31. ARM 2007 liangalei@sjtu.edu.cn Data-path in ARM • ALU – Sources: Rn, Rm(shifted); – Destination: Rd • Shifter – Helps to Extend the scope of data or address • Sign-extend – LOAD data from Main memory
  • 32. ARM 2007 liangalei@sjtu.edu.cn Register in ARM • Orthogonal Registers (ref. VAX, PDP-11) – We say R0~R13 are orthogonal, for given instruction, if it can use R0, then others can also be used. • SPRs – R13(sp), R14(lr), R15(pc) • CPSR/SPSR – Condition Codes: N, Z, C, V – Interruption mask: I(IRQ), F(FIQ) – Thumb Enable Bit – Mode(5-bit)
  • 33. ARM 2007 liangalei@sjtu.edu.cn Instruction Sets in ARM • Three Instruction Set (IS) in ARM – ARM – Thumb: 16-bit – Jazelle(closed): 8-bit » 60%: Hardware (JVM) » 40%: Software
  • 34. ARM 2007 liangalei@sjtu.edu.cn Condition Codes • N(egative), Z(ero), C(arry), Q(ov), V(signed ov). • EQ = Z, NE = z; HS = C, LO = c • GE = NV or nv, LT = Nv or nV
  • 35. ARM 2007 liangalei@sjtu.edu.cn Pipeline • ARM7 – 3 stages: Fetch, Decode, Execute • More stages (deeper pipeline) – means “More latency”, “More Dependence” • ARM9 (+13% ARM7) – 5 stages: FI, DI, EX, M, WB • ARM10 (+34% ARM7) – 6 stages: FI, Issue, DI, EX, M, WB • ARM7 instruction runs on ARM9/10 ? – Yes, same pipeline architecture as ARM7
  • 36. ARM 2007 liangalei@sjtu.edu.cn About Pipeline • Enable IRQ – MSR: Clear CPSR’s I bit. – IRQ is enabled only after MSR’s third stage (WB); • PC – PC always point to the Current “FIing” instruction; – It’s a tricky for pipeline when calculating PC offset; • When Branch or direct PC updating – ARM core will flush the whole pipeline; – ARM10: using Branch Predict Technology; – When being IRQed: Instruction in EX will insist to finish, other instructions will be flushed.
  • 37. ARM 2007 liangalei@sjtu.edu.cn Vector Table • Reset – the 1st instr. after power-up; • Undef – cannot be decoded; • Soft – SWI instruction being executed; • Prefecth Abort (PABT) – try to access invalid address for instruction; • Data Abort (DABT) – Try to access invalid address for data; • IRQ • FIQ
  • 38. ARM 2007 liangalei@sjtu.edu.cn Core Extension • Cache & TCM – Unified vs. I/D – TCM: fast SRAM, very near Core (unwired with AMBA) • MM interface – No MM: for simple embedded system; – MPU(Memory Protect Unit): section protection; – MMU: Translation table, Fine-grain protection; • CP interface – By Extend Instruction Set vs. CSR register; – E.g. » VFP instruction; » CP15: cache, TCM and MMU via load/store like instr.
  • 39. ARM 2007 liangalei@sjtu.edu.cn ARM Instruction Set • Data – Data transfer(MOVE), Arith/Logic, CMP, MUL; • Branch – If-then-else • Load/Store • SWI • MRS/MSR – Status (CPSR or SPSR) <-> Register – Coprocessor Instruction » CDP (CP Data Processing), MRC/MCR, LDC/STC • CONST load • ARMv5E extension • Condition Executed Instruction – E.g. ADDEQ r0, r1, r2