SlideShare a Scribd company logo
Digital Design:
An Embedded Systems
Approach Using Verilog
Chapter 7
Processor Basics
Portions of this work are from the book, Digital Design: An Embedded
Systems Approach Using Verilog, by Peter J. Ashenden, published by Morgan
Kaufmann Publishers, Copyright 2007 Elsevier Inc. All rights reserved.
Verilog
Digital Design — Chapter 7 — Processor Basics 2
Embedded Computers
 A computer as part of a digital system
 Performs processing to implement or control the
system’s function
 Components
 Processor core
 Instruction and data memory
 Input, output, and input/output controllers
 For interacting with the physical world
 Accelerators
 High-performance circuit for specialized functions
 Interconnecting buses
Verilog
Digital Design — Chapter 7 — Processor Basics 3
Memory Organization
 Von Neumann architecture
 Single memory for instructions and data
 Harvard architecture
 Separate instruction and data memories
 Most common in embedded systems
CPU
…
Accelerator
Instruction
memory
Input
controller
Output
controller
I/O
controller
Data
memory
Verilog
Digital Design — Chapter 7 — Processor Basics 4
Bus Organization
 Single bus for low-cost low-performance
systems
 Multiple buses for higher performance
CPU
Accelerator
Instruction
memory
Input
controller
Output
controller
I/O
controller
Data
memory
Verilog
Digital Design — Chapter 7 — Processor Basics 5
Microprocessors
 Single-chip processor in a package
 External connections to memory and
I/O buses
 Most commonly seen in general purpose
computers
 E.g., Intel Pentium family, PowerPC, …
Verilog
Digital Design — Chapter 7 — Processor Basics 6
Microcontrollers
 Single chip combining
 Processor
 A small amount of instruction/data memory
 I/O controllers
 Microcontroller families
 Same processor, varying memory and I/O
 8-bit microcontrollers
 Operate on 8-bit data
 Low cost, low performance
 16-bit and 32-bit microcontrollers
 Higher performance
Verilog
Digital Design — Chapter 7 — Processor Basics 7
Processor Cores
 Processor as a component in an FPGA or
ASIC
 In FPGA, can be a fixed-function block
 E.g., PowerPC cores in some Xilinx FPGAs
 Or can be a soft core
 Implemented using programmable resources
 E.g., Xilinx MicroBlaze, Altera Nios-II
 In ASIC, provided as an IP block
 E.g., ARM, PowerPC, MIPS, Tensilica cores
 Can be customized for an application
Verilog
Digital Design — Chapter 7 — Processor Basics 8
Digital Signal Processors
 DSPs are processors optimized for
signal processing operations
 E.g., audio, video, sensor data; wireless
communication
 Often combined with a conventional
core for processing other data
 Heterogeneous multiprocessor
Verilog
Digital Design — Chapter 7 — Processor Basics 9
Instruction Sets
 A processor executes a program
 A sequence of instructions, each performing a
small step of a computation
 Instruction set: the repertoire of available
instructions
 Different processor types have different instruction
sets
 High-level languages: more abstract
 E.g., C, C++, Ada, Java
 Translated to processor instructions by a compiler
Verilog
Digital Design — Chapter 7 — Processor Basics 10
Instruction Execution
 Instructions are encoded in binary
 Stored in the instruction memory
 A processor executes a program by
repeatedly
 Fetching the next instruction
 Decoding it to work out what to do
 Executing the operation
 Program counter (PC)
 Register in the processor holding the
address of the next instruction
Verilog
Digital Design — Chapter 7 — Processor Basics 11
Data and Endian-ness
 Instructions operate on data from the data memory
 Byte: 8-bit data
 Data memory is usually byte addressed
 16-bit, 32-bit, 64-bit words of data
0
least sig. byte
Little endian Big endian
8-bit data
16-bit data
32-bit data
most sig. byte
least sig. byte
most sig. byte
m
m + 1
n
n + 2
n + 3
n + 1
0
least sig. byte
8-bit data
16-bit data
32-bit data
most sig. byte
least sig. byte
most sig. byte
m
m + 1
n
n + 2
n + 3
n + 1
Verilog
Digital Design — Chapter 7 — Processor Basics 12
The Gumnut Core
 A small 8-bit soft core
 Can be used in FPGA designs
 Instruction set illustrates features typical of 8-
bit cores and processors in general
 Programs written in assembly language
 Each processor instruction written explicitly
 Translated to binary representation by an
assembler
 Resources available on companions web site
Verilog
Digital Design — Chapter 7 — Processor Basics 13
Gumnut Storage
Verilog
Digital Design — Chapter 7 — Processor Basics 14
Arithmetic Instructions
 Operate on register data and put result
in a register
 add, addc, sub, subc
 Can have immediate value operand
 Condition codes
 Z: 1 if result is zero, 0 if result is non-zero
 C: carry out of add/addc, borrow out of
sub/subc
 addc and subc include C bit in
operation
Verilog
Digital Design — Chapter 7 — Processor Basics 15
Arithmetic Instructions
 Examples
 add r3, r4, r1
 add r5, r1, 2
 sub r4, r4, 1
 Evaluate 2x + 1; x in r3, result in r4
 add r4, r4, r3 ; double x
add r4, r4, 1 ; then add 1
Verilog
Digital Design — Chapter 7 — Processor Basics 16
Logical Instructions
 Operate on register data and put result
in a register
 and, or, xor, mask (and not)
 Operate bitwise on 8-bit operands
 Can have immediate value operand
 Condition codes
 Z: 1 if result is zero, 0 if result is non-zero
 C: always 0
Verilog
Digital Design — Chapter 7 — Processor Basics 17
Logical Instructions
 Examples
 and r3, r4, r5
 or r1, r1, 0x80 ; set r1(7)
 xor r5, r5, 0xFF ; invert r5
 Set Z if least-significant 4 bits of r2 are 0101
 and r1, r2, 0x0F ; clear high bits
sub r0, r1, 0x05 ; compare with 0101
Verilog
Digital Design — Chapter 7 — Processor Basics 18
Shift Instructions
 Logical shift/rotate register data and
put result in a register
 shl, shr, rol, ror
 Count specified as a literal operand
 Condition codes
 Z: 1 if result is zero, 0 if result is non-zero
 C: the value of the last bit shifted/rotated
past the end of the byte
Verilog
Digital Design — Chapter 7 — Processor Basics 19
Shift Instructions
 Examples
 shl r4, r1, 3
 ror r2, r2, 4
 Multiply r4 by 8, ignoring overflow
 shl r4, r4, 3
 Multiply r4 by 10, ignoring overflow
 shl r1, r4, 1 ; multiply by 2
shl r4, r4, 3 ; multiply by 8
add r4, r4, r1
Verilog
Digital Design — Chapter 7 — Processor Basics 20
Memory Instructions
 Transfer data between registers and data
memory
 Compute address by adding an offset to a base
register value
 Load register from memory
 ldm r1, (r2)+5
 Store from register to memory
 stm r1, (r4)-2
 Use r0 if base address is 0
 ldm r3, 23  ldm r3, (r0)+23
 Condition codes not affected
Verilog
Digital Design — Chapter 7 — Processor Basics 21
Memory Instructions
 Increment a 16-bit integer in memory
 Little-endian: address of lsb in r2, msb in next
location
 ldm r1, (r2) ; increment lsb
add r1, r1, 1
stm r1, (r2)
ldm r1, (r2)+1 ; increment msb
addc r1, r1, 0 ; with carry
stm r1, (r2)+1
Verilog
Digital Design — Chapter 7 — Processor Basics 22
Input/Output Instructions
 I/O controllers have registers that govern
their operation
 Each has an address, like data memory
 Gumnut has separate data and I/O address spaces
 Input from I/O register
 inp r3, 157  inp r3, (r0)+157
 Output to I/O register
 out r3, (r7)  out r3, (r7)+0
 Condition codes not affected
 Further examples in Chapter 8
Verilog
Digital Design — Chapter 7 — Processor Basics 23
Branch Instructions
 Programs can evaluate conditions and take
alternate courses of action
 Condition codes (Z, C) represent outcomes of
arithmetic/logical/shift instructions
 Branch instructions examine Z or C
 bz, bnz, bc, bnc
 Add a displacement to PC if condition is true
 Specifies how many instructions forward or
backward to skip
 Counting from instruction after branch
Verilog
Digital Design — Chapter 7 — Processor Basics 24
Branch Example
 Elapsed seconds in location 100
 Increment, wrapping to 0 after 59
 ldm r1, 100
add r1, r1, 1
sub r0, r1, 60 ; Z set if r1 = 60
bnz +1 ; Skip to store if
add r1, r0, 0 ; Z is 0
stm r1, 100
Verilog
Digital Design — Chapter 7 — Processor Basics 25
Jump Instruction
 Unconditionally skips forward or backward to
specified address
 Changes the PC to the address
 Example: if r1 = 0, clear data location 100 to
0; otherwise clear location 200 to 0
 Assume instructions start at address 10
 10: sub r0, r1, 0
11: bnz +2
12: stm r0, 100
13: jmp 15
14: stm r0, 200
15: ...
Verilog
Digital Design — Chapter 7 — Processor Basics 26
Subroutines
 A sequence of instructions that perform
some operation
 Can call them from different parts of a
program using a jsb instruction
 Subroutine returns with a ret instruction
Verilog
Digital Design — Chapter 7 — Processor Basics 27
Subroutine Example
 Subroutine to increment second count
 Address of count in r2
 ldm r1, (r2)
add r1, r1, 1
sub r0, r1, 60
bnz +1
add r1, r0, 0
stm r1, (r2)
ret
 Call to increment locations 100 and 102
 add r2, r0, 100
jsb 20
add r2, r0, 102
jsb 20
Verilog
Digital Design — Chapter 7 — Processor Basics 28
Return Address Stack
 The jsb saves the return address for
use by the ret
 But what if the subroutine includes a jsb?
 Gumnut core includes an 8-entry push-
down stack of return addresses
return addr for first call
return addr for second call
return addr for first call
return addr for second call
return addr for third call
Verilog
Digital Design — Chapter 7 — Processor Basics 29
Miscellaneous Instructions
 Instructions supporting interrupts
 See Chapter 8
 reti Return from interrupt
 enai Enable interrupts
 disi Disable interrupts
 wait Wait for an interrupt
 stby Stand by in low power mode until
an interrupt occurs
Verilog
Digital Design — Chapter 7 — Processor Basics 30
The Gumnut Assembler
 Gasm: translates assembly programs
 Generates memory images for program
text (binary-coded instructions) and data
 See documentation on web site
 Write a program as a text file
 Instructions
 Directives
 Comments
 Use symbolic labels
Verilog
Digital Design — Chapter 7 — Processor Basics 31
Example Program
; Program to determine greater of value_1 and value_2
text
org 0x000 ; start here on reset
jmp main
; Data memory layout
data
value_1: byte 10
value_2: byte 20
result: bss 1
; Main program
text
org 0x010
main: ldm r1, value_1 ; load values
ldm r2, value_2
sub r0, r1, r2 ; compare values
bc value_2_greater
stm r1, result ; value_1 is greater
jmp finish
value_2_greater: stm r2, result ; value_2 is greater
finish: jmp finish ; idle loop
Verilog
Digital Design — Chapter 7 — Processor Basics 32
Gumnut Instruction Encoding
 Instructions are a form of information
 Can be encoded in binary
 Gumnut encoding
 18 bits per instruction
 Divided into fields representing different
aspects of the instruction
 Opcodes and function codes
 Register numbers
 Addresses
Verilog
Digital Design — Chapter 7 — Processor Basics 33
Gumnut Instruction Encoding
1 1 0
1 1 1 fn disp
6 2 2 8
Branch
Arith/Logical
Register
Arith/Logical
Immediate
Shift
Memory, I/O
1 1 0
1 fn
rd rs rs2
4 3 3
3 3 2
0 fn rd rs immed
1 8
3 3 3
1 1 0 fn
rd rs count
3 3
1 2
3 3 3
1 0 fn rd rs offset
2 2 3 3 8
1 1 1 1 0
0
fn addr
5 1 12
Jump
1 1 1 1 1 1 fn
7 3 8
Miscellaneous
Verilog
Digital Design — Chapter 7 — Processor Basics 34
Encoding Examples
 Encoding for addc r3, r5, 24
 Arithmetic immediate, fn = 001
0 fn rd rs immed
0 0
0 1 1
0 1 0
1 1 0 0 1
0 1 0
0 0
1 8
3 3 3
 Instruction encoded by 2ECFC
1 1 0
1 1 1 fn disp
6 2 2 8
1 1 0 0 0
1 1 1 1 1 1 1 1
1 1 0 0
1
Branch  bnc -4
 05D18
Verilog
Digital Design — Chapter 7 — Processor Basics 35
Other Instruction Sets
 8-bit cores and microcontrollers
 Xilinx PicoBlaze: like Gumnut
 8051, and numerous like it
 Originated as 8-bit microprocessors
 Instructions encoded as one or more bytes
 Instruction set is more complex and irregular
 Complex instruction set computer (CISC)
 C.f. Reduced instruction set computer (RISC)
 16-, 32- and 64-bit cores
 Mostly RISC
 E.g., PowerPC, ARM, MIPS, Tensilica, …
Verilog
Digital Design — Chapter 7 — Processor Basics 36
Instruction and Data Memory
 In embedded systems
 Instruction memory is usually ROM, flash,
SRAM, or combination
 Data memory is usually SRAM
 DRAM if large capacity needed
 Processor/memory interfacing
 Gluing the signals together
Verilog
Digital Design — Chapter 7 — Processor Basics 37
Example: Gumnut Memory
inst_adr_o
inst_dat_i
rst_i
gumnut data
SRAM
inst_cyc_o
inst_stb_o
inst_ack_i
data_adr_o
data_dat_i
data_dat_o
data_cyc_o
data_stb_o
data_ack_i
data_we_o
adr
dat_o
dat_i
en
we
adr
dat_o
en
clk_i
clk_i
instruction
ROM
clk_i
D Q
clk
D
Q
clk
Verilog
Digital Design — Chapter 7 — Processor Basics 38
Example: Gumnut Memory
always @(posedge clk) // Instruction memory
if (inst_cyc_o && inst_stb_o) begin
inst_dat_i <= inst_ROM[inst_adr_o[10:0]];
inst_ack_i <= 1'b1;
end
else
inst_ack_i <= 1'b0;
Verilog
Digital Design — Chapter 7 — Processor Basics 39
Example: Gumnut Memory
always @(posedge clk) // Data memory
if (data_cyc_o && data_stb_o)
if (data_we_o) begin
data_RAM[data_adr_o] <= data_dat_o;
data_dat_i <= data_dat_o;
data_ack_i <= 1'b1;
end
else begin
data_dat_i <= data_RAM[data_adr_o];
data_ack_i <= 1'b1;
end
else
data_ack_i <= 1'b0;
Verilog
Digital Design — Chapter 7 — Processor Basics 40
Example: Microcontroller Memory
A(15..8)
A(7..0)
CE
WE
OE
D
A(16)
D
LE
P2
Q
PSEN
ALE
8051 SRAM
RD
WR
P0
Verilog
Digital Design — Chapter 7 — Processor Basics 41
32-bit Memory
 Four bytes per memory word
 Little-endian: lsb at least address
 Big-endian: msb at least address
0 1 2 3
4 5 6 7
8 9 10 11
 Partial-word read
 Read all bytes, processor selects those needed
 Partial-word write
 Use byte-enable signals
Verilog
Digital Design — Chapter 7 — Processor Basics 42
Example: MicroBlaze Memory
D_in
A
SSRAM
en
wr
D_out
clk
D_in
A
SSRAM
en
wr
D_out
clk
D_in
A
SSRAM
en
wr
D_out
clk
D_in
A
SSRAM
en
wr
D_out
clk
0:7
8:15
16:23
24:31
0:7
2:16
8:15
16:23
24:31
Addr
Data_Write
AS
Read_Strobe
Ready
Clk
Data_Read
Write_Strobe
Byte_Enable(0)
Byte_Enable(1)
Byte_Enable(2)
Byte_Enable(3)
+V
Verilog
Digital Design — Chapter 7 — Processor Basics 43
Cache Memory
 For high-performance processors
 Memory access time is several clock cycles
 Performance bottleneck
 Cache memory
 Small fast memory attached to a processor
 Stores most frequently accessed items,
plus adjacent items
 Locality: those items are most likely to be
accessed again soon
Verilog
Digital Design — Chapter 7 — Processor Basics 44
Cache Memory
 Memory contents divided into fixed-
sized blocks (lines)
 Cache copies whole lines from memory
 When processor accesses an item
 If item is in cache: hit - fast access
 Occurs most of the time
 If item is not in cache: miss
 Line containing item is copied from memory
 Slower, but less frequent
 May need to replace a line already in cache
Verilog
Digital Design — Chapter 7 — Processor Basics 45
Fast Main Memory Access
 Optimize memory for line access by cache
 Wide memory
 Read a line in one access
 Burst transfers
 Send starting address, then read successive locations
 Pipelining
 Overlapping stages of memory access
 E.g., address transfer, memory operation, data transfer
 Double data rate (DDR), Quad data rate (QDR)
 Transfer on both rising and falling clock edges
Verilog
Digital Design — Chapter 7 — Processor Basics 46
Summary
 Embedded computer
 Processor, memory, I/O controllers, buses
 Microprocessors, microcontrollers, and
processor cores
 Soft-core processors for ASIC/FPGA
 Processor instruction sets
 Binary encoding for instructions
 Assembly language programs
 Memory interfacing

More Related Content

PPT
09 accelerators
PDF
Inference accelerators
PDF
TinyML - 4 speech recognition
PDF
Tensorflow lite for microcontroller
PDF
5.MLP(Multi-Layer Perceptron)
DOCX
DSP_Assign_1
PDF
Pragmatic optimization in modern programming - modern computer architecture c...
PPTX
Embedded TCP/IP stack for FreeRTOS
09 accelerators
Inference accelerators
TinyML - 4 speech recognition
Tensorflow lite for microcontroller
5.MLP(Multi-Layer Perceptron)
DSP_Assign_1
Pragmatic optimization in modern programming - modern computer architecture c...
Embedded TCP/IP stack for FreeRTOS

What's hot (20)

PPT
Fpga 11-sequence-detector-fir-iir-filter
PPTX
Dr.s.shiyamala fpga ppt
PDF
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
PDF
Code GPU with CUDA - SIMT
PDF
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
PDF
Code GPU with CUDA - Device code optimization principle
PDF
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
PPTX
Fpga video capturing
PPT
Monte Carlo on GPUs
PDF
Code GPU with CUDA - Memory Subsystem
DOCX
Digital filter design using VHDL
PDF
eBPF Tooling and Debugging Infrastructure
PPTX
06 mips-isa
PPT
Instruction Set Architecture
PDF
Socket Programming- Data Link Access
PPT
Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
PDF
Gv2512441247
PDF
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...
PDF
05 defense
PDF
Real time image processing in fpga
Fpga 11-sequence-detector-fir-iir-filter
Dr.s.shiyamala fpga ppt
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
Code GPU with CUDA - SIMT
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Code GPU with CUDA - Device code optimization principle
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
Fpga video capturing
Monte Carlo on GPUs
Code GPU with CUDA - Memory Subsystem
Digital filter design using VHDL
eBPF Tooling and Debugging Infrastructure
06 mips-isa
Instruction Set Architecture
Socket Programming- Data Link Access
Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Me...
Gv2512441247
Implementation of FPGA Based Image Processing Algorithm using Xilinx System G...
05 defense
Real time image processing in fpga
Ad

Similar to 07 processor basics (20)

PPT
Lecture1 - Computer Architecture
PPTX
Introduction to computer architecture .pptx
PPT
Design and implementation of five stage pipelined RISC-V processor using Ver...
PPTX
PPT
Introduction to Blackfin BF532 DSP
PDF
Highridge ISA
PPTX
Instruction Set Architecture
PPT
Microprocessor Systems and Interfacing Slides
PPTX
Presentation
PDF
8 bit Microprocessor with Single Vectored Interrupt
DOCX
ADS Lab 5 Report
PDF
Joel Falcou, Boost.SIMD
PPTX
C from hello world to 010101
PDF
Microprocessor lab manual
PDF
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
PPTX
מצגת פרויקט
PDF
The Principle Of Ultrasound Imaging System
PPT
Lect05 Prog Model
PDF
Design and implementation of complex floating point processor using fpga
PDF
Design of FPGA based 8-bit RISC Controller IP core using VHDL
Lecture1 - Computer Architecture
Introduction to computer architecture .pptx
Design and implementation of five stage pipelined RISC-V processor using Ver...
Introduction to Blackfin BF532 DSP
Highridge ISA
Instruction Set Architecture
Microprocessor Systems and Interfacing Slides
Presentation
8 bit Microprocessor with Single Vectored Interrupt
ADS Lab 5 Report
Joel Falcou, Boost.SIMD
C from hello world to 010101
Microprocessor lab manual
An Enhanced FPGA Based Asynchronous Microprocessor Design Using VIVADO and ISIM
מצגת פרויקט
The Principle Of Ultrasound Imaging System
Lect05 Prog Model
Design and implementation of complex floating point processor using fpga
Design of FPGA based 8-bit RISC Controller IP core using VHDL
Ad

Recently uploaded (20)

PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPT
Project quality management in manufacturing
PPTX
UNIT 4 Total Quality Management .pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Construction Project Organization Group 2.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Automation-in-Manufacturing-Chapter-Introduction.pdf
OOP with Java - Java Introduction (Basics)
bas. eng. economics group 4 presentation 1.pptx
additive manufacturing of ss316l using mig welding
Project quality management in manufacturing
UNIT 4 Total Quality Management .pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Construction Project Organization Group 2.pptx
CH1 Production IntroductoryConcepts.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
573137875-Attendance-Management-System-original
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx

07 processor basics

  • 1. Digital Design: An Embedded Systems Approach Using Verilog Chapter 7 Processor Basics Portions of this work are from the book, Digital Design: An Embedded Systems Approach Using Verilog, by Peter J. Ashenden, published by Morgan Kaufmann Publishers, Copyright 2007 Elsevier Inc. All rights reserved.
  • 2. Verilog Digital Design — Chapter 7 — Processor Basics 2 Embedded Computers  A computer as part of a digital system  Performs processing to implement or control the system’s function  Components  Processor core  Instruction and data memory  Input, output, and input/output controllers  For interacting with the physical world  Accelerators  High-performance circuit for specialized functions  Interconnecting buses
  • 3. Verilog Digital Design — Chapter 7 — Processor Basics 3 Memory Organization  Von Neumann architecture  Single memory for instructions and data  Harvard architecture  Separate instruction and data memories  Most common in embedded systems CPU … Accelerator Instruction memory Input controller Output controller I/O controller Data memory
  • 4. Verilog Digital Design — Chapter 7 — Processor Basics 4 Bus Organization  Single bus for low-cost low-performance systems  Multiple buses for higher performance CPU Accelerator Instruction memory Input controller Output controller I/O controller Data memory
  • 5. Verilog Digital Design — Chapter 7 — Processor Basics 5 Microprocessors  Single-chip processor in a package  External connections to memory and I/O buses  Most commonly seen in general purpose computers  E.g., Intel Pentium family, PowerPC, …
  • 6. Verilog Digital Design — Chapter 7 — Processor Basics 6 Microcontrollers  Single chip combining  Processor  A small amount of instruction/data memory  I/O controllers  Microcontroller families  Same processor, varying memory and I/O  8-bit microcontrollers  Operate on 8-bit data  Low cost, low performance  16-bit and 32-bit microcontrollers  Higher performance
  • 7. Verilog Digital Design — Chapter 7 — Processor Basics 7 Processor Cores  Processor as a component in an FPGA or ASIC  In FPGA, can be a fixed-function block  E.g., PowerPC cores in some Xilinx FPGAs  Or can be a soft core  Implemented using programmable resources  E.g., Xilinx MicroBlaze, Altera Nios-II  In ASIC, provided as an IP block  E.g., ARM, PowerPC, MIPS, Tensilica cores  Can be customized for an application
  • 8. Verilog Digital Design — Chapter 7 — Processor Basics 8 Digital Signal Processors  DSPs are processors optimized for signal processing operations  E.g., audio, video, sensor data; wireless communication  Often combined with a conventional core for processing other data  Heterogeneous multiprocessor
  • 9. Verilog Digital Design — Chapter 7 — Processor Basics 9 Instruction Sets  A processor executes a program  A sequence of instructions, each performing a small step of a computation  Instruction set: the repertoire of available instructions  Different processor types have different instruction sets  High-level languages: more abstract  E.g., C, C++, Ada, Java  Translated to processor instructions by a compiler
  • 10. Verilog Digital Design — Chapter 7 — Processor Basics 10 Instruction Execution  Instructions are encoded in binary  Stored in the instruction memory  A processor executes a program by repeatedly  Fetching the next instruction  Decoding it to work out what to do  Executing the operation  Program counter (PC)  Register in the processor holding the address of the next instruction
  • 11. Verilog Digital Design — Chapter 7 — Processor Basics 11 Data and Endian-ness  Instructions operate on data from the data memory  Byte: 8-bit data  Data memory is usually byte addressed  16-bit, 32-bit, 64-bit words of data 0 least sig. byte Little endian Big endian 8-bit data 16-bit data 32-bit data most sig. byte least sig. byte most sig. byte m m + 1 n n + 2 n + 3 n + 1 0 least sig. byte 8-bit data 16-bit data 32-bit data most sig. byte least sig. byte most sig. byte m m + 1 n n + 2 n + 3 n + 1
  • 12. Verilog Digital Design — Chapter 7 — Processor Basics 12 The Gumnut Core  A small 8-bit soft core  Can be used in FPGA designs  Instruction set illustrates features typical of 8- bit cores and processors in general  Programs written in assembly language  Each processor instruction written explicitly  Translated to binary representation by an assembler  Resources available on companions web site
  • 13. Verilog Digital Design — Chapter 7 — Processor Basics 13 Gumnut Storage
  • 14. Verilog Digital Design — Chapter 7 — Processor Basics 14 Arithmetic Instructions  Operate on register data and put result in a register  add, addc, sub, subc  Can have immediate value operand  Condition codes  Z: 1 if result is zero, 0 if result is non-zero  C: carry out of add/addc, borrow out of sub/subc  addc and subc include C bit in operation
  • 15. Verilog Digital Design — Chapter 7 — Processor Basics 15 Arithmetic Instructions  Examples  add r3, r4, r1  add r5, r1, 2  sub r4, r4, 1  Evaluate 2x + 1; x in r3, result in r4  add r4, r4, r3 ; double x add r4, r4, 1 ; then add 1
  • 16. Verilog Digital Design — Chapter 7 — Processor Basics 16 Logical Instructions  Operate on register data and put result in a register  and, or, xor, mask (and not)  Operate bitwise on 8-bit operands  Can have immediate value operand  Condition codes  Z: 1 if result is zero, 0 if result is non-zero  C: always 0
  • 17. Verilog Digital Design — Chapter 7 — Processor Basics 17 Logical Instructions  Examples  and r3, r4, r5  or r1, r1, 0x80 ; set r1(7)  xor r5, r5, 0xFF ; invert r5  Set Z if least-significant 4 bits of r2 are 0101  and r1, r2, 0x0F ; clear high bits sub r0, r1, 0x05 ; compare with 0101
  • 18. Verilog Digital Design — Chapter 7 — Processor Basics 18 Shift Instructions  Logical shift/rotate register data and put result in a register  shl, shr, rol, ror  Count specified as a literal operand  Condition codes  Z: 1 if result is zero, 0 if result is non-zero  C: the value of the last bit shifted/rotated past the end of the byte
  • 19. Verilog Digital Design — Chapter 7 — Processor Basics 19 Shift Instructions  Examples  shl r4, r1, 3  ror r2, r2, 4  Multiply r4 by 8, ignoring overflow  shl r4, r4, 3  Multiply r4 by 10, ignoring overflow  shl r1, r4, 1 ; multiply by 2 shl r4, r4, 3 ; multiply by 8 add r4, r4, r1
  • 20. Verilog Digital Design — Chapter 7 — Processor Basics 20 Memory Instructions  Transfer data between registers and data memory  Compute address by adding an offset to a base register value  Load register from memory  ldm r1, (r2)+5  Store from register to memory  stm r1, (r4)-2  Use r0 if base address is 0  ldm r3, 23  ldm r3, (r0)+23  Condition codes not affected
  • 21. Verilog Digital Design — Chapter 7 — Processor Basics 21 Memory Instructions  Increment a 16-bit integer in memory  Little-endian: address of lsb in r2, msb in next location  ldm r1, (r2) ; increment lsb add r1, r1, 1 stm r1, (r2) ldm r1, (r2)+1 ; increment msb addc r1, r1, 0 ; with carry stm r1, (r2)+1
  • 22. Verilog Digital Design — Chapter 7 — Processor Basics 22 Input/Output Instructions  I/O controllers have registers that govern their operation  Each has an address, like data memory  Gumnut has separate data and I/O address spaces  Input from I/O register  inp r3, 157  inp r3, (r0)+157  Output to I/O register  out r3, (r7)  out r3, (r7)+0  Condition codes not affected  Further examples in Chapter 8
  • 23. Verilog Digital Design — Chapter 7 — Processor Basics 23 Branch Instructions  Programs can evaluate conditions and take alternate courses of action  Condition codes (Z, C) represent outcomes of arithmetic/logical/shift instructions  Branch instructions examine Z or C  bz, bnz, bc, bnc  Add a displacement to PC if condition is true  Specifies how many instructions forward or backward to skip  Counting from instruction after branch
  • 24. Verilog Digital Design — Chapter 7 — Processor Basics 24 Branch Example  Elapsed seconds in location 100  Increment, wrapping to 0 after 59  ldm r1, 100 add r1, r1, 1 sub r0, r1, 60 ; Z set if r1 = 60 bnz +1 ; Skip to store if add r1, r0, 0 ; Z is 0 stm r1, 100
  • 25. Verilog Digital Design — Chapter 7 — Processor Basics 25 Jump Instruction  Unconditionally skips forward or backward to specified address  Changes the PC to the address  Example: if r1 = 0, clear data location 100 to 0; otherwise clear location 200 to 0  Assume instructions start at address 10  10: sub r0, r1, 0 11: bnz +2 12: stm r0, 100 13: jmp 15 14: stm r0, 200 15: ...
  • 26. Verilog Digital Design — Chapter 7 — Processor Basics 26 Subroutines  A sequence of instructions that perform some operation  Can call them from different parts of a program using a jsb instruction  Subroutine returns with a ret instruction
  • 27. Verilog Digital Design — Chapter 7 — Processor Basics 27 Subroutine Example  Subroutine to increment second count  Address of count in r2  ldm r1, (r2) add r1, r1, 1 sub r0, r1, 60 bnz +1 add r1, r0, 0 stm r1, (r2) ret  Call to increment locations 100 and 102  add r2, r0, 100 jsb 20 add r2, r0, 102 jsb 20
  • 28. Verilog Digital Design — Chapter 7 — Processor Basics 28 Return Address Stack  The jsb saves the return address for use by the ret  But what if the subroutine includes a jsb?  Gumnut core includes an 8-entry push- down stack of return addresses return addr for first call return addr for second call return addr for first call return addr for second call return addr for third call
  • 29. Verilog Digital Design — Chapter 7 — Processor Basics 29 Miscellaneous Instructions  Instructions supporting interrupts  See Chapter 8  reti Return from interrupt  enai Enable interrupts  disi Disable interrupts  wait Wait for an interrupt  stby Stand by in low power mode until an interrupt occurs
  • 30. Verilog Digital Design — Chapter 7 — Processor Basics 30 The Gumnut Assembler  Gasm: translates assembly programs  Generates memory images for program text (binary-coded instructions) and data  See documentation on web site  Write a program as a text file  Instructions  Directives  Comments  Use symbolic labels
  • 31. Verilog Digital Design — Chapter 7 — Processor Basics 31 Example Program ; Program to determine greater of value_1 and value_2 text org 0x000 ; start here on reset jmp main ; Data memory layout data value_1: byte 10 value_2: byte 20 result: bss 1 ; Main program text org 0x010 main: ldm r1, value_1 ; load values ldm r2, value_2 sub r0, r1, r2 ; compare values bc value_2_greater stm r1, result ; value_1 is greater jmp finish value_2_greater: stm r2, result ; value_2 is greater finish: jmp finish ; idle loop
  • 32. Verilog Digital Design — Chapter 7 — Processor Basics 32 Gumnut Instruction Encoding  Instructions are a form of information  Can be encoded in binary  Gumnut encoding  18 bits per instruction  Divided into fields representing different aspects of the instruction  Opcodes and function codes  Register numbers  Addresses
  • 33. Verilog Digital Design — Chapter 7 — Processor Basics 33 Gumnut Instruction Encoding 1 1 0 1 1 1 fn disp 6 2 2 8 Branch Arith/Logical Register Arith/Logical Immediate Shift Memory, I/O 1 1 0 1 fn rd rs rs2 4 3 3 3 3 2 0 fn rd rs immed 1 8 3 3 3 1 1 0 fn rd rs count 3 3 1 2 3 3 3 1 0 fn rd rs offset 2 2 3 3 8 1 1 1 1 0 0 fn addr 5 1 12 Jump 1 1 1 1 1 1 fn 7 3 8 Miscellaneous
  • 34. Verilog Digital Design — Chapter 7 — Processor Basics 34 Encoding Examples  Encoding for addc r3, r5, 24  Arithmetic immediate, fn = 001 0 fn rd rs immed 0 0 0 1 1 0 1 0 1 1 0 0 1 0 1 0 0 0 1 8 3 3 3  Instruction encoded by 2ECFC 1 1 0 1 1 1 fn disp 6 2 2 8 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 1 Branch  bnc -4  05D18
  • 35. Verilog Digital Design — Chapter 7 — Processor Basics 35 Other Instruction Sets  8-bit cores and microcontrollers  Xilinx PicoBlaze: like Gumnut  8051, and numerous like it  Originated as 8-bit microprocessors  Instructions encoded as one or more bytes  Instruction set is more complex and irregular  Complex instruction set computer (CISC)  C.f. Reduced instruction set computer (RISC)  16-, 32- and 64-bit cores  Mostly RISC  E.g., PowerPC, ARM, MIPS, Tensilica, …
  • 36. Verilog Digital Design — Chapter 7 — Processor Basics 36 Instruction and Data Memory  In embedded systems  Instruction memory is usually ROM, flash, SRAM, or combination  Data memory is usually SRAM  DRAM if large capacity needed  Processor/memory interfacing  Gluing the signals together
  • 37. Verilog Digital Design — Chapter 7 — Processor Basics 37 Example: Gumnut Memory inst_adr_o inst_dat_i rst_i gumnut data SRAM inst_cyc_o inst_stb_o inst_ack_i data_adr_o data_dat_i data_dat_o data_cyc_o data_stb_o data_ack_i data_we_o adr dat_o dat_i en we adr dat_o en clk_i clk_i instruction ROM clk_i D Q clk D Q clk
  • 38. Verilog Digital Design — Chapter 7 — Processor Basics 38 Example: Gumnut Memory always @(posedge clk) // Instruction memory if (inst_cyc_o && inst_stb_o) begin inst_dat_i <= inst_ROM[inst_adr_o[10:0]]; inst_ack_i <= 1'b1; end else inst_ack_i <= 1'b0;
  • 39. Verilog Digital Design — Chapter 7 — Processor Basics 39 Example: Gumnut Memory always @(posedge clk) // Data memory if (data_cyc_o && data_stb_o) if (data_we_o) begin data_RAM[data_adr_o] <= data_dat_o; data_dat_i <= data_dat_o; data_ack_i <= 1'b1; end else begin data_dat_i <= data_RAM[data_adr_o]; data_ack_i <= 1'b1; end else data_ack_i <= 1'b0;
  • 40. Verilog Digital Design — Chapter 7 — Processor Basics 40 Example: Microcontroller Memory A(15..8) A(7..0) CE WE OE D A(16) D LE P2 Q PSEN ALE 8051 SRAM RD WR P0
  • 41. Verilog Digital Design — Chapter 7 — Processor Basics 41 32-bit Memory  Four bytes per memory word  Little-endian: lsb at least address  Big-endian: msb at least address 0 1 2 3 4 5 6 7 8 9 10 11  Partial-word read  Read all bytes, processor selects those needed  Partial-word write  Use byte-enable signals
  • 42. Verilog Digital Design — Chapter 7 — Processor Basics 42 Example: MicroBlaze Memory D_in A SSRAM en wr D_out clk D_in A SSRAM en wr D_out clk D_in A SSRAM en wr D_out clk D_in A SSRAM en wr D_out clk 0:7 8:15 16:23 24:31 0:7 2:16 8:15 16:23 24:31 Addr Data_Write AS Read_Strobe Ready Clk Data_Read Write_Strobe Byte_Enable(0) Byte_Enable(1) Byte_Enable(2) Byte_Enable(3) +V
  • 43. Verilog Digital Design — Chapter 7 — Processor Basics 43 Cache Memory  For high-performance processors  Memory access time is several clock cycles  Performance bottleneck  Cache memory  Small fast memory attached to a processor  Stores most frequently accessed items, plus adjacent items  Locality: those items are most likely to be accessed again soon
  • 44. Verilog Digital Design — Chapter 7 — Processor Basics 44 Cache Memory  Memory contents divided into fixed- sized blocks (lines)  Cache copies whole lines from memory  When processor accesses an item  If item is in cache: hit - fast access  Occurs most of the time  If item is not in cache: miss  Line containing item is copied from memory  Slower, but less frequent  May need to replace a line already in cache
  • 45. Verilog Digital Design — Chapter 7 — Processor Basics 45 Fast Main Memory Access  Optimize memory for line access by cache  Wide memory  Read a line in one access  Burst transfers  Send starting address, then read successive locations  Pipelining  Overlapping stages of memory access  E.g., address transfer, memory operation, data transfer  Double data rate (DDR), Quad data rate (QDR)  Transfer on both rising and falling clock edges
  • 46. Verilog Digital Design — Chapter 7 — Processor Basics 46 Summary  Embedded computer  Processor, memory, I/O controllers, buses  Microprocessors, microcontrollers, and processor cores  Soft-core processors for ASIC/FPGA  Processor instruction sets  Binary encoding for instructions  Assembly language programs  Memory interfacing

Editor's Notes

  • #2: 24 September 2021
  • #3: 24 September 2021
  • #4: 24 September 2021
  • #5: 24 September 2021
  • #6: 24 September 2021
  • #7: 24 September 2021
  • #8: 24 September 2021
  • #9: 24 September 2021
  • #10: 24 September 2021
  • #11: 24 September 2021
  • #12: 24 September 2021
  • #13: 24 September 2021
  • #14: 24 September 2021
  • #15: 24 September 2021
  • #16: 24 September 2021
  • #17: 24 September 2021
  • #18: 24 September 2021
  • #19: 24 September 2021
  • #20: 24 September 2021
  • #21: 24 September 2021
  • #22: 24 September 2021
  • #23: 24 September 2021
  • #24: 24 September 2021
  • #25: 24 September 2021
  • #26: 24 September 2021
  • #27: 24 September 2021
  • #28: 24 September 2021
  • #29: 24 September 2021
  • #30: 24 September 2021
  • #31: 24 September 2021
  • #32: 24 September 2021
  • #33: 24 September 2021
  • #34: 24 September 2021
  • #35: 24 September 2021
  • #36: 24 September 2021
  • #37: 24 September 2021
  • #38: 24 September 2021
  • #39: 24 September 2021
  • #40: 24 September 2021
  • #41: 24 September 2021
  • #42: 24 September 2021
  • #43: 24 September 2021
  • #44: 24 September 2021
  • #45: 24 September 2021
  • #46: 24 September 2021
  • #47: 24 September 2021