SlideShare a Scribd company logo
MAHESH SINGH
ASSISTANT PROFESSOR
ADVANCED EDUCATIONAL INSTITUTE
www.advanced.edu.in
Understand CPUUnderstand CPU
Caching ConceptsCaching Concepts
Need for Cache has come about due to reasons :
The concept of Locality of reference.
-> 5 percent of the data is accessed 95 percent of the times, so
makes sense to cache the 5 percent of the data.
The gap between CPU and main memory speeds.
-> In analogy to producer consumer problem, the CPU is the
consumer and RAM, hard disks act as producers. Slow
producers limit the performance of the consumer.
Concept of Caching
www.advanced.edu.in
Locality of Reference
Spatial locality : If a particular memory location say nth location is referenced at a particular
time, then it is likely that (n+1) th memory location will be referenced in the near future
The actual piece of data that was requested is called the critical word, and the surrounding group of
bytes that gets fetched along with it is called a cache line or cache block.
Temporal Locality: If at one point in time say T a particular memory location is referenced, then
it is likely that the same location will be referenced again at time T+ delta.
This is very similar to the concept of working set, i.e., set of pages which the CPU frequently
accesses.
www.advanced.edu.in
CPU Cache and its operation
A CPU cache is a smaller, faster memory which stores copies of the data from the most
frequently used main memory locations. The concept of locality of reference drives caching
concept, we cache the most frequently used, data, instruction for faster data access.
CPU cache could be data cache, instruction cache. Unlike RAM, cache is not expandable.
The CPU first checks in the L1 cache for data, if it does not find it at L1, it moves over to L2 and
finally L3. If not found at L3, it’s a cache miss and RAM is searched next, followed by the hard
drive.
If the CPU finds the requested data in cache, it’s a cache hit, and if not, it’s a cache miss.
www.advanced.edu.in
Levels of caching and speed, size
comparisons
Level Access Time Typical Size Technology Managed By
Level 1
Cache (on-chip)
2-8 ns 8 KB-128 KB SRAM Hardware
Level 2
Cache (off-chip)
5-12 ns 0.5 MB - 8 MB SRAM Hardware
Main Memory 10-60 ns 64 MB - 2 GB DRAM Operating System
Hard Disk 3,000,000 -
10,000,000 ns
100 GB - 2 TB Magnetic Operating System
www.advanced.edu.in
Cache organization
When the processor needs to read or write a
location in main
memory, it first checks whether that memory
location is in the cache.
This is accomplished by comparing the address of
the memory location
to all tags in the cache that might contain that
address.
If the processor finds that the memory location is in
the cache, we say
that a cache hit has occurred; otherwise, we speak
of a cache miss.
www.advanced.edu.in
Cache Entry structure
Data blocks Tag Index Displacement Valid bit
Cache row entries usually have the following structure:
The data blocks (cache line) contain the actual data fetched from the main memory. The memory address is split
into the tag, the index and the displacement (offset), while the valid bit denotes that this particular entry has
valid data.
•The index length is bits and describes which row the data has been put in.
•The displacement length is and specifies which block of the ones we have stored we need.
•The tag length is address − index − displacement
www.advanced.edu.in
Cache organization - 1
Cache is divided into blocks. The blocks form the basic unit of cache
organization. RAM is also organized into blocks of the same size as the
cache's blocks
When the CPU requests a byte from a particular RAM block, it needs to
be able to determine three things very quickly:
 Whether or not the needed block is actually in the cache
 The location of the block within the cache
 The location of the desired byte within the block
www.advanced.edu.in
Mapping RAM blocks to cache block
Fully associative : Any RAM block can be stored in any available block frame. The problem with this
scheme is that if you want to retrieve a specific block from the cache, you have to check the tag of
every single block frame in the entire cache because the desired block could be in any of the frames
Direct mapping : In a direct-mapped cache, each block frame can cache only a certain subset of the
blocks in main memory. For Eg. Ram block X whose modulo results in 1 are always stored in Cache
block 1.
The problem with this approach is certain cache blocks could remain unused and there could be
frequent eviction of cache entries for certain cache blocks.
N way associative : Ram block X, could be either mapped to Cache Block X or Y.
www.advanced.edu.in
Handling Cache Miss
In order to make room for the new entry on a cache miss, the cache has to evict one of the
existing entries.
The heuristic that it uses to choose the entry to evict is called the replacement policy. The
fundamental problem with any replacement policy is that it must predict which existing cache
entry is least likely to be used in the future. Some of the replacement policies are :
 Random Eviction: Removal of any cache entry by random choice.
 LIFO: Evicting the latest cache entry.
 FIFO: Evicting the oldest cache entry.
 LRU: Evicting the Least recently used cache entry.
www.advanced.edu.in
Mirroring Cache to Main memory
If data are written to the cache, they must at some point be written to main memory and other
higher order cache as well. The timing of this write is controlled by what is known as the write
policy.
A Write-through cache, every write to the cache causes a write to main memory and higher order
cache like L2, L3.
Write-back or copy-back cache, writes are not immediately mirrored to the main memory. Instead,
the cache tracks which locations will be evicted. Such entries are written to main memory, higher
order cache just before eviction of the cache entry
www.advanced.edu.in
Stale data in cache
The data in main memory being cached may be changed by other entities (e.g. peripherals using direct
memory access or multi-core processor), in which case the copy in the cache may become out-of-date
or stale.
Alternatively, when the CPU in a multi-core processor updates the data in the cache, copies of data in
caches associated with other cores will become stale.
Communication protocols between the cache managers. Which keep the data consistent are known as
cache coherence protocols. Eg. Snoopy based, directory based, token based.
www.advanced.edu.in
State of the Art today
 Current day research on cache design, handling cache coherence, is more biased to multicore
architectures.
ReferencesReferences
Wikipedia : http://en.wikipedia.org/wiki/CPU_cache
ArsTechnica : http://arstechnica.com/
http://software.intel.com
What Every Programmer Should Know About Memory -
- Ulrich Drepper, Red Hat, Inc.
www.advanced.edu.in
Q/A
www.advanced.edu.in

More Related Content

PPTX
CPU Caching Concepts
PPTX
Elements of cache design
PPTX
Cache Memory Computer Architecture and organization
PPT
Cache memory and cache
PPTX
Cache memory ppt
PPT
Cache memory
PPTX
Project Presentation Final
PPTX
Cache design
CPU Caching Concepts
Elements of cache design
Cache Memory Computer Architecture and organization
Cache memory and cache
Cache memory ppt
Cache memory
Project Presentation Final
Cache design

What's hot (20)

PPTX
Organisation of cache memory
PPTX
Cache memory and virtual memory
PPTX
Cache Memory
PPTX
Cache memory
PPT
Cache memory
PPTX
Cache memory
PPTX
Cache memory
PPT
Cache
PPTX
Cache memory
PPT
cache memory
PPT
Cache memory ...
PPT
04 Cache Memory
PPT
Memory organisation
PPTX
Cache Memory
DOCX
Cache memory
PPTX
Cachememory
PPTX
Cache memoy designed by Mohd Tariq
PPTX
Cache memory
PPT
Cache memory by Foysal
Organisation of cache memory
Cache memory and virtual memory
Cache Memory
Cache memory
Cache memory
Cache memory
Cache memory
Cache
Cache memory
cache memory
Cache memory ...
04 Cache Memory
Memory organisation
Cache Memory
Cache memory
Cachememory
Cache memoy designed by Mohd Tariq
Cache memory
Cache memory by Foysal
Ad

Similar to Cpu caching concepts mr mahesh (20)

PPT
Memory organization including cache and RAM.ppt
PPT
Memory Organization and Cache mapping.ppt
PPT
04 cache memory
PDF
computer-memory
PPT
cache memory
PPTX
cachememppt analyzing the structure of the cache memoyr
PDF
cashe introduction, and heirarchy basics
PDF
cachememory-210517060741 (1).pdf
PDF
Cache memory
PPT
Cache Memory from Computer Architecture.ppt
PPT
Cache Memory for Computer Architecture.ppt
PPT
04 cache memory
PPT
04 cache memory
PPT
IS 139 Lecture 7
PPT
Ch_4.pptInnovation technology pptInnovation technology ppt
PPT
04_Cache Memory-computer-architecture.ppt
PPT
04 cache memory.ppt 1
PDF
Chache memory ( chapter number 4 ) by William stalling
PPT
04_Cache Memory.ppt
PPT
Cache Memory.ppt
Memory organization including cache and RAM.ppt
Memory Organization and Cache mapping.ppt
04 cache memory
computer-memory
cache memory
cachememppt analyzing the structure of the cache memoyr
cashe introduction, and heirarchy basics
cachememory-210517060741 (1).pdf
Cache memory
Cache Memory from Computer Architecture.ppt
Cache Memory for Computer Architecture.ppt
04 cache memory
04 cache memory
IS 139 Lecture 7
Ch_4.pptInnovation technology pptInnovation technology ppt
04_Cache Memory-computer-architecture.ppt
04 cache memory.ppt 1
Chache memory ( chapter number 4 ) by William stalling
04_Cache Memory.ppt
Cache Memory.ppt
Ad

Recently uploaded (20)

PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
master seminar digital applications in india
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Cell Structure & Organelles in detailed.
PDF
01-Introduction-to-Information-Management.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Computing-Curriculum for Schools in Ghana
PDF
Classroom Observation Tools for Teachers
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Chinmaya Tiranga quiz Grand Finale.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
Final Presentation General Medicine 03-08-2024.pptx
Pharma ospi slides which help in ospi learning
Final Presentation General Medicine 03-08-2024.pptx
master seminar digital applications in india
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Cell Structure & Organelles in detailed.
01-Introduction-to-Information-Management.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Anesthesia in Laparoscopic Surgery in India
Computing-Curriculum for Schools in Ghana
Classroom Observation Tools for Teachers
2.FourierTransform-ShortQuestionswithAnswers.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Microbial diseases, their pathogenesis and prophylaxis
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student

Cpu caching concepts mr mahesh

  • 1. MAHESH SINGH ASSISTANT PROFESSOR ADVANCED EDUCATIONAL INSTITUTE www.advanced.edu.in Understand CPUUnderstand CPU Caching ConceptsCaching Concepts
  • 2. Need for Cache has come about due to reasons : The concept of Locality of reference. -> 5 percent of the data is accessed 95 percent of the times, so makes sense to cache the 5 percent of the data. The gap between CPU and main memory speeds. -> In analogy to producer consumer problem, the CPU is the consumer and RAM, hard disks act as producers. Slow producers limit the performance of the consumer. Concept of Caching www.advanced.edu.in
  • 3. Locality of Reference Spatial locality : If a particular memory location say nth location is referenced at a particular time, then it is likely that (n+1) th memory location will be referenced in the near future The actual piece of data that was requested is called the critical word, and the surrounding group of bytes that gets fetched along with it is called a cache line or cache block. Temporal Locality: If at one point in time say T a particular memory location is referenced, then it is likely that the same location will be referenced again at time T+ delta. This is very similar to the concept of working set, i.e., set of pages which the CPU frequently accesses. www.advanced.edu.in
  • 4. CPU Cache and its operation A CPU cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations. The concept of locality of reference drives caching concept, we cache the most frequently used, data, instruction for faster data access. CPU cache could be data cache, instruction cache. Unlike RAM, cache is not expandable. The CPU first checks in the L1 cache for data, if it does not find it at L1, it moves over to L2 and finally L3. If not found at L3, it’s a cache miss and RAM is searched next, followed by the hard drive. If the CPU finds the requested data in cache, it’s a cache hit, and if not, it’s a cache miss. www.advanced.edu.in
  • 5. Levels of caching and speed, size comparisons Level Access Time Typical Size Technology Managed By Level 1 Cache (on-chip) 2-8 ns 8 KB-128 KB SRAM Hardware Level 2 Cache (off-chip) 5-12 ns 0.5 MB - 8 MB SRAM Hardware Main Memory 10-60 ns 64 MB - 2 GB DRAM Operating System Hard Disk 3,000,000 - 10,000,000 ns 100 GB - 2 TB Magnetic Operating System www.advanced.edu.in
  • 6. Cache organization When the processor needs to read or write a location in main memory, it first checks whether that memory location is in the cache. This is accomplished by comparing the address of the memory location to all tags in the cache that might contain that address. If the processor finds that the memory location is in the cache, we say that a cache hit has occurred; otherwise, we speak of a cache miss. www.advanced.edu.in
  • 7. Cache Entry structure Data blocks Tag Index Displacement Valid bit Cache row entries usually have the following structure: The data blocks (cache line) contain the actual data fetched from the main memory. The memory address is split into the tag, the index and the displacement (offset), while the valid bit denotes that this particular entry has valid data. •The index length is bits and describes which row the data has been put in. •The displacement length is and specifies which block of the ones we have stored we need. •The tag length is address − index − displacement www.advanced.edu.in
  • 8. Cache organization - 1 Cache is divided into blocks. The blocks form the basic unit of cache organization. RAM is also organized into blocks of the same size as the cache's blocks When the CPU requests a byte from a particular RAM block, it needs to be able to determine three things very quickly:  Whether or not the needed block is actually in the cache  The location of the block within the cache  The location of the desired byte within the block www.advanced.edu.in
  • 9. Mapping RAM blocks to cache block Fully associative : Any RAM block can be stored in any available block frame. The problem with this scheme is that if you want to retrieve a specific block from the cache, you have to check the tag of every single block frame in the entire cache because the desired block could be in any of the frames Direct mapping : In a direct-mapped cache, each block frame can cache only a certain subset of the blocks in main memory. For Eg. Ram block X whose modulo results in 1 are always stored in Cache block 1. The problem with this approach is certain cache blocks could remain unused and there could be frequent eviction of cache entries for certain cache blocks. N way associative : Ram block X, could be either mapped to Cache Block X or Y. www.advanced.edu.in
  • 10. Handling Cache Miss In order to make room for the new entry on a cache miss, the cache has to evict one of the existing entries. The heuristic that it uses to choose the entry to evict is called the replacement policy. The fundamental problem with any replacement policy is that it must predict which existing cache entry is least likely to be used in the future. Some of the replacement policies are :  Random Eviction: Removal of any cache entry by random choice.  LIFO: Evicting the latest cache entry.  FIFO: Evicting the oldest cache entry.  LRU: Evicting the Least recently used cache entry. www.advanced.edu.in
  • 11. Mirroring Cache to Main memory If data are written to the cache, they must at some point be written to main memory and other higher order cache as well. The timing of this write is controlled by what is known as the write policy. A Write-through cache, every write to the cache causes a write to main memory and higher order cache like L2, L3. Write-back or copy-back cache, writes are not immediately mirrored to the main memory. Instead, the cache tracks which locations will be evicted. Such entries are written to main memory, higher order cache just before eviction of the cache entry www.advanced.edu.in
  • 12. Stale data in cache The data in main memory being cached may be changed by other entities (e.g. peripherals using direct memory access or multi-core processor), in which case the copy in the cache may become out-of-date or stale. Alternatively, when the CPU in a multi-core processor updates the data in the cache, copies of data in caches associated with other cores will become stale. Communication protocols between the cache managers. Which keep the data consistent are known as cache coherence protocols. Eg. Snoopy based, directory based, token based. www.advanced.edu.in
  • 13. State of the Art today  Current day research on cache design, handling cache coherence, is more biased to multicore architectures. ReferencesReferences Wikipedia : http://en.wikipedia.org/wiki/CPU_cache ArsTechnica : http://arstechnica.com/ http://software.intel.com What Every Programmer Should Know About Memory - - Ulrich Drepper, Red Hat, Inc. www.advanced.edu.in