SlideShare a Scribd company logo
Multi-Core
Computing
Osama Awwad
Department of Computer Science
Western Michigan University
Thursday, March 2, 2023
2
3/2/2023
Multi-Core Computer
 A multi-core microprocessor is one that
combines two or more independent processors
into a single package, often a single integrated
circuit (IC).
 A dual-core device contains two independent
microprocessors.
 In general, multi-core microprocessors allow a
computing device to exhibit some form of thread-
level parallelism (TLP) without including multiple
microprocessors in separate physical packages.
3
3/2/2023
Major Technology Providers
 The latest versions of many architectures use multi-core, including PA-
RISC (PA-8800), IBM POWER (POWER7), SPARC (UltraSPARC IV), and
various processors from Intel and AMD.
 There is some controversy as to whether multiple cores on a chip is the
same thing as multiple processors. Major technology providers are divided
on this issue.
 IBM considers its dual-core POWER4 and POWER5 to be two processors,
just packaged together.
 Sun Microsystems, in contrast, considers its UltraSPARC IV to be a multi-
threaded rather than multi-processor chip.
 Intel considers their multi-core designs to be a single processor.
 This is not an idle debate, because software is often more expensive when
licensed for more processors.
Microsoft, Red Hat Linux, Suse Linux will license their OS per chip, not per core
4
3/2/2023
Single-core computer
5
3/2/2023
Multi-core architectures
 Replicate multiple processor cores on a
single die.
Core 1 Core 2 Core 3 Core 4
Multi-core CPU chip
6
3/2/2023
Multi-core CPU chip
 The cores fit on a single processor socket
 Also called CMP (Chip Multi-Processor)
c
o
r
e
1
c
o
r
e
2
c
o
r
e
3
c
o
r
e
4
7
3/2/2023
The cores run in parallel
c
o
r
e
1
c
o
r
e
2
c
o
r
e
3
c
o
r
e
4
thread 1 thread 2 thread 3 thread 4
8
3/2/2023
Within each core, threads are time-sliced
(just like on a uniprocessor)
c
o
r
e
1
c
o
r
e
2
c
o
r
e
3
c
o
r
e
4
several
threads
several
threads
several
threads
several
threads
9
3/2/2023
Interaction with OS
 OS perceives each core as a separate
processor
 OS scheduler maps threads/processes
to different cores
 Most major OS support multi-core today
10
3/2/2023
Why multi-core ?
 Difficult to make single-core
clock frequencies even higher
 Many new applications are
multithreaded
 General trend in computer
architecture (shift towards
more parallelism)
11
3/2/2023
Instruction-level parallelism
 Parallelism at the machine-instruction level
 The processor can re-order, pipeline
instructions, split them into
microinstructions, do aggressive branch
prediction, etc.
 Instruction-level parallelism enabled rapid
increases in processor speeds over the
last 15 years
12
3/2/2023
Thread-level parallelism (TLP)
 This is parallelism on a more coarser scale
 Server can serve each client in a separate
thread (Web server, database server)
 A computer game can do AI, graphics, and
physics in three separate threads
 Single-core superscalar processors cannot
fully exploit TLP
 Multi-core architectures are the next step in
processor evolution: explicitly exploiting TLP
13
3/2/2023
General context: Multiprocessors
 Multiprocessor is any
computer with several
processors
 SIMD
Single instruction, multiple data
Modern graphics cards
 MIMD
Multiple instructions, multiple data
Lemieux cluster,
Pittsburgh
supercomputing
center
14
3/2/2023
Multiprocessor memory types
 Shared memory:
In this model, there is one (large) common
shared memory for all processors
 Distributed memory:
In this model, each processor has its own
(small) local memory, and its content is not
replicated anywhere else
15
3/2/2023
Multi-core processor is a special
kind of a multiprocessor:
All processors are on the same chip
 Multi-core processors are MIMD:
Different cores execute different threads
(Multiple Instructions), operating on different
parts of memory (Multiple Data).
 Multi-core is a shared memory multiprocessor:
All cores share the same memory
16
3/2/2023
What applications benefit
from multi-core?
 Database servers
 Web servers (Web commerce)
 Telecommuncation markets:
6WINDGate (datapath and
control plane)
 Multimedia applications
 Scientific applications,
CAD/CAM
 In general, applications with
Thread-level parallelism
(as opposed to instruction-
level parallelism)
Each can
run on its
own core
17
3/2/2023
More examples
 Editing a photo while recording a TV show
through a digital video recorder
 Downloading software while running an
anti-virus program
 “Anything that can be threaded today will
map efficiently to multi-core”
 BUT: some applications difficult to
parallelize
18
3/2/2023
Simultaneous multithreading (SMT)
 Permits multiple independent threads to execute
SIMULTANEOUSLY on the SAME core
 Weaving together multiple “threads”
on the same core
 Example: if one thread is waiting for a floating
point operation to complete, another thread can
use the integer units
19
3/2/2023
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode ROM
BTB
L2
Cache
and
Control
Bus
Thread 1: floating point
Without SMT, only a single thread
can run at any given time
20
3/2/2023
Without SMT, only a single thread
can run at any given time
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode ROM
BTB
L2
Cache
and
Control
Bus
Thread 2:
integer operation
21
3/2/2023
SMT processor: both threads can
run concurrently
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode ROM
BTB
L2
Cache
and
Control
Bus
Thread 1: floating point
Thread 2:
integer operation
22
3/2/2023
But: Can’t simultaneously use the
same functional unit
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode ROM
BTB
L2
Cache
and
Control
Bus
Thread 1 Thread 2
This scenario is
impossible with SMT
on a single core
(assuming a single
integer unit)
IMPOSSIBLE
23
3/2/2023
SMT not a “true” parallel processor
 Enables better threading (e.g. up to 30%)
 OS and applications perceive each
simultaneous thread as a separate
“virtual processor”
 The chip has only a single copy
of each resource
 Compare to multi-core:
each core has its own copy of resources
24
3/2/2023
Multi-core:
threads can run on separate cores
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode
ROM
BTB
L2
Cache
and
Control
Bus
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode
ROM
BTB
L2
Cache
and
Control
Bus
Thread 1 Thread 3
25
3/2/2023
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode
ROM
BTB
L2
Cache
and
Control
Bus
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode
ROM
BTB
L2
Cache
and
Control
Bus
Thread 2 Thread 4
Multi-core:
threads can run on separate cores
26
3/2/2023
Combining Multi-core and SMT
 Cores can be SMT-enabled (or not)
 The different combinations:
Single-core, non-SMT: standard uniprocessor
Single-core, with SMT
Multi-core, non-SMT
Multi-core, with SMT:
 The number of SMT threads:
2, 4, or sometimes 8 simultaneous threads
 Intel calls them “hyper-threads”
27
3/2/2023
SMT Dual-core: all four threads can
run concurrently
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode
ROM
BTB
L2
Cache
and
Control
Bus
BTB and I-TLB
Decoder
Trace Cache
Rename/Alloc
Uop queues
Schedulers
Integer Floating Point
L1 D-Cache D-TLB
uCode
ROM
BTB
L2
Cache
and
Control
Bus
Thread 1 Thread 2 Thread 3 Thread 4
28
3/2/2023
Comparison: multi-core vs SMT
 Multi-core:
Since there are several cores,
each is smaller and not as powerful
(but also easier to design and manufacture)
However, great with thread-level parallelism
 SMT
Can have one large and fast superscalar core
Great performance on a single thread
Mostly still only exploits instruction-level
parallelism
29
3/2/2023
The memory hierarchy
 If simultaneous multithreading only:
all caches shared
 Multi-core chips:
L1 caches private
L2 caches private in some architectures
and shared in others
 Memory is always shared
30
3/2/2023
 Dual-core
Intel Xeon processors
 Each core is
hyper-threaded
 Private L1 caches
 Shared L2 caches
memory
L2 cache
L1 cache L1 cache
C
O
R
E
1
C
O
R
E
0
hyper-threads
31
3/2/2023
Designs with private L2 caches
memory
L2 cache
L1 cache L1 cache
C
O
R
E
1
C
O
R
E
0
L2 cache
memory
L2 cache
L1 cache L1 cache
C
O
R
E
1
C
O
R
E
0
L2 cache
Both L1 and L2 are private
Examples: AMD Opteron,
AMD Athlon, Intel Pentium D
L3 cache L3 cache
A design with L3 caches
Example: Intel Itanium 2
32
3/2/2023
Windows Task Manager
core 2
core 1
33
3/2/2023
Advantages /Disadvantages
34
3/2/2023
Advantages
 Cache coherency circuitry can operate at a much higher
clock rate than is possible if the signals have to travel
off-chip
 Signals between different CPUs travel shorter distances,
those signals degrade less
 These higher quality signals allow more data to be sent
in a given time period since individual signals can be
shorter and do not need to be repeated as often
 A dual-core processor uses slightly less power than two
coupled single-core processors
35
3/2/2023
Disadvantages
 Ability of multi-core processors to increase application
performance depends on the use of multiple threads
within applications.
 Most Current video games will run faster on a 3 GHz
single-core processor than on a 2GHz dual-core
processor (of the same core architecture
 Two processing cores sharing the same system bus and
memory bandwidth limits the real-world performance
advantage.
 If a single core is close to being memory bandwidth
limited, going to dual-core might only give 30% to 70%
improvement
 If memory bandwidth is not a problem, a 90%
improvement can be expected
36
3/2/2023
Conclusion
 Multi-core chips an
important new trend in
computer architecture
 Several new multi-core
chips in design phases
 Parallel programming techniques
likely to gain importance
37
3/2/2023
References
 http://en.wikipedia.org/wiki/Multi-
core_(computing)
 www.princeton.edu/~jdonald/research/hyp
erthreading/garg_report.pdf
 www.cs.cmu.edu/~barbic/multi-core.ppt

More Related Content

PPT
multi-core Processor.ppt for IGCSE ICT and Computer Science Students
PPT
Multi-core architectures
PPTX
Multicore processor by Ankit Raj and Akash Prajapati
PPTX
Bharath technical seminar.pptx
DOCX
Multi-Core on Chip Architecture *doc - IK
PPTX
Dual-core processor
PPTX
Nehalem
DOCX
101 Questions for Hardwareeeeeeeeeee.docx
multi-core Processor.ppt for IGCSE ICT and Computer Science Students
Multi-core architectures
Multicore processor by Ankit Raj and Akash Prajapati
Bharath technical seminar.pptx
Multi-Core on Chip Architecture *doc - IK
Dual-core processor
Nehalem
101 Questions for Hardwareeeeeeeeeee.docx

Similar to Osa-multi-core.ppt (20)

PPTX
Processors and its Types
PDF
fundamentals of digital communication Unit 5_microprocessor.pdf
PDF
Case Study on Cray T3E Architecture
PPT
Microprocessor & microcontroller
PDF
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
PDF
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
PDF
microprocessormicrocontrollerbysanat-140107013112-phpapp02.pdf
PPTX
Chapter introduction to comuting 04(1).pptx
PDF
Free Hardware & Networking Slides by ITE Infotech Private Limited
PPTX
Multi core processors
PPT
Intel new processors
DOCX
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
PDF
Nt1310 Unit 3 Computer Components
PDF
27 multicore
PDF
Ef35745749
PPT
02 the cpu
PDF
Cache performance-x86-2009
PPT
Corei7
PDF
Difference between i3 and i5 and i7 and core 2 duo pdf
Processors and its Types
fundamentals of digital communication Unit 5_microprocessor.pdf
Case Study on Cray T3E Architecture
Microprocessor & microcontroller
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
microprocessormicrocontrollerbysanat-140107013112-phpapp02.pdf
Chapter introduction to comuting 04(1).pptx
Free Hardware & Networking Slides by ITE Infotech Private Limited
Multi core processors
Intel new processors
Virtual Memory In Contemporary Microprocessors And 64-Bit Microprocessors Arc...
Nt1310 Unit 3 Computer Components
27 multicore
Ef35745749
02 the cpu
Cache performance-x86-2009
Corei7
Difference between i3 and i5 and i7 and core 2 duo pdf
Ad

Recently uploaded (20)

PPTX
UNIT - 3 Total quality Management .pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Artificial Intelligence
PPTX
Current and future trends in Computer Vision.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Information Storage and Retrieval Techniques Unit III
PDF
PPT on Performance Review to get promotions
PPT
Occupational Health and Safety Management System
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
communication and presentation skills 01
PDF
Soil Improvement Techniques Note - Rabbi
PPTX
Nature of X-rays, X- Ray Equipment, Fluoroscopy
PPT
introduction to datamining and warehousing
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
UNIT - 3 Total quality Management .pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
Safety Seminar civil to be ensured for safe working.
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Artificial Intelligence
Current and future trends in Computer Vision.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Information Storage and Retrieval Techniques Unit III
PPT on Performance Review to get promotions
Occupational Health and Safety Management System
R24 SURVEYING LAB MANUAL for civil enggi
communication and presentation skills 01
Soil Improvement Techniques Note - Rabbi
Nature of X-rays, X- Ray Equipment, Fluoroscopy
introduction to datamining and warehousing
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Ad

Osa-multi-core.ppt

  • 1. Multi-Core Computing Osama Awwad Department of Computer Science Western Michigan University Thursday, March 2, 2023
  • 2. 2 3/2/2023 Multi-Core Computer  A multi-core microprocessor is one that combines two or more independent processors into a single package, often a single integrated circuit (IC).  A dual-core device contains two independent microprocessors.  In general, multi-core microprocessors allow a computing device to exhibit some form of thread- level parallelism (TLP) without including multiple microprocessors in separate physical packages.
  • 3. 3 3/2/2023 Major Technology Providers  The latest versions of many architectures use multi-core, including PA- RISC (PA-8800), IBM POWER (POWER7), SPARC (UltraSPARC IV), and various processors from Intel and AMD.  There is some controversy as to whether multiple cores on a chip is the same thing as multiple processors. Major technology providers are divided on this issue.  IBM considers its dual-core POWER4 and POWER5 to be two processors, just packaged together.  Sun Microsystems, in contrast, considers its UltraSPARC IV to be a multi- threaded rather than multi-processor chip.  Intel considers their multi-core designs to be a single processor.  This is not an idle debate, because software is often more expensive when licensed for more processors. Microsoft, Red Hat Linux, Suse Linux will license their OS per chip, not per core
  • 5. 5 3/2/2023 Multi-core architectures  Replicate multiple processor cores on a single die. Core 1 Core 2 Core 3 Core 4 Multi-core CPU chip
  • 6. 6 3/2/2023 Multi-core CPU chip  The cores fit on a single processor socket  Also called CMP (Chip Multi-Processor) c o r e 1 c o r e 2 c o r e 3 c o r e 4
  • 7. 7 3/2/2023 The cores run in parallel c o r e 1 c o r e 2 c o r e 3 c o r e 4 thread 1 thread 2 thread 3 thread 4
  • 8. 8 3/2/2023 Within each core, threads are time-sliced (just like on a uniprocessor) c o r e 1 c o r e 2 c o r e 3 c o r e 4 several threads several threads several threads several threads
  • 9. 9 3/2/2023 Interaction with OS  OS perceives each core as a separate processor  OS scheduler maps threads/processes to different cores  Most major OS support multi-core today
  • 10. 10 3/2/2023 Why multi-core ?  Difficult to make single-core clock frequencies even higher  Many new applications are multithreaded  General trend in computer architecture (shift towards more parallelism)
  • 11. 11 3/2/2023 Instruction-level parallelism  Parallelism at the machine-instruction level  The processor can re-order, pipeline instructions, split them into microinstructions, do aggressive branch prediction, etc.  Instruction-level parallelism enabled rapid increases in processor speeds over the last 15 years
  • 12. 12 3/2/2023 Thread-level parallelism (TLP)  This is parallelism on a more coarser scale  Server can serve each client in a separate thread (Web server, database server)  A computer game can do AI, graphics, and physics in three separate threads  Single-core superscalar processors cannot fully exploit TLP  Multi-core architectures are the next step in processor evolution: explicitly exploiting TLP
  • 13. 13 3/2/2023 General context: Multiprocessors  Multiprocessor is any computer with several processors  SIMD Single instruction, multiple data Modern graphics cards  MIMD Multiple instructions, multiple data Lemieux cluster, Pittsburgh supercomputing center
  • 14. 14 3/2/2023 Multiprocessor memory types  Shared memory: In this model, there is one (large) common shared memory for all processors  Distributed memory: In this model, each processor has its own (small) local memory, and its content is not replicated anywhere else
  • 15. 15 3/2/2023 Multi-core processor is a special kind of a multiprocessor: All processors are on the same chip  Multi-core processors are MIMD: Different cores execute different threads (Multiple Instructions), operating on different parts of memory (Multiple Data).  Multi-core is a shared memory multiprocessor: All cores share the same memory
  • 16. 16 3/2/2023 What applications benefit from multi-core?  Database servers  Web servers (Web commerce)  Telecommuncation markets: 6WINDGate (datapath and control plane)  Multimedia applications  Scientific applications, CAD/CAM  In general, applications with Thread-level parallelism (as opposed to instruction- level parallelism) Each can run on its own core
  • 17. 17 3/2/2023 More examples  Editing a photo while recording a TV show through a digital video recorder  Downloading software while running an anti-virus program  “Anything that can be threaded today will map efficiently to multi-core”  BUT: some applications difficult to parallelize
  • 18. 18 3/2/2023 Simultaneous multithreading (SMT)  Permits multiple independent threads to execute SIMULTANEOUSLY on the SAME core  Weaving together multiple “threads” on the same core  Example: if one thread is waiting for a floating point operation to complete, another thread can use the integer units
  • 19. 19 3/2/2023 BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus Thread 1: floating point Without SMT, only a single thread can run at any given time
  • 20. 20 3/2/2023 Without SMT, only a single thread can run at any given time BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus Thread 2: integer operation
  • 21. 21 3/2/2023 SMT processor: both threads can run concurrently BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus Thread 1: floating point Thread 2: integer operation
  • 22. 22 3/2/2023 But: Can’t simultaneously use the same functional unit BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus Thread 1 Thread 2 This scenario is impossible with SMT on a single core (assuming a single integer unit) IMPOSSIBLE
  • 23. 23 3/2/2023 SMT not a “true” parallel processor  Enables better threading (e.g. up to 30%)  OS and applications perceive each simultaneous thread as a separate “virtual processor”  The chip has only a single copy of each resource  Compare to multi-core: each core has its own copy of resources
  • 24. 24 3/2/2023 Multi-core: threads can run on separate cores BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus Thread 1 Thread 3
  • 25. 25 3/2/2023 BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus Thread 2 Thread 4 Multi-core: threads can run on separate cores
  • 26. 26 3/2/2023 Combining Multi-core and SMT  Cores can be SMT-enabled (or not)  The different combinations: Single-core, non-SMT: standard uniprocessor Single-core, with SMT Multi-core, non-SMT Multi-core, with SMT:  The number of SMT threads: 2, 4, or sometimes 8 simultaneous threads  Intel calls them “hyper-threads”
  • 27. 27 3/2/2023 SMT Dual-core: all four threads can run concurrently BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus BTB and I-TLB Decoder Trace Cache Rename/Alloc Uop queues Schedulers Integer Floating Point L1 D-Cache D-TLB uCode ROM BTB L2 Cache and Control Bus Thread 1 Thread 2 Thread 3 Thread 4
  • 28. 28 3/2/2023 Comparison: multi-core vs SMT  Multi-core: Since there are several cores, each is smaller and not as powerful (but also easier to design and manufacture) However, great with thread-level parallelism  SMT Can have one large and fast superscalar core Great performance on a single thread Mostly still only exploits instruction-level parallelism
  • 29. 29 3/2/2023 The memory hierarchy  If simultaneous multithreading only: all caches shared  Multi-core chips: L1 caches private L2 caches private in some architectures and shared in others  Memory is always shared
  • 30. 30 3/2/2023  Dual-core Intel Xeon processors  Each core is hyper-threaded  Private L1 caches  Shared L2 caches memory L2 cache L1 cache L1 cache C O R E 1 C O R E 0 hyper-threads
  • 31. 31 3/2/2023 Designs with private L2 caches memory L2 cache L1 cache L1 cache C O R E 1 C O R E 0 L2 cache memory L2 cache L1 cache L1 cache C O R E 1 C O R E 0 L2 cache Both L1 and L2 are private Examples: AMD Opteron, AMD Athlon, Intel Pentium D L3 cache L3 cache A design with L3 caches Example: Intel Itanium 2
  • 34. 34 3/2/2023 Advantages  Cache coherency circuitry can operate at a much higher clock rate than is possible if the signals have to travel off-chip  Signals between different CPUs travel shorter distances, those signals degrade less  These higher quality signals allow more data to be sent in a given time period since individual signals can be shorter and do not need to be repeated as often  A dual-core processor uses slightly less power than two coupled single-core processors
  • 35. 35 3/2/2023 Disadvantages  Ability of multi-core processors to increase application performance depends on the use of multiple threads within applications.  Most Current video games will run faster on a 3 GHz single-core processor than on a 2GHz dual-core processor (of the same core architecture  Two processing cores sharing the same system bus and memory bandwidth limits the real-world performance advantage.  If a single core is close to being memory bandwidth limited, going to dual-core might only give 30% to 70% improvement  If memory bandwidth is not a problem, a 90% improvement can be expected
  • 36. 36 3/2/2023 Conclusion  Multi-core chips an important new trend in computer architecture  Several new multi-core chips in design phases  Parallel programming techniques likely to gain importance