SlideShare a Scribd company logo
Random Access @ The Salishan Conference
27 April 2016
Dileep Bhandarkar, Ph. D.
IEEE Life Fellow
Disclaimer
This presentation is based on personal
Experiences over the last 40+ years in industry
As a Computer Architect
and
Is not presented on behalf of
current or past employers.
1958: Jack Kilby’s
Integrated Circuit
SSI -> MSI -> LSI -> VLSI -> OMGWLSI
In < 40 Years of Moore’s Law
4004
8008
8080
8085
8086 286
386
486
Pentium proc
Pentium® Pro
Pentium® 4
Itanium® 2
• 221M in 2002
• 410M in 2003
0.001
0.01
0.1
1
10
100
1,000
10,000
’70 ’80 ’90 ’00 ’10
Million
Transistors
More than 1 Billion Transistors in 2006!
Montecito
1.7 Billion Tulsa
1.3 Billion
Penryn
410M in 2007
From 2300 to >1Billion Transistors
Dennard Scaling
Device or Circuit Parameter Scaling Factor
Device dimension tox, L, W 1/K
Doping concentration Na K
Voltage V 1/K
Current I 1/K
Capacitance eA/t 1/K
Delay time per circuit VC/I 1/K
Power dissipation per circuit VI 1/K2
Power density VI/A 1
Dennard’s 1974 paper summarizes transistor or circuit parameter changes under ideal MOSFET
device scaling conditions, where K is the unitless scaling constant.
The benefits of scaling : as transistors get smaller, they can switch faster and use less power.
Each new generation of process technology was expected to reduce minimum feature size by
approximately 0.7x (K ~1.4). A 0.7x reduction in linear features size provided roughly a 2x
increase in transistor density.
Dennard scaling broke down around 2004 with unscaled interconnect delays and our inability
to scale the voltage and the current due to reliability concerns.
But our the ability to etch smaller transistors has continued spawning multicore designs.
THE MULTICORE ERA
NEW DEVICE STRUCTURES & MATERIALS
ENERGY EFFICIENCY WITH POWER CONSTRAINTS
Post Dennard Scaling
 Moore’s Law continued for 10 more years!
 Instruction Level Parallelism harder to find
 Increasing single-stream scalar performance often requires
non-linear increase in design complexity, area, and power
 Vectorization for increasing floating point performance
Something New Needed Every Two Process Generations to Keep Moore’s Law Going
22
nm
32
nm
45
nm
4 is Better Than 2!
And
8 is Even Better!
22 nm Intel Ivy Bridge Xeon E5/E7 had 15 cores in 525 mm2
22 nm Intel Haswell Xeon E5/E7 had 18 cores in 662 mm2
14 nm Intel Broadwell Xeon E5/E7 has 24 cores in 456 mm2
FLOPS per core also doubled with each generation
8
© 2013 Qualcomm Technologies, Inc. All Rights Reserved.
CPU scaling is reaching diminishing returns
Time
Single Core Era
Uniprocessor scaling
• Hitting a limit on:
• Clock rate
• Instructions per cycle
• Becomes energy inefficient
Single-Core CPU
Multi-Core Era
Multiprocessor scaling
• Works well for scale out and
embarrassingly parallel
applications
• Memory bandwidth lags core
count increase
Multi-Core CPU
Multi-Core Era
What is next?
?
Heterogenuous
Computing Era
New
Architectures
Thoughts about the Future?
 14 nm is in production but ramping slower than
previous generations
– Future Generations will be even harder!
 Costs per wafer increasing
– Capital, more process steps, increased mask costs, EUV cost
– Cost per transistor decreasing, but at a slower rate
 Moore’s Law is slowing down beyond 14 nm
– New process generation every 30 months
– Economics, Physics, Materials, Power, Lithography
– What is the best use for increased transistor density?
– Other architectures?
– Heterogenuous Processing Engines?
 Is vectorized floating point sufficient?
 Can we truly exploit higher levels of parallelism in
large “traditional” systems effectively & efficiently?
Thank You
dbhandarkar@outlook.com
5 nm
7 nm
10 nm
65 nm
45 nm
32 nm
22 nm
14 nm

More Related Content

PPTX
Moscow conference keynote
PDF
My Feb 2003 HPCA9 Keynote Slides - Billion Transistor Processor Chips
PDF
My amazing journey from mainframes to smartphones chm lecture aug 2014 final
PDF
My ISCA 2013 - 40th International Symposium on Computer Architecture Keynote
PDF
China AI Summit talk 2017
PPTX
Intel presentation ugttw 2015
PPTX
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core
 
PDF
Intel 14nm aug11
Moscow conference keynote
My Feb 2003 HPCA9 Keynote Slides - Billion Transistor Processor Chips
My amazing journey from mainframes to smartphones chm lecture aug 2014 final
My ISCA 2013 - 40th International Symposium on Computer Architecture Keynote
China AI Summit talk 2017
Intel presentation ugttw 2015
Zen 2: The AMD 7nm Energy-Efficient High-Performance x86-64 Microprocessor Core
 
Intel 14nm aug11

What's hot (20)

PPT
Parallelism Processor Design
PDF
SGI HPC DAY 2011 Kiev
PDF
Blue Line Supermicro Superblade
PDF
SGI HPC Update for June 2013
PPT
Basics Of VLSI
PDF
Ivy bridge vs Sandy bridge Micro-architecture.
PDF
Network: Synchronization: IEEE1588's Future in Computing and the Data Center
PPTX
Performance out of the box developers
PDF
Cache Consistency – Requirements and its packet processing Performance implic...
PPT
PPT
Vlsi
DOCX
Intel Core i7
PDF
System on Chip (SoC) for mobile phones
PDF
System-on-Chip Design, Embedded System Design Challenges
PPTX
CAST BA22 32-bit Processor Design Seminar, 2/1/12
PPTX
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
PPTX
Altera’s Role In Accelerating the Internet of Things
PPTX
SoC: System On Chip
DOCX
Intel Microarchitecture (Nehalem) and its Applications on Videogames
Parallelism Processor Design
SGI HPC DAY 2011 Kiev
Blue Line Supermicro Superblade
SGI HPC Update for June 2013
Basics Of VLSI
Ivy bridge vs Sandy bridge Micro-architecture.
Network: Synchronization: IEEE1588's Future in Computing and the Data Center
Performance out of the box developers
Cache Consistency – Requirements and its packet processing Performance implic...
Vlsi
Intel Core i7
System on Chip (SoC) for mobile phones
System-on-Chip Design, Embedded System Design Challenges
CAST BA22 32-bit Processor Design Seminar, 2/1/12
Cost-Effective System Continuation using Xilinx FPGAs and Legacy Processor IP
Altera’s Role In Accelerating the Internet of Things
SoC: System On Chip
Intel Microarchitecture (Nehalem) and its Applications on Videogames
Ad

Viewers also liked (10)

PPTX
Server design summit keynote handout
PDF
Future of cloud server design
PDF
Risc vs cisc
PPTX
Ba401 Intel Corporation
PDF
DileepB EDPS talk 2015
PDF
Intel microprocessors
PPTX
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
PDF
Prometheus (Microsoft, 2016)
PDF
Systems Monitoring with Prometheus (Devops Ireland April 2015)
PDF
Prometheus Overview
Server design summit keynote handout
Future of cloud server design
Risc vs cisc
Ba401 Intel Corporation
DileepB EDPS talk 2015
Intel microprocessors
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Prometheus (Microsoft, 2016)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Prometheus Overview
Ad

Similar to Dileep Random Access Talk at salishan 2016 (20)

PDF
Performance beyond moore's law
PPT
lecture_1_09njdnjbdbbdibdibbdbdbbhbd.ppt
PDF
1. CMOS Basic.pdf detail explain provide in This pdf
PPTX
Very Large Scale Integrated Circuits VLSI Overview
PDF
Disruptive Technologies
PPTX
basic vlsi ppt
PDF
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
PDF
Linaro connect 2018 keynote final updated
PDF
IC Technology
PDF
Ic Technology
PDF
Hipeac 2018 keynote Talk
PDF
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...
PPT
Unit_1_L1_LPVLSI.ppt
PPTX
Multicore Processor Technology
PDF
N045067680
PPTX
Soc lect1
PPTX
29092013042656 multicore-processor-technology
PDF
Chapter_01 Course Introduction.pdf
PPTX
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
PPT
Essential of VLSI
Performance beyond moore's law
lecture_1_09njdnjbdbbdibdibbdbdbbhbd.ppt
1. CMOS Basic.pdf detail explain provide in This pdf
Very Large Scale Integrated Circuits VLSI Overview
Disruptive Technologies
basic vlsi ppt
HKG18-500K1 - Keynote: Dileep Bhandarkar - Emerging Computing Trends in the D...
Linaro connect 2018 keynote final updated
IC Technology
Ic Technology
Hipeac 2018 keynote Talk
DARPA ERI Summit 2018: The End of Moore’s Law & Faster General Purpose Comput...
Unit_1_L1_LPVLSI.ppt
Multicore Processor Technology
N045067680
Soc lect1
29092013042656 multicore-processor-technology
Chapter_01 Course Introduction.pdf
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
Essential of VLSI

More from Dileep Bhandarkar (18)

PDF
Open Compute Summit Keynote 17 June 2011
PDF
Datacenter Dynamics Chicago 30 sept 2010
PDF
Energy Efficiency Considerations in Large Datacenters
PDF
Samsung cio-forum-2012
PDF
Data center-server-cooling-power-management-paper
PDF
Moscow conference keynote in 2012
PDF
New Delhi Cloud Summit 05 26-11
DOC
Performance Characterization of the Pentium Pro Processor
PDF
Innovation lecture for hong kong
PDF
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...
PDF
Qualcomm centriq 2400 hot chips final submission corrected
PDF
Innovation lecture for shanghai final
PDF
Semicon2018 dileepb
PDF
Alpha memo july 1992
PDF
Future of server design
PDF
Dileep b in 2013
PDF
Antarctica XXI 8-Dec-2012 Cruise Log Book
PDF
Antarctica cruise travelogue
Open Compute Summit Keynote 17 June 2011
Datacenter Dynamics Chicago 30 sept 2010
Energy Efficiency Considerations in Large Datacenters
Samsung cio-forum-2012
Data center-server-cooling-power-management-paper
Moscow conference keynote in 2012
New Delhi Cloud Summit 05 26-11
Performance Characterization of the Pentium Pro Processor
Innovation lecture for hong kong
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...
Qualcomm centriq 2400 hot chips final submission corrected
Innovation lecture for shanghai final
Semicon2018 dileepb
Alpha memo july 1992
Future of server design
Dileep b in 2013
Antarctica XXI 8-Dec-2012 Cruise Log Book
Antarctica cruise travelogue

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Cloud computing and distributed systems.
PPTX
Spectroscopy.pptx food analysis technology
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
KodekX | Application Modernization Development
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Weekly Chronicles - August'25 Week I
Diabetes mellitus diagnosis method based random forest with bat algorithm
Per capita expenditure prediction using model stacking based on satellite ima...
Big Data Technologies - Introduction.pptx
Cloud computing and distributed systems.
Spectroscopy.pptx food analysis technology
sap open course for s4hana steps from ECC to s4
Advanced methodologies resolving dimensionality complications for autism neur...
Chapter 3 Spatial Domain Image Processing.pdf
Unlocking AI with Model Context Protocol (MCP)
Spectral efficient network and resource selection model in 5G networks
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The AUB Centre for AI in Media Proposal.docx
KodekX | Application Modernization Development
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The Rise and Fall of 3GPP – Time for a Sabbatical?
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Dileep Random Access Talk at salishan 2016

  • 1. Random Access @ The Salishan Conference 27 April 2016 Dileep Bhandarkar, Ph. D. IEEE Life Fellow
  • 2. Disclaimer This presentation is based on personal Experiences over the last 40+ years in industry As a Computer Architect and Is not presented on behalf of current or past employers.
  • 3. 1958: Jack Kilby’s Integrated Circuit SSI -> MSI -> LSI -> VLSI -> OMGWLSI
  • 4. In < 40 Years of Moore’s Law 4004 8008 8080 8085 8086 286 386 486 Pentium proc Pentium® Pro Pentium® 4 Itanium® 2 • 221M in 2002 • 410M in 2003 0.001 0.01 0.1 1 10 100 1,000 10,000 ’70 ’80 ’90 ’00 ’10 Million Transistors More than 1 Billion Transistors in 2006! Montecito 1.7 Billion Tulsa 1.3 Billion Penryn 410M in 2007 From 2300 to >1Billion Transistors
  • 5. Dennard Scaling Device or Circuit Parameter Scaling Factor Device dimension tox, L, W 1/K Doping concentration Na K Voltage V 1/K Current I 1/K Capacitance eA/t 1/K Delay time per circuit VC/I 1/K Power dissipation per circuit VI 1/K2 Power density VI/A 1 Dennard’s 1974 paper summarizes transistor or circuit parameter changes under ideal MOSFET device scaling conditions, where K is the unitless scaling constant. The benefits of scaling : as transistors get smaller, they can switch faster and use less power. Each new generation of process technology was expected to reduce minimum feature size by approximately 0.7x (K ~1.4). A 0.7x reduction in linear features size provided roughly a 2x increase in transistor density. Dennard scaling broke down around 2004 with unscaled interconnect delays and our inability to scale the voltage and the current due to reliability concerns. But our the ability to etch smaller transistors has continued spawning multicore designs.
  • 6. THE MULTICORE ERA NEW DEVICE STRUCTURES & MATERIALS ENERGY EFFICIENCY WITH POWER CONSTRAINTS Post Dennard Scaling  Moore’s Law continued for 10 more years!  Instruction Level Parallelism harder to find  Increasing single-stream scalar performance often requires non-linear increase in design complexity, area, and power  Vectorization for increasing floating point performance Something New Needed Every Two Process Generations to Keep Moore’s Law Going 22 nm 32 nm 45 nm
  • 7. 4 is Better Than 2! And 8 is Even Better! 22 nm Intel Ivy Bridge Xeon E5/E7 had 15 cores in 525 mm2 22 nm Intel Haswell Xeon E5/E7 had 18 cores in 662 mm2 14 nm Intel Broadwell Xeon E5/E7 has 24 cores in 456 mm2 FLOPS per core also doubled with each generation
  • 8. 8 © 2013 Qualcomm Technologies, Inc. All Rights Reserved. CPU scaling is reaching diminishing returns Time Single Core Era Uniprocessor scaling • Hitting a limit on: • Clock rate • Instructions per cycle • Becomes energy inefficient Single-Core CPU Multi-Core Era Multiprocessor scaling • Works well for scale out and embarrassingly parallel applications • Memory bandwidth lags core count increase Multi-Core CPU Multi-Core Era What is next? ? Heterogenuous Computing Era New Architectures
  • 9. Thoughts about the Future?  14 nm is in production but ramping slower than previous generations – Future Generations will be even harder!  Costs per wafer increasing – Capital, more process steps, increased mask costs, EUV cost – Cost per transistor decreasing, but at a slower rate  Moore’s Law is slowing down beyond 14 nm – New process generation every 30 months – Economics, Physics, Materials, Power, Lithography – What is the best use for increased transistor density? – Other architectures? – Heterogenuous Processing Engines?  Is vectorized floating point sufficient?  Can we truly exploit higher levels of parallelism in large “traditional” systems effectively & efficiently?
  • 10. Thank You dbhandarkar@outlook.com 5 nm 7 nm 10 nm 65 nm 45 nm 32 nm 22 nm 14 nm