SlideShare a Scribd company logo
DEVELOP HIGH BANDWIDTH-LOW LATENCY
ELECTRONIC SYSTEMS FOR AI/ML APPLICATION
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com
Logistics
2
All attendees are set on mute.
To ask a question, click on Arrow to the left of Chat and
type the question. Folks are standing by to answer your
questions. There will also be a time at the end for Q&A
DEVELOP HIGH BANDWIDTH-LOW LATENCY
ELECTRONIC SYSTEMS FOR AI/ML APPLICATION
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com
Agenda
Architecture Exploration of Electronic Systems
Introduction to System Modeling
VisualSim Libraries and Architecture Exploration requirements
VisualSim Demonstration and Analysis
◦ Software
◦ Semiconductor
◦ Power-Performance trade-off
Company profile
Exploration of Electronic
Systems
INTRODUCTION AND BENEFITS
Modeling Electronic Systems
Current approach
◦ Use of analytical models such as Spreadsheet and Worst-Case Execution Time
◦ Move from the high-specification to building prototypes
◦ WCET and Spreadsheets are highly inaccurate
◦ Prototypes take too long to develop and also have limited exploration capacity
Proposed Approach
◦ Add a systems engineering layer after the analytical analysis
◦ Create a virtual prototype of the full system- Hardware, software, RTOS and network connection
◦ Conduct trade-off early in the design cycle with detailed knowledge of the system operation
Tradition Approach-
Spreadsheet-based Traffic and Power Analysis
Proposed Approach-
Full Braking System
Input Spreadsheet and Trace file
Generated Report and Plots
Reuse existing data to kickstart model development
Analysis and Experiment
Understand the connectivity between all the individual components and sub-systems
Evaluate timing, throughput, power, heat and functional correctness using a single model
Measure the latency between network interface and processed output
Identify opportunity for hardware acceleration
Partition applications across multi-core, multi-processor and multi-chassis
Exploration of emerging technology
◦ New processor family, new backplane technology and better integrated memory
Why Deploy the New Approach
Eliminate all surprises before integration
Gain visibility into system operation and requirements early in the
design process
Complete visibility into constraint for each packet/request,
protocol/control, and software/hardware
Determine requirements for hardware and network components
Identify bottlenecks, limitations and reuse ability
Introduction to System Modeling
DEMO MODELS
System Architecture Modeling Methods
Application and Software behavior
Network or backplane Modeling
Hardware architecture
Network Topology
Network Model with
Scheduler and Flow Control
Software Code for Scheduler Algorithm
/* Scan Queues based on receiving input, user algorithm here */
Select = 1
WAIT (1.0E-08)
while (true) {
while (Select <= Ingress_Size) {
if (getBlockStatus(Smart_Resource_Name,"length",Select) > 0 && getBlockStatus("Egress","length",Select) < Threshold) {
token = getBlockStatus(Smart_Resource_Name,"copy",Select)
WAIT ((token.Size) / Scan_Rate)
SEND (pop,Select)
Index = Select - 1
InThru(Index) = InThru(Index) + token.Size
}
Select = Select + 1
}
Select = 1
WAIT (1.0E-09)
}
Software Profiling of the Scheduler Code
Address Number Mean_Time Script_RegEx_Statement
0 1 116.10900000 us Select = 1
1 1 69.97000000 us WAIT (1.0E-08)
2 404 206.66089 ns if (true) false, expression plus 13, else plus 1.
3 6462 258.44181 ns if (Select <= Ingress_Size) false, expression plus 9, else plus 1.
4 6059 8.07862948 us if (getBlockStatus(Smart_Resource_Name,"length",Select) > 0 && getBlockStatus("Egress","length",Select) < Threshold) false, expression plus 6,
else plus 1.
5 1168 6.47288699 us token = getBlockStatus(Smart_Resource_Name,"copy",Select)
6 1168 20.36501199 us WAIT ((token.Size) / Scan_Rate)
7 1167 1.59209769 us SEND (pop,Select)
8 1167 891.31791 ns Index = Select - 1
9 1167 4.95694859 us InThru(Index) = InThru(Index) + token.Size
10 6058 318.42786 ns Select = Select + 1
11 6058 85.02542 ns GTO (-8)
12 403 289.43921 ns Select = 1
13 403 44.19382630 us WAIT (1.0E-09)
14 403 295.18114 ns GTO (-12)
15 0 0.0000000 GTO (EndThread)
Mapping Scheduler code to Pseudo
Instructions
Instruction Sequence corresponding to the code execution
{"FXA_b", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH",
"LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT",
"ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH",
"LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "IMM", "WAIT_s"}
Software code address line execution order
0, 1, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 6
List of Psuedo Instructions
FXA_b = Function w/ Args, boolean
FXA_r = Function w/ Args, struct (record)
FXA_a = Function w/ Args, array
FXA_m = Function w/ Args, matrix
WAIT_s = WAIT string, event
WAIT_d = WAIT double, delay
DEC = --
List of Psuedo Instructions- Cont.
GT = Greater than
LT = Less than
BCH = Branch
ADD = Add
SUB = Subtract
MUL = Multiply
INC = ++
List of Psuedo Instructions- Cont.
SHIFT = >> or <<
SEND = Send to Label, Block or Port
LTE = Less than or equal
GT = Greater than
LT = Less than
MOD = Modulo
POW = Power
Mapping of Two Applications
to an Single-Board computer
Applications are a set of Complex tasks
• Variable rate input stream
• Tasks and transfer between tasks
Contention for resources by tasks
• Resource are the hardware blocks
• Assign tasks to Resources
• Transfer flows across Buses and bridges
Trade-off between process and transfer
• Efficient- More processing and less transfer
• Minimize power consumption
I/O
DSP
CPU1
CPU2
task1 task2 task3 task4
Scheduling software tasks using limited resources
VisualSim Block Diagram
Library
Folder Parameters
Reports &
Statistics
Single Board Computer Architecture
Application 1
Application 2
Workload
Mapping
Power Data
Run Simulations using two Parameter
Variations of the Bus Speed
System with faster Bus is slower in places
Unpredictable System Response
VisualSim Libraries and
Architecture Challenges
DEMO MODELS
Systems-Level Block Library
Largest library of traffic, resources, hardware, software and analysis
Traffic
• Distribution
• Sequence
• Trace file
• Instruction profile
Reports
• Timing and Buffer
• Throughput/Util
• Ave/peak power
• Statistics
Power
• State power table
• Power
management
• Energy harvesters
• Battery
• RegEx operators
SoC Buses
• AMBA and Corelink
• AHB, AB, AXI, ACE,
CHI, CMN600
• Network-on-Chip
• TileLink
System Bus
• PCI/PCI-X/PCIe
• Rapid IO
• AFDX
• OpenVPX
• VME
• SPI 3.0
• 1553B
Processors
• GPU, DSP, mP and mC
• RISC-V
• Nvidia- Drive-PX
• PowerPC
• X86- Intel and AMD
• DSP- TI and ADI
• MIPS, Tensilica, SH
ARM
• M-, R-, 7TDMI
• A8, A53, A55, A72,
A76, A77
Custom Creator
• Script language
• 600 RegEx fn
• Task graph
• Tracer
• C/C++/Java
• Python
Support
• Listener and
Trace
• Debuggers
• Assertions
Stochastic
• FIFO/LIFO Queue
• Time Queue
• Quantity Queue
• System Resource
• Schedulers
• Cyber Security
RTOS
• Template
• ARINC 653
• AUTOSAR
Memory
• Memory Controller
• DDR DRAM 2,3,4, 5
• LPDDR 2, 3, 4
• HBM, HMC
• SDR, QDR, RDRAM
Storage
• Flash & NVMe
• Storage Array
• Disk and SATA
• Fibre Channel
• FireWire
Networking
• Ethernet & GiE
• Audio-Video Bridging
• 802.11 and Bluetooth
• 5G
• Spacewire
• CAN-FD
• TTEthernet
• FlexRay
• TSN & IEEE802.1Q
FPGA
• Xilinx- Zynq, Virtex, Kintex
• Intel-Stratix, Arria
• Microsemi- Smartfusion
• Programmable logic
template
• Interface traffic generator
Software
• GEM5
• Software code integration
• Instruction trace
• Statistical software model
• Task graph
Interfaces
• Virtual Channel
• DMA
• Crossbar
• Serial Switch
• Bridge
RTL-like
• Clock, Wire-Delay
• Registers, Latches
• Flip-flop
• ALU and FSM
• Mux, DeMux
• Lookup table
Application Template Library
VisualSim Modeling Library provides coverage over all applications using electronics
Electronic System Challenges
Systems Engineering
◦ Top-level view of the entire system without worrying about the exact implementation details
◦ Capture the data flow, application task sequence and mapping to System resources
◦ Generate statistics for response time, throughput and power consumed
Hardware-Software selection
◦ Select the appropriate hardware blocks including processor, memory and bus/network
◦ Determine the number of independent boards and chassis for symmetrical processing
◦ Experiment with different mapping strategies and select accelerators
◦ Reuse the systems engineering data flow and application task sequence
System level
◦ Develop the specification for integration and test cases
Mathematical function
allocation and partitioning
DEMO MODELS
Modeling Complex AI/ML processing
in an Image-based Application
Check Correctness of AI/ML Math
Network Planning
DEMO MODELS
Host to Data center using Ethernet AVB
6/4/2020
Analysis
Latency from gateway to gateway, client to server, master to slave or node to node
Effects of communication stack activity
Scheduling of different traffic classes for policing and shaping
Trade-off switch vs gateway
Effect of global vs. local multicast
Impact of clock jitters
6/4/2020
Interface Modeling: Network on Chip
Block Diagram
VisualSim Model
VisualSim Software Architecture
and Mapping to Hardware
DEMO MODELS
Application Task Graph
(Implementation can be in HW or SW)
VisualSim Model of the Task Graph
Block Diagram of a Software System
Radar
Analyze system behavior with deterministic and non-deterministic workloads
Behavior Model of Radar Software
Mapping Radar Software Tasks to two
Hardware Architectures
X86 based ECU
DSP-based ECU
Comparing Mapping on x86 vs DSP
Key parameters are the latency, processing efficiency and the throughput
Failure Impact on RTOS and Scheduling
Without Faults
With Faults
Rapidly increasing time between Ready-to-Run and Run
VisualSim Semiconductor
DEMO MODELS
TPU v1
Tensor v3 in the Cloud
Designing for an SoC Block Diagram
Target
Power < 1.0W
Number of frames in 20 ms > 13K
Three Explorations
1. All tasks deployed in Software
2. Migrate few tasks to Hardware accelerators
3. Add power management to reduce power
ARM
Cortex A77
ARM AMBA AXI
ARM AMBA AXI Corelink CMN600
AMBA
AMBA
AMBA
Controller
VisualSim can handle any Processor architecture
Translate SoC Block Diagram into
VisualSim Model
Processor Bus
Topology
Memory
Controller
Hardware
Accelerators
Power
management
Use Cases
SoC design methodology provides lots of flexibility in level of detail and type of analysis
Comparing Power and Performance
across multiple Parameter Values
SW
SWSW
SW
HW HW
HW HW
Post processor and Batch-mode simulation allow for easy comparison across simulations
Power and Thermal Analysis in
an Application
DEMO MODELS
VisualSim Model of a Braking System
Brake ECU
Power, Heat, Functional and Timing
ABOUT MIRABILIS DESIGN
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com
VisualSim Aerospace
Simulator of the Year
Hardware Modeling
40th Customer
2003
Company
Incorporated
2005
Modeling Services
1st Customer
2008
Stochastic Modeling
Innovation Award
2010
Integration API
10th customer
2011
Network modeling
University program
20132015
2018
50th Customer
Best ESL at DAC
2nd at Arm TechCon
2019
VisualSim Automotive
250 products built
Started Europe operations
2020
VisualSim Functional
Analysis ISO/DO/IEC
Started Asia Operations
Continuous Innovation, Awards and World-Wide Presence
Company Milestone
VisualSim software with libraries
Training:
Training and modeling support- user builds
the components and models
Services:
Develop custom library- User assembles
the models
Develop custom libraries and models -
User conducts parameter study
Architecture evaluation- Will develop
model, analyse and provide feedback
Model-based Systems Engineering simplified and made easy-to-adopt
Mirabilis Design Software and Solutions
Engineering Benefits
Average increase in revenue per project = $??M
Using Alternate Design Methodology
Project Schedule
Model Creation (6)
Implementation (18)
Analysis (1.5)
Communication and Refinement (6)
Implementation (15)
Using VisualSim Model-Based Design Methodology
Note: All times in months
Communication and Refinement (4)
Analysis (2.5)
Model Creation (1) Average gain for 24-month
project is 25%-30%
Ensuring Highest
Quality Product
Accelerate Model
development
DEVELOP HIGH BANDWIDTH-LOW LATENCY
ELECTRONIC SYSTEMS FOR AI/ML APPLICATION
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com

More Related Content

PPTX
Webinar on Latency and throughput computation of automotive EE network
PDF
How to create innovative architecture using ViualSim?
PPTX
System Architecture Exploration Training Class
PPTX
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
PDF
Im 2021 tutorial next-generation closed-loop automation - an inside view - ...
PPTX
Closed Loop Platform Automation - Tong Zhong & Emma Collins
PPTX
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
PDF
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...
Webinar on Latency and throughput computation of automotive EE network
How to create innovative architecture using ViualSim?
System Architecture Exploration Training Class
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
Im 2021 tutorial next-generation closed-loop automation - an inside view - ...
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Achieving Real-time Ingestion and Analysis of Security Events through Kafka a...
Accelerating SparkML Workloads on the Intel Xeon+FPGA Platform with Srivatsan...

What's hot (12)

PPTX
Central Process Utility Plant controls upgrade required 100% uptime
PPTX
Incorporating Wireless Measurements with Wired Data Acquisition Systems
PPSX
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
PDF
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
PPTX
Introduction to Apache Apex and writing a big data streaming application
PPTX
PPT
Scalable analytics for iaas cloud availability
PDF
Raven: End-to-end Optimization of ML Prediction Queries
DOCX
Routing & Switching report
DOC
Chandan Kumar_3+_Years _EXP
PDF
IMCSummit 2015 - Day 1 Developer Track - In-memory Computing for Iterative CP...
PPTX
Big Data Berlin v8.0 Stream Processing with Apache Apex
Central Process Utility Plant controls upgrade required 100% uptime
Incorporating Wireless Measurements with Wired Data Acquisition Systems
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Introduction to Apache Apex and writing a big data streaming application
Scalable analytics for iaas cloud availability
Raven: End-to-end Optimization of ML Prediction Queries
Routing & Switching report
Chandan Kumar_3+_Years _EXP
IMCSummit 2015 - Day 1 Developer Track - In-memory Computing for Iterative CP...
Big Data Berlin v8.0 Stream Processing with Apache Apex
Ad

Similar to Develop High-bandwidth/low latency electronic systems for AI/ML application (20)

PPTX
Using VisualSim Architect for Semiconductor System Analysis
PPTX
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
PPTX
Webinar on radar
PPTX
Mirabilis_Design AMD Versal System-Level IP Library
PPTX
Exploration of Radars and Software Defined Radios using VisualSim
PDF
How to create innovative architecture using VisualSim?
PDF
How to create innovative architecture using VisualSim?
PPTX
Connect Data Strategy Deep Dive - MAZ Workshop (1).pptx
PPTX
Webinar on RISC-V
PPTX
Accelerated development in Automotive E/E Systems using VisualSim Architect
PPTX
Mirabilis Design | Chiplet Summit | 2024
PPTX
Mirabilis_Presentation_SCC_July_2024.pptx
PPTX
Introduction to architecture exploration
PDF
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
PPT
ASIC Design Flow_Introduction_details.ppt
PPTX
Task allocation on many core-multi processor distributed system
PPTX
Energy efficient AI workload partitioning on multi-core systems
PPTX
Modeling Abstraction
PPTX
Introduction to embedded System.pptx
PPTX
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Using VisualSim Architect for Semiconductor System Analysis
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar on radar
Mirabilis_Design AMD Versal System-Level IP Library
Exploration of Radars and Software Defined Radios using VisualSim
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
Connect Data Strategy Deep Dive - MAZ Workshop (1).pptx
Webinar on RISC-V
Accelerated development in Automotive E/E Systems using VisualSim Architect
Mirabilis Design | Chiplet Summit | 2024
Mirabilis_Presentation_SCC_July_2024.pptx
Introduction to architecture exploration
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
ASIC Design Flow_Introduction_details.ppt
Task allocation on many core-multi processor distributed system
Energy efficient AI workload partitioning on multi-core systems
Modeling Abstraction
Introduction to embedded System.pptx
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Ad

More from Deepak Shankar (13)

PPTX
Simulating Auto Systems & E/E Architectures for Power and Performance using V...
PPTX
Mirabilis Design- NoC Webinar- 15th-Oct 2024
PPTX
Mirabilis_Presentation_DAC_June_2024.pptx
PPTX
How to achieve 95%+ Accurate power measurement during architecture exploration?
PPTX
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
PPTX
Evaluating UCIe based multi-die SoC to meet timing and power
PDF
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
PPTX
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
PPTX
Capacity Planning and Power Management of Data Centers.
PPTX
Automotive network and gateway simulation
PPTX
Using ai for optimal time sensitive networking in avionics
PPTX
Designing memory controller for ddr5 and hbm2.0
PPTX
Webinar on Functional Safety Analysis using Model-based System Analysis
Simulating Auto Systems & E/E Architectures for Power and Performance using V...
Mirabilis Design- NoC Webinar- 15th-Oct 2024
Mirabilis_Presentation_DAC_June_2024.pptx
How to achieve 95%+ Accurate power measurement during architecture exploration?
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Evaluating UCIe based multi-die SoC to meet timing and power
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Capacity Planning and Power Management of Data Centers.
Automotive network and gateway simulation
Using ai for optimal time sensitive networking in avionics
Designing memory controller for ddr5 and hbm2.0
Webinar on Functional Safety Analysis using Model-based System Analysis

Recently uploaded (20)

PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
web development for engineering and engineering
PDF
Structs to JSON How Go Powers REST APIs.pdf
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
Well-logging-methods_new................
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
Welding lecture in detail for understanding
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
MET 305 MODULE 1 KTU 2019 SCHEME 25.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
composite construction of structures.pdf
PPTX
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
PPTX
Sustainable Sites - Green Building Construction
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Geodesy 1.pptx...............................................
DOCX
573137875-Attendance-Management-System-original
CH1 Production IntroductoryConcepts.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
web development for engineering and engineering
Structs to JSON How Go Powers REST APIs.pdf
Arduino robotics embedded978-1-4302-3184-4.pdf
Well-logging-methods_new................
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Welding lecture in detail for understanding
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
MET 305 MODULE 1 KTU 2019 SCHEME 25.pptx
additive manufacturing of ss316l using mig welding
Lecture Notes Electrical Wiring System Components
UNIT-1 - COAL BASED THERMAL POWER PLANTS
composite construction of structures.pdf
Fluid Mechanics, Module 3: Basics of Fluid Mechanics
Sustainable Sites - Green Building Construction
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Geodesy 1.pptx...............................................
573137875-Attendance-Management-System-original

Develop High-bandwidth/low latency electronic systems for AI/ML application

  • 1. DEVELOP HIGH BANDWIDTH-LOW LATENCY ELECTRONIC SYSTEMS FOR AI/ML APPLICATION Deepak Shankar Founder Mirabilis Design Inc. Email: dshankar@mirabilisdesign.com
  • 2. Logistics 2 All attendees are set on mute. To ask a question, click on Arrow to the left of Chat and type the question. Folks are standing by to answer your questions. There will also be a time at the end for Q&A
  • 3. DEVELOP HIGH BANDWIDTH-LOW LATENCY ELECTRONIC SYSTEMS FOR AI/ML APPLICATION Deepak Shankar Founder Mirabilis Design Inc. Email: dshankar@mirabilisdesign.com
  • 4. Agenda Architecture Exploration of Electronic Systems Introduction to System Modeling VisualSim Libraries and Architecture Exploration requirements VisualSim Demonstration and Analysis ◦ Software ◦ Semiconductor ◦ Power-Performance trade-off Company profile
  • 6. Modeling Electronic Systems Current approach ◦ Use of analytical models such as Spreadsheet and Worst-Case Execution Time ◦ Move from the high-specification to building prototypes ◦ WCET and Spreadsheets are highly inaccurate ◦ Prototypes take too long to develop and also have limited exploration capacity Proposed Approach ◦ Add a systems engineering layer after the analytical analysis ◦ Create a virtual prototype of the full system- Hardware, software, RTOS and network connection ◦ Conduct trade-off early in the design cycle with detailed knowledge of the system operation
  • 8. Proposed Approach- Full Braking System Input Spreadsheet and Trace file Generated Report and Plots Reuse existing data to kickstart model development
  • 9. Analysis and Experiment Understand the connectivity between all the individual components and sub-systems Evaluate timing, throughput, power, heat and functional correctness using a single model Measure the latency between network interface and processed output Identify opportunity for hardware acceleration Partition applications across multi-core, multi-processor and multi-chassis Exploration of emerging technology ◦ New processor family, new backplane technology and better integrated memory
  • 10. Why Deploy the New Approach Eliminate all surprises before integration Gain visibility into system operation and requirements early in the design process Complete visibility into constraint for each packet/request, protocol/control, and software/hardware Determine requirements for hardware and network components Identify bottlenecks, limitations and reuse ability
  • 11. Introduction to System Modeling DEMO MODELS
  • 12. System Architecture Modeling Methods Application and Software behavior Network or backplane Modeling Hardware architecture
  • 14. Network Model with Scheduler and Flow Control
  • 15. Software Code for Scheduler Algorithm /* Scan Queues based on receiving input, user algorithm here */ Select = 1 WAIT (1.0E-08) while (true) { while (Select <= Ingress_Size) { if (getBlockStatus(Smart_Resource_Name,"length",Select) > 0 && getBlockStatus("Egress","length",Select) < Threshold) { token = getBlockStatus(Smart_Resource_Name,"copy",Select) WAIT ((token.Size) / Scan_Rate) SEND (pop,Select) Index = Select - 1 InThru(Index) = InThru(Index) + token.Size } Select = Select + 1 } Select = 1 WAIT (1.0E-09) }
  • 16. Software Profiling of the Scheduler Code Address Number Mean_Time Script_RegEx_Statement 0 1 116.10900000 us Select = 1 1 1 69.97000000 us WAIT (1.0E-08) 2 404 206.66089 ns if (true) false, expression plus 13, else plus 1. 3 6462 258.44181 ns if (Select <= Ingress_Size) false, expression plus 9, else plus 1. 4 6059 8.07862948 us if (getBlockStatus(Smart_Resource_Name,"length",Select) > 0 && getBlockStatus("Egress","length",Select) < Threshold) false, expression plus 6, else plus 1. 5 1168 6.47288699 us token = getBlockStatus(Smart_Resource_Name,"copy",Select) 6 1168 20.36501199 us WAIT ((token.Size) / Scan_Rate) 7 1167 1.59209769 us SEND (pop,Select) 8 1167 891.31791 ns Index = Select - 1 9 1167 4.95694859 us InThru(Index) = InThru(Index) + token.Size 10 6058 318.42786 ns Select = Select + 1 11 6058 85.02542 ns GTO (-8) 12 403 289.43921 ns Select = 1 13 403 44.19382630 us WAIT (1.0E-09) 14 403 295.18114 ns GTO (-12) 15 0 0.0000000 GTO (EndThread)
  • 17. Mapping Scheduler code to Pseudo Instructions Instruction Sequence corresponding to the code execution {"FXA_b", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "LT", "ADD", "BCH", "LTE", "IMM", "WAIT_s"} Software code address line execution order 0, 1, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 6 List of Psuedo Instructions FXA_b = Function w/ Args, boolean FXA_r = Function w/ Args, struct (record) FXA_a = Function w/ Args, array FXA_m = Function w/ Args, matrix WAIT_s = WAIT string, event WAIT_d = WAIT double, delay DEC = -- List of Psuedo Instructions- Cont. GT = Greater than LT = Less than BCH = Branch ADD = Add SUB = Subtract MUL = Multiply INC = ++ List of Psuedo Instructions- Cont. SHIFT = >> or << SEND = Send to Label, Block or Port LTE = Less than or equal GT = Greater than LT = Less than MOD = Modulo POW = Power
  • 18. Mapping of Two Applications to an Single-Board computer Applications are a set of Complex tasks • Variable rate input stream • Tasks and transfer between tasks Contention for resources by tasks • Resource are the hardware blocks • Assign tasks to Resources • Transfer flows across Buses and bridges Trade-off between process and transfer • Efficient- More processing and less transfer • Minimize power consumption I/O DSP CPU1 CPU2 task1 task2 task3 task4 Scheduling software tasks using limited resources
  • 19. VisualSim Block Diagram Library Folder Parameters Reports & Statistics Single Board Computer Architecture Application 1 Application 2 Workload Mapping Power Data
  • 20. Run Simulations using two Parameter Variations of the Bus Speed System with faster Bus is slower in places Unpredictable System Response
  • 21. VisualSim Libraries and Architecture Challenges DEMO MODELS
  • 22. Systems-Level Block Library Largest library of traffic, resources, hardware, software and analysis Traffic • Distribution • Sequence • Trace file • Instruction profile Reports • Timing and Buffer • Throughput/Util • Ave/peak power • Statistics Power • State power table • Power management • Energy harvesters • Battery • RegEx operators SoC Buses • AMBA and Corelink • AHB, AB, AXI, ACE, CHI, CMN600 • Network-on-Chip • TileLink System Bus • PCI/PCI-X/PCIe • Rapid IO • AFDX • OpenVPX • VME • SPI 3.0 • 1553B Processors • GPU, DSP, mP and mC • RISC-V • Nvidia- Drive-PX • PowerPC • X86- Intel and AMD • DSP- TI and ADI • MIPS, Tensilica, SH ARM • M-, R-, 7TDMI • A8, A53, A55, A72, A76, A77 Custom Creator • Script language • 600 RegEx fn • Task graph • Tracer • C/C++/Java • Python Support • Listener and Trace • Debuggers • Assertions Stochastic • FIFO/LIFO Queue • Time Queue • Quantity Queue • System Resource • Schedulers • Cyber Security RTOS • Template • ARINC 653 • AUTOSAR Memory • Memory Controller • DDR DRAM 2,3,4, 5 • LPDDR 2, 3, 4 • HBM, HMC • SDR, QDR, RDRAM Storage • Flash & NVMe • Storage Array • Disk and SATA • Fibre Channel • FireWire Networking • Ethernet & GiE • Audio-Video Bridging • 802.11 and Bluetooth • 5G • Spacewire • CAN-FD • TTEthernet • FlexRay • TSN & IEEE802.1Q FPGA • Xilinx- Zynq, Virtex, Kintex • Intel-Stratix, Arria • Microsemi- Smartfusion • Programmable logic template • Interface traffic generator Software • GEM5 • Software code integration • Instruction trace • Statistical software model • Task graph Interfaces • Virtual Channel • DMA • Crossbar • Serial Switch • Bridge RTL-like • Clock, Wire-Delay • Registers, Latches • Flip-flop • ALU and FSM • Mux, DeMux • Lookup table
  • 23. Application Template Library VisualSim Modeling Library provides coverage over all applications using electronics
  • 24. Electronic System Challenges Systems Engineering ◦ Top-level view of the entire system without worrying about the exact implementation details ◦ Capture the data flow, application task sequence and mapping to System resources ◦ Generate statistics for response time, throughput and power consumed Hardware-Software selection ◦ Select the appropriate hardware blocks including processor, memory and bus/network ◦ Determine the number of independent boards and chassis for symmetrical processing ◦ Experiment with different mapping strategies and select accelerators ◦ Reuse the systems engineering data flow and application task sequence System level ◦ Develop the specification for integration and test cases
  • 25. Mathematical function allocation and partitioning DEMO MODELS
  • 26. Modeling Complex AI/ML processing in an Image-based Application
  • 27. Check Correctness of AI/ML Math
  • 29. Host to Data center using Ethernet AVB 6/4/2020
  • 30. Analysis Latency from gateway to gateway, client to server, master to slave or node to node Effects of communication stack activity Scheduling of different traffic classes for policing and shaping Trade-off switch vs gateway Effect of global vs. local multicast Impact of clock jitters 6/4/2020
  • 31. Interface Modeling: Network on Chip Block Diagram VisualSim Model
  • 32. VisualSim Software Architecture and Mapping to Hardware DEMO MODELS
  • 34. VisualSim Model of the Task Graph
  • 35. Block Diagram of a Software System Radar Analyze system behavior with deterministic and non-deterministic workloads
  • 36. Behavior Model of Radar Software
  • 37. Mapping Radar Software Tasks to two Hardware Architectures X86 based ECU DSP-based ECU
  • 38. Comparing Mapping on x86 vs DSP Key parameters are the latency, processing efficiency and the throughput
  • 39. Failure Impact on RTOS and Scheduling Without Faults With Faults Rapidly increasing time between Ready-to-Run and Run
  • 42. Tensor v3 in the Cloud
  • 43. Designing for an SoC Block Diagram Target Power < 1.0W Number of frames in 20 ms > 13K Three Explorations 1. All tasks deployed in Software 2. Migrate few tasks to Hardware accelerators 3. Add power management to reduce power ARM Cortex A77 ARM AMBA AXI ARM AMBA AXI Corelink CMN600 AMBA AMBA AMBA Controller VisualSim can handle any Processor architecture
  • 44. Translate SoC Block Diagram into VisualSim Model Processor Bus Topology Memory Controller Hardware Accelerators Power management Use Cases SoC design methodology provides lots of flexibility in level of detail and type of analysis
  • 45. Comparing Power and Performance across multiple Parameter Values SW SWSW SW HW HW HW HW Post processor and Batch-mode simulation allow for easy comparison across simulations
  • 46. Power and Thermal Analysis in an Application DEMO MODELS
  • 47. VisualSim Model of a Braking System
  • 50. ABOUT MIRABILIS DESIGN Deepak Shankar Founder Mirabilis Design Inc. Email: dshankar@mirabilisdesign.com
  • 51. VisualSim Aerospace Simulator of the Year Hardware Modeling 40th Customer 2003 Company Incorporated 2005 Modeling Services 1st Customer 2008 Stochastic Modeling Innovation Award 2010 Integration API 10th customer 2011 Network modeling University program 20132015 2018 50th Customer Best ESL at DAC 2nd at Arm TechCon 2019 VisualSim Automotive 250 products built Started Europe operations 2020 VisualSim Functional Analysis ISO/DO/IEC Started Asia Operations Continuous Innovation, Awards and World-Wide Presence Company Milestone
  • 52. VisualSim software with libraries Training: Training and modeling support- user builds the components and models Services: Develop custom library- User assembles the models Develop custom libraries and models - User conducts parameter study Architecture evaluation- Will develop model, analyse and provide feedback Model-based Systems Engineering simplified and made easy-to-adopt Mirabilis Design Software and Solutions
  • 53. Engineering Benefits Average increase in revenue per project = $??M Using Alternate Design Methodology Project Schedule Model Creation (6) Implementation (18) Analysis (1.5) Communication and Refinement (6) Implementation (15) Using VisualSim Model-Based Design Methodology Note: All times in months Communication and Refinement (4) Analysis (2.5) Model Creation (1) Average gain for 24-month project is 25%-30% Ensuring Highest Quality Product Accelerate Model development
  • 54. DEVELOP HIGH BANDWIDTH-LOW LATENCY ELECTRONIC SYSTEMS FOR AI/ML APPLICATION Deepak Shankar Founder Mirabilis Design Inc. Email: dshankar@mirabilisdesign.com