SlideShare a Scribd company logo
Multicore scheduling in automotive ECUs
 Aurélien Monot - PSA Peugeot Citroën, LORIA
 Nicolas Navet - INRIA, RealTime-at-Work
 Françoise Simonot - LORIA
 Bernard Bavoux - PSA Peugeot Citroën



  Talk at ERTS2 2010
  Toulouse, May 21st 2010
2




                         Outlook

Context:
New tools and methodologies are needed as multicore ECUs
are being introduced in the automotive EE architecture.


Problem:
How to address the scheduling of numerous runnables on a
multicore ECUs in the context of the automotive domain?


Method:
Deployment of load balancing algorithms in ECU configuration
tools.
3



   The case of a generic car manufacturer



         Typical number of ECUs in a car in 2000 : 20
       Typical number of ECUs in a car in 2010 : over 40


The number of ECUs has more than doubled in 10 years

                       Other examples
            Between 60 and 80 ECUs in the Audi A8
                Over 100 ECUs in some Lexus !
4



Moving towards multicore architecture

  Decreasing the complexity of in-vehicle architecture:
   reduces EE design and verification efforts
   decreases number of network interfaces
   decreases traffic on CAN network
   reduces costs



         ECU 1                     ECU 2
         Tas         Tas                 Tas
          k           k                   k

               Tas         Tas     Tas
                k           k       k
5



Moving towards multicore architecture

  Other use cases for the automotive domain
        Dealing with resource demanding applications
         ‣   engine control, image processing...
        Improving the safety
         ‣   segragation of multi-source software, ISO26262...
        Dedicated use of core
         ‣   monitoring, event-triggered tasks



  General benefits of multi-core
       reduced power consumption
       reduced heat
       reduced EMC
6



        AutoSAR requirements

•   Static partitioning
•   Static cyclic scheduling using schedule tables
•   BSW are all allocated on the same core
7



                                  Problem
Goal: schedule numerous runnables on a multicore ECU

Two sub-problems
     Partitioning
      ‣ 600 runnables on 2 cores : 2600 possible allocations
     Build schedule table
      ‣ 300 runnables in 200 slots (expiry points) : 200300 schedules

Sub-objectives and criteria
     Avoid load peaks
       ‣ Max
     Balance load over time
       ‣ Standard Deviation
8




                                          Model

    Runnables                                             P1=10             R
                                                                                 P2=10
                                                    R1    C1=2                   C2=1
                                                                            2
       •Period                                            O1=0                   O2=5
       •WCET
       •Initial Offset                                              P3=20              P4=20
                                                          R3        C3=3         R4    C4=2
       •Core allocation constraint
                                                                    O3=5               O4=15
       •Colocation constraint


Sequencer task

                R                  R                R                           R
    R1              R3    R1         R4     R1           R3     R1                R4
                2                  2                2                           2

0           5            10      15       20       25          30           35         40
     Ttic                      Slots      Tcycle
9


                         Solution


Partitioning is dealt with as a bin packing problem
        Worst fit decreasing algorithm with fixed number of bins


Load Balancing is done with the Least Loaded algorithm (LL)
  inspired from CAN domain [Grenier and Navet ERTSS2008]
        Extended to handle non harmonic runnable sets (G-LL)
        Improved so as to reduce further load peaks (G-LLσ)

Implemented in a tool
      Freely available soon at http://www.realtimeatwork.com
10


Experiments with RTaW-ECU
11


                 Harmonic task sets

                                             LL
                                             Max: 4.79
                                             Min: 4.52
                                             StdDvt: 0.038

                                             G-LLσ
                                             Max: 4.75
                                             Min: 4.65
                                             StdDvt: 0.018




Generated load: 94%, Ttic=5ms, Tcycle = 1s
12




                      Non harmonic task sets
                                                                         Cmax
     Schedulability bound in the harmonic case                      1−
                                                                         Ttic
                                      Generated        95%    97%         95%     97%
Max WCET (μs)       150   300   900   CPU load
                                      Schedulability   94%                82%
Schedulability      97%   94%   82%   bound in the     Max WCET =         Max WCET = 900μs
bound in the                          harmonic case    300μs
harmonic case
                                      Success % of     64%    18%         12%     1%
Success % of LL     96%   96%   92%   LL
                                      Success % of     94%    94%         30%     5%
Success % of G-LL   100% 100% 100%    G-LL

                                      Success % of     100%   100%        97%     76%
                                      G-LL1σ


      Statistics collected over 1000 generated runnable sets
13

 Multiple synchronized sequencer tasks per core




Incremental scheduling of three synchronized sequencer tasks with respective
   load of 45%, 35% and 15% resulting in 95% of the core capacity.
                        Tcycle=1000ms and Ttic=5ms
14


   Multiple non synchronized sequencer
              tasks per core

Case arises for sequencer tasks using different tic counters
      Engine control applications (standard time vs RPM)



Any offset between the sequencer tasks and all clock rates are
  possible during runtime
      each sequencer task needs to be balanced
      independantly

Verification is possible considering maximum clock rates
       Multi-frame scheduling results can be used
15



                  Conclusion
Adoption of multicore ECU raises new challenges
     Evolution of software architecture design
     Scheduling of software components

We propose runnable scheduling heuristics for ECUs
     Fast and performant
     Easily adaptable for more advanced applications
     Compatible with AutoSAR R4.0 and its multicore
     extensions

Future work
      Precedence constraints
      Lockstep synchronization
      Distributed timing chains
16




Thank you for your attention

More Related Content

DOCX
Vlsi interview questions compilation
PDF
Pushing the limits of CAN - Scheduling frames with offsets provides a major p...
PDF
Optimization of parameter settings for GAMG solver in simple solver, OpenFOAM...
PPT
Lecture11 combinational logic dynamics
PDF
customization of a deep learning accelerator, based on NVDLA
PDF
Pll Basic
PDF
BKK16-208 EAS
PDF
BKK16-104 sched-freq
Vlsi interview questions compilation
Pushing the limits of CAN - Scheduling frames with offsets provides a major p...
Optimization of parameter settings for GAMG solver in simple solver, OpenFOAM...
Lecture11 combinational logic dynamics
customization of a deep learning accelerator, based on NVDLA
Pll Basic
BKK16-208 EAS
BKK16-104 sched-freq

Similar to Ertss2010 multicore scheduling (20)

PDF
Resilience at exascale
PDF
Frame latency evaluation: when simulation and analysis alone are not enough
PDF
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
PDF
Real-time Systems Design (part I)
PPTX
iMinds The Conference: Jan Lemeire
PDF
FlexRay Product Days RTaW
PDF
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
PPT
Dynamic Shift Frequency Scaling Of ATPG Patterns
PDF
Automating the Configuration of the FlexRay Communication Cycle
PDF
NAS EP Algorithm
ODP
LOW POWER DIGITAL DESIGN
PPT
Power Optimization Through Manycore Multiprocessing
PDF
[Harvard CS264] 16 - Managing Dynamic Parallelism on GPUs: A Case Study of Hi...
PDF
2011 Feb07 Lewis Prospectus
PPTX
Critical Issues at Exascale for Algorithm and Software Design
PDF
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
PDF
Overview: Event Based Program Analysis
PDF
MSc Presentation
PPTX
Sc08 Talk Final
PDF
Designing at 2x nanometers Some New Problems Appear & Some Old Ones Remain
Resilience at exascale
Frame latency evaluation: when simulation and analysis alone are not enough
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Real-time Systems Design (part I)
iMinds The Conference: Jan Lemeire
FlexRay Product Days RTaW
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
Dynamic Shift Frequency Scaling Of ATPG Patterns
Automating the Configuration of the FlexRay Communication Cycle
NAS EP Algorithm
LOW POWER DIGITAL DESIGN
Power Optimization Through Manycore Multiprocessing
[Harvard CS264] 16 - Managing Dynamic Parallelism on GPUs: A Case Study of Hi...
2011 Feb07 Lewis Prospectus
Critical Issues at Exascale for Algorithm and Software Design
Performance Evaluation of SAR Image Reconstruction on CPUs and GPUs
Overview: Event Based Program Analysis
MSc Presentation
Sc08 Talk Final
Designing at 2x nanometers Some New Problems Appear & Some Old Ones Remain
Ad

More from Nicolas Navet (11)

PPT
Battery Aware Dynamic Scheduling for Periodic Task Graphs
PDF
In-Vehicle Networking : a Survey and Look Forward
PDF
Automotive communication systems: from dependability to security
PDF
Mécanismes de protection dans AUTOSAR OS
PDF
New policies
PDF
Isi2007 nn shc_2007
PDF
Configuring the communication on FlexRay: the case of the static segment
PDF
Aperiodic Traffic in Response Time Analyses with Adjustable Safety Level
PDF
Optimizing the Robustness of X-by-Wire using Word Combinatorics
PDF
Cief2007 nn shc_slides
PDF
Virtualization in Automotive Embedded Systems : an Outlook
Battery Aware Dynamic Scheduling for Periodic Task Graphs
In-Vehicle Networking : a Survey and Look Forward
Automotive communication systems: from dependability to security
Mécanismes de protection dans AUTOSAR OS
New policies
Isi2007 nn shc_2007
Configuring the communication on FlexRay: the case of the static segment
Aperiodic Traffic in Response Time Analyses with Adjustable Safety Level
Optimizing the Robustness of X-by-Wire using Word Combinatorics
Cief2007 nn shc_slides
Virtualization in Automotive Embedded Systems : an Outlook
Ad

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Cloud computing and distributed systems.
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation theory and applications.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Empathic Computing: Creating Shared Understanding
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
KodekX | Application Modernization Development
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Big Data Technologies - Introduction.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Advanced methodologies resolving dimensionality complications for autism neur...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Cloud computing and distributed systems.
The AUB Centre for AI in Media Proposal.docx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation theory and applications.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Empathic Computing: Creating Shared Understanding
Network Security Unit 5.pdf for BCA BBA.
KodekX | Application Modernization Development
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
MIND Revenue Release Quarter 2 2025 Press Release
Big Data Technologies - Introduction.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Review of recent advances in non-invasive hemoglobin estimation

Ertss2010 multicore scheduling

  • 1. Multicore scheduling in automotive ECUs Aurélien Monot - PSA Peugeot Citroën, LORIA Nicolas Navet - INRIA, RealTime-at-Work Françoise Simonot - LORIA Bernard Bavoux - PSA Peugeot Citroën Talk at ERTS2 2010 Toulouse, May 21st 2010
  • 2. 2 Outlook Context: New tools and methodologies are needed as multicore ECUs are being introduced in the automotive EE architecture. Problem: How to address the scheduling of numerous runnables on a multicore ECUs in the context of the automotive domain? Method: Deployment of load balancing algorithms in ECU configuration tools.
  • 3. 3 The case of a generic car manufacturer Typical number of ECUs in a car in 2000 : 20 Typical number of ECUs in a car in 2010 : over 40 The number of ECUs has more than doubled in 10 years Other examples Between 60 and 80 ECUs in the Audi A8 Over 100 ECUs in some Lexus !
  • 4. 4 Moving towards multicore architecture Decreasing the complexity of in-vehicle architecture: reduces EE design and verification efforts decreases number of network interfaces decreases traffic on CAN network reduces costs ECU 1 ECU 2 Tas Tas Tas k k k Tas Tas Tas k k k
  • 5. 5 Moving towards multicore architecture Other use cases for the automotive domain Dealing with resource demanding applications ‣ engine control, image processing... Improving the safety ‣ segragation of multi-source software, ISO26262... Dedicated use of core ‣ monitoring, event-triggered tasks General benefits of multi-core reduced power consumption reduced heat reduced EMC
  • 6. 6 AutoSAR requirements • Static partitioning • Static cyclic scheduling using schedule tables • BSW are all allocated on the same core
  • 7. 7 Problem Goal: schedule numerous runnables on a multicore ECU Two sub-problems Partitioning ‣ 600 runnables on 2 cores : 2600 possible allocations Build schedule table ‣ 300 runnables in 200 slots (expiry points) : 200300 schedules Sub-objectives and criteria Avoid load peaks ‣ Max Balance load over time ‣ Standard Deviation
  • 8. 8 Model Runnables P1=10 R P2=10 R1 C1=2 C2=1 2 •Period O1=0 O2=5 •WCET •Initial Offset P3=20 P4=20 R3 C3=3 R4 C4=2 •Core allocation constraint O3=5 O4=15 •Colocation constraint Sequencer task R R R R R1 R3 R1 R4 R1 R3 R1 R4 2 2 2 2 0 5 10 15 20 25 30 35 40 Ttic Slots Tcycle
  • 9. 9 Solution Partitioning is dealt with as a bin packing problem Worst fit decreasing algorithm with fixed number of bins Load Balancing is done with the Least Loaded algorithm (LL) inspired from CAN domain [Grenier and Navet ERTSS2008] Extended to handle non harmonic runnable sets (G-LL) Improved so as to reduce further load peaks (G-LLσ) Implemented in a tool Freely available soon at http://www.realtimeatwork.com
  • 11. 11 Harmonic task sets LL Max: 4.79 Min: 4.52 StdDvt: 0.038 G-LLσ Max: 4.75 Min: 4.65 StdDvt: 0.018 Generated load: 94%, Ttic=5ms, Tcycle = 1s
  • 12. 12 Non harmonic task sets Cmax Schedulability bound in the harmonic case 1− Ttic Generated 95% 97% 95% 97% Max WCET (μs) 150 300 900 CPU load Schedulability 94% 82% Schedulability 97% 94% 82% bound in the Max WCET = Max WCET = 900μs bound in the harmonic case 300μs harmonic case Success % of 64% 18% 12% 1% Success % of LL 96% 96% 92% LL Success % of 94% 94% 30% 5% Success % of G-LL 100% 100% 100% G-LL Success % of 100% 100% 97% 76% G-LL1σ Statistics collected over 1000 generated runnable sets
  • 13. 13 Multiple synchronized sequencer tasks per core Incremental scheduling of three synchronized sequencer tasks with respective load of 45%, 35% and 15% resulting in 95% of the core capacity. Tcycle=1000ms and Ttic=5ms
  • 14. 14 Multiple non synchronized sequencer tasks per core Case arises for sequencer tasks using different tic counters Engine control applications (standard time vs RPM) Any offset between the sequencer tasks and all clock rates are possible during runtime each sequencer task needs to be balanced independantly Verification is possible considering maximum clock rates Multi-frame scheduling results can be used
  • 15. 15 Conclusion Adoption of multicore ECU raises new challenges Evolution of software architecture design Scheduling of software components We propose runnable scheduling heuristics for ECUs Fast and performant Easily adaptable for more advanced applications Compatible with AutoSAR R4.0 and its multicore extensions Future work Precedence constraints Lockstep synchronization Distributed timing chains
  • 16. 16 Thank you for your attention