SlideShare a Scribd company logo
Simplifying debugging for
multi-core Linux devices
and low-power Linux
clusters
Embedded World Exhibition & Conference
February 24, 2015
Introduction
Embedded Linux development
Why?
– Reuse
– Community
– Memory constraints
– C and C++
– Device compatibility
– Cost
Where?
– Routers
– Media streaming
– POS
– Hardware control
– Sensor display
Free Electrons.com
© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED 3
Multi-core
• 2 or 4 core devices much more common
– Multi-core
– Many-core
• You have a choice
– Leave the core idle
– Run additional processes
– Write multithreaded code to utilize the additional cores
• Graphical Processing Unit accelerators on the device?
How to use the additional cores?
4© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Multi-thread
• Concurrency: execution proceeds asynchronously along two or more
sequences
– Parallelism : concurrency with parallel execution
• Interdependencies
– Explicit is generally better than implicit
• Synchronization
– Race Conditions
– Deadlocks
– Live-locks
Taking advantage of parallelism in your
device
5© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Multi-device
• Computationally challenging problem
– Algorithm is parallelizable
• Requirements
– Power
– Space
– Cooling
• Fault tolerance
• Off the shelf parallel runtime vs custom
Embedded clusters
6© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
High performance computing
• Typically linux-x86
– Sometimes with GP-GPU or Intel Xeon Phi accelerators
• Programmed as sets of multi-core nodes
– Data is distributed with communication and synchronization as
needed
– Communication typically takes the form of message passing
• Entire system is optimized for app performance
– Low latency interconnect
– Parallel filesystem
• Access is via submitting batch jobs to a resource management system
Supercomputers and clusters with 100s – 1000s of nodes
7© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Rogue Wave Software
Rogue Wave helps organizations simplify
complex software development, improve
code quality, and shorten cycle times
9
What we do
© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Capabilities
10
Klocwork, OpenLogic,
TotalView, IMSL,
SourcePro
Klocwork,
OpenLogic
Klocwork, TotalView
Klocwork
Visualization, Stingray,
PV-WAVE
SourcePro, IMSL,
HydraExpress
SourcePro, IMSL,
Stingray,
Visualization
OpenLogic OpenLogic OpenLogic OpenLogic
IMSL,
SourcePro
Klocwork
© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
11
Used by 3,000 customers in over 57 countries across diverse
industries to develop mission-critical applications and software
Financial Services Telecom Gov’t / Defense Technology Other Verticals
Global, diversified customer base
© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Embedded use cases
Retail point of sale
• Highly connected
– Operations
– Ad and promotional services
– Sensors (scale, scanner)
• Modern C++
• Many threads
– 1 or more threads for each task
– Responsiveness requirements for the threads reading the sensor
data.
13© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Industrial device controller
• Expensive equipment
• Used in production testing
• Controller software
– X86-linux
– C++
– Multi-threaded
• Customized at each site
– Customization takes the form of C code that runs in a pre-
compiled framework
14© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Sonar console
• Runs on Linux-64 bit and Linux-arm
– 2G flash memory
• Monolithic C++ with millions of lines of code
– Qt interface (touch displays)
– 100s of threads
• Rich visual data
– Video streaming
– One or more sensors
15© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Signal processing compute cluster
• Computationally demanding
• Sophisticated algorithms
– Translated from 4th generation languages/environments
• Need an answer quickly
• Using industry standard technologies
– C++
– MPI
– X86 processors for development
– Power processors for deployment
• Memory & Power constraints
16© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Techniques and
best practices
Debugging distributed applications
• Print debugging doesn’t scale
• You can debug 1 of N processes
– Do all processes exhibit the error
– Needle in the haystack problem
– Passing the bad apple problem
• You can run N debuggers on N processes
– Frustrating with N=2 impractical above N=4
• You can use a parallel debugger
– One debugger controlling all N processes
Techniques for debugging distributed apps
(1/3)
18© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Debugging distributed applications
• Parallel Debuggers will
– If any process fails you can focus on it and see its back-trace
– Allow you to synchronize your processes (if the code includes
common execution pathways)
– Allow you to focus on any process
– Allow you to compare processes
– Give you ways to find outliers
– Give you ways to group processes and work with those groups
Techniques for debugging distributed apps
(2/3)
19© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Debugging distributed applications
• Re-run at different scales
– Debug at lowest scale that exhibits defect
– What is different at that scale?
• Compare program flow in working and non-working cases
• Follow bad data back from the symptom to the cause
• Look closely at communication points and data decomposition
• Racy bugs
– Try out different relative orders of execution
– Add synchronization
• Deadlocks & Live Locks
– Examine sync points to make sure all assumptions are valid
– Examine flow control around sync points
• Take careful notes, there can be a lot of subtle factors
Techniques for debugging distributed apps
(3/3)
20© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Debugging multi-core applications
• Multithreaded applications and shared memory programming
– Data can be shared (higher memory efficiency)
• Shared memory programming
– Complexity: Only some memory is shared
• Multi-threaded programming
– All threads share the same heap and global
– Separate stacks (but mutually readable)
• Concurrency is the same
– Many of the same challenges and many of the same techniques
• Communication (accidental and intention) not as localized
• Memory management (new/delete, malloc/free) is shared
Observations about multi-threaded
debugging
21© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Debugging multi-core applications
• Print debugging can work for some bugs but can be very confusing for others
– Changes timing
• Look carefully at the thread capabilities of your debugger
• A good multi-thread debugger will give you
– An asynchronous interface
• Doesn’t assume a simple running/stopped state
– Easy access to all threads
– Complete control over threads
– Display of thread states
– Thread aware breakpoints
– Ways to synchronize threads
– Ways to hold threads
– Thread groups
– Display of thread-private data
– Display of data across threads
Techniques for multi-threaded debugging
(1/2)
22© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Debugging multi-core applications
• Try to reproduce problems without threads
• Vary the number of threads
• Try different interleaving patters
• Look at thread synchronization point (mutexes, semaphores, barriers)
• Use watchpoints (aka data breakpoints)
• Make sure resources are cleaned up before thread termination
• Use record and deterministic replay to capture the exact thread
execution pattern
Techniques for multi-threaded debugging
(2/2)
23© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Log file debugging
• Recompile with print statements for a log file
• Compile in and toggle on/off with a runtime flag
• Trace with an external tool
– System call tracing
– Debugger assisted tracing: refocus experiments without a recompile
• Tension & Trade-off
– Capture enough context to understand what is happening
– Manage the large volume of output that may be required
• Tips & Techniques
– Binary search to find the site of the error
– Consider file system / file size
– Flush the pipe, otherwise file writing is asynchronous
– The presence of a call sometimes changes the behavior (compiler bugs, optimization, race
conditions)
– Print debugging can be hairy with multi-thread or multi-process
• Externally driven tracing tools may be preferable to ensure logging happens
Narrow down the site and capture the context of the
bug
24© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Dynamic memory analysis
• Dynamic memory tools help catch hard to identify bugs
– memory leaks can lurk in a code base
– bounds violations can corrupt data
• can be an open door for malicious agents
– dangling pointers lead to racy, hard to reproduce symptoms
• Dynamic memory tools can also be used to inspect what is happening in the heap memory
– Normally quite hard to visualize and understand
– Critical for optimizing for low memory environments
• Tips & techniques
– Maintain a policy of eliminating 100% of leaks
– Use with a testing system to make sure you exercise different kinds of input and
different code paths
– Compare heap behavior over time to make sure OS and library changes don’t
introduce problems
Pinpoint leaks and analyze memory use
25© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Reverse debugging
• Record and deterministically replay execution trajectory through the code
– Record non deterministic inputs
– Replay those as needed to access any point in the execution
• If you can get a racy bug to reproduce you can examine it at leisure
– Give yourself the full benefit of hindsight
– What steps led to it happening?
– Where did the program go wrong?
• Tips & Techniques
– Use watchpoints (data breakpoints) to find the source of corrupt data
– Wait till you are close to the bug before activating the recording to avoid paying
overhead for the entire runtime
– Capture recordings and save them to a file as part of bug reports
– Review recordings of defects in unfamiliar parts of the code with subject
matter experts
Get “racy” bugs “on tape”
26© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Remote Debugging / Cross Debugging
• Remote Debugging
– debugger core runs on your workstation (host) system
– lightweight agent process runs on the device (target) system
• The agent process is very lightweight
• The debugger core holds all the complex analysis data structures
• Tips & Techniques
– Start with a debug target on the host machine
• Copy and strip the version that goes on the device
– You can start the server and then choose the target process
– Sources may need to be accessible on the host
– Use a tool that does the right thing with host/target library mismatch
– Be aware of security
Limit debugger resource utilization in the target system
27© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Core file debugging
• The corefile isn’t always sufficient
– It can be trashed
– It represents the consequence of the defect, but not the cause
• Examine the site of the crash
• Look for ‘suspicious’ variables
• Tips & Techniques
– Compile with debug information
– You can sync up a pre-stripped executable with a corefile
generated by its stripped counterpart
– Check the more than one stack frame
A corefile is a good place to start
28© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Static analysis
• Scan your code with a “sanity checker”
– Identifies patterns which may or will lead to errors
– Can check for compliance with coding standards
• Finds bugs that could lead to a crash, even if they don’t right away
• Finds certain kinds of resource leakage
• The sooner the better
– Faster feedback, easier to correct
– Ideally this should work like a spell checker
Catch defects early on
29© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Rogue Wave solutions
• TotalView
– Asynchronous Thread Control
– Parallel Debugging
– Core file Debugging
– Reverse Debugging
– Dynamic Memory Analysis
• Klocwork
– C and C++ Static Code Analysis
• OpenLogic
– Mange Your Open Source Components
We can help!
Visit us at booth #
4-139
30© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
Resource slides

More Related Content

PDF
FreeSWITCH Monitoring
PPTX
Real Time Debugging - What to do when a breakpoint just won't do
PDF
Introduction to Embedded Systems a Practical Approach
PPT
Fdp embedded systems
PDF
12 Lessons Learnt in Boot Time Reduction
PDF
Embedded Android : System Development - Part II (HAL)
PPT
TI TechDays 2010: swiftBoot
PPT
N-version programming
FreeSWITCH Monitoring
Real Time Debugging - What to do when a breakpoint just won't do
Introduction to Embedded Systems a Practical Approach
Fdp embedded systems
12 Lessons Learnt in Boot Time Reduction
Embedded Android : System Development - Part II (HAL)
TI TechDays 2010: swiftBoot
N-version programming

What's hot (16)

PDF
Gatehouse software genanvendelse
PDF
200923 01en
PPTX
SystemVerilog based OVM and UVM Verification Methodologies
PDF
Production Ready Microservices at Scale
PDF
Kalimucho Research Project, OW2con11, Nov 24-25, Paris
 
PPT
39245175 intro-es-ii
PPT
Software Fault Tolerance
PPT
Embedded system design process
PDF
Experience in teaching devops
PDF
DevOps Syllabus summer 2020
PDF
Chapter 1
PDF
Node.js Tools Ecosystem
PPTX
Making software development processes to work for you
PDF
Embedded Systems - A Brief Introduction
PDF
Intro to Embedded OS, RTOS and Communication Protocols
PDF
Devops syllabus
Gatehouse software genanvendelse
200923 01en
SystemVerilog based OVM and UVM Verification Methodologies
Production Ready Microservices at Scale
Kalimucho Research Project, OW2con11, Nov 24-25, Paris
 
39245175 intro-es-ii
Software Fault Tolerance
Embedded system design process
Experience in teaching devops
DevOps Syllabus summer 2020
Chapter 1
Node.js Tools Ecosystem
Making software development processes to work for you
Embedded Systems - A Brief Introduction
Intro to Embedded OS, RTOS and Communication Protocols
Devops syllabus
Ad

Similar to Simplifying debugging for multi-core Linux devices and low-power Linux clusters (20)

PPTX
Advanced technologies and techniques for debugging HPC applications
PPTX
Computer preemption and TotalView have made debugging Pascal much more seamless
PPTX
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
PPTX
Approaches to debugging mixed-language HPC apps
PPTX
Profiling Multicore Systems to Maximize Core Utilization
PPTX
Debugging CUDA applications
PDF
Surge2012
PPTX
Debugging multiplayer games
PPTX
How to achieve security, reliability, and productivity in less time
PPTX
Rapid software testing and conformance with static code analysis
PPTX
Cyber security - It starts with the embedded system
PDF
Introduction to multicore .ppt
PDF
MCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
PDF
Dynamic Languages in Production: Progress and Open Challenges
PPTX
How to debug machine learning call stacks
PDF
Codescape Debugger 8
PDF
High level programming of embedded hard real-time devices
PDF
The Hurricane's Butterfly: Debugging pathologically performing systems
PDF
Top 5 best practice for delivering secure in-vehicle software
PDF
CPSeis & GeoCraft
Advanced technologies and techniques for debugging HPC applications
Computer preemption and TotalView have made debugging Pascal much more seamless
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
Approaches to debugging mixed-language HPC apps
Profiling Multicore Systems to Maximize Core Utilization
Debugging CUDA applications
Surge2012
Debugging multiplayer games
How to achieve security, reliability, and productivity in less time
Rapid software testing and conformance with static code analysis
Cyber security - It starts with the embedded system
Introduction to multicore .ppt
MCA Daemon: Hybrid Throughput Analysis Beyond Basic Blocks
Dynamic Languages in Production: Progress and Open Challenges
How to debug machine learning call stacks
Codescape Debugger 8
High level programming of embedded hard real-time devices
The Hurricane's Butterfly: Debugging pathologically performing systems
Top 5 best practice for delivering secure in-vehicle software
CPSeis & GeoCraft
Ad

More from Rogue Wave Software (20)

PPTX
The Global Influence of Open Banking, API Security, and an Open Data Perspective
PPTX
No liftoff, touchdown, or heartbeat shall miss because of a software failure
PDF
Disrupt or be disrupted – Using secure APIs to drive digital transformation
PPTX
Leveraging open banking specifications for rigorous API security – What’s in...
PPTX
Adding layers of security to an API in real-time
PPTX
Getting the most from your API management platform: A case study
PPTX
The forgotten route: Making Apache Camel work for you
PPTX
Are open source and embedded software development on a collision course?
PDF
Three big mistakes with APIs and microservices
PPTX
5 strategies for enterprise cloud infrastructure success
PPTX
PSD2 & Open Banking: How to go from standards to implementation and compliance
PPTX
Java 10 and beyond: Keeping up with the language and planning for the future
PPTX
How to keep developers happy and lawyers calm (Presented at ESC Boston)
PPTX
Open source applied - Real world use cases (Presented at Open Source 101)
PPTX
How to migrate SourcePro apps from Solaris to Linux
PPTX
Enterprise Linux: Justify your migration from Red Hat to CentOS
PPTX
Walk through an enterprise Linux migration
PPTX
How to keep developers happy and lawyers calm
PPTX
Open source and embedded software development
PDF
Open source software: The infrastructure impact
The Global Influence of Open Banking, API Security, and an Open Data Perspective
No liftoff, touchdown, or heartbeat shall miss because of a software failure
Disrupt or be disrupted – Using secure APIs to drive digital transformation
Leveraging open banking specifications for rigorous API security – What’s in...
Adding layers of security to an API in real-time
Getting the most from your API management platform: A case study
The forgotten route: Making Apache Camel work for you
Are open source and embedded software development on a collision course?
Three big mistakes with APIs and microservices
5 strategies for enterprise cloud infrastructure success
PSD2 & Open Banking: How to go from standards to implementation and compliance
Java 10 and beyond: Keeping up with the language and planning for the future
How to keep developers happy and lawyers calm (Presented at ESC Boston)
Open source applied - Real world use cases (Presented at Open Source 101)
How to migrate SourcePro apps from Solaris to Linux
Enterprise Linux: Justify your migration from Red Hat to CentOS
Walk through an enterprise Linux migration
How to keep developers happy and lawyers calm
Open source and embedded software development
Open source software: The infrastructure impact

Recently uploaded (20)

PDF
System and Network Administration Chapter 2
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
System and Network Administration Chapter 2
Upgrade and Innovation Strategies for SAP ERP Customers
How to Migrate SBCGlobal Email to Yahoo Easily
Online Work Permit System for Fast Permit Processing
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
VVF-Customer-Presentation2025-Ver1.9.pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
ManageIQ - Sprint 268 Review - Slide Deck
Navsoft: AI-Powered Business Solutions & Custom Software Development
Design an Analysis of Algorithms I-SECS-1021-03
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
2025 Textile ERP Trends: SAP, Odoo & Oracle
Odoo POS Development Services by CandidRoot Solutions
PTS Company Brochure 2025 (1).pdf.......
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Wondershare Filmora 15 Crack With Activation Key [2025

Simplifying debugging for multi-core Linux devices and low-power Linux clusters

  • 1. Simplifying debugging for multi-core Linux devices and low-power Linux clusters Embedded World Exhibition & Conference February 24, 2015
  • 3. Embedded Linux development Why? – Reuse – Community – Memory constraints – C and C++ – Device compatibility – Cost Where? – Routers – Media streaming – POS – Hardware control – Sensor display Free Electrons.com © 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED 3
  • 4. Multi-core • 2 or 4 core devices much more common – Multi-core – Many-core • You have a choice – Leave the core idle – Run additional processes – Write multithreaded code to utilize the additional cores • Graphical Processing Unit accelerators on the device? How to use the additional cores? 4© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 5. Multi-thread • Concurrency: execution proceeds asynchronously along two or more sequences – Parallelism : concurrency with parallel execution • Interdependencies – Explicit is generally better than implicit • Synchronization – Race Conditions – Deadlocks – Live-locks Taking advantage of parallelism in your device 5© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 6. Multi-device • Computationally challenging problem – Algorithm is parallelizable • Requirements – Power – Space – Cooling • Fault tolerance • Off the shelf parallel runtime vs custom Embedded clusters 6© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 7. High performance computing • Typically linux-x86 – Sometimes with GP-GPU or Intel Xeon Phi accelerators • Programmed as sets of multi-core nodes – Data is distributed with communication and synchronization as needed – Communication typically takes the form of message passing • Entire system is optimized for app performance – Low latency interconnect – Parallel filesystem • Access is via submitting batch jobs to a resource management system Supercomputers and clusters with 100s – 1000s of nodes 7© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 9. Rogue Wave helps organizations simplify complex software development, improve code quality, and shorten cycle times 9 What we do © 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 10. Capabilities 10 Klocwork, OpenLogic, TotalView, IMSL, SourcePro Klocwork, OpenLogic Klocwork, TotalView Klocwork Visualization, Stingray, PV-WAVE SourcePro, IMSL, HydraExpress SourcePro, IMSL, Stingray, Visualization OpenLogic OpenLogic OpenLogic OpenLogic IMSL, SourcePro Klocwork © 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 11. 11 Used by 3,000 customers in over 57 countries across diverse industries to develop mission-critical applications and software Financial Services Telecom Gov’t / Defense Technology Other Verticals Global, diversified customer base © 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 13. Retail point of sale • Highly connected – Operations – Ad and promotional services – Sensors (scale, scanner) • Modern C++ • Many threads – 1 or more threads for each task – Responsiveness requirements for the threads reading the sensor data. 13© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 14. Industrial device controller • Expensive equipment • Used in production testing • Controller software – X86-linux – C++ – Multi-threaded • Customized at each site – Customization takes the form of C code that runs in a pre- compiled framework 14© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 15. Sonar console • Runs on Linux-64 bit and Linux-arm – 2G flash memory • Monolithic C++ with millions of lines of code – Qt interface (touch displays) – 100s of threads • Rich visual data – Video streaming – One or more sensors 15© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 16. Signal processing compute cluster • Computationally demanding • Sophisticated algorithms – Translated from 4th generation languages/environments • Need an answer quickly • Using industry standard technologies – C++ – MPI – X86 processors for development – Power processors for deployment • Memory & Power constraints 16© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 18. Debugging distributed applications • Print debugging doesn’t scale • You can debug 1 of N processes – Do all processes exhibit the error – Needle in the haystack problem – Passing the bad apple problem • You can run N debuggers on N processes – Frustrating with N=2 impractical above N=4 • You can use a parallel debugger – One debugger controlling all N processes Techniques for debugging distributed apps (1/3) 18© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 19. Debugging distributed applications • Parallel Debuggers will – If any process fails you can focus on it and see its back-trace – Allow you to synchronize your processes (if the code includes common execution pathways) – Allow you to focus on any process – Allow you to compare processes – Give you ways to find outliers – Give you ways to group processes and work with those groups Techniques for debugging distributed apps (2/3) 19© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 20. Debugging distributed applications • Re-run at different scales – Debug at lowest scale that exhibits defect – What is different at that scale? • Compare program flow in working and non-working cases • Follow bad data back from the symptom to the cause • Look closely at communication points and data decomposition • Racy bugs – Try out different relative orders of execution – Add synchronization • Deadlocks & Live Locks – Examine sync points to make sure all assumptions are valid – Examine flow control around sync points • Take careful notes, there can be a lot of subtle factors Techniques for debugging distributed apps (3/3) 20© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 21. Debugging multi-core applications • Multithreaded applications and shared memory programming – Data can be shared (higher memory efficiency) • Shared memory programming – Complexity: Only some memory is shared • Multi-threaded programming – All threads share the same heap and global – Separate stacks (but mutually readable) • Concurrency is the same – Many of the same challenges and many of the same techniques • Communication (accidental and intention) not as localized • Memory management (new/delete, malloc/free) is shared Observations about multi-threaded debugging 21© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 22. Debugging multi-core applications • Print debugging can work for some bugs but can be very confusing for others – Changes timing • Look carefully at the thread capabilities of your debugger • A good multi-thread debugger will give you – An asynchronous interface • Doesn’t assume a simple running/stopped state – Easy access to all threads – Complete control over threads – Display of thread states – Thread aware breakpoints – Ways to synchronize threads – Ways to hold threads – Thread groups – Display of thread-private data – Display of data across threads Techniques for multi-threaded debugging (1/2) 22© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 23. Debugging multi-core applications • Try to reproduce problems without threads • Vary the number of threads • Try different interleaving patters • Look at thread synchronization point (mutexes, semaphores, barriers) • Use watchpoints (aka data breakpoints) • Make sure resources are cleaned up before thread termination • Use record and deterministic replay to capture the exact thread execution pattern Techniques for multi-threaded debugging (2/2) 23© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 24. Log file debugging • Recompile with print statements for a log file • Compile in and toggle on/off with a runtime flag • Trace with an external tool – System call tracing – Debugger assisted tracing: refocus experiments without a recompile • Tension & Trade-off – Capture enough context to understand what is happening – Manage the large volume of output that may be required • Tips & Techniques – Binary search to find the site of the error – Consider file system / file size – Flush the pipe, otherwise file writing is asynchronous – The presence of a call sometimes changes the behavior (compiler bugs, optimization, race conditions) – Print debugging can be hairy with multi-thread or multi-process • Externally driven tracing tools may be preferable to ensure logging happens Narrow down the site and capture the context of the bug 24© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 25. Dynamic memory analysis • Dynamic memory tools help catch hard to identify bugs – memory leaks can lurk in a code base – bounds violations can corrupt data • can be an open door for malicious agents – dangling pointers lead to racy, hard to reproduce symptoms • Dynamic memory tools can also be used to inspect what is happening in the heap memory – Normally quite hard to visualize and understand – Critical for optimizing for low memory environments • Tips & techniques – Maintain a policy of eliminating 100% of leaks – Use with a testing system to make sure you exercise different kinds of input and different code paths – Compare heap behavior over time to make sure OS and library changes don’t introduce problems Pinpoint leaks and analyze memory use 25© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 26. Reverse debugging • Record and deterministically replay execution trajectory through the code – Record non deterministic inputs – Replay those as needed to access any point in the execution • If you can get a racy bug to reproduce you can examine it at leisure – Give yourself the full benefit of hindsight – What steps led to it happening? – Where did the program go wrong? • Tips & Techniques – Use watchpoints (data breakpoints) to find the source of corrupt data – Wait till you are close to the bug before activating the recording to avoid paying overhead for the entire runtime – Capture recordings and save them to a file as part of bug reports – Review recordings of defects in unfamiliar parts of the code with subject matter experts Get “racy” bugs “on tape” 26© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 27. Remote Debugging / Cross Debugging • Remote Debugging – debugger core runs on your workstation (host) system – lightweight agent process runs on the device (target) system • The agent process is very lightweight • The debugger core holds all the complex analysis data structures • Tips & Techniques – Start with a debug target on the host machine • Copy and strip the version that goes on the device – You can start the server and then choose the target process – Sources may need to be accessible on the host – Use a tool that does the right thing with host/target library mismatch – Be aware of security Limit debugger resource utilization in the target system 27© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 28. Core file debugging • The corefile isn’t always sufficient – It can be trashed – It represents the consequence of the defect, but not the cause • Examine the site of the crash • Look for ‘suspicious’ variables • Tips & Techniques – Compile with debug information – You can sync up a pre-stripped executable with a corefile generated by its stripped counterpart – Check the more than one stack frame A corefile is a good place to start 28© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 29. Static analysis • Scan your code with a “sanity checker” – Identifies patterns which may or will lead to errors – Can check for compliance with coding standards • Finds bugs that could lead to a crash, even if they don’t right away • Finds certain kinds of resource leakage • The sooner the better – Faster feedback, easier to correct – Ideally this should work like a spell checker Catch defects early on 29© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED
  • 30. Rogue Wave solutions • TotalView – Asynchronous Thread Control – Parallel Debugging – Core file Debugging – Reverse Debugging – Dynamic Memory Analysis • Klocwork – C and C++ Static Code Analysis • OpenLogic – Mange Your Open Source Components We can help! Visit us at booth # 4-139 30© 2015 ROGUE WAVE SOFTWARE, INC. ALL RIGHTS RESERVED