SlideShare a Scribd company logo
Thomas Moulard tmoulard@amazon.com
Raising the Bar on Robotics
Code Quality
08/01/2019
Tooling and Methodology for Robotics Software Teams
building critical ROS 2 Applications
Table of contents
• Raising the bar on open-source code quality?
• Code Instrumentation: ASAN/TSAN
• ClangThread Safety Extensions
• Fuzzing ROS 2
What is AWS RoboMaker?
• AWS Cloud9 simplify ROS development
• Cloud Simulation accelerate robot validation
• Fleet Management provide over the air update
capabilities to a robotic fleet.
• Cloud Extensions easily interface ROS with AWS
services such as Amazon Lex, Amazon Polly,
Amazon Kinesis Video Streams, Amazon
Rekognition, and Amazon CloudWatch.
aws.amazon.com/robomaker
Hello world
Navigation and person recognition
Voice commands
Robot monitoring
Sample Applications
Testing Robots is hard
• Errors are critical: a single bug can break a robot.
• Software input is uncontrolled.
• Experimenting with hardware is slow.
• Software is tightly coupled to hardware.
• System behavior depends on a large number of
parameters which need to be tuned.
Finding bugs in a robotic system is time consuming and
bugs have a high impact.
…
(Any) Server
One robot serves a few users, deploying
software is hard.
One server serves a lots of users,
deploying software is easier.
Raising the Bar on Open-Source Code Quality
Ensuring Code Quality for OSS is challenging:
• Shared Ownership
• Decision Making slower/harder
• Stakeholders are hard to identify
• End-to-End Testing?
Which strategy for your robotic team?
1. Fork?
2. Contribute back?
3. Both?
Are you facing difficulties running ROS 1/2 in production
→ Talk to us!
Solution: better developer infrastructure!
1. We cannot review all PRs,
2. We cannot maintain all the packages
…but we can build tooling!
Automatic Code Analysis and CI running it automatically
is crucial to code quality.
Enable the community to work together on eliminating
defects:
• Memory Issues
• Concurrency Issues
• Performance
AWS CodeBuild
Compiler Instrumentation
Automating C++ Code Defect Discovery
ASAN/MSAN Valgrind Dr. Memory Mudflap Guard Page gperftools
Technology CTI DBI DBI CTI Library Library
ARCH x86, ARM, PPC
x86, ARM, PPC,
MIPS, …
x86 All (?) All (?) All (?)
OS
Linux, OS X,
Windows, …
Linux, OS X, Solaris,
…
Windows,
Linux
Linux, Mac (?) All (?)
Linux,
Windows
Slowdown 2x 20x 10x 2x-40x ? ?
Heap OOB yes yes yes yes some some
Stack OOB yes no no some no no
Global
OOB
yes no no ? no no
UAF yes yes yes yes yes yes
UAR yes no no no no no
UMR yes (MSAN) yes yes ? no no
Leaks yes yes yes ? no yes
Source: https://github.com/google/sanitizers/wiki/AddressSanitizerComparisonOfMemoryTools
AdressSanitizer (ASan) Overview
Detect a large variety of memory defects:
• Out-of-bounds accesses to heap, stack and globals
• Use-after-free
• Use-after-return
• Use-after-scope
• Double-free, invalid free
Integrated with recent version of Clang and GCC:
-fsanitize=address
Only find bugs in executed code paths.
New! On ARM64, HWASAN is even more efficient.
Source: https://android-developers.googleblog.com/2017/08/android-bug-swatting-with-sanitizers.html
ThreadSanitizer (TSan) Overview
Detect concurrency-related defects:
• Potential deadlocks
• Race conditions
• Unsafe signal callback - see man signal-safety(7)
Integrated with recent version of Clang and GCC:
-fsanitize=thread
void signal_handler() {
// Will fail and set errno to ABCD
my_function_modifying_errno();
if (errno == ABCD) { /* do something */ }
}
int main() {
install_signal_handler(&signal_handler);
// Will fail and set errno to EFGH:
my_other_function_modifying_errno();
// A signal is received!
// signal_handler() gets executed here.
// This gets executed:
if (errno == ABCD) {
/* do something */ }
// ...but this should have been executed:
else if (errno == EFGH) {
/* do something else */ }
}
Compiling ROS 2 with ASAN / TSAN
# Initial Setup
sudo apt-get install python3-colcon-mixin
colcon mixin add default 
https://raw.githubusercontent.com/colcon/colcon-mixin-repository/master/index.yaml
colcon mixin update default
# Workspace Compilation (ASAN)
cd ~/ros2_asan_ws
colcon build --build-base=build-asan --install-base=install-asan 
--cmake-args 
-DOSRF_TESTING_TOOLS_CPP_DISABLE_MEMORY_TOOLS=ON 
-DINSTALL_EXAMPLES=OFF -DSECURITY=ON --no-warn-unused-cli 
-DCMAKE_BUILD_TYPE=Debug 
--mixin asan-gcc 
--packages-up-to test_communication 
--symlink-install
# Workspace Compilation (TSAN)
cd ~/ros2_tsan_ws
colcon build --build-base=build-tsan --install-base=install-tsan 
--cmake-args -DOSRF_TESTING_TOOLS_CPP_DISABLE_MEMORY_TOOLS=ON 
-DINSTALL_EXAMPLES=OFF -DSECURITY=ON --no-warn-unused-cli 
-DCMAKE_BUILD_TYPE=Debug 
--mixin tsan 
--packages-up-to test_communication 
--symlink-install
ROS 2 CI Integration
ci.ros2.org > Nightly > *_sanitizer
Catch regressions early!
Only run rcpputils and rcutils unit tests.
Will expend the scope of those jobs as more
and more packages get fixed!
We are looking for volunteers to help us fix
those bugs!
Thread Safety Annotations
Thread Safety Annotation
• Clang + libclangcxx required.
• Detect concurrency issues at compile time.
• Need to annotate classes attributes and functions.
• But does not require full instrumentation (can be
migrated progressively!)
• Need to pass specific flag: -Wthread-safety
Race conditions are hard to find during code reviews.
It can take very long before the bug is triggered on a
production platform.
Start annotating your code today!
Real life ROS 2 example:
rmw_fastrtps_shared_cpp/topic_cache.hpp
#include "mutex.h"
class BankAccount {
private:
Mutex mu;
int balance GUARDED_BY(mu);
void depositImpl(int amount) {
balance += amount; // WARNING! Cannot write balance
without locking mu.
}
void withdrawImpl(int amount) REQUIRES(mu) {
balance -= amount; // OK. Caller must have locked mu.
}
public:
void withdraw(int amount) {
mu.Lock();
withdrawImpl(amount); // OK. We've locked mu.
} // WARNING! Failed to unlock mu.
void transferFrom(BankAccount& b, int amount) {
mu.Lock();
b.withdrawImpl(amount); // WARNING! Calling withdrawImpl()
requires locking b.mu.
depositImpl(amount); // OK. depositImpl() has no
requirements.
mu.Unlock();
}
};
Source: https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
Fuzzing ROS 2
ROS 2 Fuzzing
ROS 2 is writing and loading lots of data:
• Config files: YAML, XML
• ROS bags
• URDFs
• Messages (serialization/unserialization)
• Etc.
Fuzzing is essential (and easy!).
This naive script relies on radamsa to generate ROS 2
messages was able to crash the ros2 cli!
#!/usr/bin/env bash
i=0
for word in $(aspell -d en dump master | aspell -l en
expand | head -n 5); do
echo "{data: "${word}"}" > "/tmp/sample-${i}"
i=$((i+1))
done
pgrep listener || exit 0
while true; do
STR=$($HOME/radamsa/bin/radamsa /tmp/sample-*)
echo "$STR"
(ros2 topic pub --once /chatter 
std_msgs/String "${STR}" 2>&1) > /dev/null
test $? -gt 127 && break # break on segfaults
pgrep listener || break
done
echo "SEGV"
What’s next?
UndefinedBehaviorSanitizer (UBSan) integration:
• bool
• integer-divide-by-zero
• return
• returns-nonnull-attribute
• shift-exponent
• unreachable
• vla-bound
Integrate Clang Control–Flow Integrity?
Annotate ROS 2 code with the Thread Safety Annotations.
Need ot fix ROS 2 Linux clang build with libclangcxx!
Expend testing to more than core packages!
Thank you!

More Related Content

PPTX
Developing intelligent robots with AWS RoboMaker
PPTX
ROSCON Fr: is ROS 2 ready for production?
PDF
2021 JCConf 使用Dapr簡化Java微服務應用開發
PDF
Quarkus tips, tricks, and techniques
PDF
KARMA: Adaptive Android Kernel Live Patching
PDF
Security in serverless world
PDF
Build reactive systems on lambda
PDF
Serverless security: defence against the dark arts
Developing intelligent robots with AWS RoboMaker
ROSCON Fr: is ROS 2 ready for production?
2021 JCConf 使用Dapr簡化Java微服務應用開發
Quarkus tips, tricks, and techniques
KARMA: Adaptive Android Kernel Live Patching
Security in serverless world
Build reactive systems on lambda
Serverless security: defence against the dark arts

What's hot (20)

PPTX
Eclipse Iceoryx Overview
PDF
Introduction to Dynamic Analysis of Android Application
PDF
Automated Infrastructure Security: Monitoring using FOSS
PPTX
Tech Days 2015: Ada 2012 and Spark Crazyflie and Railway Demo
KEY
Event machine
PDF
Serverless in production, an experience report
PDF
There is No Server: Immutable Infrastructure and Serverless Architecture
PDF
Security in serverless world
PDF
JEE on DC/OS
PDF
Canary deployment with Traefik and K3S
PDF
Efficient DevOps Tooling with Java and GraalVM
PPTX
SAST_QSDL
PDF
How did we get here and where are we going
PPTX
Beyond Continuous Delivery at ThoughtWorks North America Away Day
PPTX
Beyond Continuous Delivery TW Away Day June 2013
PDF
Docker {at,with} SignalFx
PPT
Nashorn
PDF
The Future of Security and Productivity in Our Newly Remote World
PDF
Serverless is a win for businesses, not just developers
PDF
Jenkins with SonarQube
Eclipse Iceoryx Overview
Introduction to Dynamic Analysis of Android Application
Automated Infrastructure Security: Monitoring using FOSS
Tech Days 2015: Ada 2012 and Spark Crazyflie and Railway Demo
Event machine
Serverless in production, an experience report
There is No Server: Immutable Infrastructure and Serverless Architecture
Security in serverless world
JEE on DC/OS
Canary deployment with Traefik and K3S
Efficient DevOps Tooling with Java and GraalVM
SAST_QSDL
How did we get here and where are we going
Beyond Continuous Delivery at ThoughtWorks North America Away Day
Beyond Continuous Delivery TW Away Day June 2013
Docker {at,with} SignalFx
Nashorn
The Future of Security and Productivity in Our Newly Remote World
Serverless is a win for businesses, not just developers
Jenkins with SonarQube
Ad

Similar to Raising the Bar on Robotics Code Quality (20)

PDF
maXbox Starter 45 Robotics
PDF
Infrastructureascode slideshare-160331143725
PPTX
Infrastructure as code: running microservices on AWS using Docker, Terraform,...
PDF
Infrastructureascode slideshare-160331143725
PDF
ARM Embeded_Firmware.pdf
PDF
Web (dis)assembly
PDF
Mac ruby deployment
PPT
A Life of breakpoint
PDF
introduction-infra-as-a-code using terraform
PDF
DevOps(4) : Ansible(2) - (MOSG)
PDF
HKG15-300: Art's Quick Compiler: An unofficial overview
PPT
.NET Debugging Tips and Techniques
PPT
.Net Debugging Techniques
PDF
PVS-Studio and Continuous Integration: TeamCity. Analysis of the Open RollerC...
ODP
ooc - A hybrid language experiment
ODP
ooc - A hybrid language experiment
PDF
Browser exploitation SEC-T 2019 stockholm
PDF
AWS Lambda from the trenches
PPTX
Intro To Node.js
PPT
Teflon - Anti Stick for the browser attack surface
maXbox Starter 45 Robotics
Infrastructureascode slideshare-160331143725
Infrastructure as code: running microservices on AWS using Docker, Terraform,...
Infrastructureascode slideshare-160331143725
ARM Embeded_Firmware.pdf
Web (dis)assembly
Mac ruby deployment
A Life of breakpoint
introduction-infra-as-a-code using terraform
DevOps(4) : Ansible(2) - (MOSG)
HKG15-300: Art's Quick Compiler: An unofficial overview
.NET Debugging Tips and Techniques
.Net Debugging Techniques
PVS-Studio and Continuous Integration: TeamCity. Analysis of the Open RollerC...
ooc - A hybrid language experiment
ooc - A hybrid language experiment
Browser exploitation SEC-T 2019 stockholm
AWS Lambda from the trenches
Intro To Node.js
Teflon - Anti Stick for the browser attack surface
Ad

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Encapsulation theory and applications.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Empathic Computing: Creating Shared Understanding
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Machine learning based COVID-19 study performance prediction
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Spectroscopy.pptx food analysis technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
“AI and Expert System Decision Support & Business Intelligence Systems”
Per capita expenditure prediction using model stacking based on satellite ima...
The Rise and Fall of 3GPP – Time for a Sabbatical?
MYSQL Presentation for SQL database connectivity
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Assigned Numbers - 2025 - Bluetooth® Document
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Encapsulation theory and applications.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Empathic Computing: Creating Shared Understanding
A comparative analysis of optical character recognition models for extracting...
SOPHOS-XG Firewall Administrator PPT.pptx
A Presentation on Artificial Intelligence
Machine learning based COVID-19 study performance prediction
Spectral efficient network and resource selection model in 5G networks
Spectroscopy.pptx food analysis technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Dropbox Q2 2025 Financial Results & Investor Presentation

Raising the Bar on Robotics Code Quality

  • 1. Thomas Moulard tmoulard@amazon.com Raising the Bar on Robotics Code Quality 08/01/2019 Tooling and Methodology for Robotics Software Teams building critical ROS 2 Applications
  • 2. Table of contents • Raising the bar on open-source code quality? • Code Instrumentation: ASAN/TSAN • ClangThread Safety Extensions • Fuzzing ROS 2
  • 3. What is AWS RoboMaker? • AWS Cloud9 simplify ROS development • Cloud Simulation accelerate robot validation • Fleet Management provide over the air update capabilities to a robotic fleet. • Cloud Extensions easily interface ROS with AWS services such as Amazon Lex, Amazon Polly, Amazon Kinesis Video Streams, Amazon Rekognition, and Amazon CloudWatch. aws.amazon.com/robomaker Hello world Navigation and person recognition Voice commands Robot monitoring Sample Applications
  • 4. Testing Robots is hard • Errors are critical: a single bug can break a robot. • Software input is uncontrolled. • Experimenting with hardware is slow. • Software is tightly coupled to hardware. • System behavior depends on a large number of parameters which need to be tuned. Finding bugs in a robotic system is time consuming and bugs have a high impact. … (Any) Server One robot serves a few users, deploying software is hard. One server serves a lots of users, deploying software is easier.
  • 5. Raising the Bar on Open-Source Code Quality Ensuring Code Quality for OSS is challenging: • Shared Ownership • Decision Making slower/harder • Stakeholders are hard to identify • End-to-End Testing? Which strategy for your robotic team? 1. Fork? 2. Contribute back? 3. Both? Are you facing difficulties running ROS 1/2 in production → Talk to us!
  • 6. Solution: better developer infrastructure! 1. We cannot review all PRs, 2. We cannot maintain all the packages …but we can build tooling! Automatic Code Analysis and CI running it automatically is crucial to code quality. Enable the community to work together on eliminating defects: • Memory Issues • Concurrency Issues • Performance AWS CodeBuild
  • 8. Automating C++ Code Defect Discovery ASAN/MSAN Valgrind Dr. Memory Mudflap Guard Page gperftools Technology CTI DBI DBI CTI Library Library ARCH x86, ARM, PPC x86, ARM, PPC, MIPS, … x86 All (?) All (?) All (?) OS Linux, OS X, Windows, … Linux, OS X, Solaris, … Windows, Linux Linux, Mac (?) All (?) Linux, Windows Slowdown 2x 20x 10x 2x-40x ? ? Heap OOB yes yes yes yes some some Stack OOB yes no no some no no Global OOB yes no no ? no no UAF yes yes yes yes yes yes UAR yes no no no no no UMR yes (MSAN) yes yes ? no no Leaks yes yes yes ? no yes Source: https://github.com/google/sanitizers/wiki/AddressSanitizerComparisonOfMemoryTools
  • 9. AdressSanitizer (ASan) Overview Detect a large variety of memory defects: • Out-of-bounds accesses to heap, stack and globals • Use-after-free • Use-after-return • Use-after-scope • Double-free, invalid free Integrated with recent version of Clang and GCC: -fsanitize=address Only find bugs in executed code paths. New! On ARM64, HWASAN is even more efficient. Source: https://android-developers.googleblog.com/2017/08/android-bug-swatting-with-sanitizers.html
  • 10. ThreadSanitizer (TSan) Overview Detect concurrency-related defects: • Potential deadlocks • Race conditions • Unsafe signal callback - see man signal-safety(7) Integrated with recent version of Clang and GCC: -fsanitize=thread void signal_handler() { // Will fail and set errno to ABCD my_function_modifying_errno(); if (errno == ABCD) { /* do something */ } } int main() { install_signal_handler(&signal_handler); // Will fail and set errno to EFGH: my_other_function_modifying_errno(); // A signal is received! // signal_handler() gets executed here. // This gets executed: if (errno == ABCD) { /* do something */ } // ...but this should have been executed: else if (errno == EFGH) { /* do something else */ } }
  • 11. Compiling ROS 2 with ASAN / TSAN # Initial Setup sudo apt-get install python3-colcon-mixin colcon mixin add default https://raw.githubusercontent.com/colcon/colcon-mixin-repository/master/index.yaml colcon mixin update default # Workspace Compilation (ASAN) cd ~/ros2_asan_ws colcon build --build-base=build-asan --install-base=install-asan --cmake-args -DOSRF_TESTING_TOOLS_CPP_DISABLE_MEMORY_TOOLS=ON -DINSTALL_EXAMPLES=OFF -DSECURITY=ON --no-warn-unused-cli -DCMAKE_BUILD_TYPE=Debug --mixin asan-gcc --packages-up-to test_communication --symlink-install # Workspace Compilation (TSAN) cd ~/ros2_tsan_ws colcon build --build-base=build-tsan --install-base=install-tsan --cmake-args -DOSRF_TESTING_TOOLS_CPP_DISABLE_MEMORY_TOOLS=ON -DINSTALL_EXAMPLES=OFF -DSECURITY=ON --no-warn-unused-cli -DCMAKE_BUILD_TYPE=Debug --mixin tsan --packages-up-to test_communication --symlink-install
  • 12. ROS 2 CI Integration ci.ros2.org > Nightly > *_sanitizer Catch regressions early! Only run rcpputils and rcutils unit tests. Will expend the scope of those jobs as more and more packages get fixed! We are looking for volunteers to help us fix those bugs!
  • 14. Thread Safety Annotation • Clang + libclangcxx required. • Detect concurrency issues at compile time. • Need to annotate classes attributes and functions. • But does not require full instrumentation (can be migrated progressively!) • Need to pass specific flag: -Wthread-safety Race conditions are hard to find during code reviews. It can take very long before the bug is triggered on a production platform. Start annotating your code today! Real life ROS 2 example: rmw_fastrtps_shared_cpp/topic_cache.hpp #include "mutex.h" class BankAccount { private: Mutex mu; int balance GUARDED_BY(mu); void depositImpl(int amount) { balance += amount; // WARNING! Cannot write balance without locking mu. } void withdrawImpl(int amount) REQUIRES(mu) { balance -= amount; // OK. Caller must have locked mu. } public: void withdraw(int amount) { mu.Lock(); withdrawImpl(amount); // OK. We've locked mu. } // WARNING! Failed to unlock mu. void transferFrom(BankAccount& b, int amount) { mu.Lock(); b.withdrawImpl(amount); // WARNING! Calling withdrawImpl() requires locking b.mu. depositImpl(amount); // OK. depositImpl() has no requirements. mu.Unlock(); } }; Source: https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
  • 16. ROS 2 Fuzzing ROS 2 is writing and loading lots of data: • Config files: YAML, XML • ROS bags • URDFs • Messages (serialization/unserialization) • Etc. Fuzzing is essential (and easy!). This naive script relies on radamsa to generate ROS 2 messages was able to crash the ros2 cli! #!/usr/bin/env bash i=0 for word in $(aspell -d en dump master | aspell -l en expand | head -n 5); do echo "{data: "${word}"}" > "/tmp/sample-${i}" i=$((i+1)) done pgrep listener || exit 0 while true; do STR=$($HOME/radamsa/bin/radamsa /tmp/sample-*) echo "$STR" (ros2 topic pub --once /chatter std_msgs/String "${STR}" 2>&1) > /dev/null test $? -gt 127 && break # break on segfaults pgrep listener || break done echo "SEGV"
  • 17. What’s next? UndefinedBehaviorSanitizer (UBSan) integration: • bool • integer-divide-by-zero • return • returns-nonnull-attribute • shift-exponent • unreachable • vla-bound Integrate Clang Control–Flow Integrity? Annotate ROS 2 code with the Thread Safety Annotations. Need ot fix ROS 2 Linux clang build with libclangcxx! Expend testing to more than core packages!

Editor's Notes

  • #3: Talk about AWS RoboMaker and its main features (dev / simulation / fleet management) Those features integrate and extend open-source software
  • #9: DBI: dynamic binary instrumentation CTI: compile-time instrumentation UMR: uninitialized memory reads UAF: use-after-free (aka dangling pointer) UAR: use-after-return OOB: out-of-bounds x86: includes 32- and 64-bit. mudflap was removed in GCC 4.9, as it has been superseded by AddressSanitizer. Guard Page: a family of memory error detectors (Electric fence or DUMA on Linux, Page Heap on Windows, libgmalloc on OS X) gperftools: various performance tools/error detectors bundled with TCMalloc. Heap checker (leak detector) is only available on Linux. Debug allocator provides both guard pages and canary values for more precise detection of OOB writes, so it's better than guard page-only detectors.