Analyzing the Evolution of Inter-package
Dependencies in Operating Systems: A
Case Study of Ubuntu
Victor Prokhorenko, Chadni Islam,
Muhammad Ali Babar
Overview
 Software dependencies and complexity
 Operating Systems context
 DepEx framework - Dependency extraction
 Ubuntu – Findings
 Future work directions and enhancements
Types of software dependencies
• Code libraries (source or binary)
• Network sockets
• UNIX domain sockets
• Pipes
• File-based
• Run-time code provisioning
• Downloading from external source
• Generated by external applications
• Discoverable and undiscoverable
Code libraries
• Available in source form at compile time
• Used through various form of Import- and Include-like statements
• Can be in-lined or embedded, thus eliminating external dependency
• Dynamically loadable at run time
• Typically met in form of .so and .dll files
• Can have recursive dependencies of their own
Binary Dynamically Loadable Libraries
• Required to be present in the system for a given application to start
• Required to be located in a discoverable “place”
• Required to contain the necessary functionality
• List of exported functions
• Versioning considerations
• Source code may be inaccessible
• Sourced from:
• Application bundle
• Pre-existing Operating System
Binary Dynamically Loadable Libraries: example
Complexity aspect: dependency-based metrics
• How do dependencies
reflect application
complexity?
• How do dependencies
reflect library importance?
• Four metrics investigated:
Presence, Coverage,
Occurrence, Usage
• Developer-facing
complexity vs. recursion
Operating system context
• Single application development phase
• Testing through compilation and execution
• Multiple tools: compiler, IDE, tests, debuggers
• Multiple applications usage phase
• Automatic dependency installation (apt install)
• Library version conflicts: versioned names
• Bundling: archive, container, VM
• Base system inflation
System-wide dependencies observability
• Emergent high-level architecture appears as a result of combining multiple
independently-developed applications
• Constant system modifications and updates lead to lack of stable picture
• Lack of bird’s eye view:
• Are there any libraries missing that are required by executables in the system?
Which ones?
• Which executables would not be able to run due to the lack of required libraries?
• What are the most popular/critical libraries (i.e. required by most number of
executables)? Least popular?
• Which libraries are present in the system but not required by any executable?
DepEx: Dependency Extractor framework
• Plugin-based architecture
• Presence and Coverage metrics
• Scans file system and stores
discovered dependencies in a
structured database
• Current development targeted at
run-time file modifications tracking
Ubuntu case study
• Why Ubuntu?
• High popularity
• Consistent archives
• Detailed release notes
• Long history – chance to find evolutionary patterns
• Technical challenges
• Disk space requirements
• Image format and compression changed over time
Ubuntu case study: statistics
• 84 consecutive versions (5.04 to 23.04)
• 18 years of history
• 114GB of compressed images
• 9.8 million total files
• Over 408000 total binaries and executables
• Almost 2 millions of library-level dependencies extracted
Ubuntu: libraries vs. executable vs. total files
Ubuntu: dependencies vs. files
Ubuntu: average dependencies
Ubuntu: maximum dependencies
Ubuntu: direct dependencies drift
Ubuntu: most popular libraries (direct)
Rank Library Direct uses
1 libc 4397
2 libpthread 1438
3 libglib 1037
4 libgobject 945
5 libm 836
6 librt 719
7 libgthread 660
8 libgmodule 658
9 libgtk-x11 656
10 libdl 601 0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
L I B C L I B P T H R E A D L I B G L I B L I B G O B J E C T L I B M L I B R T L I B G T H R E A D L I B G M O D U L E L I B G T K - X 1 1 L I B D L
DIRECT USES
Ubuntu: executable/library complexity (recursive)
Rank Library Direct uses
1 gnome-control-center 273
2 gnome-initial-setup 169
3 shotwell-publishing 164
4 libiradio 158
5 librilo 158
6 evolution-alarm-notify 156
7 gnome-todo 155
8 smbd.x86_64-linux-gnu 155
9 empathy-call 154
10 libclplug_gtk3lo 154
0
50
100
150
200
250
300
DEPENDENCIES
Visualization challenges
Conclusion
• Executables with a high number of recursive dependencies can be removed
• “Complexity” comes from large number of subsystems: image formats, setup
• Highly popular libraries are here to stay
• A large number of libraries (up to three quarters) are not explicitly required
• Plugins – discovered and loaded through a different mechanism
• Shipped “just in case” for applications that would likely be installed
• While periodic cleanups occur in practice, averages still steadily grow
• Developer-facing complexity tends to be controlled
Future work directions and enhancements
• More plugins for more executable types
• Network awareness
• Run-time file activity tracking and dependency graph updating
• Real-time system health monitoring
• Missing libraries, broken executables, missing links
• Recovery/fixing recommendations
• Most required missing library?
• Last file event impact?
CRICOS 00123M

More Related Content

PPTX
Network_lecture_for_students_whom_intersted.pptx
PDF
Gentoo Linux, or Why in the World You Should Compile Everything
PDF
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...
PDF
From Zero to Hero - Contribute to Linux Kernel in 15 Minutes
PDF
Hands on kubernetes_container_orchestration
PPTX
Application Performance & Flexibility on Exokernel Systems paper review
PDF
Network_lecture_for_students_whom_intersted.pptx
Gentoo Linux, or Why in the World You Should Compile Everything
Embedded Webinar #13: "From Zero to Hero: contribute to Linux Kernel in 15 mi...
From Zero to Hero - Contribute to Linux Kernel in 15 Minutes
Hands on kubernetes_container_orchestration
Application Performance & Flexibility on Exokernel Systems paper review

Similar to ECSA 2023 Ubuntu Case Study (20)

PDF
unixoperatingsystem-130327073532-phpapp01.pdf
PPTX
Lecture01_IntroToLinuxAndEmbeddedSystems.pptx
PPTX
Lecture01_IntroToLinuxAndEmbeddedSystems.pptx
PPTX
Lecture01_IntroToLinuxAndEmbeddedSystems.pptx
PDF
Building Embedded Linux Full Tutorial for ARM
PDF
Docker and the Linux Kernel
PDF
Immutable Image-Based Operating Systems - EW2024.pdf
PDF
Linux for embedded_systems
PPTX
Pune-Cocoa: Blocks and GCD
PDF
Course 101: Lecture 4: A Tour in RTOS Land
PDF
Embedded Systems: Lecture 5: A Tour in RTOS Land
PPTX
Operating system components
PPTX
Top 10 dev ops tools (1)
PPTX
Programs for office management
PPT
scaling compiled applications - highload 2013
PDF
Building Embedded Linux Systems Introduction
PPTX
Introduction to Operating system and graduate
PDF
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
PDF
Fasten Industry Meeting with GitHub about Dependancy Management
PPTX
Flexible compute
unixoperatingsystem-130327073532-phpapp01.pdf
Lecture01_IntroToLinuxAndEmbeddedSystems.pptx
Lecture01_IntroToLinuxAndEmbeddedSystems.pptx
Lecture01_IntroToLinuxAndEmbeddedSystems.pptx
Building Embedded Linux Full Tutorial for ARM
Docker and the Linux Kernel
Immutable Image-Based Operating Systems - EW2024.pdf
Linux for embedded_systems
Pune-Cocoa: Blocks and GCD
Course 101: Lecture 4: A Tour in RTOS Land
Embedded Systems: Lecture 5: A Tour in RTOS Land
Operating system components
Top 10 dev ops tools (1)
Programs for office management
scaling compiled applications - highload 2013
Building Embedded Linux Systems Introduction
Introduction to Operating system and graduate
eZ Publish 5: from zero to automated deployment (and no regressions!) in one ...
Fasten Industry Meeting with GitHub about Dependancy Management
Flexible compute
Ad

More from CREST (20)

PDF
Mobile Devices: Systemisation of Knowledge about Privacy Invasion Tactics and...
PPTX
Making Software and Software Engineering visible
PPTX
Understanding and Addressing Architectural Challenges of Cloud- Based Systems
PPTX
DevSecOps: Continuous Engineering with Security by Design: Challenges and Sol...
PPTX
A Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
PPTX
Mining Software Repositories for Security: Data Quality Issues Lessons from T...
PPTX
A Decentralised Platform for Provenance Management of Machine Learning Softwa...
PPTX
Privacy Engineering: Enabling Mobility of Mental Health Services with Data Pr...
PPTX
Falling for Phishing: An Empirical Investigation into People's Email Response...
PPTX
An Experience Report on the Design and Implementation of an Ad-hoc Blockchain...
PPTX
Gazealytics: A Unified and Flexible Visual Toolkit for Exploratory and Compar...
PPTX
Detecting Misuses of Security APIs: A Systematic Review
PPTX
Chen_Reading Strategies for Graph Visualizations that Wrap Around in Torus To...
PPTX
Data Quality for Software Vulnerability Dataset
PPTX
Mod2Dash Presentation
PDF
Run-time Patching and updating Impact Estimation
PDF
Energy Efficiency Evaluation of Local and Offloaded Data Processing
PPTX
Designing Quality-Driven Blockchain Networks
PPTX
Privacy Engineering in the Wild
PPTX
Security Data Quality Challenges
Mobile Devices: Systemisation of Knowledge about Privacy Invasion Tactics and...
Making Software and Software Engineering visible
Understanding and Addressing Architectural Challenges of Cloud- Based Systems
DevSecOps: Continuous Engineering with Security by Design: Challenges and Sol...
A Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
Mining Software Repositories for Security: Data Quality Issues Lessons from T...
A Decentralised Platform for Provenance Management of Machine Learning Softwa...
Privacy Engineering: Enabling Mobility of Mental Health Services with Data Pr...
Falling for Phishing: An Empirical Investigation into People's Email Response...
An Experience Report on the Design and Implementation of an Ad-hoc Blockchain...
Gazealytics: A Unified and Flexible Visual Toolkit for Exploratory and Compar...
Detecting Misuses of Security APIs: A Systematic Review
Chen_Reading Strategies for Graph Visualizations that Wrap Around in Torus To...
Data Quality for Software Vulnerability Dataset
Mod2Dash Presentation
Run-time Patching and updating Impact Estimation
Energy Efficiency Evaluation of Local and Offloaded Data Processing
Designing Quality-Driven Blockchain Networks
Privacy Engineering in the Wild
Security Data Quality Challenges
Ad

Recently uploaded (20)

PPTX
Tech Workshop Escape Room Tech Workshop
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PDF
Topaz Photo AI Crack New Download (Latest 2025)
PDF
Microsoft Office 365 Crack Download Free
PDF
AI Guide for Business Growth - Arna Softech
PDF
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PDF
Visual explanation of Dijkstra's Algorithm using Python
PDF
Time Tracking Features That Teams and Organizations Actually Need
PPTX
GSA Content Generator Crack (2025 Latest)
PPTX
Trending Python Topics for Data Visualization in 2025
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PPTX
CNN LeNet5 Architecture: Neural Networks
PDF
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PPTX
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PPTX
"Secure File Sharing Solutions on AWS".pptx
Tech Workshop Escape Room Tech Workshop
Wondershare Recoverit Full Crack New Version (Latest 2025)
Topaz Photo AI Crack New Download (Latest 2025)
Microsoft Office 365 Crack Download Free
AI Guide for Business Growth - Arna Softech
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
How to Use SharePoint as an ISO-Compliant Document Management System
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Visual explanation of Dijkstra's Algorithm using Python
Time Tracking Features That Teams and Organizations Actually Need
GSA Content Generator Crack (2025 Latest)
Trending Python Topics for Data Visualization in 2025
Oracle Fusion HCM Cloud Demo for Beginners
Weekly report ppt - harsh dattuprasad patel.pptx
CNN LeNet5 Architecture: Neural Networks
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
"Secure File Sharing Solutions on AWS".pptx

ECSA 2023 Ubuntu Case Study

  • 1. Analyzing the Evolution of Inter-package Dependencies in Operating Systems: A Case Study of Ubuntu Victor Prokhorenko, Chadni Islam, Muhammad Ali Babar
  • 2. Overview  Software dependencies and complexity  Operating Systems context  DepEx framework - Dependency extraction  Ubuntu – Findings  Future work directions and enhancements
  • 3. Types of software dependencies • Code libraries (source or binary) • Network sockets • UNIX domain sockets • Pipes • File-based • Run-time code provisioning • Downloading from external source • Generated by external applications • Discoverable and undiscoverable
  • 4. Code libraries • Available in source form at compile time • Used through various form of Import- and Include-like statements • Can be in-lined or embedded, thus eliminating external dependency • Dynamically loadable at run time • Typically met in form of .so and .dll files • Can have recursive dependencies of their own
  • 5. Binary Dynamically Loadable Libraries • Required to be present in the system for a given application to start • Required to be located in a discoverable “place” • Required to contain the necessary functionality • List of exported functions • Versioning considerations • Source code may be inaccessible • Sourced from: • Application bundle • Pre-existing Operating System
  • 6. Binary Dynamically Loadable Libraries: example
  • 7. Complexity aspect: dependency-based metrics • How do dependencies reflect application complexity? • How do dependencies reflect library importance? • Four metrics investigated: Presence, Coverage, Occurrence, Usage • Developer-facing complexity vs. recursion
  • 8. Operating system context • Single application development phase • Testing through compilation and execution • Multiple tools: compiler, IDE, tests, debuggers • Multiple applications usage phase • Automatic dependency installation (apt install) • Library version conflicts: versioned names • Bundling: archive, container, VM • Base system inflation
  • 9. System-wide dependencies observability • Emergent high-level architecture appears as a result of combining multiple independently-developed applications • Constant system modifications and updates lead to lack of stable picture • Lack of bird’s eye view: • Are there any libraries missing that are required by executables in the system? Which ones? • Which executables would not be able to run due to the lack of required libraries? • What are the most popular/critical libraries (i.e. required by most number of executables)? Least popular? • Which libraries are present in the system but not required by any executable?
  • 10. DepEx: Dependency Extractor framework • Plugin-based architecture • Presence and Coverage metrics • Scans file system and stores discovered dependencies in a structured database • Current development targeted at run-time file modifications tracking
  • 11. Ubuntu case study • Why Ubuntu? • High popularity • Consistent archives • Detailed release notes • Long history – chance to find evolutionary patterns • Technical challenges • Disk space requirements • Image format and compression changed over time
  • 12. Ubuntu case study: statistics • 84 consecutive versions (5.04 to 23.04) • 18 years of history • 114GB of compressed images • 9.8 million total files • Over 408000 total binaries and executables • Almost 2 millions of library-level dependencies extracted
  • 13. Ubuntu: libraries vs. executable vs. total files
  • 18. Ubuntu: most popular libraries (direct) Rank Library Direct uses 1 libc 4397 2 libpthread 1438 3 libglib 1037 4 libgobject 945 5 libm 836 6 librt 719 7 libgthread 660 8 libgmodule 658 9 libgtk-x11 656 10 libdl 601 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 L I B C L I B P T H R E A D L I B G L I B L I B G O B J E C T L I B M L I B R T L I B G T H R E A D L I B G M O D U L E L I B G T K - X 1 1 L I B D L DIRECT USES
  • 19. Ubuntu: executable/library complexity (recursive) Rank Library Direct uses 1 gnome-control-center 273 2 gnome-initial-setup 169 3 shotwell-publishing 164 4 libiradio 158 5 librilo 158 6 evolution-alarm-notify 156 7 gnome-todo 155 8 smbd.x86_64-linux-gnu 155 9 empathy-call 154 10 libclplug_gtk3lo 154 0 50 100 150 200 250 300 DEPENDENCIES
  • 21. Conclusion • Executables with a high number of recursive dependencies can be removed • “Complexity” comes from large number of subsystems: image formats, setup • Highly popular libraries are here to stay • A large number of libraries (up to three quarters) are not explicitly required • Plugins – discovered and loaded through a different mechanism • Shipped “just in case” for applications that would likely be installed • While periodic cleanups occur in practice, averages still steadily grow • Developer-facing complexity tends to be controlled
  • 22. Future work directions and enhancements • More plugins for more executable types • Network awareness • Run-time file activity tracking and dependency graph updating • Real-time system health monitoring • Missing libraries, broken executables, missing links • Recovery/fixing recommendations • Most required missing library? • Last file event impact?