SlideShare a Scribd company logo
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning
Programs to Graph Execution
Raffi Khatchadourian1,2
Tatiana Castro Vélez2
Mehdi Bagherzadeh3
Nan Jia2
Anita Raja1,2
1
CUNY Hunter College, USA (ponder@hunter.cuny.edu) 2
CUNY Graduate Center, USA 3
Oakland University, USA
Introduction
As Deep Learning (DL) datasets grow,
efficiency becomes essential to support
responsiveness [16].
Traditionally, DL frameworks embraced
deferred execution-style DL code for fast
execution.
Hybrid approaches [2, 8, 13] execute
imperative DL programs quickly.
Hybridization
Figure: Screenshot of the Hybridize Functions refactoring
preview wizard.
In TensorFlow [1], AutoGraph [13] can
enhance run-time performance by decorating
(annotating) appropriate Python function(s)
with @tf.function (Fig. 1).
Problems with Hybrid Approaches
Require non-trivial metadata [12].
Exhibit limitations and known issues with
native program constructs [9].
Are difficult to use correctly and efficiently
(e.g., avoiding side-effects) [4].
Developers manually specifying which
functions are converted.
Insight
Although imperative DL code typically
executes sequentially, hybridization resembles
parallelizing traditional sequential code.
Automated Tool
We design and implement a fully automated,
open-source refactoring tool named
Hybridize Functions [11] that transforms
otherwise eagerly-executed imperative
(Python) DL code for enhanced performance.
Contributions
Refactoring approach for automatically
converting imperative DL code to graphs.
Novel tensor analysis for imperative DL.
Fully automated, open-source tool
implemented as a PyDev [15] Eclipse [7]
IDE plug-in that integrates static analyses
from WALA [14] and Ariadne [6].
Architecture & Dependencies
Figure: Overall architecture.
Eclipse is leveraged for its existing, well
documented and integrated refactoring
framework and test engine [3], including
transformation APIs (e.g., ASTRewrite),
refactoring preview pane (Fig. 1),
precondition checking (e.g.,
Refactoring.
checkInitialConditions(),
Refactoring.
checkFinalPreconditions()), and
refactoring testing (e.g.,
RefactoringTest).
PyDev used for efficient program entity
indexing, extensive refactoring support [3],
and that it is completely open-source for
all Python development.
WALA is used for static analyses, such as
ModRef, for which we built our side-effect
analysis upon.
Ariadne, which depends on WALA, is used
for its Python and tensor analysis,
including type inference and (TensorFlow)
library modeling.
Challenges Addressed
Reworked much of the existing Java (JDT)
refactoring tooling to work with Python.
Integrated Ariadne with PyDev due to its
excellent and long-lived refactoring support
for Python, including refactoring preview
pane, element GUI selection, and
refactoring undo history.
Augmented Ariadne to analyze imperative
Deep Learning (Python) code by vastly
expanding the XML summaries to support
a wide variety of popular TensorFlow 2
APIs.
Added support for Python constructs
commonly used in modern imperative DL
programs.
Correlated varying intermediate
representations (IRs) with the original
Python source code.
Modernizing Ariadne: New Enhancements
Python module packages.
Wild card imports.
Intra-package references (relative imports;
from .. import X).
Package initialization scripts.
Automatic unit test entry points discovery.
Non-scalar tensor dataset [10] iteration.
Modeling of additional libraries.
Static and class methods analysis.
Analysis of custom decorators.
Callable object (functor) analysis (used in
Keras).
Evaluation Summary
We applied our approach to 19 open-source
Python imperative DL programs of varying
size and domain, with thousands of source
lines of code ranging from 0.12 to 36.72.
Our tool considered 766 Python functions,
automatically refactoring 42.56% despite
being highly conservative.
During a run-time performance evaluation,
we measured an average relative model
training speedup of 2.16 (memory
consumption measurement pending).
Differences in model accuracy and loss
before and after refactoring were negligible.
Conclusion
Open-source, automated refactoring PyDev
Eclipse plug-in, Hybridize Functions,
assists developers with writing optimal
imperative DL Python code.
Integrates an Eclipse refactoring with
WALA Ariadne Python static analyses.
Future Work
Explore incorporating advanced
container-based analyses.
Automatically split functions.
References
1. Abadi, M. et al.: TensorFlow: A System for Large-Scale Machine Learning. In: OSDI (2016)
2. Apache, Hybridize. Apache MXNet documentation. (2021). https://mxnet.apache.org/versions/1.8.
0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html (visited on 04/08/2021)
3. Bäumer, D. et al.: “Integrating refactoring support into a Java development tool”.
4. Castro Vélez, T. et al.: Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An
Empirical Study. In: MSR. MSR ’22. ACM (2022). https://doi.org/10.1145/3524842.3528455
5. Chollet, F.: Deep Learning with Python. Manning (2020)
6. Dolby, J. et al.: Ariadne. Analysis for Machine Learning Programs. In: MAPL, pp. 1–10. ACM (2018)
7. Eclipse Foundation, Eclipse IDE. (2024). https://eclipseide.org/ (visited on 09/10/2024)
8. Facebook Inc., PyTorch. TorchScript. en. (2019). https://pytorch.org/docs/stable/jit.html
9. Google LLC, Better performance with tf.function. (2021). https://tensorflow.org/guide/function
10. Google LLC, tf.data.Dataset. TensorFlow. Version 2.9.3. (2023). https : / / www . tensorflow . org /
versions/r2.9/api_docs/python/tf/data/Dataset (visited on 12/15/2023)
11. Hybridize-Functions-Refactoring. (2024). https://github.com/ponder-lab/Hybridize-Functions-
Refactoring (visited on 09/30/2024).
12. Jeong, E. et al.: Speculative Symbolic Graph Execution of Imperative Deep Learning Programs. SIGOPS
Oper. Syst. Rev. 53(1), 26–33 (2019). https://doi.org/10.1145/3352020.3352025
13. Moldovan, D. et al.: AutoGraph: Imperative-style Coding with Graph-based Performance. (2019). arXiv:
1810.08061 [cs.PL].
14. T.J. Watson Libraries for Analysis. (2024). https://github.com/wala/WALA (visited on 09/10/2024).
original-date: 2012-04-05T18:57:03Z.
15. Zadrozny, F.: PyDev. (2023). https://www.pydev.org (visited on 05/31/2023)
16. Zhou, W. et al.: HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs. In: ICSE (2020).
https://doi.org/10.1145/3377811.3380434
Acknowledgments This material is supported in part
by the National Science Foundation under awards CCF
2200343, CNS 2213763, and CCF 2343750.
International Conference on Fundamental Approaches to Software Engineering, May 3–8, 2025, Hamilton, Canada

More Related Content

PDF
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
PDF
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
PDF
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
PDF
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
PPTX
python_libraries_for_artificial_intelligence.pptx
PPTX
Machine Learning Toolssssssssssssss.pptx
PPTX
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
PPTX
Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Gr...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
python_libraries_for_artificial_intelligence.pptx
Machine Learning Toolssssssssssssss.pptx
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Tianqi Chen, PhD Student, University of Washington, at MLconf Seattle 2017

Similar to Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution (20)

PDF
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
PDF
Tensor flow white paper
PDF
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
PDF
Austin,TX Meetup presentation tensorflow final oct 26 2017
PPTX
Hadoop training in mumbai
PPTX
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
PDF
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
PDF
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
PDF
IRJET- Python Libraries and Packages for Deep Learning-A Survey
PPTX
Introduction to Tensor Flow-v1.pptx
PDF
MLMPLs
PDF
Julien Simon - Deep Dive: Compiling Deep Learning Models
PDF
Lecture 4: Deep Learning Frameworks
PDF
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
PPTX
Tensorflow
PPTX
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
PPTX
hpcpp.pptx
PPTX
Simplifying training deep and serving learning models with big data in python...
PPTX
2017 arab wic marwa ayad machine learning
PDF
Pytorch for tf_developers
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Tensor flow white paper
An Empirical Study of Refactorings and Technical Debt in Machine Learning Sys...
Austin,TX Meetup presentation tensorflow final oct 26 2017
Hadoop training in mumbai
What is TensorFlow? | Introduction to TensorFlow | TensorFlow Tutorial For Be...
TensorFlow 2.0 Autographs - For TFUG - Vik Pant
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
IRJET- Python Libraries and Packages for Deep Learning-A Survey
Introduction to Tensor Flow-v1.pptx
MLMPLs
Julien Simon - Deep Dive: Compiling Deep Learning Models
Lecture 4: Deep Learning Frameworks
Paolo Galeone - Dissecting tf.function to discover auto graph strengths and s...
Tensorflow
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
hpcpp.pptx
Simplifying training deep and serving learning models with big data in python...
2017 arab wic marwa ayad machine learning
Pytorch for tf_developers
Ad

More from Raffi Khatchadourian (20)

PDF
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
PDF
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
PPTX
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
PDF
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
PDF
An Empirical Study on the Use and Misuse of Java 8 Streams
PDF
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
PDF
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
PDF
A Brief Introduction to Type Constraints
PDF
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ...
PDF
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
PDF
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
PDF
Towards Safe Refactoring for Intelligent Parallelization of Java 8 Streams
PDF
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
PDF
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
PDF
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
PDF
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
PDF
Poster on Automated Refactoring of Legacy Java Software to Default Methods
PDF
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMU
PDF
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
PDF
Detecting Broken Pointcuts using Structural Commonality and Degree of Interest
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree o...
Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API U...
Automated Evolution of Feature Logging Statement Levels Using Git Histories a...
An Empirical Study on the Use and Misuse of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
A Brief Introduction to Type Constraints
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams ...
A Tool for Optimizing Java 8 Stream Software via Automated Refactoring
Porting the NetBeans Java 8 Enhanced For Loop Lambda Expression Refactoring t...
Towards Safe Refactoring for Intelligent Parallelization of Java 8 Streams
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Defaultification Refactoring: A Tool for Automatically Converting Java Method...
Automated Refactoring of Legacy Java Software to Default Methods Talk at ICSE...
Poster on Automated Refactoring of Legacy Java Software to Default Methods
Automated Refactoring of Legacy Java Software to Default Methods Talk at GMU
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Detecting Broken Pointcuts using Structural Commonality and Degree of Interest
Ad

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Machine learning based COVID-19 study performance prediction
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Cloud computing and distributed systems.
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Spectroscopy.pptx food analysis technology
PDF
Spectral efficient network and resource selection model in 5G networks
PPT
Teaching material agriculture food technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine learning based COVID-19 study performance prediction
“AI and Expert System Decision Support & Business Intelligence Systems”
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation theory and applications.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
20250228 LYD VKU AI Blended-Learning.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Mobile App Security Testing_ A Comprehensive Guide.pdf
Cloud computing and distributed systems.
Chapter 3 Spatial Domain Image Processing.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Network Security Unit 5.pdf for BCA BBA.
Spectroscopy.pptx food analysis technology
Spectral efficient network and resource selection model in 5G networks
Teaching material agriculture food technology
Advanced methodologies resolving dimensionality complications for autism neur...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution

  • 1. Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution Raffi Khatchadourian1,2 Tatiana Castro Vélez2 Mehdi Bagherzadeh3 Nan Jia2 Anita Raja1,2 1 CUNY Hunter College, USA (ponder@hunter.cuny.edu) 2 CUNY Graduate Center, USA 3 Oakland University, USA Introduction As Deep Learning (DL) datasets grow, efficiency becomes essential to support responsiveness [16]. Traditionally, DL frameworks embraced deferred execution-style DL code for fast execution. Hybrid approaches [2, 8, 13] execute imperative DL programs quickly. Hybridization Figure: Screenshot of the Hybridize Functions refactoring preview wizard. In TensorFlow [1], AutoGraph [13] can enhance run-time performance by decorating (annotating) appropriate Python function(s) with @tf.function (Fig. 1). Problems with Hybrid Approaches Require non-trivial metadata [12]. Exhibit limitations and known issues with native program constructs [9]. Are difficult to use correctly and efficiently (e.g., avoiding side-effects) [4]. Developers manually specifying which functions are converted. Insight Although imperative DL code typically executes sequentially, hybridization resembles parallelizing traditional sequential code. Automated Tool We design and implement a fully automated, open-source refactoring tool named Hybridize Functions [11] that transforms otherwise eagerly-executed imperative (Python) DL code for enhanced performance. Contributions Refactoring approach for automatically converting imperative DL code to graphs. Novel tensor analysis for imperative DL. Fully automated, open-source tool implemented as a PyDev [15] Eclipse [7] IDE plug-in that integrates static analyses from WALA [14] and Ariadne [6]. Architecture & Dependencies Figure: Overall architecture. Eclipse is leveraged for its existing, well documented and integrated refactoring framework and test engine [3], including transformation APIs (e.g., ASTRewrite), refactoring preview pane (Fig. 1), precondition checking (e.g., Refactoring. checkInitialConditions(), Refactoring. checkFinalPreconditions()), and refactoring testing (e.g., RefactoringTest). PyDev used for efficient program entity indexing, extensive refactoring support [3], and that it is completely open-source for all Python development. WALA is used for static analyses, such as ModRef, for which we built our side-effect analysis upon. Ariadne, which depends on WALA, is used for its Python and tensor analysis, including type inference and (TensorFlow) library modeling. Challenges Addressed Reworked much of the existing Java (JDT) refactoring tooling to work with Python. Integrated Ariadne with PyDev due to its excellent and long-lived refactoring support for Python, including refactoring preview pane, element GUI selection, and refactoring undo history. Augmented Ariadne to analyze imperative Deep Learning (Python) code by vastly expanding the XML summaries to support a wide variety of popular TensorFlow 2 APIs. Added support for Python constructs commonly used in modern imperative DL programs. Correlated varying intermediate representations (IRs) with the original Python source code. Modernizing Ariadne: New Enhancements Python module packages. Wild card imports. Intra-package references (relative imports; from .. import X). Package initialization scripts. Automatic unit test entry points discovery. Non-scalar tensor dataset [10] iteration. Modeling of additional libraries. Static and class methods analysis. Analysis of custom decorators. Callable object (functor) analysis (used in Keras). Evaluation Summary We applied our approach to 19 open-source Python imperative DL programs of varying size and domain, with thousands of source lines of code ranging from 0.12 to 36.72. Our tool considered 766 Python functions, automatically refactoring 42.56% despite being highly conservative. During a run-time performance evaluation, we measured an average relative model training speedup of 2.16 (memory consumption measurement pending). Differences in model accuracy and loss before and after refactoring were negligible. Conclusion Open-source, automated refactoring PyDev Eclipse plug-in, Hybridize Functions, assists developers with writing optimal imperative DL Python code. Integrates an Eclipse refactoring with WALA Ariadne Python static analyses. Future Work Explore incorporating advanced container-based analyses. Automatically split functions. References 1. Abadi, M. et al.: TensorFlow: A System for Large-Scale Machine Learning. In: OSDI (2016) 2. Apache, Hybridize. Apache MXNet documentation. (2021). https://mxnet.apache.org/versions/1.8. 0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html (visited on 04/08/2021) 3. Bäumer, D. et al.: “Integrating refactoring support into a Java development tool”. 4. Castro Vélez, T. et al.: Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study. In: MSR. MSR ’22. ACM (2022). https://doi.org/10.1145/3524842.3528455 5. Chollet, F.: Deep Learning with Python. Manning (2020) 6. Dolby, J. et al.: Ariadne. Analysis for Machine Learning Programs. In: MAPL, pp. 1–10. ACM (2018) 7. Eclipse Foundation, Eclipse IDE. (2024). https://eclipseide.org/ (visited on 09/10/2024) 8. Facebook Inc., PyTorch. TorchScript. en. (2019). https://pytorch.org/docs/stable/jit.html 9. Google LLC, Better performance with tf.function. (2021). https://tensorflow.org/guide/function 10. Google LLC, tf.data.Dataset. TensorFlow. Version 2.9.3. (2023). https : / / www . tensorflow . org / versions/r2.9/api_docs/python/tf/data/Dataset (visited on 12/15/2023) 11. Hybridize-Functions-Refactoring. (2024). https://github.com/ponder-lab/Hybridize-Functions- Refactoring (visited on 09/30/2024). 12. Jeong, E. et al.: Speculative Symbolic Graph Execution of Imperative Deep Learning Programs. SIGOPS Oper. Syst. Rev. 53(1), 26–33 (2019). https://doi.org/10.1145/3352020.3352025 13. Moldovan, D. et al.: AutoGraph: Imperative-style Coding with Graph-based Performance. (2019). arXiv: 1810.08061 [cs.PL]. 14. T.J. Watson Libraries for Analysis. (2024). https://github.com/wala/WALA (visited on 09/10/2024). original-date: 2012-04-05T18:57:03Z. 15. Zadrozny, F.: PyDev. (2023). https://www.pydev.org (visited on 05/31/2023) 16. Zhou, W. et al.: HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs. In: ICSE (2020). https://doi.org/10.1145/3377811.3380434 Acknowledgments This material is supported in part by the National Science Foundation under awards CCF 2200343, CNS 2213763, and CCF 2343750. International Conference on Fundamental Approaches to Software Engineering, May 3–8, 2025, Hamilton, Canada