Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution

Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning
Programs to Graph Execution
Raffi Khatchadourian1,2
Tatiana Castro Vélez2
Mehdi Bagherzadeh3
Nan Jia2
Anita Raja1,2
1
CUNY Hunter College, USA (ponder@hunter.cuny.edu) 2
CUNY Graduate Center, USA 3
Oakland University, USA
Introduction
As Deep Learning (DL) datasets grow,
efficiency becomes essential to support
responsiveness [16].
Traditionally, DL frameworks embraced
deferred execution-style DL code for fast
execution.
Hybrid approaches [2, 8, 13] execute
imperative DL programs quickly.
Hybridization
Figure: Screenshot of the Hybridize Functions refactoring
preview wizard.
In TensorFlow [1], AutoGraph [13] can
enhance run-time performance by decorating
(annotating) appropriate Python function(s)
with @tf.function (Fig. 1).
Problems with Hybrid Approaches
Require non-trivial metadata [12].
Exhibit limitations and known issues with
native program constructs [9].
Are difficult to use correctly and efficiently
(e.g., avoiding side-effects) [4].
Developers manually specifying which
functions are converted.
Insight
Although imperative DL code typically
executes sequentially, hybridization resembles
parallelizing traditional sequential code.
Automated Tool
We design and implement a fully automated,
open-source refactoring tool named
Hybridize Functions [11] that transforms
otherwise eagerly-executed imperative
(Python) DL code for enhanced performance.
Contributions
Refactoring approach for automatically
converting imperative DL code to graphs.
Novel tensor analysis for imperative DL.
Fully automated, open-source tool
implemented as a PyDev [15] Eclipse [7]
IDE plug-in that integrates static analyses
from WALA [14] and Ariadne [6].
Architecture & Dependencies
Figure: Overall architecture.
Eclipse is leveraged for its existing, well
documented and integrated refactoring
framework and test engine [3], including
transformation APIs (e.g., ASTRewrite),
refactoring preview pane (Fig. 1),
precondition checking (e.g.,
Refactoring.
checkInitialConditions(),
Refactoring.
checkFinalPreconditions()), and
refactoring testing (e.g.,
RefactoringTest).
PyDev used for efficient program entity
indexing, extensive refactoring support [3],
and that it is completely open-source for
all Python development.
WALA is used for static analyses, such as
ModRef, for which we built our side-effect
analysis upon.
Ariadne, which depends on WALA, is used
for its Python and tensor analysis,
including type inference and (TensorFlow)
library modeling.
Challenges Addressed
Reworked much of the existing Java (JDT)
refactoring tooling to work with Python.
Integrated Ariadne with PyDev due to its
excellent and long-lived refactoring support
for Python, including refactoring preview
pane, element GUI selection, and
refactoring undo history.
Augmented Ariadne to analyze imperative
Deep Learning (Python) code by vastly
expanding the XML summaries to support
a wide variety of popular TensorFlow 2
APIs.
Added support for Python constructs
commonly used in modern imperative DL
programs.
Correlated varying intermediate
representations (IRs) with the original
Python source code.
Modernizing Ariadne: New Enhancements
Python module packages.
Wild card imports.
Intra-package references (relative imports;
from .. import X).
Package initialization scripts.
Automatic unit test entry points discovery.
Non-scalar tensor dataset [10] iteration.
Modeling of additional libraries.
Static and class methods analysis.
Analysis of custom decorators.
Callable object (functor) analysis (used in
Keras).
Evaluation Summary
We applied our approach to 19 open-source
Python imperative DL programs of varying
size and domain, with thousands of source
lines of code ranging from 0.12 to 36.72.
Our tool considered 766 Python functions,
automatically refactoring 42.56% despite
being highly conservative.
During a run-time performance evaluation,
we measured an average relative model
training speedup of 2.16 (memory
consumption measurement pending).
Differences in model accuracy and loss
before and after refactoring were negligible.
Conclusion
Open-source, automated refactoring PyDev
Eclipse plug-in, Hybridize Functions,
assists developers with writing optimal
imperative DL Python code.
Integrates an Eclipse refactoring with
WALA Ariadne Python static analyses.
Future Work
Explore incorporating advanced
container-based analyses.
Automatically split functions.
References
1. Abadi, M. et al.: TensorFlow: A System for Large-Scale Machine Learning. In: OSDI (2016)
2. Apache, Hybridize. Apache MXNet documentation. (2021). https://mxnet.apache.org/versions/1.8.
0/api/python/docs/tutorials/packages/gluon/blocks/hybridize.html (visited on 04/08/2021)
3. Bäumer, D. et al.: “Integrating refactoring support into a Java development tool”.
4. Castro Vélez, T. et al.: Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An
Empirical Study. In: MSR. MSR ’22. ACM (2022). https://doi.org/10.1145/3524842.3528455
5. Chollet, F.: Deep Learning with Python. Manning (2020)
6. Dolby, J. et al.: Ariadne. Analysis for Machine Learning Programs. In: MAPL, pp. 1–10. ACM (2018)
7. Eclipse Foundation, Eclipse IDE. (2024). https://eclipseide.org/ (visited on 09/10/2024)
8. Facebook Inc., PyTorch. TorchScript. en. (2019). https://pytorch.org/docs/stable/jit.html
9. Google LLC, Better performance with tf.function. (2021). https://tensorflow.org/guide/function
10. Google LLC, tf.data.Dataset. TensorFlow. Version 2.9.3. (2023). https : / / www . tensorflow . org /
versions/r2.9/api_docs/python/tf/data/Dataset (visited on 12/15/2023)
11. Hybridize-Functions-Refactoring. (2024). https://github.com/ponder-lab/Hybridize-Functions-
Refactoring (visited on 09/30/2024).
12. Jeong, E. et al.: Speculative Symbolic Graph Execution of Imperative Deep Learning Programs. SIGOPS
Oper. Syst. Rev. 53(1), 26–33 (2019). https://doi.org/10.1145/3352020.3352025
13. Moldovan, D. et al.: AutoGraph: Imperative-style Coding with Graph-based Performance. (2019). arXiv:
1810.08061 [cs.PL].
14. T.J. Watson Libraries for Analysis. (2024). https://github.com/wala/WALA (visited on 09/10/2024).
original-date: 2012-04-05T18:57:03Z.
15. Zadrozny, F.: PyDev. (2023). https://www.pydev.org (visited on 05/31/2023)
16. Zhou, W. et al.: HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs. In: ICSE (2020).
https://doi.org/10.1145/3377811.3380434
Acknowledgments This material is supported in part
by the National Science Foundation under awards CCF
2200343, CNS 2213763, and CCF 2343750.
International Conference on Fundamental Approaches to Software Engineering, May 3–8, 2025, Hamilton, Canada

Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution

More Related Content

Similar to Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution (20)

More from Raffi Khatchadourian (20)

Recently uploaded (20)

Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Learning Programs to Graph Execution