SlideShare a Scribd company logo
KIT – The Research University in the Helmholtz Association www.kit.edu
Easing the Reuse of ML Solutions by Interactive
Clustering-based Autotuning in Scientific Applications
Hamideh Hajiabadi, Lennart Hilbert, Anne Koziolek
Karlsruhe Institute of Technology, Karlsruhe, Germany
Institute of Information Security and Dependability
Institute for Biological and Chemical Systems
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
Difficulties of using ML in scientific applications
2 8/27/2022
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
Challenges of supervised vs unsupervised ML reuse
3 8/27/2022
• automatically tunning ☺
• annotations by experienced domain
specialists 
• sufficiently comprehensive annotated
data to avoid over-fitting 
• re-train on new data 
supervised
unsupervised
• manually tunning 
• no annotations is required ☺
• re-tune on new data 
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
1. Supervised segmentation need
annotated data and it is very
expensive.
2. The segmentation algorithm
cannot be used out of the box. It
indeed needs tuning
3. Objects are in different size,
intensity, shape and sometimes
with unclear boundary
4. Different settings needs to apply
in each object type
8/27/2022
4
Scientific use case: biological image segmentation
RNA distribution in fixed zebrafish embryo
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
Related work
Interactive methods
WEKA1
Ilastik2
Drawbacks:
Still need expert knowledge to provide annotations
Mostly focus on cellular segmentation and not subcellular
5
[1] Kanuri N, Abdelkarim AZ, Rathore SA. Trainable WEKA (Waikato Environment for Knowledge Analysis) Segmentation
Tool: Machine-Learning-Enabled Segmentation on Features of Panoramic Radiographs. Cureus. 2022 Jan 31;14(1).
[2] Berg S, Kutra D, Kroeger T, Straehle CN, Kausler BX, Haubold C, Schiegg M, Ales J, Beier T, Rudy M, Eren K.
Ilastik: interactive machine learning for (bio) image analysis. Nature Methods. 2019 Dec;16(12):1226-32.
8/27/2022
WEKA /
Ilastik
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
We propose:
A combination of auto-tuning and active learning
In scenarios where no training data and no
objective function are available
But a metric approximating the quality can be
defined
8/27/2022
6
Proposed approach
Proposed
framework
Benefit: It helps biologists to choose the ML solution and tune the parameters
with few interactions
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
The proposed framework
7 8/27/2022
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
We used input images recorded by STEDD microscopy in a previous
study [1, 2].
The images are cell nuclei in fixed zebrafish embryos, where RNA was
labeled.
8/27/2022
8
Case study
[1] Pancholi et al., “Rna polymerase ii clusters form in line with surface condensation on regulatory chromatin,” Molecular systems biology, vol.
17, no. 9, p. e10272, 2021
[2] L. Hilbert, “Analysis of RNA polymerase II phosphorylation in STimulated Emission Double Depletion (STEDD) microscopy images,” Jun.
2021.
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
9
First step: object detection
Framework detects the objects and prepares the object set of size 60 × 60 pixels
8/27/2022
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
The frameworks clusters the objects.
Number of clusters is given to the
framework
8/27/2022
10
Second step: object set clustering
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
In each interaction 4 images based
on the defined metric are selected
The results of four segmentation
algorithm is then showed
Via user interaction, user selects
the best algorithm
The selected augmentation method
is then adjusted in the same
manner.
8/27/2022
11
Third step: framework suggests some segmentation
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
RQ1: How effective is our framework in improving the quality of
segmentation?
RQ2: How much interactions the framework needs from users to reuse
the ML solutions?
8/27/2022
12
Evaluation
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
Evaluation process
13 8/27/2022
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
A sample questionnaire provided to expert II
14 8/27/2022
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
Results
15 8/27/2022
RQ1: How effective is our framework in improving the quality of
segmentation?
Framework
Choice Percentage
Results obtained by our framework outperforms 80%
Results obtained after expert tuning outperforms 10%
Both results look almost similar 10%
It depends on …. 0%
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
8/27/2022
16
Results
RQ2: How much interactions the framework needs from users to reuse
the ML solutions?
Framework
Total number of intraction Spent time Simplicity
1: not simple
2: maximally simple
User 1 8 6 minutes 6
User 2 10 7 minutes 6
User 3 6 5 minutes 7
Average 8 6 minutes 6
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
Only example data from one type of biological sample, recorded on one
type of microscope
Three users working in the same laboratory tested the framework by
tuning segmentation on the object set.
Only one expert is used to visually assess the quality of segmentation
results
8/27/2022
17
Threats to validity
Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based
Autotuning in Scientific Applications
Karlsruhe Institute of Technology - Germany
Summary and future work
18 8/27/2022
• combination of
autotuning and active
learning
• object-type specific
adjustment
• a few user interactions
Future work: improving the feature space representation of objects by interactive fashion

More Related Content

PPTX
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
PDF
Gc2005vk
PPTX
Benchmarking Automated Machine Learning For Clustering
PPTX
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES
PDF
Fuzzy k c-means clustering algorithm for medical image
PDF
Object Detection Beyond Mask R-CNN and RetinaNet II
PDF
Open and Collaborative Software for Digital Pathology
PPTX
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
nnU-Net: a self-configuring method for deep learning-based biomedical image s...
Gc2005vk
Benchmarking Automated Machine Learning For Clustering
A DEEP LEARNING APPROACH FOR SEMANTIC SEGMENTATION IN BRAIN TUMOR IMAGES
Fuzzy k c-means clustering algorithm for medical image
Object Detection Beyond Mask R-CNN and RetinaNet II
Open and Collaborative Software for Digital Pathology
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...

Similar to Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications (6)

PDF
Two-Photon Microscopy Vasculature Segmentation
PDF
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
PDF
AI driven classification framework for advanced Test Automation
PDF
Bo Dong-ISBI2015-CameraReady
PDF
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
PPTX
Towards automated phenotypic cell profiling with high-content imaging
Two-Photon Microscopy Vasculature Segmentation
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
AI driven classification framework for advanced Test Automation
Bo Dong-ISBI2015-CameraReady
5212303961620480 1585670953 joanna_stachera_proposal_g_soc2020
Towards automated phenotypic cell profiling with high-content imaging
Ad

More from SEAA 2022 (18)

PDF
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
PDF
Bad Smells in Industrial Automation: Sniffing out Feature Envy
PDF
Software Architecture Challenges in Process Automation - From Code Generation...
PDF
From Traditional to Digital: How software, data and AI are transforming the e...
PDF
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
PDF
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
PDF
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
PDF
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
PDF
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
PPTX
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
PDF
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
PDF
Service Classification through Machine Learning: Aiding in the Efficient Ide...
PDF
Maintainability Challenges inML:ASLR
PDF
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
PDF
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
PDF
API Deprecation: A Systematic Mapping Study
PDF
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
PDF
EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
Bad Smells in Industrial Automation: Sniffing out Feature Envy
Software Architecture Challenges in Process Automation - From Code Generation...
From Traditional to Digital: How software, data and AI are transforming the e...
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
Service Classification through Machine Learning: Aiding in the Efficient Ide...
Maintainability Challenges inML:ASLR
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
API Deprecation: A Systematic Mapping Study
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments
Ad

Recently uploaded (20)

PDF
An interstellar mission to test astrophysical black holes
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPT
protein biochemistry.ppt for university classes
PPTX
famous lake in india and its disturibution and importance
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Microbiology with diagram medical studies .pptx
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
Sciences of Europe No 170 (2025)
PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PPTX
2. Earth - The Living Planet earth and life
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
2. Earth - The Living Planet Module 2ELS
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
An interstellar mission to test astrophysical black holes
neck nodes and dissection types and lymph nodes levels
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
TOTAL hIP ARTHROPLASTY Presentation.pptx
Biophysics 2.pdffffffffffffffffffffffffff
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
7. General Toxicologyfor clinical phrmacy.pptx
protein biochemistry.ppt for university classes
famous lake in india and its disturibution and importance
ECG_Course_Presentation د.محمد صقران ppt
Microbiology with diagram medical studies .pptx
INTRODUCTION TO EVS | Concept of sustainability
Sciences of Europe No 170 (2025)
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
2. Earth - The Living Planet earth and life
HPLC-PPT.docx high performance liquid chromatography
AlphaEarth Foundations and the Satellite Embedding dataset
2. Earth - The Living Planet Module 2ELS
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf

Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications

  • 1. KIT – The Research University in the Helmholtz Association www.kit.edu Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Hamideh Hajiabadi, Lennart Hilbert, Anne Koziolek Karlsruhe Institute of Technology, Karlsruhe, Germany Institute of Information Security and Dependability Institute for Biological and Chemical Systems
  • 2. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany Difficulties of using ML in scientific applications 2 8/27/2022
  • 3. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany Challenges of supervised vs unsupervised ML reuse 3 8/27/2022 • automatically tunning ☺ • annotations by experienced domain specialists  • sufficiently comprehensive annotated data to avoid over-fitting  • re-train on new data  supervised unsupervised • manually tunning  • no annotations is required ☺ • re-tune on new data 
  • 4. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany 1. Supervised segmentation need annotated data and it is very expensive. 2. The segmentation algorithm cannot be used out of the box. It indeed needs tuning 3. Objects are in different size, intensity, shape and sometimes with unclear boundary 4. Different settings needs to apply in each object type 8/27/2022 4 Scientific use case: biological image segmentation RNA distribution in fixed zebrafish embryo
  • 5. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany Related work Interactive methods WEKA1 Ilastik2 Drawbacks: Still need expert knowledge to provide annotations Mostly focus on cellular segmentation and not subcellular 5 [1] Kanuri N, Abdelkarim AZ, Rathore SA. Trainable WEKA (Waikato Environment for Knowledge Analysis) Segmentation Tool: Machine-Learning-Enabled Segmentation on Features of Panoramic Radiographs. Cureus. 2022 Jan 31;14(1). [2] Berg S, Kutra D, Kroeger T, Straehle CN, Kausler BX, Haubold C, Schiegg M, Ales J, Beier T, Rudy M, Eren K. Ilastik: interactive machine learning for (bio) image analysis. Nature Methods. 2019 Dec;16(12):1226-32. 8/27/2022 WEKA / Ilastik
  • 6. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany We propose: A combination of auto-tuning and active learning In scenarios where no training data and no objective function are available But a metric approximating the quality can be defined 8/27/2022 6 Proposed approach Proposed framework Benefit: It helps biologists to choose the ML solution and tune the parameters with few interactions
  • 7. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany The proposed framework 7 8/27/2022
  • 8. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany We used input images recorded by STEDD microscopy in a previous study [1, 2]. The images are cell nuclei in fixed zebrafish embryos, where RNA was labeled. 8/27/2022 8 Case study [1] Pancholi et al., “Rna polymerase ii clusters form in line with surface condensation on regulatory chromatin,” Molecular systems biology, vol. 17, no. 9, p. e10272, 2021 [2] L. Hilbert, “Analysis of RNA polymerase II phosphorylation in STimulated Emission Double Depletion (STEDD) microscopy images,” Jun. 2021.
  • 9. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany 9 First step: object detection Framework detects the objects and prepares the object set of size 60 × 60 pixels 8/27/2022
  • 10. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany The frameworks clusters the objects. Number of clusters is given to the framework 8/27/2022 10 Second step: object set clustering
  • 11. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany In each interaction 4 images based on the defined metric are selected The results of four segmentation algorithm is then showed Via user interaction, user selects the best algorithm The selected augmentation method is then adjusted in the same manner. 8/27/2022 11 Third step: framework suggests some segmentation
  • 12. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany RQ1: How effective is our framework in improving the quality of segmentation? RQ2: How much interactions the framework needs from users to reuse the ML solutions? 8/27/2022 12 Evaluation
  • 13. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany Evaluation process 13 8/27/2022
  • 14. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany A sample questionnaire provided to expert II 14 8/27/2022
  • 15. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany Results 15 8/27/2022 RQ1: How effective is our framework in improving the quality of segmentation? Framework Choice Percentage Results obtained by our framework outperforms 80% Results obtained after expert tuning outperforms 10% Both results look almost similar 10% It depends on …. 0%
  • 16. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany 8/27/2022 16 Results RQ2: How much interactions the framework needs from users to reuse the ML solutions? Framework Total number of intraction Spent time Simplicity 1: not simple 2: maximally simple User 1 8 6 minutes 6 User 2 10 7 minutes 6 User 3 6 5 minutes 7 Average 8 6 minutes 6
  • 17. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany Only example data from one type of biological sample, recorded on one type of microscope Three users working in the same laboratory tested the framework by tuning segmentation on the object set. Only one expert is used to visually assess the quality of segmentation results 8/27/2022 17 Threats to validity
  • 18. Hajiabadi et al. - Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning in Scientific Applications Karlsruhe Institute of Technology - Germany Summary and future work 18 8/27/2022 • combination of autotuning and active learning • object-type specific adjustment • a few user interactions Future work: improving the feature space representation of objects by interactive fashion