SlideShare a Scribd company logo
MEDICAL DATA MANAGEMENT:
COVID-19 DETECTION USING COUGH
RECORDINGS,
CHEST X-RAYS CLASSIFICATION AND
GENERATION
University of Milano-Bicocca
Master's Degree in Data Science
Digital Signal and Image Management
Academic Year 2022-2023
Authors:
Giorgio CARBONE matricola n.
811974
Gianluca CAVALLARO matricola n.
826049 Remo MARCONZINI matricola
n. 883256
PROCESSING OF
ONE-DIMENSIONAL
SIGNALS
Dataset:
 Crowdsource dataset
 Recordings collected between April 1st, 2020 and
December 1st, 2020
 34,434 recordings and their metadata
• One .json for each recording
• One .csv file containing all metadata
 Most relevant attributes
• uuid → Name of the recording
• cough_detected → Probability of being cough
sound
• status → Self-reported health condition
uuid 00039425-7f3a-42aa-ac13-
834aaa2b6b92
document 2020-04-
13T21:30:59.801831+00:00
cough_detected 0.9754
age [0, …, 99, NaN]
gender [Male, Female, NaN]
respiratory_condition [True, False, NaN]
fever_muscle_pain [True, False, NaN]
status [Healthy, Symptomatic,
COVID-19, NaN]
Data Cleaning
 Removing rows with unknown status
 Filter for recordings with cough_detected > 0.8
• Value recommended by the authors
 Number of recordings after cleaning: 12119
 Recordings distribution:
• Healthy: 9631
• Symptomatic: 2622
• COVID-19: 634
 The dataset is imbalanced
N° recordings
Healthy 9167
Symptomatic 2339
COVID-19 613
Total 12119
Preprocessing
 Noise reduction
• Spectral gating using noisereduce
 Silence removal
• To maintain only relevant audio patterns
• Silence > 1s is removed
• 0.5s of silence maintained at the beginning
and the end of the recording
 Length standardization
• Need for a fixed dimensions of the audio
features
• Trade-off between information loss and amount
of sparse values
Duration N° recordings
< 2s 1439
<3s 3461
< 4s 5826
< 5s 7892
< 6s 9468
< 7s 10680
< 8s 11470
< 9s 11941
Noise reduction
Original recording
Noise reduction
Silence removal
Noise reduction
Silence removal
Class imbalance problem
 Binary classification problem
• COVID-19 Positive vs. COVID-19 Negative
• 613 recordings vs. 11506 recordings
 Data augmentation to deal with class imbalance
• Generation of synthetic audio tracks
belonging to the minority class
 Data augmentation on raw signal
• Time Stretch
• Pitch Shift
• Shift
• Gain
N° recordings
Healthy
Negative
9167
11506
Symptomatic 2339
COVID-19 Positive 613 613
Total 12119
Data augmentation
Preprocessed track
Augmented track
Feature extraction
 Cough sounds contain more energy in lower
frequencies
 MFCCs are a suitable representation for
cough recordings
• 15 MFCCs per frame
 Audio samples have a duration of 6 seconds
• MFCC matrices 15x259
 Also MFCC-∆ and MFCC-∆∆ were considered
• Features dimension 3x15x259
Network architecture
 Convolution layer, 64 filters, kernel size
3x3, ReLU activation function, input shape
259x15x3
 Max pooling layer, pool size 2x2
 Convolution layer, 32 filters, kernel size
2x2, ReLU activation function
 Batch normalization layer
 Flatten layer
 Fully connected layer, 256 units, ReLU
activation function
 Dropout layer, rate 0.5
 Fully connected layer, 128 units, ReLU
activation function
 Dropout layer, rate 0.3
 Output layer, 1 neuron, Sigmoid activation
Training & Results
 Standard procedure with augmentation only on
training set:
• Balanced training set (positive:negative =
1:3)
• Unbalanced validation and test set
 Terrible results for validation and test set
 The model don’t recognize actual positive
recordings
Loss Accuracy Precision
Val Test Val Test Val Test
3.80 3.81 0.91 0.89 0.07 0.04
Recall AUC
Val Test Val Test
0.07 0.05 0.48 0.52
Confusion matrix on test set
Training & Results
 Procedure followed in various papers:
• Data augmentation on full dataset, before
splitting
 Much better performances
 Questions:
• Is the classifier recognizing the positives
or the augmented audio?
• Is this approach reliable in evaluating real
audio?
Loss Accuracy Precision
Val Test Val Test Val Test
0.42 0.41 0.94 0.94 0.96 0.95
Recall AUC
Val Test Val Test
0.79 0.81 0.91 0.92
Confusion matrix on test set
PROCESSING OF
BI-DIMENSIONAL
SIGNALS
Dataset: COVIDx CXR-3
 Create by COVID-NET team
 8 different data sources
 Last release: 06/02/2022
 2 different datasets:
 Training Set
 Test Set
 3 classes: COVID-19, Pneumonia, Normal
 Two .txt file (train, test) containing metadata
• Patient ID
• File name
• Class
• Data Source
Patient ID 101
filename pneumocystis-jirovecii-
pneumonia-3-1.jpg
class pneumonia
Data source cohen
Data exploration
 Training set: 29.404 CXR images:
 COVID-19: 15.774 images
 Normal (no pathology) : 8.085 images
 Pneumonia: 5.545 images
 Test set: 400 CXR images:
 COVID-19: 200 images
 Normal (no pathology) : 100 images
 Pneumonia: 100 images
 The dataset is imbalanced
Training Set Distribution
Test Set distribution
Images Exploration
CXR «Normal»
CXR «COVID-19»
CXR «Pneumonia»
 Images are 1024x1024 pixels with 3 channel:
 Only Posterior-Anterior (PA) CXR
 Many images contain:
 Noise
 Undesirable parts
 Preliminary operations:
 Resized to 112x122x3
 Reduced computational cost
 Data Splitted
 Data Normalization
Image Pre-Processing
 Image Enhancement:
 Techniques used to improve the information
interpretability in images
 For radiologists and automated systems
 Pre-Processing
 Removal of textual information commonly
embedded in CXR images
Noisy CXR-image
Common textual items
Improved Adaptive Gamma Correction
 Adaptive Gamma Correction tool
 AGC (Adaptive Gamma Correction) is a tool
for image contrast
 AGC relates the gamma parameter with the
cumulative distribution function (CDF) of
the pixel gray levels
 good for most dimmed images, but fails for
globally bright images
 Improved Adaptive Gamma Correction
 new AGC algorithm
 enhance bright images with the use of
negative images
 enhance dimmed images with the use of gamma
correction modulated by truncated CDF
Flowchart of Improved AGC tool
Improved Adaptive Gamma Correction
No ACG applied ACG applied (too
bright)
ACG applied (too dim)
Pre-Processing:
 The chest CXR images were cropped
 top 8% of the image
 Commonly embedded textual information
 Central crop
 To Centre the cropped image
Some pre-processing examples
Class imbalance problem
 Different techniques explored to handle
unbalanced classes
 Under-sampling of the dataset
 Rebalancing with respect to the least
populated class
 Class-weights
 Assigns higher weights to samples from
underrepresented classes
 Over-sampling of the dataset
 Data augmentation on minority classes
 Positional-based Data Augmentation
 GAN
Classes Nr. images
COVID-19 15.774
Pneumonia 5.545
Normal 8.085
Total 29.904
Data Augmentation
 A data augmentation technique was adopted to
balance the classes, in particular was:
 Implemented after under-sampling (performing
it on all classes)
 Implemented to increase minority classes (not
performing it on the most populated class)
 Data augmentation was exploited with the
following types of augmentation:
 Translation (± 10% in x and y directions)
 Rotation (± 10)
 Horizontal flip, zoom (± 15%)
 Intensity shift (± 10%)
Some augmentation examples
CNN: Network Architecture
 Input layer (112x112x3)
 2 convolutional blocks, with:
 Convolutional layers
 Batch Normalization layers
 ReLu
 2 convolutional blocks with: Convolutional layer, ReLu
 2 Max Pooling layers
 2 Dropout layers (rate 0,2)
 Output of feature extractor is passed to Flatten layer
 Fully connected layer (128 neurons), ReLu
 Dropout layer (rate 0,5)
 Output layer, 3 neurons, Softmax activation function
Parameters Value
Max Epoch 50
Optimizer Adam
Learning rate 0.0001 (fixed)
Batch Size 32
Step per epoch 1035
Params:
2,416,611
Trainable:
2,416,451
Non-trainable:
160
Overview
Over-Sampling wPositional Augmentation Results
 The solution that produced the best results
turned out to be the one:
 without preprocessing
 and Over-Sampling of minority classes with
positional augmentation
Confusion matrix on
test set
Under-sampling Class-Weights AC-GAN Augmentation
Image Enhancement
Image Processing
Explainable AI: Class activation Heat-Map
 We developed an explainability algorithm based on the use of Gradient-weighted
Class Activation Mapping (Grad-CAM)
 It provides a visual output of the most interesting areas found by the proposed
CNN models
 Grad-CAM uses the gradients of any target concept, flowing into the final
convolutional layer to produce a coarse localization map highlighting the
important regions in the image for predicting the concept.
COVID-19 CXR, Activation
Map
Pneumonia CXR, Activation
Map
SYNTHETIC CHEST X-RAY
IMAGES GENERATION USING
AC-GAN
Conditional Generation of Synthetic Chest X-Ray Images
 Objectives:
 Train an AC-GAN to synthesize chest x-rays
images
 Conditional generation of healthy, covid-
19 and pneumonia patients x-rays
 Data augmentation on the class-imbalanced
COVIDx dataset to improve classification
performances
 Dataset → COVIDx
 Simple image pre-processing →112𝑥112
resizing and [0,1] pixel scaling
 Data augmentation → shearing and zooming
Normal
COVID-19
Pneumonia
Auxiliary Classifier Generative Adversarial Network (AC-GAN)
 AC-GAN → extension of the GAN architecture
 The generator is class conditional as with
cGANs
 Input → randomly sampled 100-dimensional
noise vector and a label,
 Output → conditionally generating a
112x112x3 image
 The classes → coded by integers (0,1,2).
 The discriminator → comes with an auxiliary
classifier
 trained to reconstruct the input image
class label.
 Input → 112x112x3 image (real or
synthesised)
 Output → predicts its source (real/fake)
and class (0,1,2)
1. Two inputs:
1. random 100-dimensional noise vector
2. integer class label c (0, 1, 2)
2. Class label → embedding layer → dense layer → 7
× 7 × 1
3. Noise vector → dense layer → 7 × 7 × 1024
4. These two tensors are then concatenated → 7 × 7 ×
1025
5. Four transposed convolutional layers (kernel size
= 5, stride = 2) → 112 × 112 × 3
• The first three are paired with batch
normalization and a Rectified Linear Unit
(ReLU) activation
• Last one with tanh activation
6. Output: fake image with size 112 × 112 × 3
Generator Noise Vector
100
Clas
s
Labe
l
1
Embedding 100
Dense 7 * 7 7 x 7 x 1
Dense 7 * 7 *
1024
7 x 7 x
1024
ReLU
Reshape
C 7 x 7 x
1025
14 x 14
x512
5x5 Conv2DTranspose
Batch Normalization
ReLU
28 x 28 x
256
5x5 Conv2DTranspose
Batch Normalization
ReLU
56 x 56 x
128
5x5 Conv2DTranspose
Tanh Activation
112 x 112
x 3
Fake image
112 x 112
x 3
𝑁(𝜇 = 0, 𝜎 = 0.02)
Params:
22,303,108
Trainable:
22,301,316
Non-trainable:
1,792
Discriminator
1. Input: 112 × 112 × 3 image → dataset (real) or
synthetic (fake)
2. Four blocks:
 Sequence of: convolutional layer, batch
normalization layer, LeakyReLU activation
(slope = 0.2) and dropout layer (p = 0.5).
 Image size: 112 × 112 × 3 → 7 × 7 × 512
3. The tensor is flattened → fed into two dense
layers
4. First dense layer + sigmoid activation
 Binary classifier → outputs a probability
indicating whether the image is from the
original dataset (as "real") or generated by
the generator (as "fake").
5. Second dense layer + softmax activation
 Multiclass classifier → outputs a 1D tensor of
probabilities of each class
Real / Fake
Image
112 x 112 x 3
Input Layer
3x3 Conv2D (stride 2)
Batch Normalization
LeakyReLU
Dropout
3x3 Conv2D (stride 2)
Batch Normalization
LeakyReLU
Dropout
56 x 56 x
64
28 x 28 x
128
3x3 Conv2D
(stride 2)
Batch
Normalization
LeakyReLU
Dropout
14 x 14 x
256
112 x 112
x 3
7 x 7 x
512
Flatten
25088
Dense 1 Dense 3
Auxiliary
Source
Sigmoid
Activation
Softmax
Activation
COVID-19 0
NORMAL 1
PNEUMONIA 2
FAKE 0 / REAL 1
Params:
1,672,900
Trainable:
1,670,916
Non-trainable:
1,984
Training and regularization
 Adam optimizer → both the generator and the
discriminator
 Two loss functions, one for each output layer of the
discriminator
 First output layer → binary cross-entropy loss
(source loss 𝑳𝒔)
 Second output layer → sparse categorical cross
entropy (auxiliary classifier loss 𝑳𝒄)
 Minimize the overall loss 𝑳 = 𝑳𝒔 + 𝑳𝒄 → during the
generator training as well as the discriminator
training
 Label flipping (generator training) → all the fake
(0) images generated are passed to discriminator
labelled as real (1)
 Labels smoothing (discriminator training) → applied to
the binary vectors describing the origin of the image
(0/real – 1/fake) as a regularization method
Parameters Value
Max Epoch 388
Optimizer Adam
Learning rate 0.0002 (fixed)
Adam 𝜷𝟏 0.5 (fixed)
Batch Size 64
Steps per epoch 460
Auxiliary Loss 𝑳𝒄
Source Loss 𝑳𝒔 Total Loss 𝑳
Training
Testing
Discriminator
Discriminat
or
Generator
Overall
Accuracy
Real
Accuracy
Fake Accuracy
Choosing the best AC-GAN model weights for data augmentation
1. First set of models selection based on:
 ↑ visual quality qualitative evaluation of
sample images generated during each epoch
 ↓ generator losses
 ↓ discriminator accuracy in correctly
classifying fake images as fake.
2. Trained a classifier on synthetic images only →
evaluated the classification accuracy on real
COVIDx images
 epoch 288 → best model
3. Generated Images Quality Evaluation
 ↓ FID, ↓ Intra-FID and ↑ Inception Score (IS) →
InceptionV3
4. 2D t-SNE embedding visualization of generated and
real images
Evaluation
Metric Value
Generator loss 𝑳 0.44
Discriminator
accuracy (fake
images)
0.13
Qualitative
appearance
Realistic
CNN Accuracy (on
real images)
0.63
Real t-
SNE
Synthetic t-
SNE
Our AC-GAN Paper AC-
GAN [6]
IS ↑ 2.71 (±
1.70)
2.51 (±
0.12)
FID ↓ 123.26 (±
0.02)
50.67 (±
8.13)
Intra
FID ↓
136 (±
0.02)
Real and Synthetic chest x-ray sample
Normal
Pneumonia
COVID-19
Real Fake
/ Bibliography
1. Fakhry, A., Jiang, X., Xiao, J., Chaudhari, G., Han, A., & Khanzada, A. (2021). Virufy: A
multi-branch deep learning network for automated detection of COVID-19.
2. Hamdi, S., Oussalah, M., Moussaoui, A., & Saidi, M. (2022). Attention-based hybrid CNN-LSTM and
spectral data augmentation for COVID-19 diagnosis from cough sound. Journal of Intelligent
Information Systems, 59(2), 367-389.
3. Mahanta, S. K., Kaushik, D., Van Truong, H., Jain, S., & Guha, K. (2021, December). Covid-19
diagnosis from cough acoustics using convnets and data augmentation. In 2021 First
International Conference on Advances in Computing and Future Communication Technologies
(ICACFCT) (pp. 33-38). IEEE.
4. COUGHVID: A cough based COVID-19 fast screening project. https://c4science.ch/diffusion/10770/
5. Orlandic, L., Teijeiro, T., & Atienza, D. (2021). The COUGHVID crowdsourcing dataset, a corpus
for the study of large-scale cough analysis algorithms. Scientific Data, 8(1), 156.
6. Odena, A., Olah, C., & Shlens, J. (2017). Conditional Image Synthesis With Auxiliary Classifier
GANs (arXiv:1610.09585). arXiv. https://doi.org/10.48550/arXiv.1610.09585
7. Christi Florence, C. (2021). Detection of Pneumonia in Chest X-Ray Images Using Deep Transfer
Learning and Data Augmentation With Auxiliary Classifier Generative Adversarial Network. 14.
/ Bibliography
8. Karbhari, Y., Basu, A., Geem, Z. W., Han, G.-T., & Sarkar, R. (2021). Generation of Synthetic
Chest X-ray Images and Detection of COVID-19: A Deep Learning Based Approach. Diagnostics,
11(5), Article 5. https://doi.org/10.3390/diagnostics11050895.
9. DeVries, T., Romero, A., Pineda, L., Taylor, G. W., & Drozdzal, M. (2019). On the Evaluation of
Conditional GANs (arXiv:1907.08175). arXiv. https://doi.org/10.48550/arXiv.1907.08175
10.Borji, A. (2018). Pros and Cons of GAN Evaluation Measures (arXiv:1802.03446). arXiv.
https://doi.org/10.48550/arXiv.1802.03446
11.Goel S, Kipp A, Goel N, et al. (November 22, 2022) COVID-19 vs. Influenza: A Chest X-ray
Comparison. Cureus 14(11): e31794. doi:10.7759/cureus.31794
12.Kim, S.-H.; Wi, Y.M.; Lim, S.; Han, K.-T.; Bae, I.-G. Differences in Clinical Characteristics
and Chest Images between Coronavirus Disease 2019 and Influenza-Associated Pneumonia.
Diagnostics 2021, 11, 261. https://doi.org/10.3390/ diagnostics11020261
/ Bibliography
13.Wang, L., Lin, Z.Q. & Wong, A. COVID-Net: a tailored deep convolutional neural network design
for detection of COVID-19 cases from chest X-ray images. Sci Rep 10, 19549 (2020).
https://doi.org/10.1038/s41598-020-76550-z
14.Gang Cao, Lihui Huang, Huawei Tian, Xianglin Huang, Yongbin Wang, Ruicong Zhi, Contrast
enhancement of brightness-distorted images by improved adaptive gamma correction, Computers &
Electrical Engineering, Volume 66, 2018, Pages 569-582, ISSN 0045-7906,
https://doi.org/10.1016/j.compeleceng.2017.09.012.
15.Ait Nasser, A.; Akhloufi, M.A. A Review of Recent Advances in Deep Learning Models for Chest
Disease Detection Using Radiography. Diagnostics 2023, 13, 159.
https://doi.org/10.3390/diagnostics13010159
16.Huang, W., Song, G., Li, M., Hu, W., Xie, K. (2013). Adaptive Weight Optimization for
Classification of Imbalanced Data. In: Sun, C., Fang, F., Zhou, ZH., Yang, W., Liu, ZY. (eds)
Intelligence Science and Big Data Engineering. IScIDE 2013. Lecture Notes in Computer Science,
vol 8261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42057-3_69
17.Elshennawy, N.M.; Ibrahim, D.M. Deep-Pneumonia Framework Using Deep Learning Models Based on
Chest X-Ray Images. Diagnostics 2020, 10, 649. https://doi.org/10.3390/diagnostics10090649
18.Chetoui, M.; Akhloufi, M.A.; Yousefi, B.; Bouattane, E.M. Explainable COVID-19 Detection on
Chest X-rays Using an End-to-End Deep Convolutional Neural Network Architecture. Big Data Cogn.
Auxiliary Loss 𝑳𝒄
Source Loss 𝑳𝒔 Total Loss 𝑳
Training
Testing
Generator
Discriminator

More Related Content

PDF
Математика 6 клас ІІ семестр
PPT
презентація
PPT
Презентація:Квадратний корінь з числа. Арифметичний квадратний корінь.
PPTX
Математика. 1 класс. Урок 79. Сравнение. Сложение и вычитание величин
PPTX
квадратні рівняння
PPT
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
PPT
розв'язування вправ на всі дії зі звичайними та десятковими дробами
PDF
Задачі на розрізання
Математика 6 клас ІІ семестр
презентація
Презентація:Квадратний корінь з числа. Арифметичний квадратний корінь.
Математика. 1 класс. Урок 79. Сравнение. Сложение и вычитание величин
квадратні рівняння
Rotation in 3d Space: Euler Angles, Quaternions, Marix Descriptions
розв'язування вправ на всі дії зі звичайними та десятковими дробами
Задачі на розрізання

Similar to Medical data management: COVID-19 detection using cough recordings, chest X-rays classification and generation (20)

PDF
CXR-ACGAN: Auxiliary Classifier GAN for Conditional Generation of Chest X-Ray...
PDF
Covid Detection Using Lung X-ray Images
PPTX
Covid 19 detection using Transfer learning.pptx
PDF
Covid-19 Detection Using CNN MODEL
PPTX
harsh final ppt (2).pptx
PDF
Discriminating the Pneumonia-Positive Images from.pdf
PDF
covid 19 detection using x ray based on neural network
PPTX
lung disease detection using deep learning
PDF
X-Ray Based Quick Covid-19 Detection Using Raspberry-pi
PPTX
COVID-19 Disease Detection using Chest X-Ray Images.pptx
PPTX
BbbbbbbbbbbbbbbbbbbbbbElectrothonPPT.pptx
PDF
Qt7355g8v8
PDF
A deep learning approach for COVID-19 and pneumonia detection from chest X-r...
PDF
AI-Powered Detection of COVID-19 and Pneumonia: A Machine Learning Approach
PPTX
DETECTING COVID.pptx
PDF
Study and Analysis of Different CNN Architectures by Detecting Covid- 19 and ...
PDF
Realistic image synthesis of COVID-19 chest X-rays using depthwise boundary ...
PDF
Automatic COVID-19 lung images classification system based on convolution ne...
PDF
Predicting Covid-19 pneumonia Severity on Chest x-ray with deep learning
PPTX
covid 19 detection using lung x-rays.pptx.pptx
CXR-ACGAN: Auxiliary Classifier GAN for Conditional Generation of Chest X-Ray...
Covid Detection Using Lung X-ray Images
Covid 19 detection using Transfer learning.pptx
Covid-19 Detection Using CNN MODEL
harsh final ppt (2).pptx
Discriminating the Pneumonia-Positive Images from.pdf
covid 19 detection using x ray based on neural network
lung disease detection using deep learning
X-Ray Based Quick Covid-19 Detection Using Raspberry-pi
COVID-19 Disease Detection using Chest X-Ray Images.pptx
BbbbbbbbbbbbbbbbbbbbbbElectrothonPPT.pptx
Qt7355g8v8
A deep learning approach for COVID-19 and pneumonia detection from chest X-r...
AI-Powered Detection of COVID-19 and Pneumonia: A Machine Learning Approach
DETECTING COVID.pptx
Study and Analysis of Different CNN Architectures by Detecting Covid- 19 and ...
Realistic image synthesis of COVID-19 chest X-rays using depthwise boundary ...
Automatic COVID-19 lung images classification system based on convolution ne...
Predicting Covid-19 pneumonia Severity on Chest x-ray with deep learning
covid 19 detection using lung x-rays.pptx.pptx
Ad

Recently uploaded (20)

PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Foundation of Data Science unit number two notes
PPTX
Introduction to Knowledge Engineering Part 1
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Lecture1 pattern recognition............
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to machine learning and Linear Models
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
Computer network topology notes for revision
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Business Acumen Training GuidePresentation.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Foundation of Data Science unit number two notes
Introduction to Knowledge Engineering Part 1
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Lecture1 pattern recognition............
Database Infoormation System (DBIS).pptx
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to machine learning and Linear Models
Data_Analytics_and_PowerBI_Presentation.pptx
Quality review (1)_presentation of this 21
Business Ppt On Nestle.pptx huunnnhhgfvu
Moving the Public Sector (Government) to a Digital Adoption
Computer network topology notes for revision
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Ad

Medical data management: COVID-19 detection using cough recordings, chest X-rays classification and generation

  • 1. MEDICAL DATA MANAGEMENT: COVID-19 DETECTION USING COUGH RECORDINGS, CHEST X-RAYS CLASSIFICATION AND GENERATION University of Milano-Bicocca Master's Degree in Data Science Digital Signal and Image Management Academic Year 2022-2023 Authors: Giorgio CARBONE matricola n. 811974 Gianluca CAVALLARO matricola n. 826049 Remo MARCONZINI matricola n. 883256
  • 3. Dataset:  Crowdsource dataset  Recordings collected between April 1st, 2020 and December 1st, 2020  34,434 recordings and their metadata • One .json for each recording • One .csv file containing all metadata  Most relevant attributes • uuid → Name of the recording • cough_detected → Probability of being cough sound • status → Self-reported health condition uuid 00039425-7f3a-42aa-ac13- 834aaa2b6b92 document 2020-04- 13T21:30:59.801831+00:00 cough_detected 0.9754 age [0, …, 99, NaN] gender [Male, Female, NaN] respiratory_condition [True, False, NaN] fever_muscle_pain [True, False, NaN] status [Healthy, Symptomatic, COVID-19, NaN]
  • 4. Data Cleaning  Removing rows with unknown status  Filter for recordings with cough_detected > 0.8 • Value recommended by the authors  Number of recordings after cleaning: 12119  Recordings distribution: • Healthy: 9631 • Symptomatic: 2622 • COVID-19: 634  The dataset is imbalanced N° recordings Healthy 9167 Symptomatic 2339 COVID-19 613 Total 12119
  • 5. Preprocessing  Noise reduction • Spectral gating using noisereduce  Silence removal • To maintain only relevant audio patterns • Silence > 1s is removed • 0.5s of silence maintained at the beginning and the end of the recording  Length standardization • Need for a fixed dimensions of the audio features • Trade-off between information loss and amount of sparse values Duration N° recordings < 2s 1439 <3s 3461 < 4s 5826 < 5s 7892 < 6s 9468 < 7s 10680 < 8s 11470 < 9s 11941
  • 8. Class imbalance problem  Binary classification problem • COVID-19 Positive vs. COVID-19 Negative • 613 recordings vs. 11506 recordings  Data augmentation to deal with class imbalance • Generation of synthetic audio tracks belonging to the minority class  Data augmentation on raw signal • Time Stretch • Pitch Shift • Shift • Gain N° recordings Healthy Negative 9167 11506 Symptomatic 2339 COVID-19 Positive 613 613 Total 12119
  • 10. Feature extraction  Cough sounds contain more energy in lower frequencies  MFCCs are a suitable representation for cough recordings • 15 MFCCs per frame  Audio samples have a duration of 6 seconds • MFCC matrices 15x259  Also MFCC-∆ and MFCC-∆∆ were considered • Features dimension 3x15x259
  • 11. Network architecture  Convolution layer, 64 filters, kernel size 3x3, ReLU activation function, input shape 259x15x3  Max pooling layer, pool size 2x2  Convolution layer, 32 filters, kernel size 2x2, ReLU activation function  Batch normalization layer  Flatten layer  Fully connected layer, 256 units, ReLU activation function  Dropout layer, rate 0.5  Fully connected layer, 128 units, ReLU activation function  Dropout layer, rate 0.3  Output layer, 1 neuron, Sigmoid activation
  • 12. Training & Results  Standard procedure with augmentation only on training set: • Balanced training set (positive:negative = 1:3) • Unbalanced validation and test set  Terrible results for validation and test set  The model don’t recognize actual positive recordings Loss Accuracy Precision Val Test Val Test Val Test 3.80 3.81 0.91 0.89 0.07 0.04 Recall AUC Val Test Val Test 0.07 0.05 0.48 0.52 Confusion matrix on test set
  • 13. Training & Results  Procedure followed in various papers: • Data augmentation on full dataset, before splitting  Much better performances  Questions: • Is the classifier recognizing the positives or the augmented audio? • Is this approach reliable in evaluating real audio? Loss Accuracy Precision Val Test Val Test Val Test 0.42 0.41 0.94 0.94 0.96 0.95 Recall AUC Val Test Val Test 0.79 0.81 0.91 0.92 Confusion matrix on test set
  • 15. Dataset: COVIDx CXR-3  Create by COVID-NET team  8 different data sources  Last release: 06/02/2022  2 different datasets:  Training Set  Test Set  3 classes: COVID-19, Pneumonia, Normal  Two .txt file (train, test) containing metadata • Patient ID • File name • Class • Data Source Patient ID 101 filename pneumocystis-jirovecii- pneumonia-3-1.jpg class pneumonia Data source cohen
  • 16. Data exploration  Training set: 29.404 CXR images:  COVID-19: 15.774 images  Normal (no pathology) : 8.085 images  Pneumonia: 5.545 images  Test set: 400 CXR images:  COVID-19: 200 images  Normal (no pathology) : 100 images  Pneumonia: 100 images  The dataset is imbalanced Training Set Distribution Test Set distribution
  • 17. Images Exploration CXR «Normal» CXR «COVID-19» CXR «Pneumonia»  Images are 1024x1024 pixels with 3 channel:  Only Posterior-Anterior (PA) CXR  Many images contain:  Noise  Undesirable parts  Preliminary operations:  Resized to 112x122x3  Reduced computational cost  Data Splitted  Data Normalization
  • 18. Image Pre-Processing  Image Enhancement:  Techniques used to improve the information interpretability in images  For radiologists and automated systems  Pre-Processing  Removal of textual information commonly embedded in CXR images Noisy CXR-image Common textual items
  • 19. Improved Adaptive Gamma Correction  Adaptive Gamma Correction tool  AGC (Adaptive Gamma Correction) is a tool for image contrast  AGC relates the gamma parameter with the cumulative distribution function (CDF) of the pixel gray levels  good for most dimmed images, but fails for globally bright images  Improved Adaptive Gamma Correction  new AGC algorithm  enhance bright images with the use of negative images  enhance dimmed images with the use of gamma correction modulated by truncated CDF Flowchart of Improved AGC tool
  • 20. Improved Adaptive Gamma Correction No ACG applied ACG applied (too bright) ACG applied (too dim)
  • 21. Pre-Processing:  The chest CXR images were cropped  top 8% of the image  Commonly embedded textual information  Central crop  To Centre the cropped image Some pre-processing examples
  • 22. Class imbalance problem  Different techniques explored to handle unbalanced classes  Under-sampling of the dataset  Rebalancing with respect to the least populated class  Class-weights  Assigns higher weights to samples from underrepresented classes  Over-sampling of the dataset  Data augmentation on minority classes  Positional-based Data Augmentation  GAN Classes Nr. images COVID-19 15.774 Pneumonia 5.545 Normal 8.085 Total 29.904
  • 23. Data Augmentation  A data augmentation technique was adopted to balance the classes, in particular was:  Implemented after under-sampling (performing it on all classes)  Implemented to increase minority classes (not performing it on the most populated class)  Data augmentation was exploited with the following types of augmentation:  Translation (± 10% in x and y directions)  Rotation (± 10)  Horizontal flip, zoom (± 15%)  Intensity shift (± 10%) Some augmentation examples
  • 24. CNN: Network Architecture  Input layer (112x112x3)  2 convolutional blocks, with:  Convolutional layers  Batch Normalization layers  ReLu  2 convolutional blocks with: Convolutional layer, ReLu  2 Max Pooling layers  2 Dropout layers (rate 0,2)  Output of feature extractor is passed to Flatten layer  Fully connected layer (128 neurons), ReLu  Dropout layer (rate 0,5)  Output layer, 3 neurons, Softmax activation function Parameters Value Max Epoch 50 Optimizer Adam Learning rate 0.0001 (fixed) Batch Size 32 Step per epoch 1035 Params: 2,416,611 Trainable: 2,416,451 Non-trainable: 160
  • 26. Over-Sampling wPositional Augmentation Results  The solution that produced the best results turned out to be the one:  without preprocessing  and Over-Sampling of minority classes with positional augmentation Confusion matrix on test set
  • 29. Explainable AI: Class activation Heat-Map  We developed an explainability algorithm based on the use of Gradient-weighted Class Activation Mapping (Grad-CAM)  It provides a visual output of the most interesting areas found by the proposed CNN models  Grad-CAM uses the gradients of any target concept, flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. COVID-19 CXR, Activation Map Pneumonia CXR, Activation Map
  • 30. SYNTHETIC CHEST X-RAY IMAGES GENERATION USING AC-GAN
  • 31. Conditional Generation of Synthetic Chest X-Ray Images  Objectives:  Train an AC-GAN to synthesize chest x-rays images  Conditional generation of healthy, covid- 19 and pneumonia patients x-rays  Data augmentation on the class-imbalanced COVIDx dataset to improve classification performances  Dataset → COVIDx  Simple image pre-processing →112𝑥112 resizing and [0,1] pixel scaling  Data augmentation → shearing and zooming Normal COVID-19 Pneumonia
  • 32. Auxiliary Classifier Generative Adversarial Network (AC-GAN)  AC-GAN → extension of the GAN architecture  The generator is class conditional as with cGANs  Input → randomly sampled 100-dimensional noise vector and a label,  Output → conditionally generating a 112x112x3 image  The classes → coded by integers (0,1,2).  The discriminator → comes with an auxiliary classifier  trained to reconstruct the input image class label.  Input → 112x112x3 image (real or synthesised)  Output → predicts its source (real/fake) and class (0,1,2)
  • 33. 1. Two inputs: 1. random 100-dimensional noise vector 2. integer class label c (0, 1, 2) 2. Class label → embedding layer → dense layer → 7 × 7 × 1 3. Noise vector → dense layer → 7 × 7 × 1024 4. These two tensors are then concatenated → 7 × 7 × 1025 5. Four transposed convolutional layers (kernel size = 5, stride = 2) → 112 × 112 × 3 • The first three are paired with batch normalization and a Rectified Linear Unit (ReLU) activation • Last one with tanh activation 6. Output: fake image with size 112 × 112 × 3 Generator Noise Vector 100 Clas s Labe l 1 Embedding 100 Dense 7 * 7 7 x 7 x 1 Dense 7 * 7 * 1024 7 x 7 x 1024 ReLU Reshape C 7 x 7 x 1025 14 x 14 x512 5x5 Conv2DTranspose Batch Normalization ReLU 28 x 28 x 256 5x5 Conv2DTranspose Batch Normalization ReLU 56 x 56 x 128 5x5 Conv2DTranspose Tanh Activation 112 x 112 x 3 Fake image 112 x 112 x 3 𝑁(𝜇 = 0, 𝜎 = 0.02) Params: 22,303,108 Trainable: 22,301,316 Non-trainable: 1,792
  • 34. Discriminator 1. Input: 112 × 112 × 3 image → dataset (real) or synthetic (fake) 2. Four blocks:  Sequence of: convolutional layer, batch normalization layer, LeakyReLU activation (slope = 0.2) and dropout layer (p = 0.5).  Image size: 112 × 112 × 3 → 7 × 7 × 512 3. The tensor is flattened → fed into two dense layers 4. First dense layer + sigmoid activation  Binary classifier → outputs a probability indicating whether the image is from the original dataset (as "real") or generated by the generator (as "fake"). 5. Second dense layer + softmax activation  Multiclass classifier → outputs a 1D tensor of probabilities of each class Real / Fake Image 112 x 112 x 3 Input Layer 3x3 Conv2D (stride 2) Batch Normalization LeakyReLU Dropout 3x3 Conv2D (stride 2) Batch Normalization LeakyReLU Dropout 56 x 56 x 64 28 x 28 x 128 3x3 Conv2D (stride 2) Batch Normalization LeakyReLU Dropout 14 x 14 x 256 112 x 112 x 3 7 x 7 x 512 Flatten 25088 Dense 1 Dense 3 Auxiliary Source Sigmoid Activation Softmax Activation COVID-19 0 NORMAL 1 PNEUMONIA 2 FAKE 0 / REAL 1 Params: 1,672,900 Trainable: 1,670,916 Non-trainable: 1,984
  • 35. Training and regularization  Adam optimizer → both the generator and the discriminator  Two loss functions, one for each output layer of the discriminator  First output layer → binary cross-entropy loss (source loss 𝑳𝒔)  Second output layer → sparse categorical cross entropy (auxiliary classifier loss 𝑳𝒄)  Minimize the overall loss 𝑳 = 𝑳𝒔 + 𝑳𝒄 → during the generator training as well as the discriminator training  Label flipping (generator training) → all the fake (0) images generated are passed to discriminator labelled as real (1)  Labels smoothing (discriminator training) → applied to the binary vectors describing the origin of the image (0/real – 1/fake) as a regularization method Parameters Value Max Epoch 388 Optimizer Adam Learning rate 0.0002 (fixed) Adam 𝜷𝟏 0.5 (fixed) Batch Size 64 Steps per epoch 460
  • 36. Auxiliary Loss 𝑳𝒄 Source Loss 𝑳𝒔 Total Loss 𝑳 Training Testing Discriminator Discriminat or Generator Overall Accuracy Real Accuracy Fake Accuracy
  • 37. Choosing the best AC-GAN model weights for data augmentation 1. First set of models selection based on:  ↑ visual quality qualitative evaluation of sample images generated during each epoch  ↓ generator losses  ↓ discriminator accuracy in correctly classifying fake images as fake. 2. Trained a classifier on synthetic images only → evaluated the classification accuracy on real COVIDx images  epoch 288 → best model 3. Generated Images Quality Evaluation  ↓ FID, ↓ Intra-FID and ↑ Inception Score (IS) → InceptionV3 4. 2D t-SNE embedding visualization of generated and real images
  • 38. Evaluation Metric Value Generator loss 𝑳 0.44 Discriminator accuracy (fake images) 0.13 Qualitative appearance Realistic CNN Accuracy (on real images) 0.63 Real t- SNE Synthetic t- SNE Our AC-GAN Paper AC- GAN [6] IS ↑ 2.71 (± 1.70) 2.51 (± 0.12) FID ↓ 123.26 (± 0.02) 50.67 (± 8.13) Intra FID ↓ 136 (± 0.02)
  • 39. Real and Synthetic chest x-ray sample Normal Pneumonia COVID-19 Real Fake
  • 40. / Bibliography 1. Fakhry, A., Jiang, X., Xiao, J., Chaudhari, G., Han, A., & Khanzada, A. (2021). Virufy: A multi-branch deep learning network for automated detection of COVID-19. 2. Hamdi, S., Oussalah, M., Moussaoui, A., & Saidi, M. (2022). Attention-based hybrid CNN-LSTM and spectral data augmentation for COVID-19 diagnosis from cough sound. Journal of Intelligent Information Systems, 59(2), 367-389. 3. Mahanta, S. K., Kaushik, D., Van Truong, H., Jain, S., & Guha, K. (2021, December). Covid-19 diagnosis from cough acoustics using convnets and data augmentation. In 2021 First International Conference on Advances in Computing and Future Communication Technologies (ICACFCT) (pp. 33-38). IEEE. 4. COUGHVID: A cough based COVID-19 fast screening project. https://c4science.ch/diffusion/10770/ 5. Orlandic, L., Teijeiro, T., & Atienza, D. (2021). The COUGHVID crowdsourcing dataset, a corpus for the study of large-scale cough analysis algorithms. Scientific Data, 8(1), 156. 6. Odena, A., Olah, C., & Shlens, J. (2017). Conditional Image Synthesis With Auxiliary Classifier GANs (arXiv:1610.09585). arXiv. https://doi.org/10.48550/arXiv.1610.09585 7. Christi Florence, C. (2021). Detection of Pneumonia in Chest X-Ray Images Using Deep Transfer Learning and Data Augmentation With Auxiliary Classifier Generative Adversarial Network. 14.
  • 41. / Bibliography 8. Karbhari, Y., Basu, A., Geem, Z. W., Han, G.-T., & Sarkar, R. (2021). Generation of Synthetic Chest X-ray Images and Detection of COVID-19: A Deep Learning Based Approach. Diagnostics, 11(5), Article 5. https://doi.org/10.3390/diagnostics11050895. 9. DeVries, T., Romero, A., Pineda, L., Taylor, G. W., & Drozdzal, M. (2019). On the Evaluation of Conditional GANs (arXiv:1907.08175). arXiv. https://doi.org/10.48550/arXiv.1907.08175 10.Borji, A. (2018). Pros and Cons of GAN Evaluation Measures (arXiv:1802.03446). arXiv. https://doi.org/10.48550/arXiv.1802.03446 11.Goel S, Kipp A, Goel N, et al. (November 22, 2022) COVID-19 vs. Influenza: A Chest X-ray Comparison. Cureus 14(11): e31794. doi:10.7759/cureus.31794 12.Kim, S.-H.; Wi, Y.M.; Lim, S.; Han, K.-T.; Bae, I.-G. Differences in Clinical Characteristics and Chest Images between Coronavirus Disease 2019 and Influenza-Associated Pneumonia. Diagnostics 2021, 11, 261. https://doi.org/10.3390/ diagnostics11020261
  • 42. / Bibliography 13.Wang, L., Lin, Z.Q. & Wong, A. COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci Rep 10, 19549 (2020). https://doi.org/10.1038/s41598-020-76550-z 14.Gang Cao, Lihui Huang, Huawei Tian, Xianglin Huang, Yongbin Wang, Ruicong Zhi, Contrast enhancement of brightness-distorted images by improved adaptive gamma correction, Computers & Electrical Engineering, Volume 66, 2018, Pages 569-582, ISSN 0045-7906, https://doi.org/10.1016/j.compeleceng.2017.09.012. 15.Ait Nasser, A.; Akhloufi, M.A. A Review of Recent Advances in Deep Learning Models for Chest Disease Detection Using Radiography. Diagnostics 2023, 13, 159. https://doi.org/10.3390/diagnostics13010159 16.Huang, W., Song, G., Li, M., Hu, W., Xie, K. (2013). Adaptive Weight Optimization for Classification of Imbalanced Data. In: Sun, C., Fang, F., Zhou, ZH., Yang, W., Liu, ZY. (eds) Intelligence Science and Big Data Engineering. IScIDE 2013. Lecture Notes in Computer Science, vol 8261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42057-3_69 17.Elshennawy, N.M.; Ibrahim, D.M. Deep-Pneumonia Framework Using Deep Learning Models Based on Chest X-Ray Images. Diagnostics 2020, 10, 649. https://doi.org/10.3390/diagnostics10090649 18.Chetoui, M.; Akhloufi, M.A.; Yousefi, B.; Bouattane, E.M. Explainable COVID-19 Detection on Chest X-rays Using an End-to-End Deep Convolutional Neural Network Architecture. Big Data Cogn.
  • 43. Auxiliary Loss 𝑳𝒄 Source Loss 𝑳𝒔 Total Loss 𝑳 Training Testing Generator Discriminator