SlideShare a Scribd company logo
Semantic Segmentation on
Satellite Imagery
Rahul Bhojwani, Nina Domingo,
Benjamin Mayhew, Christy Tsz-En Wang
Kaggle: Can you train an eye in the sky?
Challenge: The Defence Science and
Technology Laboratory (DSTL) is seeking
novel solutions to alleviate the burden on
their image analysts and challenges
kagglers to accurately identify and classify
objects in overhead satellite imagery.
Introduction Data Methods Results
What’s in a picture?
Introduction Data Methods Results
How is this useful?
Medical imaging Agriculture Surveillance
Introduction Data Methods Results
Data
Input: 25 1km x 1km satellite images in both 3-band and 16-band
formats
● Format: GeoTiff
● Images are taken from the same region but coordinates are
transformed so the location is obscured
Object class: every class is provided in the form of a Multipolygon
● Format: Geojson or WKT
Introduction Data Methods Results
Object Class Types
Buildings Crops
Misc. Manmade Structures Waterway
Roads Standing Water
Track Vehicle Large
Trees Vehicle Small
Introduction Data Methods Results
Data Processing of Labels
Introduction Data Methods Results
Match [0,1] coordinates to
pixel coordinates
Compute projection factors for
multipolygon
Data Processing of Labels
Introduction Data Methods Results
Multipolygons to shapely objects
Project geometry to pixel coordinates
Shapely objects to shapefiles
to tiff files
Data Processing
Original image Object mask Superimposed image
Introduction Data Methods Results
Introduction Data Methods Results
Object Class Type Distribution
Introduction Data Methods Results
Average Number of Polygons Distribution
Introduction Data Methods Results
More Data Processing
25 512x512
images
Introduction Data Methods Results
25 ~3300x3300
images
25 3072x3072
images
900 512x512
images
DIRECT SCALING PARTITION
Methods - Semantic Segmentation with Deep Learning
Important deep learning
models for semantic
segmentation:
● Fully Convolutional
Network [Nov 2014]
● U-net [May 2015]
● Segnet [Nov 2015]
Introduction Data Methods Results
Methods - Semantic Segmentation with Deep Learning
VGG-16:
Introduction Data Methods Results
Methods - Semantic Segmentation with Deep Learning
Introduction Data Methods Results
Fully Convolutional
Network:
● No fully
connected
● Skip
connection
● VGG-16
Methods - Semantic Segmentation with Deep Learning
U-Net:
Introduction Data Methods Results
Methods - Semantic Segmentation with Deep Learning
U-Net:
● Encoder-Decoder network.
● Every decoding phase is convolved with trainable filters.
● Copy the encoder embedding to the corresponding decoder.
● Data Augmentation [Stretching and rotation].
● Weighted Cross Entropy.
● Forces network to learn the border pixels.
Introduction Data Methods Results
Methods - Encode/Contracting path
Goal:
● Retain context and
localization accuracy.
Operations:
● Convolution
● Non Linearity (ReLU)
● Pooling
● But skip the fully connected
layers
Introduction Data Methods Results
3x3 Convolution with
no padding, stride of 2
Methods - Semantic Segmentation with Deep Learning
Segnet Architecture:
Introduction Data Methods Results
Methods - Decode/Expansive path
Goal:
● To recover the object details and
spatial dimension
Operation:
● “Up-convolution”/ “upsampling”
● Concatenate with the corresponding
cropped encoder feature maps
● Convolution layers
● ReLU
Introduction Data Methods Results
Methods - Semantic Segmentation with Deep Learning
Segnet:
● Encoding part is exactly VGG-16
● Use Trained weights from VGG-16 [Excluding the last fully connected
layer]
● Decoder uses the pooling indices from max pooling step of
corresponding encoder.
● The upsampled maps were convolved with trainable filters.
● Unlike U-Net they don’t copy the entire encoding.
● Reduced the trainable parameters from 134M → 14.7M
Introduction Data Methods Results
Methods - Semantic Segmentation with Deep Learning
Segnet Unpooling:
Introduction Data Methods Results
Methods - Semantic Segmentation with Deep Learning
FCN vs Segnet:
Introduction Data Methods Results
Training U-net
Pixel-wise soft-max + cross entropy loss function
Methods: How does upsampling work?
Transposed convolution (fractionally strided
convolution/deconvolution)
● Reconstructs the spatial resolution
● The weights are learnable
● It is NOT reverse convolution process
Introduction Data Methods Results
Transposed 2x2 convolution
with no padding, stride of 2 and
kernel of 3
Convolution as matrix
multiplication
4 x 4
3 x 3
Convolution as matrix
multiplication
4 x 16
16 x 1
4 x 1
Transposed convolution as
matrix multiplication
(16 x 4) (4 x 1) = (16 x 1)
● Dimension of input and output swap
● Uses transpose of convolution matrix
Preliminary results: partitioned images [900x512x512]
Introduction Data Methods Results
Epoch Loss Acc Epoch Loss Acc
1 0.2356 0.9587 6 NA NA
2 0.1763 0.9587 7 NA NA
3 ETA: ~1 day 8 NA NA
4 NA NA 9 NA NA
5 NA NA 10 NA NA
Next Steps
Actual Next Steps:
▫ Include more classes as part of our training.
▫ Tuning the hyperparameters of the model.
▫ Making the segnet work.
Future Works:
▫ Exploring more recently published models. Eg: Deeplab
v3[2018]
▫ Use higher computing resources to run the models
faster.
References:
▫ Ronneberger, O. (2017). Invited Talk: U-Net Convolutional Networks for Biomedical Image Segmentation.
Informatik Aktuell Bildverarbeitung Für Die Medizin 2017, 3-3. doi:10.1007/978-3-662-54345-0_3
▫ Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. 2015 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2015.7298965
▫ Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). SegNet: A Deep Convolutional Encoder-Decoder
Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12),
2481-2495. doi:10.1109/tpami.2016.264461
▫ https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d
▫ https://www.cs.toronto.edu/~frossard/post/vgg16/
▫ https://medium.com/@wilburdes/semantic-segmentation-using-fully-convolutional-neural-networks-
86e45336f99b
▫ https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection
Questions?
Extras:
Methods: dilated/atrous convolutions
Goal:
● Take away need to pool layers
Operations:
● Apply predefined gaps between each pixel
of input image
● Replace pooling layer from pretrained
classification system with dilated
convolution
e.g. 2-dilated convolution
Introduction Data Methods Results
Kaggle: Evaluation
Average Jaccard Index between the predicted multipolygons and actual
multipolygons. The Jaccard Index for two regions is the ratio of the area of the
intersection to the area of the union.
Jaccard =TP/(TP + FP + FN) = |A∩B|/|A∪B| = |A∩B|/(|A|+|B|−|A∩B|)
Introduction Data Methods Results

More Related Content

PDF
Landuse Classification from Satellite Imagery using Deep Learning
PDF
Dimensionality Reduction
PPTX
Dimension Reduction Introduction & PCA.pptx
PPTX
Analysis by semantic segmentation of Multispectral satellite imagery using de...
PPTX
Jpeg standards
PDF
Feature Extraction
PDF
Image processing, Noise, Noise Removal filters
PPTX
Landuse Classification from Satellite Imagery using Deep Learning
Dimensionality Reduction
Dimension Reduction Introduction & PCA.pptx
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Jpeg standards
Feature Extraction
Image processing, Noise, Noise Removal filters

What's hot (20)

PPT
Image segmentation
PPTX
Hough Transform By Md.Nazmul Islam
PPTX
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
PDF
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
PPTX
PDF
Big data Clustering Algorithms And Strategies
PPT
The motion estimation
PPTX
Support vector machine
PPTX
Data Mining: clustering and analysis
PPTX
Real Time Object Tracking
PPTX
Object extraction from satellite imagery using deep learning
PPT
backpropagation in neural networks
PPTX
Supervised learning and Unsupervised learning
PPTX
Image Segmentation Using Deep Learning : A survey
PPTX
U-Net (1).pptx
PPTX
Object recognition
PPTX
Object detection with deep learning
PPTX
Image filtering in Digital image processing
PDF
Content Based Image Retrieval
PPT
4.3 multimedia datamining
Image segmentation
Hough Transform By Md.Nazmul Islam
COM2304: Intensity Transformation and Spatial Filtering – I (Intensity Transf...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
Big data Clustering Algorithms And Strategies
The motion estimation
Support vector machine
Data Mining: clustering and analysis
Real Time Object Tracking
Object extraction from satellite imagery using deep learning
backpropagation in neural networks
Supervised learning and Unsupervised learning
Image Segmentation Using Deep Learning : A survey
U-Net (1).pptx
Object recognition
Object detection with deep learning
Image filtering in Digital image processing
Content Based Image Retrieval
4.3 multimedia datamining
Ad

Similar to Semantic Segmentation on Satellite Imagery (20)

PPTX
Review-image-segmentation-by-deep-learning
PDF
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
PDF
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
PPTX
image_segmentation_ppt.pptx
PPTX
UNetEliyaLaialy (2).pptx
PPTX
Image Segmentation: Approaches and Challenges
PDF
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
PDF
IRJET- Semantic Segmentation using Deep Learning
PPTX
cityscapes Semantic Segmentation using FCN, U Net and U Net++.pptx
PPTX
Rafiqul islam
PPTX
AaSeminar_Template.pptx
PPTX
Semantic segmentation with Convolutional Neural Network Approaches
PDF
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
PDF
A brief introduction to recent segmentation methods
PDF
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
PPTX
Find nuclei in images with U-net
PDF
week14_segmentation.pdf
PPTX
U-Netpresentation.pptx
PPTX
Introduction to Segmentation in Computer vision
PDF
#6 PyData Warsaw: Deep learning for image segmentation
Review-image-segmentation-by-deep-learning
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
image_segmentation_ppt.pptx
UNetEliyaLaialy (2).pptx
Image Segmentation: Approaches and Challenges
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
IRJET- Semantic Segmentation using Deep Learning
cityscapes Semantic Segmentation using FCN, U Net and U Net++.pptx
Rafiqul islam
AaSeminar_Template.pptx
Semantic segmentation with Convolutional Neural Network Approaches
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
A brief introduction to recent segmentation methods
SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
Find nuclei in images with U-net
week14_segmentation.pdf
U-Netpresentation.pptx
Introduction to Segmentation in Computer vision
#6 PyData Warsaw: Deep learning for image segmentation
Ad

Recently uploaded (20)

PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Foundation of Data Science unit number two notes
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction to machine learning and Linear Models
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Qualitative Qantitative and Mixed Methods.pptx
Clinical guidelines as a resource for EBP(1).pdf
Foundation of Data Science unit number two notes
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
ISS -ESG Data flows What is ESG and HowHow
Introduction to Knowledge Engineering Part 1
Galatica Smart Energy Infrastructure Startup Pitch Deck
Supervised vs unsupervised machine learning algorithms
Introduction to machine learning and Linear Models
Reliability_Chapter_ presentation 1221.5784
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Fluorescence-microscope_Botany_detailed content
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg

Semantic Segmentation on Satellite Imagery

  • 1. Semantic Segmentation on Satellite Imagery Rahul Bhojwani, Nina Domingo, Benjamin Mayhew, Christy Tsz-En Wang
  • 2. Kaggle: Can you train an eye in the sky? Challenge: The Defence Science and Technology Laboratory (DSTL) is seeking novel solutions to alleviate the burden on their image analysts and challenges kagglers to accurately identify and classify objects in overhead satellite imagery. Introduction Data Methods Results
  • 3. What’s in a picture? Introduction Data Methods Results
  • 4. How is this useful? Medical imaging Agriculture Surveillance Introduction Data Methods Results
  • 5. Data Input: 25 1km x 1km satellite images in both 3-band and 16-band formats ● Format: GeoTiff ● Images are taken from the same region but coordinates are transformed so the location is obscured Object class: every class is provided in the form of a Multipolygon ● Format: Geojson or WKT Introduction Data Methods Results
  • 6. Object Class Types Buildings Crops Misc. Manmade Structures Waterway Roads Standing Water Track Vehicle Large Trees Vehicle Small Introduction Data Methods Results
  • 7. Data Processing of Labels Introduction Data Methods Results Match [0,1] coordinates to pixel coordinates Compute projection factors for multipolygon
  • 8. Data Processing of Labels Introduction Data Methods Results Multipolygons to shapely objects Project geometry to pixel coordinates Shapely objects to shapefiles to tiff files
  • 9. Data Processing Original image Object mask Superimposed image Introduction Data Methods Results
  • 11. Object Class Type Distribution Introduction Data Methods Results
  • 12. Average Number of Polygons Distribution Introduction Data Methods Results
  • 13. More Data Processing 25 512x512 images Introduction Data Methods Results 25 ~3300x3300 images 25 3072x3072 images 900 512x512 images DIRECT SCALING PARTITION
  • 14. Methods - Semantic Segmentation with Deep Learning Important deep learning models for semantic segmentation: ● Fully Convolutional Network [Nov 2014] ● U-net [May 2015] ● Segnet [Nov 2015] Introduction Data Methods Results
  • 15. Methods - Semantic Segmentation with Deep Learning VGG-16: Introduction Data Methods Results
  • 16. Methods - Semantic Segmentation with Deep Learning Introduction Data Methods Results Fully Convolutional Network: ● No fully connected ● Skip connection ● VGG-16
  • 17. Methods - Semantic Segmentation with Deep Learning U-Net: Introduction Data Methods Results
  • 18. Methods - Semantic Segmentation with Deep Learning U-Net: ● Encoder-Decoder network. ● Every decoding phase is convolved with trainable filters. ● Copy the encoder embedding to the corresponding decoder. ● Data Augmentation [Stretching and rotation]. ● Weighted Cross Entropy. ● Forces network to learn the border pixels. Introduction Data Methods Results
  • 19. Methods - Encode/Contracting path Goal: ● Retain context and localization accuracy. Operations: ● Convolution ● Non Linearity (ReLU) ● Pooling ● But skip the fully connected layers Introduction Data Methods Results 3x3 Convolution with no padding, stride of 2
  • 20. Methods - Semantic Segmentation with Deep Learning Segnet Architecture: Introduction Data Methods Results
  • 21. Methods - Decode/Expansive path Goal: ● To recover the object details and spatial dimension Operation: ● “Up-convolution”/ “upsampling” ● Concatenate with the corresponding cropped encoder feature maps ● Convolution layers ● ReLU Introduction Data Methods Results
  • 22. Methods - Semantic Segmentation with Deep Learning Segnet: ● Encoding part is exactly VGG-16 ● Use Trained weights from VGG-16 [Excluding the last fully connected layer] ● Decoder uses the pooling indices from max pooling step of corresponding encoder. ● The upsampled maps were convolved with trainable filters. ● Unlike U-Net they don’t copy the entire encoding. ● Reduced the trainable parameters from 134M → 14.7M Introduction Data Methods Results
  • 23. Methods - Semantic Segmentation with Deep Learning Segnet Unpooling: Introduction Data Methods Results
  • 24. Methods - Semantic Segmentation with Deep Learning FCN vs Segnet: Introduction Data Methods Results
  • 25. Training U-net Pixel-wise soft-max + cross entropy loss function
  • 26. Methods: How does upsampling work? Transposed convolution (fractionally strided convolution/deconvolution) ● Reconstructs the spatial resolution ● The weights are learnable ● It is NOT reverse convolution process Introduction Data Methods Results Transposed 2x2 convolution with no padding, stride of 2 and kernel of 3
  • 29. Transposed convolution as matrix multiplication (16 x 4) (4 x 1) = (16 x 1) ● Dimension of input and output swap ● Uses transpose of convolution matrix
  • 30. Preliminary results: partitioned images [900x512x512] Introduction Data Methods Results Epoch Loss Acc Epoch Loss Acc 1 0.2356 0.9587 6 NA NA 2 0.1763 0.9587 7 NA NA 3 ETA: ~1 day 8 NA NA 4 NA NA 9 NA NA 5 NA NA 10 NA NA
  • 32. Actual Next Steps: ▫ Include more classes as part of our training. ▫ Tuning the hyperparameters of the model. ▫ Making the segnet work. Future Works: ▫ Exploring more recently published models. Eg: Deeplab v3[2018] ▫ Use higher computing resources to run the models faster.
  • 33. References: ▫ Ronneberger, O. (2017). Invited Talk: U-Net Convolutional Networks for Biomedical Image Segmentation. Informatik Aktuell Bildverarbeitung Für Die Medizin 2017, 3-3. doi:10.1007/978-3-662-54345-0_3 ▫ Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). doi:10.1109/cvpr.2015.7298965 ▫ Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. doi:10.1109/tpami.2016.264461 ▫ https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d ▫ https://www.cs.toronto.edu/~frossard/post/vgg16/ ▫ https://medium.com/@wilburdes/semantic-segmentation-using-fully-convolutional-neural-networks- 86e45336f99b ▫ https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection
  • 36. Methods: dilated/atrous convolutions Goal: ● Take away need to pool layers Operations: ● Apply predefined gaps between each pixel of input image ● Replace pooling layer from pretrained classification system with dilated convolution e.g. 2-dilated convolution Introduction Data Methods Results
  • 37. Kaggle: Evaluation Average Jaccard Index between the predicted multipolygons and actual multipolygons. The Jaccard Index for two regions is the ratio of the area of the intersection to the area of the union. Jaccard =TP/(TP + FP + FN) = |A∩B|/|A∪B| = |A∩B|/(|A|+|B|−|A∩B|) Introduction Data Methods Results

Editor's Notes

  • #3: In December 2016, Kaggle hosted a 3-month competition in which the UK’s...
  • #5: But why try to do this? Medical imaging: detect location of a tumor Improve precision agriculture, identify plant disease General surveillance purposes
  • #6: For this specific challenge, we were provided with…. Multipolygon is a collection of polygons and these polygons represent objects in an image
  • #7: There are 10 types of object classes kagglers were challenged to identify...
  • #11: We also wanted to show you a video of the what the different object masks look like when superimposed to the original image...
  • #12: We also did a quick analysis of our object class distribution...
  • #13: I also mentioned that our object masks are provided in the form of multipolygons… A multipolygon of trees is made of a lot of polygon trees, and to a lesser extent...
  • #14: Ben Why did we have to scale down to 3072x3072? (multiple of 512)
  • #20: Convolved Feature(feature map), number of the features we want to extract(depth, number of filters ), stride, zero-padding
  • #22: -Deconvolution layers allow the model to use every point in the small image to “paint” a square in the larger one. -”Upsampling: use a 2*2 convolution to halve the number of feature maps→ this is one important modification in U-net: we have a large number of feature channels and allow the network to propagate context information to higher resolution layers. (This is the reason we can have a higher resolution of the output ) -White boxes represent copied feature maps from contracting path. The reason of doing this? To localize and the following layers can learn to assemble a more precise output based on these information.
  • #27: Basically it is a one-to-many relationship.