Introduction to GluonCV

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Introduction to GluonCV

Why GluonCV?
• What is the biggest challenge you have ever encountered with deep
learning?

Why GluonCV?
• What is the biggest challenge you have ever encountered with deep
learning?
• “reproducing the best claimed results from latest papers”
SOTA
state-of-the-art

Real-world Stories #1
• Back to a period in 2016, the same ImageNet models trained by MXNet
achieved on average 1% worse accuracy compared to Torch.
• Tried almost everything to debug, even developed a plugin to run Torch
code inside MXNet to make it easier to compare results.
=> Transcoding training images using 95 JPEG quality rather than 85 solved
the problem.

Real-world Stories #2
• Using another open source DL framework: trained model accuracies
cannot match previous internal version.
• Spent months to figure out why, with no clue.
=> The order of data augmentation is different from previous version.

• I will write clean and reusable code
when I’m prototyping this time.
• Variant:
• - I will write clean and reusable code
next time.
Common myth 1

Common myth 2
• My code will still run next year.
• Sometimes, it’s not our fault.

Common myth3
• I will finish setting up the
baseline model this afternoon.
• Though it may not be our fault
again.

Starting from scratch can be hard
• Even the most talented researchers will get blocked by trivial things.
• Experience and instincts can be your enemies in certain circumstances.
• Training is time-consuming, initialization and augmentation is
randomized, and many implementation details need to be taken care of.
=> Debugging deep learning models is extremely difficult.

It’s not easy to embrace open-source implementations
• Often the quality of open-source implementations vary.
• Languages, code styles, project structures, DL frameworks are mixed.
• Personal projects tend to focusing on a specific task with specific
datasets. It requires significant engineering efforts to adapt to your use
case.
• Community projects can be abandoned frequently.

What does GluonCV provide
• Reproduction of important papers in recent years
• Model zoo with 80+ pre-trained models
• Training scripts (as well as tuned hyper-parameters) to
reproduce the results
• Full training script + Dataset download script
• Logs of training run

What does GluonCV provide
• Considerate APIs and modules that are easy to follow and
understand
• Avoid re-writing the same utilities again and again
• Pre-set data augmentation and transforms, visualization and
training utilities
• Community support, feel free to ask and discuss
• User forum
• Github community and open roadmap

Image Classification
• More than 50+ pre-trained ImageNet models(ResNet, MobileNet…)
• We achieved the best accuracy using some of the most popular
models (e.g., ResNet), compared with other frameworks
• Used as backbone in many downstream tasks => better accuracy

Semantic Segmentation
• FCN
• PSPNet
• Mask-RCNN
• DeepLab

Object Detection
• SSD and YOLOv3: fastest
solution
• Faster-RCNN, RFCN and FPN:
slower but more accurate,
especially for tiny objects

Instance Segmentation
• Mask R-CNN

Key Point Estimation
• SimplePose

Style Transfer
MSGNet
GANs
CycleGAN
SRGAN
WGAN
Re-identification
Market1501

Coming Soon: Depth Estimation

Like GluonCV? Go build!
https://gluon-cv.mxnet.io
https://github.com/dmlc/gluon-cv

Introduction to GluonCV

More Related Content

Similar to Introduction to GluonCV (14)

More from Apache MXNet (20)

Recently uploaded (20)

Introduction to GluonCV

Editor's Notes