SlideShare a Scribd company logo
Vote Aggregation Techniques in
the Geo-Wiki Crowdsourcing Game:
A Case Study
Michael Khachay, Oleg Nurmukhametov
Artem Baklanov
Krasovskii Institute of Mathematics and Mechanics
Postdoctoral Research Scholar, ASA, IIASA
Krasovskii Institute of Mathematics and Mechanics
Steffen Fritz, Carl Salk, Linda See, and Dmitry
Shchepashchenko
IIASA
Young  Scientists  Summer  Program
Since  1977,  IIASA’s  annual  3-­month  Young  Scientists  
Summer  Program  (YSSP)  offers  opportunities  to  
talented  young  researchers.
Crowdsourcing = Crowd + Outsourcing
Crowdsourcing is a new approach to perform
tasks, when a group of distributed worldwide
people in total can substitute an expert.
Example:
GEO-wiki project, ESM, IIASA
Geo-Wiki: Cropland Capture Game
5 millions votes
170 000 satellite images
3000 volunteers
6 months
Goal: land cover map
www.geo-wiki.org d
Preprocessing of images
Blur detection
Duplicate detection
Benchmarking
of algorithms for vote
aggregation
Outline of presentation
Noise reduction Prediction
Preprocessing of images
Duplicate detection
Blur detection
Benchmarking
of algorithms for vote
aggregation
Outline of presentation
Noise reduction Prediction
Blur detection algorithm [H Tang, 2012].
Input: image
Output: coefficient of blur in [0, 100]
Detection of blurry images
100%0%
High quality images
Probability
0%
10%
30%
Blurry images
80% 90% 95%
98% 99%
The  table  and  figure  are  from
Tong,  Hanghang,  et  al.  "Blur  detection  for  digital  
images  using  wavelet  transform." Multimedia  
and  Expo,  2004.  ICME'04.  2004  IEEE  
International  Conference  on.  Vol.  1.  IEEE,  2004.
Blur detection algorithm
A  key  idea
• A= a  number  of  Roof-­Structure  and  
Gstep-­Structure  edge  points  that  have  lost  
their  sharpness.
• B=  a  number  of  Roof-­Structure  and  
Gstep-­Structure  edge  points.  
• Blur  coefficient  =  100*A/B
Detection of blurry images
Results
Current campaign:
Noise reduction in dataset.
+ for future campaigns:
Less workload, more joy.
Blur coefficient Number of images
0% (Excellent images) 74 000 (38%)
80% - 100% (Extremely bad images) 3 200 (2%)
Extremely bad images were removed
Detection of duplicates
Web Link
Image unique
identifier Link to image
1 http://cg.tuwien.ac.at/~sturn/crop/img_78.1875_52.9958_1000.jpg
2 http://cg.tuwien.ac.at/~sturn/crop/img_78.1875_52.9958_500.jpg
3 http://cg.tuwien.ac.at/~sturn/crop/img_78.1875_52.9958_300.jpg
… …
Binary file Image content
Detection of duplicates
Find the 10 differences
Detection of duplicates
The pixel-by-pixel differences
Links: different
Binary files: different
Image content: similar
pHash (perceptual hash) [Zauner, 2010].
Detection of duplicates
pHash.  High  level  description
1. Resize  to  32x32  and  grayscale;;
2. Perform  DCT-­2  transformation  and  keep  
top-­left  8x8;;
3. Calculate  the  average  value  of  
alternating  components;;
4. Set  the  64  hash  bits  to  0  (or  1)  on  
whether   DCT  values  more  (or  less)  the  
average  value.
Detection of copies
Method Copies count Time to completion
Binary data comparison (MD5) 2 700 Fast (~10 min)
Image content comparison (pHash) 10 000 (6%) Slow(~7 hours)
Results
Current campaign:
Increase of statistically significance;
Reduction of dimensionality.
+ for future campaigns:
Less workload, more joy.
Votes  were  merged.
Preprocessing of images
Duplicate detection
Blur detection
Benchmarking
of algorithms for vote
aggregation
Outline of presentation
Noise reduction Prediction
A  part  of  the  dataset was  annotated  by  an  
expert  after  the  campaign  took  place:
• 854 images;;
• 1813 volunteers;;
• 16  940 votes.  
We    sample  two  subsets  for  training  and  testing  
70/30 ratio
Expert dataset
Baseline  algorithms  
We  use  SVD  to  reduce  dimensionality  and  
10-­fold  cross-­validation  to  fit  parameters
A  publicly  available  code  implements  KOS and  EM algorithm;;  
both  are  implemented  in  conjunction  with  reputation  
algorithm  (also  called  Hard  penalty).  
https://github.com/ashwin90/Penalty-­based-­clustering  
[EM]  Dawid,  A.P.,  Skene,  A.M.:  Maximum  likelihood  estimation  of  
observer  error-­rates  using  the  em algorithm.  Applied  statistics  pp.  
20–28  (1979)
[Hard  penalty]  Jagabathula,  S.,  et  al.:  Reputation-­based  worker  
filtering  in  crowdsourcing.  In:  Advances  in  Neural  Information  
Processing  Systems.  pp.  2492–2500  (2014)  
[KOS]  Karger,  D.R.,  Oh,  S.,  Shah,  D.:  Iterative  learning  for  reliable  
crowdsourcing  systems.  In:  Advances  in  neural  information  
processing  systems.  pp.  1953–1961  (2011)
Weighted  MV  Heuristic  
• We  use  weighted  MV  with  weights  equal  
to  reliabilities  of  volunteers
Reliability=  2  Pcorrect  answer
-­ 1
• The  heuristic  is  combined  with  iterative  
removal  of  a  volunteer  with  the  highest  
penalty [Hard  penalty]. Then  recalculate  
penalties,  and  obtain  new  results  for  the  
weighted  MV.  
Accuracy  for  ‘crowdsourcing’  algorithms
Baseline:    AdaBoost (35  features)  
91.08
Accuracy  for  ‘crowdsourcing’  
algorithms  with  image  thresholding.  
Only  images  with  
at  least  10  votes  
are  left  in  the  
expert  dataset:
404  images,  
1777  volunteers.  
Possible  explanations
Spammers
Malicious	
  
Annotators
Good	
  
Annotators
Biased	
  
Annotators
Biased
Annotators
ROCs  for  all    
1813  volunteers
on  the  expert  dataset
Raykar,  V.C.:  Eliminating  Spammers  and  Ranking  Annotators  for  Crowdsourced
Labeling  Tasks.  JMLR  13,  491–518  (2012)  
Volunteers’  ROCs
Spammers
Malicious	
  
Annotators
Good	
  
Annotators
Biased	
  
Annotators
Biased
Annotators
Threshold  =  12  votes,  262  volunteers.  
Volunteers’  ROCs
Spammers
Malicious	
  
Annotators
Good	
  
Annotators
Biased	
  
Annotators
Biased
Annotators
Threshold  =  100  votes, 24  volunteers.  
Accuracy
Due  to  image  processing  step
• accuracy  of  MV  increased  by  2%,  
• workload  decreased  by  6  %.
Using  threshold  for  voters  and  the  heuristic,  
we  increase  accuracy  to  95.5%.
Conclusions  
• Numerical  experiments  show  that  MV  performs  on  a  par  
with  all  algorithms.  
• Possible  explanation:  high  accuracy  of  frequently  voting  
volunteers  coupled  with  the  absence  of  spammers.
• Image  thresholding by  number  of  votes  helps  to  improve  
the  results  of  all  algorithms  similarly.
• To  summarize,  good  annotators  eliminate  any  advantages  
of  state  of  the  art  algorithms  over  MV.  
• Image  preprocessing  is  the  only  way  to  improve  accuracy!
Future  plans
• Development  of  theory
• More  case  studies  in  this  field,  hopefully  
with  spammers  and  malicious  voters!
• Elaborate  approach  to  ‘Maybe’  votes
Artem Baklanov - Votes Aggregation Techniques in Geo-Wiki Crowdsourcing Game:  a Case Study
The  baseline
Ø We  apply  SVD  to  the  whole  dataset  to  reduce  
dimensionality.  
Ø We  find  an  appropriate  choice  for  the  number  of  
features:  5,  14,  35.  
Ø Transform  the  feature  space  of  the  testing  and  
training  subsets  accordingly.  
Ø On  the  basis  of  10-­fold  cross-­validation  of  the  training  
subset,  we  fit  parameters  for  the  AdaBoost and  
Random  Forest  algorithms.  For  Linear  Discriminant  
Analysis  (LDA),  we  use  default  parameters.  
Ø The  accuracy  of  the  algorithms  with  fitted  parameters  
was  estimated  using  the  testing  subset.
Before and after merging of duplicates
Votes per image Unique votes
Repetition of images
(average)
Before [1 … 269] [1 … 88] 4
After [1…6042] [1…942] 13
900 volunteers
saw the same
image 7 times
Accuracy  for  ‘crowdsourcing’  
algorithms  with  image  thresholding.  
Only  images  with  
at  least  4  votes  
are  left  in  the  
expert  dataset:
729  images,  
1812  volunteers.  
Volunteers’  ROCs
Spammers
Malicious	
  
Annotators
Good	
  
Annotators
Biased	
  
Annotators
Biased
Annotators
Threshold  =  44  votes,  52 volunteers.  

More Related Content

PDF
OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIG...
PPTX
Automatic Image Annotation
PPTX
Usage of Generative Adversarial Networks (GANs) in Healthcare
PPTX
MultiModal Retrieval Image
PPTX
Developing Document Image Retrieval System
PDF
A Novel Adaptive Denoising Method for Removal of Impulse Noise in Images usin...
PDF
Paper 58 disparity-of_stereo_images_by_self_adaptive_algorithm
OBTAINING SUPER-RESOLUTION IMAGES BY COMBINING LOW-RESOLUTION IMAGES WITH HIG...
Automatic Image Annotation
Usage of Generative Adversarial Networks (GANs) in Healthcare
MultiModal Retrieval Image
Developing Document Image Retrieval System
A Novel Adaptive Denoising Method for Removal of Impulse Noise in Images usin...
Paper 58 disparity-of_stereo_images_by_self_adaptive_algorithm

What's hot (20)

PDF
B045050812
PDF
Fuzzy Logic based Contrast Enhancement
PPTX
Image Co-segmentation via Saliency Co-fusion
PPT
Color reduction using the combination of the kohonen self organized feature m...
PDF
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
PDF
histogram equalization of grayscale and color image
PDF
Btv thesis defense_v1.02-final
PDF
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
PPTX
Iccv2009 recognition and learning object categories p2 c01 - recognizing a ...
PDF
Mr image compression based on selection of mother wavelet and lifting based w...
PDF
Fractional step discriminant pruning
PDF
International Journal of Engineering and Science Invention (IJESI)
PPTX
Super Resolution of Image
PDF
PDF
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
DOCX
Image super resolution based on
PDF
Detecting image splicing in the wild Web
PDF
538 207-219
PDF
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
PPTX
Bayesian Networks with R and Hadoop
B045050812
Fuzzy Logic based Contrast Enhancement
Image Co-segmentation via Saliency Co-fusion
Color reduction using the combination of the kohonen self organized feature m...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
histogram equalization of grayscale and color image
Btv thesis defense_v1.02-final
Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...
Iccv2009 recognition and learning object categories p2 c01 - recognizing a ...
Mr image compression based on selection of mother wavelet and lifting based w...
Fractional step discriminant pruning
International Journal of Engineering and Science Invention (IJESI)
Super Resolution of Image
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
Image super resolution based on
Detecting image splicing in the wild Web
538 207-219
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
Bayesian Networks with R and Hadoop
Ad

Similar to Artem Baklanov - Votes Aggregation Techniques in Geo-Wiki Crowdsourcing Game: a Case Study (20)

PDF
Framework on Retrieval of Hypermedia Data using Data mining Technique
PPTX
Evolving a Medical Image Similarity Search
PDF
群衆の知を引き出すための機械学習(第4回ステアラボ人工知能セミナー)
PDF
Introduction to Computer Vision (uapycon 2017)
PDF
Intelligent Multimedia Recommendation
PDF
Decision Forests and discriminant analysis
PPTX
Empirical Study on Collaborative Software in the field of Machine learning.pptx
PDF
Flickr Image Classification using SIFT Algorism
PDF
D43031521
PPTX
Surveys of Image Recoginition.ppt
PDF
IRJET- Analysis of Vehicle Number Plate Recognition
PDF
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
PPT
Face Detection techniques
PDF
Active Content-Based Crowdsourcing Task Selection
PDF
A_Survey_Paper_on_Image_Classification_and_Methods.pdf
PDF
C1803011419
DOCX
2013 ieee matlab project titles
DOCX
2013 ieee matlab project titles
DOCX
2013 ieee matlab project titles
DOCX
2013 ieee matlab project titles
Framework on Retrieval of Hypermedia Data using Data mining Technique
Evolving a Medical Image Similarity Search
群衆の知を引き出すための機械学習(第4回ステアラボ人工知能セミナー)
Introduction to Computer Vision (uapycon 2017)
Intelligent Multimedia Recommendation
Decision Forests and discriminant analysis
Empirical Study on Collaborative Software in the field of Machine learning.pptx
Flickr Image Classification using SIFT Algorism
D43031521
Surveys of Image Recoginition.ppt
IRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
Face Detection techniques
Active Content-Based Crowdsourcing Task Selection
A_Survey_Paper_on_Image_Classification_and_Methods.pdf
C1803011419
2013 ieee matlab project titles
2013 ieee matlab project titles
2013 ieee matlab project titles
2013 ieee matlab project titles
Ad

More from AIST (20)

PDF
Alexey Mikhaylichenko - Automatic Detection of Bone Contours in X-Ray Images
PDF
Алена Ильина и Иван Бибилов, GoTo - GoTo школы, конкурсы и хакатоны
PDF
Станислав Кралин, Сайтсофт - Связанные открытые данные федеральных органов ис...
PDF
Павел Браславский,Velpas - Velpas: мобильный визуальный поиск
PDF
Евгений Цымбалов, Webgames - Методы машинного обучения для задач игровой анал...
PDF
Александр Москвичев, EveResearch - Алгоритмы анализа данных в маркетинговых и...
PDF
Петр Ермаков, HeadHunter - Модерация резюме: от людей к роботам. Машинное обу...
PPTX
Иосиф Иткин, Exactpro - TBA
PPTX
Nikolay Karpov - Evolvable Semantic Platform for Facilitating Knowledge Exchange
PDF
George Moiseev - Classification of E-commerce Websites by Product Categories
PDF
Elena Bruches - The Hybrid Approach to Part-of-Speech Disambiguation
PDF
Marina Danshina - The methodology of automated decryption of znamenny chants
PDF
Edward Klyshinsky - The Corpus of Syntactic Co-occurences: the First Glance
PPTX
Galina Lavrentyeva - Anti-spoofing Methods for Automatic Speaker Verification...
PDF
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
PDF
Kaytoue Mehdi - Finding duplicate labels in behavioral data: an application f...
PPTX
Valeri Labunets - The bichromatic excitable Schrodinger metamedium
PPTX
Valeri Labunets - Fast multiparametric wavelet transforms and packets for ima...
PDF
Alexander Karkishchenko - Threefold Symmetry Detection in Hexagonal Images Ba...
PPTX
Artyom Makovetskii - An Efficient Algorithm for Total Variation Denoising
Alexey Mikhaylichenko - Automatic Detection of Bone Contours in X-Ray Images
Алена Ильина и Иван Бибилов, GoTo - GoTo школы, конкурсы и хакатоны
Станислав Кралин, Сайтсофт - Связанные открытые данные федеральных органов ис...
Павел Браславский,Velpas - Velpas: мобильный визуальный поиск
Евгений Цымбалов, Webgames - Методы машинного обучения для задач игровой анал...
Александр Москвичев, EveResearch - Алгоритмы анализа данных в маркетинговых и...
Петр Ермаков, HeadHunter - Модерация резюме: от людей к роботам. Машинное обу...
Иосиф Иткин, Exactpro - TBA
Nikolay Karpov - Evolvable Semantic Platform for Facilitating Knowledge Exchange
George Moiseev - Classification of E-commerce Websites by Product Categories
Elena Bruches - The Hybrid Approach to Part-of-Speech Disambiguation
Marina Danshina - The methodology of automated decryption of znamenny chants
Edward Klyshinsky - The Corpus of Syntactic Co-occurences: the First Glance
Galina Lavrentyeva - Anti-spoofing Methods for Automatic Speaker Verification...
Oleksandr Frei and Murat Apishev - Parallel Non-blocking Deterministic Algori...
Kaytoue Mehdi - Finding duplicate labels in behavioral data: an application f...
Valeri Labunets - The bichromatic excitable Schrodinger metamedium
Valeri Labunets - Fast multiparametric wavelet transforms and packets for ima...
Alexander Karkishchenko - Threefold Symmetry Detection in Hexagonal Images Ba...
Artyom Makovetskii - An Efficient Algorithm for Total Variation Denoising

Recently uploaded (20)

PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Introduction to Data Science and Data Analysis
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPT
Quality review (1)_presentation of this 21
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Introduction to the R Programming Language
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
annual-report-2024-2025 original latest.
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
climate analysis of Dhaka ,Banglades.pptx
[EN] Industrial Machine Downtime Prediction
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction to Data Science and Data Analysis
IBA_Chapter_11_Slides_Final_Accessible.pptx
IB Computer Science - Internal Assessment.pptx
SAP 2 completion done . PRESENTATION.pptx
Introduction-to-Cloud-ComputingFinal.pptx
ISS -ESG Data flows What is ESG and HowHow
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Quality review (1)_presentation of this 21
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction to the R Programming Language
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
annual-report-2024-2025 original latest.

Artem Baklanov - Votes Aggregation Techniques in Geo-Wiki Crowdsourcing Game: a Case Study

  • 1. Vote Aggregation Techniques in the Geo-Wiki Crowdsourcing Game: A Case Study Michael Khachay, Oleg Nurmukhametov Artem Baklanov Krasovskii Institute of Mathematics and Mechanics Postdoctoral Research Scholar, ASA, IIASA Krasovskii Institute of Mathematics and Mechanics Steffen Fritz, Carl Salk, Linda See, and Dmitry Shchepashchenko IIASA
  • 2. Young  Scientists  Summer  Program Since  1977,  IIASA’s  annual  3-­month  Young  Scientists   Summer  Program  (YSSP)  offers  opportunities  to   talented  young  researchers.
  • 3. Crowdsourcing = Crowd + Outsourcing Crowdsourcing is a new approach to perform tasks, when a group of distributed worldwide people in total can substitute an expert. Example: GEO-wiki project, ESM, IIASA
  • 4. Geo-Wiki: Cropland Capture Game 5 millions votes 170 000 satellite images 3000 volunteers 6 months Goal: land cover map www.geo-wiki.org d
  • 5. Preprocessing of images Blur detection Duplicate detection Benchmarking of algorithms for vote aggregation Outline of presentation Noise reduction Prediction
  • 6. Preprocessing of images Duplicate detection Blur detection Benchmarking of algorithms for vote aggregation Outline of presentation Noise reduction Prediction
  • 7. Blur detection algorithm [H Tang, 2012]. Input: image Output: coefficient of blur in [0, 100] Detection of blurry images 100%0%
  • 9. Blurry images 80% 90% 95% 98% 99%
  • 10. The  table  and  figure  are  from Tong,  Hanghang,  et  al.  "Blur  detection  for  digital   images  using  wavelet  transform." Multimedia   and  Expo,  2004.  ICME'04.  2004  IEEE   International  Conference  on.  Vol.  1.  IEEE,  2004. Blur detection algorithm
  • 11. A  key  idea • A= a  number  of  Roof-­Structure  and   Gstep-­Structure  edge  points  that  have  lost   their  sharpness. • B=  a  number  of  Roof-­Structure  and   Gstep-­Structure  edge  points.   • Blur  coefficient  =  100*A/B
  • 12. Detection of blurry images Results Current campaign: Noise reduction in dataset. + for future campaigns: Less workload, more joy. Blur coefficient Number of images 0% (Excellent images) 74 000 (38%) 80% - 100% (Extremely bad images) 3 200 (2%) Extremely bad images were removed
  • 13. Detection of duplicates Web Link Image unique identifier Link to image 1 http://cg.tuwien.ac.at/~sturn/crop/img_78.1875_52.9958_1000.jpg 2 http://cg.tuwien.ac.at/~sturn/crop/img_78.1875_52.9958_500.jpg 3 http://cg.tuwien.ac.at/~sturn/crop/img_78.1875_52.9958_300.jpg … … Binary file Image content
  • 14. Detection of duplicates Find the 10 differences
  • 15. Detection of duplicates The pixel-by-pixel differences Links: different Binary files: different Image content: similar
  • 16. pHash (perceptual hash) [Zauner, 2010]. Detection of duplicates
  • 17. pHash.  High  level  description 1. Resize  to  32x32  and  grayscale;; 2. Perform  DCT-­2  transformation  and  keep   top-­left  8x8;; 3. Calculate  the  average  value  of   alternating  components;; 4. Set  the  64  hash  bits  to  0  (or  1)  on   whether   DCT  values  more  (or  less)  the   average  value.
  • 18. Detection of copies Method Copies count Time to completion Binary data comparison (MD5) 2 700 Fast (~10 min) Image content comparison (pHash) 10 000 (6%) Slow(~7 hours) Results Current campaign: Increase of statistically significance; Reduction of dimensionality. + for future campaigns: Less workload, more joy. Votes  were  merged.
  • 19. Preprocessing of images Duplicate detection Blur detection Benchmarking of algorithms for vote aggregation Outline of presentation Noise reduction Prediction
  • 20. A  part  of  the  dataset was  annotated  by  an   expert  after  the  campaign  took  place: • 854 images;; • 1813 volunteers;; • 16  940 votes.   We    sample  two  subsets  for  training  and  testing   70/30 ratio Expert dataset
  • 21. Baseline  algorithms   We  use  SVD  to  reduce  dimensionality  and   10-­fold  cross-­validation  to  fit  parameters
  • 22. A  publicly  available  code  implements  KOS and  EM algorithm;;   both  are  implemented  in  conjunction  with  reputation   algorithm  (also  called  Hard  penalty).   https://github.com/ashwin90/Penalty-­based-­clustering   [EM]  Dawid,  A.P.,  Skene,  A.M.:  Maximum  likelihood  estimation  of   observer  error-­rates  using  the  em algorithm.  Applied  statistics  pp.   20–28  (1979) [Hard  penalty]  Jagabathula,  S.,  et  al.:  Reputation-­based  worker   filtering  in  crowdsourcing.  In:  Advances  in  Neural  Information   Processing  Systems.  pp.  2492–2500  (2014)   [KOS]  Karger,  D.R.,  Oh,  S.,  Shah,  D.:  Iterative  learning  for  reliable   crowdsourcing  systems.  In:  Advances  in  neural  information   processing  systems.  pp.  1953–1961  (2011)
  • 23. Weighted  MV  Heuristic   • We  use  weighted  MV  with  weights  equal   to  reliabilities  of  volunteers Reliability=  2  Pcorrect  answer -­ 1 • The  heuristic  is  combined  with  iterative   removal  of  a  volunteer  with  the  highest   penalty [Hard  penalty]. Then  recalculate   penalties,  and  obtain  new  results  for  the   weighted  MV.  
  • 24. Accuracy  for  ‘crowdsourcing’  algorithms Baseline:    AdaBoost (35  features)   91.08
  • 25. Accuracy  for  ‘crowdsourcing’   algorithms  with  image  thresholding.   Only  images  with   at  least  10  votes   are  left  in  the   expert  dataset: 404  images,   1777  volunteers.  
  • 26. Possible  explanations Spammers Malicious   Annotators Good   Annotators Biased   Annotators Biased Annotators ROCs  for  all     1813  volunteers on  the  expert  dataset Raykar,  V.C.:  Eliminating  Spammers  and  Ranking  Annotators  for  Crowdsourced Labeling  Tasks.  JMLR  13,  491–518  (2012)  
  • 27. Volunteers’  ROCs Spammers Malicious   Annotators Good   Annotators Biased   Annotators Biased Annotators Threshold  =  12  votes,  262  volunteers.  
  • 28. Volunteers’  ROCs Spammers Malicious   Annotators Good   Annotators Biased   Annotators Biased Annotators Threshold  =  100  votes, 24  volunteers.  
  • 29. Accuracy Due  to  image  processing  step • accuracy  of  MV  increased  by  2%,   • workload  decreased  by  6  %. Using  threshold  for  voters  and  the  heuristic,   we  increase  accuracy  to  95.5%.
  • 30. Conclusions   • Numerical  experiments  show  that  MV  performs  on  a  par   with  all  algorithms.   • Possible  explanation:  high  accuracy  of  frequently  voting   volunteers  coupled  with  the  absence  of  spammers. • Image  thresholding by  number  of  votes  helps  to  improve   the  results  of  all  algorithms  similarly. • To  summarize,  good  annotators  eliminate  any  advantages   of  state  of  the  art  algorithms  over  MV.   • Image  preprocessing  is  the  only  way  to  improve  accuracy!
  • 31. Future  plans • Development  of  theory • More  case  studies  in  this  field,  hopefully   with  spammers  and  malicious  voters! • Elaborate  approach  to  ‘Maybe’  votes
  • 33. The  baseline Ø We  apply  SVD  to  the  whole  dataset  to  reduce   dimensionality.   Ø We  find  an  appropriate  choice  for  the  number  of   features:  5,  14,  35.   Ø Transform  the  feature  space  of  the  testing  and   training  subsets  accordingly.   Ø On  the  basis  of  10-­fold  cross-­validation  of  the  training   subset,  we  fit  parameters  for  the  AdaBoost and   Random  Forest  algorithms.  For  Linear  Discriminant   Analysis  (LDA),  we  use  default  parameters.   Ø The  accuracy  of  the  algorithms  with  fitted  parameters   was  estimated  using  the  testing  subset.
  • 34. Before and after merging of duplicates Votes per image Unique votes Repetition of images (average) Before [1 … 269] [1 … 88] 4 After [1…6042] [1…942] 13 900 volunteers saw the same image 7 times
  • 35. Accuracy  for  ‘crowdsourcing’   algorithms  with  image  thresholding.   Only  images  with   at  least  4  votes   are  left  in  the   expert  dataset: 729  images,   1812  volunteers.  
  • 36. Volunteers’  ROCs Spammers Malicious   Annotators Good   Annotators Biased   Annotators Biased Annotators Threshold  =  44  votes,  52 volunteers.