Mechanical Cheat

Mechanical cheat
Spamming Schemes and Adversarial
Techniques on Crowdsourcing Platforms
Djellel Eddine Difallah, GianlucaDemartini, and Philippe Cudré-Mauroux
University of Fribourg, Switzerland

Popularity and Monetary Incentives
 Micro task Crowdsourcing is growing in popularity.
 ~500k registered workers in AMT
 ~200k hits available (April 2012)
 ~20k $ of rewards (April 2012)

Spam could be a threat for
Crowdsourcing

Some Experiments Results:
Entity Link Selection (ZenCrowd – WWW2012)

 Evidence of participations of dishonest workers, spending

less time doing more tasks and achieving lesser quality.

Dishonest Answers onCrowdsourcing
Platforms
 We define a dishonest answer in a crowd sourcing context as

answer that has been either:
 Randomly posted.
 Artificially generated.
 Duplicated from another source.

How can requesters perform quality
control?
 Go over all the submissions?
 Blindly accept all submissions?
 Use selection and filtering algorithms.

Anti adversarial techniques
 Pre-selection and dissuasion
 Use built in control (ex: acceptance rate)
 Task design
 Qualification test

 Post processing
 Task repetition and aggregation
 Test questions
 Machine learning (ex: probabilistic netw0rk in ZenCrowd)

Countering adversarial techniques
Organization

Counteringadversarial techniques
Individual attacks
 Random Answers
 Target tasks designed with monetary incentive
 Countered with test questions
 Automated Answers
 Target tasks with simple submission mechanism
 Counter with test questions (especially captchas)
 Semi-Automated Answers
 Target easy hits achievable with some AI.
 Can pass easy-to-answer test questions
 Can detect captchas and forward them to a human.

Counteringadversarial techniques
Group attacks
 Agree on Answers
 Target naïve aggregation schemes like majority vote.
 May discard valid answers!
 Counter by shuffling the options
 Answer Sharing
 Target repeated tasks
 Counter with creating multiple batches
 Artificial Clones
 Target repeated tasks

Conclusions and future work
 We claim the inefficiency of some quality control tools to

counter resourceful spammers.
 Combine multiple techniques for post-filtering.
 Crowdsourcing platforms to provide more tools.
 Evaluation of futurefiltering algorithms must be repeatable

and generic.
 Crowdsourcing benchmark.

Conclusions and future work
Benchmarkproposal
 A collection of tasks with multiple choice options
 Each task is repeated multiple times
 Unpublished expert judgment for all the tasks
 Publish answers completed in a controlled environment with the

following categories of workers:





Honest workers
Random clicks
Semi automated program
Organized group

 Post-filtering methods are evaluated based on their ability to achieve

high precision score.

 Other parameter could be the money spent etc

Mechanical Cheat

More Related Content

Similar to Mechanical Cheat (20)

More from eXascale Infolab (20)

Recently uploaded (20)

Mechanical Cheat

Editor's Notes