Skip to content

Runs - Crowdsourcing 2011

beta0

Participants | Proceedings

  • Run ID: beta0
  • Participant: qirdcsuog
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 2f8cbcb22b62f38ef0bb9e3c20059094
  • Run description: Uses a network-based algorithm to determine binary labels based on worker label agreements and worker trustworthiness.

beta04

Participants | Proceedings

  • Run ID: beta04
  • Participant: qirdcsuog
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 9d9e22ae5dfcc68a32ed9a747639f218
  • Run description: Uses a network-based algorithm to determine binary labels based on worker label agreements and worker trustworthiness.

beta08

Participants | Proceedings

  • Run ID: beta08
  • Participant: qirdcsuog
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 2b48f1288b6f47335264dfe2a4e7c68e
  • Run description: Uses a network-based algorithm to determine binary labels based on worker label agreements and worker trustworthiness.

BUPTWildCat1

Participants | Proceedings

  • Run ID: BUPTWildCat1
  • Participant: BUPT_WILDCAT
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/14/2011
  • Task: task2
  • MD5: 6f204361d22a7544285d5a20a8e82793
  • Run description: BUPT-WILDCAT TASK2 RUN1 (Primary) EM-Algorithms

BUPTWildCat2

Participants | Proceedings

  • Run ID: BUPTWildCat2
  • Participant: BUPT_WILDCAT
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 36eadc43dd885ef9c17eb5123cdb6510
  • Run description: BUPT-WILDCAT TASK2 RUN2 (Secondary) Gaussian model & EM-Algorithms

DMIR1

Participants | Proceedings

  • Run ID: DMIR1
  • Participant: TUD_DMIR
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/12/2011
  • Task: task1
  • MD5: d98bd25d0f70b6ed5c9d8613a7794b1a
  • Run description: Crowdsourcing run with replacement of detected spam until at least 5 proper votes per query document pair were obtained.

DMIR2

Participants | Proceedings

  • Run ID: DMIR2
  • Participant: TUD_DMIR
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/12/2011
  • Task: task2
  • MD5: fb75bc3c2cc7dd85792900b629002393
  • Run description: Aggregation of crowdsourcing results by first filtering out suspected random votes, choosing the vote made by the least random worker and rank by probability that is the correct outcome.

DMIR3

Participants | Proceedings

  • Run ID: DMIR3
  • Participant: TUD_DMIR
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/14/2011
  • Task: task2
  • MD5: fde2431c87c00ba054fbfda4bf724f53
  • Run description: Aggregation of crowdsourcing results by first filtering out suspected random votes, using simple mle to determine which label is supported by the most evidence from ethical workers, and rank by probability that is the correct outcome.

G6T1R1

Participants | Proceedings

  • Run ID: G6T1R1
  • Participant: GeAnn
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: e9d4e816d908d51e0ae80d6c082812f2
  • Run description: We used a game to collect relevance judgements between paragraphs of the web pages and topics. CrowdFlower was only used to direct worker attention to our off-site game.

G6T2R1

Participants | Proceedings

  • Run ID: G6T2R1
  • Participant: GeAnn
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 9f3a9bef92d629c3e3eb0b0822efc377
  • Run description: Main run using gold labels and worker trust

G6T2R2

Participants | Proceedings

  • Run ID: G6T2R2
  • Participant: GeAnn
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 4ad8330b21f6ecba553fa2fae7376e67
  • Run description: Main run using worker trust, omitting gold training

G6T2R3

Participants | Proceedings

  • Run ID: G6T2R3
  • Participant: GeAnn
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: bcfdd90e8e6089da06461d888cf4ae77
  • Run description: Baseline run using only majority votes

LingPipeSBin

Participants | Proceedings

  • Run ID: LingPipeSBin
  • Participant: LingPipe
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/12/2011
  • Task: task2
  • MD5: e8bf2444ddf3eeb5e2df51586a8c4d00
  • Run description: Semisupervised hierarchical Bayesian model a la Dawid and Skene (1979) with binary estimates.

LingPipeSemi

Participants | Proceedings

  • Run ID: LingPipeSemi
  • Participant: LingPipe
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/12/2011
  • Task: task2
  • MD5: 4f3aeeed0640189bdb2583b501c38d94
  • Run description: Semisupervised hierarchical Bayesian model a la Dawid and Skene (1979).

LingPipeUn

Participants | Proceedings

  • Run ID: LingPipeUn
  • Participant: LingPipe
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/12/2011
  • Task: task2
  • MD5: 5be63e45d085215a9065835fe747333c
  • Run description: Unsupervised hierarchical Bayesian model a la Dawid and Skene (1979).

RMITT1

Participants | Proceedings

  • Run ID: RMITT1
  • Participant: RMIT
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: 61f438d52b3add4c3368d575a4c219ef
  • Run description: Total cost $133. A total of 6875 image judgments. Total time taken to gather judgments ~3 hours. 23 different gold standards used. 35% of the hits presented to workers were gold standards.

uc3m.graded

Participants | Proceedings

  • Run ID: uc3m.graded
  • Participant: uc3m
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: a0f1a3b9f9b46741c6bdbb91f50dee87
  • Run description: Mechanical Turk, with 5 documents per HIT. Quality control question per document, asking to choose between 2 sets of keywords that best describe the document, besides control of minimum time spent per document. Relevance asked with a 3-point graded scale.

uc3m.hterms

Participants | Proceedings

  • Run ID: uc3m.hterms
  • Participant: uc3m
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: 86dc3d3624b2abc39f28ad44fedb5e3b
  • Run description: Mechanical Turk, with 5 documents per HIT and topic terms highlighted inside documents. Permissive Quality control question per document, asking to choose between 2 sets of keywords that best describe the document, besides control of minimum time spent per document. Relevance asked with an unbiased slider from bad to good document, providing a direct ranking and from which binary labels can be computed.

uc3m.rule

Participants | Proceedings

  • Run ID: uc3m.rule
  • Participant: uc3m
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/16/2011
  • Task: task2
  • MD5: 855c695cab901aadd8754f5a66ea5fb6
  • Run description: Rule-based model learned with per topic and per worker confidence.

uc3m.slider

Participants | Proceedings

  • Run ID: uc3m.slider
  • Participant: uc3m
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: 3aec7d2065db7e69c5ec64a592aaa26a
  • Run description: Mechanical Turk, with 5 documents per HIT. Restrictive quality control question per document, asking to choose between 2 sets of keywords that best describe the document, besides control of minimum time spent per document. Relevance asked with a biased slider from bad to good document, providing a direct ranking and from which binary labels can be computed.

uc3m.svn

Participants | Proceedings

  • Run ID: uc3m.svn
  • Participant: uc3m
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/16/2011
  • Task: task2
  • MD5: 9e7c4c375853d4117d4bcf4698bc2acb
  • Run description: SVM model learned with per topic and per worker confidence.

uc3m.wordnet

Participants | Proceedings

  • Run ID: uc3m.wordnet
  • Participant: uc3m
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/16/2011
  • Task: task2
  • MD5: 138e6cc1dcadf423c63d3a3028ae39c4
  • Run description: GetAnotherLabel software splitting the topic set in hard and easy topics (according to wordnet), using just the best workers per topic subset.

uogTrP1rg

Participants | Proceedings

  • Run ID: uogTrP1rg
  • Participant: uogTr
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: 9adfb8c26085e37b7274fdf60de0a594
  • Run description: Mturk crowdsourcing run using rendered pages and gold assessment

uogTrP2O4teh

Participants | Proceedings

  • Run ID: uogTrP2O4teh
  • Participant: uogTr
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 5c9c8224e28274ec2bddc397ae9ef865
  • Run description: Learned run using a 4 parameter assessment confidence model and highest scoring vote

uogTrP2O4wte

Participants | Proceedings

  • Run ID: uogTrP2O4wte
  • Participant: uogTr
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: ac6681702806ea705fa170a09aa3602c
  • Run description: Learned run using a 4 parameter assessment confidence model and voting

uogTrP2O4wtr

Participants | Proceedings

  • Run ID: uogTrP2O4wtr
  • Participant: uogTr
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 77a2e89c9e348eb8fb7702f0f598501f
  • Run description: Learned run using a 4 parameter assessment confidence model and voting

UWatCS1Human

Participants | Proceedings

  • Run ID: UWatCS1Human
  • Participant: UWaterlooMDS
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: 8086e552a540746c120a3c77fbfb9979
  • Run description: All labels collected by a single human individual with a home grown relevance assessing platform designed for the TREC Crowd Source track. Because of bugs and topic misunderstandings, some topics were judged more than once and the last collected judgments were submitted. (This description is for run UWatCS1Human.)

UWatCS2Semi

Participants | Proceedings

  • Run ID: UWatCS2Semi
  • Participant: UWaterlooMDS
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 3d55b4c6b127c1934de67687741b8282
  • Run description: Consensus based on quality of worker as measured by d-prime^2. Semi-supervised.

UWatCS2Unsup

Participants | Proceedings

  • Run ID: UWatCS2Unsup
  • Participant: UWaterlooMDS
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task2
  • MD5: 2e66e5d28c7533133a25960084b7ae83
  • Run description: Consensus based on quality of worker as measured by d-prime^2. Unsupervised. (This description is for run UWatCS2Unsup.)

wildcatrun

Participants | Proceedings

  • Run ID: wildcatrun
  • Participant: BUPT_WILDCAT
  • Track: Crowdsourcing
  • Year: 2011
  • Submission: 9/15/2011
  • Task: task1
  • MD5: fd109f7a1a32c9ef27bd3b1a329e50eb
  • Run description: First, we designed a job on CrowdFlower as the qualification test. Workers would be notified by email if the quality of their assignments on CrowdFlower meets our requirement. We ran the HITs on Amazon Mechanical Turk, by using a extenal webpage to load HITs. Workers were asked to give a binary label and a rank over a set. Each HIT contains 6 sets (30 documents). Some quality control measures had been taken by us. We used some aditional gold-sets to compute two kinds of scores of each assignment, workers would be notified automatically if they reached the threshold. On the contrary, they would be refused to submit the result. And the workers can review the reference answers of the gold set as instruction for further HITs. Assignments were automatically approved or rejected by our system, we collected 12000 labels in about 10 days. Workers could get $0.42 for every approved assignment.