Runs - Crowdsourcing 2011¶

beta0¶

Participants | Proceedings

Run ID: beta0
Participant: qirdcsuog
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 2f8cbcb22b62f38ef0bb9e3c20059094
Run description: Uses a network-based algorithm to determine binary labels based on worker label agreements and worker trustworthiness.

beta04¶

Participants | Proceedings

Run ID: beta04
Participant: qirdcsuog
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 9d9e22ae5dfcc68a32ed9a747639f218
Run description: Uses a network-based algorithm to determine binary labels based on worker label agreements and worker trustworthiness.

beta08¶

Participants | Proceedings

Run ID: beta08
Participant: qirdcsuog
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 2b48f1288b6f47335264dfe2a4e7c68e
Run description: Uses a network-based algorithm to determine binary labels based on worker label agreements and worker trustworthiness.

BUPTWildCat1¶

Participants | Proceedings

Run ID: BUPTWildCat1
Participant: BUPT_WILDCAT
Track: Crowdsourcing
Year: 2011
Submission: 9/14/2011
Task: task2
MD5: 6f204361d22a7544285d5a20a8e82793
Run description: BUPT-WILDCAT TASK2 RUN1 (Primary) EM-Algorithms

BUPTWildCat2¶

Participants | Proceedings

Run ID: BUPTWildCat2
Participant: BUPT_WILDCAT
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 36eadc43dd885ef9c17eb5123cdb6510
Run description: BUPT-WILDCAT TASK2 RUN2 (Secondary) Gaussian model & EM-Algorithms

DMIR1¶

Participants | Proceedings

Run ID: DMIR1
Participant: TUD_DMIR
Track: Crowdsourcing
Year: 2011
Submission: 9/12/2011
Task: task1
MD5: d98bd25d0f70b6ed5c9d8613a7794b1a
Run description: Crowdsourcing run with replacement of detected spam until at least 5 proper votes per query document pair were obtained.

DMIR2¶

Participants | Proceedings

Run ID: DMIR2
Participant: TUD_DMIR
Track: Crowdsourcing
Year: 2011
Submission: 9/12/2011
Task: task2
MD5: fb75bc3c2cc7dd85792900b629002393
Run description: Aggregation of crowdsourcing results by first filtering out suspected random votes, choosing the vote made by the least random worker and rank by probability that is the correct outcome.

DMIR3¶

Participants | Proceedings

Run ID: DMIR3
Participant: TUD_DMIR
Track: Crowdsourcing
Year: 2011
Submission: 9/14/2011
Task: task2
MD5: fde2431c87c00ba054fbfda4bf724f53
Run description: Aggregation of crowdsourcing results by first filtering out suspected random votes, using simple mle to determine which label is supported by the most evidence from ethical workers, and rank by probability that is the correct outcome.

G6T1R1¶

Participants | Proceedings

Run ID: G6T1R1
Participant: GeAnn
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: e9d4e816d908d51e0ae80d6c082812f2
Run description: We used a game to collect relevance judgements between paragraphs of the web pages and topics. CrowdFlower was only used to direct worker attention to our off-site game.

G6T2R1¶

Participants | Proceedings

Run ID: G6T2R1
Participant: GeAnn
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 9f3a9bef92d629c3e3eb0b0822efc377
Run description: Main run using gold labels and worker trust

G6T2R2¶

Participants | Proceedings

Run ID: G6T2R2
Participant: GeAnn
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 4ad8330b21f6ecba553fa2fae7376e67
Run description: Main run using worker trust, omitting gold training

G6T2R3¶

Participants | Proceedings

Run ID: G6T2R3
Participant: GeAnn
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: bcfdd90e8e6089da06461d888cf4ae77
Run description: Baseline run using only majority votes

LingPipeSBin¶

Participants | Proceedings

Run ID: LingPipeSBin
Participant: LingPipe
Track: Crowdsourcing
Year: 2011
Submission: 9/12/2011
Task: task2
MD5: e8bf2444ddf3eeb5e2df51586a8c4d00
Run description: Semisupervised hierarchical Bayesian model a la Dawid and Skene (1979) with binary estimates.

LingPipeSemi¶

Participants | Proceedings

Run ID: LingPipeSemi
Participant: LingPipe
Track: Crowdsourcing
Year: 2011
Submission: 9/12/2011
Task: task2
MD5: 4f3aeeed0640189bdb2583b501c38d94
Run description: Semisupervised hierarchical Bayesian model a la Dawid and Skene (1979).

LingPipeUn¶

Participants | Proceedings

Run ID: LingPipeUn
Participant: LingPipe
Track: Crowdsourcing
Year: 2011
Submission: 9/12/2011
Task: task2
MD5: 5be63e45d085215a9065835fe747333c
Run description: Unsupervised hierarchical Bayesian model a la Dawid and Skene (1979).

RMITT1¶

Participants | Proceedings

Run ID: RMITT1
Participant: RMIT
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: 61f438d52b3add4c3368d575a4c219ef
Run description: Total cost $133. A total of 6875 image judgments. Total time taken to gather judgments ~3 hours. 23 different gold standards used. 35% of the hits presented to workers were gold standards.

uc3m.graded¶

Participants | Proceedings

Run ID: uc3m.graded
Participant: uc3m
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: a0f1a3b9f9b46741c6bdbb91f50dee87
Run description: Mechanical Turk, with 5 documents per HIT. Quality control question per document, asking to choose between 2 sets of keywords that best describe the document, besides control of minimum time spent per document. Relevance asked with a 3-point graded scale.

uc3m.hterms¶

Participants | Proceedings

Run ID: uc3m.hterms
Participant: uc3m
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: 86dc3d3624b2abc39f28ad44fedb5e3b
Run description: Mechanical Turk, with 5 documents per HIT and topic terms highlighted inside documents. Permissive Quality control question per document, asking to choose between 2 sets of keywords that best describe the document, besides control of minimum time spent per document. Relevance asked with an unbiased slider from bad to good document, providing a direct ranking and from which binary labels can be computed.

uc3m.rule¶

Participants | Proceedings

Run ID: uc3m.rule
Participant: uc3m
Track: Crowdsourcing
Year: 2011
Submission: 9/16/2011
Task: task2
MD5: 855c695cab901aadd8754f5a66ea5fb6
Run description: Rule-based model learned with per topic and per worker confidence.

uc3m.slider¶

Participants | Proceedings

Run ID: uc3m.slider
Participant: uc3m
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: 3aec7d2065db7e69c5ec64a592aaa26a
Run description: Mechanical Turk, with 5 documents per HIT. Restrictive quality control question per document, asking to choose between 2 sets of keywords that best describe the document, besides control of minimum time spent per document. Relevance asked with a biased slider from bad to good document, providing a direct ranking and from which binary labels can be computed.

uc3m.svn¶

Participants | Proceedings

Run ID: uc3m.svn
Participant: uc3m
Track: Crowdsourcing
Year: 2011
Submission: 9/16/2011
Task: task2
MD5: 9e7c4c375853d4117d4bcf4698bc2acb
Run description: SVM model learned with per topic and per worker confidence.

uc3m.wordnet¶

Participants | Proceedings

Run ID: uc3m.wordnet
Participant: uc3m
Track: Crowdsourcing
Year: 2011
Submission: 9/16/2011
Task: task2
MD5: 138e6cc1dcadf423c63d3a3028ae39c4
Run description: GetAnotherLabel software splitting the topic set in hard and easy topics (according to wordnet), using just the best workers per topic subset.

uogTrP1rg¶

Participants | Proceedings

Run ID: uogTrP1rg
Participant: uogTr
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: 9adfb8c26085e37b7274fdf60de0a594
Run description: Mturk crowdsourcing run using rendered pages and gold assessment

uogTrP2O4teh¶

Participants | Proceedings

Run ID: uogTrP2O4teh
Participant: uogTr
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 5c9c8224e28274ec2bddc397ae9ef865
Run description: Learned run using a 4 parameter assessment confidence model and highest scoring vote

uogTrP2O4wte¶

Participants | Proceedings

Run ID: uogTrP2O4wte
Participant: uogTr
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: ac6681702806ea705fa170a09aa3602c
Run description: Learned run using a 4 parameter assessment confidence model and voting

uogTrP2O4wtr¶

Participants | Proceedings

Run ID: uogTrP2O4wtr
Participant: uogTr
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 77a2e89c9e348eb8fb7702f0f598501f
Run description: Learned run using a 4 parameter assessment confidence model and voting

UWatCS1Human¶

Participants | Proceedings

Run ID: UWatCS1Human
Participant: UWaterlooMDS
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: 8086e552a540746c120a3c77fbfb9979
Run description: All labels collected by a single human individual with a home grown relevance assessing platform designed for the TREC Crowd Source track. Because of bugs and topic misunderstandings, some topics were judged more than once and the last collected judgments were submitted. (This description is for run UWatCS1Human.)

UWatCS2Semi¶

Participants | Proceedings

Run ID: UWatCS2Semi
Participant: UWaterlooMDS
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 3d55b4c6b127c1934de67687741b8282
Run description: Consensus based on quality of worker as measured by d-prime^2. Semi-supervised.

UWatCS2Unsup¶

Participants | Proceedings

Run ID: UWatCS2Unsup
Participant: UWaterlooMDS
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task2
MD5: 2e66e5d28c7533133a25960084b7ae83
Run description: Consensus based on quality of worker as measured by d-prime^2. Unsupervised. (This description is for run UWatCS2Unsup.)

wildcatrun¶

Participants | Proceedings

Run ID: wildcatrun
Participant: BUPT_WILDCAT
Track: Crowdsourcing
Year: 2011
Submission: 9/15/2011
Task: task1
MD5: fd109f7a1a32c9ef27bd3b1a329e50eb
Run description: First, we designed a job on CrowdFlower as the qualification test. Workers would be notified by email if the quality of their assignments on CrowdFlower meets our requirement. We ran the HITs on Amazon Mechanical Turk, by using a extenal webpage to load HITs. Workers were asked to give a binary label and a rank over a set. Each HIT contains 6 sets (30 documents). Some quality control measures had been taken by us. We used some aditional gold-sets to compute two kinds of scores of each assignment, workers would be notified automatically if they reached the threshold. On the contrary, they would be refused to submit the result. And the workers can review the reference answers of the gold set as instruction for further HITs. Assignments were automatically approved or rejected by our system, we collected 12000 labels in about 10 days. Workers could get $0.42 for every approved assignment.