Runs - Health Misinformation 2021¶

all_use_sup_cre¶

Run ID: all_use_sup_cre
Participant: DigiLab
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 52659ee5c32f45bd055be01d0bd6678f
Run description: Run 7: This automatic run was created using a rank fusion based on RRF of the individual models used to create the i) usefulness (5 individual models), ii) supportiveness (3 individual models), and iii) credibility (2 individual models).

baselineBM25¶

Run ID: baselineBM25
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: auto
Task: main
MD5: eae994bd8547ab2af61f6372c1e4d95f
Run description: Anserinis default BM25.

bm25¶

Results | Participants | Input | Summary | Appendix

Run ID: bm25
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: auto
Task: main
MD5: 4aaeeeaa7b75356abd4d4a48441a5b86
Run description: Pyserini's Default BM25

bm25_rob_rf¶

Run ID: bm25_rob_rf
Participant: DigiLab
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 13a4087bb7894504bfaa8a28e34a9853
Run description: Run 1: Baseline run. This automatic run was created using a rank fusion based on RRF of three models: i) usefulness, created using a default BM25, ii) supportiveness, created using a RoBERTa large model fine-tuned on the FEVER+SciFact corpus, and iii) credibility, created using a random forest model trained on the Microsoft Credibility dataset.

bow_sup_cred¶

Run ID: bow_sup_cred
Participant: DigiLab
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 891eec36e5c09a81ff21107fc3aea426
Run description: Run 2: This automatic run was created using a rank fusion based on RRF of three models: i) usefulness, created using a combined default BM25 with a fine-tuned BM25 model using known item search with query and description been generated using transfer learning from language models, ii) supportiveness, created using a combined rank of three transformer-based models fine-tuned on the FEVER+SciFact corpus, and iii) credibility, created using a random forest model trained on the Microsoft Credibility dataset combined with a list of credible sites.

citius.r1¶

Run ID: citius.r1
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: eb4216d22a4bb77699dc838323a55815
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 documents using a hand-crafted expression generated from the description and the stance fields.

citius.r10¶

Run ID: citius.r10
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: bb81d2d704f795cb3dce910a6d6f4390
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using a hand-crafted expression generated from the description and the stance field + sentence similarity between passages of the top 100 docs and the same hand-crafted expression using a RoBERTa Base model. Finally, we reordered the top 100 documents using CombSUM ranking fusion technique and all three scores.

citius.r2¶

Run ID: citius.r2
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: 526f0f45180a3ab31cc497c221bc6f4b
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using the query field + sentence similarity between passages of the top 100 docs and a hand-crafted expression generated from the description and the stance fields using a RoBERTa Large model. Finally, we reorder the top 100 documents using the CombSUM ranking fusion technique and all three scores.

citius.r3¶

Run ID: citius.r3
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: 451b13e581a40c33d1189a14ec806d39
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using the query field + sentence similarity between passages of the top 100 docs and a hand-crafted expression generated from the description and the stance fields using RoBERTa Base model. Finally, we reordered the top 100 documents using CombSUM ranking fusion technique and all three scores.

citius.r4¶

Run ID: citius.r4
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: 63b5e82473e3ed28be9577031a38a5c3
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using the query field + sentence similarity between passages of the top 100 docs and a hand-crafted expression generated from the description and the stance fields and its query variations using RoBERTa Large model. Finally, we reordered the top 100 documents using CombSUM ranking fusion technique and all three scores.

citius.r5¶

Run ID: citius.r5
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: d26f23444b291438bc86a0d8a3c95cc8
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using the query field + sentence similarity between passages of the top 100 docs and a hand-crafted expression generated from the description and the stance fields and its query variations using RoBERTa Base model. Finally, we reordered the top 100 documents using CombSUM ranking fusion technique and all three scores.

citius.r6¶

Run ID: citius.r6
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: 7a9caa3fc45c208b5d42c243488e9434
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using the query field + sentence similarity between passages of the top 100 docs and a hand-crafted expression generated from the description and the stance field using RoBERTa Large model. Before performing the sentence similarity step, we apply a passage cleaning phase based on NLP unsupervised techniques. Finally, we reordered the top 100 documents using CombSUM ranking fusion technique and all three scores.

citius.r7¶

Run ID: citius.r7
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: 5cc98a403480f8136906a62e86929fb2
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using the query field + passage reliability classifier of the top 100 docs trained with 2019 and 2020 data in the form correct sentence + passage + label. Finally, we reordered the top 100 documents using the Borda Count ranking fusion technique and all three scores.

citius.r8¶

Run ID: citius.r8
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: 62acf8a73a2ee21eb6be92638b497c5f
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using the query field + passage reliability classifier of the top 100 docs trained with 2019 and 2020 data in the form query + passage + label. Finally, we reordered the top 100 documents using the Borda Count ranking fusion technique and all three scores.

citius.r9¶

Run ID: citius.r9
Participant: CiTIUS
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: manual
Task: main
MD5: 5a581a5f1e272266fd9d90115e119338
Run description: Initial BM25 retrieval based on the query field + passage reranking of the top 100 docs using a hand-crafted expression generated from the description and the stance field + sentence similarity between passages of the top 100 docs and the same hand-crafted expression using a RoBERTa Large model. Finally, we reordered the top 100 documents using CombSUM ranking fusion technique and all three scores.

lin_use_sup_rf¶

Run ID: lin_use_sup_rf
Participant: DigiLab
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 42728fbbf230774478932487f88678df
Run description: Run 5: This automatic run was created using a rank fusion based on RRF of three models: i) usefulness, created using a combined BoW model with three transformed-based language models trained on the MS MARCO corpus, ii) supportiveness, created using a combined rank of three transformer-based models fine-tuned on the FEVER+SciFact corpus, and iii) credibility, create using a random forest model trained on the Microsoft Credibility dataset.

mdt5¶

Results | Participants | Input | Summary | Appendix

Run ID: mdt5
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: auto
Task: main
MD5: e346edfa6eb6dd49e2cc14537b814000
Run description: Pyserini's Default BM25. MedMonoT5/DuoT5 using Description only

mdt5_r¶

Results | Participants | Input | Summary | Appendix

Run ID: mdt5_r
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: manual
Task: main
MD5: cca9961fdad60c3ff1e302a3909e9c0b
Run description: Pyserini's Default BM25. MedMonoT5/DuoT5 using Description and Stance

mlm_sup_cred¶

Run ID: mlm_sup_cred
Participant: DigiLab
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 08e64a177aa31f73ba6b2560f98b77f2
Run description: Run 3: This automatic run was created using a rank fusion based on RRF of three models: i) usefulness, created using a combination of three transformed-based language models trained on the MS MARCO corpus, ii) supportiveness, created using a combined rank of three transformer-based models fine-tuned on the FEVER+SciFact corpus, and iii) credibility, created using a random forest model trained on the Microsoft Credibility dataset combined with a list of credible sites.

mt5¶

Results | Participants | Input | Summary | Appendix

Run ID: mt5
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: auto
Task: main
MD5: e4338585930a63484a87f590a3fb1442
Run description: Pyserini's Default BM25. MedMonoT5 using Description only

mt5_r¶

Results | Participants | Input | Summary | Appendix

Run ID: mt5_r
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: manual
Task: main
MD5: c855e8ad6635900e3cbb3d1afe73f438
Run description: Pyserini's Default BM25. MedMonoT5 using Description and Stance

upv_bm25¶

Run ID: upv_bm25
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: 549b3611315c7ebf9eb4d1e39805e346
Run description: It is a baseline that is obtained by using pyserini bm25.

upv_fuse_10¶

Run ID: upv_fuse_10
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: aad92770286ac574baf690995c69d08b
Run description: It is a fused of results by cos similarity by bio SBERT, bm25 and credibility Kullback divergence similarity from the base Roberta credibility classifier.

upv_fuse_2¶

Run ID: upv_fuse_2
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: f836bfe3963010fc08854bfce6c2ce56
Run description: We calculate the cos similarity between documents (limited with 20 sentences) and the description by using Bio Sentence Transformer. The result is fused of bm25 and the cos similarity by using reciprocal rank fusion.

upv_fuse_3¶

Run ID: upv_fuse_3
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: 5e697f4599e033073735ad4488222031
Run description: We define a model that measures the credibility of an article which is trained with Roberta base model and we set up reference article as satisfying 4 criteria of credibility. We then calculate the cosine similarity between the document and reference vector. Lastly, we fused the results of bm-25 and the cos similarities.

upv_fuse_4¶

Run ID: upv_fuse_4
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: c6f9bd7393318abcd85b3033a6b2d353
Run description: We define a model that measures the credibility of an article which is trained with Roberta base model and we set up reference article as satisfying 4 criteria of credibility. We then calculate the kullback-divergence score between the document and reference vector. Lastly, we fused the results of bm-25 and the cos similarities.

upv_fuse_5¶

Run ID: upv_fuse_5
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: af0e3cf82b98e6eebbba97d0532a506a
Run description: We define a model that measures the credibility of an article which is trained with Roberta large model and we set up reference article as satisfying 4 criteria of credibility. We then calculate the cos similarity score between the document and reference vector. Lastly, we fused the results of bm-25 and the cos similarities.

upv_fuse_6¶

Run ID: upv_fuse_6
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: 05995987ec18520d765fc3597adae05d
Run description: We define a model that measures the credibility of an article which is trained with Roberta large model and we set up reference article as satisfying 4 criteria of credibility. We then calculate the Kullback Divergence score between the document and reference vector. Lastly, we fused the results of bm-25 and the kullback divergence score.

upv_fuse_7¶

Run ID: upv_fuse_7
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: 28fb5d99694866f814a582a1e2f74c58
Run description: It is a fused of results by cos similarity by bio SBERT, bm25 and credibility cos similarity from the large Roberta credibility classifier.

upv_fuse_8¶

Run ID: upv_fuse_8
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: 2c59c03416503b4b560c827e07fbe88c
Run description: It is a fused of results by cos similarity by bio SBERT, bm25 and credibility Kullback divergence similarity from the large Roberta credibility classifier.

upv_fuse_9¶

Run ID: upv_fuse_9
Participant: UPV
Track: Health Misinformation
Year: 2021
Submission: 8/31/2021
Type: auto
Task: main
MD5: e47f4f91564c53a73b163f49c0ef1099
Run description: It is a fused of results by cos similarity by bio SBERT, bm25 and credibility cos similarity from the base Roberta credibility classifier.

use_rob_cred¶

Run ID: use_rob_cred
Participant: DigiLab
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 92626b95faf276cf7dcc5c44882ed552
Run description: Run 4: This automatic run was created using a rank fusion based on RRF of three models: i) usefulness, created using a combined BoW model with three transformed-based language models trained on the MS MARCO corpus, ii) supportiveness, created using a RoBERTa large model fine-tuned on the FEVER+SciFact corpus, and iii) credibility, created using a random forest model trained on the Microsoft Credibility dataset combined with a list of credible sites.

use_sup_cred¶

Run ID: use_sup_cred
Participant: DigiLab
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 804880792d09d7d31bb583aae6778055
Run description: Run 6: This automatic run was created using a rank fusion based on RRF of three models: i) usefulness, created using a combined BoW model with three transformed-based language models trained on the MS MARCO corpus, ii) supportiveness, created using a combined rank of three transformer-based models fine-tuned on the FEVER+SciFact corpus, and iii) credibility, created using a random forest model trained on the Microsoft Credibility dataset combined with a list of credible sites.

vera0¶

Results | Participants | Input | Summary | Appendix

Run ID: vera0
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: manual
Task: main
MD5: e39b7425e9b5756ebf2d51a4bee8c69e
Run description: Pyserini's Default BM25. Vera - label prediction only

vera_mdt5_0.5¶

Results | Participants | Input | Summary | Appendix

Run ID: vera_mdt5_0.5
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: manual
Task: main
MD5: d1bbd06f09411b658ff0fd285aef70a1
Run description: Pyserini's Default BM25. Linear combination of mono-duo-T5 with Vera (duo1, 0.5)

vera_mdt5_0.95¶

Results | Participants | Input | Summary | Appendix

Run ID: vera_mdt5_0.95
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: manual
Task: main
MD5: 582abbf4859c8a7402d873d59826feff
Run description: Pyserini's Default BM25. Linear combination of mono-duo-T5 with Vera (duo1, 0.95)

vera_mt5_0.5¶

Results | Participants | Input | Summary | Appendix

Run ID: vera_mt5_0.5
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: manual
Task: main
MD5: a0d43e3e208ef172467d0dd17f2e4427
Run description: Pyserini's Default BM25. Linear combination of mono-T5 with Vera (mono, 0.5)

vera_mt5_0.95¶

Results | Participants | Input | Summary | Appendix

Run ID: vera_mt5_0.95
Participant: h2oloo
Track: Health Misinformation
Year: 2021
Submission: 9/3/2021
Type: manual
Task: main
MD5: 175f51667279d9ce992891471064ed1d
Run description: Pyserini's Default BM25. Linear combination of mono-T5 with Vera (mono, 0.95)

watbm25¶

Results | Participants | Input | Summary | Appendix

Run ID: watbm25
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: 51254f49ca93354638c5d30a48ba283e
Run description: BM25, no relevance feedback.

watbm25p¶

Results | Participants | Input | Summary | Appendix

Run ID: watbm25p
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: 59a878796b44f8ceddce89bd1489129b
Run description: BM25, additional search term "pubmed" added; no relevance feedback.

watgoog¶

Results | Participants | Input | Summary | Appendix

Run ID: watgoog
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: 2cd6a92a74bf8f3307f159603f9ea32d
Run description: Logistic regression, top Google hits as training docs.

watgoogp¶

Results | Participants | Input | Summary | Appendix

Run ID: watgoogp
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: 6dc2d9bfb3bd25c7fbb8e93350113ee2
Run description: Logistic regression, top Google hits as training docs. "Pubmed" as additional search term.

watmed¶

Results | Participants | Input | Summary | Appendix

Run ID: watmed
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: a056990f607e319dbf9c3d06bb783d1b
Run description: Logistic regression, top medline BM25 hits as training docs.

watrrfall¶

Results | Participants | Input | Summary | Appendix

Run ID: watrrfall
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: 65d2a9b32c0133628d6b1df6def1359a
Run description: Reciprocal rank fusion of everything

watrrfg¶

Results | Participants | Input | Summary | Appendix

Run ID: watrrfg
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: 086a32cdd9118e564450dd0c1ead4a80
Run description: Reciprocal rank fusion of two Google-seeded classifiers

watrrfm¶

Results | Participants | Input | Summary | Appendix

Run ID: watrrfm
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: b5efe126d6978c91447ddb0a52f1b557
Run description: Fusion watbm25 and watmed

watrrfnp¶

Results | Participants | Input | Summary | Appendix

Run ID: watrrfnp
Participant: Waterloo_Cormack
Track: Health Misinformation
Year: 2021
Submission: 8/30/2021
Type: auto
Task: main
MD5: 5c9346ca48482140958581e584e4b506
Run description: Reciprocal rank fusion of runs, minus pubmed seed.

WatSAE-BM25¶

Run ID: WatSAE-BM25
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 7eb278d49af0331957d5275f4eebe8fe
Run description: We use the common crawl host graph to find domains the (HONCode + handpicked) domains have linked to. We perform pagerank on this subset of the host graph and take the top 10000 hosts. We filter c4/no.clean for these hosts documents. We filter these documents using a medical text classifier. With this collection we run BM25 with Anserini.

WatSAE-BM25-RR¶

Run ID: WatSAE-BM25-RR
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: a8e14e7afe999b9c93fc62522ef0f939
Run description: We use the common crawl host graph to find domains the (HONCode + handpicked) domains have linked to. We perform pagerank on this subset of the host graph and take the top 10000 hosts. We filter c4/no.clean for these hosts documents. We filter these documents using a medical text classifier. With this collection we run BM25 with Anserini. We rank with query independent features including pagerank ratio of medical pages on the host, and manual url features.

WatSAE-BM25RM3¶

Run ID: WatSAE-BM25RM3
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 058194e6e64fa9eb30af4b5a5bbd6254
Run description: We use the common crawl host graph to find domains the (HONCode + handpicked) domains have linked to. We perform pagerank on this subset of the host graph and take the top 10000 hosts. We filter c4/no.clean for these hosts documents. We filter these documents using a medical text classifier. With this collection we run BM25 plus RM3 with Anserini.

WatSAM-BM25¶

Run ID: WatSAM-BM25
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: auto
Task: main
MD5: b0784fd5e374d5e7d4c476249582a8a4
Run description: Anserini's BM25 on filtered collection (HONCode + handpicked) domains. This collection only includes documents with domains having an HONcode certification (see www.hon.ch for more details) or are part of small list of handpicked health related websites (e.g. health.harvard.edu).

WatSMC-CAL¶

Run ID: WatSMC-CAL
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: ac633b766fc6c488e07fac826c784688
Run description: Documents are scored using a Continuous Active Learning (Logistic Regression) model trained with two round of judging: 10 minutes per topic on filtered HONCode collection and 5 minutes per topic on HONCode+10kBM25 collection. Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed.

WatSMC-CALQA100¶

Run ID: WatSMC-CALQA100
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: 623c72dda2c396688beaa38deada9edb
Run description: Paragraphs are scored using a Continuous Active Learning (Logistic Regression) model trained with two round of judging: 10 minutes per topic on filtered HONCode collection and 5 minutes per topic on HONCode+10kBM25 collection. Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. Top 100 scoring paragraphs are reranked based on RoBERTa, fine tuned on BoolQ dataset, with the paragraph as context and topic's description as the yes/no question. Reranking is done to match the topic's stance field.

WatSMC-CALQAAll¶

Run ID: WatSMC-CALQAAll
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: 8a60fd145feedcfcc95ed6846bcf228f
Run description: Paragraphs are scored using a Continuous Active Learning (Logistic Regression) model trained with two round of judging: 10 minutes per topic on filtered HONCode collection and 5 minutes per topic on HONCode+10kBM25 collection. Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. All 1000 paragraphs are reranked based on RoBERTa, fine tuned on BoolQ dataset, with the paragraph as context and topic's description as the yes/no question. Reranking is done to match the topic's stance field.

WatSMC-CALQAHC1¶

Run ID: WatSMC-CALQAHC1
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: a4b41b7fa5e195e7f55e3c9740cc1053
Run description: Paragraphs are scored using a Continuous Active Learning (Logistic Regression) model trained with two round of judging: 10 minutes per topic on filtered HONCode collection and 5 minutes per topic on HONCode+10kBM25 collection. Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. All 1000 paragraphs are reranked based on RoBERTa, fine tuned on BoolQ dataset, with the paragraph as context and topic's description as the yes/no question. Reranking is done based the topic's stance field and harmonic centrality.

WatSMC-CALQAHC2¶

Run ID: WatSMC-CALQAHC2
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: 5745a60f7e1caa62e8a453a9d4568990
Run description: Paragraphs are scored using a Continuous Active Learning (Logistic Regression) model trained with two round of judging: 10 minutes per topic on filtered HONCode collection and 5 minutes per topic on HONCode+10kBM25 collection. Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. All 1000 paragraphs are reranked based on RoBERTa, fine tuned on BoolQ dataset, with the paragraph as context and topic's description as the yes/no question. Aggressive reranking is done based the topic's stance field and harmonic centrality.

WatSMC-Correct¶

Run ID: WatSMC-Correct
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: 1addced4013566793af13e9792c8c3bc
Run description: Correct documents are manually found using search and continuous active learning. Correct documents are placed first, followed by documents returned by a continuous active learning model trained using correct judgments only.

WatSMM-CAL¶

Run ID: WatSMM-CAL
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: b54cd092a78c834843be7b9909b79f67
Run description: Documents are scored using a Continuous Active Learning (Logistic Regression) model trained with one round of judging (maximum of 10 minutes per topic). Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. Used filtered collection (HONCode + handpicked) domains.

WatSMM-CALHC¶

Run ID: WatSMM-CALHC
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: 25c853b2b65abd0bba443c572a70f731
Run description: Documents are scored using a Continuous Active Learning (Logistic Regression) model trained with one round of judging (maximum of 10 minutes per topic). Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. Used filtered collection (HONCode + handpicked) domains. Top 50 documents are reranked based on model score and normalized harmonic centrality score.

WatSMM-CALPR¶

Run ID: WatSMM-CALPR
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: d1fac02651e334d03c5d6d0cbaf0c014
Run description: Documents are scored using a Continuous Active Learning (Logistic Regression) model trained with one round of judging (maximum of 10 minutes per topic). Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. Used filtered collection (HONCode + handpicked) domains. Top 50 documents are reranked based on model score and normalized pagerank score.

WatSMM-CALQA100¶

Run ID: WatSMM-CALQA100
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: f15e5aa3d45ceb4f42b773bed5cef971
Run description: Paragraphs are scored using a Continuous Active Learning (Logistic Regression) model trained with one round of judging: 10 minutes per topic on filtered HONCode collection. Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. Top 100 scoring paragraphs are reranked based on RoBERTa, fine tuned on BoolQ dataset, with the paragraph as context and topic's description as the yes/no question. Reranking is done to match the topic's stance field.

WatSMM-CALQAAll¶

Run ID: WatSMM-CALQAAll
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: c2de0fff7c7f1bce74a14b9cbc2fcae9
Run description: Paragraphs are scored using a Continuous Active Learning (Logistic Regression) model trained with one round of judging: 10 minutes per topic on filtered HONCode collection. Training was focused on usefulness only. Each topic was initialized with the query as seed judgment. Interactive Search and Judging was allowed. All 1000 paragraphs are reranked based on RoBERTa, fine tuned on BoolQ dataset, with the paragraph as context and topic's description as the yes/no question. Reranking is done to match the topic's stance field.

WatSMM-Fused¶

Run ID: WatSMM-Fused
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/1/2021
Type: manual
Task: main
MD5: 2e260a14432a0feb7b05fbf33ff40905
Run description: Reciprocal rank fusion on runs WatSMM-CAL, WatSMM-CALHC, and WatSMM-CALPR.

WatSMT-SD-S1¶

Run ID: WatSMT-SD-S1
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: manual
Task: main
MD5: ede6c9947300b70ca3820e82b1eacda7
Run description: We fine-tune the T5-large model on a balanced subset of 2019 qrels to predict the stance of each document ("helpful" or "unhelpful"). We apply the stance detection model to re-rank the top 3k results from the BM25 baseline. To combine the BM25 scores and the stance prediction, we use the following fusion strategy: BM25_score * e^(probability - 0.5), where probability = helpful_probability if topic.stance == "helpful" else unhelpful_probability.

WatSMT-SD-S2¶

Run ID: WatSMT-SD-S2
Participant: UWaterlooMDS
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: manual
Task: main
MD5: fafd396c8dc2c75c2729b92643176137
Run description: We fine-tune the T5-large model on a balanced subset of 2019 qrels to predict the stance of each document ("helpful" or "unhelpful"). We apply the stance detection model to re-rank the top 3k results from the BM25 baseline. To combine the BM25 scores and the stance prediction, we use the following fusion strategy: if probability > 0.75 then BM25_score * 10, else if probability < 0.25 then BM25_socre * -1, where probability = helpful_probability if topic.stance == "helpful" else unhelpful_probability.

webis-bm25¶

Run ID: webis-bm25
Participant: Webis
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: e3bf2564d6e39ba733443a2eda5fbafe
Run description: We retrieve the top 1000 documents with Anserini using BM25 (k1=0.9 and b=0.4), processing documents and queries with the porter stemmer and stopword removal.

webis-bm25-ax1¶

Run ID: webis-bm25-ax1
Participant: Webis
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 122205f6500824419eef86506da4f7db
Run description: We re-rank top 20 documents initially retrieved with Anserini BM25 (k1=0.9 and b=0.4) using three argumentative axioms, where the axiom weights are chosen such that either one or two axioms decide to swap document positions.

webis-bm25-ax3¶

Run ID: webis-bm25-ax3
Participant: Webis
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: e29c63a84bf6de4dd3deacfe46886b05
Run description: We re-rank top 20 documents initially retrieved with Anserini BM25 (k1=0.9 and b=0.4) using three argumentative axioms, where the axiom weights are chosen such that all three axioms decide to swap document positions.

webis-t5¶

Run ID: webis-t5
Participant: Webis
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: d0b50189e46970bfb73a752f01fd86ce
Run description: We rerank the top 100 results of webis-bm25 with the MonoT5 model "castorini/monot5-base-msmarco" available on Hugging Face with PyGaggle.

webis-t5-ax1¶

Run ID: webis-t5-ax1
Participant: Webis
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: 2850d40b78be05232d2d726ccec7aa96
Run description: We re-rank top 20 documents of the webis-t5.txt run using three argumentative axioms, where the axiom weights are chosen such that either one or two axioms decide to swap document positions.

webis-t5-ax3¶

Run ID: webis-t5-ax3
Participant: Webis
Track: Health Misinformation
Year: 2021
Submission: 9/2/2021
Type: auto
Task: main
MD5: d4330120b5164639297c617735241979
Run description: We re-rank top 20 documents of the webis-t5.txt run using three argumentative axioms, where the axiom weights are chosen such that all three axioms decide to swap document positions.