Runs - Round 3 2020¶

active_learning¶

Results | Participants | Input | Appendix

Run ID: active_learning
Participant: risklick
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: manual
MD5: d97154b5f0bccf440b14a9db2e553563
Run description: Manually judging retrieved publications return by a basic IR model.

BioInfo-run1¶

Results | Participants | Input | Appendix

Run ID: BioInfo-run1
Participant: BioinformaticsUA
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: 8ec1fcf53f9db66cbabec5bbe25b1989
Run description: This run uses the open baseline of the UIowaS Team and applies a neural ranking model [1] to batches of 10 documents sequentially over the original ranking order. REFs: [1] T. Almeida and S. Matos, "Calling Attention to Passages for Biomedical Question Answering," in Advances in Information Retrieval, 2020, pp. 69--77.

BioInfo-run2¶

Results | Participants | Input | Appendix

Run ID: BioInfo-run2
Participant: BioinformaticsUA
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: e5158ce38bfc5530aca5c6e02d27e198
Run description: This run uses the Anserini relevance feedback baseline [2] and applies a neural ranking model [1] to batches of 50 documents sequentially over the original ranking order. REFs: [1] T. Almeida and S. Matos, "Calling Attention to Passages for Biomedical Question Answering," in Advances in Information Retrieval, 2020, pp. 69--77. [2] https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

BioInfo-run3¶

Results | Participants | Input | Appendix

Run ID: BioInfo-run3
Participant: BioinformaticsUA
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: ed6ac4a5a451b60d2b2e98ca67826802
Run description: This run uses RR fusion of runs 1 and 2

BITEM_AX0¶

Results | Participants | Input | Appendix

Run ID: BITEM_AX0
Participant: BITEM
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 58b9c98446dde04b18c7503185dccf20
Run description: Automatic run based on the search results from BITEM_BL (ElasticSearch query based on the three fields query+question+narrative, normalization, token boosting). In this run, we tried to prioritize the documents according to the axes identified in the COVoc terminology.

BITEM_BL¶

Results | Participants | Input | Appendix

Run ID: BITEM_BL
Participant: BITEM
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: c2a6597abdbde48827106e1d402381ba
Run description: Baseline run (same as previous rounds)

CincyMedIR-1¶

Results | Participants | Input | Appendix

Run ID: CincyMedIR-1
Participant: CincyMedIR
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: a9eff5e749b8db2eb0a8f427015e1b66
Run description: Query expanded using synonyms from Lexigram's API and searched against title, abstract and metamap terms from the database; Elasticsearch used as the search engine, removal of documents from previous qrels files.

CincyMedIR-12¶

Results | Participants | Input | Appendix

Run ID: CincyMedIR-12
Participant: CincyMedIR
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: b605b3548453e44b82f0311af281aecc
Run description: Query expanded using synonyms from Lexigram's API, Metamap IDs were concatenated to the query string and the search was done against title, abstract, metamap_scaled_title and metamap_scaled_abstract; Elasticsearch used as the search engine and documents from previous qrels were removed.

CincyMedIR-9¶

Results | Participants | Input | Appendix

Run ID: CincyMedIR-9
Participant: CincyMedIR
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: 68dc9de9f0095041d026eb7f22fae82b
Run description: Query expanded using synonyms from Lexigram's API, Metamap IDs were concatenated to the query string and the search was done against title, abstract, metamap_scaled_title and metamap_scaled_abstract; Elasticsearch used as the search engine and documents from previous qrels were removed.

combined¶

Results | Participants | Input | Appendix

Run ID: combined
Participant: risklick
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 6f87a6bf35d77b05683cc862dfc013df
Run description: A combination of basic IR models like bm25 and dfr on topics expanded with covid-related ontologies.

CORD-19-LTR¶

Results | Participants | Input | Appendix

Run ID: CORD-19-LTR
Participant: LTR_ESB_TEAM
Track: Round 3
Year: 2020
Submission: 6/1/2020
Type: automatic
MD5: 06fa088d26a569e66c6eb50c4b1b7a4c
Run description: Learning to Rank using Custom Made Features

cord19.vespa.ai-bm25¶

Results | Participants | Input | Appendix

Run ID: cord19.vespa.ai-bm25
Participant: cord19.vespa.ai
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: fd9aa8fefd4cbbba201dce9a8b868d59
Run description: Retrieval using cord19.vespa.ai using entities extracted from query, question and narrative from topic. Results ranked by bm25(title) + bm25(abstract) + bm25(body_text) + bm25(abstract_t5) where abstract_t5 is a T5 summary of the abstract. See https://github.com/vespa-engine/cord-19/tree/master/trec-covid

cord19.vespa.ai-gb-1¶

Results | Participants | Input | Appendix

Run ID: cord19.vespa.ai-gb-1
Participant: cord19.vespa.ai
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: c37709ac3af59ac8ed690899ace1d090
Run description: Retrieval using entities from query, question and narrative. First-phase ranking using bm25(title) + bm25(abstract) Top 1000 hits from first phase re-ranked using a GBDT model (LightGBM with lambdarank objective). Model is trained on judgements from round 1 and round 2. Used all queries except judgements for topics in [1,3,5,7,10,15,18,29,25,32]. https://github.com/vespa-engine/cord-19/tree/master/trec-covid

cord19.vespa.ai-gb-2¶

Results | Participants | Input | Appendix

Run ID: cord19.vespa.ai-gb-2
Participant: cord19.vespa.ai
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 06db85fb1f0824cde1c229deb1847d14
Run description: Retrieval using entities from query, question and narrative. First-phase ranking using bm25(title) + bm25(abstract) Top 1000 hits from first phase re-ranked using a GBDT model (LightGBM with lambdarank objective). Model trained on all available judgements from round 2 and round 3. https://github.com/vespa-engine/cord-19/tree/master/trec-covid

covidex.r3.duot5¶

Results | Participants | Input | Appendix

Run ID: covidex.r3.duot5
Participant: covidex
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: ea38343d005b95b757131b11f539ca68
Run description: Pairwise reranker using top-50 documents from run covidex.monoT5

covidex.r3.monot5¶

Results | Participants | Input | Appendix

Run ID: covidex.r3.monot5
Participant: covidex
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 0bc7f366949f078df81bc992443a734b
Run description: Reciprocal rank fusion of two runs: 1) Anserini r3.fusion1 reranked with medT5-3B; 2) Anserini r3.fusion2 reranked with medT5-3B. Our reranker (medT5-3B) is a T5-3B reranker fine-tuned on MS MARCO then fine-tuned (again) on MS MARCO medical subset (from MacAvaney et al., 2020, "SLEDGE"). You can find Anserini fusion baselines for round 3 here: https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

covidex.r3.t5_lr¶

Results | Participants | Input | Appendix

Run ID: covidex.r3.t5_lr
Participant: covidex
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: 1ac91680feed06b8e3691cd0ff718fbe
Run description: Interpolation (alpha=0.5) of covidex.r3.monot5 scores and scores from a logistic regression classifier trained on qrels of round 1 & 2 with tf-idf as input.

crowd¶

Results | Participants | Input | Appendix

Run ID: crowd
Participant: VATech
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: manual
MD5: 36f256ec136459475784a69889448982
Run description: crowd relevance feedback

crowdPRF¶

Results | Participants | Input | Appendix

Run ID: crowdPRF
Participant: VATech
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: manual
MD5: 3c3a54b7410c3a7b01a25f2be7363d3e
Run description: pseudo-relevance feedback using the run crowd as the first-round search

CSIROmedFusion¶

Results | Participants | Input | Appendix

Run ID: CSIROmedFusion
Participant: CSIROmed
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 0a511d9312175b0b35b1ef642a3d0839
Run description: Score fusion of two other runs from this round.

CSIROmedNIR¶

Results | Participants | Input | Appendix

Run ID: CSIROmedNIR
Participant: CSIROmed
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 27def6067d827eb333aa1b9511d465f4
Run description: Neural Index with Cosine Similarity for retrieval over question, narrative and query and mean sentence-embeddings over abstract and title fields + BM25 score over fulltext.

CSIROmedNIRR¶

Results | Participants | Input | Appendix

Run ID: CSIROmedNIRR
Participant: CSIROmed
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 64550bfe62e2a49fb26a7a8b15c958eb
Run description: Neural Index with Cosine Similarity for retrieval over question, narrative and query for abstract and title fields + BM25 score over fulltext. The run is reranked using top 3 sentences scores from abstracts.

Emory_IRLab_rnd3_r1¶

Results | Participants | Input | Appendix

Run ID: Emory_IRLab_rnd3_r1
Participant: Emory_IRLab
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: a5106ccac915657ffb124fd100806ad3
Run description: This run is based on MART reranker that trained on several self-constructed features, which include BM25 matching score that calculated on different fields related to the article (title, abstract, paragraph and anchor text), relevance probability from Scibert that fine tuned on the qrels from rnd1 and 2, as well as date and citation.

Emory_IRLab_rnd3_r2¶

Results | Participants | Input | Appendix

Run ID: Emory_IRLab_rnd3_r2
Participant: Emory_IRLab
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 9f97144e650bbdbd2a7b5886a5ad7256
Run description: This run is based on MART reranker that trained on several self-constructed features, which include BM25 matching score that calculated on different fields related to the article (title, abstract, paragraph and anchor text), relevance probability from Scibert that fine tuned on the qrels from rnd1 and 2, as well as date and citation. Same strategy with different duplication removal.

factum-sparse¶

Results | Participants | Input | Appendix

Run ID: factum-sparse
Participant: Factum
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 929fc7a778f1a97f9dc2863f2accbb29
Run description: Sparse retrieval (BM25) with fusion of ranks from document, abstract and paragraph indexes. We used unigrams, lemmatization and removed stopwords.

factum-sparse-rerank¶

Results | Participants | Input | Appendix

Run ID: factum-sparse-rerank
Participant: Factum
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: f2decf5f52baf285c7dc47d10e0502c3
Run description: Sparse retrieval (BM25) with fusion of scores from document, abstract and paragraph indexes. We used unigrams, with lemmatization and removal of stopwords. The top 100 documents are re-ranked using a BERT-base scoring model fine-tuned on MSMARCO. The final score of every document is given by the paragraph with the highest score from the ranking model.

fusionoffusion¶

Results | Participants | Input | Appendix

Run ID: fusionoffusion
Participant: IRLabKU
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 2cba11f73877b63d3dc94cde845a8992
Run description: -We create 3 indexes: 1) Based on title+abstract, 2) Based on full-text, and 3) Based on title+abstract+paragraph (a document is split into para1, para2, ..., paraK, and we create K+1 documents of the form title+abstract, title+abstract+para1, title+abstract+para2. - We create 4 different types of queries: 1) Query + named entitities(question), 2) Query + named entitities(question) + named entitities(narrative), 3) Question + named entitities(query), and 4) Question + named entitities(query) + named entitities(narrative) -For each query type we create 4 runs (default BM25 + default RM3): 1) We search in index of title+abstract, 2) We search in index of full-text, 3) We search in index of title+abstract+paragraph (where we only keep the first occurrence of a document and delete the following duplicates), and 4) We search in index of title+abstract+paragraph (where we don't remove the duplicates) Finally, based on these 4 runs, we use reciprocal rank fusion to get 1 run. - Now, we have 4 fusion run, which we use to create a final reciprocal rank fusion (Fusion of Fusions)

fusionofruns¶

Results | Participants | Input | Appendix

Run ID: fusionofruns
Participant: IRLabKU
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 9f5c6c47c013c20ee76b2447c137701e
Run description: - We create 3 indexes: 1) Based on title+abstract, 2) Based on full-text, and 3) Based on title+abstract+paragraph (a document is split into para1, para2, ..., paraK, and we create K+1 documents of the form title+abstract, title+abstract+para1, title+abstract+para2. - We create 4 different types of queries: 1) Query + named entitities(question), 2) Query + named entitities(question) + named entitities(narrative), 3) Question + named entitities(query), and 4) Question + named entitities(query) + named entitities(narrative) - For each query type we create 4 runs (default BM25 + default RM3): 1) We search in index of title+abstract, 2) We search in index of full-text, 3) We search in index of title+abstract+paragraph (where we only keep the first occurrence of a document and delete the following duplicates), and 4) We search in index of title+abstract+paragraph (where we don't remove the duplicates) Finally, based on each of these runs, we use reciprocal rank fusion to get 1 run.

jlbase-QE-rnd3¶

Results | Participants | Input | Appendix

Run ID: jlbase-QE-rnd3
Participant: julielab
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: manual
MD5: 9ff12458bca4b3c6327b5daf73b2d753
Run description: In most queries the token "coronavirus" is present. However, coronaviridae are a family of viruses, that are not limited to only SARS-CoV-2. Thus many false positive are likely to be found. This holds also true for other terms, such as "animal model". This term does not occure often, as most of the researchers specify the animal they used as model organism (such as mice or rats). So we created a list of synonyms to specify these general terms. We found out, that mostly nouns are the terms that contained the most information. Thus we used a part-of-speech tagger to isolate nouns and used a manually filled blacklist as well as a generall stop word list to filter them. Our final query consists of four different parts: several synonyms and spellings for the illness and the virus respectively, the query, the question and finally a bag-of-words containing the filtered nouns from both the question and the query. We planned on including the filtered nouns of the narrative as optional part of the query, but it turned out during evaluation on the previous part, that it does not have any benefit on the result.

jlbasernd3¶

Results | Participants | Input | Appendix

Run ID: jlbasernd3
Participant: julielab
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: e5258b778fe007b777271aca8a80640a
Run description: ElasticSearch with BM25 default settings. Stop word filtered query as mandatory clause. Stop word filtered question and narrative as optional clause.

jlbasernd3-jlQErnd3¶

Results | Participants | Input | Appendix

Run ID: jlbasernd3-jlQErnd3
Participant: julielab
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: manual
MD5: 9748598ed073105a170fb021ebb463ad
Run description: Reciprocal Rank Fusion between jlbasernd3 and jlbase-QE-rnd3.

l2r¶

Results | Participants | Input | Appendix

Run ID: l2r
Participant: risklick
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 580411c0b8e38c6e85cef28eed129f29
Run description: Learning to rank on a retrieval set based on basic IR models, enriched with covid-related ontologies.

mpiid5_run1¶

Results | Participants | Input | Appendix

Run ID: mpiid5_run1
Participant: mpiid5
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 39b83780d80123cf878f5ea582da9259
Run description: Fusion of mpiid5_run2 and mpiid5_run3 using Reciprocal Rank Fusion.

mpiid5_run2¶

Results | Participants | Input | Appendix

Run ID: mpiid5_run2
Participant: mpiid5
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: ed13e79a286de6c4f75dbcbc24c1fbb9
Run description: We re-rank top-10k documents returned by BM25 using the queries produced by UDel's method. For the re-ranking method, we use the ELECTRA-Base model fine-tuned on the MSMARCO passage dataset. With a more complex attention mechanism, the model is later fine-tuned on the TREC COVID round 1&2 full-text collection. We use the question queries for re-ranking.

mpiid5_run3¶

Results | Participants | Input | Appendix

Run ID: mpiid5_run3
Participant: mpiid5
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 5a0685078be654fe14860cd799242475
Run description: We re-rank top-10k documents returned by BM25 using the queries produced by UDel's method. For the re-ranking method, we use the ELECTRA-Base model fine-tuned on the MSMARCO passage dataset. With a simple attention mechanism, the model is later fine-tuned on the TREC COVID round 1&2 full-text collection. We use the question queries for re-ranking.

OHSU_BCB_round3-bcb¶

Results | Participants | Input | Appendix

Run ID: OHSU_BCB_round3-bcb
Participant: OHSU
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: ee3b37a51591868d715a890535a4787f
Run description: Search queries were generated using the query, question and narrative field and executed using Anserini on the full text index. The scores for these runs were normalized, and a document's total score was generated as a sum from the scores in each of these queries.

OHSU_Fusion¶

Results | Participants | Input | Appendix

Run ID: OHSU_Fusion
Participant: OHSU
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: manual
MD5: c64cf6a3cf115a452ec4af3bf40c9846
Run description: Tokenized queries were generated using the query, question, and narrative fields for each topic. These were input into abstract, full-text, and paragraph indexes and RRF was performed to combine the 3 runs. These runs were further combined using RRF with UIowa's Round 3 baseline runs, which used Borda and Terrier.

OHSU_Rerank¶

Results | Participants | Input | Appendix

Run ID: OHSU_Rerank
Participant: OHSU
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: manual
MD5: ec4264646faa4c2f530c3ef68b3dd7b9
Run description: Tokenized combinations of narrative, query, and question fields for each topic were input into an Anserini BM25 searcher for 3 indexes: abstract only, full-text only, and paragraphs+abstracts. RRF was performed to combine these 3 runs for the top 2000 documents/topic. BioBERT trained on pubmed was used as a reranker.

poznan_p3_2¶

Results | Participants | Input | Appendix

Run ID: poznan_p3_2
Participant: POZNAN
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 047809c1ccc1925832d1abdf9985b50b
Run description: Work in progress. We use LSTM based neural network to compare sentences. The score function is based on the five most similar sentences. Once again... work in progress.

poznan_run_p3_1¶

Results | Participants | Input | Appendix

Run ID: poznan_run_p3_1
Participant: POZNAN
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: c6d5f129d601bc3f69fd6a8cee145409
Run description: Baseline run for our research. Aside from preprocessing it contains no further improvements over the baseline. Elasticsearch used as a baseline indexing system.

poznan_run_p3_3¶

Results | Participants | Input | Appendix

Run ID: poznan_run_p3_3
Participant: POZNAN
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 563171130659b4fd8055a9469884a159
Run description: This is a run, which we would like to be validated. We do not expect great results on this run, but we will be able to disambiguate several properties in our model.

PRF¶

Results | Participants | Input | Appendix

Run ID: PRF
Participant: VATech
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 3559e4ac877a4c2f6d64cf116c6533ed
Run description: pseudo-relevance feedback baseline

r3.fusion1¶

Results | Participants | Input | Appendix

Run ID: r3.fusion1
Participant: anserini
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: c1caf63a9c3b02f0b12e233112fc79a6
Run description: Anserini fusion run corresponding to row 7 in table for Round 3 at https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

r3.fusion2¶

Results | Participants | Input | Appendix

Run ID: r3.fusion2
Participant: anserini
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 12679197846ed77306ecb2ca7895b011
Run description: Anserini fusion run corresponding to row 8 in table for Round 3 at https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

r3.rf¶

Results | Participants | Input | Appendix

Run ID: r3.rf
Participant: anserini
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: 7192a08c5275b59d5ef18395917ff694
Run description: Anserini run with abstract index, UDel query generator, BM25+RM3 relevance feedback (100 feedback terms).

R3QQMETRA¶

Results | Participants | Input | Appendix

Run ID: R3QQMETRA
Participant: IRIT_LSIS_FR
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: manual
MD5: 4cdb3e8eeca5845b2b9add4c51cd695a
Run description: Documents have been pre-treated to extract phrases using Tetralogie system. Topics have been also pre-treated for key-phrase extraction. Documents have then been indexed using Indri and search done using BM25 Indri search core. Query field has been used in this run. We added manually some key phrases related to the query topic.

R3QQTETRA¶

Results | Participants | Input | Appendix

Run ID: R3QQTETRA
Participant: IRIT_LSIS_FR
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: manual
MD5: 716397b86e4b4ca589ea813132e6e60d
Run description: Documents have been pre-treated to extract phrases using Tetralogie system. Topics have been also pre-treated for key-phrase extraction. Documents have then been indexed using Indri and search done using BM25 Indri search core. Query and Question fields have been used in this run.

R3QTETRA¶

Results | Participants | Input | Appendix

Run ID: R3QTETRA
Participant: IRIT_LSIS_FR
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: manual
MD5: 42f5871727067928501ce52ddc55984e
Run description: Documents have been pre-treated to extract phrases using Tetralogie system. Topics have been also pre-treated for key-phrase extraction. Documents have then been indexed using Indri and search done using BM25 Indri search core. Query field has been used in this run.

ruir-q2narr4task4¶

Results | Participants | Input | Appendix

Run ID: ruir-q2narr4task4
Participant: RUIR
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: manual
MD5: cd04ef189f17607c053ca187665b0e7f
Run description: manual classification of trec topics into the original 10 kaggle tasks. rm3 query expansion where we weight terms as 0.2 * query terms + 0.4 * narrative terms + 0.4 task terms task terms found by tf-idf from kaggle task description against corpus of paper abstracts. parameters found by coarsely tuning towards ndcg after filtering docs with unknown qrels (may be more stable than bpref - sakai 2007)

sab20.3.dfo¶

Results | Participants | Input | Appendix

Run ID: sab20.3.dfo
Participant: sabir
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: 99b06fe9b3bbbe671bdcfd9b13fd2f87
Run description: SMART vector DFO run. Base Lnu.ltu weights, with doc indexing = 0.5 metadoc_Lnu_weighting + 0.7 JSON_Lnu_weighting if a JSON doc exists (= straight Lnu weighting if only metadata info).Run DFO algorithm (the runs are described in my TREC 2005 Routing track, and later, eg 2017 core track). Use relevance info on Rounds 1+2 collections to expand and optimize weights on that collection. This is a conservative run, expanding by only top 15 terms.

sab20.3.metadocs_m¶

Results | Participants | Input | Appendix

Run ID: sab20.3.metadocs_m
Participant: sabir
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 18017fdb55232a5d2ce5c46264991c9b
Run description: Standard SMART vector run based on Lnu docs, ltu query weighting. Doc indexing: if only metadata info exists for a docid, that is used with Lnu weights. Each JSON doc is assigned final indexing as 0.5 * Metadata_Lnu_vector + 0.7 * JSON_Lnu_vector. After inverted retrieval, the highest similarity for each cord_uid is used.

sab20.3.rocchio¶

Results | Participants | Input | Appendix

Run ID: sab20.3.rocchio
Participant: sabir
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: 6735120d48af75db541a1eafd0aaa907
Run description: SMART vector Rocchio feedbackrun. Base Lnu.ltu weights, with doc indexing = 0.5 metadoc_Lnu_weighting + 0.7 JSON_Lnu_weighting if a JSON doc exists (= straight Lnu weighting if only metadata info).Run Rocchio algorithm. Use relevance info on Rounds 1+2 collections to expand and optimize query weights on that collection. Query term weights = 4 * original query weight + 6 * average weight in relevant docs - 8 * average weight in nonrel docs. Expansion to top 50 terms.

SFDC-e23-f12-re-tf3¶

Results | Participants | Input | Appendix

Run ID: SFDC-e23-f12-re-tf3
Participant: SFDC
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 1313a7932a29aca3458f5e7c7bea408c
Run description: Round3 Encoder combined with Round 2 Encoder: Fusion run of Linear(Siamese-BERT paragraph-level semantic retrieval, TFIDF), and Anserini BM25 Function: RRF( Anserini-Rnd2-1(reranked), Anserini-Rnd2-2(reranked), 0.7 * semantic-23 + 0.3 * TFIDF-3)

SFDC-fus12-enc23-tf3¶

Results | Participants | Input | Appendix

Run ID: SFDC-fus12-enc23-tf3
Participant: SFDC
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: ef4f214c83137fc58c5e787662efbc94
Run description: Round3 Encoder combined with Round 2 Encoder: Fusion run of Linear(Siamese-BERT paragraph-level semantic retrieval, TFIDF), and Anserini BM25 Function: RRF( Anserini-Rnd2-1, Anserini-Rnd2-2, 0.7 * semantic-23 + 0.3 * TFIDF-3)

SFDC-fus12-enc3-tf3¶

Results | Participants | Input | Appendix

Run ID: SFDC-fus12-enc3-tf3
Participant: SFDC
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 5ebe6e92916864eba16d42effc359313
Run description: Fusion run of Linear(Siamese-BERT paragraph-level semantic retrieval, TFIDF), and Anserini BM25 Function: RRF( Anserini-Rnd2-1, Anserini-Rnd2-2, 0.7 * semantic-3 + 0.3 * TFIDF-3)

sim_run¶

Results | Participants | Input | Appendix

Run ID: sim_run
Participant: UH_UAQ
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 9985510578ec612bb72ab3c665cf8ea5
Run description: Encoding documents with SciBert and applying a classic information retrieval model. Unfortunately, we could not run all over the dataset, just a small subset

sparse-dense-0.45¶

Results | Participants | Input | Appendix

Run ID: sparse-dense-0.45
Participant: CIR
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 8449ef9ebab8eba4deb5f7908f033637
Run description: Fusion of: (1) Anserini BM25 fusion baseline(RRF(2,4,6)), (2) ANN dense embedding retrieval trained on Med-MARCO (fusion weights tuned). Language model finetuned by performing MLM task on CORD19 Rnd3 documents.

sparse-dense-SBrr-2¶

Results | Participants | Input | Appendix

Run ID: sparse-dense-SBrr-2
Participant: CIR
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 9249459f60add67485e7e349b40d2a13
Run description: Fusion of: (1) Anserini BM25 fusion baseline(RRF(2,4,6)), (2) Med-MARCO ANN dense embedding retrieval, (3) MedMARCO SciBERT reranker(top 1000). Fusion weights tuned. Language model finetuned by performing MLM task on CORD19 Rnd3 documents.

sparse-dense-SBrr-3¶

Results | Participants | Input | Appendix

Run ID: sparse-dense-SBrr-3
Participant: CIR
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 4731e16b0847af830fb2f9c54f454e2b
Run description: Fusion of: (1) Anserini BM25 fusion baseline(RRF(2,4,6)), (2) Med-MARCO ANN dense embedding retrieval, (3) MedMARCO SciBERT reranker(top 1000). Fusion weights set equal. Language model finetuned by performing MLM task on CORD19 Rnd3 documents.

TF_IDF_Feedback¶

Results | Participants | Input | Appendix

Run ID: TF_IDF_Feedback
Participant: UB_BW
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 410d8011b5283ab35164ac4323ed9402
Run description: We indexed only the title and the abstract using Terrier-v5.2. For the final document ranking, we deployed the TF-IDF term weighting model. During retrieval, we used both the query and the question tags, which were then expanded with terms from the relevant documents in the qrels file.

ucd_cs_r1¶

Results | Participants | Input | Appendix

Run ID: ucd_cs_r1
Participant: UCD_CS
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: 1b92ebe42c5f4f4c73469c49c6f04353
Run description: BM25 baseline with parameters setup from previous study.

ucd_cs_r2¶

Results | Participants | Input | Appendix

Run ID: ucd_cs_r2
Participant: UCD_CS
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: automatic
MD5: b3da47e26b386d65ba99e6c97de19932
Run description: BM25 baseline with parameters setup from previous study. Re-ranked by a fine-tuned ELECTRA-base on MSMACRO.

ucd_cs_r3¶

Results | Participants | Input | Appendix

Run ID: ucd_cs_r3
Participant: UCD_CS
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: a9fac3513754c7a5af92936e834ee8da
Run description: BM25 first stage and then re-ranked by the fine-tuned ELECTRA_base on TREC-COVID dataset.

udel_fang_FB1¶

Results | Participants | Input | Appendix

Run ID: udel_fang_FB1
Participant: udel_fang
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: d5c2f1a7f5fee1260f73ce669ab546a3
Run description: We build an index with title and abstract from the metadata file. Non-stopwords in query, as well as entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We perform relevance feedback for the first 35 queries and pseudo relevance feedback on the 5 new queries using bm25+rm3.

udel_fang_FB2¶

Results | Participants | Input | Appendix

Run ID: udel_fang_FB2
Participant: udel_fang
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: b6dc52e9790352db906edb78d65c8534
Run description: We build an index with title and abstract from the metadata file. Non-stopwords in query, as well as entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We perform relevance feedback for the first 35 queries and pseudo relevance feedback on the 5 new queries using f2exp+axiom.

udel_fang_lambdarank¶

Results | Participants | Input | Appendix

Run ID: udel_fang_lambdarank
Participant: udel_fang
Track: Round 3
Year: 2020
Submission: 6/2/2020
Type: feedback
MD5: cee0b0c03cf864a7ddcab18ed0beaa8a
Run description: We build an index with title and abstract from the metadata file. Non-stopwords in query, as well as entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We generate a run using relevance feedback on the first 35 queries and pseudo relevance feedback on the last 5 queries. Lambdarank is used to re-rank the first 100 results using features such as retrieval scores and recency.

UIowaS_Rd3Borda¶

Results | Participants | Input | Appendix

Run ID: UIowaS_Rd3Borda
Participant: UIowaS
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: fc4f67f16ec2af2041dbfecb090dccfd
Run description: Borda merge of two runs - both employ Terrier, BM25 weighting with relevance feedback. Both use all query fields - we did add the word Covid to the question field of all topics. In the first run we used 10 relevant documents to expand the query by 300 terms. In the second run we used 30 documents to expand the query by 1000 terms. Borda merge was done on these two runs. Retrieval was done against the metadata title and abstract fields. For the 5 new topics we did a Terrier, TF_IDF retrieval feedback run, 10 documents, 20 expansion query terms. The queries were as above, But the dataset was limited to filtered documents from the metadata title and abstract. Each retained document has to have a word in a pre-defined list described in our run 1 documentation. The scores for the first 35 topics are Borda scores while for the last 5 these are TF_IDF scores. Probably ranks will be more useful than scores for re-ranking.

UIowaS_Rd3MLReRank¶

Results | Participants | Input | Appendix

Run ID: UIowaS_Rd3MLReRank
Participant: UIowaS
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 2dbf9f2267c0e06ecc4e378b1d697887
Run description: For the first 35 topics, re-rank the output provided by UIowaS_Rd3Borda , creating one linear SVM classifier per topic (using highly-relevant and non-relevant documents for the topic from qrel-rnd1 and q rel-rnd2). Re-ranking is done based on prediction probability. Preprocessing includes stop word removal and conversion of text to lowercase. TF-IDF for vector representation of a document considering unigrams and bigrams. The last five topics are the same as in UIowaS_Rd3Borda.

UIowaS_Run2¶

Results | Participants | Input | Appendix

Run ID: UIowaS_Run2
Participant: UIowaS
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 583b674ad3ce6e518cc4d5495e164bdd
Run description: For first 35 topics: Relevance feedback with provided relevance judgements using BM25 with 30 docs and 1000 terms for expansion (on unfiltered metadata dataset) limited to title and abstract. For the last 5 topics retrieval feedback using TF_IDF using 10 documents and 20 terms (on filtered metadata dataset) limited to title and abstract. Then reranked retrieved documents for each topic with biobert_msmarco model.

uogTrDPH_QE_QQN¶

Results | Participants | Input | Appendix

Run ID: uogTrDPH_QE_QQN
Participant: uogTr
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: fb6551d08a50d13e08205a84f759ff65
Run description: An automatic query expansion run using DFR query expansion built on pyTerrier

uogTrDPH_RF_QQN¶

Results | Participants | Input | Appendix

Run ID: uogTrDPH_RF_QQN
Participant: uogTr
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: 20be888afcfa5aa677cfc767ac8eec0c
Run description: A feedback run using DFR query expansion built on pyTerrier

UPrrf20lgbert50-r3¶

Results | Participants | Input | Appendix

Run ID: UPrrf20lgbert50-r3
Participant: unique_ptr
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: e523138172346232293876bbacd9587c
Run description: LightGBM learning-to-rank framework, combining (a) a reciprocal rank fusion of 20 retrieval runs using Anserini and Terrier, (b) TF-Ranking + BERT fine-tuned on MSMarco, and (c) article-based features such as its publication year.

UPrrf20lgprel50v2-r3¶

Results | Participants | Input | Appendix

Run ID: UPrrf20lgprel50v2-r3
Participant: unique_ptr
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: c61844396ad07a23bb2c644dae1f8502
Run description: LightGBM learning-to-rank framework, combining (a) a reciprocal rank fusion of 20 retrieval runs using Anserini and Terrier, (b) TF-Ranking + BERT fine-tuned on MSMarco (c) TF-Ranking + BERT fine-tuned on the relevance judgments from Rounds 1 & 2 and (d) article-based features such as its publication year.

UPrrf20maxens-r3¶

Results | Participants | Input | Appendix

Run ID: UPrrf20maxens-r3
Participant: unique_ptr
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: feedback
MD5: de35e20b3cf83515b0a36457819b26de
Run description: A per-query ensemble of (a) a reciprocal rank fusion of 20 retrieval runs using Anserini and Terrier (b) LightGBM learning-to-rank (c) TF-Ranking + BERT fine-tuned on the relevance judgments from Rounds 1 & 2. The ensemble was created by weighting each method by its NDCG@10 for a given query.

xj4wang_run1¶

Results | Participants | Input | Appendix

Run ID: xj4wang_run1
Participant: xj4wang
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: manual
MD5: 28c085794682fae2a824a654c3c21370
Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

xj4wang_run2¶

Results | Participants | Input | Appendix

Run ID: xj4wang_run2
Participant: xj4wang
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: manual
MD5: 976a0da2c93ffbf976957506f63d2438
Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

xj4wang_run3¶

Results | Participants | Input | Appendix

Run ID: xj4wang_run3
Participant: xj4wang
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: manual
MD5: d248e0bb9e27b9714d98cb454083029c
Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

Yibo_round3¶

Results | Participants | Input | Appendix

Run ID: Yibo_round3
Participant: 0_214_wyb
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: 94a6640ae7665b8038d2db49452aa990
Run description: use metadata.csv to delete repeated documents clean data use BM25 to retrieve to 2000 documents use sentence-bert to retrieve top 100 documents from the 2000 documents

Yibo_round3_test2¶

Results | Participants | Input | Appendix

Run ID: Yibo_round3_test2
Participant: 0_214_wyb
Track: Round 3
Year: 2020
Submission: 6/3/2020
Type: automatic
MD5: ede2ca1b6cb6113a17fec8ca0340936f
Run description: use metadata.csv to delete repeated documents clean data use BM25 to retrieve to 2000 documents use sentence-bert to retrieve top 1000 documents from the 2000 documents