Skip to content

Runs - Round 3 2020

active_learning

Results | Participants | Input | Appendix

  • Run ID: active_learning
  • Participant: risklick
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: manual
  • MD5: d97154b5f0bccf440b14a9db2e553563
  • Run description: Manually judging retrieved publications return by a basic IR model.

BioInfo-run1

Results | Participants | Input | Appendix

  • Run ID: BioInfo-run1
  • Participant: BioinformaticsUA
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: 8ec1fcf53f9db66cbabec5bbe25b1989
  • Run description: This run uses the open baseline of the UIowaS Team and applies a neural ranking model [1] to batches of 10 documents sequentially over the original ranking order. REFs: [1] T. Almeida and S. Matos, "Calling Attention to Passages for Biomedical Question Answering," in Advances in Information Retrieval, 2020, pp. 69--77.

BioInfo-run2

Results | Participants | Input | Appendix

  • Run ID: BioInfo-run2
  • Participant: BioinformaticsUA
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: e5158ce38bfc5530aca5c6e02d27e198
  • Run description: This run uses the Anserini relevance feedback baseline [2] and applies a neural ranking model [1] to batches of 50 documents sequentially over the original ranking order. REFs: [1] T. Almeida and S. Matos, "Calling Attention to Passages for Biomedical Question Answering," in Advances in Information Retrieval, 2020, pp. 69--77. [2] https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

BioInfo-run3

Results | Participants | Input | Appendix

  • Run ID: BioInfo-run3
  • Participant: BioinformaticsUA
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: ed6ac4a5a451b60d2b2e98ca67826802
  • Run description: This run uses RR fusion of runs 1 and 2

BITEM_AX0

Results | Participants | Input | Appendix

  • Run ID: BITEM_AX0
  • Participant: BITEM
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 58b9c98446dde04b18c7503185dccf20
  • Run description: Automatic run based on the search results from BITEM_BL (ElasticSearch query based on the three fields query+question+narrative, normalization, token boosting). In this run, we tried to prioritize the documents according to the axes identified in the COVoc terminology.

BITEM_BL

Results | Participants | Input | Appendix

  • Run ID: BITEM_BL
  • Participant: BITEM
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: c2a6597abdbde48827106e1d402381ba
  • Run description: Baseline run (same as previous rounds)

CincyMedIR-1

Results | Participants | Input | Appendix

  • Run ID: CincyMedIR-1
  • Participant: CincyMedIR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: a9eff5e749b8db2eb0a8f427015e1b66
  • Run description: Query expanded using synonyms from Lexigram's API and searched against title, abstract and metamap terms from the database; Elasticsearch used as the search engine, removal of documents from previous qrels files.

CincyMedIR-12

Results | Participants | Input | Appendix

  • Run ID: CincyMedIR-12
  • Participant: CincyMedIR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: b605b3548453e44b82f0311af281aecc
  • Run description: Query expanded using synonyms from Lexigram's API, Metamap IDs were concatenated to the query string and the search was done against title, abstract, metamap_scaled_title and metamap_scaled_abstract; Elasticsearch used as the search engine and documents from previous qrels were removed.

CincyMedIR-9

Results | Participants | Input | Appendix

  • Run ID: CincyMedIR-9
  • Participant: CincyMedIR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: 68dc9de9f0095041d026eb7f22fae82b
  • Run description: Query expanded using synonyms from Lexigram's API, Metamap IDs were concatenated to the query string and the search was done against title, abstract, metamap_scaled_title and metamap_scaled_abstract; Elasticsearch used as the search engine and documents from previous qrels were removed.

combined

Results | Participants | Input | Appendix

  • Run ID: combined
  • Participant: risklick
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 6f87a6bf35d77b05683cc862dfc013df
  • Run description: A combination of basic IR models like bm25 and dfr on topics expanded with covid-related ontologies.

CORD-19-LTR

Results | Participants | Input | Appendix

  • Run ID: CORD-19-LTR
  • Participant: LTR_ESB_TEAM
  • Track: Round 3
  • Year: 2020
  • Submission: 6/1/2020
  • Type: automatic
  • MD5: 06fa088d26a569e66c6eb50c4b1b7a4c
  • Run description: Learning to Rank using Custom Made Features

cord19.vespa.ai-bm25

Results | Participants | Input | Appendix

  • Run ID: cord19.vespa.ai-bm25
  • Participant: cord19.vespa.ai
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: fd9aa8fefd4cbbba201dce9a8b868d59
  • Run description: Retrieval using cord19.vespa.ai using entities extracted from query, question and narrative from topic. Results ranked by bm25(title) + bm25(abstract) + bm25(body_text) + bm25(abstract_t5) where abstract_t5 is a T5 summary of the abstract. See https://github.com/vespa-engine/cord-19/tree/master/trec-covid

cord19.vespa.ai-gb-1

Results | Participants | Input | Appendix

  • Run ID: cord19.vespa.ai-gb-1
  • Participant: cord19.vespa.ai
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: c37709ac3af59ac8ed690899ace1d090
  • Run description: Retrieval using entities from query, question and narrative. First-phase ranking using bm25(title) + bm25(abstract) Top 1000 hits from first phase re-ranked using a GBDT model (LightGBM with lambdarank objective). Model is trained on judgements from round 1 and round 2. Used all queries except judgements for topics in [1,3,5,7,10,15,18,29,25,32]. https://github.com/vespa-engine/cord-19/tree/master/trec-covid

cord19.vespa.ai-gb-2

Results | Participants | Input | Appendix

  • Run ID: cord19.vespa.ai-gb-2
  • Participant: cord19.vespa.ai
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 06db85fb1f0824cde1c229deb1847d14
  • Run description: Retrieval using entities from query, question and narrative. First-phase ranking using bm25(title) + bm25(abstract) Top 1000 hits from first phase re-ranked using a GBDT model (LightGBM with lambdarank objective). Model trained on all available judgements from round 2 and round 3. https://github.com/vespa-engine/cord-19/tree/master/trec-covid

covidex.r3.duot5

Results | Participants | Input | Appendix

  • Run ID: covidex.r3.duot5
  • Participant: covidex
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: ea38343d005b95b757131b11f539ca68
  • Run description: Pairwise reranker using top-50 documents from run covidex.monoT5

covidex.r3.monot5

Results | Participants | Input | Appendix

  • Run ID: covidex.r3.monot5
  • Participant: covidex
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 0bc7f366949f078df81bc992443a734b
  • Run description: Reciprocal rank fusion of two runs: 1) Anserini r3.fusion1 reranked with medT5-3B; 2) Anserini r3.fusion2 reranked with medT5-3B. Our reranker (medT5-3B) is a T5-3B reranker fine-tuned on MS MARCO then fine-tuned (again) on MS MARCO medical subset (from MacAvaney et al., 2020, "SLEDGE"). You can find Anserini fusion baselines for round 3 here: https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

covidex.r3.t5_lr

Results | Participants | Input | Appendix

  • Run ID: covidex.r3.t5_lr
  • Participant: covidex
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: 1ac91680feed06b8e3691cd0ff718fbe
  • Run description: Interpolation (alpha=0.5) of covidex.r3.monot5 scores and scores from a logistic regression classifier trained on qrels of round 1 & 2 with tf-idf as input.

crowd

Results | Participants | Input | Appendix

  • Run ID: crowd
  • Participant: VATech
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: manual
  • MD5: 36f256ec136459475784a69889448982
  • Run description: crowd relevance feedback

crowdPRF

Results | Participants | Input | Appendix

  • Run ID: crowdPRF
  • Participant: VATech
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: manual
  • MD5: 3c3a54b7410c3a7b01a25f2be7363d3e
  • Run description: pseudo-relevance feedback using the run crowd as the first-round search

CSIROmedFusion

Results | Participants | Input | Appendix

  • Run ID: CSIROmedFusion
  • Participant: CSIROmed
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 0a511d9312175b0b35b1ef642a3d0839
  • Run description: Score fusion of two other runs from this round.

CSIROmedNIR

Results | Participants | Input | Appendix

  • Run ID: CSIROmedNIR
  • Participant: CSIROmed
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 27def6067d827eb333aa1b9511d465f4
  • Run description: Neural Index with Cosine Similarity for retrieval over question, narrative and query and mean sentence-embeddings over abstract and title fields + BM25 score over fulltext.

CSIROmedNIRR

Results | Participants | Input | Appendix

  • Run ID: CSIROmedNIRR
  • Participant: CSIROmed
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 64550bfe62e2a49fb26a7a8b15c958eb
  • Run description: Neural Index with Cosine Similarity for retrieval over question, narrative and query for abstract and title fields + BM25 score over fulltext. The run is reranked using top 3 sentences scores from abstracts.

Emory_IRLab_rnd3_r1

Results | Participants | Input | Appendix

  • Run ID: Emory_IRLab_rnd3_r1
  • Participant: Emory_IRLab
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: a5106ccac915657ffb124fd100806ad3
  • Run description: This run is based on MART reranker that trained on several self-constructed features, which include BM25 matching score that calculated on different fields related to the article (title, abstract, paragraph and anchor text), relevance probability from Scibert that fine tuned on the qrels from rnd1 and 2, as well as date and citation.

Emory_IRLab_rnd3_r2

Results | Participants | Input | Appendix

  • Run ID: Emory_IRLab_rnd3_r2
  • Participant: Emory_IRLab
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 9f97144e650bbdbd2a7b5886a5ad7256
  • Run description: This run is based on MART reranker that trained on several self-constructed features, which include BM25 matching score that calculated on different fields related to the article (title, abstract, paragraph and anchor text), relevance probability from Scibert that fine tuned on the qrels from rnd1 and 2, as well as date and citation. Same strategy with different duplication removal.

factum-sparse

Results | Participants | Input | Appendix

  • Run ID: factum-sparse
  • Participant: Factum
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 929fc7a778f1a97f9dc2863f2accbb29
  • Run description: Sparse retrieval (BM25) with fusion of ranks from document, abstract and paragraph indexes. We used unigrams, lemmatization and removed stopwords.

factum-sparse-rerank

Results | Participants | Input | Appendix

  • Run ID: factum-sparse-rerank
  • Participant: Factum
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: f2decf5f52baf285c7dc47d10e0502c3
  • Run description: Sparse retrieval (BM25) with fusion of scores from document, abstract and paragraph indexes. We used unigrams, with lemmatization and removal of stopwords. The top 100 documents are re-ranked using a BERT-base scoring model fine-tuned on MSMARCO. The final score of every document is given by the paragraph with the highest score from the ranking model.

fusionoffusion

Results | Participants | Input | Appendix

  • Run ID: fusionoffusion
  • Participant: IRLabKU
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 2cba11f73877b63d3dc94cde845a8992
  • Run description: -We create 3 indexes: 1) Based on title+abstract, 2) Based on full-text, and 3) Based on title+abstract+paragraph (a document is split into para1, para2, ..., paraK, and we create K+1 documents of the form title+abstract, title+abstract+para1, title+abstract+para2. - We create 4 different types of queries: 1) Query + named entitities(question), 2) Query + named entitities(question) + named entitities(narrative), 3) Question + named entitities(query), and 4) Question + named entitities(query) + named entitities(narrative) -For each query type we create 4 runs (default BM25 + default RM3): 1) We search in index of title+abstract, 2) We search in index of full-text, 3) We search in index of title+abstract+paragraph (where we only keep the first occurrence of a document and delete the following duplicates), and 4) We search in index of title+abstract+paragraph (where we don't remove the duplicates) Finally, based on these 4 runs, we use reciprocal rank fusion to get 1 run. - Now, we have 4 fusion run, which we use to create a final reciprocal rank fusion (Fusion of Fusions)

fusionofruns

Results | Participants | Input | Appendix

  • Run ID: fusionofruns
  • Participant: IRLabKU
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 9f5c6c47c013c20ee76b2447c137701e
  • Run description: - We create 3 indexes: 1) Based on title+abstract, 2) Based on full-text, and 3) Based on title+abstract+paragraph (a document is split into para1, para2, ..., paraK, and we create K+1 documents of the form title+abstract, title+abstract+para1, title+abstract+para2. - We create 4 different types of queries: 1) Query + named entitities(question), 2) Query + named entitities(question) + named entitities(narrative), 3) Question + named entitities(query), and 4) Question + named entitities(query) + named entitities(narrative) - For each query type we create 4 runs (default BM25 + default RM3): 1) We search in index of title+abstract, 2) We search in index of full-text, 3) We search in index of title+abstract+paragraph (where we only keep the first occurrence of a document and delete the following duplicates), and 4) We search in index of title+abstract+paragraph (where we don't remove the duplicates) Finally, based on each of these runs, we use reciprocal rank fusion to get 1 run.

jlbase-QE-rnd3

Results | Participants | Input | Appendix

  • Run ID: jlbase-QE-rnd3
  • Participant: julielab
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: manual
  • MD5: 9ff12458bca4b3c6327b5daf73b2d753
  • Run description: In most queries the token "coronavirus" is present. However, coronaviridae are a family of viruses, that are not limited to only SARS-CoV-2. Thus many false positive are likely to be found. This holds also true for other terms, such as "animal model". This term does not occure often, as most of the researchers specify the animal they used as model organism (such as mice or rats). So we created a list of synonyms to specify these general terms. We found out, that mostly nouns are the terms that contained the most information. Thus we used a part-of-speech tagger to isolate nouns and used a manually filled blacklist as well as a generall stop word list to filter them. Our final query consists of four different parts: several synonyms and spellings for the illness and the virus respectively, the query, the question and finally a bag-of-words containing the filtered nouns from both the question and the query. We planned on including the filtered nouns of the narrative as optional part of the query, but it turned out during evaluation on the previous part, that it does not have any benefit on the result.

jlbasernd3

Results | Participants | Input | Appendix

  • Run ID: jlbasernd3
  • Participant: julielab
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: e5258b778fe007b777271aca8a80640a
  • Run description: ElasticSearch with BM25 default settings. Stop word filtered query as mandatory clause. Stop word filtered question and narrative as optional clause.

jlbasernd3-jlQErnd3

Results | Participants | Input | Appendix

  • Run ID: jlbasernd3-jlQErnd3
  • Participant: julielab
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: manual
  • MD5: 9748598ed073105a170fb021ebb463ad
  • Run description: Reciprocal Rank Fusion between jlbasernd3 and jlbase-QE-rnd3.

l2r

Results | Participants | Input | Appendix

  • Run ID: l2r
  • Participant: risklick
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 580411c0b8e38c6e85cef28eed129f29
  • Run description: Learning to rank on a retrieval set based on basic IR models, enriched with covid-related ontologies.

mpiid5_run1

Results | Participants | Input | Appendix

  • Run ID: mpiid5_run1
  • Participant: mpiid5
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 39b83780d80123cf878f5ea582da9259
  • Run description: Fusion of mpiid5_run2 and mpiid5_run3 using Reciprocal Rank Fusion.

mpiid5_run2

Results | Participants | Input | Appendix

  • Run ID: mpiid5_run2
  • Participant: mpiid5
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: ed13e79a286de6c4f75dbcbc24c1fbb9
  • Run description: We re-rank top-10k documents returned by BM25 using the queries produced by UDel's method. For the re-ranking method, we use the ELECTRA-Base model fine-tuned on the MSMARCO passage dataset. With a more complex attention mechanism, the model is later fine-tuned on the TREC COVID round 1&2 full-text collection. We use the question queries for re-ranking.

mpiid5_run3

Results | Participants | Input | Appendix

  • Run ID: mpiid5_run3
  • Participant: mpiid5
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 5a0685078be654fe14860cd799242475
  • Run description: We re-rank top-10k documents returned by BM25 using the queries produced by UDel's method. For the re-ranking method, we use the ELECTRA-Base model fine-tuned on the MSMARCO passage dataset. With a simple attention mechanism, the model is later fine-tuned on the TREC COVID round 1&2 full-text collection. We use the question queries for re-ranking.

OHSU_BCB_round3-bcb

Results | Participants | Input | Appendix

  • Run ID: OHSU_BCB_round3-bcb
  • Participant: OHSU
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: ee3b37a51591868d715a890535a4787f
  • Run description: Search queries were generated using the query, question and narrative field and executed using Anserini on the full text index. The scores for these runs were normalized, and a document's total score was generated as a sum from the scores in each of these queries.

OHSU_Fusion

Results | Participants | Input | Appendix

  • Run ID: OHSU_Fusion
  • Participant: OHSU
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: manual
  • MD5: c64cf6a3cf115a452ec4af3bf40c9846
  • Run description: Tokenized queries were generated using the query, question, and narrative fields for each topic. These were input into abstract, full-text, and paragraph indexes and RRF was performed to combine the 3 runs. These runs were further combined using RRF with UIowa's Round 3 baseline runs, which used Borda and Terrier.

OHSU_Rerank

Results | Participants | Input | Appendix

  • Run ID: OHSU_Rerank
  • Participant: OHSU
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: manual
  • MD5: ec4264646faa4c2f530c3ef68b3dd7b9
  • Run description: Tokenized combinations of narrative, query, and question fields for each topic were input into an Anserini BM25 searcher for 3 indexes: abstract only, full-text only, and paragraphs+abstracts. RRF was performed to combine these 3 runs for the top 2000 documents/topic. BioBERT trained on pubmed was used as a reranker.

poznan_p3_2

Results | Participants | Input | Appendix

  • Run ID: poznan_p3_2
  • Participant: POZNAN
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 047809c1ccc1925832d1abdf9985b50b
  • Run description: Work in progress. We use LSTM based neural network to compare sentences. The score function is based on the five most similar sentences. Once again... work in progress.

poznan_run_p3_1

Results | Participants | Input | Appendix

  • Run ID: poznan_run_p3_1
  • Participant: POZNAN
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: c6d5f129d601bc3f69fd6a8cee145409
  • Run description: Baseline run for our research. Aside from preprocessing it contains no further improvements over the baseline. Elasticsearch used as a baseline indexing system.

poznan_run_p3_3

Results | Participants | Input | Appendix

  • Run ID: poznan_run_p3_3
  • Participant: POZNAN
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 563171130659b4fd8055a9469884a159
  • Run description: This is a run, which we would like to be validated. We do not expect great results on this run, but we will be able to disambiguate several properties in our model.

PRF

Results | Participants | Input | Appendix

  • Run ID: PRF
  • Participant: VATech
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 3559e4ac877a4c2f6d64cf116c6533ed
  • Run description: pseudo-relevance feedback baseline

r3.fusion1

Results | Participants | Input | Appendix

  • Run ID: r3.fusion1
  • Participant: anserini
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: c1caf63a9c3b02f0b12e233112fc79a6
  • Run description: Anserini fusion run corresponding to row 7 in table for Round 3 at https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

r3.fusion2

Results | Participants | Input | Appendix

  • Run ID: r3.fusion2
  • Participant: anserini
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 12679197846ed77306ecb2ca7895b011
  • Run description: Anserini fusion run corresponding to row 8 in table for Round 3 at https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

r3.rf

Results | Participants | Input | Appendix

  • Run ID: r3.rf
  • Participant: anserini
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: 7192a08c5275b59d5ef18395917ff694
  • Run description: Anserini run with abstract index, UDel query generator, BM25+RM3 relevance feedback (100 feedback terms).

R3QQMETRA

Results | Participants | Input | Appendix

  • Run ID: R3QQMETRA
  • Participant: IRIT_LSIS_FR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: manual
  • MD5: 4cdb3e8eeca5845b2b9add4c51cd695a
  • Run description: Documents have been pre-treated to extract phrases using Tetralogie system. Topics have been also pre-treated for key-phrase extraction. Documents have then been indexed using Indri and search done using BM25 Indri search core. Query field has been used in this run. We added manually some key phrases related to the query topic.

R3QQTETRA

Results | Participants | Input | Appendix

  • Run ID: R3QQTETRA
  • Participant: IRIT_LSIS_FR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: manual
  • MD5: 716397b86e4b4ca589ea813132e6e60d
  • Run description: Documents have been pre-treated to extract phrases using Tetralogie system. Topics have been also pre-treated for key-phrase extraction. Documents have then been indexed using Indri and search done using BM25 Indri search core. Query and Question fields have been used in this run.

R3QTETRA

Results | Participants | Input | Appendix

  • Run ID: R3QTETRA
  • Participant: IRIT_LSIS_FR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: manual
  • MD5: 42f5871727067928501ce52ddc55984e
  • Run description: Documents have been pre-treated to extract phrases using Tetralogie system. Topics have been also pre-treated for key-phrase extraction. Documents have then been indexed using Indri and search done using BM25 Indri search core. Query field has been used in this run.

ruir-q2narr4task4

Results | Participants | Input | Appendix

  • Run ID: ruir-q2narr4task4
  • Participant: RUIR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: manual
  • MD5: cd04ef189f17607c053ca187665b0e7f
  • Run description: manual classification of trec topics into the original 10 kaggle tasks. rm3 query expansion where we weight terms as 0.2 * query terms + 0.4 * narrative terms + 0.4 task terms task terms found by tf-idf from kaggle task description against corpus of paper abstracts. parameters found by coarsely tuning towards ndcg after filtering docs with unknown qrels (may be more stable than bpref - sakai 2007)

sab20.3.dfo

Results | Participants | Input | Appendix

  • Run ID: sab20.3.dfo
  • Participant: sabir
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: 99b06fe9b3bbbe671bdcfd9b13fd2f87
  • Run description: SMART vector DFO run. Base Lnu.ltu weights, with doc indexing = 0.5 metadoc_Lnu_weighting + 0.7 JSON_Lnu_weighting if a JSON doc exists (= straight Lnu weighting if only metadata info).Run DFO algorithm (the runs are described in my TREC 2005 Routing track, and later, eg 2017 core track). Use relevance info on Rounds 1+2 collections to expand and optimize weights on that collection. This is a conservative run, expanding by only top 15 terms.

sab20.3.metadocs_m

Results | Participants | Input | Appendix

  • Run ID: sab20.3.metadocs_m
  • Participant: sabir
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 18017fdb55232a5d2ce5c46264991c9b
  • Run description: Standard SMART vector run based on Lnu docs, ltu query weighting. Doc indexing: if only metadata info exists for a docid, that is used with Lnu weights. Each JSON doc is assigned final indexing as 0.5 * Metadata_Lnu_vector + 0.7 * JSON_Lnu_vector. After inverted retrieval, the highest similarity for each cord_uid is used.

sab20.3.rocchio

Results | Participants | Input | Appendix

  • Run ID: sab20.3.rocchio
  • Participant: sabir
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: 6735120d48af75db541a1eafd0aaa907
  • Run description: SMART vector Rocchio feedbackrun. Base Lnu.ltu weights, with doc indexing = 0.5 metadoc_Lnu_weighting + 0.7 JSON_Lnu_weighting if a JSON doc exists (= straight Lnu weighting if only metadata info).Run Rocchio algorithm. Use relevance info on Rounds 1+2 collections to expand and optimize query weights on that collection. Query term weights = 4 * original query weight + 6 * average weight in relevant docs - 8 * average weight in nonrel docs. Expansion to top 50 terms.

SFDC-e23-f12-re-tf3

Results | Participants | Input | Appendix

  • Run ID: SFDC-e23-f12-re-tf3
  • Participant: SFDC
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 1313a7932a29aca3458f5e7c7bea408c
  • Run description: Round3 Encoder combined with Round 2 Encoder: Fusion run of Linear(Siamese-BERT paragraph-level semantic retrieval, TFIDF), and Anserini BM25 Function: RRF( Anserini-Rnd2-1(reranked), Anserini-Rnd2-2(reranked), 0.7 * semantic-23 + 0.3 * TFIDF-3)

SFDC-fus12-enc23-tf3

Results | Participants | Input | Appendix

  • Run ID: SFDC-fus12-enc23-tf3
  • Participant: SFDC
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: ef4f214c83137fc58c5e787662efbc94
  • Run description: Round3 Encoder combined with Round 2 Encoder: Fusion run of Linear(Siamese-BERT paragraph-level semantic retrieval, TFIDF), and Anserini BM25 Function: RRF( Anserini-Rnd2-1, Anserini-Rnd2-2, 0.7 * semantic-23 + 0.3 * TFIDF-3)

SFDC-fus12-enc3-tf3

Results | Participants | Input | Appendix

  • Run ID: SFDC-fus12-enc3-tf3
  • Participant: SFDC
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 5ebe6e92916864eba16d42effc359313
  • Run description: Fusion run of Linear(Siamese-BERT paragraph-level semantic retrieval, TFIDF), and Anserini BM25 Function: RRF( Anserini-Rnd2-1, Anserini-Rnd2-2, 0.7 * semantic-3 + 0.3 * TFIDF-3)

sim_run

Results | Participants | Input | Appendix

  • Run ID: sim_run
  • Participant: UH_UAQ
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 9985510578ec612bb72ab3c665cf8ea5
  • Run description: Encoding documents with SciBert and applying a classic information retrieval model. Unfortunately, we could not run all over the dataset, just a small subset

sparse-dense-0.45

Results | Participants | Input | Appendix

  • Run ID: sparse-dense-0.45
  • Participant: CIR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 8449ef9ebab8eba4deb5f7908f033637
  • Run description: Fusion of: (1) Anserini BM25 fusion baseline(RRF(2,4,6)), (2) ANN dense embedding retrieval trained on Med-MARCO (fusion weights tuned). Language model finetuned by performing MLM task on CORD19 Rnd3 documents.

sparse-dense-SBrr-2

Results | Participants | Input | Appendix

  • Run ID: sparse-dense-SBrr-2
  • Participant: CIR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 9249459f60add67485e7e349b40d2a13
  • Run description: Fusion of: (1) Anserini BM25 fusion baseline(RRF(2,4,6)), (2) Med-MARCO ANN dense embedding retrieval, (3) MedMARCO SciBERT reranker(top 1000). Fusion weights tuned. Language model finetuned by performing MLM task on CORD19 Rnd3 documents.

sparse-dense-SBrr-3

Results | Participants | Input | Appendix

  • Run ID: sparse-dense-SBrr-3
  • Participant: CIR
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 4731e16b0847af830fb2f9c54f454e2b
  • Run description: Fusion of: (1) Anserini BM25 fusion baseline(RRF(2,4,6)), (2) Med-MARCO ANN dense embedding retrieval, (3) MedMARCO SciBERT reranker(top 1000). Fusion weights set equal. Language model finetuned by performing MLM task on CORD19 Rnd3 documents.

TF_IDF_Feedback

Results | Participants | Input | Appendix

  • Run ID: TF_IDF_Feedback
  • Participant: UB_BW
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 410d8011b5283ab35164ac4323ed9402
  • Run description: We indexed only the title and the abstract using Terrier-v5.2. For the final document ranking, we deployed the TF-IDF term weighting model. During retrieval, we used both the query and the question tags, which were then expanded with terms from the relevant documents in the qrels file.

ucd_cs_r1

Results | Participants | Input | Appendix

  • Run ID: ucd_cs_r1
  • Participant: UCD_CS
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: 1b92ebe42c5f4f4c73469c49c6f04353
  • Run description: BM25 baseline with parameters setup from previous study.

ucd_cs_r2

Results | Participants | Input | Appendix

  • Run ID: ucd_cs_r2
  • Participant: UCD_CS
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: automatic
  • MD5: b3da47e26b386d65ba99e6c97de19932
  • Run description: BM25 baseline with parameters setup from previous study. Re-ranked by a fine-tuned ELECTRA-base on MSMACRO.

ucd_cs_r3

Results | Participants | Input | Appendix

  • Run ID: ucd_cs_r3
  • Participant: UCD_CS
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: a9fac3513754c7a5af92936e834ee8da
  • Run description: BM25 first stage and then re-ranked by the fine-tuned ELECTRA_base on TREC-COVID dataset.

udel_fang_FB1

Results | Participants | Input | Appendix

  • Run ID: udel_fang_FB1
  • Participant: udel_fang
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: d5c2f1a7f5fee1260f73ce669ab546a3
  • Run description: We build an index with title and abstract from the metadata file. Non-stopwords in query, as well as entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We perform relevance feedback for the first 35 queries and pseudo relevance feedback on the 5 new queries using bm25+rm3.

udel_fang_FB2

Results | Participants | Input | Appendix

  • Run ID: udel_fang_FB2
  • Participant: udel_fang
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: b6dc52e9790352db906edb78d65c8534
  • Run description: We build an index with title and abstract from the metadata file. Non-stopwords in query, as well as entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We perform relevance feedback for the first 35 queries and pseudo relevance feedback on the 5 new queries using f2exp+axiom.

udel_fang_lambdarank

Results | Participants | Input | Appendix

  • Run ID: udel_fang_lambdarank
  • Participant: udel_fang
  • Track: Round 3
  • Year: 2020
  • Submission: 6/2/2020
  • Type: feedback
  • MD5: cee0b0c03cf864a7ddcab18ed0beaa8a
  • Run description: We build an index with title and abstract from the metadata file. Non-stopwords in query, as well as entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We generate a run using relevance feedback on the first 35 queries and pseudo relevance feedback on the last 5 queries. Lambdarank is used to re-rank the first 100 results using features such as retrieval scores and recency.

UIowaS_Rd3Borda

Results | Participants | Input | Appendix

  • Run ID: UIowaS_Rd3Borda
  • Participant: UIowaS
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: fc4f67f16ec2af2041dbfecb090dccfd
  • Run description: Borda merge of two runs - both employ Terrier, BM25 weighting with relevance feedback. Both use all query fields - we did add the word Covid to the question field of all topics. In the first run we used 10 relevant documents to expand the query by 300 terms. In the second run we used 30 documents to expand the query by 1000 terms. Borda merge was done on these two runs. Retrieval was done against the metadata title and abstract fields. For the 5 new topics we did a Terrier, TF_IDF retrieval feedback run, 10 documents, 20 expansion query terms. The queries were as above, But the dataset was limited to filtered documents from the metadata title and abstract. Each retained document has to have a word in a pre-defined list described in our run 1 documentation. The scores for the first 35 topics are Borda scores while for the last 5 these are TF_IDF scores. Probably ranks will be more useful than scores for re-ranking.

UIowaS_Rd3MLReRank

Results | Participants | Input | Appendix

  • Run ID: UIowaS_Rd3MLReRank
  • Participant: UIowaS
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 2dbf9f2267c0e06ecc4e378b1d697887
  • Run description: For the first 35 topics, re-rank the output provided by UIowaS_Rd3Borda , creating one linear SVM classifier per topic (using highly-relevant and non-relevant documents for the topic from qrel-rnd1 and q rel-rnd2). Re-ranking is done based on prediction probability. Preprocessing includes stop word removal and conversion of text to lowercase. TF-IDF for vector representation of a document considering unigrams and bigrams. The last five topics are the same as in UIowaS_Rd3Borda.

UIowaS_Run2

Results | Participants | Input | Appendix

  • Run ID: UIowaS_Run2
  • Participant: UIowaS
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 583b674ad3ce6e518cc4d5495e164bdd
  • Run description: For first 35 topics: Relevance feedback with provided relevance judgements using BM25 with 30 docs and 1000 terms for expansion (on unfiltered metadata dataset) limited to title and abstract. For the last 5 topics retrieval feedback using TF_IDF using 10 documents and 20 terms (on filtered metadata dataset) limited to title and abstract. Then reranked retrieved documents for each topic with biobert_msmarco model.

uogTrDPH_QE_QQN

Results | Participants | Input | Appendix

  • Run ID: uogTrDPH_QE_QQN
  • Participant: uogTr
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: fb6551d08a50d13e08205a84f759ff65
  • Run description: An automatic query expansion run using DFR query expansion built on pyTerrier

uogTrDPH_RF_QQN

Results | Participants | Input | Appendix

  • Run ID: uogTrDPH_RF_QQN
  • Participant: uogTr
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: 20be888afcfa5aa677cfc767ac8eec0c
  • Run description: A feedback run using DFR query expansion built on pyTerrier

UPrrf20lgbert50-r3

Results | Participants | Input | Appendix

  • Run ID: UPrrf20lgbert50-r3
  • Participant: unique_ptr
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: e523138172346232293876bbacd9587c
  • Run description: LightGBM learning-to-rank framework, combining (a) a reciprocal rank fusion of 20 retrieval runs using Anserini and Terrier, (b) TF-Ranking + BERT fine-tuned on MSMarco, and (c) article-based features such as its publication year.

UPrrf20lgprel50v2-r3

Results | Participants | Input | Appendix

  • Run ID: UPrrf20lgprel50v2-r3
  • Participant: unique_ptr
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: c61844396ad07a23bb2c644dae1f8502
  • Run description: LightGBM learning-to-rank framework, combining (a) a reciprocal rank fusion of 20 retrieval runs using Anserini and Terrier, (b) TF-Ranking + BERT fine-tuned on MSMarco (c) TF-Ranking + BERT fine-tuned on the relevance judgments from Rounds 1 & 2 and (d) article-based features such as its publication year.

UPrrf20maxens-r3

Results | Participants | Input | Appendix

  • Run ID: UPrrf20maxens-r3
  • Participant: unique_ptr
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: feedback
  • MD5: de35e20b3cf83515b0a36457819b26de
  • Run description: A per-query ensemble of (a) a reciprocal rank fusion of 20 retrieval runs using Anserini and Terrier (b) LightGBM learning-to-rank (c) TF-Ranking + BERT fine-tuned on the relevance judgments from Rounds 1 & 2. The ensemble was created by weighting each method by its NDCG@10 for a given query.

xj4wang_run1

Results | Participants | Input | Appendix

  • Run ID: xj4wang_run1
  • Participant: xj4wang
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: manual
  • MD5: 28c085794682fae2a824a654c3c21370
  • Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

xj4wang_run2

Results | Participants | Input | Appendix

  • Run ID: xj4wang_run2
  • Participant: xj4wang
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: manual
  • MD5: 976a0da2c93ffbf976957506f63d2438
  • Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

xj4wang_run3

Results | Participants | Input | Appendix

  • Run ID: xj4wang_run3
  • Participant: xj4wang
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: manual
  • MD5: d248e0bb9e27b9714d98cb454083029c
  • Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

Yibo_round3

Results | Participants | Input | Appendix

  • Run ID: Yibo_round3
  • Participant: 0_214_wyb
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: 94a6640ae7665b8038d2db49452aa990
  • Run description: use metadata.csv to delete repeated documents clean data use BM25 to retrieve to 2000 documents use sentence-bert to retrieve top 100 documents from the 2000 documents

Yibo_round3_test2

Results | Participants | Input | Appendix

  • Run ID: Yibo_round3_test2
  • Participant: 0_214_wyb
  • Track: Round 3
  • Year: 2020
  • Submission: 6/3/2020
  • Type: automatic
  • MD5: ede2ca1b6cb6113a17fec8ca0340936f
  • Run description: use metadata.csv to delete repeated documents clean data use BM25 to retrieve to 2000 documents use sentence-bert to retrieve top 1000 documents from the 2000 documents