Skip to content

Runs - Round 4 2020

active_learning

Results | Participants | Input | Appendix

  • Run ID: active_learning
  • Participant: risklick
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: manual
  • MD5: e6f7e87ad9db5668a261c8c290a20262
  • Run description: Manually judging retrieved publications return by a basic IR model

aserini2000-t53

Results | Participants | Input | Appendix

  • Run ID: aserini2000-t53
  • Participant: test_uma
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: 32b50b9ad4afd61f1cb62aea0cc5eca7
  • Run description: We first use anserini to choose first 2000 document per topic, and use T5 score for ranking.

BioInfo-run1

Results | Participants | Input | Appendix

  • Run ID: BioInfo-run1
  • Participant: BioinformaticsUA
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 950f2cdd623fd61f44869bd2cb3fa582
  • Run description: This run uses the open baseline anserini.final-r4.rf.txt and applies a neural ranking model [1] to rerank the top 15 documents. REFs: [1] T. Almeida and S. Matos, "Calling Attention to Passages for Biomedical Question Answering," in Advances in Information Retrieval, 2020, pp. 69--77.

BioInfo-run2

Results | Participants | Input | Appendix

  • Run ID: BioInfo-run2
  • Participant: BioinformaticsUA
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: f67d8080993adf2a00f4e32593d5d7f8
  • Run description: This run uses the open baseline anserini.final-r4.rf.txt and applies a rbf fusion to four runs produced by a neural ranking model [1] over the top 25 documents. REFs: [1] T. Almeida and S. Matos, "Calling Attention to Passages for Biomedical Question Answering," in Advances in Information Retrieval, 2020, pp. 69--77.

BioInfo-run3

Results | Participants | Input | Appendix

  • Run ID: BioInfo-run3
  • Participant: BioinformaticsUA
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: b75761f251a1ec285cad8cbbb935a1c0
  • Run description: This run applies a rbf fusion over the run1 run2 and two anserini baselines. REFs: [1] T. Almeida and S. Matos, "Calling Attention to Passages for Biomedical Question Answering," in Advances in Information Retrieval, 2020, pp. 69--77.

BITEM_BERT4

Results | Participants | Input | Appendix

  • Run ID: BITEM_BERT4
  • Participant: BITEM
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: bd7f3552da58b082c5d053ab467da091
  • Run description: Reranking of ES search (10k entries / query) using ROBERTA

BITEM_COVOC4

Results | Participants | Input | Appendix

  • Run ID: BITEM_COVOC4
  • Participant: BITEM
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 42ae53c960e860a404541b489373a960
  • Run description: Automatic run based on the search results from BITEMBL (ElasticSearch query based on the three fields query+question+narrative, normalization, token boosting). In this run, we tried to prioritize the documents according to the axes identified in the COVoc terminology.

bm25_bertsim_run4

Results | Participants | Input | Appendix

  • Run ID: bm25_bertsim_run4
  • Participant: UH_UAQ
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 229d8a52fabb2fba52fbc27f8dbdcb3f
  • Run description: BM25 + Bert + Similarity

bm25_bl_run4

Results | Participants | Input | Appendix

  • Run ID: bm25_bl_run4
  • Participant: UH_UAQ
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 1da7ce06d7b1bff63fe54241dec6656d
  • Run description: BM25 + minor changes + baseline

CincyMedIR-7

Results | Participants | Input | Appendix

  • Run ID: CincyMedIR-7
  • Participant: CincyMedIR
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 7e8422fa7c812e5990179983f7bb1d84
  • Run description: Query expanded using Lexigram; Documents occurring in previous qrels_files removed; Learning to Rank implemented - initial ranks were rescored - model used: Linear Regression with elasticsearch set to cross_fields and queries searched against title, abstract and metamap_terms for title and abstract.

CincyMedIR-8

Results | Participants | Input | Appendix

  • Run ID: CincyMedIR-8
  • Participant: CincyMedIR
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 551b7138d0b89fd99fda7fbc3f4f1087
  • Run description: Query expanded using Lexigram; Documents occurring in previous qrels_files removed; Learning to Rank implemented - initial ranks were rescored - model used: AdaRank with elasticsearch set to cross_fields and queries searched against title, abstract and metamap_terms for title and abstract.

CincyMedIR-9

Results | Participants | Input | Appendix

  • Run ID: CincyMedIR-9
  • Participant: CincyMedIR
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 2ce3627604907434628053e06eea1655
  • Run description: Query expanded using Lexigram; Documents occurring in previous qrels_files removed; Learning to Rank implemented - initial ranks were rescored - model used: RankBoost with elasticsearch set to cross_fields and queries searched against title, abstract and metamap_terms for title and abstract.

combined

Results | Participants | Input | Appendix

  • Run ID: combined
  • Participant: risklick
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: cb4e9389adac505b5b0a69ffc31ed3bd
  • Run description: A combination of basic IR models like bm25 and dfr on topics expanded with covid-related ontologies.

covidex.r4.d2q.duot5

Results | Participants | Input | Appendix

  • Run ID: covidex.r4.d2q.duot5
  • Participant: covidex
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: ff3b5c2c8fe704f14e0b12744ec8d9bf
  • Run description: A pairwise reranker (duoT5) using top-50 documents from a pointwise reranker (monoT5). Documents were expanded with doc2query prior to indexing. Both rerankers were trained on MedMARCO (MacAvaney et al., SLEDGE, 2020). Initial runs are baselines 7 and 8 from https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md#round-4

covidex.r4.duot5

Results | Participants | Input | Appendix

  • Run ID: covidex.r4.duot5
  • Participant: covidex
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 7f2374987d484a074359fc96e6c4fa37
  • Run description: A pairwise reranker (duoT5) using top-50 documents from a pointwise reranker (monoT5). Both rerankers were trained on MedMARCO (MacAvaney et al., SLEDGE, 2020). Initial runs are baselines 7 and 8 from https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md#round-4.

covidex.r4.duot5.lr

Results | Participants | Input | Appendix

  • Run ID: covidex.r4.duot5.lr
  • Participant: covidex
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: eb33630eb27f1118a61d105ee718879f
  • Run description: Interpolation (alpha=0.6) of covidex.r4.d2q.duot5 scores and scores from a logistic regression classifier trained on qrels of round 1 & 2 & 3 with tf-idf as input.

CSIROmedBM

Results | Participants | Input | Appendix

  • Run ID: CSIROmedBM
  • Participant: CSIROmed
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: c2d7821dd3597b1903815b2a73219432
  • Run description: BM25 retrieval over question, narrative and query over abstract, fulltext and title fields. Same as round CSIROmedNIR, but only BM25.

CSIROmedNIR

Results | Participants | Input | Appendix

  • Run ID: CSIROmedNIR
  • Participant: CSIROmed
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: 7c88505c0323b7a2ee8a83b9f92e73ee
  • Run description: Neural Index with Cosine Similarity for retrieval over question, narrative and query and mean sentence-embeddings over abstract and title fields + BM25 score over fulltext. Same as round 3, except using clinical covid bert.

CSIROmedNO

Results | Participants | Input | Appendix

  • Run ID: CSIROmedNO
  • Participant: CSIROmed
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: 9212d4c9466b998f804a97fba823eec4
  • Run description: Neural Index with Cosine Similarity for retrieval over question, narrative and query and mean sentence-embeddings over abstract and title fields. Same as round CSIROmedNIR, but no BM25.

Emory_rnd4_run1

Results | Participants | Input | Appendix

  • Run ID: Emory_rnd4_run1
  • Participant: Emory_IRLab
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 4c5b50827f99bbb9462fe0caf199b355
  • Run description: This run is obtained by training a lambdamart model. The features include BM25 score on different document fields, RM3 score from anserini baseline, scaled date value. Besides, covered query term number and frequency, convered number and frequency of query entities from scispacy are also added as additional features for lambdamart rerank.

Emory_rnd4_run2

Results | Participants | Input | Appendix

  • Run ID: Emory_rnd4_run2
  • Participant: Emory_IRLab
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: cce688769079c9c2a1dd490c6a73ad63
  • Run description: This run is obtained by training a lambdamart model. The features include BM25 score on different document fields, RM3 score from anserini baseline, scaled date value. Besides, covered query term number and frequency, convered number and frequency of query entities from scispacy are also added as additional features for lambdamart rerank.

Emory_rnd4_run3

Results | Participants | Input | Appendix

  • Run ID: Emory_rnd4_run3
  • Participant: Emory_IRLab
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 912a357a257cf258bd6cacfc2314ebfc
  • Run description: This run is obtained by training a lambdamart model. The features include BM25 score on different document fields, RM3 score from anserini baseline, scaled date value. Besides, covered query term number and frequency, convered number and frequency of query entities from scispacy are also added as additional features for lambdamart rerank.

HKPU-Gos1-pPRF

Results | Participants | Input | Appendix

  • Run ID: HKPU-Gos1-pPRF
  • Participant: HKPU
  • Track: Round 4
  • Year: 2020
  • Submission: 7/3/2020
  • Type: automatic
  • MD5: 068cfba5187b9df9ac36a9255d2c04a2
  • Run description: The index is built from the combined title and abstract fields of the metadata file. Retrieval is performed with the scoring function #1 presented in the Conclusion section (p.469) of the paper by Goswami et al. (Goswami et al, Exploring the space of information retrieval term scoring functions, Information Processing and Management, 53 (2017), p.454-472), on long queries consisting of the combined Query, Question and Narrative. Passage-based retrieval with pseudo-relevance feedback is employed.

HKPU-MATF-pPRF

Results | Participants | Input | Appendix

  • Run ID: HKPU-MATF-pPRF
  • Participant: HKPU
  • Track: Round 4
  • Year: 2020
  • Submission: 7/3/2020
  • Type: automatic
  • MD5: 05b107ebdb1d0a48305f6c92bba1ae10
  • Run description: The index is built from the combined title and abstract fields of the metadata file. Retrieval is performed by the MATF model (Paik, J.H., A novel tf-idf weighting scheme for effective ranking. In Proceedings of the ACM SIGIR (2013), pp. 343-352), on long queries consisting of the combined Query, Question and Narrative. Passage-based retrieval with pseudo-relevance feedback is employed.

HKPU-SPUD-pPRF

Results | Participants | Input | Appendix

  • Run ID: HKPU-SPUD-pPRF
  • Participant: HKPU
  • Track: Round 4
  • Year: 2020
  • Submission: 7/3/2020
  • Type: automatic
  • MD5: 52c5fcb5445c904ae41dba7d710c8844
  • Run description: The index is built from the combined title and abstract fields of the metadata file. Retrieval is performed by the SPUD model (Cummins et al., A polya urn document language model for improved information retrieval. ACM TOIS 33, 4, Article 21 (2015), p.1-34.), on long queries consisting of the combined Query, Question and Narrative. Passage-based retrieval with pseudo-relevance feedback is employed.

ILPS_UvA_allrounds_c

Results | Participants | Input | Appendix

  • Run ID: ILPS_UvA_allrounds_c
  • Participant: ILPS_UvA
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 7544c95d685e802baaa3e8bde5d83049
  • Run description: We used a weak supervision technique to rerank with covidBERT (only abstracts). This run was selected due to promising results using our soft evaluation protocol and because the performance of the reranker did not seem to drop after fine-tuning for more than 3 epochs.

ILPS_UvA_big_diverse

Results | Participants | Input | Appendix

  • Run ID: ILPS_UvA_big_diverse
  • Participant: ILPS_UvA
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 14f0907896cd59025f088e6dea887f4f
  • Run description: We used a weak supervision technique to rerank with BERT (only abstracts). This run was selected due to a big amount of training examples used and a big gap between our soft and hard evaluation protocol, which could suggest that diverse predictions are made by the model.

ILPS_UvA_zeroshot_BE

Results | Participants | Input | Appendix

  • Run ID: ILPS_UvA_zeroshot_BE
  • Participant: ILPS_UvA
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 09f151002069f7d255d276a2f58f86e0
  • Run description: Labels: zeroshot weak (with sample size = 5, negcutoff=1000, GM) + BERT re-ranker

jlbasernd4

Results | Participants | Input | Appendix

  • Run ID: jlbasernd4
  • Participant: julielab
  • Track: Round 4
  • Year: 2020
  • Submission: 6/30/2020
  • Type: manual
  • MD5: be2d7a0145d3a2b078fbb4086875828f
  • Run description: ElasticSearch with BM25 default settings. Index documents are the document paragraphs. Stop word filtered query as mandatory clause. Stop word filtered question and narrative as optional clause.

jlbasernd4-jlQErnd4

Results | Participants | Input | Appendix

  • Run ID: jlbasernd4-jlQErnd4
  • Participant: julielab
  • Track: Round 4
  • Year: 2020
  • Submission: 6/30/2020
  • Type: manual
  • MD5: 46c7a07c73a56768e6bec38b6821f786
  • Run description: Reciprocal Rank Fusion between jlbasernd4 and jlbase-QErnd4.

jlQErnd4

Results | Participants | Input | Appendix

  • Run ID: jlQErnd4
  • Participant: julielab
  • Track: Round 4
  • Year: 2020
  • Submission: 6/30/2020
  • Type: manual
  • MD5: e20d5712efca1c5c17606436f89782ce
  • Run description: In most queries the token "coronavirus" is present. However, coronaviridae are a family of viruses, that are not limited to only SARS-CoV-2. Thus many false positive are likely to be found. This holds also true for other terms, such as "animal model". This term does not occure often, as most of the researchers specify the animal they used as model organism (such as mice or rats). So we created a list of synonyms to specify these general terms. We found out, that mostly nouns are the terms that contained the most information. Thus we used a part-of-speech tagger to isolate nouns and used a manually filled blacklist as well as a generall stop word list to filter them. Our final query consists of four different parts: several synonyms and spellings for the illness and the virus respectively, the query, the question and finally a bag-of-words containing the filtered nouns from both the question and the query. We planned on including the filtered nouns of the narrative as optional part of the query, but it turned out during evaluation on the previous part, that it does not have any benefit on the result.

l2r

Results | Participants | Input | Appendix

  • Run ID: l2r
  • Participant: risklick
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: ae9720a1ae8ed348cada5e2807507789
  • Run description: Basic learning to rank applied to an IR run.

Marouane_eQQ_EnM-BTN

Results | Participants | Input | Appendix

  • Run ID: Marouane_eQQ_EnM-BTN
  • Participant: Marouane
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: 11cda52bc09fdf70fa470459d65370cd
  • Run description: In this run we rank the documents using Ensemble Model (EnM), EnM = 0.38BM25+0.34TFIDF+0.28*NMF. We train/rank on a small subset (#Papers 18674, #Tokens 24909), papers published after 2019-12, containing PMC parsed file Please see the [1] (7.2 Coherence) to see why we are using NMF. NMF stands for Non-negative Matrix Factorization. We expand the Query+Question using "Topic models RM" please see [1] (Section 3.3). [1] https://www.kaggle.com/atmarouane/covid-19-search-engine-indexing-by-lda-enm [2] Indexing by Latent Dirichlet Allocation and Ensemble Model

Marouane_QQ_EnM-BTN

Results | Participants | Input | Appendix

  • Run ID: Marouane_QQ_EnM-BTN
  • Participant: Marouane
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: 7c8d7d839b36295db360b93a6ff1dde3
  • Run description: In this run we rank the documents using Ensemble Model (EnM), EnM = 0.38BM25+0.34TFIDF+0.28*NMF. We train/rank on a small subset (#Papers 18674, #Tokens 24909), papers published after 2019-12, containing PMC parsed file Please see the [1] (7.2 Coherence) to see why we are using NMF. NMF stands for Non-negative Matrix Factorization. We use Query+Question. [1] https://www.kaggle.com/atmarouane/covid-19-search-engine-indexing-by-lda-enm [2] Indexing by Latent Dirichlet Allocation and Ensemble Model

mpiid5_run1

Results | Participants | Input | Appendix

  • Run ID: mpiid5_run1
  • Participant: mpiid5
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 99f6ec044eb17d0aa3cf809bd4707f62
  • Run description: We re-rank the top 100 then the top 101-2000 documents from the Anserini fusion2 baseline (8th row). For the re-ranking method, we use the ELECTRA-Base model fine-tuned on the MSMARCO passage dataset. The model is later fine-tuned on the TREC COVID round 1-3 full-text collection. We use the question queries for re-ranking.

OHSU_R4_totalfusion

Results | Participants | Input | Appendix

  • Run ID: OHSU_R4_totalfusion
  • Participant: OHSU
  • Track: Round 4
  • Year: 2020
  • Submission: 7/4/2020
  • Type: manual
  • MD5: 253aefc5d70c3d11d46734b4186ad191
  • Run description: Anserini with BM25 and RM3 reranking was used. A query consisting of processed question, query, and narrative were used to create a fusion run using the abstract, full text, and paragraph indexes.

OHSU_TF_UDGEN_AVG

Results | Participants | Input | Appendix

  • Run ID: OHSU_TF_UDGEN_AVG
  • Participant: OHSU
  • Track: Round 4
  • Year: 2020
  • Submission: 7/4/2020
  • Type: manual
  • MD5: 70cc402f50b2d09ee7ab2587092f1d4a
  • Run description: Udel query generator was used to create a fusion run consisting of search on an abstract, full-text, and paragraph indices using BM25 and RM3 (anserini). The top 250 documents were reranked using SciBERT tuned on Pubmed and CORD19.

OHSU_totalfusion_avg

Results | Participants | Input | Appendix

  • Run ID: OHSU_totalfusion_avg
  • Participant: OHSU
  • Track: Round 4
  • Year: 2020
  • Submission: 7/4/2020
  • Type: manual
  • MD5: 8f4e1bef7b387b4ec65de1271c13c049
  • Run description: A combination of query, question, and narrative were searched on the abstract, full text, and paragraph indices to create fusion baseline. The top 250 documents per topic were reranked using SciBERT tuned on Pubmed and CORD19.

poznan_p4_run1

Results | Participants | Input | Appendix

  • Run ID: poznan_p4_run1
  • Participant: POZNAN
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 8067581eb31632e5c3b83f63c94339dd
  • Run description: BM25_DFR baseline with query expansion. Parameters: d = 30; t = 100. Indexed: abstract, title. Performed with the Terrier engine.

poznan_p4_run2

Results | Participants | Input | Appendix

  • Run ID: poznan_p4_run2
  • Participant: POZNAN
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 9f81b8b11f84cf36d3dbd822a359fab3
  • Run description: InL2 baseline with query expansion. Parameters: d = 30; t = 100. Indexed: abstract, title. Performed with the Terrier engine.

poznan_p4_run3

Results | Participants | Input | Appendix

  • Run ID: poznan_p4_run3
  • Participant: POZNAN
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: 8abf2204ad02783b7d4d91bbd1dc04cf
  • Run description: Aggressive reranking of BM25 baseline. Abstract is tokenized into sentences. Each word in a short query gets compared with each word in the document abstract. Five sentences with the highest sum of top five most similar words to the query are picked. Sum of the top five word similarities in top five sentences is added to the score. Word similarity is defined as a cosine similarity between word vectors (embedding calculated with word2vec).

r4.fusion1

Results | Participants | Input | Appendix

  • Run ID: r4.fusion1
  • Participant: anserini
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: a8ab52e12c151012adbfc8e37d666760
  • Run description: Anserini fusion run corresponding to row 7 in table for Round 4 at https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

r4.fusion2

Results | Participants | Input | Appendix

  • Run ID: r4.fusion2
  • Participant: anserini
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 1500104c928f463f38e76b58b91d4c07
  • Run description: Anserini fusion run corresponding to row 8 in table for Round 4 at https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

r4.rf

Results | Participants | Input | Appendix

  • Run ID: r4.rf
  • Participant: anserini
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: 41d746eb86a99d2f33068ebc195072cd
  • Run description: Anserini relevance feedback run corresponding to row 9 in table for Round 4 at https://github.com/castorini/anserini/blob/master/docs/experiments-covid.md

run1_C_A_SciB

Results | Participants | Input | Appendix

  • Run ID: run1_C_A_SciB
  • Participant: CIR
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 652d955748555a1e5c074602b6532834
  • Run description: Fusion of: (1) Anserini BM25 baseline(r4.rf), (2) Med-MARCO ANN dense embedding paragraph retrieval, (3) MedMARCP SciBERT paragraph reranker(top 1000). Fusion weights tuned. Language model finetuned on Round-4 documents using MLM task.

run2_Crf_A_SciB_MAP

Results | Participants | Input | Appendix

  • Run ID: run2_Crf_A_SciB_MAP
  • Participant: CIR
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 5480199e26e876334ba8eb7bd22ae0a4
  • Run description: Fusion of: (1) Anserini BM25 baseline(r4.rf), (2) Med-MARCO ANN dense embedding paragraph retrieval with PRF based on embedding similarity, (3) MedMARCP SciBERT paragraph reranker(top 1000). Fusion weights tuned based on final MAP. Language model finetuned on Round-4 documents using MLM task.

run3_Crf_A_SciB_N

Results | Participants | Input | Appendix

  • Run ID: run3_Crf_A_SciB_N
  • Participant: CIR
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: 7435c926407469d023e04d598a889acd
  • Run description: Fusion of: (1) Anserini BM25 baseline(r4.rf), (2) Med-MARCO ANN dense embedding paragraph retrieval with PRF based on embedding similarity, (3) MedMARCP SciBERT paragraph reranker(top 1000). Fusion weights tuned based on NDCG@10. Language model finetuned on Round-4 documents using MLM task.

sab20.4.dfo

Results | Participants | Input | Appendix

  • Run ID: sab20.4.dfo
  • Participant: sabir
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: bdb4f90c71cbe3e6c10aff995dbba7d8
  • Run description: SMART vector DFO run. Base Lnu.ltu weights, with doc indexing = 0.4 metadoc_Lnu_weighting + 0.6 JSON_Lnu_weighting if a JSON doc exists (= straight Lnu weighting if only metadata info).Run DFO algorithm (the runs are described in my TREC 2005 Routing track, and later, eg 2017 core track). Use relevance info on Rounds 1+2+3 collections to expand and optimize weights on that collection. Expand to top 50 terms. Optimize ignoring top 30 nonrel docs (to accomodate incomplete judgements).

sab20.4.metadocs_m

Results | Participants | Input | Appendix

  • Run ID: sab20.4.metadocs_m
  • Participant: sabir
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 74aa141f3511787b7dcb5e161785d62d
  • Run description: Standard SMART vector run based on Lnu docs (pivot 130,slope.12 for JSON. pivot 110, slope.24 for metadata), ltu query weighting. Doc indexing: if only metadata info exists for a docid, that is used with Lnu weights. Each JSON doc is assigned final indexing as 0.4 * Metadata_Lnu_vector + 0.6 * JSON_Lnu_vector. After inverted retrieval, the highest similarity for each cord_uid is used.

sab20.4.rocchio

Results | Participants | Input | Appendix

  • Run ID: sab20.4.rocchio
  • Participant: sabir
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: 434e13f847f966feeabdc055916eb788
  • Run description: SMART vector Rocchio feedbackrun. Base Lnu.ltu weights, with doc indexing = 0.4 metadoc_Lnu_weighting + 0.6 JSON_Lnu_weighting if a JSON doc exists (= straight Lnu weighting if only metadata info).Run Rocchio algorithm. Use relevance info on Rounds 1+2+3 collections to expand and optimize query weights on that collection. Query term weights = 4 * original query weight + 12 * average weight in relevant docs - 12 * average weight in nonrel docs. Expansion to top 100 terms.

SFDC-fus12-enc34

Results | Participants | Input | Appendix

  • Run ID: SFDC-fus12-enc34
  • Participant: SFDC
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 4250ec725b9e25dbd42fe0be3a78ed6e
  • Run description: Implements: RRF( EncoderRerank( RRF(fusion1, fusion2)), encoder3, encoder4) fusion1 is anserini baseline 135 fusion2 is anserini baseline 246 encoders are hybrid semantic (bipartite-graph-trained SBERT) + keyword-based (TFIDF) encoder3 means TREC-3-data trained encoder, indexing round 4 data encoder4 means TREC-4-data trained encoder, indexing round 4 data

SFDC-re-fus12-enc24

Results | Participants | Input | Appendix

  • Run ID: SFDC-re-fus12-enc24
  • Participant: SFDC
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 83e7effc87837a639059cc60510fa057
  • Run description: Implements: RRF( EncoderRerank( RRF(fusion1, fusion2)), encoder2, encoder4) fusion1 is anserini baseline 135 fusion2 is anserini baseline 246 encoders are hybrid semantic (bipartite-graph-trained SBERT) + keyword-based (TFIDF) encoder2 means TREC-2-data trained encoder, indexing round 4 data encoder4 means TREC-4-data trained encoder, indexing round 4 data

SFDC-re-fus12-enc34

Results | Participants | Input | Appendix

  • Run ID: SFDC-re-fus12-enc34
  • Participant: SFDC
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: 02c9d23e6d9cfe7d7663101890fe8e8d
  • Run description: Implements: RRF( EncoderRerank( RRF(fusion1, fusion2)), encoder3, encoder4) fusion1 is anserini baseline 135 fusion2 is anserini baseline 246 encoders are hybrid semantic (bipartite-graph-trained SBERT) + keyword-based (TFIDF) encoder3 means TREC-3-data trained encoder, indexing round 4 data encoder4 means TREC-4-data trained encoder, indexing round 4 data

uab.run1

Results | Participants | Input | Appendix

  • Run ID: uab.run1
  • Participant: UAlbertaSearch
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: eab2949e2b6313eb4232bf4b8e3fabb6
  • Run description: BM25+ ranking on documents using the "query" part of the topics as query terms (stop words removed). IDF is computed by taking the geometric mean between the IDFs from CORD-19 corpus and the IDFs from another corpus. Only returns documents that contain all of the query terms.

ucd_cs_r4_r1

Results | Participants | Input | Appendix

  • Run ID: ucd_cs_r4_r1
  • Participant: UCD_CS
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: ff603e977cda5f792463750557dbbc30
  • Run description: Fusion of BM25 runs using abstract and paragraph indexing, where queries are carefully expanded based on the past relevant full-texts.

ucd_cs_r4_r2

Results | Participants | Input | Appendix

  • Run ID: ucd_cs_r4_r2
  • Participant: UCD_CS
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: 5ee1c51e17ee4ecdf5714f50de77f869
  • Run description: Dense retrieval using abstract of the collection and question field of the query to train a BER-based embedding model.

ucd_cs_r4_r3

Results | Participants | Input | Appendix

  • Run ID: ucd_cs_r4_r3
  • Participant: UCD_CS
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: cb863353cfcc96bf043328d7c71eac8c
  • Run description: Fusion ucd_cs_r4_r1 of ucd_cs_r4_r2 using RRF.

udel_fang_lambdarank

Results | Participants | Input | Appendix

  • Run ID: udel_fang_lambdarank
  • Participant: udel_fang
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: b6318fe6d720ddf6583e96b5daec30cb
  • Run description: We build an index with title and abstract from the metadata file. Non-stopwords in the query and entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We generate a run using relevance feedback on the first 40 queries and pseudo relevance feedback on the last 5 queries. LambdaMART is used to re-rank the first 200 results. The features we use include BM25, SciBERT, recency, and so on.

udel_fang_ltr_nobert

Results | Participants | Input | Appendix

  • Run ID: udel_fang_ltr_nobert
  • Participant: udel_fang
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: feedback
  • MD5: 283695960de61bb5783d6720c420b423
  • Run description: We build an index with title and abstract from the metadata file. Non-stopwords in the query and entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We generate a run using relevance feedback on the first 40 queries and pseudo relevance feedback on the last 5 queries. LambdaMART is used to re-rank the first 200 results. The features we use include BM25, recency, and so on.

udel_fang_nir

Results | Participants | Input | Appendix

  • Run ID: udel_fang_nir
  • Participant: udel_fang
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: automatic
  • MD5: a28da843cf9cf2a13427da06091bf7f7
  • Run description: We build an index with title and abstract from the metadata file. Non-stopwords in the query and entities tagged by SciSpacy in question and narrative fields are assigned the weight ratio of 2:3:1 to form the query. We generate a run using relevance feedback on the first 40 queries and pseudo relevance feedback on the last 5 queries. SCIBERT is used to re-rank the first 500 results. SCIBERT is finetuned on the whole msmarco datasets.

uogTrDPH_QE

Results | Participants | Input | Appendix

  • Run ID: uogTrDPH_QE
  • Participant: uogTr
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: fbdaf62e6d3d9ede890b64e0c8aea241
  • Run description: An automatic query expansion run using DFR query expansion built on pyTerrier.

uogTrDPH_QE_SCB1

Results | Participants | Input | Appendix

  • Run ID: uogTrDPH_QE_SCB1
  • Participant: uogTr
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: bbfb1bb1a2aa3def50f926faab3a1663
  • Run description: An automatic query expansion run using DFR query expansion built on pyTerrier, which linearly combines a SciBert model trained on MSMarco Medical Queries.

uogTrDPH_QE_SCB_PM1

Results | Participants | Input | Appendix

  • Run ID: uogTrDPH_QE_SCB_PM1
  • Participant: uogTr
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: 1e8082024d1b8ddf6a98fbd492a85f01
  • Run description: An automatic query expansion run using DFR query expansion built on pyTerrier, which linearly combines a SciBert model trained on data of previous tracks of TREC Precision Medicine.

UPrrf38rrf2-r4

Results | Participants | Input | Appendix

  • Run ID: UPrrf38rrf2-r4
  • Participant: unique_ptr
  • Track: Round 4
  • Year: 2020
  • Submission: 7/3/2020
  • Type: automatic
  • MD5: 2acc532f7e6bf08c4a3cf6928b255406
  • Run description: This run is produced by Reciprocal Rank Fusion (Cormack et al., 2009) of (a) Terrier & Anserini retrieval runs and neural retrieval runs based on synthetic query generation (https://arxiv.org/abs/2004.14503) with various combinations of query, question and narrative (b) TF-Ranking + ELECTRA with softmax loss (https://arxiv.org/abs/2004.08476) trained on MS-Marco dataset. This is a fully automatic run, with no extra-tuning based on the existing judgments from TREC-COVID.

UPrrf38rrf3-r4

Results | Participants | Input | Appendix

  • Run ID: UPrrf38rrf3-r4
  • Participant: unique_ptr
  • Track: Round 4
  • Year: 2020
  • Submission: 7/3/2020
  • Type: feedback
  • MD5: d0b13fd313851758653d4597f035afe0
  • Run description: This run is produced by Reciprocal Rank Fusion (Cormack et al., 2009) of (a) Terrier & Anserini retrieval runs with various combinations of query, question and narrative, including query expansion via pseudo and true relevance feedback (b) Neural retrieval runs based on synthetic query generation (https://arxiv.org/abs/2004.14503) (c) TF-Ranking + BERT with softmax loss (https://arxiv.org/abs/2004.08476) fine-tuned on the relevance judgments from Rounds 1 - 3.

UPrrf38rrf3v2-r4

Results | Participants | Input | Appendix

  • Run ID: UPrrf38rrf3v2-r4
  • Participant: unique_ptr
  • Track: Round 4
  • Year: 2020
  • Submission: 7/3/2020
  • Type: feedback
  • MD5: 598941712d8e3c03cfad86a6eca95b93
  • Run description: This run is produced by Reciprocal Rank Fusion (Cormack et al., 2009) of (a) Terrier & Anserini retrieval runs and neural retrieval runs based on synthetic query generation (https://arxiv.org/abs/2004.14503) with various combinations of query, question and narrative (b) TF-Ranking + ELECTRA with softmax loss (https://arxiv.org/abs/2004.08476) trained on MS-Marco dataset (c) TF-Ranking + BERT with softmax loss fine-tuned on the relevance judgments from Rounds 1 - 3.

uw_base

Results | Participants | Input | Appendix

  • Run ID: uw_base
  • Participant: UWMadison_iSchool
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: automatic
  • MD5: f053cfdacd13a777d7a8d45d4b9f4e7f
  • Run description: Simply QL: Lucene standard analyzer, krovetz stem, inquery stop words. title:abstract:body=5:5:1 query:description:narrative=0.2:0.4:0.4

uw_crowd

Results | Participants | Input | Appendix

  • Run ID: uw_crowd
  • Participant: UWMadison_iSchool
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: manual
  • MD5: 4d0b06fedc4085206bcb4958ad1610aa
  • Run description: KLD; the QM is an interpolation between original query and alternative queries from the crowd.

uw_fb

Results | Participants | Input | Appendix

  • Run ID: uw_fb
  • Participant: UWMadison_iSchool
  • Track: Round 4
  • Year: 2020
  • Submission: 7/6/2020
  • Type: feedback
  • MD5: dfd80d74ec328e51e6f486d4676aa220
  • Run description: KLD; the QM is an interpolation between original query and a parsimonious LM of Rel docs factoring out NRel docs. Lucene standard analyzer, krovetz stem, inquery stop words. title:abstract:body=5:5:1 query:description:narrative=0.2:0.4:0.4 query:FBQM=0.5:0.5

xj4wang_run1

Results | Participants | Input | Appendix

  • Run ID: xj4wang_run1
  • Participant: xj4wang
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: manual
  • MD5: 0615cff086d1a2639349ca01a182d278
  • Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

xj4wang_run2

Results | Participants | Input | Appendix

  • Run ID: xj4wang_run2
  • Participant: xj4wang
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: manual
  • MD5: d0d07a5a10a9c6fc7f32b6c595f29fe1
  • Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf

xj4wang_run3

Results | Participants | Input | Appendix

  • Run ID: xj4wang_run3
  • Participant: xj4wang
  • Track: Round 4
  • Year: 2020
  • Submission: 7/5/2020
  • Type: manual
  • MD5: 351135d0982f7b3689eb4bb2b5e6e08a
  • Run description: The retrieval model used is BMI (Baseline Model Implementation), provided as a starter by Gordon Cormack for the TREC 2015/2016 Total Recall Track, with human assessors in place of the server (manual processing). [1] In more detail: It uses the CAL (Continuous Active Learning) method, starting with 1 synthetic file created using the given topics, word for word. This method is described by Grossman and Cormack in [4]. Feature vectors are created using the BMI tools. [1] SofiaML is used as the learner. The weighting scheme were chosen heavily based on the work of Cormack and Grossman in [2]. Stopping conditions for manual labeling were chosen heavily based on the work of Grossman et al. in [3]. References: [1] https://cormack.uwaterloo.ca/trecvm/ [2] file:///C:/Users/Jean/Downloads/2600428.2609601.pdf [3] https://trec.nist.gov/pubs/trec25/papers/Overview-TR.pdf [4] https://cormack.uwaterloo.ca/caldemo/AprMay16_EdiscoveryBulletin.pdf