Skip to content

Runs - Deep Learning 2021

bcai_bertm1_ens

Results | Participants | Input | Summary | Appendix

  • Run ID: bcai_bertm1_ens
  • Participant: bcai
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: docs
  • MD5: d39faeb40b6f2b37a13306e5ded2e34b
  • Run description: Candidate generation: see description of the run bl_bcai_nn_retr. Ranking: top-200 entries are re-ranked using five BERT-Model1 models that were previously used on MS MARCO V1 leaderboard. These models were fine-tuned using current MS MARCO data.

bcai_p_mbert

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: bcai_p_mbert
  • Participant: bcai
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: a13b820d166178f30625dbf5a942aab4
  • Run description: 1. First stage see run bl_bcai_p_nn_rt 2. Re-ranking using an ensemble of four BERT-Model 1 models.

bcai_p_vbert

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: bcai_p_vbert
  • Participant: bcai
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: fa9350484fd7d78eee81096dc0892cb8
  • Run description: 1. First stage see run bl_bcai_p_nn_rt 2. Re-ranking using an ensemble of four vanilla BERT-large models.

bigram_qe_cedr

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: bigram_qe_cedr
  • Participant: CERTH_ITI_M4D
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: adaf17feaccdfe9026bd2389e2b4a6e4
  • Run description: step 1: BM25 initial retrieval step 2: query expansion with contextualized embeddings (from untrained BERT) step 3: BM25 with expanded queries step 4: re-ranking with CEDR Query expansion hyperparameters tuned with previous years' qrels on the new dataset. CEDR: trained with this year's train queries

bigrams_cont_qe

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: bigrams_cont_qe
  • Participant: CERTH_ITI_M4D
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: c02497567f2d0ec068deeb545b591013
  • Run description: A simple query expansion pipeline without reranking. Retrieval with BM25 and query expansion based on contextualized embeddings given from the default BERT (not fine-tuned). The query expansion technique is based on the work "CEQE: Contextualized Embeddings for Query Expansion" (Naseri et al) with some variations. Previous years' qrels were used for hyperparameter tuning on the new dataset (v2).

bl_bcai_nn_rtr

Results | Participants | Input | Summary | Appendix

  • Run ID: bl_bcai_nn_rtr
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: docs
  • MD5: 0e98d39fedc57b99ed753122688cacd0
  • Run description: direct retrieval using a fusion of ANCE (FirstP) and BM25 on doc2query expanded text.

bl_bcai_p_nn_rt

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: bl_bcai_p_nn_rt
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 241539b1e600312b1769668497b19664
  • Run description: 1. First stage see run bl_bcai_nn_retr 2. Re-ranking using a mix of passage BM25 and Model 1 (both neural and traditional) scores.

bl_bcai_p_trad

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: bl_bcai_p_trad
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: b849866801e62e02938c3c79c277155c
  • Run description: First, we retrieve passages using a non-neural approach (see run bl_bcai_bm25_mdl1). Then, we re-rank passages, using a learned combination of document and passage scores, where passage scores include BM25 and Model 1 scores.

bl_bcai_trad

Results | Participants | Input | Summary | Appendix

  • Run ID: bl_bcai_trad
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: docs
  • MD5: b437127f4705e0664e2fe6861befe0da
  • Run description: Re-ranking of BM25 candidates using multi-field BM25 and IBM Model1 scores.

bl_bcai_wloo_d

Results | Participants | Input | Summary | Appendix

  • Run ID: bl_bcai_wloo_d
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 9a3ac101acc72848446918b4b43a82e4
  • Run description: 1. First stage Jimmy Lin's document retrieval using dense vectors 2. Re-ranking using an ensemble of five Model 1 BERT models.

bl_bcai_wloo_p

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: bl_bcai_wloo_p
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 7f84298304a8fb077a042d779c8e5d9d
  • Run description: 1. First stage Jimmy Lin's passage retrieval using dense vectors 2. Re-ranking using an ensemble of four vanilla BERT-large models.

CIP_run1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CIP_run1
  • Participant: CIP
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: docs
  • MD5: 2dca8d728e5b4783a790e878896d7d87
  • Run description: In this run, we use the BERT model to re-rank the official candidate documents. Specifically, we utilize the BERT-large which is first trained on MS MARCO v1 passage small train triples, and then fine-tuned on MS MARCO v2 document training data. This BERT re-ranker predicts the relevance of each passage with a query independently, and the document score is given by the average score of the scores of the top-4 passages. All candidate documents are re-ranked by the document scores received.

CIP_run2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CIP_run2
  • Participant: CIP
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: docs
  • MD5: 3bb9ba53c3b3f7168f3b4811db73d08f
  • Run description: In this run, we use the BERT model to re-rank the official candidate documents. Specifically, we utilize the BERT-large which is first trained on MS MARCO v1 passage small train triples, and then fine-tuned on MS MARCO v2 passage data, and lastly fine-tuned on MS MARCO v2 document data. This BERT re-ranker predicts the relevance of each passage with a query independently, and the document score is given by the average score of the scores of the top-4 passages. All candidate documents are re-ranked by the document scores received.

CIP_run3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CIP_run3
  • Participant: CIP
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: docs
  • MD5: 5f1276704e10d7143caa8f2e48b6cbc7
  • Run description: In this run, we use the BERT model to re-rank the official candidate documents. Specifically, we utilize the BERT-large which is first trained on MS MARCO v1 passage small train triples, and then fine-tuned on MS MARCO v2 document data. This BERT re-ranker predicts the relevance of each passage with a query independently, and the document score is given by the score of the best passage (MaxP). All candidate documents are re-ranked by the document scores received.

d_bm25

Results | Participants | Input | Summary | Appendix

  • Run ID: d_bm25
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: ca88e3eaf20c01fe76787035168d9eac
  • Run description: Anserini BM25, default parameters

d_bm25rm3

Results | Participants | Input | Summary | Appendix

  • Run ID: d_bm25rm3
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 402de11d1257fe37dba3c769fba855c1
  • Run description: Anserini BM25 + RM3, default parameters

d_f10_mdt53b

Results | Participants | Input | Summary | Appendix

  • Run ID: d_f10_mdt53b
  • Participant: h2oloo
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: c9f9b07f73973adc7c806b586f9729a8
  • Run description: Uses d_fusion10 as base run. Reranking using Mono-Duo-T5 3B (both trained on TCT-ColBERT HN mined from V2 Passage Collection).

d_f10_mdt5base

Results | Participants | Input | Summary | Appendix

  • Run ID: d_f10_mdt5base
  • Participant: h2oloo
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 088424e31a59c77bd01290869c7d4976
  • Run description: Uses d_fusion10 as base run. Reranking using Mono-Duo-T5 base (both trained on TCT-ColBERT HN mined from V2 Passage Collection).

d_f10_mt53b

Results | Participants | Input | Summary | Appendix

  • Run ID: d_f10_mt53b
  • Participant: h2oloo
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: a42e2f31a356ef65fe2ee7ddddfdd13d
  • Run description: Uses d_fusion10 as base run. Reranking using Mono-T5 3B (trained on TCT-ColBERT HN mined from V2 Passage Collection).

d_fusion00

Results | Participants | Input | Summary | Appendix

  • Run ID: d_fusion00
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 9624771f293886e4bac0e2b39fd57592
  • Run description: hybrid of TCT-ColBERT HN+ dense retrieval (d_tct0) and uniCOIL (d_unicoil)

d_fusion10

Results | Participants | Input | Summary | Appendix

  • Run ID: d_fusion10
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 10ec4560350c2ed87260ae6e5f379188
  • Run description: hybrid of TCT-ColBERT HN+ dense retrieval (d_tct1) and uniCOIL (d_unicoil)

d_tct0

Results | Participants | Input | Summary | Appendix

  • Run ID: d_tct0
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 675b17cd121ac1b033b9e70bb246eec6
  • Run description: TCT-ColBERT HN+ dense retrieval (trained on MS MARCO v1, zero shot)

d_tct1

Results | Participants | Input | Summary | Appendix

  • Run ID: d_tct1
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 8018f64051e8aa6254d8bb5164de7261
  • Run description: TCT-ColBERT HN+ dense retrieval (trained on MS MARCO v2)

d_unicoil0

Results | Participants | Input | Summary | Appendix

  • Run ID: d_unicoil0
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 5592899e3b7a1e14c3b83ed16d3d1550
  • Run description: uniCOIL sparse retrieval (no expansion, trained on MS MARCO v1, zero shot)

doc_full_100

Results | Participants | Input | Summary | Appendix

  • Run ID: doc_full_100
  • Participant: ALIBABA
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: docs
  • MD5: f90df93af9798ea0fddbc1c1628789a5
  • Run description: ance + doc2query + prop top100

doc_full_100e

Results | Participants | Input | Summary | Appendix

  • Run ID: doc_full_100e
  • Participant: ALIBABA
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 087856a0251de74899c636ab21124d4e
  • Run description: ance+doc2query recall prop_deepimpact

doc_rank_100

Results | Participants | Input | Summary | Appendix

  • Run ID: doc_rank_100
  • Participant: ALIBABA
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 3ba5d8606b3d40bd19455a937d409718
  • Run description: prop

dseg_bm25

Results | Participants | Input | Summary | Appendix

  • Run ID: dseg_bm25
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 9b0c6169928932f05d9dc320e8da4902
  • Run description: Anserini BM25, default parameters, on segmented document corpus

dseg_bm25rm3

Results | Participants | Input | Summary | Appendix

  • Run ID: dseg_bm25rm3
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 5d1f5675ec90d68a83dcddedf44ff37b
  • Run description: Anserini BM25 + RM3, default parameters, on segmented document corpus

Fast_Forward_2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: Fast_Forward_2
  • Participant: L3S
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: docs
  • MD5: 129d34be0221b525dcbc81267d2ca88d
  • Run description: We retrieve the top 5000 documents from the sparse index for each query using BM25. After that, we retrieve the dense matching score of these 5000 query-document pairs using a pre-trained TCT-Colbert model (castorini/tct_colbert-v2-msmarco). Finally, we interpolate the scores and retrieve the top 100 documents per query.

Fast_Forward_3

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: Fast_Forward_3
  • Participant: L3S
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: passages
  • MD5: 106894f297c6ceed0996240daac181f8
  • Run description: We retrieve the top 5000 passages from the sparse index for each query using BM25. After that, we retrieve the dense matching score of these 5000 query-passage pairs using a pre-trained TCT-Colbert model (castorini/tct_colbert-v2-msmarco). Finally, we interpolate the scores and retrieve the top 100 passages per query.

Fast_Forward_5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: Fast_Forward_5
  • Participant: L3S
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: docs
  • MD5: d4f800b073d15fbff93992682277295a
  • Run description: We retrieve the top 5000 documents from the sparse index for each query using BM25. After that, we retrieve the dense matching score of these 5000 query-document pairs using a pre-trained TCT-Colbert model (castorini/tct_colbert-v2-msmarco). Finally, we interpolate the scores and retrieve the top 100 documents per query.

Fast_Forward_7

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: Fast_Forward_7
  • Participant: L3S
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: docs
  • MD5: 9fd6cddc99a4efa31e8e762700c18cb8
  • Run description: We retrieve the top 5000 documents from the sparse index for each query using BM25. After that, we retrieve the dense matching score of these 5000 query-document pairs using a pre-trained TCT-Colbert model (castorini/tct_colbert-v2-msmarco). Finally, we interpolate the scores and retrieve the top 100 documents per query.

Fast_ForwardP_2

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: Fast_ForwardP_2
  • Participant: L3S
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: passages
  • MD5: bf2eb22d96b104d3d3834b1adec61e0b
  • Run description: We retrieve the top 5000 passages from the sparse index for each query using BM25. After that, we retrieve the dense matching score of these 5000 query-passage pairs using a pre-trained TCT-Colbert model (castorini/tct_colbert-v2-msmarco). Finally, we interpolate the scores and retrieve the top 100 passages per query.

Fast_ForwardP_5

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: Fast_ForwardP_5
  • Participant: L3S
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: passages
  • MD5: 7655e3e1c0ac9a01aa2c8b73bdbf046b
  • Run description: We retrieve the top 5000 passages from the sparse index for each query using BM25. After that, we retrieve the dense matching score of these 5000 query-passage pairs using a pre-trained TCT-Colbert model (castorini/tct_colbert-v2-msmarco). Finally, we interpolate the scores and retrieve the top 100 passages per query.

ielab-AD-uni

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ielab-AD-uni
  • Participant: ielab
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: ff811e2c413cd3dc42139019216f99a5
  • Run description: This is a single stage retrieval run, in which we interpolate ADORE top 1000 passage scores with uniCOIL top 1000 scores. Scores are normalised before interpolation. uniCOIL is a BERT based retrieval method. It precomputes token scores in each passage at query time, and requires a single BERT inference to get token scores in query at query time. It has been trained on MS MARCO v1 training dataset, it uses relevant judgments as positive training samples and randomly picks negatives from BM25 top1000. ADORE is a BERT based dense retriever. It has been trained on MS MARCO v1 training dataset.

ielab-AD-uni-d

Results | Participants | Input | Summary | Appendix

  • Run ID: ielab-AD-uni-d
  • Participant: ielab
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 2f72a4a54bce100e3c3c696311fe4c5f
  • Run description: This is a single stage retrieval run, in which we interpolate ADORE top 1000 passage scores with uniCOIL top 1000 scores. Scores are normalised before interpolation. uniCOIL is a BERT based retrieval method. It precomputes token scores in each passage at query time, and requires a single BERT inference to get token scores in query at query time. It has been trained on MS MARCO v1 training dataset, it uses relevant judgments as positive training samples and randomly picks negatives from BM25 top1000. ADORE is a BERT based dense retriever. It has been trained on MS MARCO v1 training dataset. All document runs are generated from passage ranking runs. We use passage id map to get document ids and use passage score as the document score. If there are multiple passages in a documents have been retrieved, we use the max score as the document score.

ielab-roberta1d

Results | Participants | Input | Summary | Appendix

  • Run ID: ielab-roberta1d
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 1b63f66eeaab43ad01325f32533ff3ff
  • Run description: roberta v1 is trained with v2 collection training data, we use NCE loss with 10 hard negatives sampled from top 1000 results of bm25 interpolate uniCOIL run. This model is trained with a single Tesla v100 16G GPU with batch size of 2, max length set to be 128; the training took around 15 hours to complete. We use the trained roberta model to rerank the top 100 passages retrieved by BM25 at query time. All document runs are generated from passage ranking runs. We use passage id map to get document ids and use passage score as the document score. If there are multiple passages in a documents have been retrieved, we use the max score as the document score.

ielab-roberta2d

Results | Participants | Input | Summary | Appendix

  • Run ID: ielab-roberta2d
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 1ec01715e81b72d3c089fcd1ab41b264
  • Run description: roberta v2 is trained with v2 collection training data, we use NCE loss with 10 hard negatives sampled from top 1000 results of bm25 interpolate uniCOIL run. This model is trained with a single Tesla v100 16G GPU with batch size of 2, max length set to be 128; the training took around 15 hours to complete. We use the trained roberta model to rerank the top 100 passages retrieved by BM25 at query time. All document runs are generated from passage ranking runs. We use passage id map to get document ids and use passage score as the document score. If there are multiple passages in a documents have been retrieved, we use the max score as the document score.

ielab-robertav1

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ielab-robertav1
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: c630f0d99df04e6e511c067a76bc792d
  • Run description: roberta v1 is trained with v1 collection training data, we use NCE loss with 10 hard negatives sampled from top 1000 results of bm25 interpolate uniCOIL run. This model is trained with a single Tesla v100 16G GPU with batch size of 2, max length set to be 128; the training took around 15 hours to complete. We use the trained roberta model to rerank the top 100 passages retrieved by BM25 at query time.

ielab-robertav2

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ielab-robertav2
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: fa9479541195579d548909a6f558b860
  • Run description: roberta v2 is trained with v2 collection training data, we use NCE loss with 10 hard negatives sampled from top 1000 results of bm25 interpolate uniCOIL run. This model is trained with a single Tesla v100 16G GPU with batch size of 2, max length set to be 128; the training took around 15 hours to complete. We use the trained roberta model to rerank the top 100 passages retrieved by BM25 at query time.

ielab-TILDEv2

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ielab-TILDEv2
  • Participant: ielab
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 337781dca7b8568c366d12e9bfef1080
  • Run description: This is a two stages run: BM25 retrieves the top 1000 passages, and re-rank is done with TILDEv2 model. TILDEv2 is a BERT based reranker, it uses BERT to precompute document representation at indexing time and uses tokeniser to process query at query time. It has been trained on MS MARCO v1 training dataset, it uses relevant judgments as positive training samples and randomly picks negatives from BM25 top 1000.

ielab-TILDEv2d

Results | Participants | Input | Summary | Appendix

  • Run ID: ielab-TILDEv2d
  • Participant: ielab
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 2bef8db01aa665c04ef2edfa97488d0b
  • Run description: This is a two stages run: BM25 retrieves the top 1000 passages, and re-rank is done with TILDEv2 model. TILDEv2 is a BERT based reranker, it uses BERT to precompute document representation at indexing time and uses tokeniser to process query at query time. It has been trained on MS MARCO v1 training dataset, it uses relevant judgments as positive training samples and randomly picks negatives from BM25 top1000. All document runs are generated from passage ranking runs. We use passage id map to get document ids and use passage score as the document score. If there are multiple passages in a documents have been retrieved, we use the max score as the document score.

ielab-uniCOIL

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ielab-uniCOIL
  • Participant: ielab
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 71ae5e66995647b1e57a3f98f07c56ff
  • Run description: This is a single stage retrieval run, in which we interpolate BM25 top 1000 passage scores with uniCOIL top 1000 scores. Scores are normalised before interpolation. uniCOIL is a BERT based retrieval method. It precomputes token scores in each passage at query time, and requires a single BERT inference to get token scores in query at query time. It has been trained on MS MARCO v1 training dataset, it uses relevant judgments as positive training samples and randomly picks negatives from BM25 top1000.

ielab-uniCOIL-d

Results | Participants | Input | Summary | Appendix

  • Run ID: ielab-uniCOIL-d
  • Participant: ielab
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 9cb982d12d737250d210af3da3d731bb
  • Run description: This is a single stage retrieval run, in which we interpolate BM25 top 1000 passage scores with uniCOIL top 1000 scores. Scores are normalised before interpolation. uniCOIL is a BERT based retrieval method. It precomputes token scores in each passage at query time, and requires a single BERT inference to get token scores in query at query time. It has been trained on MS MARCO v1 training dataset, it uses relevant judgments as positive training samples and randomly picks negatives from BM25 top1000. All document runs are generated from passage ranking runs. We use passage id map to get document ids and use passage score as the document score. If there are multiple passages in a documents have been retrieved, we use the max score as the document score.

ihsm_bicolbert

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ihsm_bicolbert
  • Participant: IHSM
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 14a030e67189667cad76ff02247cd204
  • Run description: Colbert model from https://arxiv.org/pdf/2004.12832 with additional hashing layer (as described in https://arxiv.org/pdf/2106.00882) after document vectorization to produce binary vectors for documents. MaxSim ranker part contains de-binarization step. The model uses cosine as a similarity metric and has output vectors dimension set to 256. Model was trained with MarginMSELoss on MSMarco dataset with logits from ensembled cross-encoder models introduced in https://arxiv.org/pdf/2010.02666.

ihsm_colbert64

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ihsm_colbert64
  • Participant: IHSM
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 12b2ca6f7d272a029fde8d5366d42540
  • Run description: Colbert model, which was introduced in https://arxiv.org/pdf/2004.12832, with additional layer normalization after final dimensionality reduction linear layer. Output vectors dimension is set to 64, l2 distance is used as a similarity metric. Model was trained with MarginMSELoss on MSMarco dataset with logits from ensembled cross-encoder models introduced in https://arxiv.org/pdf/2010.02666.

ihsm_poly8q

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: ihsm_poly8q
  • Participant: IHSM
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: aa9b66f992281640fcd803d649539506
  • Run description: Polyencoder architecture was firstly described in https://arxiv.org/abs/1905.01969, it is a split-encoder, which has three main parts: a candidate encoder, a context encoder and a ranker. Candidate encoder allows to precompute all vectors of documents to store them in a search index. This Polyencoder uses https://huggingface.co/castorini/tct_colbert-v2-hn-msmarco as encoder, and has 8 codes for query, dotprod as a score. Model was trained with MarginMSELoss on MSMarco dataset with logits from ensembled cross-encoder models introduced in https://arxiv.org/pdf/2010.02666.

max-firstp-pass

Results | Participants | Input | Summary | Appendix

  • Run ID: max-firstp-pass
  • Participant: CFDA_CLIP
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 60663082002bf456bc607205ac861cca
  • Run description: We use TCT-ColBERT trained on MsmarcoV1 Document with FirstP; and directly zero-shot transfer to msmarcoV2 Document. We first fuse the retrieval with maxP and firstP approaches (using the same model checkpoint). Then further fuse with TCT-ColBERT trained on MsmarcoV2 passage. In the second fusion, we retrieve passages from passage corpus and then map it to document ID using meta data.

maxp

Results | Participants | Input | Summary | Appendix

  • Run ID: maxp
  • Participant: CFDA_CLIP
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 40abde1f0dcb369a73ef13fe26013f6d
  • Run description: We use TCT-ColBERT trained on MsmarcoV1 Document with FirstP; and directly zero-shot transfer to msmarcoV2 Document and retrieve with MaxP.

maxp-firstp

Results | Participants | Input | Summary | Appendix

  • Run ID: maxp-firstp
  • Participant: CFDA_CLIP
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 0c80ac140f4db78da5d796020f928e35
  • Run description: We use TCT-ColBERT trained on MsmarcoV1 Document with FirstP; and directly zero-shot transfer to msmarcoV2 Document. We first fuse the retrieval with maxP and firstP approaches (using the same model checkpoint).

maxp_h3

Results | Participants | Input | Summary | Appendix

  • Run ID: maxp_h3
  • Participant: mpii
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: b85d742c47bb2b623492e565a195eb12
  • Run description: BERT-base MaxP reranking d_fusion10 from h2oloo (fine-tuned on MS MARCO v2)

mono_d3

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: mono_d3
  • Participant: mpii
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 066429503cee1e0e8b906717ac2d4283
  • Run description: BERT-base monoBERT reranking p_tct1 from h2oloo (fine-tuned on MS MARCO v2)

mono_electra_h3

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: mono_electra_h3
  • Participant: mpii
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 82df6aeb833161e486dd083f9a3e6687
  • Run description: ELECTRA-base monoBERT reranking p_fusion10 from h2oloo (fine-tuned on MS MARCO v2)

mono_h3

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: mono_h3
  • Participant: mpii
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: a23ffe48ff79f21bec036f1f0a728d89
  • Run description: BERT-base monoBERT reranking p_fusion10 from h2oloo (fine-tuned on MS MARCO v2)

NLE_D_quick

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NLE_D_quick
  • Participant: NLE
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 8edba032c31ef2f18fe57297d12ac3bb
  • Run description: This run only performs one pass to rank passages. We use a splade model (https://arxiv.org/abs/2107.05720) trained on MSMARCO v1 without any query encoder (query is encoded just using the bert tokenizer) in order to make retrieval faster. Everything is performed on passages (same result as NLE_P_quick) and then ids are converted to document

NLE_D_v1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NLE_D_v1
  • Participant: NLE
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 52e02d76bd8e9f889d4f4c1aa73b6c65
  • Run description: This run is divided into three steps: first stage on passages, rerank on passages, passage to document conversion. Steps 1 and 2 are the same (even the same indexes and networks) as our passage run with almost the same name (P instead of D) First stage: We use a splade model (https://arxiv.org/abs/2107.05720) trained on MSMARCO v1 using distillation following (https://arxiv.org/abs/2010.02666). Triplets for distillation come from the aforementioned paper. We retrieve top1k passages. Second stage: We use a mean score ensemble of 7 rerankers. 1 is used off-the-shelf (cross-encoder/ms-marco-MiniLM-L-12-v2 from https://www.sbert.net/docs/pretrained-models/ce-msmarco.html), 2 are trained using triplets extracted from the TOP100 of splade on the train queries and 4 are trained using triplets extracted from the TOP1k of splade on the train queries. Third stage: We convert passage ids to document ids. Note that mean score ensembling is done after this third stage (so that the mean scores are from documents and not passages).

NLE_D_V1andV2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NLE_D_V1andV2
  • Participant: NLE
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 4cc39d7183861576f4238125e441ea8a
  • Run description: This run is divided into three steps: first stage on passages, rerank on passages, passage to document conversion. Steps 1 and 2 are the same (even the same indexes and networks) as our passage run with almost the same name (P instead of D) First stage: We use an ensemble of splade models (https://arxiv.org/abs/2107.05720) trained under different settings (4 on MSMARCO v1 and 1 on MSMARCO v2). We retrieve top1k passages. Second stage: We use a mean score ensemble of 10 rerankers. 1 is used off-the-shelf (cross-encoder/ms-marco-MiniLM-L-12-v2 from https://www.sbert.net/docs/pretrained-models/ce-msmarco.html), 2 are trained using triplets extracted from the TOP100 of splade on the MSMARCOv1 train queries and 4 are trained using triplets extracted from the TOP1k of splade on the MSMARCOv1 train queries and 3 trained using triplets extracted from the TOP1k of BM25 on the MSMARCOv2 train queries. Third stage: We convert passage ids to document ids. Note that mean score ensembling is done after this third stage (so that the mean scores are from documents and not passages).

NLE_P_quick

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: NLE_P_quick
  • Participant: NLE
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 53024b59932a02d222b8143978950c98
  • Run description: This run only performs one pass to rank passages. We use a splade model (https://arxiv.org/abs/2107.05720) trained on MSMARCO v1 without any query encoder (query is encoded just using the bert tokenizer) in order to make retrieval faster.

NLE_P_v1

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: NLE_P_v1
  • Participant: NLE
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/6/2021
  • Type: auto
  • Task: passages
  • MD5: aafc82164696e3eaa6e76f338dd9562b
  • Run description: This run is divided into two steps: first stage and rerank. First stage: We use a splade model (https://arxiv.org/abs/2107.05720) trained on MSMARCO v1 using distillation following (https://arxiv.org/abs/2010.02666). Triplets for distillation come from the aforementioned paper. We retrieve top1k passages. Second stage: We use a mean score ensemble of 7 rerankers. 1 is used off-the-shelf (cross-encoder/ms-marco-MiniLM-L-12-v2 from https://www.sbert.net/docs/pretrained-models/ce-msmarco.html), 2 are trained using triplets extracted from the TOP100 of splade on the train queries and 4 are trained using triplets extracted from the TOP1k of splade on the train queries.

NLE_P_V1andV2

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: NLE_P_V1andV2
  • Participant: NLE
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: e9bbe02cc8b5340a5e10c0b6b8baf42c
  • Run description: This run is divided into two steps: first stage and rerank. First stage: We use an ensemble of splade models (https://arxiv.org/abs/2107.05720) trained under different settings (4 on MSMARCO v1 and 1 on MSMARCO v2). We retrieve top1k passages. Second stage: We use a mean score ensemble of 10 rerankers. 1 is used off-the-shelf (cross-encoder/ms-marco-MiniLM-L-12-v2 from https://www.sbert.net/docs/pretrained-models/ce-msmarco.html), 2 are trained using triplets extracted from the TOP100 of splade on the MSMARCOv1 train queries and 4 are trained using triplets extracted from the TOP1k of splade on the MSMARCOv1 train queries and 3 trained using triplets extracted from the TOP1k of BM25 on the MSMARCOv2 train queries.

p_bm25

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_bm25
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 5ef5b44895dc5e1eddd9660d9e361421
  • Run description: Anserini BM25, default parameters

p_bm25rm3

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_bm25rm3
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 2744e65480c2ec468c78c96e90055138
  • Run description: Anserini BM25 + RM3, default parameters

p_f10_mdt53b

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_f10_mdt53b
  • Participant: h2oloo
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 15942385d81560a69849f25c4d935bf3
  • Run description: Uses p_fusion10 as base run. Reranking using Mono-Duo-T5 3B (trained on TCT-ColBERT HN mined from V2 Passage Collection).

p_f10_mdt5base

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_f10_mdt5base
  • Participant: h2oloo
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: f0d88d8346c7c9f9bf085f39059c8e80
  • Run description: Uses p_fusion10 as base run. Reranking using Mono-Duo-T5 base (trained on TCT-ColBERT HN mined from V2 Passage Collection).

p_f10_mt53b

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_f10_mt53b
  • Participant: h2oloo
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: b008142132fbde3854671b7ffcd7bc4e
  • Run description: Uses p_fusion10 as base run. Reranking using Mono-T5 3B (trained on TCT-ColBERT HN mined from V2 Passage Collection).

p_fusion00

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_fusion00
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: e9fef47413d59a468cc9fc4c57a6c760
  • Run description: hybrid of TCT-ColBERT HN+ dense retrieval (d_tct0) and uniCOIL (d_unicoil)

p_fusion10

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_fusion10
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: c087263907409c8541bd2ee3946da13b
  • Run description: hybrid of TCT-ColBERT HN+ dense retrieval (d_tct1) and uniCOIL (d_unicoil)

p_tct0

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_tct0
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: b73d2317b0e51b2d29112acddb5de0ec
  • Run description: TCT-ColBERT HN+ dense retrieval (trained on MS MARCO v1, zero shot)

p_tct1

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_tct1
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 1bb5692f50e1cd1e285fc97e3c8896ee
  • Run description: TCT-ColBERT HN+ dense retrieval (trained on MS MARCO v2)

p_unicoil0

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: p_unicoil0
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: dc2d5404e86654f436c7260fd0dccb83
  • Run description: uniCOIL sparse retrieval (no expansion, trained on MS MARCO v1, zero shot)

parade_bm25

Results | Participants | Input | Summary | Appendix

  • Run ID: parade_bm25
  • Participant: mpii
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 4ef93e9136dccfa8a64985e40ad8684c
  • Run description: BERT-base PARADE reranking BM25 (fine-tuned on MS MARCO v2)

parade_h3

Results | Participants | Input | Summary | Appendix

  • Run ID: parade_h3
  • Participant: mpii
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 9d9ee392000d2125335c9e2150dd847f
  • Run description: BERT-base PARADE reranking d_fusion10 from h2oloo (fine-tuned on MS MARCO v2)

pash_doc_f1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: pash_doc_f1
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 0ec318a9b78db35c5cb4fbeb3eaefae1
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_doc_f4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: pash_doc_f4
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: cdc3ef92ceb6d706ba8a8c49eb591c53
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_doc_f5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: pash_doc_f5
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: c86b1faf7ec7f76e3aec43c49cb502e3
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_doc_r1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: pash_doc_r1
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: f6483ca40214b750335dfca7a0502b4d
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_doc_r2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: pash_doc_r2
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 9fcbdaec0e2a6c83ffa3afe8caedba4f
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_doc_r3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: pash_doc_r3
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 8b5786369ddacc7ad1ce8c2c70258b43
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_f1

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pash_f1
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 001169a4810355fdcf653c451b388f1c
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_f2

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pash_f2
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 985701c361aeb3ea1b8a3fce4204f15e
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_f3

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pash_f3
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 86dd3f83ecfa70d573ab9e3e4cdc4578
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_r1

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pash_r1
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 4cb96d5e0f64928398bf19c7c437c571
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b.

pash_r2

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pash_r2
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 03a0cabdf85452c54c87a435fd6efd8a
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pash_r3

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pash_r3
  • Participant: PASH
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: e616cb840af05d48154660298b44d702
  • Run description: We adopt a multi-stage ranking framework combines DeBERTa-2.6B and T5-3b. We use a multi-way matching composed of n-grams and BM25+docT5query(neural document expansion).

pass_full_1000

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pass_full_1000
  • Participant: ALIBABA
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: passages
  • MD5: 724b8f93add87cd5b0b65bfe2af6acd3
  • Run description: passv2_full_rank ance + doc2query+prop_deepimpact

pass_full_1000e

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pass_full_1000e
  • Participant: ALIBABA
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 2a8bd727ebe7b545a45500b41a7be335
  • Run description: ance+doc2query recall prop_deepimpact

pass_rank_100

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: pass_rank_100
  • Participant: ALIBABA
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 8a47463958dae7cc98af224f4741977a
  • Run description: prop_deepimpact

paug_bm25

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: paug_bm25
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 545cd65a2766d2c65a5e6e3fd5e7728b
  • Run description: Anserini BM25, default parameters, on augmented passage corpus

paug_bm25rm3

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: paug_bm25rm3
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 7d625a3680f82cc58d2188c4933f3c72
  • Run description: Anserini BM25 + RM3, default parameters, on augmented passage corpus

top1000

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: top1000
  • Participant: UAmsterdam
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: e56e4d13a14ccbf549d166df55797b03
  • Run description: BM25 top-1000, re-ranked using interaction BERT filtering out the top-100.

TUW_DR_Base

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: TUW_DR_Base
  • Participant: TU_Vienna
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: passages
  • MD5: f99c01f2cf43a827b341b41c30f5ce3e
  • Run description: This is a baseline dense retrieval model (based on DistilBERT) trained on the MSMARCO-V1 training triples (using BM25 negative samples) and a simple RankNet loss with a batch size of 32 using the binary relevance labels, without any knowledge distillation. For inference we use ONNX runtime and BERT optimizations with fp16 (resulting vectors are also fp16).

TUW_IDCM_ALL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TUW_IDCM_ALL
  • Participant: TU_Vienna
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: d6e070bff56d5e6bf007c185f1987cb6
  • Run description: This is our IDCM (intra document cascade model) with a all passage selection (meaning DistilBERT scores all passages of the document) and a maximum document length of 2,000. The re-ranking is done on the given top100 set (Title and body of documents are concatenated and fed into the model).

TUW_IDCM_S4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TUW_IDCM_S4
  • Participant: TU_Vienna
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 2d6e087979661b60ee501cd90d077443
  • Run description: This is our IDCM (intra document cascade model) with a 4 passage selection (meaning DistilBERT scores the top 4 passages of our CK selection module) and a maximum document length of 2,000. The re-ranking is done on the given top100 set (Title and body of documents are concatenated and fed into the model).

TUW_TAS-B_768

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: TUW_TAS-B_768
  • Participant: TU_Vienna
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: passages
  • MD5: e562417be5074cc1b7fdfa79451f4787
  • Run description: We use our publicly available checkpoint (https://huggingface.co/sebastian-hofstaetter/distilbert-dot-tas_b-b256-msmarco) of our TAS-Balanced trained DistilBERT dense retrieval model in a brute-force search configuration. For inference we use ONNX runtime and BERT optimizations with fp16 (resulting vectors are also fp16).

TUW_TAS-B_ANN

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: TUW_TAS-B_ANN
  • Participant: TU_Vienna
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: passages
  • MD5: 1d04515caec6c97da5e3048119fc425f
  • Run description: This TAS-Balanced trained model (based on DistilBERT) uses a compression layer at the end to produce 192 dimensional embeddings in fp16 (a 8x reduction to a default 768 dim output in fp32), we then indexed the vectors with HNSW (using 96 neighbors per vector). For inference we use ONNX runtime and BERT optimizations with fp16 (resulting vectors are also fp16).

uogTrBaseDD

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrBaseDD
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: b0957438d05773971b2debe0c8579226
  • Run description: PyTerrier/Terrier DPH on the document corpus

uogTrBaseDDpmp

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrBaseDDpmp
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 6c4ae32d7d68b95df4599e6d65853cb7
  • Run description: PyTerrier/Terrier DPH on the passage corpus followed by mapping to docnos, and max passage

uogTrBaseDDQ

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrBaseDDQ
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 6faca1f643556e7e4024e6bc971eb9cb
  • Run description: PyTerrier/Terrier DPH + Bo1 QE on the document corpus

uogTrBaseDDQC

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrBaseDDQC
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 00fff5b8297c31c0dfb6466966b45f79
  • Run description: PyTerrier/Terrier DPH + Bo1QE ColBERT and maxpassage

uogTrBaseDDQpmp

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrBaseDDQpmp
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 7e97ec9d58068e164508fc8eda2b9379
  • Run description: PyTerrier/Terrier DPH + Bo1 QE on the passage corpus followed by mapping to docnos, and max passage

uogTrBasePD

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: uogTrBasePD
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 8406afcf86c1970df67cd8654b58fa6d
  • Run description: PyTerrier/Terrier DPH

uogTrBasePDQ

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: uogTrBasePDQ
  • Participant: BASELINES
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 984975934351867b58125309b8343cd2
  • Run description: PyTerrier/Terrier DPH + Bo1 QE

uogTrDCPpmp

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrDCPpmp
  • Participant: uogTr
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 1947a4f7ccdf0ab34d1d6b4c307d75a7
  • Run description: PyTerrier/ColBERT dense retrieval plus some ColBERT PRF on the passage corpus then converted into document ranking run using max passage

uogTrDDQt5

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrDDQt5
  • Participant: uogTr
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: 72d3762b12cf5e456deb896aa2ab66e2
  • Run description: PyTerrier/Terrier DPH + Bo1QE monoT5

uogTrDot5pmp

Results | Participants | Input | Summary | Appendix

  • Run ID: uogTrDot5pmp
  • Participant: uogTr
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: docs
  • MD5: d00a5728b521ff77afbadc8e220491f1
  • Run description: PyTerrier Combination of sparse (Terrier DPH + Bo1QE) and dense (ColBERT & ColBERT PRF) runs re-ranked by monoT5 converted to document run with maxpassage

uogTrPC

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: uogTrPC
  • Participant: uogTr
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 59f574f140161b1090579e594847b209
  • Run description: PyTerrier/ColBERT dense retrieval on the passage corpus

uogTrPCP

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: uogTrPCP
  • Participant: uogTr
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 9c4b68c6d5ac2fdd5f7bf88dc72ba691
  • Run description: PyTerrier/ColBERT dense retrieval with some ColBERT PRF to re-rank on the passage corpus

uogTrPot5

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: uogTrPot5
  • Participant: uogTr
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/10/2021
  • Type: auto
  • Task: passages
  • MD5: 37162adf2ca4a28b0b5d54cf435c14cb
  • Run description: PyTerrier Combination of sparse (Terrier DPH + Bo1QE) and dense (ColBERT & ColBERT PRF) runs re-ranked by monoT5

watdfd

Results | Participants | Input | Summary | Appendix

  • Run ID: watdfd
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: a7e95247ca5c89288d8f81121f681b9c
  • Run description: Google SERP as training for logistic regression, document priority scoring.

watdff

Results | Participants | Input | Summary | Appendix

  • Run ID: watdff
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 874740d4a33ba224a7a936d53c32b394
  • Run description: Google SERP as training for logistic regression, document/passage fusion scoring.

watdfp

Results | Participants | Input | Summary | Appendix

  • Run ID: watdfp
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 136d5b2d6209a2433a83089945a13417
  • Run description: Google SERP as training for logistic regression, document priority scoring.

watdrd

Results | Participants | Input | Summary | Appendix

  • Run ID: watdrd
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: cfbaca66992049638b83cc41dec49574
  • Run description: Google SERP as training for logistic regression, document-only scoring.

watdrf

Results | Participants | Input | Summary | Appendix

  • Run ID: watdrf
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: 4ccfc61de8aadad22d70dd7c4ffcd227
  • Run description: Google SERP as training for logistic regression, document/passage fusion scoring.

watdrp

Results | Participants | Input | Summary | Appendix

  • Run ID: watdrp
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: docs
  • MD5: f1a7324717fc45c16d8f4f0d84cc4140
  • Run description: Google SERP as training for logistic regression, passage-priority scoring.

watpfd

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: watpfd
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: acf88f72aea2db19df469092699e1d1d
  • Run description: Google SERP as training for logistic regression, document-priority scoring.

watpff

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: watpff
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 2196763f588a5f8ebb38de8a37b4ffc3
  • Run description: Google SERP as training for logistic regression, document/passage fusion scoring.

watpfp

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: watpfp
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: def8908cececcc0632117575e303600b
  • Run description: Google SERP as training for logistic regression, passage-priority scoring.

watprd

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: watprd
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: ccb5fb4a10adf5959a5901a2ed9e6be8
  • Run description: Google SERP as training for logistic regression, document-priority scoring.

watprf

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: watprf
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: 424b0f3ede7165d42222a224369d9ded
  • Run description: Google SERP as training for logistic regression, document/passage fusion scoring.

watprp

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: watprp
  • Participant: Waterloo_Cormack
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/9/2021
  • Type: auto
  • Task: passages
  • MD5: f7e6f0c39e213ee1971accf65a289ac4
  • Run description: Google SERP as training for logistic regression, passage-only scoring.

webis-dl-1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: webis-dl-1
  • Participant: Webis
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: docs
  • MD5: 75909131f332f7034e072ecf2e00575d
  • Run description: We calculate 50 traditional features and train a LambdaMART model on those 50 features using this year's MS MARCO training data. The features include 36 query-document features (9 similarities like BM25, TF-IDF, etc. on 4 types of text: title, URL, body, and anchor text extracted from a common crawl snapshot), 8 document features (PageRank, etc), and 6 query features (number of entities in the query, etc). Here we train a model with 5000 trees.

webis-dl-2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: webis-dl-2
  • Participant: Webis
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: docs
  • MD5: b0b229b8dd92307009ca99a5597938f4
  • Run description: We calculate 41 traditional features and train a LambdaMART model on those 50 features using this year's MS MARCO training data. The features include 27 query-document features (9 similarities like BM25, TF-IDF, etc. on 3 types of text: title, URL, body), 8 document features (PageRank, etc), and 6 query features (number of entities in the query, etc). Here we train a model with 5000 trees.

webis-dl-3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: webis-dl-3
  • Participant: Webis
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: docs
  • MD5: 8b47475818b4a49b664d5a1f44ce67bf
  • Run description: We calculate 50 traditional features and train a LambdaMART model on those 50 features using this year's MS MARCO training data. The features include 36 query-document features (9 similarities like BM25, TF-IDF, etc. on 4 types of text: title, URL, body, and anchor text extracted from a common crawl snapshot), 8 document features (PageRank, etc), and 6 query features (number of entities in the query, etc). Here we train a model with 1000 trees.

WLUPassage

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: WLUPassage
  • Participant: WLU
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/7/2021
  • Type: auto
  • Task: passages
  • MD5: cf3d9743dbe32cc720fda3e5a113644f
  • Run description: The query and passage were truncated to 10 words and 50 words respectively. BERT (base model, uncased) embeddings for both were created. These were inputs into a neural network with three sections. 1. Each input was passed to LSTM layers, convolutional and average pooling layers, and regular densely connected layers. Then, they were multiplied together and passed through more of those layers. 2. The passage embedding input was subtracted from the query embedding input. The resulting tensor was passed to LSTM layers, convolutional and average pooling layers, and regular densely connected layers. 3. A new tensor was created by averaging all BERT embeddings in the passage. The query was also split into individual words. The cosine similarity between the average tensor and each word tensor was taken and then each was passed to densely connected layers. The average of the cosine similarities was taken, and the max. similarity was calculated as well. These values were added together. The tensors at the end of these three sections were all added together into one tensor, passed through more densely connected layers, and the final output was a score between 0 and 1.

WLUPassage1

Results | Participants | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: WLUPassage1
  • Participant: WLU
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/7/2021
  • Type: auto
  • Task: passages
  • MD5: 370faf3233f3ac6e3f0d66ec37b65109
  • Run description: The query and passage were truncated to 10 words and 50 words respectively. BERT (base model, uncased) embeddings for both were created. These were inputs into a neural network with two sections. 1. Each input was passed to LSTM layers, convolutional and max pooling layers, and regular densely connected layers. Then, they were multiplied together and passed through more of those layers. 2. The passage embedding input was subtracted from the query embedding input. The resulting tensor was passed to LSTM layers, convolutional and max pooling layers, and regular densely connected layers. The tensors at the end of these two sections were all added together into one tensor, passed through more densely connected layers, and the final output was a score between 0 and 1.

yorku21_a

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: yorku21_a
  • Participant: yorku
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: passages
  • MD5: 444c16ae80754346a1727e7370ff9243
  • Run description: First, we utilized the pre-trained model msmarco-MiniLM-L-6-v3 to calculate the sentence embeddings in each jsonl file of the passage ranking dataset. Second, we encoded each search query as a sentence embedding and utilized semantic search to calculate its relevance to the sentence embeddings of the entire dataset. This would retrieve the most relevant 100 passages from each jsonl file, instead of selecting 100 passages from the entire dataset. Finally, we utilized two pre-trained cross-coder models: ms-marco-MiniLM-L-12-v2 and ms-marco-MiniLM-L-6-v2 to re-rank the relevant passages retrieved from the second step. The results of the above three rankings were voted, and the top 100 most relevant passages for each query were selected as the final result.

yorku21_b

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: yorku21_b
  • Participant: yorku
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: passages
  • MD5: 44132faafb55dc01f9614155f1445647
  • Run description: First, we utilized the pre-trained model msmarco-MiniLM-L-6-v3 to calculate the sentence embeddings in each jsonl file of the passage ranking dataset. Second, we encoded each search query as a sentence embedding and utilized semantic search to calculate its relevance to the sentence embeddings of the entire dataset. This would retrieve the most relevant 100 passages from each jsonl file, instead of selecting 100 passages from the entire dataset.

yorku21_c

Results | Participants | Proceedings | Input | Summary (trec_eval) | Summary (passages-eval) | Appendix

  • Run ID: yorku21_c
  • Participant: yorku
  • Track: Deep Learning
  • Year: 2021
  • Submission: 8/8/2021
  • Type: auto
  • Task: passages
  • MD5: 4d459f1027098b55992e185aef81f311
  • Run description: First, we utilized the pre-trained model msmarco-MiniLM-L-6-v3 to calculate the sentence embeddings in each jsonl file of the passage ranking dataset. Second, we encoded each search query as a sentence embedding and utilized semantic search to calculate its relevance to the sentence embeddings of the entire dataset. This would retrieve the most relevant 100 passages from each jsonl file, instead of selecting 100 passages from the entire dataset. Finally, we utilized the pre-trained cross-coder models: ms-marco-MiniLM-L-6-v2 to re-rank the relevant passages retrieved from the second step. And the top 100 most relevant passages for each query were selected as the final result.