Run description: This run uses a sentence encoder to encode the itemized inclusion and exclusion criteria. The scoring function weighs the inclusion and exclusion criteria equally.
Run description: a 2-stage cascade ranking pipeline. The first stage retriever is a dense-sparse hybrid retriever model, the second stage reranker is a BERT cross-encoder reranker which rerankers top1000 docs from the first stage. The final doc scores are weighted between the scores of the retriever and reranker.
Run description: Transformer models fine-tuned on clinical data used to standardize Clinical Trials and topics. Topic were enhanced with understanding through Synonym Enrichment using Language Models (LLMs). To retrieve relevant results, they utilized ElasticSearch and devised a custom query, incorporating specialized analyzers within the ElasticSearch mappings, to match the normalized data effectively. Finally, the lexically retrieved results were re-ranked by a neural model, which was pretrained on TREC CT data from previous years.
Run description: DFR retrieval model from PyTerrier, queries reformulated with GPT-3.5. Both documents and queries were expanded using the Kusa et al. 2023 approach for past, current and family medical history.
Run description: This run takes DoSSIER_4 run and post-processes it with the GPT-3.5 model using QA based on the eligibility criteria section. Runs are filtered until top 10 of them are included or # of excluded is equal to 50.
Run description: We developed a proprietary retrieval model implemented through a pipeline starting with a retrieval method based on cosine similarity and with subsequent refinements adopting different type of language models.
Run description: a 3-stage cascade ranking pipeline. The first stage retriever is a dense-sparse hybrid retriever model, the second stage reranker is a BERT cross-encoder reranker which rerankers top1000 docs from the first stage. The thrid stage is prompting GPT4 to rerank top20 docs from the sencond stage.
Run description: This run uses a sentence encoder to encode the itemized inclusion and exclusion criteria. The scoring function prefers high precision and performs strict checks for exclusion criteria,
Run description: Solr BM25 with query expansion with GPT-3.5, followed by prompt-based reranking with GPT-3.5 (fine-tuned on a subset of 2021 TREC CT dataset). Rank fusion between BM25 and reranker.
Run description: In this run, we used BM25 with RM3 and RRF (with sintetic queries generated by doc2query-t5-large-msmarco) to generate a list of top 10000 documents for each topic, and then used Llama-2-13b-chat-hf with 4-bit quantization to generate relevance judgements to re-rank the top10000, adapting prior work done by TrialGPT. The LLaMa-2 model was not trained due to hardware limitations, so the settings were the base settings and weights meta-llama offers in huggingface.
Run description: This run uses a sentence encoder to encode the itemized inclusion and exclusion criteria. The scoring function weighs the exclusion criteria more than the inclusion criteria for computing relevance scores.