Skip to content

Text REtrieval Conference (TREC) 2020

News

Overview | Proceedings | Data | Runs | Participants

The News track focuses on information retrieval in the service of helping people read the news. In 2018, in cooperation with the Washington Post, we released a new collection of nearly 600,000 news articles, and crafted two tasks related to how news is presented on the web: background linking and entity ranking. For 2020, we added more documents to the collection and retired the entity ranking task in favor of a new wikification task.

Track coordinator(s):

  • Ian Soboroff, National Institute of Standards and Technology (NIST)
  • Shudong Huang, National Institute of Standards and Technology (NIST)
  • Donna Harman, National Institute of Standards and Technology (NIST)

Track Web Page: http://trec-news.org/


Deep Learning

Overview | Proceedings | Data | Results | Runs | Participants

This is the second year of the TREC Deep Learning Track, with the goal of studying ad hoc ranking in the large training data regime. We again have a document retrieval task and a passage retrieval task, each with hundreds of thousands of human-labeled training queries. We evaluate using singleshot TREC-style evaluation, to give us a picture of which ranking methods work best when large data is available, with much more comprehensive relevance labeling on the small number of test queries. This year we have further evidence that rankers with BERT-style pretraining outperform other rankers in the large data regime.

Track coordinator(s):

  • Nick Craswell, Microsoft AI & Research
  • Bhaskar Mitra, Microsoft AI & Research
  • Bhaskar Mitra, University College London
  • Emine Yilmaz, University College London
  • Daniel Campos, University of Illinois Urbana-Champaign

Track Web Page: https://microsoft.github.io/msmarco/TREC-Deep-Learning


Incident Streams

Overview | Proceedings | Data | Runs | Participants

Between 2018 and 2019, the Incident Streams track (TREC-IS) has developed standard approaches for classifying the types and criticality of information shared in online social spaces during crises, but the introduction of SARS-CoV-2 has shifted the landscape of online crises substantially. While prior editions of TREC-IS have lacked data on large-scale public-health emergencies as these events are exceedingly rare, COVID-19 has introduced an over-abundance of potential data, and significant open questions remain about how existing approaches to crisis informatics and datasets built on other emergencies adapt to this new context.

Track coordinator(s):

  • Cody Buntain, New Jersey Institute of Technology
  • Richard McCreadie, University of Glasgow
  • Ian Soboroff, National Institute of Standards and Technology (NIST)

Track Web Page: https://www.dcs.gla.ac.uk/~richardm/TREC_IS/


Health Misinformation

Overview | Proceedings | Data | Results | Runs | Participants

TREC 2020 was the second year for the Health Misinformation track, which was named the Decision Track in 2019. Information retrieval using document collections that contain misinformation are problematic. When a search engine returns documents that contain misinformation, users may have difficulty discerning correct from incorrect information and the incorrect information can lead to incorrect decisions. Decisions regarding health-related topics can be consequential, and as such we want search engines that enable users to make correct decisions. The track is designed to address the problem of health misinformation in three areas: 1) adhoc retrieval, 2) the total recall of misinformation in the collection, and 3) the evaluation of retrieval in the presence of misinformation. The 2020 Health Misinformation track had both a recall task and an adhoc task for participants.

Track coordinator(s):

  • Charles L. A. Clarke, University of Waterloo
  • Saira Rizvi, University of Waterloo
  • Mark D. Smucker, University of Waterloo
  • Maria Maistro, University of Copenhagen
  • Guido Zuccon, University of Queensland

Track Web Page: https://trec-health-misinfo.github.io/


Conversational Assistance

Overview | Proceedings | Data | Results | Runs | Participants

CAsT 2020 is the second year of the Conversational Assistance Track and builds on the lessons from the first year. Teams tried a wide range of techniques to address conversational search challenges. Some methods used proven techniques such as query difficulty prediction and query expansion. Given the text understanding challenges in the task, teams also used traditional NLP models that incorporate coreference resolution. One important development was the application of generative query models and ranking models using pre-trained neural language models. The results showed that both traditional and neural techniques provided complementary effectiveness.

Track coordinator(s):

  • Jeffrey Dalton, University of Glasgow
  • Chenyan Xiong, Microsoft Research
  • Jamie Callan, Carnegie Mellon University

Track Web Page: https://www.treccast.ai/


Precision Medicine

Overview | Proceedings | Data | Results | Runs | Participants

The precision medicine paradigm focuses on identifying treatments that are best suited to an individual patient’s unique attributes. The reasoning behind this paradigm is that diseases do not uniformly manifest in people and thus “one size fits all” treatments are often not appropriate. For many diseases, such as cancer, proper selection of a treatment strategy can drastically improve results compared to the standard, frontline treatment. Generally speaking, the issues that are taken into consideration for precision medicine are the genomic, environmental, and lifestyle contexts of the patient.

Track coordinator(s):

  • Kirk Roberts, The University of Texas Health Science Center
  • Dina Demner-Fushman, U.S. National Library of Medicine
  • Ellen M. Voorhees, National Institute of Standards and Technology (NIST)
  • Steven Bedrick and William R. Hersh, Oregon Health & Science University

Track Web Page: https://www.trec-cds.org/


Podcast

Overview | Proceedings | Data | Results | Runs | Participants

The Podcast Track is new at the Text Retrieval Conference (TREC) in 2020. The podcast track was designed to encourage research into podcasts in the information retrieval and NLP research communities. The track consisted of two shared tasks: segment retrieval and summarization, both based on a dataset of over 100,000 podcast episodes (metadata, audio, and automatic transcripts) which was released concurrently with the track. The track generated considerable interest, a‚racted hundreds of new registrations to TREC and fifteen teams, mostly disjoint between search and summarization, made final submissions for assessment. Deep learning was the dominant experimental approach for both search experiments and summarization.

Track coordinator(s):

  • Rosie Jones, Spotify Research
  • Ben Carterette, Spotify Research
  • Ann Clifton, Spotify Research
  • Jussi Karlgren, Spotify Research
  • Aasish Pappu, Spotify Research
  • Sravana Reddy, Spotify Research
  • Yongze Yu, Spotify Research
  • Maria Eskevich, CLARIN ERIC
  • Gareth J. F. Jones, Dublin City University

Track Web Page: https://trecpodcasts.github.io/


Fair Ranking

Overview | Proceedings | Data | Runs | Participants

For 2020, we again adopted an academic search task, where we have a corpus of academic article abstracts and queries submitted to a production academic search engine. The central goal of the Fair Ranking track is to provide fair exposure to different groups of authors (a group fairness framing). We recognize that there may be multiple group definitions (e.g. based on demographics, stature, topic) and hoped for the systems to be robust to these. We expected participants to develop systems that optimize for fairness and relevance for arbitrary group definitions, and did not reveal the exact group definitions until after the evaluation runs were submitted. The track contains two tasks, reranking and retrieval, with a shared evaluation.

Track coordinator(s):

  • Asia J. Biega, Microsoft Research Montreal
  • Fernando Diaz, Montreal Institute for Learning Algorithms
  • Michael D. Ekstrand, Boise State University
  • Sergey Feldman, Allen Institute for Artificial Intelligence
  • Sebastian Kohlmeier, Allen Institute for Artificial Intelligence

Track Web Page: https://fair-trec.github.io/