Skip to content

Proceedings - News 2021

IRCologne at TREC 2021 News Track Relation-based re-ranking for background linking

Björn Engelmann, Philipp Schaer

Abstract

This paper presents our approach to the background linking task of the TREC 2021 News Track. The background linking task is to find a set of relevant articles in the Washington Post dataset containing helpful background information for a given news article. Our approach involved a two-stage retrieval process. In the first stage, the 200 most relevant documents were extracted from the entire corpus using BM25. The second stage involved re-ranking using similarity scores based on entities and relations extracted from the query document and the associated 200 relevant documents. For this task, we submitted five runs, each giving different weights to the entities and relations. Our best run received a nDCG@5 of 0.4423, and we were thus able to show that re-ranking with the use of relations leads to a slight improvement over the baseline without re-ranking.

Bibtex
@inproceedings{DBLP:conf/trec/0002S21,
    author = {Bj{\"{o}}rn Engelmann and Philipp Schaer},
    editor = {Ian Soboroff and Angela Ellis},
    title = {IRCologne at {TREC} 2021 News Track Relation-based re-ranking for background linking},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/IR-Cologne-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/0002S21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Elastic Embedded Background Linking for News Articles with Keywords, Entities and Events

Luis Adrián Cabrera-Diego, Emanuela Boros, Antoine Doucet

Abstract

In this paper, we present a collection of five flexible background linking models created for the News Track in TREC 2021 that generate ranked lists of articles to provide contextual information. The collection is based on the use of sentence embeddings indexes, created with Sentence BERT and Open Distro for ElasticSearch. For each model, we explore additional tools, from keywords extraction using YAKE, to entity and event detection, while passing through a linear combination. The associated code is available online as open-source software.

Bibtex
@inproceedings{DBLP:conf/trec/Cabrera-DiegoBD21,
    author = {Luis Adri{\'{a}}n Cabrera{-}Diego and Emanuela Boros and Antoine Doucet},
    editor = {Ian Soboroff and Angela Ellis},
    title = {Elastic Embedded Background Linking for News Articles with Keywords, Entities and Events},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/L3i\_Rochelle-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/Cabrera-DiegoBD21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

bigIR at TREC 2021: Adopting Transfer Learning for News Background Linking

Marwa Essam, Tamer Elsayed

Abstract

In this paper, we present the participation of the bigIR team at Qatar University in the TREC 2021 news track. We participated in the background linking task. The task mainly aims to retrieve news articles that provide context and background knowledge to the reader of a specific query article. We submitted five runs for this task. In the first two, we adopted an ad-hoc retrieval approach, where the query articles were analyzed to generate search queries that were issued against the news articles collection to retrieve the required links. In the remaining runs, we adopted a transfer learning approach to rerank the articles retrieved given their usefulness to address specific subtopics related to the query articles. These subtopics were given by the track organizers as a new challenge this year. The results show that one of our runs outperformed TREC median submission, while others achieved comparable results.

Bibtex
@inproceedings{DBLP:conf/trec/EssamE21,
    author = {Marwa Essam and Tamer Elsayed},
    editor = {Ian Soboroff and Angela Ellis},
    title = {bigIR at {TREC} 2021: Adopting Transfer Learning for News Background Linking},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/QU-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/EssamE21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

SU-NLP at TREC NEWS 2021

Kenan Fayoumi, Reyyan Yeniterzi

Abstract

This paper presents our work and submissions for the TREC 2021 News Track Wikification task. We approach the problem as an entity linking task initially and after identifying the mentions and their corresponding Wikipedia entities, we rank the mentions within the news article based on their usefulness. For the entity linking part, transformer-based architectures are explored for both detecting the mentions, generating the possible candidates and re-ranking them. Finally for the mention ranking, we use previous years' best performing approach which uses the position of the mention within the text

Bibtex
@inproceedings{DBLP:conf/trec/FayoumiY21,
    author = {Kenan Fayoumi and Reyyan Yeniterzi},
    editor = {Ian Soboroff and Angela Ellis},
    title = {{SU-NLP} at {TREC} {NEWS} 2021},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/SU-NLP-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/FayoumiY21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Middlebury at TREC News '21 Exploring Learning to Rank Model Variants

Culton Koster, John Foley

Abstract

Middlebury College participated in the TREC News Background Linking task in 2021. We constructed a linear learning to rank model trained on the 2018-2020 data and submitted runs that included variants on the standard low resource learning-to-rank models. In this notebook paper we detail the contents of our submissions and our lessons learned from this year's participation. We explored a few variant models including a random forest ranker, linear models trained on that random forest, and two-stage linear models, but found that traditional, direct ranking still appears to be optimal.

Bibtex
@inproceedings{DBLP:conf/trec/KosterF21,
    author = {Culton Koster and John Foley},
    editor = {Ian Soboroff and Angela Ellis},
    title = {Middlebury at {TREC} News '21 Exploring Learning to Rank Model Variants},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/middlebury-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/KosterF21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Semantic Search for Background Linking in News Articles

Udhav Sethi, Anup Anand Deshmukh

Abstract

The task of background linking aims at recommending news articles to a reader that are most relevant for providing context and background for the query article. For this task, we propose a two-stage approach, IR-BERT, which combines the retrieval power of BM25 with the contextual understanding gained through a BERT-based model. We further propose the use of a diversity measure to evaluate the effectiveness of background linking approaches in retrieving a diverse set of documents. We provide a comparison of IR-BERT with other participating approaches at TREC 2021. We have open sourced our implementation on Github.

Bibtex
@inproceedings{DBLP:conf/trec/SethiD21,
    author = {Udhav Sethi and Anup Anand Deshmukh},
    editor = {Ian Soboroff and Angela Ellis},
    title = {Semantic Search for Background Linking in News Articles},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/Waterloo\_Cormack-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/SethiD21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

University of Hagen @ TREC2021 News Track

Stefan Wagenpfeil, Matthias L. Hemmje, Paul Mc Kevitt

Abstract

This paper discusses University of Hagen's approach for the TREC2021 News Track. The News Track aims at providing relevant background links to documents of the Washington Post article archive. Our submitted run is based on research and development in the field of multimedia information retrieval and employs a modified TFIDF (Text Frequency vs. Inverse Document Frequency) algorithm for topic modeling and matrix based indexing operations founded on Graph Codes for the calculation of similarity, relevance, and recommendations. This run was submitted as FUH (Fernuniversit¨at Hagen) and obtained a nDCG@5 of 0.2655.

Bibtex
@inproceedings{DBLP:conf/trec/WagenpfeilHK21,
    author = {Stefan Wagenpfeil and Matthias L. Hemmje and Paul Mc Kevitt},
    editor = {Ian Soboroff and Angela Ellis},
    title = {University of Hagen @ {TREC2021} News Track},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/FUH-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/WagenpfeilHK21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

TKB48 at TREC 2021 News Track

Lirong Zhang, Hideo Joho, Sumio Fujita

Abstract

TKB48 incorporated document expansion methods such as docT5query and keyword extraction into indexing to solve the background linking problem. Using a transformer-based model, we calculated the text similarity of queries and documents at a semantic level and combined the semantic similarity and BM25 score for re-ranking background articles. We examined different combinations of re-ranking factors such as semantic similarities between expanded documents and attributes of topics. We found that increasing index fields produced by the docT5query model and keyword extraction model was beneficial. At the same time, the re-ranking performance was influenced by the amount of semantic similarity factors and their weight in the total relevance score. To discover the effectiveness of document expansion and our method using temporal recency, we further generated several unofficial runs incorporating a temporal topic classifier and learning to rank method. However, the lack of temporal topics limits the performance of the model. Our purposed algorithm outperformed the learning to rank method. Our future work will focus on fine-tuning of the docT5query model.

Bibtex
@inproceedings{DBLP:conf/trec/ZhangJF21,
    author = {Lirong Zhang and Hideo Joho and Sumio Fujita},
    editor = {Ian Soboroff and Angela Ellis},
    title = {{TKB48} at {TREC} 2021 News Track},
    booktitle = {Proceedings of the Thirtieth Text REtrieval Conference, {TREC} 2021, online, November 15-19, 2021},
    series = {{NIST} Special Publication},
    volume = {500-335},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2021},
    url = {https://trec.nist.gov/pubs/trec30/papers/TKB48-N.pdf},
    timestamp = {Mon, 28 Aug 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/ZhangJF21.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}