Runs - News 2018¶

anserini_1000w¶

Results | Participants | Proceedings | Input | Summary

Run ID: anserini_1000w
Participant: Anserini
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: 355780282813c7c9d7c8e262d185993b
Run description: The query is constructed by using all terms in the query document and weighted by their TF-IDF scores. Post-processing includes duplicated documents removal.

anserini_axp¶

Results | Participants | Proceedings | Input | Summary

Run ID: anserini_axp
Participant: Anserini
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: ceca9f5360f8e77047571cb411ee310b
Run description: The list of queries are constructed by using the first 1000 (at most) terms from the longest 5 paragraphs in the query document. BM25 is used to do the first round retrieval and the Axiomatic Reranking (expansion terms at most 1000) is used for reranking. The final result is generated by picking the results from the paragraph queries in a round-robin fashion. Post-processing includes duplicated documents removal.

anserini_nax¶

Results | Participants | Proceedings | Input | Summary

Run ID: anserini_nax
Participant: Anserini
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: 87b938649552aa86d2bb4391dff897fc
Run description: The query is constructed by using the first 1000 (at most) terms. BM25 is used to do the first round retrieval and the Axiomatic Reranking (expansion terms at most 1000) is used for reranking. Post-processing includes duplicated documents removal.

anserini_nsdm¶

Results | Participants | Proceedings | Input | Summary

Run ID: anserini_nsdm
Participant: Anserini
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: 2fd7ca575b0081404306adb3c1ae61b4
Run description: The query is constructed by using the first 1000 (at most) terms and the query model is the Sequential Dependency Model query. Post-processing includes duplicated documents removal.

anserini_sdmp¶

Results | Participants | Proceedings | Input | Summary

Run ID: anserini_sdmp
Participant: Anserini
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: 5cc28cafd3078f3eb57597e27ffcf2cd
Run description: The list of queries are constructed by using the first 1000 (at most) terms from the longest 5 paragraphs in the query document. The query model used is the Sequential Dependency Model query. The final result is generated by picking the results from the paragraph queries in a round-robin fashion. Post-processing includes duplicated documents removal.

htwsaar1¶

Results | Participants | Proceedings | Input | Summary

Run ID: htwsaar1
Participant: htwsaar
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: 07204fcdd3cf49ab26d4457339df6dae
Run description: The approach tested here relies on simple keyword and keyphrase extraction strategies to derive a query from the input document. The document collection was indexed using ElasticSearch, relying on its default retrieval model.

htwsaar2¶

Results | Participants | Proceedings | Input | Summary

Run ID: htwsaar2
Participant: htwsaar
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: bbf3bd683e42f61e7ec4cc61ae7d475a
Run description: The approach tested here relies on the entity recognition module of Stanford CoreNLP. Annotated entities as well as identified keyphrases are used to build a query. The document collection was indexed using ElasticSearch, relying on its default retrieval model.

htwsaar3¶

Results | Participants | Proceedings | Input | Summary

Run ID: htwsaar3
Participant: htwsaar
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: a94c75897fc81521bc5332303fb022ce
Run description: The approach tested here relies on keyphrase extraction informed by sentiment tagging and named entity recognition from Stanford CoreNLP. The document collection was indexed using Lucene, relying on its default retrieval model.

htwsaar4¶

Results | Participants | Proceedings | Input | Summary

Run ID: htwsaar4
Participant: htwsaar
Track: News
Year: 2018
Submission: 8/16/2018
Type: auto
Task: background
MD5: f8091dafaf3b4e586f6642a2922583a1
Run description: The approach tested here uses the entire text of the input document as a query and filters result documents based on publication dates. This served as our simple baseline. Documents are indexed in Lucene, relying on its default retrieval model.

signal-ucl-eff¶

Results | Participants | Proceedings | Input | Summary

Run ID: signal-ucl-eff
Participant: signal
Track: News
Year: 2018
Submission: 8/21/2018
Type: auto
Task: entity
MD5: 5a47bfdfc188400646b4aaa1ddb9761a
Run description: We have developed a supervised model to rank entities in a document using a random forest regressor trained with saliency features and a dataset proposed in [1]. In particular, these features include attributes of the entity and its mention in the document, in addition to features modelling the entity's relationship with other entities in the document. Some of the entity attributes and entity relation features are derived from the Wikipedia knowledge graph. On top of this, we perform a feature ranking to significantly speed up the performance of our computing the features during prediction time. This would make our approach operational in a production system. [1] Trani, Salvatore, et al. "SEL: A unified algorithm for salient entity linking." Computational Intelligence 34.1 (2018): 2-29.

signal-ucl-sel¶

Results | Participants | Proceedings | Input | Summary

Run ID: signal-ucl-sel
Participant: signal
Track: News
Year: 2018
Submission: 8/21/2018
Type: auto
Task: entity
MD5: 60fb2cf14916b2e3b7afe92e7e59d9ef
Run description: We have developed a supervised model to rank entities in a document using a random forest regressor trained with saliency features and a dataset proposed in [1]. In particular, these features include attributes of the entity and its mention in the document, in addition to features modelling the entity's relationship with other entities in the document. Some of the entity attributes and entity relation features are derived from the Wikipedia knowledge graph. [1] Trani, Salvatore, et al. "SEL: A unified algorithm for salient entity linking." Computational Intelligence 34.1 (2018): 2-29.

signal-ucl-slst¶

Results | Participants | Proceedings | Input | Summary

Run ID: signal-ucl-slst
Participant: signal
Track: News
Year: 2018
Submission: 8/21/2018
Type: auto
Task: entity
MD5: 93cd6e0ac0691eea07c43c9e3001e3fd
Run description: We have developed a supervised model to rank entities in a document using a random forest regressor trained with saliency features and a dataset proposed in [1]. In particular, these features include attributes of the entity and its mention in the document, in addition to features modelling the entity's relationship with other entities in the document. Some of the entity attributes and entity relation features are derived from the Wikipedia knowledge graph. In addition to those features, we have devised a new set of features that model the sentiment around the mentions of an entity, as they can provide an additional signal on the saliency of the entity. [1] Trani, Salvatore, et al. "SEL: A unified algorithm for salient entity linking." Computational Intelligence 34.1 (2018): 2-29.

SINAI_base_A¶

Results | Participants | Proceedings | Input | Summary

Run ID: SINAI_base_A
Participant: SINAI
Track: News
Year: 2018
Submission: 8/28/2018
Type: auto
Task: background
MD5: 08a1dc6b33ae6f5464fe2a88afc16812
Run description: Base case with abstract. We create a query with the abstract of the article. Next we launch the query in a Lemur IR system.

SINAI_base_T¶

Results | Participants | Proceedings | Input | Summary

Run ID: SINAI_base_T
Participant: SINAI
Track: News
Year: 2018
Submission: 8/28/2018
Type: auto
Task: background
MD5: ef43ee5c36b89f2a7ae223b9af5b2040
Run description: Base case with title. We create a query with the title of the article. Next we launch the query in a Lemur IR system.

SINAI_base_TA¶

Results | Participants | Proceedings | Input | Summary

Run ID: SINAI_base_TA
Participant: SINAI
Track: News
Year: 2018
Submission: 8/28/2018
Type: auto
Task: background
MD5: ae8ddaee34a16fcb9f431bbf73e61a8b
Run description: Base case with title and abstract. We create a query with the title and abstract of the article. Next we launch the query in a Lemur IR system.

SINAI_cluster_A¶

Results | Participants | Proceedings | Input | Summary

Run ID: SINAI_cluster_A
Participant: SINAI
Track: News
Year: 2018
Submission: 8/28/2018
Type: auto
Task: background
MD5: 8d257e1139cab817af3db70461192285
Run description: Title query and clustering. We create a query with the title of the article. Next we launch the query in a Lemur IR system. The obtained results are classified into 10 clusters. Finally we select the 100 best results of each cluster (or less, if the cluster has not 100 results) to generate the final results file.

SINAI_cluster_T¶

Results | Participants | Proceedings | Input | Summary

Run ID: SINAI_cluster_T
Participant: SINAI
Track: News
Year: 2018
Submission: 8/28/2018
Type: auto
Task: background
MD5: 2b3c8b8390b3da8972392d1cc78a56a5
Run description: Title query and clustering. We create a query with the title of the article. Next we launch the query in a Lemur IR system. The obtained results are classified into 10 clusters. Finally we select the 100 best results of each cluster (or less, if the cluster has not 100 results) to generate the final results file.

SINAI_clusterTA¶

Results | Participants | Proceedings | Input | Summary

Run ID: SINAI_clusterTA
Participant: SINAI
Track: News
Year: 2018
Submission: 8/28/2018
Type: auto
Task: background
MD5: 196ace81fa3b0ae50fdda45c5b210202
Run description: Title and abstract query, and clustering. We create a query with the title and abstract of the article. Next we launch the query in a Lemur IR system. The obtained results are classified into 10 clusters. Finally we select the 100 best results of each cluster (or less, if the cluster has not 100 results) to generate the final results file.

UDInfolab_kwef¶

Results | Participants | Proceedings | Input | Summary

Run ID: UDInfolab_kwef
Participant: udel_fang
Track: News
Year: 2018
Submission: 8/20/2018
Type: auto
Task: background
MD5: 7f9fec56691c48ed31d6e9c5f2a7a80f
Run description: Use DBpedia to annotate entities for all documents. Build index such that the entities are treated as a word. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Rank the first results for each paragraph using their scores and put them at the top of the result list. Then the second result documents, third result documents, so on and so forth

UDInfolab_kweh¶

Results | Participants | Proceedings | Input | Summary

Run ID: UDInfolab_kweh
Participant: udel_fang
Track: News
Year: 2018
Submission: 8/20/2018
Type: auto
Task: background
MD5: 36b6325656c50c19f2cfb0d6e9636e96
Run description: Use DBpedia to annotate entities for all documents. Build index such that the entities are treated as words. Use paragraphs of the query article as queries. Merge results of the queries using the highest scores of the documents among the paragraphs.

UDInfolab_kwev¶

Results | Participants | Proceedings | Input | Summary

Run ID: UDInfolab_kwev
Participant: udel_fang
Track: News
Year: 2018
Submission: 8/20/2018
Type: auto
Task: background
MD5: 64c93232f49d086603634df0a463b649
Run description: Use DBpedia to annotate entities for all documents. Build index such that the entities are treated as a word. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Paragraphs "vote" for the best documents, (e.g. rank documents based on how many votes they receive). If there are ties, they are broken by using the highest scores of the documents

UDInfolab_kwf¶

Results | Participants | Proceedings | Input | Summary

Run ID: UDInfolab_kwf
Participant: udel_fang
Track: News
Year: 2018
Submission: 8/20/2018
Type: auto
Task: background
MD5: 375899a9cc1338717edf88d02b8e5402
Run description: Normally build the index. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Merge results of the queries using the highest scores of the documents among the paragraphs. Rank the first results for each paragraph using their scores and put them at the top of the result list. Then the second result documents, third result documents, so on and so forth.

UDInfolab_kwh¶

Results | Participants | Proceedings | Input | Summary

Run ID: UDInfolab_kwh
Participant: udel_fang
Track: News
Year: 2018
Submission: 8/20/2018
Type: auto
Task: background
MD5: 5eee03acca877f7a980123ddcb537995
Run description: Normally build the index. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Merge results of the queries using the highest scores of the documents among the paragraphs.

umass_cbrdm¶

Results | Participants | Proceedings | Input | Summary

Run ID: umass_cbrdm
Participant: UMass
Track: News
Year: 2018
Submission: 8/21/2018
Type: auto
Task: background
MD5: 2fc9f06b2b2d5c36908a046209262cf7
Run description: umass_cbrdm uses a relevance-dependency model with cheap-dependency features and bm25 scoring functions.

umass_rdm¶

Results | Participants | Proceedings | Input | Summary

Run ID: umass_rdm
Participant: UMass
Track: News
Year: 2018
Submission: 8/21/2018
Type: auto
Task: background
MD5: ce0f56e981dc736a5cb8bdd81e63497c
Run description: umass_rdm uses a relevance-dependency model inferred from the query document.

umass_rm¶

Results | Participants | Proceedings | Input | Summary

Run ID: umass_rm
Participant: UMass
Track: News
Year: 2018
Submission: 8/21/2018
Type: auto
Task: background
MD5: e71bae558c05191b54db4aba10b95642
Run description: umass_rdm uses a classical relevance model inferred from the query document.

UNH-ParaBm25¶

Results | Participants | Proceedings | Input | Summary

Run ID: UNH-ParaBm25
Participant: trema-unh
Track: News
Year: 2018
Submission: 8/22/2018
Type: auto
Task: entity
MD5: ea7891d0f6212a2b754952cf03b1d22c
Run description: UNH-ParaBM25.run.bz2 The first paragraph of the content (or at least 200 characters) is used as a search query (the title is omitted) Retrieval from a Wikipedia page index (based on trec car's dump from Dec 2016) with BM25 retrieval model, using rank score of Wiki pages to rank given entities.

UNH-ParaBm25Ecm¶

Results | Participants | Proceedings | Input | Summary

Run ID: UNH-ParaBm25Ecm
Participant: trema-unh
Track: News
Year: 2018
Submission: 8/22/2018
Type: auto
Task: entity
MD5: 75dd0676c71792f250d467f29fa2bc1e
Run description: UNH-ParaBm25Ecm.run.bz2 The first paragraph of the content (or at least 200 characters) is used as a search query (the title is omitted) Retrieval from a Wikipedia page index with BM25 retrieval model, then a query-expansion model is computed on entities linked on the retrieved wikipedia pages (akin to Lavrenko & Croft's Relevance Model, but over an entity id vocabulary). Instead of using the expansion distribution for a second retrieval pass, the probabilities are used as retrieval scores of to rank given entities.

UNH-TitleBm25¶

Results | Participants | Proceedings | Input | Summary

Run ID: UNH-TitleBm25
Participant: trema-unh
Track: News
Year: 2018
Submission: 8/22/2018
Type: auto
Task: entity
MD5: 91c0318e1665eb1e09ea44cad051d875
Run description: UNH-TitleBm25.run.bz2 The title of the article is used as a search query (content is omitted) Retrieval from a Wikipedia page index (based on trec car's dump from Dec 2016) with BM25 retrieval model, using rank score of Wiki pages to rank given entities.