Runs - News 2018¶
anserini_1000w¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: anserini_1000w
- Participant: Anserini
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
355780282813c7c9d7c8e262d185993b
- Run description: The query is constructed by using all terms in the query document and weighted by their TF-IDF scores. Post-processing includes duplicated documents removal.
anserini_axp¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: anserini_axp
- Participant: Anserini
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
ceca9f5360f8e77047571cb411ee310b
- Run description: The list of queries are constructed by using the first 1000 (at most) terms from the longest 5 paragraphs in the query document. BM25 is used to do the first round retrieval and the Axiomatic Reranking (expansion terms at most 1000) is used for reranking. The final result is generated by picking the results from the paragraph queries in a round-robin fashion. Post-processing includes duplicated documents removal.
anserini_nax¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: anserini_nax
- Participant: Anserini
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
87b938649552aa86d2bb4391dff897fc
- Run description: The query is constructed by using the first 1000 (at most) terms. BM25 is used to do the first round retrieval and the Axiomatic Reranking (expansion terms at most 1000) is used for reranking. Post-processing includes duplicated documents removal.
anserini_nsdm¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: anserini_nsdm
- Participant: Anserini
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
2fd7ca575b0081404306adb3c1ae61b4
- Run description: The query is constructed by using the first 1000 (at most) terms and the query model is the Sequential Dependency Model query. Post-processing includes duplicated documents removal.
anserini_sdmp¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: anserini_sdmp
- Participant: Anserini
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
5cc28cafd3078f3eb57597e27ffcf2cd
- Run description: The list of queries are constructed by using the first 1000 (at most) terms from the longest 5 paragraphs in the query document. The query model used is the Sequential Dependency Model query. The final result is generated by picking the results from the paragraph queries in a round-robin fashion. Post-processing includes duplicated documents removal.
htwsaar1¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: htwsaar1
- Participant: htwsaar
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
07204fcdd3cf49ab26d4457339df6dae
- Run description: The approach tested here relies on simple keyword and keyphrase extraction strategies to derive a query from the input document. The document collection was indexed using ElasticSearch, relying on its default retrieval model.
htwsaar2¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: htwsaar2
- Participant: htwsaar
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
bbf3bd683e42f61e7ec4cc61ae7d475a
- Run description: The approach tested here relies on the entity recognition module of Stanford CoreNLP. Annotated entities as well as identified keyphrases are used to build a query. The document collection was indexed using ElasticSearch, relying on its default retrieval model.
htwsaar3¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: htwsaar3
- Participant: htwsaar
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
a94c75897fc81521bc5332303fb022ce
- Run description: The approach tested here relies on keyphrase extraction informed by sentiment tagging and named entity recognition from Stanford CoreNLP. The document collection was indexed using Lucene, relying on its default retrieval model.
htwsaar4¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: htwsaar4
- Participant: htwsaar
- Track: News
- Year: 2018
- Submission: 8/16/2018
- Type: auto
- Task: background
- MD5:
f8091dafaf3b4e586f6642a2922583a1
- Run description: The approach tested here uses the entire text of the input document as a query and filters result documents based on publication dates. This served as our simple baseline. Documents are indexed in Lucene, relying on its default retrieval model.
signal-ucl-eff¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: signal-ucl-eff
- Participant: signal
- Track: News
- Year: 2018
- Submission: 8/21/2018
- Type: auto
- Task: entity
- MD5:
5a47bfdfc188400646b4aaa1ddb9761a
- Run description: We have developed a supervised model to rank entities in a document using a random forest regressor trained with saliency features and a dataset proposed in [1]. In particular, these features include attributes of the entity and its mention in the document, in addition to features modelling the entity's relationship with other entities in the document. Some of the entity attributes and entity relation features are derived from the Wikipedia knowledge graph. On top of this, we perform a feature ranking to significantly speed up the performance of our computing the features during prediction time. This would make our approach operational in a production system. [1] Trani, Salvatore, et al. "SEL: A unified algorithm for salient entity linking." Computational Intelligence 34.1 (2018): 2-29.
signal-ucl-sel¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: signal-ucl-sel
- Participant: signal
- Track: News
- Year: 2018
- Submission: 8/21/2018
- Type: auto
- Task: entity
- MD5:
60fb2cf14916b2e3b7afe92e7e59d9ef
- Run description: We have developed a supervised model to rank entities in a document using a random forest regressor trained with saliency features and a dataset proposed in [1]. In particular, these features include attributes of the entity and its mention in the document, in addition to features modelling the entity's relationship with other entities in the document. Some of the entity attributes and entity relation features are derived from the Wikipedia knowledge graph. [1] Trani, Salvatore, et al. "SEL: A unified algorithm for salient entity linking." Computational Intelligence 34.1 (2018): 2-29.
signal-ucl-slst¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: signal-ucl-slst
- Participant: signal
- Track: News
- Year: 2018
- Submission: 8/21/2018
- Type: auto
- Task: entity
- MD5:
93cd6e0ac0691eea07c43c9e3001e3fd
- Run description: We have developed a supervised model to rank entities in a document using a random forest regressor trained with saliency features and a dataset proposed in [1]. In particular, these features include attributes of the entity and its mention in the document, in addition to features modelling the entity's relationship with other entities in the document. Some of the entity attributes and entity relation features are derived from the Wikipedia knowledge graph. In addition to those features, we have devised a new set of features that model the sentiment around the mentions of an entity, as they can provide an additional signal on the saliency of the entity. [1] Trani, Salvatore, et al. "SEL: A unified algorithm for salient entity linking." Computational Intelligence 34.1 (2018): 2-29.
SINAI_base_A¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: SINAI_base_A
- Participant: SINAI
- Track: News
- Year: 2018
- Submission: 8/28/2018
- Type: auto
- Task: background
- MD5:
08a1dc6b33ae6f5464fe2a88afc16812
- Run description: Base case with abstract. We create a query with the abstract of the article. Next we launch the query in a Lemur IR system.
SINAI_base_T¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: SINAI_base_T
- Participant: SINAI
- Track: News
- Year: 2018
- Submission: 8/28/2018
- Type: auto
- Task: background
- MD5:
ef43ee5c36b89f2a7ae223b9af5b2040
- Run description: Base case with title. We create a query with the title of the article. Next we launch the query in a Lemur IR system.
SINAI_base_TA¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: SINAI_base_TA
- Participant: SINAI
- Track: News
- Year: 2018
- Submission: 8/28/2018
- Type: auto
- Task: background
- MD5:
ae8ddaee34a16fcb9f431bbf73e61a8b
- Run description: Base case with title and abstract. We create a query with the title and abstract of the article. Next we launch the query in a Lemur IR system.
SINAI_cluster_A¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: SINAI_cluster_A
- Participant: SINAI
- Track: News
- Year: 2018
- Submission: 8/28/2018
- Type: auto
- Task: background
- MD5:
8d257e1139cab817af3db70461192285
- Run description: Title query and clustering. We create a query with the title of the article. Next we launch the query in a Lemur IR system. The obtained results are classified into 10 clusters. Finally we select the 100 best results of each cluster (or less, if the cluster has not 100 results) to generate the final results file.
SINAI_cluster_T¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: SINAI_cluster_T
- Participant: SINAI
- Track: News
- Year: 2018
- Submission: 8/28/2018
- Type: auto
- Task: background
- MD5:
2b3c8b8390b3da8972392d1cc78a56a5
- Run description: Title query and clustering. We create a query with the title of the article. Next we launch the query in a Lemur IR system. The obtained results are classified into 10 clusters. Finally we select the 100 best results of each cluster (or less, if the cluster has not 100 results) to generate the final results file.
SINAI_clusterTA¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: SINAI_clusterTA
- Participant: SINAI
- Track: News
- Year: 2018
- Submission: 8/28/2018
- Type: auto
- Task: background
- MD5:
196ace81fa3b0ae50fdda45c5b210202
- Run description: Title and abstract query, and clustering. We create a query with the title and abstract of the article. Next we launch the query in a Lemur IR system. The obtained results are classified into 10 clusters. Finally we select the 100 best results of each cluster (or less, if the cluster has not 100 results) to generate the final results file.
UDInfolab_kwef¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UDInfolab_kwef
- Participant: udel_fang
- Track: News
- Year: 2018
- Submission: 8/20/2018
- Type: auto
- Task: background
- MD5:
7f9fec56691c48ed31d6e9c5f2a7a80f
- Run description: Use DBpedia to annotate entities for all documents. Build index such that the entities are treated as a word. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Rank the first results for each paragraph using their scores and put them at the top of the result list. Then the second result documents, third result documents, so on and so forth
UDInfolab_kweh¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UDInfolab_kweh
- Participant: udel_fang
- Track: News
- Year: 2018
- Submission: 8/20/2018
- Type: auto
- Task: background
- MD5:
36b6325656c50c19f2cfb0d6e9636e96
- Run description: Use DBpedia to annotate entities for all documents. Build index such that the entities are treated as words. Use paragraphs of the query article as queries. Merge results of the queries using the highest scores of the documents among the paragraphs.
UDInfolab_kwev¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UDInfolab_kwev
- Participant: udel_fang
- Track: News
- Year: 2018
- Submission: 8/20/2018
- Type: auto
- Task: background
- MD5:
64c93232f49d086603634df0a463b649
- Run description: Use DBpedia to annotate entities for all documents. Build index such that the entities are treated as a word. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Paragraphs "vote" for the best documents, (e.g. rank documents based on how many votes they receive). If there are ties, they are broken by using the highest scores of the documents
UDInfolab_kwf¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UDInfolab_kwf
- Participant: udel_fang
- Track: News
- Year: 2018
- Submission: 8/20/2018
- Type: auto
- Task: background
- MD5:
375899a9cc1338717edf88d02b8e5402
- Run description: Normally build the index. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Merge results of the queries using the highest scores of the documents among the paragraphs. Rank the first results for each paragraph using their scores and put them at the top of the result list. Then the second result documents, third result documents, so on and so forth.
UDInfolab_kwh¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UDInfolab_kwh
- Participant: udel_fang
- Track: News
- Year: 2018
- Submission: 8/20/2018
- Type: auto
- Task: background
- MD5:
5eee03acca877f7a980123ddcb537995
- Run description: Normally build the index. Extract keywords from paragraphs and use the keywords of each paragraph as a separated query. Merge results of the queries using the highest scores of the documents among the paragraphs.
umass_cbrdm¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: umass_cbrdm
- Participant: UMass
- Track: News
- Year: 2018
- Submission: 8/21/2018
- Type: auto
- Task: background
- MD5:
2fc9f06b2b2d5c36908a046209262cf7
- Run description: umass_cbrdm uses a relevance-dependency model with cheap-dependency features and bm25 scoring functions.
umass_rdm¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: umass_rdm
- Participant: UMass
- Track: News
- Year: 2018
- Submission: 8/21/2018
- Type: auto
- Task: background
- MD5:
ce0f56e981dc736a5cb8bdd81e63497c
- Run description: umass_rdm uses a relevance-dependency model inferred from the query document.
umass_rm¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: umass_rm
- Participant: UMass
- Track: News
- Year: 2018
- Submission: 8/21/2018
- Type: auto
- Task: background
- MD5:
e71bae558c05191b54db4aba10b95642
- Run description: umass_rdm uses a classical relevance model inferred from the query document.
UNH-ParaBm25¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UNH-ParaBm25
- Participant: trema-unh
- Track: News
- Year: 2018
- Submission: 8/22/2018
- Type: auto
- Task: entity
- MD5:
ea7891d0f6212a2b754952cf03b1d22c
- Run description: UNH-ParaBM25.run.bz2 The first paragraph of the content (or at least 200 characters) is used as a search query (the title is omitted) Retrieval from a Wikipedia page index (based on trec car's dump from Dec 2016) with BM25 retrieval model, using rank score of Wiki pages to rank given entities.
UNH-ParaBm25Ecm¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UNH-ParaBm25Ecm
- Participant: trema-unh
- Track: News
- Year: 2018
- Submission: 8/22/2018
- Type: auto
- Task: entity
- MD5:
75dd0676c71792f250d467f29fa2bc1e
- Run description: UNH-ParaBm25Ecm.run.bz2 The first paragraph of the content (or at least 200 characters) is used as a search query (the title is omitted) Retrieval from a Wikipedia page index with BM25 retrieval model, then a query-expansion model is computed on entities linked on the retrieved wikipedia pages (akin to Lavrenko & Croft's Relevance Model, but over an entity id vocabulary). Instead of using the expansion distribution for a second retrieval pass, the probabilities are used as retrieval scores of to rank given entities.
UNH-TitleBm25¶
Results
| Participants
| Proceedings
| Input
| Summary
- Run ID: UNH-TitleBm25
- Participant: trema-unh
- Track: News
- Year: 2018
- Submission: 8/22/2018
- Type: auto
- Task: entity
- MD5:
91c0318e1665eb1e09ea44cad051d875
- Run description: UNH-TitleBm25.run.bz2 The title of the article is used as a search query (content is omitted) Retrieval from a Wikipedia page index (based on trec car's dump from Dec 2016) with BM25 retrieval model, using rank score of Wiki pages to rank given entities.