Skip to content

Runs - Microblog 2015

BJUTllyQE

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: BJUTllyQE
  • Participant: BJUT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: b
  • MD5: 87d49d07761191f6f7d60cc815a4cf58
  • Run description: Extern collection to enrich the query that we take is Wikipedia.

BjutNMF1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: BjutNMF1
  • Participant: BJUT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: b
  • MD5: 5cfa080f74434f096967ccd032fd2693
  • Run description: google,bing search result

BjutNMF2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: BjutNMF2
  • Participant: BJUT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: b
  • MD5: 3c45784738613cc940e6658c12dc03c7
  • Run description: google,bing search result

CLIP-A-5.0-0.5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CLIP-A-5.0-0.5
  • Participant: CLIP
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 4c9b40687dd956f649988e86bbdbe138
  • Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-A-5.0-0.6

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CLIP-A-5.0-0.6
  • Participant: CLIP
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 96806b7d2ac81e619b15d2ac96f142f0
  • Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-A-DYN-0.5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CLIP-A-DYN-0.5
  • Participant: CLIP
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 394b7bde9ea0da25eb49f0164da8c2ef
  • Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-B-0.4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CLIP-B-0.4
  • Participant: CLIP
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 1ebb13b4e030bd1c194e63c41278f855
  • Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-B-0.5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CLIP-B-0.5
  • Participant: CLIP
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: e0d0e5fc3bc2aa253e26e496473010a5
  • Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-B-0.6

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CLIP-B-0.6
  • Participant: CLIP
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: f1e5bd7fa3b53c2009b732a05fc24880
  • Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

DALTREC_B_PREP

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: DALTREC_B_PREP
  • Participant: DalTREC
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: b
  • MD5: 32a58faa6e7b980f60862a38e7aa7ebb
  • Run description: This code reads the crawled tweets from json files in ./data/tweets and index them separately for each day. Then, loads the manually assigned weights from ./data/ManuallyLabeledKeyterms2.txt. These weights indicate the importance of manually extracted keyterms in each topic/profile. Using these weights, we create boosted queries and set Lucene similarity to "LMDirichletSimilarity". The ranked tweets returned by Lucene are tested to see whether the sum of the weights of the keyterms occurred in each tweet passes a predefined threshold or not. If yes, then the tweet is returned as relevant. Otherwise, it is ignored.
  • Code: https://github.com/RaraMakki/DalTrec-Microblog

DALTRECAA1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: DALTRECAA1
  • Participant: DalTREC
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 695f697d9cf6eae9e670ffa68f6fd17b
  • Run description: 1. Compare wikipedia concepts running GTM. Wikipedia concept linking: http://dexter.isti.cnr.it/ GTM:http://cgm6.research.cs.dal.ca:8080/DalTextWebApp/
  • Code: https://github.com/hellolvn1/DalTrec

DALTRECAB1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: DALTRECAB1
  • Participant: DalTREC
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 0f888a2adcb813f1d21a4d42fbe69442
  • Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms
  • Code: https://github.com/hellolvn1/DalTrec

DALTRECMA1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: DALTRECMA1
  • Participant: DalTREC
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: a
  • MD5: d6f542d8edf0e79bec00b9a6da0831e7
  • Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms
  • Code: https://github.com/hellolvn1/DalTrec

DALTRECMA2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: DALTRECMA2
  • Participant: DalTREC
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: a
  • MD5: 9b330da0c6d3833155477ffc09b71502
  • Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms
  • Code: https://github.com/hellolvn1/DalTrec

DALTRECMB1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: DALTRECMB1
  • Participant: DalTREC
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 058752c22563a7452d5e8bd3269b5a17
  • Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms
  • Code: https://github.com/hellolvn1/DalTrec

ECNURUNA1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURUNA1
  • Participant: ECNU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 06a9b424ae0663c8acc5fa5d287ef934
  • Run description: When we get a new tweet,we use a classifier trained by data in 2013 Trec Microblog to classify whether it's relevant to a specific query.If it is relevant,we then will compute its similarity with the former delivered tweets.We use google search result to expand our querys.
  • Code: https://github.com/AliceQin900/MicroblogTrack2015.git

ECNURUNA2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURUNA2
  • Participant: ECNU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: da1b5b4f77074489148367353bf14a80
  • Run description: When we get a new tweet,we use a classifier trained by data in 2013 Trec Microblog to classify whether it's relevant to a specific query.If it is relevant,we then will compute its similarity with the former delivered tweets.We use google search result to expand our querys.
  • Code: https://github.com/AliceQin900/MicroblogTrack2015.git

ECNURUNA3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURUNA3
  • Participant: ECNU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 549872e41bd0585993734427e142a976
  • Run description: When we get a new tweet,we use a classifier trained by data in 2013 Trec Microblog to classify whether it's relevant to a specific query.If it is relevant,we then will compute its similarity with the former delivered tweets.We use google search result to expand our querys.
  • Code: https://github.com/AliceQin900/MicroblogTrack2015.git

ECNURUNB1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURUNB1
  • Participant: ECNU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: fccad6db2b606c741173a1fcee73b43b
  • Run description: combine DFRee_QTGP with LM_Q; use three query expansion methods: Google searched based, tfidf based, Bo1bfree based;
  • Code: https://github.com/AliceQin900/MicroblogTrack2015.git

ECNURUNB2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURUNB2
  • Participant: ECNU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 6f8077717920b9b3f136ab30e752af73
  • Run description: combine DFRee_QTGP, LM_Q and BM25_QT; use three query expansion methods: Google searched based, tfidf based, Bo1bfree based;
  • Code: https://github.com/AliceQin900/MicroblogTrack2015.git

ECNURUNB3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURUNB3
  • Participant: ECNU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: bd7f88c884870918ab96efc2ce6464cc
  • Run description: combine DFRee_QTGP, LM_Q, BM25_QGTP, BM25_QG; use three query expansion methods: Google searched based, tfidf based, Bo1bfree based;
  • Code: https://github.com/AliceQin900/MicroblogTrack2015.git

hpclab_pi_algA

Results | Participants | Input | Summary | Appendix

  • Run ID: hpclab_pi_algA
  • Participant: HPCLAB_PI
  • Track: Microblog
  • Year: 2015
  • Submission: 7/31/2015
  • Task: a
  • MD5: 5f923e2a1c0b37043f1b2ded87f21424
  • Run description: The algorithm takes into account RT count and favorite count, combining that values in the tweet with BM25 score.

hpclabpibm25mod

Results | Participants | Input | Summary | Appendix

  • Run ID: hpclabpibm25mod
  • Participant: HPCLAB_PI
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: b
  • MD5: 972bcf02a64199df509fa145cd88972d
  • Run description: The run combine evidence from retweet count and favorite count of tweets, combining that values with bm25 scores.

IRIT-KLTFIDF

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: IRIT-KLTFIDF
  • Participant: IRIT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: a
  • MD5: 1fc4981858bdb8ae14b19cb4ef566dbc
  • Run description: no external ressources were used for this run.

IRIT-RTDig.33

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: IRIT-RTDig.33
  • Participant: IRIT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/31/2015
  • Type: automatic
  • Task: b
  • MD5: f5ae850daf1a75d0530f25aff03b03ff
  • Run description: Our tweet digest run filter and cluster tweets on real time. Tweets are assigned to recently created cluster that maximize average similarity

IRIT-RTNotif.33

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: IRIT-RTNotif.33
  • Participant: IRIT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/31/2015
  • Type: automatic
  • Task: a
  • MD5: e51c810ff5f76f993f01e8bea47e8ad7
  • Run description: Our tweet push notification uses the same clustering model as our model digest model expect that the time window for notification is set to 1 hour. Our tweet push notification uses also pseudo-relevance model to push only highly relevant tweets.

IRIT100KLTFIDF

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: IRIT100KLTFIDF
  • Participant: IRIT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: b
  • MD5: fe2f13255cb4462ad73f21c756b6a4bb
  • Run description: no external ressources were used for this run.

IritSigSDA

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: IritSigSDA
  • Participant: IRIT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 2408934d70274257fb48c1c05fb4c44d
  • Run description: The main aim of this run was the speed of the decision making : all answers are taken within few seconds. Many features, in addition of the content of the tweet, are taken into account to make the decision to retain the given tweet as soon as possible. Python had been prefered to Java to increase the rapidity of the decision making even more.

IritSigSDB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: IritSigSDB
  • Participant: IRIT
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 509ee95449cd419d15854625b915b23b
  • Run description: The main aim of this run was the speed of the decision making : all answers are taken within few minutes. Many features, in addition of the content of the tweet, are taken into account to make a first score on the text content of the tweet, and a second one based on all the additionnal features. Python had been prefered to Java to increase the rapidity of the decision making even more, exept for the final ranking of the result (back to Java as the sampling).

MPII_COM_MAXREP

Results | Participants | Input | Summary | Appendix

  • Run ID: MPII_COM_MAXREP
  • Participant: MPII
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: e52f0a4f9db69d432bef2590272d1109
  • Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We used twitter meta data as features.
  • Code: https://github.com/lhyan792/microblogtrack.git

MPII_COMB_MART

Results | Participants | Input | Summary | Appendix

  • Run ID: MPII_COMB_MART
  • Participant: MPII
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: d8359599b196fb5022e213ff47d4059a
  • Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We also used twitter meta data as features.
  • Code: https://github.com/lhyan792/microblogtrack.git

MPII_COMB_SORT

Results | Participants | Input | Summary | Appendix

  • Run ID: MPII_COMB_SORT
  • Participant: MPII
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 0260cbb3437d02d33610ac8b9f924f53
  • Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We used twitter meta data as features.
  • Code: https://github.com/lhyan792/microblogtrack.git

MPII_HYBRID_PW

Results | Participants | Input | Summary | Appendix

  • Run ID: MPII_HYBRID_PW
  • Participant: MPII
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: a39cc63d29e498a59c33f8bc6bb7de3d
  • Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We used twitter meta data as features.
  • Code: https://github.com/lhyan792/microblogtrack.git

MPII_LUC_MART

Results | Participants | Input | Summary | Appendix

  • Run ID: MPII_LUC_MART
  • Participant: MPII
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 54a9d054c826795beb8912eb5549d26d
  • Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used.
  • Code: https://github.com/lhyan792/microblogtrack.git

MPII_LUC_SORT

Results | Participants | Input | Summary | Appendix

  • Run ID: MPII_LUC_SORT
  • Participant: MPII
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 8f30c4a39af081d84e1a62d4ae27f388
  • Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used.
  • Code: https://github.com/lhyan792/microblogtrack.git

PKUICSTRunA1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICSTRunA1
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: f4023e79eacaa8a17a48fe57890a7892
  • Run description: We utilized google web search to realize query expansion before the evaluation period, and we adopted an adaptive relevance threshold according to top K relevance threshold in ScenarioB of previous day. Besides, we utilized a uniform novel threshold N = 0.67.
  • Code: https://github.com/ifff/microblogfilter

PKUICSTRunA2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICSTRunA2
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: a
  • MD5: 0bd96aaefa08b03f5fceb7b239230f78
  • Run description: We utilized google web search to realize query expansion before the evaluation period, and we adopted an adaptive relevance threshold according to manual relevance threshold in ScenarioB of previous day. Besides, we utilized a uniform novel threshold N = 0.67.
  • Code: https://github.com/ifff/microblogfilter

PKUICSTRunA3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICSTRunA3
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 6094629c4596031dc5c7128b5eff00f3
  • Run description: We utilized google web search to realize query expansion before the evaluation period, and we adopted an manual relevance threshold according to top K relevance threshold in ScenarioB of previous day. Besides, we utilized a uniform novel threshold N = 0.72.
  • Code: https://github.com/ifff/microblogfilter

PKUICSTRunB1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICSTRunB1
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 0ebd3b390d77a3b522caa224bf5774ef
  • Run description: We utilized google web search to realize query expansion before the evaluation period, and we realized a language model with pseudo-relevance-feedback to obtain relevant tweets. In addition, we adopted an adaptive relevance threshold according to top K relevance threshold in ScenarioB of previous day. and utilized a uniform novel threshold N = 0.67.
  • Code: https://github.com/ifff/microblogfilter

PKUICSTRunB2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICSTRunB2
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Task: b
  • MD5: d908773fbd1d07e20942e2810451fd8b
  • Run description: We utilized google web search to realize query expansion before the evaluation period, and we realized a language model with pseudo-relevance-feedback to obtain relevant tweets. In addition, we adopted an adaptive relevance threshold according to manual relevance threshold in ScenarioB of previous day. and utilized a uniform novel threshold N = 0.67.
  • Code: https://github.com/ifff/microblogfilter

PKUICSTRunB3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICSTRunB3
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 808e2ab0eec249dcf7a9b30f266ec8f4
  • Run description: We utilized google web search to realize query expansion before the evaluation period, and we realized a language model with pseudo-relevance-feedback to obtain relevant tweets. In addition, we adopted an manual relevance threshold according to top K relevance threshold in ScenarioB of previous day. and utilized a uniform novel threshold N = 0.72.
  • Code: https://github.com/ifff/microblogfilter

prnaTaskA1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: prnaTaskA1
  • Participant: prna
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 14453f25b3bc592d12f3e936ea5237cd
  • Run description: We use various NLP/IR techniques to extract the important user profile keywords, expand those based on WordNet synsets, and index them. Incoming tweets are processed and the relevant tweets are then mapped to corresponding user profiles using a combination of semantic similarity and frequency-based relevance scores.

prnaTaskA2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: prnaTaskA2
  • Participant: prna
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 04708f457c91a4457626069d4a40034c
  • Run description: We use various NLP/IR techniques to extract the important user profile keywords, expand those based on neural word/phrase embeddings, and index them. Incoming tweets are processed and the relevant tweets are then mapped to corresponding user profiles using a combination of semantic similarity and frequency-based relevance scores.

prnaTaskA3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: prnaTaskA3
  • Participant: prna
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: d7d8aad8afed9ef93ce426224463f6ac
  • Run description: We use various NLP/IR techniques to extract the important user profile keywords, expand those based on WordNet synsets and neural word/phrase embeddings, and index them. Incoming tweets are processed and the relevant tweets are then mapped to corresponding user profiles using a combination of semantic similarity and frequency-based relevance scores.

prnaTaskB1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: prnaTaskB1
  • Participant: prna
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: c4126ba2a7c28858ccca834ead7da15f
  • Run description: Various NLP/IR techniques are used to process and index the candidate tweets for a day. The important user profile keywords are extracted, expanded based on WordNet synsets, and used to search for relevant tweets and create a digest of up to hundred tweets per day per user profile.

prnaTaskB2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: prnaTaskB2
  • Participant: prna
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 326a295c53cd96e5bd0500146b8f18ff
  • Run description: Various NLP/IR techniques are used to process and index the candidate tweets for a day. The important user profile keywords are extracted, expanded based on neural word/phrase embeddings, and used to search for relevant tweets and create a digest of up to hundred tweets per day per user profile.

prnaTaskB3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: prnaTaskB3
  • Participant: prna
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: cb281531cedabec7a28f692c34684cc2
  • Run description: Various NLP/IR techniques are used to process and index the candidate tweets for a day. The important user profile keywords are extracted, expanded based on WordNet synsets and neural word/phrase embeddings, and used to search for relevant tweets and create a digest of up to hundred tweets per day per user profile.

QUBaseline

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUBaseline
  • Participant: QU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: bf5cabaaced8c2683f897e7f2cedfb8d
  • Run description: We use online filtering by measuring similarity between the current tweet in the stream with the topical representation of all 225 topics. We represent a topic by terms from title, description and narrative of the topic in addition to expansion terms extracted from pseudo relevant tweets to the topic. We use a static similarity threshold and based on that decide which topics the tweet matches. Additionally, we reduce redundancy in pushed tweets by only pushing a relevant tweet to a topic if it isn't similar to already pushed tweets.

QUBaselineB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUBaselineB
  • Participant: QU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 67310fde622fb773c15a6dcc92b4f4a8
  • Run description: This run is the baseline run that just uses the title as a query and run it at the end of each day against the index of tweets. We indexed 3 days before the evaluation time to get initial statistics.

QUDyn

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUDyn
  • Participant: QU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 967648832c91c0098e92dce00224920e
  • Run description: We use online filtering by measuring similarity between the current tweet in the stream with the topical representation of all 225 topics. We represent a topic by terms from title, description and narrative of the topic in addition to expansion terms extracted from pseudo relevant tweets to the topic. We use a dynamically-set similarity threshold and based on that decide which topics the tweet matches. Additionally, we reduce redundancy in pushed tweets by only pushing a relevant tweet to a topic if it isn't similar to already pushed tweets. We use Twitter data collected 3 days before evaluation period started.

QUDynExp

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUDynExp
  • Participant: QU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: bc3a216a1e36def0bdef5e1af6f7c930
  • Run description: We use online filtering by measuring similarity between the current tweet in the stream with the topical representation of all 225 topics. We represent a topic by terms from title, description and narrative of the topic in addition to expansion terms extracted from pseudo relevant tweets to the topic. We use a dynamically-set similarity threshold and based on that decide which topics the tweet matches. Additionally, we reduce redundancy in pushed tweets by only pushing a relevant tweet to a topic if it isn't similar to already pushed tweets. For external resources, we use Twitter data collected 3 days before evaluation period started.

QUExpB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUExpB
  • Participant: QU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: b777d14ecc0ac85c656402fc762232ba
  • Run description: This run does query expansion every day using tweets of the same day. Expansion terms are only 3 terms (from top 5 pseudo relevant tweets) besides the title. We indexed 3 days before the evaluation time to get initial statistics.

QUFullExpB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUFullExpB
  • Participant: QU
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: a5fa33f2cdff2fc2dbc89bb609ba409a
  • Run description: This run does query expansion every day using tweets of the same day. Expansion terms are 10 terms (from top 10 pseudo relevant tweets) besides the title, description and narrative. Final query will not exceed 20 terms. We indexed 3 days before the evaluation time to get initial statistics.

SNACS

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SNACS
  • Participant: NUDTSNA
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 7494df7a98cfdabf768a5f9d789b23a5
  • Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka
  • Code: https://github.com/zhuxiang/MB_TREC2015

SNACS_LA

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SNACS_LA
  • Participant: NUDTSNA
  • Track: Microblog
  • Year: 2015
  • Submission: 7/31/2015
  • Type: automatic
  • Task: a
  • MD5: 244c1802c38969207cf1f70495e83b69
  • Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka
  • Code: https://github.com/zhuxiang/MB_TREC2015

SNACS_LB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SNACS_LB
  • Participant: NUDTSNA
  • Track: Microblog
  • Year: 2015
  • Submission: 7/31/2015
  • Type: automatic
  • Task: b
  • MD5: 0ff568df2d851df3b3862bfdc33c51d1
  • Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka
  • Code: https://github.com/zhuxiang/MB_TREC2015

SNACSA

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SNACSA
  • Participant: NUDTSNA
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 367d138763cf3ee3da9263514c6ec7c9
  • Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka
  • Code: https://github.com/zhuxiang/MB_TREC2015

udelRun1A

Results | Participants | Input | Summary | Appendix

  • Run ID: udelRun1A
  • Participant: udel
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: a
  • MD5: 3648596c170574a2d517c2a42d3892a6
  • Run description: Uses top 2 documents returned by Google and top 450 Tweets returned by Twitter for training the classifier for every profile.

udelRun1B

Results | Participants | Input | Summary | Appendix

  • Run ID: udelRun1B
  • Participant: udel
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: b
  • MD5: 1d8a0708bf70887936caea047adeb50e
  • Run description: Uses top 2 documents returned by Google and top 450 tweets returned by Twitter for training the classifier for every profile. Tweets are ranked as per the arrival time to represent a true chronological summary.

udelRun2A

Results | Participants | Input | Summary | Appendix

  • Run ID: udelRun2A
  • Participant: udel
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: a
  • MD5: c6c17f0eeb725dc3cc966807a2f6855a
  • Run description: Uses top 2 documents returned by Google and top 450 Tweets returned by twitter for training the classifier for every profile. A modified tf-idf score is used.

udelRun2B

Results | Participants | Input | Summary | Appendix

  • Run ID: udelRun2B
  • Participant: udel
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: b
  • MD5: afa08f6f19dd913627882385bb87064a
  • Run description: Uses top 2 documents returned by Google and top 450 tweets returned by Twitter for training the classifier for every profile. Tweets are ranked as per the relevance score.

udelRun3A

Results | Participants | Input | Summary | Appendix

  • Run ID: udelRun3A
  • Participant: udel
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: a
  • MD5: 4f12836696f4682b4e736db23cdb65db
  • Run description: Uses top 2 ClueWeb and top 450 Tweets returned by twitter for training the classifier for every profile.

udelRun3B

Results | Participants | Input | Summary | Appendix

  • Run ID: udelRun3B
  • Participant: udel
  • Track: Microblog
  • Year: 2015
  • Submission: 7/29/2015
  • Type: automatic
  • Task: b
  • MD5: 7a1c448085fc32d8a690e768fb85f95b
  • Run description: Uses top 2 ClueWeb and top 450 tweets returned by Twitter for training the classifier for every profile. Tweets are ranked as per the relevance score.

umd_hcil_run01

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: umd_hcil_run01
  • Participant: umd_hcil
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 077b41cbefc093a3d89f9cf3c3d0d253
  • Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least TWO (2) such keywords.
  • Code: https://github.com/cbuntain/UMD_HCIL_TREC2015/tree/v1.0

umd_hcil_run02

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: umd_hcil_run02
  • Participant: umd_hcil
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 8f6f18436c2a81e12b01163b9cc81994
  • Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least ONE (1) such keyword.
  • Code: https://github.com/cbuntain/UMD_HCIL_TREC2015/tree/v1.0

umd_hcil_run03

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: umd_hcil_run03
  • Participant: umd_hcil
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: e9081903e8e9e824332b692323011b20
  • Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least TWO (2) such keywords.
  • Code: https://github.com/cbuntain/UMD_HCIL_TREC2015/tree/v1.0

umd_hcil_run04

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: umd_hcil_run04
  • Participant: umd_hcil
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 2bc3e9ef684fbd8d37dacfc1631805e4
  • Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least ONE (1) such keyword.
  • Code: https://github.com/cbuntain/UMD_HCIL_TREC2015/tree/v1.0

UNCSILS_HRM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UNCSILS_HRM
  • Participant: UNCSILS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 34670caa6ab03b1f8298b2be83ac4310
  • Run description: In this approach, we aim to expand the query using relevant hashtags. To this end, we first build a "relevance model" of hashtags. The probability assigned to each hashtag was proportional to the query-generation probability given the hashtag's language model. Hashtag language models were generated from all tweets containing the hashtag during a period of about 20 days prior to the evaluation period. We expanded the original query using the 10 most highly scoring hashtags. The 10 most highly scoring hashtags were combined with the original query model using linear interpolation with parameter lambda=0.50. Finally, we scored tweets collected throughout the day using the KL-divergence between the expanded query model and the document model (using Dirichlet smoothing with parameter mu=1000).

UNCSILS_TRM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UNCSILS_TRM
  • Participant: UNCSILS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: cc994a2336555b70a6e0c37f9fefcfc7
  • Run description: In this approach, we first expand the query using Lavrenko's relevance model (RM3) and an external collection of tweets gathered for about 20 days prior to the evaluation period. For the baseline retrieval (from our static tweet collection), we used the query-likelihood model with Dirichlet smoothing and removed duplicate tweets from the ranking. Tweets were considered duplicates if they had a Jaccard Coefficient >= 0.70). We set parameters topDocs = 10, topTerms = 10, Dirichlet smoothing parameter mu=1000, and lambda=0.50. Parameter lambda was used to linearly interpolate the relevance model with the original query model. Finally, we scored tweets collected throughout the day using the KL-divergence between the RM3 relevance model and the document model (again, using Dirichlet smoothing).

UNCSILS_WRM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UNCSILS_WRM
  • Participant: UNCSILS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 2f9c706ae647cd574e674bb44ada79d8
  • Run description: In this approach, we first expand the query using Lavrenko's relevance model (RM3) and an external Wikipedia collection. For the baseline retrieval (from Wikipedia), we used the query-likelihood model with Dirichlet smoothing. We set parameters topDocs = 10, topTerms = 10, Dirichlet smoothing parameter mu=1000, and lambda=0.50. Parameter lambda was used to linearly interpolate the relevance model with the original query (as is done in RM3). Finally, we scored tweets collected throughout the day using the KL-divergence between the RM3 relevance model and the document model (again, using Dirichlet smoothing).

UWaterlooATDK

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWaterlooATDK
  • Participant: UWaterlooMDS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: c948bbb341a051d66124566df4f9725e
  • Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses only the title terms and expansion terms as a standing query. Two thresholds k0 and k1 were set using the results from the past 24 hours. For each incoming tweet, if its score is higher than k1, push it immediately; if its score is higher than k0 but smaller than k1, wait for at most 80 minutes. During this waiting period, if any tweet is scored higher than the current waiting one, reset the waiting time and replace the waiting tweet with the higher scoring one.

UWaterlooATEK

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWaterlooATEK
  • Participant: UWaterlooMDS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 0c2514a8c615b777f61760e15feec002
  • Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses only the title terms and expansion terms. Every 90 minutes, it selects the highest scored tweet and emits it. This run uses a general score threshold to determine whether any tweet(s) should be emitted. If no tweet is scored higher than this threshold during a 90 minute period, then nothing is emitted.

UWaterlooATNDEK

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWaterlooATNDEK
  • Participant: UWaterlooMDS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: a3d2a811bdc0b58e5fe8b422e62f9e5b
  • Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses all the terms from title, description and narrative of all 225 topics as a background model. Terms from a single topic are used as a foreground model. KL-divergence is used to select important terms from narrative and description (i.e., from the foreground model). This run adds these terms to the title terms and expansion terms as our standing query. Every 90 minutes, it selects the highest scored tweet and emits it. This run uses a general score threshold to determine whether any tweet(s) should be emitted. If no tweet is scored higher than this threshold during a 90 minute period, then nothing is emitted.

UWaterlooBT

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWaterlooBT
  • Participant: UWaterlooMDS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: e4d21ae1c2a9ca250d4f0a82aa96dd38
  • Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses only the title terms and expansion terms. Select top 100 ranked tweets in that day.

UWaterlooBTND

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWaterlooBTND
  • Participant: UWaterlooMDS
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: a4f1821e1898f28b5d746f7406f32815
  • Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses all the terms from title, description and narrative of all 225 topics as a background model. Terms from a single topic are used as a foreground model. KL-divergence is used to select important terms from narrative and description (i.e., from the foreground model). This run adds these terms to the title terms and expansion terms as our standing query. Select top 100 ranked tweets in that day.

UWCMBE1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWCMBE1
  • Participant: WaterlooClarke
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 0275a5ac58640a3fe6b173130f23b3d4
  • Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.

UWCMBE2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWCMBE2
  • Participant: WaterlooClarke
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: b
  • MD5: 199b69a53b167bb6473e584ad1fd3030
  • Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. A weighted scoring function is used which favors tweets with more title words. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.

UWCMBP1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWCMBP1
  • Participant: WaterlooClarke
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: 8b7813c8d174adb947037ae89a607712
  • Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. We make use of a push notification strategy adapted from the secretary problem. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.

UWCMBP2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWCMBP2
  • Participant: WaterlooClarke
  • Track: Microblog
  • Year: 2015
  • Submission: 7/30/2015
  • Type: automatic
  • Task: a
  • MD5: cf4d7b1eb13547541ccb4114fce3274e
  • Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. A weighted scoring function is used which favors tweets with more title words. We make use of a push notification strategy adapted from the secretary problem. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.