Runs - Microblog 2015¶

BJUTllyQE¶

Run ID: BJUTllyQE
Participant: BJUT
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: b
MD5: 87d49d07761191f6f7d60cc815a4cf58
Run description: Extern collection to enrich the query that we take is Wikipedia.

BjutNMF1¶

Run ID: BjutNMF1
Participant: BJUT
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: b
MD5: 5cfa080f74434f096967ccd032fd2693
Run description: google,bing search result

BjutNMF2¶

Run ID: BjutNMF2
Participant: BJUT
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: b
MD5: 3c45784738613cc940e6658c12dc03c7
Run description: google,bing search result

CLIP-A-5.0-0.5¶

Run ID: CLIP-A-5.0-0.5
Participant: CLIP
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 4c9b40687dd956f649988e86bbdbe138
Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-A-5.0-0.6¶

Run ID: CLIP-A-5.0-0.6
Participant: CLIP
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 96806b7d2ac81e619b15d2ac96f142f0
Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-A-DYN-0.5¶

Run ID: CLIP-A-DYN-0.5
Participant: CLIP
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 394b7bde9ea0da25eb49f0164da8c2ef
Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-B-0.4¶

Run ID: CLIP-B-0.4
Participant: CLIP
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 1ebb13b4e030bd1c194e63c41278f855
Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-B-0.5¶

Run ID: CLIP-B-0.5
Participant: CLIP
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: e0d0e5fc3bc2aa253e26e496473010a5
Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

CLIP-B-0.6¶

Run ID: CLIP-B-0.6
Participant: CLIP
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: f1e5bd7fa3b53c2009b732a05fc24880
Run description: It uses a word embedding model trained on 1.5B+ tweets between Sep 2011 and Apr 2015.

DALTREC_B_PREP¶

Run ID: DALTREC_B_PREP
Participant: DalTREC
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: b
MD5: 32a58faa6e7b980f60862a38e7aa7ebb
Run description: This code reads the crawled tweets from json files in ./data/tweets and index them separately for each day. Then, loads the manually assigned weights from ./data/ManuallyLabeledKeyterms2.txt. These weights indicate the importance of manually extracted keyterms in each topic/profile. Using these weights, we create boosted queries and set Lucene similarity to "LMDirichletSimilarity". The ranked tweets returned by Lucene are tested to see whether the sum of the weights of the keyterms occurred in each tweet passes a predefined threshold or not. If yes, then the tweet is returned as relevant. Otherwise, it is ignored.

DALTRECAA1¶

Run ID: DALTRECAA1
Participant: DalTREC
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 695f697d9cf6eae9e670ffa68f6fd17b
Run description: 1. Compare wikipedia concepts running GTM. Wikipedia concept linking: http://dexter.isti.cnr.it/ GTM:http://cgm6.research.cs.dal.ca:8080/DalTextWebApp/

DALTRECAB1¶

Run ID: DALTRECAB1
Participant: DalTREC
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 0f888a2adcb813f1d21a4d42fbe69442
Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms

DALTRECMA1¶

Run ID: DALTRECMA1
Participant: DalTREC
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: a
MD5: d6f542d8edf0e79bec00b9a6da0831e7
Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms

DALTRECMA2¶

Run ID: DALTRECMA2
Participant: DalTREC
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: a
MD5: 9b330da0c6d3833155477ffc09b71502
Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms

DALTRECMB1¶

Run ID: DALTRECMB1
Participant: DalTREC
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 058752c22563a7452d5e8bd3269b5a17
Run description: 1. Dexter Wikipedia linking concepts. 2. GTM semantic similarity score 3. Manual labelling key terms

ECNURUNA1¶

Run ID: ECNURUNA1
Participant: ECNU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 06a9b424ae0663c8acc5fa5d287ef934
Run description: When we get a new tweet,we use a classifier trained by data in 2013 Trec Microblog to classify whether it's relevant to a specific query.If it is relevant,we then will compute its similarity with the former delivered tweets.We use google search result to expand our querys.

ECNURUNA2¶

Run ID: ECNURUNA2
Participant: ECNU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: da1b5b4f77074489148367353bf14a80
Run description: When we get a new tweet,we use a classifier trained by data in 2013 Trec Microblog to classify whether it's relevant to a specific query.If it is relevant,we then will compute its similarity with the former delivered tweets.We use google search result to expand our querys.

ECNURUNA3¶

Run ID: ECNURUNA3
Participant: ECNU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 549872e41bd0585993734427e142a976
Run description: When we get a new tweet,we use a classifier trained by data in 2013 Trec Microblog to classify whether it's relevant to a specific query.If it is relevant,we then will compute its similarity with the former delivered tweets.We use google search result to expand our querys.

ECNURUNB1¶

Run ID: ECNURUNB1
Participant: ECNU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: fccad6db2b606c741173a1fcee73b43b
Run description: combine DFRee_QTGP with LM_Q; use three query expansion methods: Google searched based, tfidf based, Bo1bfree based;

ECNURUNB2¶

Run ID: ECNURUNB2
Participant: ECNU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 6f8077717920b9b3f136ab30e752af73
Run description: combine DFRee_QTGP, LM_Q and BM25_QT; use three query expansion methods: Google searched based, tfidf based, Bo1bfree based;

ECNURUNB3¶

Run ID: ECNURUNB3
Participant: ECNU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: bd7f88c884870918ab96efc2ce6464cc
Run description: combine DFRee_QTGP, LM_Q, BM25_QGTP, BM25_QG; use three query expansion methods: Google searched based, tfidf based, Bo1bfree based;

hpclab_pi_algA¶

Results | Participants | Input | Summary | Appendix

Run ID: hpclab_pi_algA
Participant: HPCLAB_PI
Track: Microblog
Year: 2015
Submission: 7/31/2015
Task: a
MD5: 5f923e2a1c0b37043f1b2ded87f21424
Run description: The algorithm takes into account RT count and favorite count, combining that values in the tweet with BM25 score.

hpclabpibm25mod¶

Results | Participants | Input | Summary | Appendix

Run ID: hpclabpibm25mod
Participant: HPCLAB_PI
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: b
MD5: 972bcf02a64199df509fa145cd88972d
Run description: The run combine evidence from retweet count and favorite count of tweets, combining that values with bm25 scores.

IRIT-KLTFIDF¶

Run ID: IRIT-KLTFIDF
Participant: IRIT
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: a
MD5: 1fc4981858bdb8ae14b19cb4ef566dbc
Run description: no external ressources were used for this run.

IRIT-RTDig.33¶

Run ID: IRIT-RTDig.33
Participant: IRIT
Track: Microblog
Year: 2015
Submission: 7/31/2015
Type: automatic
Task: b
MD5: f5ae850daf1a75d0530f25aff03b03ff
Run description: Our tweet digest run filter and cluster tweets on real time. Tweets are assigned to recently created cluster that maximize average similarity

IRIT-RTNotif.33¶

Run ID: IRIT-RTNotif.33
Participant: IRIT
Track: Microblog
Year: 2015
Submission: 7/31/2015
Type: automatic
Task: a
MD5: e51c810ff5f76f993f01e8bea47e8ad7
Run description: Our tweet push notification uses the same clustering model as our model digest model expect that the time window for notification is set to 1 hour. Our tweet push notification uses also pseudo-relevance model to push only highly relevant tweets.

IRIT100KLTFIDF¶

Run ID: IRIT100KLTFIDF
Participant: IRIT
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: b
MD5: fe2f13255cb4462ad73f21c756b6a4bb
Run description: no external ressources were used for this run.

IritSigSDA¶

Run ID: IritSigSDA
Participant: IRIT
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 2408934d70274257fb48c1c05fb4c44d
Run description: The main aim of this run was the speed of the decision making : all answers are taken within few seconds. Many features, in addition of the content of the tweet, are taken into account to make the decision to retain the given tweet as soon as possible. Python had been prefered to Java to increase the rapidity of the decision making even more.

IritSigSDB¶

Run ID: IritSigSDB
Participant: IRIT
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 509ee95449cd419d15854625b915b23b
Run description: The main aim of this run was the speed of the decision making : all answers are taken within few minutes. Many features, in addition of the content of the tweet, are taken into account to make a first score on the text content of the tweet, and a second one based on all the additionnal features. Python had been prefered to Java to increase the rapidity of the decision making even more, exept for the final ranking of the result (back to Java as the sampling).

MPII_COM_MAXREP¶

Results | Participants | Input | Summary | Appendix

Run ID: MPII_COM_MAXREP
Participant: MPII
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: e52f0a4f9db69d432bef2590272d1109
Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We used twitter meta data as features.

MPII_COMB_MART¶

Results | Participants | Input | Summary | Appendix

Run ID: MPII_COMB_MART
Participant: MPII
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: d8359599b196fb5022e213ff47d4059a
Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We also used twitter meta data as features.

MPII_COMB_SORT¶

Results | Participants | Input | Summary | Appendix

Run ID: MPII_COMB_SORT
Participant: MPII
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 0260cbb3437d02d33610ac8b9f924f53
Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We used twitter meta data as features.

MPII_HYBRID_PW¶

Results | Participants | Input | Summary | Appendix

Run ID: MPII_HYBRID_PW
Participant: MPII
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: a39cc63d29e498a59c33f8bc6bb7de3d
Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used. We used twitter meta data as features.

MPII_LUC_MART¶

Results | Participants | Input | Summary | Appendix

Run ID: MPII_LUC_MART
Participant: MPII
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 54a9d054c826795beb8912eb5549d26d
Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used.

MPII_LUC_SORT¶

Results | Participants | Input | Summary | Appendix

Run ID: MPII_LUC_SORT
Participant: MPII
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 8f30c4a39af081d84e1a62d4ae27f388
Run description: Wikipedia is used to expand the query, meanwhile the title from the page of embeding url is also used.

PKUICSTRunA1¶

Run ID: PKUICSTRunA1
Participant: PKUICST
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: f4023e79eacaa8a17a48fe57890a7892
Run description: We utilized google web search to realize query expansion before the evaluation period, and we adopted an adaptive relevance threshold according to top K relevance threshold in ScenarioB of previous day. Besides, we utilized a uniform novel threshold N = 0.67.

PKUICSTRunA2¶

Run ID: PKUICSTRunA2
Participant: PKUICST
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: a
MD5: 0bd96aaefa08b03f5fceb7b239230f78
Run description: We utilized google web search to realize query expansion before the evaluation period, and we adopted an adaptive relevance threshold according to manual relevance threshold in ScenarioB of previous day. Besides, we utilized a uniform novel threshold N = 0.67.

PKUICSTRunA3¶

Run ID: PKUICSTRunA3
Participant: PKUICST
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 6094629c4596031dc5c7128b5eff00f3
Run description: We utilized google web search to realize query expansion before the evaluation period, and we adopted an manual relevance threshold according to top K relevance threshold in ScenarioB of previous day. Besides, we utilized a uniform novel threshold N = 0.72.

PKUICSTRunB1¶

Run ID: PKUICSTRunB1
Participant: PKUICST
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 0ebd3b390d77a3b522caa224bf5774ef
Run description: We utilized google web search to realize query expansion before the evaluation period, and we realized a language model with pseudo-relevance-feedback to obtain relevant tweets. In addition, we adopted an adaptive relevance threshold according to top K relevance threshold in ScenarioB of previous day. and utilized a uniform novel threshold N = 0.67.

PKUICSTRunB2¶

Run ID: PKUICSTRunB2
Participant: PKUICST
Track: Microblog
Year: 2015
Submission: 7/30/2015
Task: b
MD5: d908773fbd1d07e20942e2810451fd8b
Run description: We utilized google web search to realize query expansion before the evaluation period, and we realized a language model with pseudo-relevance-feedback to obtain relevant tweets. In addition, we adopted an adaptive relevance threshold according to manual relevance threshold in ScenarioB of previous day. and utilized a uniform novel threshold N = 0.67.

PKUICSTRunB3¶

Run ID: PKUICSTRunB3
Participant: PKUICST
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 808e2ab0eec249dcf7a9b30f266ec8f4
Run description: We utilized google web search to realize query expansion before the evaluation period, and we realized a language model with pseudo-relevance-feedback to obtain relevant tweets. In addition, we adopted an manual relevance threshold according to top K relevance threshold in ScenarioB of previous day. and utilized a uniform novel threshold N = 0.72.

prnaTaskA1¶

Run ID: prnaTaskA1
Participant: prna
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 14453f25b3bc592d12f3e936ea5237cd
Run description: We use various NLP/IR techniques to extract the important user profile keywords, expand those based on WordNet synsets, and index them. Incoming tweets are processed and the relevant tweets are then mapped to corresponding user profiles using a combination of semantic similarity and frequency-based relevance scores.

prnaTaskA2¶

Run ID: prnaTaskA2
Participant: prna
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 04708f457c91a4457626069d4a40034c
Run description: We use various NLP/IR techniques to extract the important user profile keywords, expand those based on neural word/phrase embeddings, and index them. Incoming tweets are processed and the relevant tweets are then mapped to corresponding user profiles using a combination of semantic similarity and frequency-based relevance scores.

prnaTaskA3¶

Run ID: prnaTaskA3
Participant: prna
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: d7d8aad8afed9ef93ce426224463f6ac
Run description: We use various NLP/IR techniques to extract the important user profile keywords, expand those based on WordNet synsets and neural word/phrase embeddings, and index them. Incoming tweets are processed and the relevant tweets are then mapped to corresponding user profiles using a combination of semantic similarity and frequency-based relevance scores.

prnaTaskB1¶

Run ID: prnaTaskB1
Participant: prna
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: c4126ba2a7c28858ccca834ead7da15f
Run description: Various NLP/IR techniques are used to process and index the candidate tweets for a day. The important user profile keywords are extracted, expanded based on WordNet synsets, and used to search for relevant tweets and create a digest of up to hundred tweets per day per user profile.

prnaTaskB2¶

Run ID: prnaTaskB2
Participant: prna
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 326a295c53cd96e5bd0500146b8f18ff
Run description: Various NLP/IR techniques are used to process and index the candidate tweets for a day. The important user profile keywords are extracted, expanded based on neural word/phrase embeddings, and used to search for relevant tweets and create a digest of up to hundred tweets per day per user profile.

prnaTaskB3¶

Run ID: prnaTaskB3
Participant: prna
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: cb281531cedabec7a28f692c34684cc2
Run description: Various NLP/IR techniques are used to process and index the candidate tweets for a day. The important user profile keywords are extracted, expanded based on WordNet synsets and neural word/phrase embeddings, and used to search for relevant tweets and create a digest of up to hundred tweets per day per user profile.

QUBaseline¶

Run ID: QUBaseline
Participant: QU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: bf5cabaaced8c2683f897e7f2cedfb8d
Run description: We use online filtering by measuring similarity between the current tweet in the stream with the topical representation of all 225 topics. We represent a topic by terms from title, description and narrative of the topic in addition to expansion terms extracted from pseudo relevant tweets to the topic. We use a static similarity threshold and based on that decide which topics the tweet matches. Additionally, we reduce redundancy in pushed tweets by only pushing a relevant tweet to a topic if it isn't similar to already pushed tweets.

QUBaselineB¶

Run ID: QUBaselineB
Participant: QU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 67310fde622fb773c15a6dcc92b4f4a8
Run description: This run is the baseline run that just uses the title as a query and run it at the end of each day against the index of tweets. We indexed 3 days before the evaluation time to get initial statistics.

QUDyn¶

Run ID: QUDyn
Participant: QU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 967648832c91c0098e92dce00224920e
Run description: We use online filtering by measuring similarity between the current tweet in the stream with the topical representation of all 225 topics. We represent a topic by terms from title, description and narrative of the topic in addition to expansion terms extracted from pseudo relevant tweets to the topic. We use a dynamically-set similarity threshold and based on that decide which topics the tweet matches. Additionally, we reduce redundancy in pushed tweets by only pushing a relevant tweet to a topic if it isn't similar to already pushed tweets. We use Twitter data collected 3 days before evaluation period started.

QUDynExp¶

Run ID: QUDynExp
Participant: QU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: bc3a216a1e36def0bdef5e1af6f7c930
Run description: We use online filtering by measuring similarity between the current tweet in the stream with the topical representation of all 225 topics. We represent a topic by terms from title, description and narrative of the topic in addition to expansion terms extracted from pseudo relevant tweets to the topic. We use a dynamically-set similarity threshold and based on that decide which topics the tweet matches. Additionally, we reduce redundancy in pushed tweets by only pushing a relevant tweet to a topic if it isn't similar to already pushed tweets. For external resources, we use Twitter data collected 3 days before evaluation period started.

QUExpB¶

Run ID: QUExpB
Participant: QU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: b777d14ecc0ac85c656402fc762232ba
Run description: This run does query expansion every day using tweets of the same day. Expansion terms are only 3 terms (from top 5 pseudo relevant tweets) besides the title. We indexed 3 days before the evaluation time to get initial statistics.

QUFullExpB¶

Run ID: QUFullExpB
Participant: QU
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: a5fa33f2cdff2fc2dbc89bb609ba409a
Run description: This run does query expansion every day using tweets of the same day. Expansion terms are 10 terms (from top 10 pseudo relevant tweets) besides the title, description and narrative. Final query will not exceed 20 terms. We indexed 3 days before the evaluation time to get initial statistics.

SNACS¶

Run ID: SNACS
Participant: NUDTSNA
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 7494df7a98cfdabf768a5f9d789b23a5
Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka

SNACS_LA¶

Run ID: SNACS_LA
Participant: NUDTSNA
Track: Microblog
Year: 2015
Submission: 7/31/2015
Type: automatic
Task: a
MD5: 244c1802c38969207cf1f70495e83b69
Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka

SNACS_LB¶

Run ID: SNACS_LB
Participant: NUDTSNA
Track: Microblog
Year: 2015
Submission: 7/31/2015
Type: automatic
Task: b
MD5: 0ff568df2d851df3b3862bfdc33c51d1
Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka

SNACSA¶

Run ID: SNACSA
Participant: NUDTSNA
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 367d138763cf3ee3da9263514c6ec7c9
Run description: Using wikipedia english corpus to train word2vec model Using labeled tweets collection to train Importance Model by Weka

udelRun1A¶

Results | Participants | Input | Summary | Appendix

Run ID: udelRun1A
Participant: udel
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: a
MD5: 3648596c170574a2d517c2a42d3892a6
Run description: Uses top 2 documents returned by Google and top 450 Tweets returned by Twitter for training the classifier for every profile.

udelRun1B¶

Results | Participants | Input | Summary | Appendix

Run ID: udelRun1B
Participant: udel
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: b
MD5: 1d8a0708bf70887936caea047adeb50e
Run description: Uses top 2 documents returned by Google and top 450 tweets returned by Twitter for training the classifier for every profile. Tweets are ranked as per the arrival time to represent a true chronological summary.

udelRun2A¶

Results | Participants | Input | Summary | Appendix

Run ID: udelRun2A
Participant: udel
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: a
MD5: c6c17f0eeb725dc3cc966807a2f6855a
Run description: Uses top 2 documents returned by Google and top 450 Tweets returned by twitter for training the classifier for every profile. A modified tf-idf score is used.

udelRun2B¶

Results | Participants | Input | Summary | Appendix

Run ID: udelRun2B
Participant: udel
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: b
MD5: afa08f6f19dd913627882385bb87064a
Run description: Uses top 2 documents returned by Google and top 450 tweets returned by Twitter for training the classifier for every profile. Tweets are ranked as per the relevance score.

udelRun3A¶

Results | Participants | Input | Summary | Appendix

Run ID: udelRun3A
Participant: udel
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: a
MD5: 4f12836696f4682b4e736db23cdb65db
Run description: Uses top 2 ClueWeb and top 450 Tweets returned by twitter for training the classifier for every profile.

udelRun3B¶

Results | Participants | Input | Summary | Appendix

Run ID: udelRun3B
Participant: udel
Track: Microblog
Year: 2015
Submission: 7/29/2015
Type: automatic
Task: b
MD5: 7a1c448085fc32d8a690e768fb85f95b
Run description: Uses top 2 ClueWeb and top 450 tweets returned by Twitter for training the classifier for every profile. Tweets are ranked as per the relevance score.

umd_hcil_run01¶

Run ID: umd_hcil_run01
Participant: umd_hcil
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 077b41cbefc093a3d89f9cf3c3d0d253
Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least TWO (2) such keywords.

umd_hcil_run02¶

Run ID: umd_hcil_run02
Participant: umd_hcil
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 8f6f18436c2a81e12b01163b9cc81994
Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least ONE (1) such keyword.

umd_hcil_run03¶

Run ID: umd_hcil_run03
Participant: umd_hcil
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: e9081903e8e9e824332b692323011b20
Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least TWO (2) such keywords.

umd_hcil_run04¶

Run ID: umd_hcil_run04
Participant: umd_hcil
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 2bc3e9ef684fbd8d37dacfc1631805e4
Run description: This run makes use of exponential curve fitting to identify bursts in keyword activity on Twitter. We filter the public Twitter sample stream down to tweets including lemmatized keywords from the TREC topics and keep all tweets that contain at least ONE (1) such keyword.

UNCSILS_HRM¶

Run ID: UNCSILS_HRM
Participant: UNCSILS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 34670caa6ab03b1f8298b2be83ac4310
Run description: In this approach, we aim to expand the query using relevant hashtags. To this end, we first build a "relevance model" of hashtags. The probability assigned to each hashtag was proportional to the query-generation probability given the hashtag's language model. Hashtag language models were generated from all tweets containing the hashtag during a period of about 20 days prior to the evaluation period. We expanded the original query using the 10 most highly scoring hashtags. The 10 most highly scoring hashtags were combined with the original query model using linear interpolation with parameter lambda=0.50. Finally, we scored tweets collected throughout the day using the KL-divergence between the expanded query model and the document model (using Dirichlet smoothing with parameter mu=1000).

UNCSILS_TRM¶

Run ID: UNCSILS_TRM
Participant: UNCSILS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: cc994a2336555b70a6e0c37f9fefcfc7
Run description: In this approach, we first expand the query using Lavrenko's relevance model (RM3) and an external collection of tweets gathered for about 20 days prior to the evaluation period. For the baseline retrieval (from our static tweet collection), we used the query-likelihood model with Dirichlet smoothing and removed duplicate tweets from the ranking. Tweets were considered duplicates if they had a Jaccard Coefficient >= 0.70). We set parameters topDocs = 10, topTerms = 10, Dirichlet smoothing parameter mu=1000, and lambda=0.50. Parameter lambda was used to linearly interpolate the relevance model with the original query model. Finally, we scored tweets collected throughout the day using the KL-divergence between the RM3 relevance model and the document model (again, using Dirichlet smoothing).

UNCSILS_WRM¶

Run ID: UNCSILS_WRM
Participant: UNCSILS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 2f9c706ae647cd574e674bb44ada79d8
Run description: In this approach, we first expand the query using Lavrenko's relevance model (RM3) and an external Wikipedia collection. For the baseline retrieval (from Wikipedia), we used the query-likelihood model with Dirichlet smoothing. We set parameters topDocs = 10, topTerms = 10, Dirichlet smoothing parameter mu=1000, and lambda=0.50. Parameter lambda was used to linearly interpolate the relevance model with the original query (as is done in RM3). Finally, we scored tweets collected throughout the day using the KL-divergence between the RM3 relevance model and the document model (again, using Dirichlet smoothing).

UWaterlooATDK¶

Run ID: UWaterlooATDK
Participant: UWaterlooMDS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: c948bbb341a051d66124566df4f9725e
Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses only the title terms and expansion terms as a standing query. Two thresholds k0 and k1 were set using the results from the past 24 hours. For each incoming tweet, if its score is higher than k1, push it immediately; if its score is higher than k0 but smaller than k1, wait for at most 80 minutes. During this waiting period, if any tweet is scored higher than the current waiting one, reset the waiting time and replace the waiting tweet with the higher scoring one.

UWaterlooATEK¶

Run ID: UWaterlooATEK
Participant: UWaterlooMDS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 0c2514a8c615b777f61760e15feec002
Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses only the title terms and expansion terms. Every 90 minutes, it selects the highest scored tweet and emits it. This run uses a general score threshold to determine whether any tweet(s) should be emitted. If no tweet is scored higher than this threshold during a 90 minute period, then nothing is emitted.

UWaterlooATNDEK¶

Run ID: UWaterlooATNDEK
Participant: UWaterlooMDS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: a3d2a811bdc0b58e5fe8b422e62f9e5b
Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses all the terms from title, description and narrative of all 225 topics as a background model. Terms from a single topic are used as a foreground model. KL-divergence is used to select important terms from narrative and description (i.e., from the foreground model). This run adds these terms to the title terms and expansion terms as our standing query. Every 90 minutes, it selects the highest scored tweet and emits it. This run uses a general score threshold to determine whether any tweet(s) should be emitted. If no tweet is scored higher than this threshold during a 90 minute period, then nothing is emitted.

UWaterlooBT¶

Run ID: UWaterlooBT
Participant: UWaterlooMDS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: e4d21ae1c2a9ca250d4f0a82aa96dd38
Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses only the title terms and expansion terms. Select top 100 ranked tweets in that day.

UWaterlooBTND¶

Run ID: UWaterlooBTND
Participant: UWaterlooMDS
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: a4f1821e1898f28b5d746f7406f32815
Run description: This run expands the title statement. To do so, we compare a foreground model for each topic with a general background model for all the topics, using KL-divergence to generate expansion terms from the foreground model. Each topic-specific foreground model was composed of the top 1000 tweets retrieved by using Twitter's web-based search (search.twitter.com) with stemmed title terms forming the query. URLs in each tweet were replaced by the title tag from the corresponding webpage. The general background model was comprised of 6 months of English tweets from Twitter's streaming API. Specifics: This run uses all the terms from title, description and narrative of all 225 topics as a background model. Terms from a single topic are used as a foreground model. KL-divergence is used to select important terms from narrative and description (i.e., from the foreground model). This run adds these terms to the title terms and expansion terms as our standing query. Select top 100 ranked tweets in that day.

UWCMBE1¶

Run ID: UWCMBE1
Participant: WaterlooClarke
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 0275a5ac58640a3fe6b173130f23b3d4
Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.

UWCMBE2¶

Run ID: UWCMBE2
Participant: WaterlooClarke
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: b
MD5: 199b69a53b167bb6473e584ad1fd3030
Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. A weighted scoring function is used which favors tweets with more title words. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.

UWCMBP1¶

Run ID: UWCMBP1
Participant: WaterlooClarke
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: 8b7813c8d174adb947037ae89a607712
Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. We make use of a push notification strategy adapted from the secretary problem. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.

UWCMBP2¶

Run ID: UWCMBP2
Participant: WaterlooClarke
Track: Microblog
Year: 2015
Submission: 7/30/2015
Type: automatic
Task: a
MD5: cf4d7b1eb13547541ccb4114fce3274e
Run description: The system first does query expansion by pseudo relevance feedback for each interest profile by making queries using the Twitter and Google search APIs. The expanded terms, together with the profile titles, are then used to score the tweets. A weighted scoring function is used which favors tweets with more title words. We make use of a push notification strategy adapted from the secretary problem. We avoid recommending redundant tweets by making use of a simple tweet similarity measure which counts the number of similar terms between tweets. External Resources (for query expansion): 1. Data from Twitter and Google search APIs during evaluation period. 2. A collected corpus of tweets from the Twitter sample stream gathered prior to the start of the evaluation period.