Skip to content

Runs - Microblog 2011

1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: 1
  • Participant: knowcenter
  • Track: Microblog
  • Year: 2011
  • Submission: 7/29/2011
  • Type: automatic
  • Task: main
  • MD5: 8a57acc414ad236d0371f73bb4e5f0d8
  • Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher. Our final score combines both Lucene score and the Burst Detection score. If no burst was detected, than only the Lucene score was considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters.

2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: 2
  • Participant: knowcenter
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 3f9f238b0ab1fc668091a1d4254f078c
  • Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher by a burst detection score. As second ranking factor we counted the number of retweets (retweet score) of all twitter user over the whole corpus. Note that we did not restrict this computation to retweets up the given query time. As third ranking factor, we computed the most often used hashtag per topic and ranked tweets higher that contained the most often used hashtag (hashtag score). Note that this computation has been done only on tweets posted before the query timestamp (no future evidence). Our final score combines therefore a Lucene score, the Burst Detection score, the retweet score, and the hashtag score. If no burst was detected, only the other three score have been considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters. Besides, we computed the Levenshtein distance in order to filter out too similar tweets.

3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: 3
  • Participant: knowcenter
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 610180ff8735df2efc62dd4bc64c4a93
  • Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher by a burst detection score. As 2nd ranking factor, we computed the most often used hashtag per topic and ranked tweets higher that contained the most often used hashtag (hashtag score). Note that this computation has been done only on tweets posted before the query timestamp (no future evidence). Our final score combines therefore a Lucene score, the Burst Detection score, and the hashtag score. If no burst was detected, only the other two scores have been considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters. Besides, we computed the Levenshtein distance in order to filter out too similar tweets.

4

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: 4
  • Participant: knowcenter
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 902bfa494abed656cd1180796efb1612
  • Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher by a burst detection score. Our final score combines therefore a Lucene score, the Burst Detection score. If no burst was detected, only the lucene score have been considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters. Besides, we computed the Levenshtein distance in order to filter out too similar tweets. The difference to RUN IDENTIFICATION "1" is the levenshtein distance and more sophisticated stop word lists (slang, emoticons, hashtags...).

balanceRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: balanceRun
  • Participant: NUSIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: f1100c7b9efaab524d27aaa184420646
  • Run description: we balance time and recency based on basic run results

baseline

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: baseline
  • Participant: ICTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 7/29/2011
  • Type: manual
  • Task: main
  • MD5: 4b6365c72557ec9e85c8ebabbc7b37df
  • Run description: baseline

baseline1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: baseline1
  • Participant: UPorto
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 43872afc732b87d4e5f43e91dfba1573
  • Run description: This run is mostly a baseline of our Terrier set-up. Uses only the text of the tweets.

baseline2

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: baseline2
  • Participant: UPorto
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 6582dbdc6b1fc0cfa9cea5088ef87e52
  • Run description: Same as baseline1, but with better indexing.

baselineBM25

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: baselineBM25
  • Participant: ULugano
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 4164b51855a8cdff45709bdc6b9081d8
  • Run description: This is a very basic run, to be used as baseline for (future) comparison with the improved ones. We used BM25 with standard settings to match the relevant tweets. The retrieved tweets were then filtered by score and time, so that the most relevant tweets for each day (from the query date) are preserved in a time-ordered way.

Basic

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: Basic
  • Participant: Elly
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 0083b6367ba46d6504d3ab146f047257
  • Run description: In response to a query, the first stage is to automatically retrieve a list of tweets and rank them based on their similarity to the query. The top 1000 tweets were submitted as the basic run results.

basicWISTUD

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: basicWISTUD
  • Participant: wis_tudelft
  • Track: Microblog
  • Year: 2011
  • Submission: 8/3/2011
  • Type: automatic
  • Task: main
  • MD5: 3c4afb51ef4cb20c4c4fc6d3a7934f91
  • Run description: Baseline run: standard retrieval with some filters (English language tweets only, no retweets, no directed tweets, no tweets with less than 100 characters, no tweets that mainly consist of a URL).

ciirRun1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ciirRun1
  • Participant: CIIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: afd692120b840b9c6fb1f6b10badb6bc
  • Run description: Temporal query expansion model and no external evidence used.

ciirRun2

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ciirRun2
  • Participant: CIIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 06ec1e105218251e1049d5da5f85b002
  • Run description: Sequential dependence model, relevance feedback model adapted and quality-biased model used.

ciirRun3

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ciirRun3
  • Participant: CIIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 99f10871e3b089fc4ec4f7bd31769098
  • Run description: Based on previous best model, temporal query expansion model added

ciirRun4

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ciirRun4
  • Participant: CIIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: b51e30495b7e491f1703c8c4f97f9481
  • Run description: Based on previous best model, query expansion model using hashtags added

clarity1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: clarity1
  • Participant: CLARITY_DCU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 9ebf1c09bbe4f9039248d07cd49614d1
  • Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0).

clarity2

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: clarity2
  • Participant: CLARITY_DCU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: f7fe9a61ccce4e4fc4209b7c1f17a08c
  • Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0). Query Expansion with Pseudo Relevance Feedback using the Top N results, with N determined on a per-query basis.

clarity3

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: clarity3
  • Participant: CLARITY_DCU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: fc88733de82b8512807ae8f5d3248a30
  • Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0). Query Expansion with Pseudo Relevance Feedback using the Top N results, with N determined on a per-query basis. EXTERNAL RESOURCE: Language Classifier (http://code.google.com/p/language-detection/) used to detect and remove non-English tweets. This resource is not timely with respect to the queries.

clarity4

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: clarity4
  • Participant: CLARITY_DCU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 34320b496e0a61ea6e15b7faa729bd1b
  • Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0). Query Expansion with Pseudo Relevance Feedback using the Top N results, with N determined on a per-query basis. Top N results used to estimate the temporal centre of the relevant tweets, and the scores tweets far from this temporal centre are downweighted. EXTERNAL RESOURCE: Language Classifier (http://code.google.com/p/language-detection/) used to detect and remove non-English tweets. This resource is not timely with respect to the queries.

COMMITbase

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: COMMITbase
  • Participant: COMMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 7/28/2011
  • Type: automatic
  • Task: main
  • MD5: dfac0a054ffa59ef1ab961c8886e16cd
  • Run description: Removed duplicate tweets, retweets, and tweets without links. Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores.

COMMITexp

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: COMMITexp
  • Participant: COMMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/5/2011
  • Type: automatic
  • Task: main
  • MD5: 04944c9b60c108902ccb1d79226bd457
  • Run description: Removed duplicate tweets and retweets. We used learning to rank to learn the weights for a linear combination of retrieval scores of different models. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores. The four models were: 1. Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model (retrieval model of our baseline). 2. A language modeling retrieval model 3. Boolean matching 4. We used a wikipedia-based Semantic Query Expansion (SQM). The tweets were then retrieved using a language modeling retrieval model. This model uses external data.

COMMITfilter

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: COMMITfilter
  • Participant: COMMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/5/2011
  • Type: automatic
  • Task: main
  • MD5: 41ffecf8e6d801b7342dc1edde0f028e
  • Run description: This approach uses pre-filtering potentially relevant tweets using a manually annotated trainingset. We trained a random forest on query-dependent (e.g.: HITS authority and hubs) and query-independent (e.g.: friends, capitalisation, number of links) features. The prediction value of the random forest served as filtering value. We then used learning to rank to learn the weights for a linear combination of retrieval scores of different models. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores. The six models were: Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model (retrieval model of our baseline). A language modeling retrieval model Boolean matching We used a wikipedia-based Semantic Query Expansion (SQM). The tweets were then retrieved using a language modeling retrieval model. This model uses external data. We also retrieved links. We built a corpus of web documents that were linked in tweets and used a language modeling retrieval model. We mapped the links back to the tweets. This model uses external data. We use SQM for the link retrieval setting. This model uses external data.

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: COMMITlinks
  • Participant: COMMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/5/2011
  • Type: automatic
  • Task: main
  • MD5: c36d91135d325bb2f39f7dfd7bb2eadf
  • Run description: We used learning to rank to learn the weights for a linear combination of retrieval scores of different models. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores. The six models were: Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model (retrieval model of our baseline). A language modeling retrieval model Boolean matching We used a wikipedia-based Semantic Query Expansion (SQM). The tweets were then retrieved using a language modeling retrieval model. This model uses external data. We also retrieved links. We built a corpus of web documents that were linked in tweets and used a language modeling retrieval model. We mapped the links back to the tweets. This model uses external data. We use SQM for the link retrieval setting. This model uses external data.

cyfrun1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: cyfrun1
  • Participant: UCSC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: b0f776e40c437fff031b9c4b6fff363a
  • Run description: sum of query word IDF, tf score of tweet,

cyfrun2

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: cyfrun2
  • Participant: UCSC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 73e8485d949cfd5eaed54e6c796575c1
  • Run description: tf score of tweet,

dbpWISTUD

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: dbpWISTUD
  • Participant: wis_tudelft
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 09f2aced5cd4599f891a29604bf04c39
  • Run description: 1. dbpedia 3.6 dump version 2011-01-17

DFReeKLIM

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: DFReeKLIM
  • Participant: FUB
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 5745b2ccc4db35d97c58511b842165b9
  • Run description: Baseline. Built using a new retrieval model and pseudo relevance feedback QE.

DFReeKLIM30

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: DFReeKLIM30
  • Participant: FUB
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 94e778d61c8a29b00d5ae65aae570ee5
  • Run description: Baseline. Built using a new retrieval model and pseudo relevance feedback QE. We fix the result list size to 30.

DFReeKLIMDC

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: DFReeKLIMDC
  • Participant: FUB
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: c53f041a475ebdb1c9537b5d1c1ce03e
  • Run description: Baseline. Built using a new retrieval model and pseudo relevance feedback QE. We use a heuristic approach to determine the result list size, for each query separately.

DFReeKLIMRA

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: DFReeKLIMRA
  • Participant: FUB
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 73d9f244335e2054cab1117cb637f15d
  • Run description: Built using a new retrieval model and pseudo relevance feedback QE. We also apply a re-ranking technique to deal with both, recency and relevance. We further use a heuristic approach to determine the result list size, for each query separately.

dutirLmFb

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: dutirLmFb
  • Participant: DUTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 87b2af7d0d3d0c19a9da1c9d5abcb282
  • Run description: language model, feedback, entropy, whether exist link

dutirMixFb

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: dutirMixFb
  • Participant: DUTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 43fcddf53dd458e68296bf09eb28b16d
  • Run description: max of language model and tf*idf, feedback, entropy, whether exist link

dutirMixSp

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: dutirMixSp
  • Participant: DUTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: edcbbeb332a4b74fc357e924d32d379a
  • Run description: mixture of language model and tf*idf, entropy, whether exist link

dutirTfidfFb

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: dutirTfidfFb
  • Participant: DUTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 141b50f5f88f32f5c55d4d0cd98f6f5b
  • Run description: tf*idf, feedback, entropy, whether exist link

EMAX

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: EMAX
  • Participant: TUD_DMIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 37f657bb58239dbc35f7e8696305db38
  • Run description: The index for the query MB007 is polluted, while other queries are conducted under the strict realtime condition. The scoring method is only based on information (Entropy) provided by tweets.

FASILKOM01

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: FASILKOM01
  • Participant: FASILKOMUI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 71cc79ab9ebfc080410ea7b090a56c16
  • Run description: This run does not include any future evidence and external resource.

FASILKOM02

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: FASILKOM02
  • Participant: FASILKOMUI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 8f47b67721351ebfef78d4ec54fcd133
  • Run description: This run uses phrase query identification (using POS tagger), query expansion (from Google and the Twitter dataset), customized scoring function.

FASILKOM03

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: FASILKOM03
  • Participant: FASILKOMUI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 2e0a7c807a0cdf99493dad950fbf8326
  • Run description: This run uses phrase query identification (using POS Tagger), query expansion which is generated from the dataset, and customized scoring function.

FASILKOM04

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: FASILKOM04
  • Participant: FASILKOMUI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: e60ea9f35a435c7aa66810c78889ff3a
  • Run description: This run uses phrase query identification (using POS Tagger), query expansion which is generated from Google search results, and customized scoring function.

FDUNLP

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: FDUNLP
  • Participant: FDUMED
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 84f7f40f41e38d46e8f6ae7d1d836735
  • Run description: Strictly real time, heauristic approach to identify the language of the tweet. Each tweet is given 2 features, and finally all the tweets are clustered.

FDUNLP2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: FDUNLP2
  • Participant: FDUMED
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 90c4ca007596e45b949e254f6e691302
  • Run description: Strictly real time, with a fine tuned tool to identify the language of the tweet. Each tweet is given 2 features, and finally all the tweets are clustered.

Google1GNO

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: Google1GNO
  • Participant: IRSI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: cc42b20e843c53fb75f10c690c8366ba
  • Run description: The given queries were searched using Google Search API. Word wise 1-grams of the titles of all the pages returned by Google were sorted in the descending order by their frequencies. Top 5 1-grams were used as new topic (the original topics were not added)and retrieval was done using Terrier-3.5 with these new topics.

gus

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: gus
  • Participant: gslisUIUC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: b322f122a6ef9a4bc12088e08e224019
  • Run description: This run uses no external or future evidence. It is a variant of the temporal smoothing method described in Efron and Golovchinsky (2011)--a language modeling variant (with no forward-looking corpus stats). No document priors are used.

gust

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: gust
  • Participant: gslisUIUC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 44fed9a322a7525f7c0207a17c1080f5
  • Run description: This run uses no external or future evidence. It is a variant of the temporal smoothing method described in Efron and Golovchinsky (2011)--a language modeling variant (with no forward-looking corpus stats). The run uses a "temporal" document prior. That is, the prior is estimated by judging the fit of the document's pseudo-query against an exponential distribution.

gustc

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: gustc
  • Participant: gslisUIUC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: a5bdf6b28e82395058dd5c2de6f677a3
  • Run description: This run uses no external or future evidence. It is a variant of the temporal smoothing method described in Efron and Golovchinsky (2011)--a language modeling variant (with no forward-looking corpus stats). The run uses a "temporal" document prior. That is, the prior is estimated by judging the fit of the document's pseudo-query against an exponential distribution. Also uses a second prior based on the clustering coefficient of the graph of words in a relevance model induced from the document's pseudo query.

gut

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: gut
  • Participant: gslisUIUC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: c811a00053f4cc72dad873637782bb65
  • Run description: This run uses no external or future evidence. It uses a query likelihood model (with no forward-looking corpus smoothing) supplemented with an independent evidence source based on the likelihood that the temporal profile of the document's pseudo-query also generated the temporal profile of the query.

hitWId

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: hitWId
  • Participant: HIT_LTRC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: a7e2b76aab0aec88da2fcf99400e104b
  • Run description: This run aims to test the effectiveness of score decay.

hitWIt

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: hitWIt
  • Participant: HIT_LTRC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 70f705b120725eb40d4f13f75bcdba57
  • Run description: This run focused on automatic threshold selection.

ICTNET11MBR1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ICTNET11MBR1
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 5c0837b7adb9dfaeac20d2b7d1c5b7e6
  • Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness,hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, by developping a semi-supervising algorithm, we extend query within the content of those tweets before the query time. There is no external and future information being used.

ICTNET11MBR2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ICTNET11MBR2
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: ab11a48d0a3a9fcf3cf0b0f61df16524
  • Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness, hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, we used the same query extension with ICTNET11MBR1, and the different enhanced BM25

ICTNET11MBR3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ICTNET11MBR3
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 98142713c2f4ca34cfe77cad7d008ff9
  • Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness, hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, we extend query with the external information by google meta search technique. Meanwhile, some wiki articles are included as a complement. There is no future information being used.

ICTNET11MBR4

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ICTNET11MBR4
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 8cd46d79e2d80acbeab42a36518a0ac7
  • Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness, hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, we extend the query within all the available tweets collection. And the external extension from ICTNET11MBR3 are combined together. Thus we use both external and future informaiton in this run.

IDEAACTQE

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: IDEAACTQE
  • Participant: GUCAS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: f39fd9a63bbb96a4f6101d2f460f51c0
  • Run description: A query expand run using content,authority and time based on field-based retrieval model

IDEABASIC

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: IDEABASIC
  • Participant: GUCAS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 72b2a47cec7d7bf3d87afa63d103c0d7
  • Run description: A baseline run using field-based retrieval model

IDEABASICACT

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: IDEABASICACT
  • Participant: GUCAS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: b22c690fb69ceaddf7996093a476af0e
  • Run description: A basic run using content,authority and time based on field-based retrieval model

IDEABASICQE

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: IDEABASICQE
  • Participant: GUCAS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 108280222fc6c04707b72857694aaeac
  • Run description: A query expand run using field-based retrieval model

ikmRun1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ikmRun1
  • Participant: ikm101
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: manual
  • Task: main
  • MD5: 6f5bf6f36b5f3dc578bd67565e90f8f8
  • Run description: We use information from link

InL2c1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: InL2c1
  • Participant: IRSI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: e4d60cb5d2bac0c2d970b3f27ad424ce
  • Run description: The retrieval was done using original queries by terrier-3.5.

iritfd1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: iritfd1
  • Participant: IRIT_SIG
  • Track: Microblog
  • Year: 2011
  • Submission: 8/9/2011
  • Type: automatic
  • Task: main
  • MD5: 586e236e16a4c2a275017f6f2564cc21
  • Run description: This run combines the score of Lucene search engine with a set of features scores: Popularity of the tweet, Length of the tweet, exact term matching, Presence of a URL, Frequency of URL, Hashtag score, Number of tweet for a twitterer, Number of mentions for a twitterer. Queries were expanded with keyword from news articles published before timestamp.

iritfd2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: iritfd2
  • Participant: IRIT_SIG
  • Track: Microblog
  • Year: 2011
  • Submission: 8/9/2011
  • Type: automatic
  • Task: main
  • MD5: 713df52d4cce2be1d2ee167538b5d819
  • Run description: This run combines the score of Lucene search engine with a set of features scores: Popularity of the tweet, Length of the tweet, exact term matching, Presence of a URL, Frequency of URL, Hashtag score, Number of tweet for a twitterer, Number of mentions for a twitterer.

IRSIGoogle1G

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: IRSIGoogle1G
  • Participant: IRSI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 6a3914edd79c1533f3422c8e7b554ad7
  • Run description: The given queries were searched using Google Search API. Word wise 1-grams of the titles of all the pages returned by Google were sorted in the descending order by their frequencies. Top 5 1-grams were added to the original topics and retrieval was done using Terrier-3.5 with these new topics.

IRSIGoogle2G

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: IRSIGoogle2G
  • Participant: IRSI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 5fcdaa84a0b293215226f3f67707141c
  • Run description: The given queries were searched using Google Search API. Word wise 2-grams of the titles of all the pages returned by Google were sorted in the descending order by their frequencies. Top 5 2-grams were added to the original topics and retrieval was done using Terrier-3.5 with these new topics.

isiFD

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: isiFD
  • Participant: isi
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 14d026d65861123bcc77ce3598bf1e11
  • Run description: Basic keyword search using "full dependence" variant of MRF retrieval model.

isiFDL

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: isiFDL
  • Participant: isi
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 01752e408ca034471f1ff97606380e83
  • Run description: Learning to rank model [base ranking function = "full dependence" variant of the MRF model].

isiFDRM

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: isiFDRM
  • Participant: isi
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 2d923f243a6f7b7aeb620d55a3e6afe1
  • Run description: "Full dependence" variant of the MRF model + pseudo relevance feedback.

isiFDRML

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: isiFDRML
  • Participant: isi
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 3f6d5e58dbbe9eb02fc92c0ac41af198
  • Run description: Learning to rank model [base ranking function = "full dependence" variant of the MRF model + pseudo relevance feedback].

kanopeRun

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: kanopeRun
  • Participant: KanopeReunion
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 7183078cb4d6eaaa3e0f09b8f3627e12
  • Run description: Our approach merged (i) an indicator of semantic similarity, approximated using the Reflexing Random Indexing (RRI)(Cohen, Schvaneveldt & Widdows, 2010) semantic space model, with (ii) the chronological distance separating tweets from a given query. RRI is a semantic space model that as demonstrated as good performances as LSA or LDA, but using a mathematical approach of the distributional hypothesis which is based on random projection. This point makes RRI very efficient in terms of computational resources that seems particularly attractive considering the large amount of data from social media.

KAUSTBase

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: KAUSTBase
  • Participant: KAUST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: e71412ef748c91258605008067cacda6
  • Run description: Baseline run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Tweets are ranked by content similarity (using IDF) and recency.

KAUSTExp

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: KAUSTExp
  • Participant: KAUST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 5c9d969495878a9335ce81c06cef91f3
  • Run description: Expansion without rerank run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Tweet expansion: expanded URLs and hashtags with most-frequent co-occurring terms. Expansion terms are added to the tweets at indexing time. - Tweets are ranked by content similarity (using IDF) and recency.

KAUSTExpRrnk

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: KAUSTExpRrnk
  • Participant: KAUST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: de700fe8b67550de607605463067f7fa
  • Run description: Expansion with rerank run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Tweet expansion: expanded URLs and hashtags with most-frequent co-occurring terms. Expansion terms are added to the tweets at indexing time. - Computed an estimation of user topic authority by building user's term-profile. - Computed an estimation of user popularity based on frequency of being replied-to, mentioned, and retweeted. - Tweets are ranked first by content similarity (using IDF) and recency. Then 4 other features are used to rerank: retweet frequency of a tweet, frequency of URL (if exist), estimated user popularity, and estimated user topic-authority.

KAUSTRerank

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: KAUSTRerank
  • Participant: KAUST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: e4e36171200802841c4ab4152de9b6e9
  • Run description: Rerank without expansion run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Computed an estimation of user topic authority by building user's term-profile. - Computed an estimation of user popularity based on frequency of being replied-to, mentioned, and retweeted. - Tweets are ranked first by content similarity (using IDF) and recency. Then 4 other features are used to rerank: retweet frequency of a tweet, frequency of URL (if exist), estimated user popularity, and estimated user topic-authority.

LJQO10

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: LJQO10
  • Participant: PolyU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 99b14a6ad47ea00b77ee2d8d6c9bba26
  • Run description: Only use query without any external or future information.

LJQO5

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: LJQO5
  • Participant: PolyU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 5d9510a4dfa98530c0f92d01c5e55f94
  • Run description: Only use query without any external or future information.

LMOP10

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: LMOP10
  • Participant: PolyU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: b6ddc9331327f12c2dbb438e1c4cd0b5
  • Run description: Only use query without any external or future information.

LMOP5

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: LMOP5
  • Participant: PolyU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 7dddb79e0139c717497bbb58107d98c4
  • Run description: Only use query without any external or future information.

LThresh

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: LThresh
  • Participant: syles
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 9135eceadc69c22f8987b19c1534cbbe
  • Run description: - Language identifier trained on Wikipedia

manualWISTUD

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: manualWISTUD
  • Participant: wis_tudelft
  • Track: Microblog
  • Year: 2011
  • Submission: 7/12/2011
  • Type: manual
  • Task: main
  • MD5: 4f2323f91c7ed055d74a589800b39413
  • Run description: An assessor manually searched through the corpus (filtered by language automatically; English only) and located interesting tweets. Allowed time per topic: 5 minutes. To increase the number of tweets retrieved per topic, a single query was submitted at the end of the 5 minute interval and all tweets returned with a tweetid lower than the lowest manually retrieved tweet were appended to the list. No other external sources were used: the context/circumstances topics were learnt by the assessor while assessing the tweets.

melblt

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: melblt
  • Participant: UniMelbLT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 1228faffe792fc1c63f1587d0f14d090
  • Run description: Baseline system, using language identification and lexical normalisation and off-the-shelf IR

MONASH1NEW

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: MONASH1NEW
  • Participant: monash
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: e468cbdd5a3b63d7a0d5dce5c1a871da
  • Run description: MONASH1NEW run performs similar way of MONASH2NEW but in order to enhance the performance we modified the returned values of some of the characteristics of tweets which have been mentioned in MONASH2NEW run.

MONASH2NEW

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: MONASH2NEW
  • Participant: monash
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 26d651fd17c4ad24af190668727f19cb
  • Run description: In MONASH2run, we take into consideration some of the tweets features in the other words, if a tweet has hash tag, @ tag or url, then add specific values to the function to make this tweet weigh more than others which they do not have. Moreover, our system calculates the value of IDF (Inverse Document Frequency) of a content of tweet and takes this value into account. The other thing in our system is that the number of tweets which a user writes and a length of that twee, indeed, give a good indicator of the importance and the relevance of that tweet. Accordingly, both of them are taken into account. In addition to that, this run uses a method to detect the language of tweets. Therefore, if a language of a tweet is English, then it has more weight which makes English tweets appear in the top of non-English tweets.

MorpheusRun1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: MorpheusRun1
  • Participant: Morpheus
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 2ac13ae226dd2afad39aaa4586e78038
  • Run description: Our system use a totally non-traditional approach to real-time search. First we have a process of combining tweets into tweet bundles (super tweets). These super tweets allow us to have a larger document size for our topic modeling runs. Document size is a main draw back to using topic models with microblogs. We do a few things, on searches past an hour in time we run a batch Latent Dirichlet Allocation on hour intervals of tweets. To find best super tweets we use KullbackLeibler divergence on the topic distributions. To find the best tweets we use recency and word occurrence.

mulnewWISTUD

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: mulnewWISTUD
  • Participant: wis_tudelft
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 67646c18aa6376cdd7607e88750c372f
  • Run description: 1. dbpedia 3.6 version: published at 2011-01-17 2. Short description of news articles crawled from 62 news RSS feeds.(From Jan. 21st to Feb. 10th)

myRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: myRun
  • Participant: Purdue_IR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 4157b6e8049492def50d3c134f51c562
  • Run description: Use belief propagation to select the exemplar of each tweet. Boost score of each tweet according to its exemplar.

myrun2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: myrun2
  • Participant: Purdue_IR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: db42d924a19d287374fdcc7e8be88a35
  • Run description: Use query expansion to reformulate query. Use belief propagation to select the exemplar of each tweet. Boost score of each tweet according to its exemplar. Use the same similarity metrics as in run1

myrun3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: myrun3
  • Participant: Purdue_IR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 690b48419ae8ce109c26a8d951bd06ee
  • Run description: Use belief propagation to select the exemplar of each tweet. Boost score of each tweet according to its exemplar. Use a different similarity metrics than run1

Nestor

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: Nestor
  • Participant: IRIT_SIG
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 5ad7dcf7dab4591ce895b43649de19f4
  • Run description: We use a Bayesian network retrieval model for tweet search that considers, in addition to textual similarity measures, the social influence of microbloggers, the time magnitude, the tweet length and hashtags occurrence. Results are filtered by the numbers of query terms present in the tweet. No future or external features is used in this system.

NestorS

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: NestorS
  • Participant: IRIT_SIG
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 50e9d8286f7dbcfbd0f7c31de812c4cf
  • Run description: We use a Bayesian network retrieval model for tweet search that considers, in addition to textual similarity measures, the time magnitude, the tweet length and hashtag occurrence. This system ignores the social influence of microbloggers. Results are filtered by the numbers of query terms present in the tweet. No future of external features is used in this system.

normal

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: normal
  • Participant: KobeU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 6673f7d5991196223871f24eacf5a775
  • Run description: We retrieve tweets from indexes corresponding to each query time.

nQCRIwoTag

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: nQCRIwoTag
  • Participant: QCRI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: c0ca23ea4095a12ba8a1c594d930e19b
  • Run description: without Automatically induced tags, ordered

nQCRIwTag

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: nQCRIwTag
  • Participant: QCRI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: bf2a0cd0029e032dcb9ccb2b362e8da7
  • Run description: with Automatically induced tags, ordered

omarRun

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: omarRun
  • Participant: DLDE
  • Track: Microblog
  • Year: 2011
  • Submission: 8/9/2011
  • Type: automatic
  • Task: main
  • MD5: 6fba88786362d1e431415e1c1ab07be4
  • Run description: This results only use the tweets itself , and we practice a simple model and use some heuristic rules to pure the results.

PKUICST

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PKUICST
  • Participant: PKU_ICST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 4af1811b38070cdb3f2d38bb403ac363
  • Run description: This submission run at the dynamic index with respect to different query, and the number of result is selected according to a score threshold.

PKUICST2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PKUICST2
  • Participant: PKU_ICST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 4ed23ac4a71a5267d7a4250b3c27f87d
  • Run description: This submission run at the dynamic index with respect to different query, and the number of result is 30/31 per query.

PKUICST3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PKUICST3
  • Participant: PKU_ICST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 1887446e5dca8892c15e51cae519697a
  • Run description: This submission run at the dynamic index with respect to different query, and the number of result is 100/101 per query.

PKUICST4

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PKUICST4
  • Participant: PKU_ICST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: cc098105eb33e72bdc219763652bc7c8
  • Run description: This submission run at the dynamic index with respect to different query, and the number of result is 300/301 per query.

PL2Bo1SDExt

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PL2Bo1SDExt
  • Participant: UoW
  • Track: Microblog
  • Year: 2011
  • Submission: 8/2/2011
  • Type: automatic
  • Task: main
  • MD5: c057d38c54d1a1555b2775a08e50e711
  • Run description: PL2 DFR algorithm with Sequential Divergence from Randomness based dependence model and Bo1 query expansion using linked HTML pages as part of tweet.

PL2NoQENoDM

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PL2NoQENoDM
  • Participant: UoW
  • Track: Microblog
  • Year: 2011
  • Submission: 8/2/2011
  • Type: automatic
  • Task: main
  • MD5: bc5fe4d773b1d6664a8cdd4012d99ea2
  • Run description: PL2 DFR algorithm baseline.

PL2NoQeSd

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PL2NoQeSd
  • Participant: UoW
  • Track: Microblog
  • Year: 2011
  • Submission: 8/2/2011
  • Type: automatic
  • Task: main
  • MD5: f81c980eab64f23123c111101f8437a9
  • Run description: PL2 DFR algorithm with Sequential Divergence from Randomness based dependence model.

PL2NoQeSdExt

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PL2NoQeSdExt
  • Participant: UoW
  • Track: Microblog
  • Year: 2011
  • Submission: 8/2/2011
  • Type: automatic
  • Task: main
  • MD5: 94001c71957e741b5ed0ad6dc8ef88d5
  • Run description: PL2 DFR algorithm with Sequential Divergence from Randomness based dependence model where the linked HTML pages are also part of the document.

PRISrun1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PRISrun1
  • Participant: PRIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: manual
  • Task: main
  • MD5: 48cd32d42dbb1c898f8fcfb39feb7a39
  • Run description: 1.There are no future or external resources used in this run. 2.Filter out irrelevant tweets using WAF(Word Activation Force) model. 3.Some parameters and thresholds in the model are select manually. 4.Operate in a strict real-time fashion.

PRISrun2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PRISrun2
  • Participant: PRIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: manual
  • Task: main
  • MD5: 72297b2efe89dea63764c41ce4cafe2d
  • Run description: 1.Web pages linked from tweets are used in this run. 2.Some parameters and thresholds in our model are select manually. 3.Operate in a strict real-time fashion. 4.Filter out irrelevant tweets using WAF(Word Activation Force) model.

PRISrun3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PRISrun3
  • Participant: PRIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 877300bfbfa0413876c46be42fb904e3
  • Run description: 1.Corpus after the query time is used in this run. 2.The query system filters the desirable tweets automatically. 3.Filter out irrelevant tweets using WAF(Word Activation Force) model.

PRISrun4

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: PRISrun4
  • Participant: PRIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: c7b591bc0313002a37e8816c50e2bce7
  • Run description: 1.Corpus after the query time is used in this run. 2.Web pages linked from tweets are used in this run. 3.The query system filters the desirable tweets automatically. 4.Filter out irrelevant tweets using WAF(Word Activation Force) model.

QCRIwoTagOrg

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: QCRIwoTagOrg
  • Participant: QCRI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: e9f9bb1ea861e3eaf86c814ca7d7b816
  • Run description: NO induced hash tags, Original ranking

QCRIwTagOrg

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: QCRIwTagOrg
  • Participant: QCRI
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: b1784eb259eda2dd2b99af342d1f5f4c
  • Run description: using automatically induced hash tags, Original ranking

qHtagBaseRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: qHtagBaseRun
  • Participant: L3S
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: ddb5dcb3ce242408005b51b477cfc79c
  • Run description: This run represents a simple baseline that take words in the title of the topic and used them as hashtags to select and rank the tweets.

qRefLThresh

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: qRefLThresh
  • Participant: syles
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 7d7552846d2cfa4ba3bf1682f4832939
  • Run description: - MSR Web Ngrams for query segmentation / reformulation / weighting - Tf.Idf on whole corpus (Lucene) - Language detector trained on Wikipedia

refBalRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: refBalRun
  • Participant: NUSIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: b818d50b444a834b8c92925b417da58b
  • Run description: this is the result of query reformulation and we balance relevance and recency

refRelRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: refRelRun
  • Participant: NUSIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 081f58f3cf19402ff3c1a94af2d07684
  • Run description: this is the result of query reformulation

relevanceRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: relevanceRun
  • Participant: NUSIS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 2ffd2ddd5e2fe295e994bcdf57b60750
  • Run description: This is the basic run.

RFD

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RFD
  • Participant: Elly
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: a2045a1e9054bd4efed68c04340b0f8c
  • Run description: The RFD mode is a pattern based model. Different from term-based models, the weight of the terms in RFD model is based on the weight of the extracted patterns. The top 10 tweets were selected as the positive feedbacks. These pseudo-relevance feedbacks were used in the RFD model to generate the feature set. Then, the feature set was used to rank all the tweets. The top 1000 ranked tweets were submitted as the final results.

ri

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ri
  • Participant: KobeU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 54eb009021921ab317f3a716ac45db4e
  • Run description: We use JSON format tweets to know user information and tweet's descriptions. Our run reranks tweets by learning to rank with careful attention to their topic and interestingness.

rit

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: rit
  • Participant: KobeU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 3d1576d11fb36ba2fd04ecfd69405eb9
  • Run description: We use JSON format tweets to know user information and tweet's descriptions. Our run reranks tweets by learning to rank with careful attention to their topic, interestingness, and time.

rit3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: rit3
  • Participant: KobeU
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 0f501622e54ae9bf98e4d001095babb3
  • Run description: We use JSON format tweets to know user information and tweet's descriptions. Our run reranks tweets by learning to rank with careful attention to their topic, interestingness, and time. We also use the wordnet for query expantion.

RMITAR

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RMITAR
  • Participant: RMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 43595f1783eb6716124606040a0f224e
  • Run description: English dictionary to filter out non-english tweets.

RMITM

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RMITM
  • Participant: RMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: manual
  • Task: main
  • MD5: 25743aa8a5020827d1e6a454d7f95ffe
  • Run description: External sources: - english dictionary to filter out non english tweets

RMITMR

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RMITMR
  • Participant: RMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: manual
  • Task: main
  • MD5: 029cb02fccff09deb252bc7595bcb68c
  • Run description: - english dictionary to filter out non english tweets.

RMITMRR

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RMITMRR
  • Participant: RMIT
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: manual
  • Task: main
  • MD5: bd5ce14d6d4b820f984266646a647726
  • Run description: english dictionary to filter out non-english tweets.

Rocchio

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: Rocchio
  • Participant: Elly
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 8a50379eec1de900e8679b1136c971da
  • Run description: The Rocchio was used to build user profiles from pseudo-relevance feedbacks. Then the feedbacks were used to re-rank all tweets. The top 1000 ranked documents were submitted as the final results.

RTB

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RTB
  • Participant: TUD_DMIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 6c6ce650e0618c06c02e9a39d9518bf9
  • Run description: The index for the query MB007 is polluted, while other queries are conducted under the strict realtime condition. The scoring method balances the time dimension and information dimension.

run1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run1
  • Participant: ICTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/6/2011
  • Type: manual
  • Task: main
  • MD5: d77899e073f8b0925519c3225486a591
  • Run description: p@30

run1a

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run1a
  • Participant: QUT1
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: c09215a22f2c284fe27ba712ca440548
  • Run description: Pseudo rf no weight

run1fix

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run1fix
  • Participant: ICTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: manual
  • Task: main
  • MD5: e15b8b1136c619b750234725ef978eee
  • Run description: run1fix for p@30

run2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run2
  • Participant: ICTIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: manual
  • Task: main
  • MD5: a121efc2d844de57d23aba709de3d2f2
  • Run description: run2 combine author and cluster

run2a

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run2a
  • Participant: QUT1
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 92f3b72c372adfc1324106d459a4c52e
  • Run description: pseudo rf with weight

run3

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run3
  • Participant: UCSC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 3cb8117ee7a56c197df59fd6eb0eed36
  • Run description: sum of query word idf, if has URL, URL MIME/text, URL MIME/audio, URL MIME/video, URL MIME/image, URL MIME/application, if has hashtap #, average word length of tweet, word length variance of tweet, if has @, stop word percent, bm25 score of tweet, tf score of tweet, tfidf score of tweet, bm25 score of url title, tf score of url title, tfidf score of url title, bm25 score of url page, tf score of url page, tfidf score of url page, language model score of tweet, doc length, retweeted times, tweet time,

run3a

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run3a
  • Participant: QUT1
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 2ad11f9c9b9ca5a79b752bf36c798532
  • Run description: This run uses pattern generated from query and reweight them based on the term frequency.

run4

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run4
  • Participant: UCSC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: caf81b5b47b05c3e93d06f1efb4d6557
  • Run description: sum of query word idf, if has URL, if has hashtap #, average word length of tweet, word length variance of tweet, if has @, stop word percent, bm25 score of tweet, tf score of tweet, tfidf score of tweet, language model score of tweet, doc length, retweeted times, tweet time,

run4a

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: run4a
  • Participant: QUT1
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 540aec82c862294382744629e1947265
  • Run description: This run uses pattern generated from query, reweight them and combine with term weight based on the term frequency.

RunAll

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RunAll
  • Participant: xmuPRC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: c3996aef27f879bc7c9b8965ff0dafab
  • Run description: Index is built upon all tweets, expanding tweet contents by extracting contents from its linking web pages. Named Entity Extraction used. Query expansion by by extracting representative words from most relevant page on the web, pseudo relevance feedback and representative Hashtag keywords. Filter non-informative tweets by ensemble ranking author pagerank, author HITS authority and pure retweets. Filter non-english tweets.

RunFut

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RunFut
  • Participant: xmuPRC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: adf99031ef313198daf392a9d5d936ba
  • Run description: Index is built upon all tweets, only including tweet contents. Named Entity Extraction used. Query expansion by pseudo relevance feedback and representative Hashtag keywords. Filter non-informative tweets by ensemble ranking author pagerank, author HITS authority and pure retweets. Filter non-english tweets.

runNeMIS

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: runNeMIS
  • Participant: NEMIS_ISTI_CNR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 2042cf44a94b170cb51dd55ba5f8d825
  • Run description: Python retrieval system based on the whoosh library. The system uses three separate indexes: -one with the text of the tweet. -one with the distinct words composing any hash tag in the tweets (multi-word hash tag are automatically split with a Viterbi-based algorithm). -one with the title of any linked page by any tweet. The retrieval scores from the three indexes are linearly combined and a filtering threshold is used to filter out low-score tweets.

runNeMISext

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: runNeMISext
  • Participant: NEMIS_ISTI_CNR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 7ed10b983f96eb27b0f4845300e61109
  • Run description: Python retrieval system based on the whoosh library. The system uses three separate indexes: -one with the text of the tweet. -one with the distinct words composing any hash tag in the tweets (multi-word hash tag are automatically split with a Viterbi-based algorithm). -one with the title of any linked page by any tweet. This run uses a stylometric score function that compares each tweet with a word distribution model extracted from a collection of Reuters news (external resource). This function allows to filter out poorly written tweets. The retrieval scores from the three indexes and the stylometric score are linearly combined and a filtering threshold is used to filter out low-score tweets.

RunPure

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: RunPure
  • Participant: xmuPRC
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 329028ff3a7e6fbe5842ac702f527494
  • Run description: Index is built upon tweets that are before query time, only including tweet contents. Named Entity Extraction used. Query expansion by pseudo relevance feedback on Lucene search results. Filter non-informative tweets by ensemble ranking author pagerank and pure retweets. Filter non-english tweets.

scurtuRun1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: scurtuRun1
  • Participant: Vitalie_Scurtu
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 1c561553beedbd06c2fc7a296ad5954d
  • Run description: The system does some basic parsing for identification of tweets/retweets. the twitter is decomposed in a list of features that are purely extracted from text (such as mentions, links, hashtags etc.), and does language recognition. For querying it uses a simple strategy for keywords reduction, and as a scoring formula it uses lucene dismax query-document similarity.

sielrun1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: sielrun1
  • Participant: SIEL_IIITH
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 3e71246ee2177f69105e05224e65a645
  • Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 2:3. Does not uses any external source.

sielrun2

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: sielrun2
  • Participant: SIEL_IIITH
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: f63debe27b4cfcc68801f380e6e85e48
  • Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 5:2. Does not uses any external source.

sielrun3

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: sielrun3
  • Participant: SIEL_IIITH
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 703ed32b19a47492e04d9a2792e1b65e
  • Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 3:2. Does not uses any external source.

sielrun4

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: sielrun4
  • Participant: SIEL_IIITH
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 31722c75077e7c0e59a18999b78a7eb2
  • Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 7:2. Does not uses any external source.

SienaCL1B

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: SienaCL1B
  • Participant: SienaCLTeam
  • Track: Microblog
  • Year: 2011
  • Submission: 8/9/2011
  • Type: automatic
  • Task: main
  • MD5: 0ec65fa9348180621d758d32480e18ef
  • Run description: Utilized content of links in tweets

SienaCL31

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: SienaCL31
  • Participant: SienaCLTeam
  • Track: Microblog
  • Year: 2011
  • Submission: 8/9/2011
  • Type: automatic
  • Task: main
  • MD5: 578a65ea6a00fbbe1c82b0d356c90947
  • Run description: Utilized Google for the query expansion module

SienaCL342

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: SienaCL342
  • Participant: SienaCLTeam
  • Track: Microblog
  • Year: 2011
  • Submission: 8/9/2011
  • Type: automatic
  • Task: main
  • MD5: 563e765898ce0514ccbdffeaf86ab549
  • Run description: Content of URLs within tweets utilized WEKA machine learning used (e.g. information retrieved utilizing Twitter API) Query expansion module utilizing google

SienaCLbase

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: SienaCLbase
  • Participant: SienaCLTeam
  • Track: Microblog
  • Year: 2011
  • Submission: 8/8/2011
  • Type: automatic
  • Task: main
  • MD5: a500d58e3ef37e960ff7bb2029cceedc
  • Run description: Simple baseline run using Lucene. Non-English tweets removed Textese expanded Strict RT's removed

simfoll

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: simfoll
  • Participant: UGLA_D
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 8a92b27554c36a11348f481ad6a9e8fb
  • Run description: 1) Full text searching through mysql - this should be a flavour of bm25; results contain only tweets < tweet time given in topics 2) Removed straight retweets 3) Pushed up scores depending on the number of followers: score := score + log(#followers^(score/C) + 1); C=2 - given all users have followers in our tweet set, the use of log might not be ideal 4) Removed tweets which contain substrings identical to higher-scored ones, where the length of the substrings is > 60% of the length of the tweet (approximately)

simfollTP01

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: simfollTP01
  • Participant: UGLA_D
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: b169bb09325e8b82cc28f7ad40d2cbef
  • Run description: Uses a novel temporal pseudo relevance feedback technique (based on the retrieval of our other run, simfoll) to attempt to expand query with terms that occur strongly in the the same time periods. Term temporality was extracted in 2 hour intervals for the duration of the collection with the algorithm attempting to identify and expand the query with other terms that had similar temporal characteristics. Lucene was used with a vector space retrieval model for the retrieval in this run.

sylesNoRes

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: sylesNoRes
  • Participant: syles
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: cba58274e229a8ca581af0bb396aa0ff
  • Run description: - Cosine similarity with modified tf-idf formula for terms weight: weight(term) = 1 * log(1 / (df + 1)) / ave(tf) / (var(timef) + 1), where timef - time frequency - Filter documents not containing query terms starting with capital letter (e.g. New York) - Filter stopwords (Most frequent words. Computed from tweets until querytime) - Filter identical tweets using minhash

tfTP01

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: tfTP01
  • Participant: UGLA_D
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: c30413773b6275230fb3a32fce409926
  • Run description: Uses a novel temporal pseudo-relevance feedback technique (based on a TF-only retrieval) to attempt to expand query with terms that occur strongly in the the same time periods. Term temporality was extracted in 2 hour intervals for the duration of the collection with the algorithm attempting to identify and expand the query with other terms that had similar temporal characteristics. Only the temporal information prior to the query was used by the temporal PRF algorithm (therefore this run conforms to the real-time requirements). Lucene was used with a TF-only vector space retrieval model for the initial and expanded retrieval in this run.

udelIndri

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: udelIndri
  • Participant: udel
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 5a8ba18aefa7aeb66fe1579acbfc319a
  • Run description: basic indri run

udelLucene

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: udelLucene
  • Participant: udel
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: edc77285fea3fb5fa418b571833774b9
  • Run description: basic lucene run

UDMicroComb1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UDMicroComb1
  • Participant: Udel_Fang
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 879f50d5dc8bdca64b38bf750c16944d
  • Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period. Use document-length-weighting to favor long and high-term-IDF tweets in order to improve "interestingness" (The only future information we used is term IDF). Use pseudo feedback.

UDMicroComb2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UDMicroComb2
  • Participant: Udel_Fang
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: a54a8349d750deff767d4d31331210d7
  • Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period. Use document-length-weighting to favor long and high-term-IDF tweets in order to improve "interestingness" (The only future information we used is term IDF). Use yahoo's search result to do query expansion.

UDMicroIDF

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UDMicroIDF
  • Participant: Udel_Fang
  • Track: Microblog
  • Year: 2011
  • Submission: 8/9/2011
  • Type: automatic
  • Task: main
  • MD5: d3b7095102af99d3ea98abc1fcbe2f3f
  • Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period.

UDMicroIDFD

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UDMicroIDFD
  • Participant: Udel_Fang
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 82ca89d80100ab264b00a5bf6c6a03d3
  • Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period. Use document-length-weighting to favor long and high-term-IDF tweets in order to improve "interestingness" (The only future information we used is term IDF).

uicir1

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: uicir1
  • Participant: UICIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: e0dffd23595f54c73c9fefb6abe5ed47
  • Run description: We use Wikipedia and Google to conduct the query expansion. Moreover, Wikipedia is used to extracted the related concepts.

uicir2

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: uicir2
  • Participant: UICIR
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: e01d15a7c22b2c9300bddfd2be2bac89
  • Run description: We use Wikipedia and Google to conduct the query expansion. Moreover, Wikipedia is used to extracted the related concepts.

UIowaS1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UIowaS1
  • Participant: UIowaS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 222ccd3b627801613484df2a449ec050
  • Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs, and was indexed using Indri. No external resources were used in this run, though it is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). The results were constrained temporally according to the query date, and duplicated tweets were removed. Finally top 30 results were ordered temporally.

UIowaS2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UIowaS2
  • Participant: UIowaS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: bb2fbbad3a72d3a30877ce43dc43fa0d
  • Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs. External resources included expanded URLs (plus title, description and keywords from the pages they refer to), as well as definitions of tags using tagdef.com. It was indexed using Indri. It is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). The results were constrained temporally according to the query date, and duplicates tweets were removed. Finally top 30 results were ordered temporally.

UIowaS3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UIowaS3
  • Participant: UIowaS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 1f092d376d8630bc90d40b19b939f8fc
  • Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs. External resources included expanded URLs (plus title, description and keywords from the pages they refer to), as well as definitions of tags using tagdef.com. It was indexed using Indri. It is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). For queries which had results from the conservative strategies, we performed query expansion by appending the most frequent capitalized word found in returned tweets which is not in the original query and is more or as frequent as one of original query terms. The results were constrained temporally according to the query date, and duplicates tweets were removed. Finally top 30 results were ordered temporally.

UIowaS4

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UIowaS4
  • Participant: UIowaS
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 4c60733fb4e91670de30a6430f51ba8c
  • Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs. External resources included expanded URLs (plus title, description and keywords from the pages they refer to), as well as definitions of tags using tagdef.com. It was indexed using Indri. It is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). For queries which had results from the conservative strategies, we performed query expansion by appending the most frequent capitalized word found in returned tweets which is not in the original query and is more or as frequent as one of original query terms. The results were constrained temporally according to the query date, and duplicates tweets were removed. The results are NOT sorted temporally, but by relevance instead (which is the only difference from UIowaS3 run).

uiucsf

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: uiucsf
  • Participant: uiuc
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 9756e1111fdd36a7ac71f9d55fedca78
  • Run description: mixture model of causal potential and standard tfidf-ness.

uogTrLqea

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: uogTrLqea
  • Participant: uogTr
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 49bf34e568dc2f3cd4abbbb7a3e51465
  • Run description: Learned run using 66 real-time non-external features

uogTrLqeabd

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: uogTrLqeabd
  • Participant: uogTr
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: a78d5aee1c52e88d3d22481fb856c980
  • Run description: Learned Run using 76 features including content linked from tweets.

uogTrLqeabdd

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: uogTrLqeabdd
  • Participant: uogTr
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: fa0fccbf0c1901c6ce8bc438cd08555d
  • Run description: Learned run using 76 real-time non-external features where the objective function directly tries to trade off relevance and recency

uogTrUB2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: uogTrUB2
  • Participant: uogTr
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 9af36188e5379e3d0f21dfbdf978cad7
  • Run description: Filtering run

UTBase

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UTBase
  • Participant: utwente
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: f68b57b10f11184032f6d3512ce16b11
  • Run description: Baseline run that performs a standard Lucene free text search over the content field of all tweets that exist up to the timestamp of each query. Performs only basic query pre-processing and matching: stopword removal+lowercasing and uses Lucene's StandardTokenizer. Uses a strict incremental index for each query. Applies (repeated) query expansion based on the tweet content if an original TREC query does not yield enough results.

UTBaseRTF

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UTBaseRTF
  • Participant: utwente
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 801243b48a7bf536f318b4f829e77474
  • Run description: Performs a standard Lucene free text search over the content field of all tweets that exist up to the timestamp of each query, but uses a full index created for all tweets. Prefers tweets that have been retweeted one or more times, but falls back to unretweeted tweets if this yields an insufficient number of results. If this still does not yield enough results, (repeated) query expansion is applied based on the content of already obtained tweets. Basic query processing involves stopword removal+lowercasing and Lucene's StandardTokenizer.

UTWngFuture

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UTWngFuture
  • Participant: utwente
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: b986f08903d6d090899be016f5f2ba3d
  • Run description: Performs a Word n-gram search over the content field of all tweets that exist up to the timestamp of each query using Lucene, but uses a full index created for all tweets. The value of n used depends on the length in words of the input query and is almost always ceil(word_count(query)) unless word_count(query) is 2, in which case n is fixed to two. The word n-grams generated are posed as an AND query to the system and then interleaved to yield a final result list. If there are too few results, each query word is submitted as a query, and if that still does not yield enough results (repeated) query expansion is applied based on the content of already obtained tweets. Basic query processing involves stopword removal+lowercasing and Lucene's StandardTokenizer.

UTWngFutureQ

Results | Participants | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: UTWngFutureQ
  • Participant: utwente
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 19e175fcd6ecacbf8d63c7778c495b43
  • Run description: Performs a Word n-gram search over the content field of all tweets that exist up to the timestamp of each query using Lucene using a full index created for all tweets. Prefers tweets that meet 4 quality criteria: contains either a hashtag or URL; is not directed at more than three people (with the @ symbol), does not consist of more than 50 percent ALL CAPS words, and does not contain repeated exclamation marks. The value of n used for the word n-grams depends on the length in words of the input query and is almost always ceil(word_count(query)) unless word_count(query) is 2, in which case n is fixed to two. The word n-grams generated are posed as an AND query to the system and then interleaved to yield a final result list. If there are too few results, each query word is submitted as a query, and if that still does not yield enough results (repeated) query expansion is applied based on the content of already obtained tweets. Basic query processing involves stopword removal+lowercasing and Lucene's StandardTokenizer.

waterlooa1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: waterlooa1
  • Participant: waterloo
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 25fba9b6fdfe5c58f4d467a1fb0e7a0f
  • Run description: Uses the Wumpus Search Engine to issue multiple queries for indices respecting the time constraints. Combines results using reciprocal rank fusion.

waterlooa2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: waterlooa2
  • Participant: waterloo
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 1d83ce46435fe7be6389a0469ed5145a
  • Run description: Uses the Wumpus Search Engine to issue multiple queries for indices, where only HTTP Status 200 tweets are used, respecting the time constraints. Combines results using reciprocal rank fusion. Due to an error in how such tweets were selected, it cannot be guaranteed that all tweets used chronologically preceded the queerytweetime but all tweets returned are earlier than the time constraint.

waterlooa3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: waterlooa3
  • Participant: waterloo
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 668e0b977accf3080fcb4110cdb5247c
  • Run description: Uses the Wumpus Search Engine to issue multiple queries for indices respecting the time constraints. Queries were issued using psudeo-relevance feedback (Okapi and KLD type feedback) and the language model used was from a previous Terabyte track. Combines results using reciprocal rank fusion.

waterlooa4

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: waterlooa4
  • Participant: waterloo
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: a7ce41e7fdfcf31ba44e5a4b22d438e8
  • Run description: Uses the Wumpus Search Engine to issue multiple queries for indices respecting the time constraints. Combines results using reciprocal rank fusion. Following this results were reranked with respect to recency, i.e. the rrf score was multiplied by /.

WESTfilext

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: WESTfilext
  • Participant: WeST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 6698265a2c3ef5ff77ca38ff22fe4121
  • Run description: Used ANEW sentiment vocabulary to compute the high-level feature (sentiment) of the tweet.

WESTfilter

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: WESTfilter
  • Participant: WeST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 48fee9dfc8a867da641043a80c8252be
  • Run description: This run is created purely using internal knowledge that is available at the time of query. No use of web pages linked from tweets.

WESTrelint

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: WESTrelint
  • Participant: WeST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/10/2011
  • Type: automatic
  • Task: main
  • MD5: 7842ecdefa8df62eb24e80d4e7e18d1e
  • Run description: Real time, no use of external knowledge, no use of web pages linked from tweets.

WESTrlext

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: WESTrlext
  • Participant: WeST
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 14a344ab4b4970e40567b7fd597d2028
  • Run description: Used ANEW sentiment vocabulary to compute the high-level feature (sentiment) of the tweet.

Wise2ndRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: Wise2ndRun
  • Participant: SEEM_CUHK
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 5034c44f8df0dbcd0db56d38434e4c6d
  • Run description: 1. Language model based retrieval 2. Query expansion

WiseFifthRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: WiseFifthRun
  • Participant: SEEM_CUHK
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 6018662af8b9237cc35bcf1c8b98b7a8
  • Run description: 1. Language Model based Retrieval 2. topic classification based on the result returned 3. result re-ranking strategy for emerging topic (different with the fourth run)

WiseFouthRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: WiseFouthRun
  • Participant: SEEM_CUHK
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: ecab4cad81c0fa0941772dc7642bd6b6
  • Run description: 1. Language Model based Retrieval 2. topic classification based on the result returned 3. result re-ranking for emerging topic

WiseThirdRun

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: WiseThirdRun
  • Participant: SEEM_CUHK
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 4023e21d0665040e464e1fb572715e74
  • Run description: 1. Language model based retrieval 2. Normalize all the tweets in the corpus

ya3

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ya3
  • Participant: yandex
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: c5cf5b7a565c55fe385ea3d59a730c84
  • Run description: -Query extension -User social features, such as number of followers at etc. -Textual quality and diversity of the tweets (query independent) -Emotion features of the tweets -Text features the headers of external links -Ranking scores achieved from boosting trees regression algorithm

ya4

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: ya4
  • Participant: yandex
  • Track: Microblog
  • Year: 2011
  • Submission: 8/12/2011
  • Type: automatic
  • Task: main
  • MD5: 4effd0c57d54f17af70338f2e6e0e132
  • Run description: -Query extension -User social features, such as number of followers at etc. -Textual quality and diversity of the tweets (query independent) -Emotion features of the tweets -Text features the headers of external links -Ranking scores achieved from boosting trees classification algorithm

YNDXTPC1

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: YNDXTPC1
  • Participant: yandex
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: 08246052e6bb8ebac13810d1c46eeaf6
  • Run description: just query expansion using tweets posted before the query time

YNDXTPC2

Results | Participants | Proceedings | Input | Summary (highrel) | Summary (allrel) | Appendix

  • Run ID: YNDXTPC2
  • Participant: yandex
  • Track: Microblog
  • Year: 2011
  • Submission: 8/11/2011
  • Type: automatic
  • Task: main
  • MD5: ad201ddcb1e660f8f3e537e21f6bdd74
  • Run description: just query expansion using tweets posted before the query time