Runs - Microblog 2011¶

1¶

Run ID: 1
Participant: knowcenter
Track: Microblog
Year: 2011
Submission: 7/29/2011
Type: automatic
Task: main
MD5: 8a57acc414ad236d0371f73bb4e5f0d8
Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher. Our final score combines both Lucene score and the Burst Detection score. If no burst was detected, than only the Lucene score was considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters.

2¶

Run ID: 2
Participant: knowcenter
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 3f9f238b0ab1fc668091a1d4254f078c
Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher by a burst detection score. As second ranking factor we counted the number of retweets (retweet score) of all twitter user over the whole corpus. Note that we did not restrict this computation to retweets up the given query time. As third ranking factor, we computed the most often used hashtag per topic and ranked tweets higher that contained the most often used hashtag (hashtag score). Note that this computation has been done only on tweets posted before the query timestamp (no future evidence). Our final score combines therefore a Lucene score, the Burst Detection score, the retweet score, and the hashtag score. If no burst was detected, only the other three score have been considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters. Besides, we computed the Levenshtein distance in order to filter out too similar tweets.

3¶

Run ID: 3
Participant: knowcenter
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 610180ff8735df2efc62dd4bc64c4a93
Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher by a burst detection score. As 2nd ranking factor, we computed the most often used hashtag per topic and ranked tweets higher that contained the most often used hashtag (hashtag score). Note that this computation has been done only on tweets posted before the query timestamp (no future evidence). Our final score combines therefore a Lucene score, the Burst Detection score, and the hashtag score. If no burst was detected, only the other two scores have been considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters. Besides, we computed the Levenshtein distance in order to filter out too similar tweets.

4¶

Run ID: 4
Participant: knowcenter
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 902bfa494abed656cd1180796efb1612
Run description: We used Lucene for querying and ranking the Tweets. After that, we did a burst detection, identifying the time windows (interval: 3 hours) when an unusual amount of Tweets occurred. Based on our assumption that something extraordinary happened within that time, those Tweets have been ranked higher by a burst detection score. Our final score combines therefore a Lucene score, the Burst Detection score. If no burst was detected, only the lucene score have been considered. We applied simple rule-based filtering techniques. For example, Tweets which contain an @-Character are not considered. Also, we implemented a simple language guesser which removes Tweets which are not written mainly in ASCII characters. Besides, we computed the Levenshtein distance in order to filter out too similar tweets. The difference to RUN IDENTIFICATION "1" is the levenshtein distance and more sophisticated stop word lists (slang, emoticons, hashtags...).

balanceRun¶

Run ID: balanceRun
Participant: NUSIS
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: f1100c7b9efaab524d27aaa184420646
Run description: we balance time and recency based on basic run results

baseline¶

Run ID: baseline
Participant: ICTIR
Track: Microblog
Year: 2011
Submission: 7/29/2011
Type: manual
Task: main
MD5: 4b6365c72557ec9e85c8ebabbc7b37df
Run description: baseline

baseline1¶

Run ID: baseline1
Participant: UPorto
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 43872afc732b87d4e5f43e91dfba1573
Run description: This run is mostly a baseline of our Terrier set-up. Uses only the text of the tweets.

baseline2¶

Run ID: baseline2
Participant: UPorto
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 6582dbdc6b1fc0cfa9cea5088ef87e52
Run description: Same as baseline1, but with better indexing.

baselineBM25¶

Run ID: baselineBM25
Participant: ULugano
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 4164b51855a8cdff45709bdc6b9081d8
Run description: This is a very basic run, to be used as baseline for (future) comparison with the improved ones. We used BM25 with standard settings to match the relevant tweets. The retrieved tweets were then filtered by score and time, so that the most relevant tweets for each day (from the query date) are preserved in a time-ordered way.

Basic¶

Run ID: Basic
Participant: Elly
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 0083b6367ba46d6504d3ab146f047257
Run description: In response to a query, the first stage is to automatically retrieve a list of tweets and rank them based on their similarity to the query. The top 1000 tweets were submitted as the basic run results.

basicWISTUD¶

Run ID: basicWISTUD
Participant: wis_tudelft
Track: Microblog
Year: 2011
Submission: 8/3/2011
Type: automatic
Task: main
MD5: 3c4afb51ef4cb20c4c4fc6d3a7934f91
Run description: Baseline run: standard retrieval with some filters (English language tweets only, no retweets, no directed tweets, no tweets with less than 100 characters, no tweets that mainly consist of a URL).

ciirRun1¶

Run ID: ciirRun1
Participant: CIIR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: afd692120b840b9c6fb1f6b10badb6bc
Run description: Temporal query expansion model and no external evidence used.

ciirRun2¶

Run ID: ciirRun2
Participant: CIIR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 06ec1e105218251e1049d5da5f85b002
Run description: Sequential dependence model, relevance feedback model adapted and quality-biased model used.

ciirRun3¶

Run ID: ciirRun3
Participant: CIIR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 99f10871e3b089fc4ec4f7bd31769098
Run description: Based on previous best model, temporal query expansion model added

ciirRun4¶

Run ID: ciirRun4
Participant: CIIR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: b51e30495b7e491f1703c8c4f97f9481
Run description: Based on previous best model, query expansion model using hashtags added

clarity1¶

Run ID: clarity1
Participant: CLARITY_DCU
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 9ebf1c09bbe4f9039248d07cd49614d1
Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0).

clarity2¶

Run ID: clarity2
Participant: CLARITY_DCU
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: f7fe9a61ccce4e4fc4209b7c1f17a08c
Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0). Query Expansion with Pseudo Relevance Feedback using the Top N results, with N determined on a per-query basis.

clarity3¶

Run ID: clarity3
Participant: CLARITY_DCU
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: fc88733de82b8512807ae8f5d3248a30
Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0). Query Expansion with Pseudo Relevance Feedback using the Top N results, with N determined on a per-query basis. EXTERNAL RESOURCE: Language Classifier (http://code.google.com/p/language-detection/) used to detect and remove non-English tweets. This resource is not timely with respect to the queries.

clarity4¶

Run ID: clarity4
Participant: CLARITY_DCU
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 34320b496e0a61ea6e15b7faa729bd1b
Run description: BM25 ranking algorithm with parameter set to ignore Document Length and Term Frequency (i.e. K1=0). Query Expansion with Pseudo Relevance Feedback using the Top N results, with N determined on a per-query basis. Top N results used to estimate the temporal centre of the relevant tweets, and the scores tweets far from this temporal centre are downweighted. EXTERNAL RESOURCE: Language Classifier (http://code.google.com/p/language-detection/) used to detect and remove non-English tweets. This resource is not timely with respect to the queries.

COMMITbase¶

Run ID: COMMITbase
Participant: COMMIT
Track: Microblog
Year: 2011
Submission: 7/28/2011
Type: automatic
Task: main
MD5: dfac0a054ffa59ef1ab961c8886e16cd
Run description: Removed duplicate tweets, retweets, and tweets without links. Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores.

COMMITexp¶

Run ID: COMMITexp
Participant: COMMIT
Track: Microblog
Year: 2011
Submission: 8/5/2011
Type: automatic
Task: main
MD5: 04944c9b60c108902ccb1d79226bd457
Run description: Removed duplicate tweets and retweets. We used learning to rank to learn the weights for a linear combination of retrieval scores of different models. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores. The four models were: 1. Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model (retrieval model of our baseline). 2. A language modeling retrieval model 3. Boolean matching 4. We used a wikipedia-based Semantic Query Expansion (SQM). The tweets were then retrieved using a language modeling retrieval model. This model uses external data.

COMMITfilter¶

Run ID: COMMITfilter
Participant: COMMIT
Track: Microblog
Year: 2011
Submission: 8/5/2011
Type: automatic
Task: main
MD5: 41ffecf8e6d801b7342dc1edde0f028e
Run description: This approach uses pre-filtering potentially relevant tweets using a manually annotated trainingset. We trained a random forest on query-dependent (e.g.: HITS authority and hubs) and query-independent (e.g.: friends, capitalisation, number of links) features. The prediction value of the random forest served as filtering value. We then used learning to rank to learn the weights for a linear combination of retrieval scores of different models. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores. The six models were: Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model (retrieval model of our baseline). A language modeling retrieval model Boolean matching We used a wikipedia-based Semantic Query Expansion (SQM). The tweets were then retrieved using a language modeling retrieval model. This model uses external data. We also retrieved links. We built a corpus of web documents that were linked in tweets and used a language modeling retrieval model. We mapped the links back to the tweets. This model uses external data. We use SQM for the link retrieval setting. This model uses external data.

COMMITlinks¶

Run ID: COMMITlinks
Participant: COMMIT
Track: Microblog
Year: 2011
Submission: 8/5/2011
Type: automatic
Task: main
MD5: c36d91135d325bb2f39f7dfd7bb2eadf
Run description: We used learning to rank to learn the weights for a linear combination of retrieval scores of different models. The ranked list was further curated by cutting-off at threshold relative to the retrieval scores. The six models were: Queries were expanded using time sensitive query expansion which considers terms from documents prior to query time. Relevant tweets were retrieved using a language modeling retrieval model (retrieval model of our baseline). A language modeling retrieval model Boolean matching We used a wikipedia-based Semantic Query Expansion (SQM). The tweets were then retrieved using a language modeling retrieval model. This model uses external data. We also retrieved links. We built a corpus of web documents that were linked in tweets and used a language modeling retrieval model. We mapped the links back to the tweets. This model uses external data. We use SQM for the link retrieval setting. This model uses external data.

cyfrun1¶

Run ID: cyfrun1
Participant: UCSC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: b0f776e40c437fff031b9c4b6fff363a
Run description: sum of query word IDF, tf score of tweet,

cyfrun2¶

Run ID: cyfrun2
Participant: UCSC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 73e8485d949cfd5eaed54e6c796575c1
Run description: tf score of tweet,

dbpWISTUD¶

Run ID: dbpWISTUD
Participant: wis_tudelft
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 09f2aced5cd4599f891a29604bf04c39
Run description: 1. dbpedia 3.6 dump version 2011-01-17

DFReeKLIM¶

Run ID: DFReeKLIM
Participant: FUB
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 5745b2ccc4db35d97c58511b842165b9
Run description: Baseline. Built using a new retrieval model and pseudo relevance feedback QE.

DFReeKLIM30¶

Run ID: DFReeKLIM30
Participant: FUB
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 94e778d61c8a29b00d5ae65aae570ee5
Run description: Baseline. Built using a new retrieval model and pseudo relevance feedback QE. We fix the result list size to 30.

DFReeKLIMDC¶

Run ID: DFReeKLIMDC
Participant: FUB
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: c53f041a475ebdb1c9537b5d1c1ce03e
Run description: Baseline. Built using a new retrieval model and pseudo relevance feedback QE. We use a heuristic approach to determine the result list size, for each query separately.

DFReeKLIMRA¶

Run ID: DFReeKLIMRA
Participant: FUB
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 73d9f244335e2054cab1117cb637f15d
Run description: Built using a new retrieval model and pseudo relevance feedback QE. We also apply a re-ranking technique to deal with both, recency and relevance. We further use a heuristic approach to determine the result list size, for each query separately.

dutirLmFb¶

Run ID: dutirLmFb
Participant: DUTIR
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 87b2af7d0d3d0c19a9da1c9d5abcb282
Run description: language model, feedback, entropy, whether exist link

dutirMixFb¶

Run ID: dutirMixFb
Participant: DUTIR
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 43fcddf53dd458e68296bf09eb28b16d
Run description: max of language model and tf*idf, feedback, entropy, whether exist link

dutirMixSp¶

Run ID: dutirMixSp
Participant: DUTIR
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: edcbbeb332a4b74fc357e924d32d379a
Run description: mixture of language model and tf*idf, entropy, whether exist link

dutirTfidfFb¶

Run ID: dutirTfidfFb
Participant: DUTIR
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 141b50f5f88f32f5c55d4d0cd98f6f5b
Run description: tf*idf, feedback, entropy, whether exist link

EMAX¶

Run ID: EMAX
Participant: TUD_DMIR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 37f657bb58239dbc35f7e8696305db38
Run description: The index for the query MB007 is polluted, while other queries are conducted under the strict realtime condition. The scoring method is only based on information (Entropy) provided by tweets.

FASILKOM01¶

Run ID: FASILKOM01
Participant: FASILKOMUI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 71cc79ab9ebfc080410ea7b090a56c16
Run description: This run does not include any future evidence and external resource.

FASILKOM02¶

Run ID: FASILKOM02
Participant: FASILKOMUI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 8f47b67721351ebfef78d4ec54fcd133
Run description: This run uses phrase query identification (using POS tagger), query expansion (from Google and the Twitter dataset), customized scoring function.

FASILKOM03¶

Run ID: FASILKOM03
Participant: FASILKOMUI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 2e0a7c807a0cdf99493dad950fbf8326
Run description: This run uses phrase query identification (using POS Tagger), query expansion which is generated from the dataset, and customized scoring function.

FASILKOM04¶

Run ID: FASILKOM04
Participant: FASILKOMUI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: e60ea9f35a435c7aa66810c78889ff3a
Run description: This run uses phrase query identification (using POS Tagger), query expansion which is generated from Google search results, and customized scoring function.

FDUNLP¶

Run ID: FDUNLP
Participant: FDUMED
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 84f7f40f41e38d46e8f6ae7d1d836735
Run description: Strictly real time, heauristic approach to identify the language of the tweet. Each tweet is given 2 features, and finally all the tweets are clustered.

FDUNLP2¶

Run ID: FDUNLP2
Participant: FDUMED
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 90c4ca007596e45b949e254f6e691302
Run description: Strictly real time, with a fine tuned tool to identify the language of the tweet. Each tweet is given 2 features, and finally all the tweets are clustered.

Google1GNO¶

Run ID: Google1GNO
Participant: IRSI
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: cc42b20e843c53fb75f10c690c8366ba
Run description: The given queries were searched using Google Search API. Word wise 1-grams of the titles of all the pages returned by Google were sorted in the descending order by their frequencies. Top 5 1-grams were used as new topic (the original topics were not added)and retrieval was done using Terrier-3.5 with these new topics.

gus¶

Run ID: gus
Participant: gslisUIUC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: b322f122a6ef9a4bc12088e08e224019
Run description: This run uses no external or future evidence. It is a variant of the temporal smoothing method described in Efron and Golovchinsky (2011)--a language modeling variant (with no forward-looking corpus stats). No document priors are used.

gust¶

Run ID: gust
Participant: gslisUIUC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 44fed9a322a7525f7c0207a17c1080f5
Run description: This run uses no external or future evidence. It is a variant of the temporal smoothing method described in Efron and Golovchinsky (2011)--a language modeling variant (with no forward-looking corpus stats). The run uses a "temporal" document prior. That is, the prior is estimated by judging the fit of the document's pseudo-query against an exponential distribution.

gustc¶

Run ID: gustc
Participant: gslisUIUC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: a5bdf6b28e82395058dd5c2de6f677a3
Run description: This run uses no external or future evidence. It is a variant of the temporal smoothing method described in Efron and Golovchinsky (2011)--a language modeling variant (with no forward-looking corpus stats). The run uses a "temporal" document prior. That is, the prior is estimated by judging the fit of the document's pseudo-query against an exponential distribution. Also uses a second prior based on the clustering coefficient of the graph of words in a relevance model induced from the document's pseudo query.

gut¶

Run ID: gut
Participant: gslisUIUC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: c811a00053f4cc72dad873637782bb65
Run description: This run uses no external or future evidence. It uses a query likelihood model (with no forward-looking corpus smoothing) supplemented with an independent evidence source based on the likelihood that the temporal profile of the document's pseudo-query also generated the temporal profile of the query.

hitWId¶

Run ID: hitWId
Participant: HIT_LTRC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: a7e2b76aab0aec88da2fcf99400e104b
Run description: This run aims to test the effectiveness of score decay.

hitWIt¶

Run ID: hitWIt
Participant: HIT_LTRC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 70f705b120725eb40d4f13f75bcdba57
Run description: This run focused on automatic threshold selection.

ICTNET11MBR1¶

Run ID: ICTNET11MBR1
Participant: ICTNET
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 5c0837b7adb9dfaeac20d2b7d1c5b7e6
Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness,hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, by developping a semi-supervising algorithm, we extend query within the content of those tweets before the query time. There is no external and future information being used.

ICTNET11MBR2¶

Run ID: ICTNET11MBR2
Participant: ICTNET
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: ab11a48d0a3a9fcf3cf0b0f61df16524
Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness, hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, we used the same query extension with ICTNET11MBR1, and the different enhanced BM25

ICTNET11MBR3¶

Run ID: ICTNET11MBR3
Participant: ICTNET
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 98142713c2f4ca34cfe77cad7d008ff9
Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness, hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, we extend query with the external information by google meta search technique. Meanwhile, some wiki articles are included as a complement. There is no future information being used.

ICTNET11MBR4¶

Run ID: ICTNET11MBR4
Participant: ICTNET
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 8cd46d79e2d80acbeab42a36518a0ac7
Run description: we run the result with 6 features, including the enhanced BM25 weight, length, freshness, hashtag hits, user activeness. we do the query extention as well, except misspell extention. As for this run, we extend the query within all the available tweets collection. And the external extension from ICTNET11MBR3 are combined together. Thus we use both external and future informaiton in this run.

IDEAACTQE¶

Run ID: IDEAACTQE
Participant: GUCAS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: f39fd9a63bbb96a4f6101d2f460f51c0
Run description: A query expand run using content,authority and time based on field-based retrieval model

IDEABASIC¶

Run ID: IDEABASIC
Participant: GUCAS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 72b2a47cec7d7bf3d87afa63d103c0d7
Run description: A baseline run using field-based retrieval model

IDEABASICACT¶

Run ID: IDEABASICACT
Participant: GUCAS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: b22c690fb69ceaddf7996093a476af0e
Run description: A basic run using content,authority and time based on field-based retrieval model

IDEABASICQE¶

Run ID: IDEABASICQE
Participant: GUCAS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 108280222fc6c04707b72857694aaeac
Run description: A query expand run using field-based retrieval model

ikmRun1¶

Run ID: ikmRun1
Participant: ikm101
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: manual
Task: main
MD5: 6f5bf6f36b5f3dc578bd67565e90f8f8
Run description: We use information from link

InL2c1¶

Run ID: InL2c1
Participant: IRSI
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: e4d60cb5d2bac0c2d970b3f27ad424ce
Run description: The retrieval was done using original queries by terrier-3.5.

iritfd1¶

Run ID: iritfd1
Participant: IRIT_SIG
Track: Microblog
Year: 2011
Submission: 8/9/2011
Type: automatic
Task: main
MD5: 586e236e16a4c2a275017f6f2564cc21
Run description: This run combines the score of Lucene search engine with a set of features scores: Popularity of the tweet, Length of the tweet, exact term matching, Presence of a URL, Frequency of URL, Hashtag score, Number of tweet for a twitterer, Number of mentions for a twitterer. Queries were expanded with keyword from news articles published before timestamp.

iritfd2¶

Run ID: iritfd2
Participant: IRIT_SIG
Track: Microblog
Year: 2011
Submission: 8/9/2011
Type: automatic
Task: main
MD5: 713df52d4cce2be1d2ee167538b5d819
Run description: This run combines the score of Lucene search engine with a set of features scores: Popularity of the tweet, Length of the tweet, exact term matching, Presence of a URL, Frequency of URL, Hashtag score, Number of tweet for a twitterer, Number of mentions for a twitterer.

IRSIGoogle1G¶

Run ID: IRSIGoogle1G
Participant: IRSI
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 6a3914edd79c1533f3422c8e7b554ad7
Run description: The given queries were searched using Google Search API. Word wise 1-grams of the titles of all the pages returned by Google were sorted in the descending order by their frequencies. Top 5 1-grams were added to the original topics and retrieval was done using Terrier-3.5 with these new topics.

IRSIGoogle2G¶

Run ID: IRSIGoogle2G
Participant: IRSI
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 5fcdaa84a0b293215226f3f67707141c
Run description: The given queries were searched using Google Search API. Word wise 2-grams of the titles of all the pages returned by Google were sorted in the descending order by their frequencies. Top 5 2-grams were added to the original topics and retrieval was done using Terrier-3.5 with these new topics.

isiFD¶

Run ID: isiFD
Participant: isi
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 14d026d65861123bcc77ce3598bf1e11
Run description: Basic keyword search using "full dependence" variant of MRF retrieval model.

isiFDL¶

Run ID: isiFDL
Participant: isi
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 01752e408ca034471f1ff97606380e83
Run description: Learning to rank model [base ranking function = "full dependence" variant of the MRF model].

isiFDRM¶

Run ID: isiFDRM
Participant: isi
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 2d923f243a6f7b7aeb620d55a3e6afe1
Run description: "Full dependence" variant of the MRF model + pseudo relevance feedback.

isiFDRML¶

Run ID: isiFDRML
Participant: isi
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 3f6d5e58dbbe9eb02fc92c0ac41af198
Run description: Learning to rank model [base ranking function = "full dependence" variant of the MRF model + pseudo relevance feedback].

kanopeRun¶

Run ID: kanopeRun
Participant: KanopeReunion
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 7183078cb4d6eaaa3e0f09b8f3627e12
Run description: Our approach merged (i) an indicator of semantic similarity, approximated using the Reflexing Random Indexing (RRI)(Cohen, Schvaneveldt & Widdows, 2010) semantic space model, with (ii) the chronological distance separating tweets from a given query. RRI is a semantic space model that as demonstrated as good performances as LSA or LDA, but using a mathematical approach of the distributional hypothesis which is based on random projection. This point makes RRI very efficient in terms of computational resources that seems particularly attractive considering the large amount of data from social media.

KAUSTBase¶

Run ID: KAUSTBase
Participant: KAUST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: e71412ef748c91258605008067cacda6
Run description: Baseline run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Tweets are ranked by content similarity (using IDF) and recency.

KAUSTExp¶

Run ID: KAUSTExp
Participant: KAUST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 5c9d969495878a9335ce81c06cef91f3
Run description: Expansion without rerank run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Tweet expansion: expanded URLs and hashtags with most-frequent co-occurring terms. Expansion terms are added to the tweets at indexing time. - Tweets are ranked by content similarity (using IDF) and recency.

KAUSTExpRrnk¶

Run ID: KAUSTExpRrnk
Participant: KAUST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: de700fe8b67550de607605463067f7fa
Run description: Expansion with rerank run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Tweet expansion: expanded URLs and hashtags with most-frequent co-occurring terms. Expansion terms are added to the tweets at indexing time. - Computed an estimation of user topic authority by building user's term-profile. - Computed an estimation of user popularity based on frequency of being replied-to, mentioned, and retweeted. - Tweets are ranked first by content similarity (using IDF) and recency. Then 4 other features are used to rerank: retweet frequency of a tweet, frequency of URL (if exist), estimated user popularity, and estimated user topic-authority.

KAUSTRerank¶

Run ID: KAUSTRerank
Participant: KAUST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: e4e36171200802841c4ab4152de9b6e9
Run description: Rerank without expansion run: - No external or future evidence is used. - Preprocessing: detecting spam users and spam-tweets and also non-English tweets. - Computed an estimation of user topic authority by building user's term-profile. - Computed an estimation of user popularity based on frequency of being replied-to, mentioned, and retweeted. - Tweets are ranked first by content similarity (using IDF) and recency. Then 4 other features are used to rerank: retweet frequency of a tweet, frequency of URL (if exist), estimated user popularity, and estimated user topic-authority.

LJQO10¶

Run ID: LJQO10
Participant: PolyU
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 99b14a6ad47ea00b77ee2d8d6c9bba26
Run description: Only use query without any external or future information.

LJQO5¶

Run ID: LJQO5
Participant: PolyU
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 5d9510a4dfa98530c0f92d01c5e55f94
Run description: Only use query without any external or future information.

LMOP10¶

Run ID: LMOP10
Participant: PolyU
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: b6ddc9331327f12c2dbb438e1c4cd0b5
Run description: Only use query without any external or future information.

LMOP5¶

Run ID: LMOP5
Participant: PolyU
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 7dddb79e0139c717497bbb58107d98c4
Run description: Only use query without any external or future information.

LThresh¶

Run ID: LThresh
Participant: syles
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 9135eceadc69c22f8987b19c1534cbbe
Run description: - Language identifier trained on Wikipedia

manualWISTUD¶

Run ID: manualWISTUD
Participant: wis_tudelft
Track: Microblog
Year: 2011
Submission: 7/12/2011
Type: manual
Task: main
MD5: 4f2323f91c7ed055d74a589800b39413
Run description: An assessor manually searched through the corpus (filtered by language automatically; English only) and located interesting tweets. Allowed time per topic: 5 minutes. To increase the number of tweets retrieved per topic, a single query was submitted at the end of the 5 minute interval and all tweets returned with a tweetid lower than the lowest manually retrieved tweet were appended to the list. No other external sources were used: the context/circumstances topics were learnt by the assessor while assessing the tweets.

melblt¶

Run ID: melblt
Participant: UniMelbLT
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 1228faffe792fc1c63f1587d0f14d090
Run description: Baseline system, using language identification and lexical normalisation and off-the-shelf IR

MONASH1NEW¶

Run ID: MONASH1NEW
Participant: monash
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: e468cbdd5a3b63d7a0d5dce5c1a871da
Run description: MONASH1NEW run performs similar way of MONASH2NEW but in order to enhance the performance we modified the returned values of some of the characteristics of tweets which have been mentioned in MONASH2NEW run.

MONASH2NEW¶

Run ID: MONASH2NEW
Participant: monash
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 26d651fd17c4ad24af190668727f19cb
Run description: In MONASH2run, we take into consideration some of the tweets features in the other words, if a tweet has hash tag, @ tag or url, then add specific values to the function to make this tweet weigh more than others which they do not have. Moreover, our system calculates the value of IDF (Inverse Document Frequency) of a content of tweet and takes this value into account. The other thing in our system is that the number of tweets which a user writes and a length of that twee, indeed, give a good indicator of the importance and the relevance of that tweet. Accordingly, both of them are taken into account. In addition to that, this run uses a method to detect the language of tweets. Therefore, if a language of a tweet is English, then it has more weight which makes English tweets appear in the top of non-English tweets.

MorpheusRun1¶

Run ID: MorpheusRun1
Participant: Morpheus
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 2ac13ae226dd2afad39aaa4586e78038
Run description: Our system use a totally non-traditional approach to real-time search. First we have a process of combining tweets into tweet bundles (super tweets). These super tweets allow us to have a larger document size for our topic modeling runs. Document size is a main draw back to using topic models with microblogs. We do a few things, on searches past an hour in time we run a batch Latent Dirichlet Allocation on hour intervals of tweets. To find best super tweets we use KullbackLeibler divergence on the topic distributions. To find the best tweets we use recency and word occurrence.

mulnewWISTUD¶

Run ID: mulnewWISTUD
Participant: wis_tudelft
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 67646c18aa6376cdd7607e88750c372f
Run description: 1. dbpedia 3.6 version: published at 2011-01-17 2. Short description of news articles crawled from 62 news RSS feeds.(From Jan. 21st to Feb. 10th)

myRun¶

Run ID: myRun
Participant: Purdue_IR
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 4157b6e8049492def50d3c134f51c562
Run description: Use belief propagation to select the exemplar of each tweet. Boost score of each tweet according to its exemplar.

myrun2¶

Run ID: myrun2
Participant: Purdue_IR
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: db42d924a19d287374fdcc7e8be88a35
Run description: Use query expansion to reformulate query. Use belief propagation to select the exemplar of each tweet. Boost score of each tweet according to its exemplar. Use the same similarity metrics as in run1

myrun3¶

Run ID: myrun3
Participant: Purdue_IR
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 690b48419ae8ce109c26a8d951bd06ee
Run description: Use belief propagation to select the exemplar of each tweet. Boost score of each tweet according to its exemplar. Use a different similarity metrics than run1

Nestor¶

Run ID: Nestor
Participant: IRIT_SIG
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 5ad7dcf7dab4591ce895b43649de19f4
Run description: We use a Bayesian network retrieval model for tweet search that considers, in addition to textual similarity measures, the social influence of microbloggers, the time magnitude, the tweet length and hashtags occurrence. Results are filtered by the numbers of query terms present in the tweet. No future or external features is used in this system.

NestorS¶

Run ID: NestorS
Participant: IRIT_SIG
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 50e9d8286f7dbcfbd0f7c31de812c4cf
Run description: We use a Bayesian network retrieval model for tweet search that considers, in addition to textual similarity measures, the time magnitude, the tweet length and hashtag occurrence. This system ignores the social influence of microbloggers. Results are filtered by the numbers of query terms present in the tweet. No future of external features is used in this system.

normal¶

Run ID: normal
Participant: KobeU
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 6673f7d5991196223871f24eacf5a775
Run description: We retrieve tweets from indexes corresponding to each query time.

nQCRIwoTag¶

Run ID: nQCRIwoTag
Participant: QCRI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: c0ca23ea4095a12ba8a1c594d930e19b
Run description: without Automatically induced tags, ordered

nQCRIwTag¶

Run ID: nQCRIwTag
Participant: QCRI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: bf2a0cd0029e032dcb9ccb2b362e8da7
Run description: with Automatically induced tags, ordered

omarRun¶

Run ID: omarRun
Participant: DLDE
Track: Microblog
Year: 2011
Submission: 8/9/2011
Type: automatic
Task: main
MD5: 6fba88786362d1e431415e1c1ab07be4
Run description: This results only use the tweets itself , and we practice a simple model and use some heuristic rules to pure the results.

PKUICST¶

Run ID: PKUICST
Participant: PKU_ICST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 4af1811b38070cdb3f2d38bb403ac363
Run description: This submission run at the dynamic index with respect to different query, and the number of result is selected according to a score threshold.

PKUICST2¶

Run ID: PKUICST2
Participant: PKU_ICST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 4ed23ac4a71a5267d7a4250b3c27f87d
Run description: This submission run at the dynamic index with respect to different query, and the number of result is 30/31 per query.

PKUICST3¶

Run ID: PKUICST3
Participant: PKU_ICST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 1887446e5dca8892c15e51cae519697a
Run description: This submission run at the dynamic index with respect to different query, and the number of result is 100/101 per query.

PKUICST4¶

Run ID: PKUICST4
Participant: PKU_ICST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: cc098105eb33e72bdc219763652bc7c8
Run description: This submission run at the dynamic index with respect to different query, and the number of result is 300/301 per query.

PL2Bo1SDExt¶

Run ID: PL2Bo1SDExt
Participant: UoW
Track: Microblog
Year: 2011
Submission: 8/2/2011
Type: automatic
Task: main
MD5: c057d38c54d1a1555b2775a08e50e711
Run description: PL2 DFR algorithm with Sequential Divergence from Randomness based dependence model and Bo1 query expansion using linked HTML pages as part of tweet.

PL2NoQENoDM¶

Run ID: PL2NoQENoDM
Participant: UoW
Track: Microblog
Year: 2011
Submission: 8/2/2011
Type: automatic
Task: main
MD5: bc5fe4d773b1d6664a8cdd4012d99ea2
Run description: PL2 DFR algorithm baseline.

PL2NoQeSd¶

Run ID: PL2NoQeSd
Participant: UoW
Track: Microblog
Year: 2011
Submission: 8/2/2011
Type: automatic
Task: main
MD5: f81c980eab64f23123c111101f8437a9
Run description: PL2 DFR algorithm with Sequential Divergence from Randomness based dependence model.

PL2NoQeSdExt¶

Run ID: PL2NoQeSdExt
Participant: UoW
Track: Microblog
Year: 2011
Submission: 8/2/2011
Type: automatic
Task: main
MD5: 94001c71957e741b5ed0ad6dc8ef88d5
Run description: PL2 DFR algorithm with Sequential Divergence from Randomness based dependence model where the linked HTML pages are also part of the document.

PRISrun1¶

Run ID: PRISrun1
Participant: PRIS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: manual
Task: main
MD5: 48cd32d42dbb1c898f8fcfb39feb7a39
Run description: 1.There are no future or external resources used in this run. 2.Filter out irrelevant tweets using WAF(Word Activation Force) model. 3.Some parameters and thresholds in the model are select manually. 4.Operate in a strict real-time fashion.

PRISrun2¶

Run ID: PRISrun2
Participant: PRIS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: manual
Task: main
MD5: 72297b2efe89dea63764c41ce4cafe2d
Run description: 1.Web pages linked from tweets are used in this run. 2.Some parameters and thresholds in our model are select manually. 3.Operate in a strict real-time fashion. 4.Filter out irrelevant tweets using WAF(Word Activation Force) model.

PRISrun3¶

Run ID: PRISrun3
Participant: PRIS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 877300bfbfa0413876c46be42fb904e3
Run description: 1.Corpus after the query time is used in this run. 2.The query system filters the desirable tweets automatically. 3.Filter out irrelevant tweets using WAF(Word Activation Force) model.

PRISrun4¶

Run ID: PRISrun4
Participant: PRIS
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: c7b591bc0313002a37e8816c50e2bce7
Run description: 1.Corpus after the query time is used in this run. 2.Web pages linked from tweets are used in this run. 3.The query system filters the desirable tweets automatically. 4.Filter out irrelevant tweets using WAF(Word Activation Force) model.

QCRIwoTagOrg¶

Run ID: QCRIwoTagOrg
Participant: QCRI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: e9f9bb1ea861e3eaf86c814ca7d7b816
Run description: NO induced hash tags, Original ranking

QCRIwTagOrg¶

Run ID: QCRIwTagOrg
Participant: QCRI
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: b1784eb259eda2dd2b99af342d1f5f4c
Run description: using automatically induced hash tags, Original ranking

qHtagBaseRun¶

Run ID: qHtagBaseRun
Participant: L3S
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: ddb5dcb3ce242408005b51b477cfc79c
Run description: This run represents a simple baseline that take words in the title of the topic and used them as hashtags to select and rank the tweets.

qRefLThresh¶

Run ID: qRefLThresh
Participant: syles
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 7d7552846d2cfa4ba3bf1682f4832939
Run description: - MSR Web Ngrams for query segmentation / reformulation / weighting - Tf.Idf on whole corpus (Lucene) - Language detector trained on Wikipedia

refBalRun¶

Run ID: refBalRun
Participant: NUSIS
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: b818d50b444a834b8c92925b417da58b
Run description: this is the result of query reformulation and we balance relevance and recency

refRelRun¶

Run ID: refRelRun
Participant: NUSIS
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 081f58f3cf19402ff3c1a94af2d07684
Run description: this is the result of query reformulation

relevanceRun¶

Run ID: relevanceRun
Participant: NUSIS
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 2ffd2ddd5e2fe295e994bcdf57b60750
Run description: This is the basic run.

RFD¶

Run ID: RFD
Participant: Elly
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: a2045a1e9054bd4efed68c04340b0f8c
Run description: The RFD mode is a pattern based model. Different from term-based models, the weight of the terms in RFD model is based on the weight of the extracted patterns. The top 10 tweets were selected as the positive feedbacks. These pseudo-relevance feedbacks were used in the RFD model to generate the feature set. Then, the feature set was used to rank all the tweets. The top 1000 ranked tweets were submitted as the final results.

ri¶

Run ID: ri
Participant: KobeU
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 54eb009021921ab317f3a716ac45db4e
Run description: We use JSON format tweets to know user information and tweet's descriptions. Our run reranks tweets by learning to rank with careful attention to their topic and interestingness.

rit¶

Run ID: rit
Participant: KobeU
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 3d1576d11fb36ba2fd04ecfd69405eb9
Run description: We use JSON format tweets to know user information and tweet's descriptions. Our run reranks tweets by learning to rank with careful attention to their topic, interestingness, and time.

rit3¶

Run ID: rit3
Participant: KobeU
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 0f501622e54ae9bf98e4d001095babb3
Run description: We use JSON format tweets to know user information and tweet's descriptions. Our run reranks tweets by learning to rank with careful attention to their topic, interestingness, and time. We also use the wordnet for query expantion.

RMITAR¶

Run ID: RMITAR
Participant: RMIT
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 43595f1783eb6716124606040a0f224e
Run description: English dictionary to filter out non-english tweets.

RMITM¶

Run ID: RMITM
Participant: RMIT
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: manual
Task: main
MD5: 25743aa8a5020827d1e6a454d7f95ffe
Run description: External sources: - english dictionary to filter out non english tweets

RMITMR¶

Run ID: RMITMR
Participant: RMIT
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: manual
Task: main
MD5: 029cb02fccff09deb252bc7595bcb68c
Run description: - english dictionary to filter out non english tweets.

RMITMRR¶

Run ID: RMITMRR
Participant: RMIT
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: manual
Task: main
MD5: bd5ce14d6d4b820f984266646a647726
Run description: english dictionary to filter out non-english tweets.

Rocchio¶

Run ID: Rocchio
Participant: Elly
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 8a50379eec1de900e8679b1136c971da
Run description: The Rocchio was used to build user profiles from pseudo-relevance feedbacks. Then the feedbacks were used to re-rank all tweets. The top 1000 ranked documents were submitted as the final results.

RTB¶

Run ID: RTB
Participant: TUD_DMIR
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 6c6ce650e0618c06c02e9a39d9518bf9
Run description: The index for the query MB007 is polluted, while other queries are conducted under the strict realtime condition. The scoring method balances the time dimension and information dimension.

run1¶

Run ID: run1
Participant: ICTIR
Track: Microblog
Year: 2011
Submission: 8/6/2011
Type: manual
Task: main
MD5: d77899e073f8b0925519c3225486a591
Run description: p@30

run1a¶

Run ID: run1a
Participant: QUT1
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: c09215a22f2c284fe27ba712ca440548
Run description: Pseudo rf no weight

run1fix¶

Run ID: run1fix
Participant: ICTIR
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: manual
Task: main
MD5: e15b8b1136c619b750234725ef978eee
Run description: run1fix for p@30

run2¶

Run ID: run2
Participant: ICTIR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: manual
Task: main
MD5: a121efc2d844de57d23aba709de3d2f2
Run description: run2 combine author and cluster

run2a¶

Run ID: run2a
Participant: QUT1
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 92f3b72c372adfc1324106d459a4c52e
Run description: pseudo rf with weight

run3¶

Run ID: run3
Participant: UCSC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 3cb8117ee7a56c197df59fd6eb0eed36
Run description: sum of query word idf, if has URL, URL MIME/text, URL MIME/audio, URL MIME/video, URL MIME/image, URL MIME/application, if has hashtap #, average word length of tweet, word length variance of tweet, if has @, stop word percent, bm25 score of tweet, tf score of tweet, tfidf score of tweet, bm25 score of url title, tf score of url title, tfidf score of url title, bm25 score of url page, tf score of url page, tfidf score of url page, language model score of tweet, doc length, retweeted times, tweet time,

run3a¶

Run ID: run3a
Participant: QUT1
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 2ad11f9c9b9ca5a79b752bf36c798532
Run description: This run uses pattern generated from query and reweight them based on the term frequency.

run4¶

Run ID: run4
Participant: UCSC
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: caf81b5b47b05c3e93d06f1efb4d6557
Run description: sum of query word idf, if has URL, if has hashtap #, average word length of tweet, word length variance of tweet, if has @, stop word percent, bm25 score of tweet, tf score of tweet, tfidf score of tweet, language model score of tweet, doc length, retweeted times, tweet time,

run4a¶

Run ID: run4a
Participant: QUT1
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 540aec82c862294382744629e1947265
Run description: This run uses pattern generated from query, reweight them and combine with term weight based on the term frequency.

RunAll¶

Run ID: RunAll
Participant: xmuPRC
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: c3996aef27f879bc7c9b8965ff0dafab
Run description: Index is built upon all tweets, expanding tweet contents by extracting contents from its linking web pages. Named Entity Extraction used. Query expansion by by extracting representative words from most relevant page on the web, pseudo relevance feedback and representative Hashtag keywords. Filter non-informative tweets by ensemble ranking author pagerank, author HITS authority and pure retweets. Filter non-english tweets.

RunFut¶

Run ID: RunFut
Participant: xmuPRC
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: adf99031ef313198daf392a9d5d936ba
Run description: Index is built upon all tweets, only including tweet contents. Named Entity Extraction used. Query expansion by pseudo relevance feedback and representative Hashtag keywords. Filter non-informative tweets by ensemble ranking author pagerank, author HITS authority and pure retweets. Filter non-english tweets.

runNeMIS¶

Run ID: runNeMIS
Participant: NEMIS_ISTI_CNR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 2042cf44a94b170cb51dd55ba5f8d825
Run description: Python retrieval system based on the whoosh library. The system uses three separate indexes: -one with the text of the tweet. -one with the distinct words composing any hash tag in the tweets (multi-word hash tag are automatically split with a Viterbi-based algorithm). -one with the title of any linked page by any tweet. The retrieval scores from the three indexes are linearly combined and a filtering threshold is used to filter out low-score tweets.

runNeMISext¶

Run ID: runNeMISext
Participant: NEMIS_ISTI_CNR
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 7ed10b983f96eb27b0f4845300e61109
Run description: Python retrieval system based on the whoosh library. The system uses three separate indexes: -one with the text of the tweet. -one with the distinct words composing any hash tag in the tweets (multi-word hash tag are automatically split with a Viterbi-based algorithm). -one with the title of any linked page by any tweet. This run uses a stylometric score function that compares each tweet with a word distribution model extracted from a collection of Reuters news (external resource). This function allows to filter out poorly written tweets. The retrieval scores from the three indexes and the stylometric score are linearly combined and a filtering threshold is used to filter out low-score tweets.

RunPure¶

Run ID: RunPure
Participant: xmuPRC
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 329028ff3a7e6fbe5842ac702f527494
Run description: Index is built upon tweets that are before query time, only including tweet contents. Named Entity Extraction used. Query expansion by pseudo relevance feedback on Lucene search results. Filter non-informative tweets by ensemble ranking author pagerank and pure retweets. Filter non-english tweets.

scurtuRun1¶

Run ID: scurtuRun1
Participant: Vitalie_Scurtu
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 1c561553beedbd06c2fc7a296ad5954d
Run description: The system does some basic parsing for identification of tweets/retweets. the twitter is decomposed in a list of features that are purely extracted from text (such as mentions, links, hashtags etc.), and does language recognition. For querying it uses a simple strategy for keywords reduction, and as a scoring formula it uses lucene dismax query-document similarity.

sielrun1¶

Run ID: sielrun1
Participant: SIEL_IIITH
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 3e71246ee2177f69105e05224e65a645
Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 2:3. Does not uses any external source.

sielrun2¶

Run ID: sielrun2
Participant: SIEL_IIITH
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: f63debe27b4cfcc68801f380e6e85e48
Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 5:2. Does not uses any external source.

sielrun3¶

Run ID: sielrun3
Participant: SIEL_IIITH
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 703ed32b19a47492e04d9a2792e1b65e
Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 3:2. Does not uses any external source.

sielrun4¶

Run ID: sielrun4
Participant: SIEL_IIITH
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 31722c75077e7c0e59a18999b78a7eb2
Run description: Uses a combination of Lucene tf-idf relevance score and a score generated by k-means clustering giving them weights in the ratio of 7:2. Does not uses any external source.

SienaCL1B¶

Run ID: SienaCL1B
Participant: SienaCLTeam
Track: Microblog
Year: 2011
Submission: 8/9/2011
Type: automatic
Task: main
MD5: 0ec65fa9348180621d758d32480e18ef
Run description: Utilized content of links in tweets

SienaCL31¶

Run ID: SienaCL31
Participant: SienaCLTeam
Track: Microblog
Year: 2011
Submission: 8/9/2011
Type: automatic
Task: main
MD5: 578a65ea6a00fbbe1c82b0d356c90947
Run description: Utilized Google for the query expansion module

SienaCL342¶

Run ID: SienaCL342
Participant: SienaCLTeam
Track: Microblog
Year: 2011
Submission: 8/9/2011
Type: automatic
Task: main
MD5: 563e765898ce0514ccbdffeaf86ab549
Run description: Content of URLs within tweets utilized WEKA machine learning used (e.g. information retrieved utilizing Twitter API) Query expansion module utilizing google

SienaCLbase¶

Run ID: SienaCLbase
Participant: SienaCLTeam
Track: Microblog
Year: 2011
Submission: 8/8/2011
Type: automatic
Task: main
MD5: a500d58e3ef37e960ff7bb2029cceedc
Run description: Simple baseline run using Lucene. Non-English tweets removed Textese expanded Strict RT's removed

simfoll¶

Run ID: simfoll
Participant: UGLA_D
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 8a92b27554c36a11348f481ad6a9e8fb
Run description: 1) Full text searching through mysql - this should be a flavour of bm25; results contain only tweets < tweet time given in topics 2) Removed straight retweets 3) Pushed up scores depending on the number of followers: score := score + log(#followers^(score/C) + 1); C=2 - given all users have followers in our tweet set, the use of log might not be ideal 4) Removed tweets which contain substrings identical to higher-scored ones, where the length of the substrings is > 60% of the length of the tweet (approximately)

simfollTP01¶

Run ID: simfollTP01
Participant: UGLA_D
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: b169bb09325e8b82cc28f7ad40d2cbef
Run description: Uses a novel temporal pseudo relevance feedback technique (based on the retrieval of our other run, simfoll) to attempt to expand query with terms that occur strongly in the the same time periods. Term temporality was extracted in 2 hour intervals for the duration of the collection with the algorithm attempting to identify and expand the query with other terms that had similar temporal characteristics. Lucene was used with a vector space retrieval model for the retrieval in this run.

sylesNoRes¶

Run ID: sylesNoRes
Participant: syles
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: cba58274e229a8ca581af0bb396aa0ff
Run description: - Cosine similarity with modified tf-idf formula for terms weight: weight(term) = 1 * log(1 / (df + 1)) / ave(tf) / (var(timef) + 1), where timef - time frequency - Filter documents not containing query terms starting with capital letter (e.g. New York) - Filter stopwords (Most frequent words. Computed from tweets until querytime) - Filter identical tweets using minhash

tfTP01¶

Run ID: tfTP01
Participant: UGLA_D
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: c30413773b6275230fb3a32fce409926
Run description: Uses a novel temporal pseudo-relevance feedback technique (based on a TF-only retrieval) to attempt to expand query with terms that occur strongly in the the same time periods. Term temporality was extracted in 2 hour intervals for the duration of the collection with the algorithm attempting to identify and expand the query with other terms that had similar temporal characteristics. Only the temporal information prior to the query was used by the temporal PRF algorithm (therefore this run conforms to the real-time requirements). Lucene was used with a TF-only vector space retrieval model for the initial and expanded retrieval in this run.

udelIndri¶

Run ID: udelIndri
Participant: udel
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 5a8ba18aefa7aeb66fe1579acbfc319a
Run description: basic indri run

udelLucene¶

Run ID: udelLucene
Participant: udel
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: edc77285fea3fb5fa418b571833774b9
Run description: basic lucene run

UDMicroComb1¶

Run ID: UDMicroComb1
Participant: Udel_Fang
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 879f50d5dc8bdca64b38bf750c16944d
Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period. Use document-length-weighting to favor long and high-term-IDF tweets in order to improve "interestingness" (The only future information we used is term IDF). Use pseudo feedback.

UDMicroComb2¶

Run ID: UDMicroComb2
Participant: Udel_Fang
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: a54a8349d750deff767d4d31331210d7
Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period. Use document-length-weighting to favor long and high-term-IDF tweets in order to improve "interestingness" (The only future information we used is term IDF). Use yahoo's search result to do query expansion.

UDMicroIDF¶

Run ID: UDMicroIDF
Participant: Udel_Fang
Track: Microblog
Year: 2011
Submission: 8/9/2011
Type: automatic
Task: main
MD5: d3b7095102af99d3ea98abc1fcbe2f3f
Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period.

UDMicroIDFD¶

Run ID: UDMicroIDFD
Participant: Udel_Fang
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 82ca89d80100ab264b00a5bf6c6a03d3
Run description: Use time-sensitive weighting to favor tweets in "popular discussed" period. Use document-length-weighting to favor long and high-term-IDF tweets in order to improve "interestingness" (The only future information we used is term IDF).

uicir1¶

Run ID: uicir1
Participant: UICIR
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: e0dffd23595f54c73c9fefb6abe5ed47
Run description: We use Wikipedia and Google to conduct the query expansion. Moreover, Wikipedia is used to extracted the related concepts.

uicir2¶

Run ID: uicir2
Participant: UICIR
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: e01d15a7c22b2c9300bddfd2be2bac89
Run description: We use Wikipedia and Google to conduct the query expansion. Moreover, Wikipedia is used to extracted the related concepts.

UIowaS1¶

Run ID: UIowaS1
Participant: UIowaS
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 222ccd3b627801613484df2a449ec050
Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs, and was indexed using Indri. No external resources were used in this run, though it is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). The results were constrained temporally according to the query date, and duplicated tweets were removed. Finally top 30 results were ordered temporally.

UIowaS2¶

Run ID: UIowaS2
Participant: UIowaS
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: bb2fbbad3a72d3a30877ce43dc43fa0d
Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs. External resources included expanded URLs (plus title, description and keywords from the pages they refer to), as well as definitions of tags using tagdef.com. It was indexed using Indri. It is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). The results were constrained temporally according to the query date, and duplicates tweets were removed. Finally top 30 results were ordered temporally.

UIowaS3¶

Run ID: UIowaS3
Participant: UIowaS
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 1f092d376d8630bc90d40b19b939f8fc
Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs. External resources included expanded URLs (plus title, description and keywords from the pages they refer to), as well as definitions of tags using tagdef.com. It was indexed using Indri. It is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). For queries which had results from the conservative strategies, we performed query expansion by appending the most frequent capitalized word found in returned tweets which is not in the original query and is more or as frequent as one of original query terms. The results were constrained temporally according to the query date, and duplicates tweets were removed. Finally top 30 results were ordered temporally.

UIowaS4¶

Run ID: UIowaS4
Participant: UIowaS
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 4c60733fb4e91670de30a6430f51ba8c
Run description: Dataset was pre-processed by extracting hashtags, mentions, and URLs. External resources included expanded URLs (plus title, description and keywords from the pages they refer to), as well as definitions of tags using tagdef.com. It was indexed using Indri. It is not a strict real-time run, in that the queries were run against the index of the whole data set. Run consists of a merged set of results, ranging from most conservative to least. The least (OR) set of results was filtered using the presence of capitalized query words (as indicators of important entities in the query). For queries which had results from the conservative strategies, we performed query expansion by appending the most frequent capitalized word found in returned tweets which is not in the original query and is more or as frequent as one of original query terms. The results were constrained temporally according to the query date, and duplicates tweets were removed. The results are NOT sorted temporally, but by relevance instead (which is the only difference from UIowaS3 run).

uiucsf¶

Run ID: uiucsf
Participant: uiuc
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 9756e1111fdd36a7ac71f9d55fedca78
Run description: mixture model of causal potential and standard tfidf-ness.

uogTrLqea¶

Run ID: uogTrLqea
Participant: uogTr
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 49bf34e568dc2f3cd4abbbb7a3e51465
Run description: Learned run using 66 real-time non-external features

uogTrLqeabd¶

Run ID: uogTrLqeabd
Participant: uogTr
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: a78d5aee1c52e88d3d22481fb856c980
Run description: Learned Run using 76 features including content linked from tweets.

uogTrLqeabdd¶

Run ID: uogTrLqeabdd
Participant: uogTr
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: fa0fccbf0c1901c6ce8bc438cd08555d
Run description: Learned run using 76 real-time non-external features where the objective function directly tries to trade off relevance and recency

uogTrUB2¶

Run ID: uogTrUB2
Participant: uogTr
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 9af36188e5379e3d0f21dfbdf978cad7
Run description: Filtering run

UTBase¶

Run ID: UTBase
Participant: utwente
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: f68b57b10f11184032f6d3512ce16b11
Run description: Baseline run that performs a standard Lucene free text search over the content field of all tweets that exist up to the timestamp of each query. Performs only basic query pre-processing and matching: stopword removal+lowercasing and uses Lucene's StandardTokenizer. Uses a strict incremental index for each query. Applies (repeated) query expansion based on the tweet content if an original TREC query does not yield enough results.

UTBaseRTF¶

Run ID: UTBaseRTF
Participant: utwente
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 801243b48a7bf536f318b4f829e77474
Run description: Performs a standard Lucene free text search over the content field of all tweets that exist up to the timestamp of each query, but uses a full index created for all tweets. Prefers tweets that have been retweeted one or more times, but falls back to unretweeted tweets if this yields an insufficient number of results. If this still does not yield enough results, (repeated) query expansion is applied based on the content of already obtained tweets. Basic query processing involves stopword removal+lowercasing and Lucene's StandardTokenizer.

UTWngFuture¶

Run ID: UTWngFuture
Participant: utwente
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: b986f08903d6d090899be016f5f2ba3d
Run description: Performs a Word n-gram search over the content field of all tweets that exist up to the timestamp of each query using Lucene, but uses a full index created for all tweets. The value of n used depends on the length in words of the input query and is almost always ceil(word_count(query)) unless word_count(query) is 2, in which case n is fixed to two. The word n-grams generated are posed as an AND query to the system and then interleaved to yield a final result list. If there are too few results, each query word is submitted as a query, and if that still does not yield enough results (repeated) query expansion is applied based on the content of already obtained tweets. Basic query processing involves stopword removal+lowercasing and Lucene's StandardTokenizer.

UTWngFutureQ¶

Run ID: UTWngFutureQ
Participant: utwente
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 19e175fcd6ecacbf8d63c7778c495b43
Run description: Performs a Word n-gram search over the content field of all tweets that exist up to the timestamp of each query using Lucene using a full index created for all tweets. Prefers tweets that meet 4 quality criteria: contains either a hashtag or URL; is not directed at more than three people (with the @ symbol), does not consist of more than 50 percent ALL CAPS words, and does not contain repeated exclamation marks. The value of n used for the word n-grams depends on the length in words of the input query and is almost always ceil(word_count(query)) unless word_count(query) is 2, in which case n is fixed to two. The word n-grams generated are posed as an AND query to the system and then interleaved to yield a final result list. If there are too few results, each query word is submitted as a query, and if that still does not yield enough results (repeated) query expansion is applied based on the content of already obtained tweets. Basic query processing involves stopword removal+lowercasing and Lucene's StandardTokenizer.

waterlooa1¶

Run ID: waterlooa1
Participant: waterloo
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 25fba9b6fdfe5c58f4d467a1fb0e7a0f
Run description: Uses the Wumpus Search Engine to issue multiple queries for indices respecting the time constraints. Combines results using reciprocal rank fusion.

waterlooa2¶

Run ID: waterlooa2
Participant: waterloo
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 1d83ce46435fe7be6389a0469ed5145a
Run description: Uses the Wumpus Search Engine to issue multiple queries for indices, where only HTTP Status 200 tweets are used, respecting the time constraints. Combines results using reciprocal rank fusion. Due to an error in how such tweets were selected, it cannot be guaranteed that all tweets used chronologically preceded the queerytweetime but all tweets returned are earlier than the time constraint.

waterlooa3¶

Run ID: waterlooa3
Participant: waterloo
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 668e0b977accf3080fcb4110cdb5247c
Run description: Uses the Wumpus Search Engine to issue multiple queries for indices respecting the time constraints. Queries were issued using psudeo-relevance feedback (Okapi and KLD type feedback) and the language model used was from a previous Terabyte track. Combines results using reciprocal rank fusion.

waterlooa4¶

Run ID: waterlooa4
Participant: waterloo
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: a7ce41e7fdfcf31ba44e5a4b22d438e8
Run description: Uses the Wumpus Search Engine to issue multiple queries for indices respecting the time constraints. Combines results using reciprocal rank fusion. Following this results were reranked with respect to recency, i.e. the rrf score was multiplied by /.

WESTfilext¶

Run ID: WESTfilext
Participant: WeST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 6698265a2c3ef5ff77ca38ff22fe4121
Run description: Used ANEW sentiment vocabulary to compute the high-level feature (sentiment) of the tweet.

WESTfilter¶

Run ID: WESTfilter
Participant: WeST
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 48fee9dfc8a867da641043a80c8252be
Run description: This run is created purely using internal knowledge that is available at the time of query. No use of web pages linked from tweets.

WESTrelint¶

Run ID: WESTrelint
Participant: WeST
Track: Microblog
Year: 2011
Submission: 8/10/2011
Type: automatic
Task: main
MD5: 7842ecdefa8df62eb24e80d4e7e18d1e
Run description: Real time, no use of external knowledge, no use of web pages linked from tweets.

WESTrlext¶

Run ID: WESTrlext
Participant: WeST
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 14a344ab4b4970e40567b7fd597d2028
Run description: Used ANEW sentiment vocabulary to compute the high-level feature (sentiment) of the tweet.

Wise2ndRun¶

Run ID: Wise2ndRun
Participant: SEEM_CUHK
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 5034c44f8df0dbcd0db56d38434e4c6d
Run description: 1. Language model based retrieval 2. Query expansion

WiseFifthRun¶

Run ID: WiseFifthRun
Participant: SEEM_CUHK
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 6018662af8b9237cc35bcf1c8b98b7a8
Run description: 1. Language Model based Retrieval 2. topic classification based on the result returned 3. result re-ranking strategy for emerging topic (different with the fourth run)

WiseFouthRun¶

Run ID: WiseFouthRun
Participant: SEEM_CUHK
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: ecab4cad81c0fa0941772dc7642bd6b6
Run description: 1. Language Model based Retrieval 2. topic classification based on the result returned 3. result re-ranking for emerging topic

WiseThirdRun¶

Run ID: WiseThirdRun
Participant: SEEM_CUHK
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 4023e21d0665040e464e1fb572715e74
Run description: 1. Language model based retrieval 2. Normalize all the tweets in the corpus

ya3¶

Run ID: ya3
Participant: yandex
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: c5cf5b7a565c55fe385ea3d59a730c84
Run description: -Query extension -User social features, such as number of followers at etc. -Textual quality and diversity of the tweets (query independent) -Emotion features of the tweets -Text features the headers of external links -Ranking scores achieved from boosting trees regression algorithm

ya4¶

Run ID: ya4
Participant: yandex
Track: Microblog
Year: 2011
Submission: 8/12/2011
Type: automatic
Task: main
MD5: 4effd0c57d54f17af70338f2e6e0e132
Run description: -Query extension -User social features, such as number of followers at etc. -Textual quality and diversity of the tweets (query independent) -Emotion features of the tweets -Text features the headers of external links -Ranking scores achieved from boosting trees classification algorithm

YNDXTPC1¶

Run ID: YNDXTPC1
Participant: yandex
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: 08246052e6bb8ebac13810d1c46eeaf6
Run description: just query expansion using tweets posted before the query time

YNDXTPC2¶

Run ID: YNDXTPC2
Participant: yandex
Track: Microblog
Year: 2011
Submission: 8/11/2011
Type: automatic
Task: main
MD5: ad201ddcb1e660f8f3e537e21f6bdd74
Run description: just query expansion using tweets posted before the query time