Runs - Microblog 2014¶

1unique2¶

Results | Participants | Input | Summary | Appendix

Run ID: 1unique2
Participant: uog_twteam
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: a6b41e7e8707f84c7e36e3d99f336038
Run description: A cluster-based approach which makes use of nouns and verbs only to perform clustering.

3unique0¶

Results | Participants | Input | Summary | Appendix

Run ID: 3unique0
Participant: uog_twteam
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: f01272b8e7fe15a72a8205755c7f0f73
Run description: A more relaxed cluster based approach using only nouns and verbs.

3unique2¶

Results | Participants | Input | Summary | Appendix

Run ID: 3unique2
Participant: uog_twteam
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: 52e5ad955a3c4a22280f654340bcc782
Run description: A cluster based approach which uses nouns and verbs only.

baselineRaw¶

Results | Participants | Input | Summary | Appendix

Run ID: baselineRaw
Participant: uiucGSLIS
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: ffc3239871bc206f618cd4b239446c60
Run description: Official baseline using raw API output.

baselineRM3¶

Results | Participants | Input | Summary | Appendix

Run ID: baselineRM3
Participant: uiucGSLIS
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 7e0bb31fe90f5be648ff84eced423ffb
Run description: Official baseline using RM3.

ECNURankLib¶

Run ID: ECNURankLib
Participant: ECNUCS
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 73d24ba9ee887994e986111891a53789
Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the RankLib tools. The training sets are 2011 and 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

ECNURankLib2013¶

Run ID: ECNURankLib2013
Participant: ECNUCS
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: ea96edce6819cb82980b9808d99d774e
Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the RankLib tools. The training set is 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

ECNUSVM¶

Run ID: ECNUSVM
Participant: ECNUCS
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: e1a248959bf6285e258bc1736a449425
Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the SVM from sklearn tool. The training sets are 2011 and 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

ECNUSVM2013¶

Run ID: ECNUSVM2013
Participant: ECNUCS
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 633c8ef8202cb1c6b9abb70993930d4c
Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the SVM from sklearn tool. The training set is 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

EM100¶

Run ID: EM100
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 91729890380aad420ee3a5cbe458b949
Run description: Used cosine similarity to detect similar tweets in the top 100 results of each topic

EM50¶

Run ID: EM50
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 9b3ec30244cb948dc079b3abad4bf802
Run description: Used cosine similarity to detect similar tweets in the top 50 results of each topic

ER¶

Results | Participants | Input | Summary | Appendix

Run ID: ER
Participant: ir.cs.sfsu
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: adhoc
MD5: 03e85ee0f026b1139ef82003e1ac507f
Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, and a retweet filter that removes tweets starting with RT.

ERL¶

Results | Participants | Input | Summary | Appendix

Run ID: ERL
Participant: ir.cs.sfsu
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: adhoc
MD5: 3aab5c7b0596eb92ea556d0f451659b7
Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, a retweet filter that removes tweets starting with RT, and a language filter that removes non english tweets using Cybozu's language detector.

ERLU¶

Results | Participants | Input | Summary | Appendix

Run ID: ERLU
Participant: ir.cs.sfsu
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: adhoc
MD5: 7d669d888c4cb610533889490ab4583f
Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, a retweet filter that removes tweets starting with RT, a language filter that removes non english tweets using Cybozu's language detector, and URL boosting with a factor of 1.1.

ERU¶

Results | Participants | Input | Summary | Appendix

Run ID: ERU
Participant: ir.cs.sfsu
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: adhoc
MD5: 8d4848c602aa9bb31da1e601cd27dbe9
Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, a retweet filter that removes tweets starting with RT, and URL boosting with a factor of 1.1.

hltcoe0¶

Run ID: hltcoe0
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: ac3585bcc46bc1a97f59264df71bdd02
Run description: Baseline run using API.

hltcoe1¶

Run ID: hltcoe1
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 952aaf0766203ccd2bae782538d52dce
Run description: Query expansion from top Google search snippets and then corpus pseudo relevance feedback. Google search is customized to only return results by the time of query time. Corpus PRF is multi-staged.

hltcoe2¶

Run ID: hltcoe2
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: f31dd8a0924361b9761130280f87b421
Run description: From hltcoe1 run (query expansion from google search and multi-stage PRF), rerank the returned tweets by Coordinate Ascent algorithm (Learning-to-rank) with multiple query-dependent and independent features.

hltcoe3¶

Run ID: hltcoe3
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 19501623afb77c825cd091e575de93a5
Run description: Similar to the hltcoe2, this is also a rerank from the hltcoe1 resulting tweets. The difference is that, if an embed tweet link can be crawled and analyzed correctly no longer than 3 seconds, then some features from this tweet expansion is used in the rerank. Basically, there are more tweet expansion based features in this run than the hltcoe2.

hltcoeTTG0¶

Run ID: hltcoeTTG0
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 5469a6f63931b1e8ecac76aaa0647262
Run description: From ad-hoc hltcoe3 run, this TTG run simply return top 90 relevant tweets as results. Since hltcoe3 use external resource like google search and tweet expansion features, so is this one.

hltcoeTTG1¶

Run ID: hltcoeTTG1
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 32aa65351199a27fc369dbce8c569d16
Run description: On the base of hltcoeTTG0, deduplicate tweets against historical selected tweets, which are selected greedily by threshold the max cosine similary of a tweet with each historical tweets. The order of seeing tweets is according to their relevance score calculated from the hltcoe4.

hltcoeTTG2¶

Run ID: hltcoeTTG2
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: dabb0fb45b8e7885de12fed85bc378d7
Run description: Similar to the hltcoeTTG1, but the difference is that the deduplication is made by a binary novelty decision with multiple features though SVM binary classifier. Since the input is again from the hltcoe4, which used external good search and tweet expansion, so is this run. In addition, some novelty features are calculated using tweet linked content.

hltcoeTTG3¶

Run ID: hltcoeTTG3
Participant: hltcoe
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: a42e98ec4b6a26429a3441af5302b47b
Run description: From the hltcoe4 results, perform near-deduplication with shingling/hashing.

HPRF1020¶

Run ID: HPRF1020
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: 32cd484acf0b4b13813b54f812386168
Run description: Hyperlink-based PRF

HPRF1020RR¶

Run ID: HPRF1020RR
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: 1b90883d7a724a401b23632baf4f43df
Run description: Hyperlink-based PRF + reranking

ICARUN1¶

Results | Participants | Input | Summary | Appendix

Run ID: ICARUN1
Participant: ecnu
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 61fc6b00cc68bdce84b4264340cfe1ca
Run description: weighted combination of 8 models in both Indri and Terrier; 3 query expansion methods: timely Google search with respect to each query, corpus based tfidf and embedded PRF;

ICARUN2¶

Results | Participants | Input | Summary | Appendix

Run ID: ICARUN2
Participant: ecnu
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 091ba48357413403101250658e092da3
Run description: combination of 8 models in both Indri and Terrier; 3 query expansion methods: timely Google search with respect to each query, corpus based tfidf and embedded PRF; similar to run1,but with different weights

ICARUN3¶

Results | Participants | Input | Summary | Appendix

Run ID: ICARUN3
Participant: ecnu
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: fe21f5805bd91d1074ef949699f002d8
Run description: weighted combination of 3 models in Indri and Terrier; 3 query expansion methods: timely Google search with respect to each query, corpus based tfidf and embedded PRF;

ICARUN4¶

Results | Participants | Input | Summary | Appendix

Run ID: ICARUN4
Participant: ecnu
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 724f787e2c0c661a47affb6b4b71b0aa
Run description: DFRee model with embedded PRF; URL considered;

ICTNETAP3¶

Run ID: ICTNETAP3
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 4a4688f2b541cb6512335cfeed8eb1a3
Run description: AP

ICTNETAP4¶

Run ID: ICTNETAP4
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: abf5b8d18dd7742873e1d8480104545f
Run description: AP

ICTNETRUN1¶

Run ID: ICTNETRUN1
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 9156e58ccecb95a19eaf32a951c39182
Run description: This run uses learning to rank to get a model and to sort. The features contains many aspects,e.g,VSM score and so on. We don't use external resources.

ICTNETRUN2¶

Run ID: ICTNETRUN2
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 874b60d820a3820302a728757cddf8b1
Run description: This run uses learning to rank to get a model and to sort. The features contains many aspects,e.g,VSM score and so on. We don't use external resources.

ICTNETRUN3¶

Run ID: ICTNETRUN3
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 3f2143fca3f982473374cb5789995303
Run description: This run removes the twitter with 'RT' at the beginning of the text.

ICTNETRUN4¶

Run ID: ICTNETRUN4
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 5673fd12464b0b9d5e89dafe6481f848
Run description: This run removes the twitter with 'RT' at the beginning of the text and use learning to rank to sort the twitters again.

ICTNETRunSP3¶

Run ID: ICTNETRunSP3
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: c1db0b18f7eedf266873ca737517a31b
Run description: SP

ICTNETRUNSP4¶

Run ID: ICTNETRUNSP4
Participant: ICTNET
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 52cab46b1c4e6d5dbb789b9a5e3d2acf
Run description: SP

JufeLdkeAdhoc1¶

Results | Participants | Input | Summary | Appendix

Run ID: JufeLdkeAdhoc1
Participant: LDKE
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: adhoc
MD5: 780a01c02132f11b947ff0cd3355bcbe
Run description: The baseline that uses the results come from the official API

JufeLdkeAdhoc2¶

Results | Participants | Input | Summary | Appendix

Run ID: JufeLdkeAdhoc2
Participant: LDKE
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 31c9e94c96d7a0d6bca9539bb7f1759b
Run description: The baseline results come from the official API, and reordered by timeline where new tweets have higher scores. We use the baseline results just to check the method for Timeline Generation task.

JufeLdkeAdhoc3¶

Results | Participants | Input | Summary | Appendix

Run ID: JufeLdkeAdhoc3
Participant: LDKE
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 1f3be6a4e0c7d9e6229e77a450c1c7c6
Run description: The baseline results come from the official API, and reordered by timeline where old tweets have higher scores.

JufeLdkeSum1¶

Results | Participants | Input | Summary | Appendix

Run ID: JufeLdkeSum1
Participant: LDKE
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: ttg
MD5: a6ffac89a045dcc6d2a4aebae12b904c
Run description: The baseline that uses the results come from the official API

JufeLdkeSum2¶

Results | Participants | Input | Summary | Appendix

Run ID: JufeLdkeSum2
Participant: LDKE
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: ttg
MD5: cf2db544c3fe726b61ddae2a9e337c5a
Run description: Using MMR that delete the reduncdent tweets posted later.

JufeLdkeSum3¶

Results | Participants | Input | Summary | Appendix

Run ID: JufeLdkeSum3
Participant: LDKE
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: ttg
MD5: 9e47bb3e14711e322df1a111c6bca92b
Run description: Using MMR on 2k tweets retrieved by oficial API, and delete the reduncdent tweets posted later.

NCOS¶

Run ID: NCOS
Participant: BJUT
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 82680ea4db20b9d8945cf2df0950dbde
Run description: null

NewBee¶

Run ID: NewBee
Participant: zhg15
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: f58a8b8b378b50d62a7d7fa2e502b6e9
Run description: We use Google API to expand topic queries. We crawled the first 10 result pages from Google using the original queries. And we use tfidf method to retrieve the first ten words as the query expansion based on their tfidf weights. We checked the expansion words manually and found they are suitable for the original queries.

NovaRun0¶

Run ID: NovaRun0
Participant: NovaSearch
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: fadff4ebf81b0dc57ff1ea302604975c
Run description: Uses Retweet filtering and Language filtering based on ldig. Uses the RM3 method for pseudo-relevance feedback. Temporal reranking with KDE using retrieved documents timestamps.

NovaRun1¶

Run ID: NovaRun1
Participant: NovaSearch
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: manual
Task: adhoc
MD5: 430013645dd82f2d8722cf318fee93fc
Run description: Uses Retweet filtering and Language filtering based on ldig. Uses Wikipedia page view counts (daily aggregates, only from days before the query time). Uses the RM3 method for pseudo-relevance feedback.

NovaRun2¶

Run ID: NovaRun2
Participant: NovaSearch
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: manual
Task: adhoc
MD5: 49226724d60cd7a053a92252834da891
Run description: Uses Retweet filtering and Language filtering based on ldig. Uses Wikipedia page view counts (daily aggregates, only from days before the query time). Uses the RM3 method for pseudo-relevance feedback. Uses retrieved documents + page views timestamps.

NSIM¶

Run ID: NSIM
Participant: BJUT
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 7453be203e401a16c431e33b626978d4
Run description: null

OSIM¶

Run ID: OSIM
Participant: BJUT
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: adb5640d721b32d28e240cc3f96a3eef
Run description: null

PKUICST1¶

Run ID: PKUICST1
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 0c3c62ebc58a6f6f85db850b6b539b27
Run description: Run based on Learning to Rank framework. Features (up to 133 features) include the different relevance scores (e.g. Language Model, TFIDF, BM25) of query and document and tweet quality features. Note that we expand the query with web resource, pseudo relevance feedback etc. The external resource used here is the Google search result (with time limitation) for each query. All the relevance scores are computed based on the official API result and provided term statistics.

PKUICST2¶

Run ID: PKUICST2
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 0cf49a3718b30253a8ad23e747665f0d
Run description: Run based on Learning to Rank framework. Features (up to 130 features) include the different relevance scores (e.g. Language Model, TFIDF, BM25) of query and document and tweet quality features. Note that we expand the query with web resource, pseudo relevance feedback etc. The external resource we used here is the Google search result (with time limit) for each query. All the relevance scores are computed based on a local copy of the official corpus. We do more pre-processing such as non-English tweet removal, stemmer etc.

PKUICST3¶

Run ID: PKUICST3
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 1604137dc999f222205415e4fd352297
Run description: Run based on Learning to Rank framework. Features include all the features used in PKUICST1 and PKUICST2 (in total 253 features). The candidates are based on the API results. That is, the local features are added as a different view of each tweet (different pre-processing of corpus). The external resource used here is the Google search result (with time limit) for each query.

PKUICST4¶

Run ID: PKUICST4
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 043749ffd1cdc77b367dde5cffef8d0e
Run description: Run based on language modeling framework. We expand each query with the Google web resource (with time limitation) and two-stage pseudo relevance feedback query expansion. The relevance score is computed based on the official API result and provided term statistics.

PolyURun1¶

Results | Participants | Input | Summary | Appendix

Run ID: PolyURun1
Participant: POLYUCOMP
Track: Microblog
Year: 2014
Submission: 8/16/2014
Type: automatic
Task: adhoc
MD5: dad75f2c41cfa5e19facca04ed17e294
Run description: This is the first run file of The Hong Kong Polytechnic University.

PolyURun2¶

Results | Participants | Input | Summary | Appendix

Run ID: PolyURun2
Participant: POLYUCOMP
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: 225c8e7d4675092eb7b5ad2ebd8c17a0
Run description: We use Google search results to help us conduct query expansion.

PolyURun3¶

Results | Participants | Input | Summary | Appendix

Run ID: PolyURun3
Participant: POLYUCOMP
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: de76f44f331d720d6df451a64480a721
Run description: We use Google search results to help us conduct query expansion. Meanwhile, we utilize PRF to conduct query exapnsion. We combine these two methods in this run.

PRF1030¶

Run ID: PRF1030
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: 1c8b55353f5698de844ecf7ad7a08144
Run description: PRF

PRF1030RR¶

Run ID: PRF1030RR
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: b17408b94db0da23592c8908476c2393
Run description: PRF + reranking

Pris2014a¶

Results | Participants | Input | Summary | Appendix

Run ID: Pris2014a
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/4/2014
Type: automatic
Task: adhoc
MD5: c48054a82b9818f33bdbe218a49c5bb6
Run description: Based on the baseline dataset returned by search API, We use tf for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

Pris2014b¶

Results | Participants | Input | Summary | Appendix

Run ID: Pris2014b
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/4/2014
Type: automatic
Task: adhoc
MD5: 6782f1d0372abcf42470da4284475864
Run description: Based on the baseline dataset returned by search API, We use sequential pattern for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

Pris2014c¶

Results | Participants | Input | Summary | Appendix

Run ID: Pris2014c
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/4/2014
Type: automatic
Task: adhoc
MD5: 3f5ebf44d1ef478ca235f071ea1ea470
Run description: Based on the baseline dataset returned by search API, We use waf for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

Pris2014e¶

Results | Participants | Input | Summary | Appendix

Run ID: Pris2014e
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/6/2014
Type: automatic
Task: adhoc
MD5: 51ef13433c341e7cc80756010c83bbc7
Run description: Based on the raw dataset returned by search API, We use tf for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

PrisTTG2014a¶

Results | Participants | Input | Summary | Appendix

Run ID: PrisTTG2014a
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/4/2014
Type: automatic
Task: ttg
MD5: 43c438906a758880b07eb50d4a321544
Run description: Based on the result of baseline+tf+url, we use simhash to cluster tweets and select higher score tweets from bigger cluster.

PrisTTG2014b¶

Results | Participants | Input | Summary | Appendix

Run ID: PrisTTG2014b
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/4/2014
Type: automatic
Task: ttg
MD5: 5cf9195070c21e2258ca5ce9fc82f639
Run description: Based on the result of baseline+waf+url, we use simhash to cluster tweets and select higher score tweets from bigger cluster.

PrisTTG2014c¶

Results | Participants | Input | Summary | Appendix

Run ID: PrisTTG2014c
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/4/2014
Type: automatic
Task: ttg
MD5: 38aadb36af0a1f981e09bffac2b67c89
Run description: Based on the result of baseline+sequential pattern+url, we use sequential pattern to cluster tweets and select higher score tweets from bigger cluster.

PrisTTG2014e¶

Results | Participants | Input | Summary | Appendix

Run ID: PrisTTG2014e
Participant: BUPT_PRIS
Track: Microblog
Year: 2014
Submission: 8/4/2014
Type: automatic
Task: ttg
MD5: c4c226e7923754e4ca6e9008ddd05281
Run description: Based on the result of baseline+tf+url, we use sequential pattern to cluster tweets and select higher score tweets from bigger cluster.

QUQEd10t15TTgCL¶

Run ID: QUQEd10t15TTgCL
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: 260e4337068c312b98a1163b1d211169
Run description: Selecting top tweets in clusters of results of a retrieval system using prf-based query expansion. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUQEd5t25TTgBL¶

Run ID: QUQEd5t25TTgBL
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: d6c87adf8f9e750b4a142e8d5d9297fc
Run description: Selecting top documents in results of a retrieval system using query expansion using PRF. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUQueryExp10D15¶

Run ID: QUQueryExp10D15
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 5767da3e9a97cb8be7ab7eec790b0a0d
Run description: Query expansion using PRF. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUQueryExp5D25T¶

Run ID: QUQueryExp5D25T
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 289a54296cd3a1ad56231b5ed86de229
Run description: - Query expansion using PRF - Use an open source language detection tool to filter non-English tweets - Timely resources with respect to the query.

QUTmpDecay¶

Run ID: QUTmpDecay
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 7a7e518f8fe32fb76086b7c14a93801f
Run description: Temporal re-ranking using temporal exponential decay. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUTmpDecayTTgCL¶

Run ID: QUTmpDecayTTgCL
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: 87b0a3bbad3bc6e897984ba4119fbcc0
Run description: Selecting top tweets in clusters of results of a retrieval system using temporal decay re-ranking. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUTQRM¶

Run ID: QUTQRM
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 0e29ab4edac6f3888f55e9d17ee2acc7
Run description: Temporal query expansion using temporal relevance modeling. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUTqrmTTgBL¶

Run ID: QUTqrmTTgBL
Participant: QU
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: 677bad0bdb8951a2f426221c2d8b7e8b
Run description: Selecting top documents in results of a retrieval system using query expansion using temporal relevance judgment. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

SCIAI124a¶

Run ID: SCIAI124a
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: automatic
Task: adhoc
MD5: 1a0e20e4b1d2ecdf30427baa178ab6e9
Run description: This run uses Link Crawling, Machine Learning and rescoreTweets. Link Crawling, using all links found in the whole corpus, decide - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. Machine Learning uses WEKA with a training set made up of several attributes, with a classifier, to decide whether a tweet is relevant. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI124aTTG¶

Run ID: SCIAI124aTTG
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: automatic
Task: ttg
MD5: 8d7a4aac03975ac70ab31b4d7e172834
Run description: This run uses Link Crawling, Machine Learning and rescoreTweets. Link Crawling, using all links found in the whole corpus, decide - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. Machine Learning uses WEKA with a training set made up of several attributes, with a classifier, to decide whether a tweet is relevant. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SCIAI14a¶

Run ID: SCIAI14a
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: automatic
Task: adhoc
MD5: 3ee1500548b1f0b554d1166d789c79db
Run description: This run uses Link Crawling and rescoreTweets. Link Crawling, using all links found in the whole corpus, decides - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI14aTTG¶

Run ID: SCIAI14aTTG
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: automatic
Task: ttg
MD5: 2fe4c8ee44545c46b6c6cf5e9ee9a30a
Run description: This run uses Link Crawling and rescoreTweets. Link Crawling, using all links found in the whole corpus, decide - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SCIAI3am14a¶

Run ID: SCIAI3am14a
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: manual
Task: adhoc
MD5: fd9a277fb4f63985ade97ee0f37893f8
Run description: This run uses a manual Google Query Expansion Module with Link Crawling and rescoreTweets. Google googles each query in Google - between the starting corpus time and then query time (so no future evidence was used) - and finds the top 4 common words to add to the query. Manual means a user was able to decide which words of the query expansion to keep or discard. Link Crawling, using all links found in the whole corpus, decided - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI3am14aTTG¶

Run ID: SCIAI3am14aTTG
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: manual
Task: ttg
MD5: ef625e7925c6265b38043757690c475b
Run description: This run uses a manual Google Query Expansion Module with Link Crawling and rescoreTweets. Google googles each query in Google - between the starting corpus time and then query time (so no future evidence was used) - and finds the top 4 common words to add to the query. Manual means a user was able to decide which words of the query expansion to keep or discard. Link Crawling, using all links found in the whole corpus, decided - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SCIAI3cm4a¶

Run ID: SCIAI3cm4a
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: manual
Task: adhoc
MD5: 22ccb3bf56e5c9b9b087dc325075d809
Run description: This run uses a manual CommonWords Query Expansion Module with rescoreTweets. CommonWords grabs top 10 words from each topic's tweets(the initial tweets grabbed by API). Manual means a user was able to decide which words of the query expansion to keep or discard. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI3cm4aTTG¶

Run ID: SCIAI3cm4aTTG
Participant: SCIAITeam
Track: Microblog
Year: 2014
Submission: 8/5/2014
Type: manual
Task: ttg
MD5: 6d0b100e8d9b733e768a448e35e0414b
Run description: This run uses a manual CommonWords Query Expansion Module with rescoreTweets. CommonWords grabs top 10 words from each topic's tweets(the initial tweets grabbed by API). Manual means a user was able to decide which words of the query expansion to keep or discard. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SM100¶

Run ID: SM100
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: fa080a601757e00cf2c63f4a8fff1cfa
Run description: Used modified cosine similarity to detect similar tweets in the top 100 results of each topic

SM50¶

Run ID: SM50
Participant: QCRI
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: e5cd5bedd1751a381a1f7f1689681b21
Run description: Used modified cosine similarity to detect similar tweets in the top 50 results of each topic

SR¶

Run ID: SR
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: 76b03d2022f5383351cf91fb68e5d3e7
Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. All parameters have the same weight.

SRAH¶

Run ID: SRAH
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 7c044463726111081ae66f54437281e5
Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. All parameters have the same weight. Identical to the TGG result.

SRTD¶

Run ID: SRTD
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: e62c6577ded3ea91369cd531ff1b9f49
Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. The time measurement has less affect than the other parameters in this run.

SRTDAH¶

Run ID: SRTDAH
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: a0e3a7a9c19b639727943ed5a0033657
Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. The time measurement has less affect than the other parameters in this run. Identical to the TGG result.

SRTL¶

Run ID: SRTL
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: 8b1a52e2ec2fa34392c06bb09cfb8ec8
Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity and hashtag similarity with equal weight.

SRTLAH¶

Run ID: SRTLAH
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 3b264f1cea73ab3776c7c0168b4c9da9
Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity and hashtag similarity with equal weight. Identical to the TGG result.

Standard¶

Run ID: Standard
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: ttg
MD5: 51b6031bf194f4ffd2aad2a7109d8504
Run description: Queries containing three or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, time proximity and hashtag similarity with equal weight.

StandardAH¶

Run ID: StandardAH
Participant: HU_DB
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: baa40f21166f823a7fc5615b81ee3990
Run description: Queries containing three or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, time proximity and hashtag similarity with equal weight. Identical to the TGG result.

TTGPKUICST1¶

Run ID: TTGPKUICST1
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 5a6e475312f3ddc209b2263c487e75de
Run description: Apply star clustering method with parameter sigma as 0.7. Treat top 200 results from ad-hoc task run PKUICST3 as relevant tweets.

TTGPKUICST2¶

Run ID: TTGPKUICST2
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 918760869b976901d4b94209877925aa
Run description: Apply hierarchical clustering. Set cluster merging threshold as 0.7. Treat results whose score is no less than 4.5 from ad-hoc task run PKUICST3 as relevant tweets.

TTGPKUICST3¶

Run ID: TTGPKUICST3
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: manual
Task: ttg
MD5: 3b1370aaa27ffdfec5a294c6f27c2ebb
Run description: Apply star clustering method with parameter sigma=0.7. Manually select top Ni results for each query Qi.

TTGPKUICST4¶

Run ID: TTGPKUICST4
Participant: PKUICST
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: manual
Task: ttg
MD5: 388441f503631d397c07463b2ae704d7
Run description: Apply hierarchical clustering. Set cluster merging threshold as 0.7. Manually select top Ni results for each query Qi.

UCASRun1¶

Run ID: UCASRun1
Participant: UCAS
Track: Microblog
Year: 2014
Submission: 8/11/2014
Type: automatic
Task: adhoc
MD5: 04b1481ac8f2b129da87005b94c900f9
Run description: This is the baseline run of UCAS. It is a straight-forward application of Ranking SVM for the retrieval.

UCASRun2¶

Run ID: UCASRun2
Participant: UCAS
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: fb377e4decdb3de206c22ea01b4eaaba
Run description: The UCASRun2 selects the high-quality training data to learn to rank the results without the web pages.

UCASRun3¶

Run ID: UCASRun3
Participant: UCAS
Track: Microblog
Year: 2014
Submission: 8/11/2014
Type: automatic
Task: adhoc
MD5: b2851821a0bd86fa45bcfb010df7310f
Run description: This is a straight-forward application of Ranking SVM for the retrieval with the use of external resource, namely the linked web pages.

UCASRun4¶

Run ID: UCASRun4
Participant: UCAS
Track: Microblog
Year: 2014
Submission: 8/17/2014
Type: automatic
Task: adhoc
MD5: 785124cf203d19e178461ff3415a817f
Run description: The UCASRun4 selects the high-quality training data to learn to rank the results with with the web pages.

udelRunAH¶

Run ID: udelRunAH
Participant: udel
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: b47591d75a724e4c4c6dc6adf031b560
Run description: This run filters non english tweets and retweets

udelRunTTG1¶

Run ID: udelRunTTG1
Participant: udel
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 12729657c243a437f334e451ed1f34b3
Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Retweets and non English tweets are filtered before clustering. Most recent tweet within each cluster represents the cluster.

udelRunTTG2¶

Run ID: udelRunTTG2
Participant: udel
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 9df2493e38d200c43188e9b801f75d40
Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Retweets and non English tweets are filtered before clustering. Most relevant tweet within each cluster represents the cluster.

udelRunTTG3¶

Run ID: udelRunTTG3
Participant: udel
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 835e9894e4aa0c23b7b789890a93d655
Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Tweets from TREC Ad Hoc baseline run are used for creating clusters. Most recent tweet within each cluster represents the cluster.

udelRunTTG4¶

Run ID: udelRunTTG4
Participant: udel
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: ttg
MD5: 3523513e69289acc0291c4680b08848b
Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Tweets from TREC Ad Hoc baseline run are used for creating clusters. Most relevant tweet within each cluster represents the cluster.

UDInfoLTR¶

Run ID: UDInfoLTR
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 1c819016b9bc920ac45332e002ae07a3
Run description: External resources: language detection tool described in paper "ngid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features: Use learning to rank method in Terrier

UDInfoMMR5¶

Run ID: UDInfoMMR5
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: 44a784f19370ac8d325e018d67796243
Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" is used for language detection Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. The top 5 tweets are used as result.

UDInfoMMRA¶

Run ID: UDInfoMMRA
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: 0eb622cb8372f3a9d7afcf555ab47b7c
Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" is used for language detection Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. The top 5 tweets are used as result.

UDInfoMMRWC5¶

Run ID: UDInfoMMRWC5
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: e3a0223bde6f8cdb682e34ca76edb9bb
Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. When computing relevance and novelty, concepts detected by wikimantic in both queries and tweets are also used. Tweets are iteratively selected out of the original 30 tweets, when 5 tweets are already selected, the selection stops.

UDInfoMMRWCA¶

Run ID: UDInfoMMRWCA
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: 6501466630b3948a33736ba75d0220e8
Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. When computing relevance and novelty, concepts detected by wikimantic in both queries and tweets are also used. Tweets are iteratively selected out of the original 30 tweets, when 7 tweets are already selected or the score difference between current tweet and previous tweet is larger than a predefined threshold( 0.01 of the mmr score of the previous tweet), the selection stops.

UDInfoQE¶

Run ID: UDInfoQE
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: bb1f91a9d644ef22748720d947ed9db5
Run description: External resources: language detection tool described in paper "ngid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features: Terrier was used to perform this run. We performed the run using Bo1 as a query expansion model available in the tool. The weighting model was PL2 and we adjusted the parameter to 19 given previous tests we made with 2013 data.

UDInfoTB¶

Run ID: UDInfoTB
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 6cbf5dec3d59f7c3d693a145f15a7ea5
Run description: Using tie-breaking to implement one IR signals (e.g. TF, IDF) one at a time. One external resource is used, which is the language detection tool described in the paper: "Concept-based information retrieval using explicit semantic analysis" published in 2011.

UDInfoTBRR¶

Run ID: UDInfoTBRR
Participant: udel_fang
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: e52835ca6c581a652a3c26d68446b214
Run description: External resources: language detection tool described in paper "ngid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query.

Upitt¶

Run ID: Upitt
Participant: zhg15
Track: Microblog
Year: 2014
Submission: 8/18/2014
Type: automatic
Task: adhoc
MD5: 7e0bf3bad55271f9e5e4f6150009f4ed
Run description: We use Google API to expand topic queries. We crawled the first 10 result pages from Google using the original queries. And we use tfidf method to retrieve the first ten words as the query expansion based on their tfidf weights. We checked the expansion words manually and found they are suitable for the original queries.

UWMHBUT1¶

Run ID: UWMHBUT1
Participant: UWM.HBUT
Track: Microblog
Year: 2014
Submission: 8/15/2014
Type: automatic
Task: adhoc
MD5: fdf3266875f0bb737f2963e18a5af4d3
Run description: BSR: Base run using TREC API (RunQueriesThrift.java) Some result records were duplicated. We removed duplicated ones, and added 1001th records for these topics (MB178, MB179, MB189, MB194, and MB197).

UWMHBUT2¶

Run ID: UWMHBUT2
Participant: UWM.HBUT
Track: Microblog
Year: 2014
Submission: 8/15/2014
Type: automatic
Task: adhoc
MD5: fa19500dbd0463670bd4bb8ccf38ec03
Run description: QWR: Query expansion based on the term frequency from the first top 10 Google results with weight for each term. The title and abstract of the top 10 Google result items were used to calculate term frequency. After the stop words were removed, nine highest term frequency terms were added to the original query: Ti, (i=0C-1, C=10, T0 = original query). Weight wj was given to each term Tj: wj = (C-j) / _(i=0)^(C-1)(i+1) , j=0C-1 Where C=10. This approach is based on Kwok et al. (2005) idea by introducing web assistance for improved performance (cf. Kwok, K. L., Grunfeld, L., & Deng, P. (2005). Improving weak ad-hoc retrieval by web assistance and data fusion. In Information Retrieval Technology (pp. 17-30). Springer Berlin Heidelberg).

UWMHBUT3¶

Run ID: UWMHBUT3
Participant: UWM.HBUT
Track: Microblog
Year: 2014
Submission: 8/15/2014
Type: automatic
Task: adhoc
MD5: 49197d4af51b7886dbe2e72896de7090
Run description: QER: Same as previous QWR (UWMHBUT2), but with equal weight for each expanded term. The title and abstract of the top 10 Google result items were used to calculate term frequency. After the stop words were removed, nine highest term frequency terms were added to the original query: Ti, (i=0C-1, C=10, T0 = original query). Weight was equally given to each term. This approach is based on Kwok et al. (2005) idea by introducing web assistance for improved performance (cf. Kwok, K. L., Grunfeld, L., & Deng, P. (2005). Improving weak ad-hoc retrieval by web assistance and data fusion. In Information Retrieval Technology (pp. 17-30). Springer Berlin Heidelberg).

UWMHBUT4¶

Run ID: UWMHBUT4
Participant: UWM.HBUT
Track: Microblog
Year: 2014
Submission: 8/15/2014
Type: automatic
Task: adhoc
MD5: 26891c8a9cbff2482cb65ab0a2f0715b
Run description: EIA: Adjusted by Event Identification Algorithm (EIA) We assume the distribution of twitter records about a topic along time is a Gaussian distribution. We further assume that the mean of that distribution, which is also the point maximum of twitter records were posted regarding that topic. We used the time of search results regarding a query to adjust the ranking. In other words, the top 30 search results are analyzed and the mean of their posting time were calculated as the point where this topic is hot. Accordingly, we slightly give more weighting to the twitter posters which are close to this hot spot. Specifically, Rn = Ro ( + (1-) E), where, Rn is the new ranking score, Ro is the original ranking score calculated from TREC Microblog APIs, and E is the event effect. Here is adjusting parameter (we choose 0.8 for this run). The event weighing factor E is calculated by the following formula: E= f (t, , ) * Ro where, f (t, , ) is the Gaussian distribution, t is the time gap between the querytime and the time where the twitter record was posted, is the event center (or the hot point), and is the standard deviation. We calculated the based on the top 1500 search results. To smooth it, we use = 3* 1 where 1 is the standard deviation of the top 1500 search results in terms of days.

wistudA1¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudA1
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/15/2014
Type: automatic
Task: adhoc
MD5: 2ff658f381fbf4f273577d025086b3dd
Run description: 1. using jimmy's api as the input 2. cluster tweets with Lingo algorithm

wistudt1bc¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudt1bc
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: 53d1a7884f0359c9ace9eaa8540e31cc
Run description: Using baseline run as input and return popular tweets.

wistudt1q¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudt1q
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: ad62478aa8a41597b399c78b3f7aa6b5
Run description: Simple query expansion (with knowledge from wikipedia Jan 2013)

wistudt1qc¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudt1qc
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: adhoc
MD5: e52e649063e84e718086ffe8eb7b61f3
Run description: Using Wikipedia (version of January 2013) dump to expand the query, filtering out non-popular tweets.

wistudt2bcd¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudt2bcd
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: c50a75845e118d2dfb94c725f5403546
Run description: Cluster the tweets using cosine similarity, select one tweet from each cluster and limit the amount of tweets at 20. Then using the duplicate detection framework (with syntactical and contextual features) to remove the duplicates.

wistudt2bd¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudt2bd
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: 595035005d55bdc7d4be3799675f69cd
Run description: Using baseline run as input, detecting duplicates with syntactical and contextual features, reserving up to 20 tweets (each of them from a cluster of detected duplicates) within the depth of 500.

wistudt2qcd¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudt2qcd
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: 31932b7818987a3e33764bfeec2c8d7a
Run description: Using query expansion run (with knowledge from Wikipedia as of Jan 2013), cluster the tweets using cosine similarity, select one tweet from each cluster and limit the amount of tweets at 20. Then using the duplicate detection framework (with syntactical and contextual features) to remove the duplicates.

wistudt2qd¶

Results | Participants | Input | Summary | Appendix

Run ID: wistudt2qd
Participant: wistud
Track: Microblog
Year: 2014
Submission: 8/19/2014
Type: automatic
Task: ttg
MD5: 7d0f059ce504434167a0d9c0edea0198
Run description: Using query expansion run (with knowledge from Wikipedia Jan 2013 version) as input, detecting duplicates with syntactical and contextual features, reserving up to 20 tweets (each of them from a cluster of detected duplicates) within the depth of 500.