Skip to content

Runs - Microblog 2014

1unique2

Results | Participants | Input | Summary | Appendix

  • Run ID: 1unique2
  • Participant: uog_twteam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: a6b41e7e8707f84c7e36e3d99f336038
  • Run description: A cluster-based approach which makes use of nouns and verbs only to perform clustering.

3unique0

Results | Participants | Input | Summary | Appendix

  • Run ID: 3unique0
  • Participant: uog_twteam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: f01272b8e7fe15a72a8205755c7f0f73
  • Run description: A more relaxed cluster based approach using only nouns and verbs.

3unique2

Results | Participants | Input | Summary | Appendix

  • Run ID: 3unique2
  • Participant: uog_twteam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: 52e5ad955a3c4a22280f654340bcc782
  • Run description: A cluster based approach which uses nouns and verbs only.

baselineRaw

Results | Participants | Input | Summary | Appendix

  • Run ID: baselineRaw
  • Participant: uiucGSLIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: ffc3239871bc206f618cd4b239446c60
  • Run description: Official baseline using raw API output.

baselineRM3

Results | Participants | Input | Summary | Appendix

  • Run ID: baselineRM3
  • Participant: uiucGSLIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 7e0bb31fe90f5be648ff84eced423ffb
  • Run description: Official baseline using RM3.

ECNURankLib

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURankLib
  • Participant: ECNUCS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 73d24ba9ee887994e986111891a53789
  • Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the RankLib tools. The training sets are 2011 and 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

ECNURankLib2013

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNURankLib2013
  • Participant: ECNUCS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: ea96edce6819cb82980b9808d99d774e
  • Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the RankLib tools. The training set is 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

ECNUSVM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNUSVM
  • Participant: ECNUCS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: e1a248959bf6285e258bc1736a449425
  • Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the SVM from sklearn tool. The training sets are 2011 and 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

ECNUSVM2013

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ECNUSVM2013
  • Participant: ECNUCS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 633c8ef8202cb1c6b9abb70993930d4c
  • Run description: We put the query to google SE to get some useful document and calculate the tfidf of every words from each document to implement feedback with the top 20 effective terms appeared in top 20 twitters returned by api. With regards to the tools, we use the SVM from sklearn tool. The training set is 2013 Moreover, the main features we adopted are KL between twitter, topic and google document.

EM100

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: EM100
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 91729890380aad420ee3a5cbe458b949
  • Run description: Used cosine similarity to detect similar tweets in the top 100 results of each topic

EM50

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: EM50
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 9b3ec30244cb948dc079b3abad4bf802
  • Run description: Used cosine similarity to detect similar tweets in the top 50 results of each topic

ER

Results | Participants | Input | Summary | Appendix

  • Run ID: ER
  • Participant: ir.cs.sfsu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 03e85ee0f026b1139ef82003e1ac507f
  • Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, and a retweet filter that removes tweets starting with RT.

ERL

Results | Participants | Input | Summary | Appendix

  • Run ID: ERL
  • Participant: ir.cs.sfsu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 3aab5c7b0596eb92ea556d0f451659b7
  • Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, a retweet filter that removes tweets starting with RT, and a language filter that removes non english tweets using Cybozu's language detector.

ERLU

Results | Participants | Input | Summary | Appendix

  • Run ID: ERLU
  • Participant: ir.cs.sfsu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 7d669d888c4cb610533889490ab4583f
  • Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, a retweet filter that removes tweets starting with RT, a language filter that removes non english tweets using Cybozu's language detector, and URL boosting with a factor of 1.1.

ERU

Results | Participants | Input | Summary | Appendix

  • Run ID: ERU
  • Participant: ir.cs.sfsu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 8d4848c602aa9bb31da1e601cd27dbe9
  • Run description: This run uses a light query expansion, using the ark-tweet-nlp-0.3.2 pos tagger, a retweet filter that removes tweets starting with RT, and URL boosting with a factor of 1.1.

hltcoe0

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoe0
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: ac3585bcc46bc1a97f59264df71bdd02
  • Run description: Baseline run using API.

hltcoe1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoe1
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 952aaf0766203ccd2bae782538d52dce
  • Run description: Query expansion from top Google search snippets and then corpus pseudo relevance feedback. Google search is customized to only return results by the time of query time. Corpus PRF is multi-staged.

hltcoe2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoe2
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: f31dd8a0924361b9761130280f87b421
  • Run description: From hltcoe1 run (query expansion from google search and multi-stage PRF), rerank the returned tweets by Coordinate Ascent algorithm (Learning-to-rank) with multiple query-dependent and independent features.

hltcoe3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoe3
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 19501623afb77c825cd091e575de93a5
  • Run description: Similar to the hltcoe2, this is also a rerank from the hltcoe1 resulting tweets. The difference is that, if an embed tweet link can be crawled and analyzed correctly no longer than 3 seconds, then some features from this tweet expansion is used in the rerank. Basically, there are more tweet expansion based features in this run than the hltcoe2.

hltcoeTTG0

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoeTTG0
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 5469a6f63931b1e8ecac76aaa0647262
  • Run description: From ad-hoc hltcoe3 run, this TTG run simply return top 90 relevant tweets as results. Since hltcoe3 use external resource like google search and tweet expansion features, so is this one.

hltcoeTTG1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoeTTG1
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 32aa65351199a27fc369dbce8c569d16
  • Run description: On the base of hltcoeTTG0, deduplicate tweets against historical selected tweets, which are selected greedily by threshold the max cosine similary of a tweet with each historical tweets. The order of seeing tweets is according to their relevance score calculated from the hltcoe4.

hltcoeTTG2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoeTTG2
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: dabb0fb45b8e7885de12fed85bc378d7
  • Run description: Similar to the hltcoeTTG1, but the difference is that the deduplication is made by a binary novelty decision with multiple features though SVM binary classifier. Since the input is again from the hltcoe4, which used external good search and tweet expansion, so is this run. In addition, some novelty features are calculated using tweet linked content.

hltcoeTTG3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hltcoeTTG3
  • Participant: hltcoe
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: a42e98ec4b6a26429a3441af5302b47b
  • Run description: From the hltcoe4 results, perform near-deduplication with shingling/hashing.

HPRF1020

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: HPRF1020
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 32cd484acf0b4b13813b54f812386168
  • Run description: Hyperlink-based PRF

HPRF1020RR

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: HPRF1020RR
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 1b90883d7a724a401b23632baf4f43df
  • Run description: Hyperlink-based PRF + reranking

ICARUN1

Results | Participants | Input | Summary | Appendix

  • Run ID: ICARUN1
  • Participant: ecnu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 61fc6b00cc68bdce84b4264340cfe1ca
  • Run description: weighted combination of 8 models in both Indri and Terrier; 3 query expansion methods: timely Google search with respect to each query, corpus based tfidf and embedded PRF;

ICARUN2

Results | Participants | Input | Summary | Appendix

  • Run ID: ICARUN2
  • Participant: ecnu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 091ba48357413403101250658e092da3
  • Run description: combination of 8 models in both Indri and Terrier; 3 query expansion methods: timely Google search with respect to each query, corpus based tfidf and embedded PRF; similar to run1,but with different weights

ICARUN3

Results | Participants | Input | Summary | Appendix

  • Run ID: ICARUN3
  • Participant: ecnu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: fe21f5805bd91d1074ef949699f002d8
  • Run description: weighted combination of 3 models in Indri and Terrier; 3 query expansion methods: timely Google search with respect to each query, corpus based tfidf and embedded PRF;

ICARUN4

Results | Participants | Input | Summary | Appendix

  • Run ID: ICARUN4
  • Participant: ecnu
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 724f787e2c0c661a47affb6b4b71b0aa
  • Run description: DFRee model with embedded PRF; URL considered;

ICTNETAP3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETAP3
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 4a4688f2b541cb6512335cfeed8eb1a3
  • Run description: AP

ICTNETAP4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETAP4
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: abf5b8d18dd7742873e1d8480104545f
  • Run description: AP

ICTNETRUN1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETRUN1
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 9156e58ccecb95a19eaf32a951c39182
  • Run description: This run uses learning to rank to get a model and to sort. The features contains many aspects,e.g,VSM score and so on. We don't use external resources.

ICTNETRUN2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETRUN2
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 874b60d820a3820302a728757cddf8b1
  • Run description: This run uses learning to rank to get a model and to sort. The features contains many aspects,e.g,VSM score and so on. We don't use external resources.

ICTNETRUN3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETRUN3
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 3f2143fca3f982473374cb5789995303
  • Run description: This run removes the twitter with 'RT' at the beginning of the text.

ICTNETRUN4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETRUN4
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 5673fd12464b0b9d5e89dafe6481f848
  • Run description: This run removes the twitter with 'RT' at the beginning of the text and use learning to rank to sort the twitters again.

ICTNETRunSP3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETRunSP3
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: c1db0b18f7eedf266873ca737517a31b
  • Run description: SP

ICTNETRUNSP4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: ICTNETRUNSP4
  • Participant: ICTNET
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 52cab46b1c4e6d5dbb789b9a5e3d2acf
  • Run description: SP

JufeLdkeAdhoc1

Results | Participants | Input | Summary | Appendix

  • Run ID: JufeLdkeAdhoc1
  • Participant: LDKE
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 780a01c02132f11b947ff0cd3355bcbe
  • Run description: The baseline that uses the results come from the official API

JufeLdkeAdhoc2

Results | Participants | Input | Summary | Appendix

  • Run ID: JufeLdkeAdhoc2
  • Participant: LDKE
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 31c9e94c96d7a0d6bca9539bb7f1759b
  • Run description: The baseline results come from the official API, and reordered by timeline where new tweets have higher scores. We use the baseline results just to check the method for Timeline Generation task.

JufeLdkeAdhoc3

Results | Participants | Input | Summary | Appendix

  • Run ID: JufeLdkeAdhoc3
  • Participant: LDKE
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 1f3be6a4e0c7d9e6229e77a450c1c7c6
  • Run description: The baseline results come from the official API, and reordered by timeline where old tweets have higher scores.

JufeLdkeSum1

Results | Participants | Input | Summary | Appendix

  • Run ID: JufeLdkeSum1
  • Participant: LDKE
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: ttg
  • MD5: a6ffac89a045dcc6d2a4aebae12b904c
  • Run description: The baseline that uses the results come from the official API

JufeLdkeSum2

Results | Participants | Input | Summary | Appendix

  • Run ID: JufeLdkeSum2
  • Participant: LDKE
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: ttg
  • MD5: cf2db544c3fe726b61ddae2a9e337c5a
  • Run description: Using MMR that delete the reduncdent tweets posted later.

JufeLdkeSum3

Results | Participants | Input | Summary | Appendix

  • Run ID: JufeLdkeSum3
  • Participant: LDKE
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: ttg
  • MD5: 9e47bb3e14711e322df1a111c6bca92b
  • Run description: Using MMR on 2k tweets retrieved by oficial API, and delete the reduncdent tweets posted later.

NCOS

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NCOS
  • Participant: BJUT
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 82680ea4db20b9d8945cf2df0950dbde
  • Run description: null

NewBee

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NewBee
  • Participant: zhg15
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: f58a8b8b378b50d62a7d7fa2e502b6e9
  • Run description: We use Google API to expand topic queries. We crawled the first 10 result pages from Google using the original queries. And we use tfidf method to retrieve the first ten words as the query expansion based on their tfidf weights. We checked the expansion words manually and found they are suitable for the original queries.

NovaRun0

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NovaRun0
  • Participant: NovaSearch
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: fadff4ebf81b0dc57ff1ea302604975c
  • Run description: Uses Retweet filtering and Language filtering based on ldig. Uses the RM3 method for pseudo-relevance feedback. Temporal reranking with KDE using retrieved documents timestamps.

NovaRun1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NovaRun1
  • Participant: NovaSearch
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: manual
  • Task: adhoc
  • MD5: 430013645dd82f2d8722cf318fee93fc
  • Run description: Uses Retweet filtering and Language filtering based on ldig. Uses Wikipedia page view counts (daily aggregates, only from days before the query time). Uses the RM3 method for pseudo-relevance feedback.

NovaRun2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NovaRun2
  • Participant: NovaSearch
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: manual
  • Task: adhoc
  • MD5: 49226724d60cd7a053a92252834da891
  • Run description: Uses Retweet filtering and Language filtering based on ldig. Uses Wikipedia page view counts (daily aggregates, only from days before the query time). Uses the RM3 method for pseudo-relevance feedback. Uses retrieved documents + page views timestamps.

NSIM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: NSIM
  • Participant: BJUT
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 7453be203e401a16c431e33b626978d4
  • Run description: null

OSIM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: OSIM
  • Participant: BJUT
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: adb5640d721b32d28e240cc3f96a3eef
  • Run description: null

PKUICST1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICST1
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 0c3c62ebc58a6f6f85db850b6b539b27
  • Run description: Run based on Learning to Rank framework. Features (up to 133 features) include the different relevance scores (e.g. Language Model, TFIDF, BM25) of query and document and tweet quality features. Note that we expand the query with web resource, pseudo relevance feedback etc. The external resource used here is the Google search result (with time limitation) for each query. All the relevance scores are computed based on the official API result and provided term statistics.

PKUICST2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICST2
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 0cf49a3718b30253a8ad23e747665f0d
  • Run description: Run based on Learning to Rank framework. Features (up to 130 features) include the different relevance scores (e.g. Language Model, TFIDF, BM25) of query and document and tweet quality features. Note that we expand the query with web resource, pseudo relevance feedback etc. The external resource we used here is the Google search result (with time limit) for each query. All the relevance scores are computed based on a local copy of the official corpus. We do more pre-processing such as non-English tweet removal, stemmer etc.

PKUICST3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICST3
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 1604137dc999f222205415e4fd352297
  • Run description: Run based on Learning to Rank framework. Features include all the features used in PKUICST1 and PKUICST2 (in total 253 features). The candidates are based on the API results. That is, the local features are added as a different view of each tweet (different pre-processing of corpus). The external resource used here is the Google search result (with time limit) for each query.

PKUICST4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PKUICST4
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 043749ffd1cdc77b367dde5cffef8d0e
  • Run description: Run based on language modeling framework. We expand each query with the Google web resource (with time limitation) and two-stage pseudo relevance feedback query expansion. The relevance score is computed based on the official API result and provided term statistics.

PolyURun1

Results | Participants | Input | Summary | Appendix

  • Run ID: PolyURun1
  • Participant: POLYUCOMP
  • Track: Microblog
  • Year: 2014
  • Submission: 8/16/2014
  • Type: automatic
  • Task: adhoc
  • MD5: dad75f2c41cfa5e19facca04ed17e294
  • Run description: This is the first run file of The Hong Kong Polytechnic University.

PolyURun2

Results | Participants | Input | Summary | Appendix

  • Run ID: PolyURun2
  • Participant: POLYUCOMP
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 225c8e7d4675092eb7b5ad2ebd8c17a0
  • Run description: We use Google search results to help us conduct query expansion.

PolyURun3

Results | Participants | Input | Summary | Appendix

  • Run ID: PolyURun3
  • Participant: POLYUCOMP
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: de76f44f331d720d6df451a64480a721
  • Run description: We use Google search results to help us conduct query expansion. Meanwhile, we utilize PRF to conduct query exapnsion. We combine these two methods in this run.

PRF1030

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PRF1030
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 1c8b55353f5698de844ecf7ad7a08144
  • Run description: PRF

PRF1030RR

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: PRF1030RR
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: b17408b94db0da23592c8908476c2393
  • Run description: PRF + reranking

Pris2014a

Results | Participants | Input | Summary | Appendix

  • Run ID: Pris2014a
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/4/2014
  • Type: automatic
  • Task: adhoc
  • MD5: c48054a82b9818f33bdbe218a49c5bb6
  • Run description: Based on the baseline dataset returned by search API, We use tf for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

Pris2014b

Results | Participants | Input | Summary | Appendix

  • Run ID: Pris2014b
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/4/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 6782f1d0372abcf42470da4284475864
  • Run description: Based on the baseline dataset returned by search API, We use sequential pattern for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

Pris2014c

Results | Participants | Input | Summary | Appendix

  • Run ID: Pris2014c
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/4/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 3f5ebf44d1ef478ca235f071ea1ea470
  • Run description: Based on the baseline dataset returned by search API, We use waf for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

Pris2014e

Results | Participants | Input | Summary | Appendix

  • Run ID: Pris2014e
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/6/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 51ef13433c341e7cc80756010c83bbc7
  • Run description: Based on the raw dataset returned by search API, We use tf for query expansion , and extract the content of the url in tweet. Then we combine the score of search API and the query expand ratio and the Indri score of the url content.

PrisTTG2014a

Results | Participants | Input | Summary | Appendix

  • Run ID: PrisTTG2014a
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/4/2014
  • Type: automatic
  • Task: ttg
  • MD5: 43c438906a758880b07eb50d4a321544
  • Run description: Based on the result of baseline+tf+url, we use simhash to cluster tweets and select higher score tweets from bigger cluster.

PrisTTG2014b

Results | Participants | Input | Summary | Appendix

  • Run ID: PrisTTG2014b
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/4/2014
  • Type: automatic
  • Task: ttg
  • MD5: 5cf9195070c21e2258ca5ce9fc82f639
  • Run description: Based on the result of baseline+waf+url, we use simhash to cluster tweets and select higher score tweets from bigger cluster.

PrisTTG2014c

Results | Participants | Input | Summary | Appendix

  • Run ID: PrisTTG2014c
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/4/2014
  • Type: automatic
  • Task: ttg
  • MD5: 38aadb36af0a1f981e09bffac2b67c89
  • Run description: Based on the result of baseline+sequential pattern+url, we use sequential pattern to cluster tweets and select higher score tweets from bigger cluster.

PrisTTG2014e

Results | Participants | Input | Summary | Appendix

  • Run ID: PrisTTG2014e
  • Participant: BUPT_PRIS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/4/2014
  • Type: automatic
  • Task: ttg
  • MD5: c4c226e7923754e4ca6e9008ddd05281
  • Run description: Based on the result of baseline+tf+url, we use sequential pattern to cluster tweets and select higher score tweets from bigger cluster.

QUQEd10t15TTgCL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUQEd10t15TTgCL
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: 260e4337068c312b98a1163b1d211169
  • Run description: Selecting top tweets in clusters of results of a retrieval system using prf-based query expansion. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUQEd5t25TTgBL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUQEd5t25TTgBL
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: d6c87adf8f9e750b4a142e8d5d9297fc
  • Run description: Selecting top documents in results of a retrieval system using query expansion using PRF. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUQueryExp10D15

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUQueryExp10D15
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 5767da3e9a97cb8be7ab7eec790b0a0d
  • Run description: Query expansion using PRF. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUQueryExp5D25T

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUQueryExp5D25T
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 289a54296cd3a1ad56231b5ed86de229
  • Run description: - Query expansion using PRF - Use an open source language detection tool to filter non-English tweets - Timely resources with respect to the query.

QUTmpDecay

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUTmpDecay
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 7a7e518f8fe32fb76086b7c14a93801f
  • Run description: Temporal re-ranking using temporal exponential decay. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUTmpDecayTTgCL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUTmpDecayTTgCL
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: 87b0a3bbad3bc6e897984ba4119fbcc0
  • Run description: Selecting top tweets in clusters of results of a retrieval system using temporal decay re-ranking. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUTQRM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUTQRM
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 0e29ab4edac6f3888f55e9d17ee2acc7
  • Run description: Temporal query expansion using temporal relevance modeling. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

QUTqrmTTgBL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: QUTqrmTTgBL
  • Participant: QU
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: 677bad0bdb8951a2f426221c2d8b7e8b
  • Run description: Selecting top documents in results of a retrieval system using query expansion using temporal relevance judgment. Uses an open source language detection tool to filter non-English tweets. Timely resources with respect to the query.

SCIAI124a

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI124a
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 1a0e20e4b1d2ecdf30427baa178ab6e9
  • Run description: This run uses Link Crawling, Machine Learning and rescoreTweets. Link Crawling, using all links found in the whole corpus, decide - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. Machine Learning uses WEKA with a training set made up of several attributes, with a classifier, to decide whether a tweet is relevant. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI124aTTG

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI124aTTG
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: automatic
  • Task: ttg
  • MD5: 8d7a4aac03975ac70ab31b4d7e172834
  • Run description: This run uses Link Crawling, Machine Learning and rescoreTweets. Link Crawling, using all links found in the whole corpus, decide - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. Machine Learning uses WEKA with a training set made up of several attributes, with a classifier, to decide whether a tweet is relevant. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SCIAI14a

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI14a
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 3ee1500548b1f0b554d1166d789c79db
  • Run description: This run uses Link Crawling and rescoreTweets. Link Crawling, using all links found in the whole corpus, decides - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI14aTTG

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI14aTTG
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: automatic
  • Task: ttg
  • MD5: 2fe4c8ee44545c46b6c6cf5e9ee9a30a
  • Run description: This run uses Link Crawling and rescoreTweets. Link Crawling, using all links found in the whole corpus, decide - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SCIAI3am14a

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI3am14a
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: manual
  • Task: adhoc
  • MD5: fd9a277fb4f63985ade97ee0f37893f8
  • Run description: This run uses a manual Google Query Expansion Module with Link Crawling and rescoreTweets. Google googles each query in Google - between the starting corpus time and then query time (so no future evidence was used) - and finds the top 4 common words to add to the query. Manual means a user was able to decide which words of the query expansion to keep or discard. Link Crawling, using all links found in the whole corpus, decided - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI3am14aTTG

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI3am14aTTG
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: manual
  • Task: ttg
  • MD5: ef625e7925c6265b38043757690c475b
  • Run description: This run uses a manual Google Query Expansion Module with Link Crawling and rescoreTweets. Google googles each query in Google - between the starting corpus time and then query time (so no future evidence was used) - and finds the top 4 common words to add to the query. Manual means a user was able to decide which words of the query expansion to keep or discard. Link Crawling, using all links found in the whole corpus, decided - using Lucene - which urls were the best for each topic, then adjusted each tweet's score based on whether it contained a url and whether it made the top urls list. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SCIAI3cm4a

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI3cm4a
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: manual
  • Task: adhoc
  • MD5: 22ccb3bf56e5c9b9b087dc325075d809
  • Run description: This run uses a manual CommonWords Query Expansion Module with rescoreTweets. CommonWords grabs top 10 words from each topic's tweets(the initial tweets grabbed by API). Manual means a user was able to decide which words of the query expansion to keep or discard. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet.

SCIAI3cm4aTTG

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SCIAI3cm4aTTG
  • Participant: SCIAITeam
  • Track: Microblog
  • Year: 2014
  • Submission: 8/5/2014
  • Type: manual
  • Task: ttg
  • MD5: 6d0b100e8d9b733e768a448e35e0414b
  • Run description: This run uses a manual CommonWords Query Expansion Module with rescoreTweets. CommonWords grabs top 10 words from each topic's tweets(the initial tweets grabbed by API). Manual means a user was able to decide which words of the query expansion to keep or discard. rescoreTweets goes through each tweet and adjusts the score based on 1) the number of retweets that tweet got and 2) the percentage of the query that makes up the tweet. This is for discovering relevance; for actual TTG results, we formed clusters, for each topic, based on the percentage of the tweet that matched each first tweet of each cluster. So the first tweet would create the first cluster, and then each consecutive tweet would go through and either 1) be added to a cluster if it's percentage was higher then the threshold or 2) create a new cluster. It only matched up against the first tweet in each cluster because the first tweet held "the rules" to get into that cluster. Afterwards, the TTG run would print the first tweet for each cluster for each topic.

SM100

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SM100
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: fa080a601757e00cf2c63f4a8fff1cfa
  • Run description: Used modified cosine similarity to detect similar tweets in the top 100 results of each topic

SM50

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SM50
  • Participant: QCRI
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: e5cd5bedd1751a381a1f7f1689681b21
  • Run description: Used modified cosine similarity to detect similar tweets in the top 50 results of each topic

SR

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SR
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: 76b03d2022f5383351cf91fb68e5d3e7
  • Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. All parameters have the same weight.

SRAH

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SRAH
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 7c044463726111081ae66f54437281e5
  • Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. All parameters have the same weight. Identical to the TGG result.

SRTD

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SRTD
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: e62c6577ded3ea91369cd531ff1b9f49
  • Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. The time measurement has less affect than the other parameters in this run.

SRTDAH

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SRTDAH
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: a0e3a7a9c19b639727943ed5a0033657
  • Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, hashtag similarity, and time proximity between the tweets. The time measurement has less affect than the other parameters in this run. Identical to the TGG result.

SRTL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SRTL
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: 8b1a52e2ec2fa34392c06bb09cfb8ec8
  • Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity and hashtag similarity with equal weight.

SRTLAH

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: SRTLAH
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 3b264f1cea73ab3776c7c0168b4c9da9
  • Run description: Queries containing four or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity and hashtag similarity with equal weight. Identical to the TGG result.

Standard

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: Standard
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: ttg
  • MD5: 51b6031bf194f4ffd2aad2a7109d8504
  • Run description: Queries containing three or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, time proximity and hashtag similarity with equal weight.

StandardAH

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: StandardAH
  • Participant: HU_DB
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: baa40f21166f823a7fc5615b81ee3990
  • Run description: Queries containing three or more tokens are enhanced by additional queries that add important words to the query. We cluster the tweets with Affinity Clustering, based on word similarity, time proximity and hashtag similarity with equal weight. Identical to the TGG result.

TTGPKUICST1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TTGPKUICST1
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 5a6e475312f3ddc209b2263c487e75de
  • Run description: Apply star clustering method with parameter sigma as 0.7. Treat top 200 results from ad-hoc task run PKUICST3 as relevant tweets.

TTGPKUICST2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TTGPKUICST2
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 918760869b976901d4b94209877925aa
  • Run description: Apply hierarchical clustering. Set cluster merging threshold as 0.7. Treat results whose score is no less than 4.5 from ad-hoc task run PKUICST3 as relevant tweets.

TTGPKUICST3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TTGPKUICST3
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: manual
  • Task: ttg
  • MD5: 3b1370aaa27ffdfec5a294c6f27c2ebb
  • Run description: Apply star clustering method with parameter sigma=0.7. Manually select top Ni results for each query Qi.

TTGPKUICST4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TTGPKUICST4
  • Participant: PKUICST
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: manual
  • Task: ttg
  • MD5: 388441f503631d397c07463b2ae704d7
  • Run description: Apply hierarchical clustering. Set cluster merging threshold as 0.7. Manually select top Ni results for each query Qi.

UCASRun1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UCASRun1
  • Participant: UCAS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/11/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 04b1481ac8f2b129da87005b94c900f9
  • Run description: This is the baseline run of UCAS. It is a straight-forward application of Ranking SVM for the retrieval.

UCASRun2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UCASRun2
  • Participant: UCAS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: fb377e4decdb3de206c22ea01b4eaaba
  • Run description: The UCASRun2 selects the high-quality training data to learn to rank the results without the web pages.

UCASRun3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UCASRun3
  • Participant: UCAS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/11/2014
  • Type: automatic
  • Task: adhoc
  • MD5: b2851821a0bd86fa45bcfb010df7310f
  • Run description: This is a straight-forward application of Ranking SVM for the retrieval with the use of external resource, namely the linked web pages.

UCASRun4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UCASRun4
  • Participant: UCAS
  • Track: Microblog
  • Year: 2014
  • Submission: 8/17/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 785124cf203d19e178461ff3415a817f
  • Run description: The UCASRun4 selects the high-quality training data to learn to rank the results with with the web pages.

udelRunAH

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: udelRunAH
  • Participant: udel
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: b47591d75a724e4c4c6dc6adf031b560
  • Run description: This run filters non english tweets and retweets

udelRunTTG1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: udelRunTTG1
  • Participant: udel
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 12729657c243a437f334e451ed1f34b3
  • Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Retweets and non English tweets are filtered before clustering. Most recent tweet within each cluster represents the cluster.

udelRunTTG2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: udelRunTTG2
  • Participant: udel
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 9df2493e38d200c43188e9b801f75d40
  • Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Retweets and non English tweets are filtered before clustering. Most relevant tweet within each cluster represents the cluster.

udelRunTTG3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: udelRunTTG3
  • Participant: udel
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 835e9894e4aa0c23b7b789890a93d655
  • Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Tweets from TREC Ad Hoc baseline run are used for creating clusters. Most recent tweet within each cluster represents the cluster.

udelRunTTG4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: udelRunTTG4
  • Participant: udel
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: ttg
  • MD5: 3523513e69289acc0291c4680b08848b
  • Run description: Quality threshold (QT) clustering algorithm is used for creating semantic clusters. Tweets from TREC Ad Hoc baseline run are used for creating clusters. Most relevant tweet within each cluster represents the cluster.

UDInfoLTR

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoLTR
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 1c819016b9bc920ac45332e002ae07a3
  • Run description: External resources: language detection tool described in paper "ngid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features: Use learning to rank method in Terrier

UDInfoMMR5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoMMR5
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: 44a784f19370ac8d325e018d67796243
  • Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" is used for language detection Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. The top 5 tweets are used as result.

UDInfoMMRA

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoMMRA
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: 0eb622cb8372f3a9d7afcf555ab47b7c
  • Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" is used for language detection Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. The top 5 tweets are used as result.

UDInfoMMRWC5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoMMRWC5
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: e3a0223bde6f8cdb682e34ca76edb9bb
  • Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. When computing relevance and novelty, concepts detected by wikimantic in both queries and tweets are also used. Tweets are iteratively selected out of the original 30 tweets, when 5 tweets are already selected, the selection stops.

UDInfoMMRWCA

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoMMRWCA
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: 6501466630b3948a33736ba75d0220e8
  • Run description: External resources:A language detection tool described in paper "langid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features:Choose the top 30 tweets in our ad-hoc run UDInfoQE for each query, use maximal marginal relevance, which is a linear combination of relevance and novelty, is used to re-rank them. When computing relevance and novelty, concepts detected by wikimantic in both queries and tweets are also used. Tweets are iteratively selected out of the original 30 tweets, when 7 tweets are already selected or the score difference between current tweet and previous tweet is larger than a predefined threshold( 0.01 of the mmr score of the previous tweet), the selection stops.

UDInfoQE

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoQE
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: bb1f91a9d644ef22748720d947ed9db5
  • Run description: External resources: language detection tool described in paper "ngid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query. Features: Terrier was used to perform this run. We performed the run using Bo1 as a query expansion model available in the tool. The weighting model was PL2 and we adjusted the parameter to 19 given previous tests we made with 2013 data.

UDInfoTB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoTB
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 6cbf5dec3d59f7c3d693a145f15a7ea5
  • Run description: Using tie-breaking to implement one IR signals (e.g. TF, IDF) one at a time. One external resource is used, which is the language detection tool described in the paper: "Concept-based information retrieval using explicit semantic analysis" published in 2011.

UDInfoTBRR

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UDInfoTBRR
  • Participant: udel_fang
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: e52835ca6c581a652a3c26d68446b214
  • Run description: External resources: language detection tool described in paper "ngid.py: an off-the-shelf language identification tool" and wikimantic, a tool used to detect the concepts in a query.

Upitt

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: Upitt
  • Participant: zhg15
  • Track: Microblog
  • Year: 2014
  • Submission: 8/18/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 7e0bf3bad55271f9e5e4f6150009f4ed
  • Run description: We use Google API to expand topic queries. We crawled the first 10 result pages from Google using the original queries. And we use tfidf method to retrieve the first ten words as the query expansion based on their tfidf weights. We checked the expansion words manually and found they are suitable for the original queries.

UWMHBUT1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWMHBUT1
  • Participant: UWM.HBUT
  • Track: Microblog
  • Year: 2014
  • Submission: 8/15/2014
  • Type: automatic
  • Task: adhoc
  • MD5: fdf3266875f0bb737f2963e18a5af4d3
  • Run description: BSR: Base run using TREC API (RunQueriesThrift.java) Some result records were duplicated. We removed duplicated ones, and added 1001th records for these topics (MB178, MB179, MB189, MB194, and MB197).

UWMHBUT2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWMHBUT2
  • Participant: UWM.HBUT
  • Track: Microblog
  • Year: 2014
  • Submission: 8/15/2014
  • Type: automatic
  • Task: adhoc
  • MD5: fa19500dbd0463670bd4bb8ccf38ec03
  • Run description: QWR: Query expansion based on the term frequency from the first top 10 Google results with weight for each term. The title and abstract of the top 10 Google result items were used to calculate term frequency. After the stop words were removed, nine highest term frequency terms were added to the original query: Ti, (i=0C-1, C=10, T0 = original query). Weight wj was given to each term Tj: wj = (C-j) / _(i=0)^(C-1)(i+1) , j=0C-1 Where C=10. This approach is based on Kwok et al. (2005) idea by introducing web assistance for improved performance (cf. Kwok, K. L., Grunfeld, L., & Deng, P. (2005). Improving weak ad-hoc retrieval by web assistance and data fusion. In Information Retrieval Technology (pp. 17-30). Springer Berlin Heidelberg).

UWMHBUT3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWMHBUT3
  • Participant: UWM.HBUT
  • Track: Microblog
  • Year: 2014
  • Submission: 8/15/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 49197d4af51b7886dbe2e72896de7090
  • Run description: QER: Same as previous QWR (UWMHBUT2), but with equal weight for each expanded term. The title and abstract of the top 10 Google result items were used to calculate term frequency. After the stop words were removed, nine highest term frequency terms were added to the original query: Ti, (i=0C-1, C=10, T0 = original query). Weight was equally given to each term. This approach is based on Kwok et al. (2005) idea by introducing web assistance for improved performance (cf. Kwok, K. L., Grunfeld, L., & Deng, P. (2005). Improving weak ad-hoc retrieval by web assistance and data fusion. In Information Retrieval Technology (pp. 17-30). Springer Berlin Heidelberg).

UWMHBUT4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UWMHBUT4
  • Participant: UWM.HBUT
  • Track: Microblog
  • Year: 2014
  • Submission: 8/15/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 26891c8a9cbff2482cb65ab0a2f0715b
  • Run description: EIA: Adjusted by Event Identification Algorithm (EIA) We assume the distribution of twitter records about a topic along time is a Gaussian distribution. We further assume that the mean of that distribution, which is also the point maximum of twitter records were posted regarding that topic. We used the time of search results regarding a query to adjust the ranking. In other words, the top 30 search results are analyzed and the mean of their posting time were calculated as the point where this topic is hot. Accordingly, we slightly give more weighting to the twitter posters which are close to this hot spot. Specifically, Rn = Ro ( + (1-) E), where, Rn is the new ranking score, Ro is the original ranking score calculated from TREC Microblog APIs, and E is the event effect. Here is adjusting parameter (we choose 0.8 for this run). The event weighing factor E is calculated by the following formula: E= f (t, , ) * Ro where, f (t, , ) is the Gaussian distribution, t is the time gap between the querytime and the time where the twitter record was posted, is the event center (or the hot point), and is the standard deviation. We calculated the based on the top 1500 search results. To smooth it, we use = 3* 1 where 1 is the standard deviation of the top 1500 search results in terms of days.

wistudA1

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudA1
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/15/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 2ff658f381fbf4f273577d025086b3dd
  • Run description: 1. using jimmy's api as the input 2. cluster tweets with Lingo algorithm

wistudt1bc

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudt1bc
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: 53d1a7884f0359c9ace9eaa8540e31cc
  • Run description: Using baseline run as input and return popular tweets.

wistudt1q

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudt1q
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: ad62478aa8a41597b399c78b3f7aa6b5
  • Run description: Simple query expansion (with knowledge from wikipedia Jan 2013)

wistudt1qc

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudt1qc
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: adhoc
  • MD5: e52e649063e84e718086ffe8eb7b61f3
  • Run description: Using Wikipedia (version of January 2013) dump to expand the query, filtering out non-popular tweets.

wistudt2bcd

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudt2bcd
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: c50a75845e118d2dfb94c725f5403546
  • Run description: Cluster the tweets using cosine similarity, select one tweet from each cluster and limit the amount of tweets at 20. Then using the duplicate detection framework (with syntactical and contextual features) to remove the duplicates.

wistudt2bd

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudt2bd
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: 595035005d55bdc7d4be3799675f69cd
  • Run description: Using baseline run as input, detecting duplicates with syntactical and contextual features, reserving up to 20 tweets (each of them from a cluster of detected duplicates) within the depth of 500.

wistudt2qcd

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudt2qcd
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: 31932b7818987a3e33764bfeec2c8d7a
  • Run description: Using query expansion run (with knowledge from Wikipedia as of Jan 2013), cluster the tweets using cosine similarity, select one tweet from each cluster and limit the amount of tweets at 20. Then using the duplicate detection framework (with syntactical and contextual features) to remove the duplicates.

wistudt2qd

Results | Participants | Input | Summary | Appendix

  • Run ID: wistudt2qd
  • Participant: wistud
  • Track: Microblog
  • Year: 2014
  • Submission: 8/19/2014
  • Type: automatic
  • Task: ttg
  • MD5: 7d0f059ce504434167a0d9c0edea0198
  • Run description: Using query expansion run (with knowledge from Wikipedia Jan 2013 version) as input, detecting duplicates with syntactical and contextual features, reserving up to 20 tweets (each of them from a cluster of detected duplicates) within the depth of 500.