Runs - Web 2009¶

arsc09web¶

Run ID: arsc09web
Participant: ARSC09
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 05b2798c343aadd818c5d710b78a6dd9
Run description: The ARSC team used the Google N-Gram Corpus to create a vocabularies of 1-gram, 2-gram, and 3-gram phrase tokens. For the official judged run, a fraction of the Category B collection was indexed using the fixed vocabulary of 1-gram terms--a "go list" rather than a stop list--and searched with a custom index/search application built with the Lucene toolkit. Multiple Sun Fire X4600 nodes, each with 64 GB of RAM, from the ARSC midnight cluster were used for the index and search tasks.

ICTNETADRun3¶

Run ID: ICTNETADRun3
Participant: ICTNET
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: f503aee0854f321a102743de0f28858c
Run description: using improved BM25 model with search extension.

ICTNETADRun4¶

Run ID: ICTNETADRun4
Participant: ICTNET
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 28ad749f5f1aa7d3d726094bdc28f253
Run description: using improved BM25 model.

ICTNETADRun5¶

Run ID: ICTNETADRun5
Participant: ICTNET
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 5808d4d7d5b31c415dae9580415a14f3
Run description: using improved BM25 model and optimizing parameters. please use this run, because we can't replace the former run which PRECEDENCE is 2.

ICTNETDivR1¶

Run ID: ICTNETDivR1
Participant: ICTNET
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: d124d75a8bf95ea5605109f965a5c257
Run description: do query expansion by using top k results returned by Google search engine. the weighting model is KL

ICTNETDivR2¶

Run ID: ICTNETDivR2
Participant: ICTNET
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 614393fc25e5f5b3f5a4d0d2f47d4ba6
Run description: 1.clustering top N search results form google search engine 2.getting the key words for each subtopic from the clustered texts 3.getting docs by searching document colletion with the inherent key words 4.ranking those docs using a greedy algorithm

ICTNETDivR3¶

Run ID: ICTNETDivR3
Participant: ICTNET
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 88d19930162ad3fd04e8db6af1fd5aed
Run description: Using K-MEANS algorithm to make different clusters.

IE09¶

Results | Participants | Input | Summary | Appendix

Run ID: IE09
Participant: York
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: cf45ab9065c2ff794fdf655baf10a088
Run description: This run is based on using number of inlinks to a page in ranking and also other factors.

irra1a¶

Run ID: irra1a
Participant: IRRA
Track: Web
Year: 2009
Submission: 8/13/2009
Type: automatic
Task: adhoc
MD5: 8f643e2ae813e546bdf2e341554921db
Run description: This is the base IRRA run for adhoc task.

irra1d¶

Run ID: irra1d
Participant: IRRA
Track: Web
Year: 2009
Submission: 8/13/2009
Type: automatic
Task: diversity
MD5: 6a6027791df65d4ce844a1766f1a4d34
Run description: This run is the base IRRA run for diversity task.

irra2a¶

Run ID: irra2a
Participant: IRRA
Track: Web
Year: 2009
Submission: 8/13/2009
Type: automatic
Task: adhoc
MD5: 25769e901f7e12e25d5370088ad59909
Run description: This run is a variation of the base IRRA run for adhoc task.

irra2d¶

Run ID: irra2d
Participant: IRRA
Track: Web
Year: 2009
Submission: 8/13/2009
Type: automatic
Task: diversity
MD5: f53964da7c9fdfb8b1d53de0ef400de7
Run description: This run is a variation of the base IRRA run for diversity task.

irra3a¶

Run ID: irra3a
Participant: IRRA
Track: Web
Year: 2009
Submission: 8/13/2009
Type: automatic
Task: adhoc
MD5: e158aec0de79b78eb706f6519bebe9fd
Run description: This run is another variation of the base IRRA run for adhoc task.

irra3d¶

Run ID: irra3d
Participant: IRRA
Track: Web
Year: 2009
Submission: 8/13/2009
Type: automatic
Task: diversity
MD5: 1ac189040b7f7f5f9391176a29b78d29
Run description: This run is another variation of the base IRRA run for diversity task.

MS1¶

Run ID: MS1
Participant: msrc
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 980d4b0dec45220a6085c5e3957c7e46
Run description: We train our system using some training queries. For these queries we use top documents retrieved by bing as relevance judgments.

MS2¶

Run ID: MS2
Participant: msrc
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: eabef716a85fa12ba1641931a546068b
Run description: This is the same as the first run with some extra features: We train our system using some training queries. For these queries we use top documents retrieved by bing as relevance judgments.

MSDiv1¶

Run ID: MSDiv1
Participant: msrc
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 1dc8301ce6d90939e03f7f5d1ee6e2ba
Run description: We train our system using some training queries. For these queries we use top documents retrieved by bing as relevance judgments.

MSDiv2¶

Run ID: MSDiv2
Participant: msrc
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: f9bc6c4cbb46b0f889d7b1c585023c16
Run description: This is the same as the first run with some extra features. We train our system using some training queries. For these queries we use top documents retrieved by bing as relevance judgments.

MSDiv3¶

Run ID: MSDiv3
Participant: msrc
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: 8f78b9cb8d2d0c641425dccd6ef3b7ab
Run description: Extremely simple ranker. Host-collapse limits the number of results from each host.

MSRAACSF¶

Run ID: MSRAACSF
Participant: MSRAsia
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: f558e65107c5a75d746392e77e9e7cb1
Run description: diversify results by using clustering (C), sites(S), and anchors(A) only results from main query are reserved(F).

MSRAAF¶

Run ID: MSRAAF
Participant: MSRAsia
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: adhoc
MD5: 09772edf9b8cb524672413b47cc62740
Run description: baseline + anchor-based rerank

MSRABASE¶

Run ID: MSRABASE
Participant: MSRAsia
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: c7ecd0f5c5067e8271c5dfd5621df961
Run description: baseline: msra2000

MSRAC¶

Run ID: MSRAC
Participant: MSRAsia
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: adhoc
MD5: 0049bfcabe7085b3677b7b5f07ef4a45
Run description: baseline + clustering-based diversity

MSRACS¶

Run ID: MSRACS
Participant: MSRAsia
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: 7e26bce094542715b3d503cf42d8dc3e
Run description: diversify results by using clustering and sites

MSRANORM¶

Run ID: MSRANORM
Participant: MSRAsia
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: adhoc
MD5: e09ed3c89c785f3c58c788c510671cdc
Run description: a baseline which uses msra2000 (without term prox in anchor.)

muadanchor¶

Results | Participants | Input | Summary | Appendix

Run ID: muadanchor
Participant: unimelb
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: a9cdac46a31e3c09d06c6eebb2f2a8ec
Run description: standard impact score on incoming anchor text

muadibm5¶

Results | Participants | Input | Summary | Appendix

Run ID: muadibm5
Participant: unimelb
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: bacb296229507ecf0dde1daccb53441c
Run description: BM25 on impact score, content only.

muadimp¶

Results | Participants | Input | Summary | Appendix

Run ID: muadimp
Participant: unimelb
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 366afc0d6451af761ba21df12230c469
Run description: standard impact score, content only.

mudvibm5¶

Results | Participants | Input | Summary | Appendix

Run ID: mudvibm5
Participant: unimelb
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: 3e0528836cf74bdca70b4694c07d68d1
Run description: bm25 on content impact

mudvimp¶

Results | Participants | Input | Summary | Appendix

Run ID: mudvimp
Participant: unimelb
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 3cdc05cb031401625b57b759ae9c2571
Run description: standard impact score on incoming anchor text

NeuDiv1¶

Run ID: NeuDiv1
Participant: NEU
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 46c885114ff3faa108e2122b625e3040
Run description: - run Indri to get top 2000 docs - run a simple spam filter based on term freq, statistical match with the expected Zipf distribution on English text, eliminate spam - tag the documents with dictionary-based tags (developed by Neu) - greedily pick documents (out of ones with roughly same Indri score) that maximize a diversity utility against the already retrieved set of documents

NeuDivW75¶

Run ID: NeuDivW75
Participant: NEU
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: 95555522db738ceeb9d695c434fb1a01
Run description: - run Indri to get 2000 docs per query - run a simple statistical spam filter - tag documents using dictionaries - mechanism for diversity based on the (universal tags): repeatedly, out of 75 docs with consecutive Indri scores, select the one that maximizes tag-diversity utility against the set already retrieved.

NeuLMWeb300¶

Run ID: NeuLMWeb300
Participant: NEU
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 93b3c3abf5a23e24c2a60a722a7e66a9
Run description: - run Indri to get 2000 docs (for each query) - run a spam filter based on Expected Zipfian distribution entropy for English text after some statistical analysis - Use a threshold of 300 for the spamming value. - Output top remaining 1000 docs.

NeuLMWeb600¶

Run ID: NeuLMWeb600
Participant: NEU
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: e4ecbf1411c911cbe559303e2aa623c6
Run description: - run Indri to get 2000 docs (for each query) - run a spam filter based on Expected Zipfian distribution entropy for English text after some statistical analysis - Use a threshold of 600 for the spamming value. - Output top remaining 1000 docs.

NeuLMWebBase¶

Run ID: NeuLMWebBase
Participant: NEU
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 8243ee682215c50ded7f056c72c54b52
Run description: - run Indri to get 1000 docs (for each query)

NeuRRWeb300¶

Run ID: NeuRRWeb300
Participant: NEU
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 55397c2afc00eccf94a3851ea9f91e9b
Run description: - run Indri to get 2000 docs (for each query) - run a spam filter based on Expected Zipfian distribution entropy for English text after some statistical analysis - Use a threshold of 300 for the spamming value. - Identification of Entity Tags - Run k-means using cosine similarity between document tags - Rank the documents within the cluster by indri - Fuse the clusters by round-robin in a single list.

pkuLink¶

Run ID: pkuLink
Participant: pku2009
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: db86e0db3a391ad46a15f64d75abf797
Run description: In this run , we use the html page's struct and link infos. We just remove the html tags and index whole collection on 23 machines with poster stemming and remove stop words. Anchor text and link info we use hadoop to compute. When searching,we use multi-field bm25 formual , term proximity info and link infos to rank results,then we merge the results from 23 machines to generate finally result.

pkuSewmTp¶

Run ID: pkuSewmTp
Participant: pku2009
Track: Web
Year: 2009
Submission: 8/13/2009
Type: automatic
Task: adhoc
MD5: 6847607851242d9d448a35440a089a36
Run description: In this run , we only use the html page's content info. We just remove the html tags and index whole collection on 23 machines with poster stemming and remove stop words. When searching,we use bm25 formual and term proximity info to rank results,then we merge the results from 23 machines to generate finally result.

pkuStruct¶

Run ID: pkuStruct
Participant: pku2009
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 0c303db646306a2c551b677c36fac763
Run description: In this run , we use the html page's struct info,likes title,anchor, h1. We just remove the html tags and index whole collection on 23 machines with poster stemming and remove stop words. When searching,we use multi-field bm25 formual and term proximity info to rank results,then we merge the results from 23 machines to generate finally result.

RmitDiv¶

Run ID: RmitDiv
Participant: RMIT
Track: Web
Year: 2009
Submission: 8/15/2009
Type: automatic
Task: diversity
MD5: fd2b4f80b42f291a353876af1bd98821
Run description: Language Model run allowing a single (top ranked) result from each unique domain.

RmitLm¶

Run ID: RmitLm
Participant: RMIT
Track: Web
Year: 2009
Submission: 8/15/2009
Type: automatic
Task: adhoc
MD5: 8dabce8406ac9242d75a3a46806ffe3d
Run description: Baseline Zettair run using Language Modelling

RmitOkapi¶

Run ID: RmitOkapi
Participant: RMIT
Track: Web
Year: 2009
Submission: 8/15/2009
Type: automatic
Task: adhoc
MD5: 6d2965ba3bbf7d5e29839a7b1809d280
Run description: Baseline Zettair run using Okapi BM25

Sab9wtBase¶

Results | Participants | Input | Summary | Appendix

Run ID: Sab9wtBase
Participant: SABIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: f5975c0a7159ea662eb17cbe07f0231f
Run description: Base SMART ltu.Lnu no expansion run.

Sab9wtBDiv1¶

Results | Participants | Input | Summary | Appendix

Run ID: Sab9wtBDiv1
Participant: SABIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 3425eda6cba3483e915c90ad4ba641ef
Run description: Initial base SMART run ltu.Lnu. rerank top 10-30 docs by starting at rank 2 and choosing the lowest max inner product to higher ranked docs already added from candidate docs. candidate docs here are those with maximum rank MIN (4*i,30) where i is number already added. Stop reranking after 10 docs added.

Sab9wtBDiv2¶

Results | Participants | Input | Summary | Appendix

Run ID: Sab9wtBDiv2
Participant: SABIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 4ec028e297e76603197ac70e4b11d5b9
Run description: Initial base SMART run ltu.Lnu. rerank top 20-70 docs by starting at rank 2 and choosing the lowest max inner product to higher ranked docs already added from candidate docs. candidate docs here are those with maximum rank MIN (5*i,70) where i is number already added. Stop reranking after 20 docs added. Docs weighted Ltu.

Sab9wtBf1¶

Results | Participants | Input | Summary | Appendix

Run ID: Sab9wtBf1
Participant: SABIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: b55f7ba2ceffabf7cca81df1ebc04d1f
Run description: Blind feed back afer SMART ltu.Lnu base run, 25 docs, added 20 terms, weighted Rocchio collwt a,b,c = 32,64,128. Lnu docs, all docs in collection considered nonrel.

Sab9wtBf2¶

Results | Participants | Input | Summary | Appendix

Run ID: Sab9wtBf2
Participant: SABIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 317d32011dc361723f8b3dbfe0f66397
Run description: Blind feed back after SMART ltu.Lnu base run, 25 docs, added 20 terms, weighted Rocchio a,b,c = 32,8,0. Ltu docs, no nonrel docs.

Sab9wtBfDiv¶

Results | Participants | Input | Summary | Appendix

Run ID: Sab9wtBfDiv
Participant: SABIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 369b9f2e377efe9d7e984dacfb1b19e4
Run description: Blind feed back afer SMART ltu.Lnu base run, 25 docs, added 20 terms, weighted Rocchio collwt a,b,c = 32,64,128. Lnu docs, all docs in collection considered nonrel. rerank top 20-70 docs by starting at rank 2 and choosing the lowest max inner product to higher ranked docs already added from candidate docs. candidate docs here are those with maximum rank MIN (5*i,70) where i is number already added. Stop reranking after 20 docs added. Docs weighted Ltu.

scutrun1¶

Results | Participants | Input | Summary | Appendix

Run ID: scutrun1
Participant: scut_kapok
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 6d93d34c6c0d12de205c1e1a3745096f
Run description: the first result of South China University of Technology:scutrun1

scutrun2¶

Results | Participants | Input | Summary | Appendix

Run ID: scutrun2
Participant: scut_kapok
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 6bfd7e5bc10d41b699e03a7fc3d09561
Run description: the 2nd result of South China University of Technology:scutrun2

scutrun3¶

Results | Participants | Input | Summary | Appendix

Run ID: scutrun3
Participant: scut_kapok
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 1e5f12b26335f421fc138b2464848d04
Run description: the third result of South China University of Technology:scutrun3

SIEL09¶

Results | Participants | Input | Summary | Appendix

Run ID: SIEL09
Participant: SIEL
Track: Web
Year: 2009
Submission: 8/18/2009
Type: automatic
Task: adhoc
MD5: 027ca07d066870f1f90e8edefa567ed9
Run description: just simple adhoc retrieval

spc¶

Run ID: spc
Participant: UAms
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: c255887bd5b0f5923d38ebe63b37d899
Run description: This run uses single pass clustering to rerank the documents from an initial run with mrf+wikipedia non-article filtering

THUIR09AbClu¶

Run ID: THUIR09AbClu
Participant: THUIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: e6cbaad6398c8bf4ff1a3a8fc12ecece
Run description: Baseline + web page abstract extraction + result clustering + diversity document selection with site-based duplicate elimination. Baseline: Improved probabilistic model with PageRank ranking and wordpair model. Retrieve on the combined full text and in-link anchor. Web page spam filtering. multi-field search.

THUIR09An¶

Run ID: THUIR09An
Participant: THUIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: acb7ea691c7436b82733e40704eca9d5
Run description: Improved probabilistic model with PageRank ranking and wordpair model. Retrieve only on the in-link anchor text. Web page spam filtering. multi-field search.

THUIR09FuClu¶

Run ID: THUIR09FuClu
Participant: THUIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 1cc0e5c28544b663847089a167a869d5
Run description: Baseline + result clustering + diversity document selection with site-based duplicate elimination. Baseline: Improved probabilistic model with PageRank ranking and wordpair model. Retrieve on the combined full text and in-link anchor. Web page spam filtering. multi-field search.

THUIR09LuTA¶

Run ID: THUIR09LuTA
Participant: THUIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: bbe1e6c2c2a33e7c7d28da0bb341acdf
Run description: Lucene with BM25 ranking. Retrieve on both full text and in-link anchor. Result re-ranking with PageRank. Web page spam filtering. multi-field search.

THUIR09QeDiv¶

Run ID: THUIR09QeDiv
Participant: THUIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 8648f32fec1474f2ae38ed604a5acd58
Run description: Retrieval with Query expansion based on Google query recommendation + diversity document selection + site and content based duplicate elimination.

THUIR09TxAn¶

Run ID: THUIR09TxAn
Participant: THUIR
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 2d1859b130b046829ca1d28b7e3981e6
Run description: Improved probabilistic model with PageRank ranking and wordpair model. Retrieve on the combined full text and in-link anchor. Web page spam filtering. multi-field search.

tm¶

Run ID: tm
Participant: UAms
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 2e5043e7409006dbc52835bade5432b7
Run description: topic model + re-ranking with top 2500 docs

twCSodpRBB¶

Run ID: twCSodpRBB
Participant: utwente
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 49545612bc61705e084985833298d0b1
Run description: run also relies on ODP categories; no merging of results is performed, simple round robin over the different corpora

twCSodpRNB¶

Run ID: twCSodpRNB
Participant: utwente
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: a81f241d251afe6c48115774b72f58a7
Run description: 100 category terms from the ODP directory were used for diversification. Each query was run 100 times together with each query term - the retrieval score of the top document containing all terms (query terms + category term) was recorded and the categories of the top 5 scoring test queries were used for retrieval of the final result list. Spam detection relies on the UK Webspam corpus.

twCSrs9N¶

Run ID: twCSrs9N
Participant: utwente
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 1e2b8dba652d1f52463f8f3c1ebeaf9d
Run description: Around 30 retrieval methods were created and run on each subcorpus (ClueWeb_English_1 to ClueWeb_English_10). The performance of each method was estimated/guessed by an automatic method and the best estimated results per subcorpus were used during merging (ZMUV merging). Basic spam detection was implemented, relying on content-only features. For training purposes of the spam decision tree, part of the training data from the UK Webspam collection (barcelona.research.yahoo.net/webspam/) was used.

twCSrsR¶

Run ID: twCSrsR
Participant: utwente
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: c2de41f182a88baa54abe0ebaf3d9805
Run description: Very similar to run twCSrs9N. Differences: variation of the automatic estimation method and no normalization in the merging process, instead simple round robin procedure to gather the results of each subcorpus (ClueWeb_English_1 to ClueWeb_English_10).

twJ48rsU¶

Run ID: twJ48rsU
Participant: utwente
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 9350da15080401b28e1994f5ca19f727
Run description: less agressive spam detection method used; merging without normalization.

UamsAw7an3¶

Run ID: UamsAw7an3
Participant: Amsterdam
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: ba1e99a81be49f50c5089e143219c057
Run description: Merge of text run with document length prior and anchor text run with anchor text length prior, weight 0.7 text + 0.3 anchor

UamsAwebQE10¶

Run ID: UamsAwebQE10
Participant: Amsterdam
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 6e2cea64c436c350dc15ac5e469e6166
Run description: Merge of 10 runs based on 10 distinct expanded queries using the 10 top result of a baseline run

UamsDancTFb1¶

Run ID: UamsDancTFb1
Participant: Amsterdam
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 0d442129b07f69d0b4fd5216ca20b078
Run description: Run on anchor text index with length prior on anchor text length, top down filtered on the contribution of unseen words.

UamsDwebLFou¶

Run ID: UamsDwebLFou
Participant: Amsterdam
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 4a9054bc4893132a24c23e53bed639fd
Run description: Run on text index with document length prior, top down filtered on the contribution of unseen outgoing links.

UamsDwQE10TF¶

Run ID: UamsDwQE10TF
Participant: Amsterdam
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 542479a8088967e4efd87846a7cad003
Run description: Merge of 10 runs on text index based on 10 distinct expanded queries, with a document length prior, top down filtered on the contribution of unseen words.

UCDSIFTdiv¶

Run ID: UCDSIFTdiv
Participant: CSIUCD
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 51e7e297b7174ab6311b9f58ec72ac8f
Run description: Data fusion using SlideFuse algorithm on input topfiles from the Terrier IR system.

UCDSIFTinter¶

Run ID: UCDSIFTinter
Participant: CSIUCD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 5c4f0a104562ab957dc34532085efccb
Run description: Fusion using the InterFuse algorithm and input topfiles from the Terrier IR system.

UCDSIFTprob¶

Run ID: UCDSIFTprob
Participant: CSIUCD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: c187b45d1affe2e2db55cead44f9be9e
Run description: Fusion using the ProbFuse algorithm and input topfiles from the Terrier IR system.

UCDSIFTslide¶

Run ID: UCDSIFTslide
Participant: CSIUCD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: e797e5de2dc643c6b9122f948136965b
Run description: Fusion using the SlideFuse algorithm and input topfiles from the Terrier IR system.

udelFMRM¶

Run ID: udelFMRM
Participant: UDel
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 9a9eed61f000ad90450cbe3daf3db3b2
Run description: Carterette & Chandar facet models constructed from document relevance models.

udelFMWG¶

Run ID: udelFMWG
Participant: UDel
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: a76077d6381ccba18b92b1ea7700b99a
Run description: Carterette & Chandar facet models constructed from HITS-style web graph of links among top 200 docs retrieved by basic indri query.

udelIndDMRM¶

Run ID: udelIndDMRM
Participant: UDel
Track: Web
Year: 2009
Submission: 8/18/2009
Type: automatic
Task: adhoc
MD5: fca748cfdb71a6308739288f4a7baa34
Run description: indri run with Metzler & Croft dependence models and pseudo-relevance feedback with Lavrenko & Croft relevance models. parameters trained in a semi-supervised fashion using last year's Million Query data.

udelIndDRPR¶

Run ID: udelIndDRPR
Participant: UDel
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 1e1ec7882c6f2b921f858c1c0cb11e9e
Run description: indri run with dependence models, relevance models, and pagerank; parameters trained semi-supervisedly using last year's MQ data

udelIndDRSP¶

Run ID: udelIndDRSP
Participant: UDel
Track: Web
Year: 2009
Submission: 8/18/2009
Type: automatic
Task: adhoc
MD5: 3e813551652ac0272a080c79f62d7fca
Run description: indri run with Metzler & Croft dependence models and pseudo-relevance feedback with Lavrenko & Croft relevance models, with a "domain trust" document prior. parameters trained in a semi-supervised fashion using last year's Million Query data. "domain trust" calculated by looking at presence of domain on publicly-available URL and sendmail whitelists and blacklists.

udelSimPrune¶

Run ID: udelSimPrune
Participant: UDel
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: ea2ba4ff5c7f4b624b4d278eb049977d
Run description: greedy pruning based on document-document similarities among top 200 retrieved by standard indri query.

UDWAxBL¶

Run ID: UDWAxBL
Participant: EceUdel
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: a361ca7d721fb0010acac8dab1c46ba8
Run description: Axiomatic method

UDWAxQE¶

Run ID: UDWAxQE
Participant: EceUdel
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: b96c66bd10bfe8081ccfe0a03a83624b
Run description: Axiomatic method and query expansion.

UDWAxQEWeb¶

Run ID: UDWAxQEWeb
Participant: EceUdel
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 8736e65ca662834a963569428b5d2b22
Run description: Axiomatic method and query expansion with Google snippet.

UMHOObm25B¶

Run ID: UMHOObm25B
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: bc3de21d3c5c3bc85d6d60a8eb5b6a6d
Run description: Ivory run (collaboration between UMD and Yahoo)

UMHOObm25GS¶

Run ID: UMHOObm25GS
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 4421f1a4b376a954f0606be44d2143f6
Run description: Ivory run (collaboration between UMD and Yahoo)

UMHOObm25IF¶

Run ID: UMHOObm25IF
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: bd2bf8045d045c373c0431b8d5f52c88
Run description: Ivory run (collaboration between UMD and Yahoo)

UMHOOqlB¶

Run ID: UMHOOqlB
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: d56a2c95432239349502e2d0cb09c1f5
Run description: Ivory run (collaboration between UMD and Yahoo)

UMHOOqlGS¶

Run ID: UMHOOqlGS
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 14b50c4de6adeca67e3da39e7630a2e1
Run description: Ivory run (collaboration between UMD and Yahoo)

UMHOOqlIF¶

Run ID: UMHOOqlIF
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 15d460020f697486fac1a19c79de1f91
Run description: Ivory run (collaboration between UMD and Yahoo)

UMHOOsd¶

Run ID: UMHOOsd
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 5530d1ca3c7a5364fcfc5c80737027e5
Run description: Ivory run (collaboration between UMD and Yahoo)

UMHOOsdp¶

Run ID: UMHOOsdp
Participant: UMD
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: e3365f611c00b4e4c21ab1e609d58850
Run description: Ivory run (collaboration between UMD and Yahoo)

uogTrDPCQcdB¶

Run ID: uogTrDPCQcdB
Participant: uogTr
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: 50c550cd2331b61d9a3caf2f3516d9dc
Run description: Community diversification framework combining subqueries from cluster-based diversification mechanism.

uogTrdphA¶

Run ID: uogTrdphA
Participant: uogTr
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: e72af22ff40dc136ea873d311de68b4d
Run description: Parameter free DFR model with anchor text.

uogTrdphCEwP¶

Run ID: uogTrdphCEwP
Participant: uogTr
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: d404aae435c6b528815c03112093adcc
Run description: Parameter free DFR model with collection enrichment and some proximity.

uogTrdphP¶

Run ID: uogTrdphP
Participant: uogTr
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 7b79605029ea9d11bd7a1e6bc1e1c694
Run description: Parameter free DFR model with some proximity.

uogTrDYCcsB¶

Run ID: uogTrDYCcsB
Participant: uogTr
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: b22e8ef0e3b13934d58721b920b26f44
Run description: Resource diversification framework combining subqueries from external SE.

uogTrDYScdA¶

Run ID: uogTrDYScdA
Participant: uogTr
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 463edba86f8911e96f049d4fc110b24e
Run description: Community diversification framework combining subqueries from external SE.

uvaaol¶

Run ID: uvaaol
Participant: UAms
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: d6dabe0fc42da8ade7d2d158db59bbbd
Run description: generated multiple query variants using a query log.

uvaee¶

Run ID: uvaee
Participant: UAms
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: e4512552c68547ce0c3cdc195a08e0e8
Run description: Pseudo relevance feedback on wikipedia part of Clueweb.

uvamrf¶

Run ID: uvamrf
Participant: UAms
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: a928865901cd2465d7b8fddf1390b55d
Run description: markov random field, wikipedia non-article pages removed

uvamrftop¶

Run ID: uvamrftop
Participant: UAms
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 350ae84eb062d6cf78ff90c05cab1d48
Run description: markov random field, wikipedia non-article pages removed, Wikipedia pages moved to the top of the rankings

uwgym¶

Run ID: uwgym
Participant: Waterloo
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 6fa401d018480fd66eaa6b348dd31fb0
Run description: This is a run based on commercial search engines.

watd1¶

Run ID: watd1
Participant: Waterloo
Track: Web
Year: 2009
Submission: 8/16/2009
Type: automatic
Task: diversity
MD5: b3cf6bcbe1af003e5eb2ebb1883385f6
Run description: Discrimination based on the maximum likelihood term in each message.

watd3¶

Run ID: watd3
Participant: Waterloo
Track: Web
Year: 2009
Submission: 8/16/2009
Type: automatic
Task: diversity
MD5: 1f34b6739e41f969fad327bf9ae30906
Run description: Discrimination based on the 3 maximum likelihood terms in each message.

watd5¶

Run ID: watd5
Participant: Waterloo
Track: Web
Year: 2009
Submission: 8/16/2009
Type: automatic
Task: diversity
MD5: e1062061b7ff0d8722a33e32cc329436
Run description: Discrimination based on the 5 maximum likelihood terms in each message.

watprf¶

Run ID: watprf
Participant: Waterloo
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: adhoc
MD5: 257c69a5993685502e78aaa3baaa36e5
Run description: Sequential logistic regression used to retrieve 10,000 docs/topic from Wikipedia and another 10,000 docs/topic from the English Clueweb09. Pseudo-relevance feedback based on variable number of documents (by likelihood) used to select the best 1,000 from the 20,000 retrieved.

watrrfw¶

Run ID: watrrfw
Participant: Waterloo
Track: Web
Year: 2009
Submission: 8/16/2009
Type: automatic
Task: adhoc
MD5: 97ce48389aac8cede3e34022368f85e3
Run description: Used sequential logistic regression (no indexing). Reciprocal rank fusion of wikipedia-only (watwp) and two relevance feedback runs (watrrf and WAT2.base [relevance feeback submission]).

WatSdmrm3¶

Run ID: WatSdmrm3
Participant: UWaterlooMDS
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: 1f42b8baa3f320f4841bb8c0fb96f47a
Run description: Dependence models with a relevance models (rm3) feeback component using only the collection.

WatSdmrm3we¶

Run ID: WatSdmrm3we
Participant: UWaterlooMDS
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: f61e3bad2dca3c48cf6a065e182bd1c9
Run description: Dependence models with a relevance models (rm3) feeback component derived from the top results of Microsoft and Yahoo search.

WatSklfb¶

Run ID: WatSklfb
Participant: UWaterlooMDS
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 58eb6ac6075d9e5499a514712a17ae7c
Run description: We first obtain the 25 highest scoring point-wise kl-div stems from the 100 most frequent stems in the top 25 documents retrieved by a query likelihood retrieval ranking of the documents containing all of the query terms. We exclude the query stems and 418 stopwords from being part of this set. Using this set, we perform a blind feedback ranking of the documents that contain all query stems.

WatSklfu¶

Run ID: WatSklfu
Participant: UWaterlooMDS
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 161e8336f90579fbb513345403e571f7
Run description: We first obtain the 25 highest scoring point-wise kl-div stems from the 100 most frequent stems in the top 25 documents retrieved by a query likelihood retrieval ranking of the documents containing all of the query terms. We exclude the query stems and 418 stopwords from being part of this set. Using this set, we generate 25 additional retrievals. We then use a variant of reciprocal rank fusion to join these lists.

WatSklq¶

Run ID: WatSklq
Participant: UWaterlooMDS
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: 8655e0d3cdab3869a36bbf1f65145964
Run description: We first obtain the 25 highest scoring point-wise kl-div stems from the 100 most frequent stems in the top 25 documents retrieved by a query likelihood retrieval ranking of the documents containing all of the query terms. We exclude the query stems and 418 stopwords from being part of this set. Using this set, we generate 25 additional retrievals. We then select between these lists using a priority queue like mechanism.

WatSql¶

Run ID: WatSql
Participant: UWaterlooMDS
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: adhoc
MD5: cb1237e12828b95629249c334333d322
Run description: Simple query likelihood retrieval given the query with Dirichlet prior smoothing.

watwp¶

Run ID: watwp
Participant: Waterloo
Track: Web
Year: 2009
Submission: 8/16/2009
Type: automatic
Task: adhoc
MD5: 27c7002b6a120a2b34a7c48c803119f5
Run description: Wikipedia only run. Used sequential logistic regression (no indexing).

wume1¶

Run ID: wume1
Participant: LU_WUME
Track: Web
Year: 2009
Submission: 8/19/2009
Type: automatic
Task: diversity
MD5: d42d414bcf2f8b495f303f91099c9f37
Run description: We get query expansion from google insight and then combine the search result from different queries expansions.

wume2¶

Run ID: wume2
Participant: LU_WUME
Track: Web
Year: 2009
Submission: 8/20/2009
Type: automatic
Task: diversity
MD5: c0c906e32891d576646a5fd9ba655386
Run description: A variation of BM25

yhooumd09BFM¶

Results | Participants | Input | Summary | Appendix

Run ID: yhooumd09BFM
Participant: yahoo
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 9b65a1b586b8d6d340569debd8534dc6
Run description: BM25 run using Ivory (w/ independent fusion). Results are post-filtered using Yahoo!'s "industrial strength" adult and spam classifiers. Retrieval scores are also biased towards high quality pages, which are also uses the output of a proprietary Y! classifier. A moderate amount of filtering and reranking is done.

yhooumd09BGC¶

Results | Participants | Input | Summary | Appendix

Run ID: yhooumd09BGC
Participant: yahoo
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: 3f53b4924905c5a7cc23163582638320
Run description: BM25 run using Ivory (w/ global statistics). Results are post-filtered using Yahoo!'s "industrial strength" adult and spam classifiers. Retrieval scores are also biased towards high quality pages, which are also uses the output of a proprietary Y! classifier. A conservative amount of filtering and reranking is done.

yhooumd09BGM¶

Results | Participants | Input | Summary | Appendix

Run ID: yhooumd09BGM
Participant: yahoo
Track: Web
Year: 2009
Submission: 8/17/2009
Type: automatic
Task: adhoc
MD5: a24963a0d373d8e1f29ea3b5d458a65f
Run description: BM25 run using Ivory (w/ global statistics). Results are post-filtered using Yahoo!'s "industrial strength" adult and spam classifiers. Retrieval scores are also biased towards high quality pages, which are also uses the output of a proprietary Y! classifier. A moderate amount of filtering and reranking is done.