Runs - Terabyte 2006¶

AMRIMtp20006¶

Run ID: AMRIMtp20006
Participant: ecole-des-mines.beigbeder
Track: Terabyte
Year: 2006
Submission: 7/1/2006
Type: automatic
Task: adhoc
Run description: Automatic run with title field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary. Moreover of re-ranking, our method discovers at least 80 in addition to Zettair BM25 list.

AMRIMtp5006¶

Run ID: AMRIMtp5006
Participant: ecole-des-mines.beigbeder
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: automatic
Task: adhoc
Run description: Automatic run with title field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary.

AMRIMtpm5006¶

Run ID: AMRIMtpm5006
Participant: ecole-des-mines.beigbeder
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Manual run with words of all field. This run is adapted at our method because we construct boolean queries which can be analysed by fuzzy proximity. Our boolean queries take words of all topic field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary.

arscDomAlog¶

Run ID: arscDomAlog
Participant: ualaska.fairbanks.newby
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: automatic
Task: adhoc
Run description: Automatic run, each web domain is gov2 was indexed and searched individually, the ranked results from each domain were fit and normalized to a logistic curve, then merged.

arscDomAsrt¶

Run ID: arscDomAsrt
Participant: ualaska.fairbanks.newby
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: automatic
Task: adhoc
Run description: Automatic run, each web domain is gov2 was indexed and searched individually, the ranked results from each domain were sorted via GNU sort on the relevance score.

arscDomManL¶

Run ID: arscDomManL
Participant: ualaska.fairbanks.newby
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: manual
Task: adhoc
Run description: Manual processing included reading the topics and descriptions, thinking a bit, and constructing a query with boolean AND, OR, NOT. As with the automatic run, each web domain is searched independent of the others and the results are merged after fitting to a normalized logistic curve.

arscDomManS¶

Run ID: arscDomManS
Participant: ualaska.fairbanks.newby
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: manual
Task: adhoc
Run description: Manual processing included reading the topics and descriptions, thinking a bit, and constructing a query with boolean AND, OR, NOT. As with the automatic run, each web domain is searched independent of the others and the results are merged using GNU sort on the relevance score.

CoveoNPRun1¶

Results | Participants | Input | Summary | Appendix

Run ID: CoveoNPRun1
Participant: coveo.soucy
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query.

CoveoNPRun2¶

Results | Participants | Input | Summary | Appendix

Run ID: CoveoNPRun2
Participant: coveo.soucy
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the previous one, but weights for each criteria have been slightly changed.

CoveoNPRun3¶

Results | Participants | Input | Summary | Appendix

Run ID: CoveoNPRun3
Participant: coveo.soucy
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the previous one, but acronyms built from the query have been added.

CoveoNPRun4¶

Results | Participants | Input | Summary | Appendix

Run ID: CoveoNPRun4
Participant: coveo.soucy
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the first one, but weights for query words found in title and in keyphrases extracted from a summarization tool are boosted.

CoveoRun1¶

Results | Participants | Input | Summary | Appendix

Run ID: CoveoRun1
Participant: coveo.soucy
Track: Terabyte
Year: 2006
Submission: 6/29/2006
Type: automatic
Task: adhoc
Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. We return a max of 1000 results since this is the max number of results that our system is able to return with the current configuration.

CWI06COMP1¶

Run ID: CWI06COMP1
Participant: lowlands-team.deVries
Track: Terabyte
Year: 2006
Submission: 9/5/2006
Task: comp_eff

CWI06DISK1¶

Run ID: CWI06DISK1
Participant: lowlands-team.deVries
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Entire collection indexed on a single machine. A decent (10-disk) RAID used to see the system performance with I/O based processing (3GB of buffer memory only). Single query stream to analyze sequential performance. Index is compressed. Scores are precomputed, and stored in compressed, quantized form. Top-20 is retrieved using two-pass strategy. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent, we copied the result from the 'Total wall-clock time' field.

CWI06DISK1ah¶

Run ID: CWI06DISK1ah
Participant: lowlands-team.deVries
Track: Terabyte
Year: 2006
Submission: 6/26/2006
Type: automatic
Task: adhoc
Run description: Entire collection indexed on a single machine. A decent (10-disk) RAID used to see the system performance with I/O based processing (3GB of buffer memory only). Single query stream to analyze sequential performance. Index is compressed. Scores are precomputed, and stored in compressed, quantized form. Top-10000 is retrieved using two-pass strategy. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent, we copied the result from the 'Total wall-clock time' field.

CWI06DIST8¶

Run ID: CWI06DIST8
Participant: lowlands-team.deVries
Track: Terabyte
Year: 2006
Submission: 6/20/2006
Task: efficiency
Run description: This is a distributed run using a centralized broker, that runs 4 query streams in parallel, against 8 workstations that each index one eighth of the full document collection. Each workstation has 2GB of RAM, and a Athlon64 X2 3800+ dual core CPU. Each workstation runs our MonetDB/X100 system, which is a research relational DBMS, designed for high performance on data- and query-intensive workloads. This means that all the data structures we use are stored in relational tables. On each node, the index (inverted file stored in a relational table) is cached in RAM (using compression). Term occurences are ordered on docid, to allow for merge-joins and optimize for our PFOR-DELTA compression scheme. Queries are executed using a two-pass strategy in the first pass, we try to retrieve the per-node top-20 using the boolean conjunction of the query terms in combination with Okapi BM25 ranking. If this fails to return 20 results over all 8 nodes, we execute a second pass, in which we use a boolean-disjunctive variant of the same query, again ranking with Okapi BM25. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

CWI06DIST8ah¶

Run ID: CWI06DIST8ah
Participant: lowlands-team.deVries
Track: Terabyte
Year: 2006
Submission: 6/26/2006
Type: automatic
Task: adhoc
Run description: This is a distributed run using a centralized broker, that runs 4 query streams in parallel, against 8 workstations that each index one eighth of the full document collection. Each workstation has 2GB of RAM, and a Athlon64 X2 3800+ dual core CPU. Each workstation runs our MonetDB/X100 system, which is a research relational DBMS, designed for high performance on data- and query-intensive workloads. This means that all the data structures we use are stored in relational tables. On each node, the index (inverted file stored in a relational table) is cached in RAM (using compression). Term occurences are ordered on docid, to allow for merge-joins and optimize for our PFOR-DELTA compression scheme. Queries are executed using a two-pass strategy in the first pass, we try to retrieve the per-node top-10000 using the boolean conjunction of the query terms in combination with Okapi BM25 ranking. If this fails to return 10000 results over all 8 nodes, we execute a second pass, in which we use a boolean-disjunctive variant of the same query, again ranking with Okapi BM25. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

CWI06MEM1¶

Run ID: CWI06MEM1
Participant: lowlands-team.deVries
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Single node, single query-stream, sequential run, using RAM resident index. Using 16GB of RAM allows to keep the entire index in-memory to avoid I/O completely. Per document term scores are quantized and compressed. Although the machine has 4 CPUs, only one of them was used for sequentially processing the query single query stream. Note a different machine was used for indexing (the one from CWI06DISK1 run). Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

CWI06MEM4¶

Run ID: CWI06MEM4
Participant: lowlands-team.deVries
Track: Terabyte
Year: 2006
Submission: 6/20/2006
Task: efficiency
Run description: Entire collection indexed in RAM on a single machine. Note a different machine was used for indexing (the one from CWI06DISK run). Using 16GB of RAM allows to keep the entire index in-memory to avoid I/O completely. Per document term scores are quantized and compressed. 4 CPUs are used to see the benefit of 4 query streams. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

DCU05BASE¶

Run ID: DCU05BASE
Participant: dublincityu.gurrin
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: automatic
Task: adhoc
Run description: Automatic run on a sorted index

hedge0¶

Run ID: hedge0
Participant: northeasternu.aslam
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Hedge metasearch (without feedback) over eight standard lemur retrieval systems.

hedge10¶

Run ID: hedge10
Participant: northeasternu.aslam
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Hedge metasearch (10 documents feedback) over eight standard lemur retrieval systems.

hedge30¶

Run ID: hedge30
Participant: northeasternu.aslam
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Hedge metasearch (30 documents feedback) over eight standard lemur retrieval systems.

hedge5¶

Run ID: hedge5
Participant: northeasternu.aslam
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Hedge metasearch (5 documents feedback) over eight standard lemur retrieval systems.

hedge50¶

Run ID: hedge50
Participant: northeasternu.aslam
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Hedge metasearch (50 documents feedback) over eight standard lemur retrieval systems.

humT06l¶

Results | Participants | Input | Summary | Appendix

Run ID: humT06l
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 6/29/2006
Type: automatic
Task: adhoc
Run description: Plain content search, boolean-OR of query terms, English inflections, normal tf and idf dampening, document length normalization.

humT06xl¶

Results | Participants | Input | Summary | Appendix

Run ID: humT06xl
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 6/29/2006
Type: automatic
Task: adhoc
Run description: Same as humT06l except that extra 20% weight on proximity match of query terms.

humT06xlc¶

Results | Participants | Input | Summary | Appendix

Run ID: humT06xlc
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: automatic
Task: adhoc
Run description: Same as humT06xl except that a duplicate filtering heuristic was applied and only 1000 rows per topic were returned.

humT06xle¶

Results | Participants | Input | Summary | Appendix

Run ID: humT06xle
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 6/29/2006
Type: automatic
Task: adhoc
Run description: Blind feedback using top-2 rows of humT06xl

humT06xlz¶

Results | Participants | Input | Summary | Appendix

Run ID: humT06xlz
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: automatic
Task: adhoc
Run description: One percent subset of first 9000 rows of humT06xl (rows 1, 101, 201, 301, ..., 8901) plus last 1000 rows of humT06xl (rows 9001-10000).

humTE06i3¶

Results | Participants | Input | Summary | Appendix

Run ID: humTE06i3
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 6/19/2006
Task: efficiency
Run description: Boolean-AND of query words, normal tf and idf dampening; no document length normalization, no stemming.

humTE06v2¶

Results | Participants | Input | Summary | Appendix

Run ID: humTE06v2
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 6/19/2006
Task: efficiency
Run description: Differs from humTE06i3 in that Boolean-OR is used, extra 20% weight on matching Title, terms in more than 10% of rows discarded, document length normalization enabled.

humTN06dpl¶

Results | Participants | Input | Summary | Appendix

Run ID: humTN06dpl
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 7/9/2006
Task: namedpage
Run description: Content weight 10, Title weight 2, Phrase-in-Title weight 1, Url-depth weight 5, English inflections, normal tf and idf dampening, document length normalization.

humTN06dplc¶

Results | Participants | Input | Summary | Appendix

Run ID: humTN06dplc
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 7/9/2006
Task: namedpage
Run description: Same as humTN06dpl except that a duplicate filtering heuristic was applied.

humTN06l¶

Results | Participants | Input | Summary | Appendix

Run ID: humTN06l
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: Same as humTN06pl except special weights on title omitted.

humTN06pl¶

Results | Participants | Input | Summary | Appendix

Run ID: humTN06pl
Participant: hummingbird.tomlinson
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: Same as humTN06dpl except url-depth weighting omitted.

icttb0600¶

Run ID: icttb0600
Participant: cas-ict.wang
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0601¶

Run ID: icttb0601
Participant: cas-ict.wang
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0602¶

Run ID: icttb0602
Participant: cas-ict.wang
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0603¶

Run ID: icttb0603
Participant: cas-ict.wang
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0604¶

Run ID: icttb0604
Participant: cas-ict.wang
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered,and duplicated urls were removed. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

indri06AdmD¶

Run ID: indri06AdmD
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/1/2006
Type: automatic
Task: adhoc
Run description: Dependence model run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking.

indri06AlceB¶

Run ID: indri06AlceB
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/1/2006
Type: automatic
Task: adhoc
Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using BM25 features. Both term proximity and phrase matches are taken into account during ranking.

indri06AlceD¶

Run ID: indri06AlceD
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/1/2006
Type: automatic
Task: adhoc
Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking.

indri06Aql¶

Run ID: indri06Aql
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/1/2006
Type: automatic
Task: adhoc
Run description: Query likelihood (bag of words) baseline run.

indri06AtdnD¶

Run ID: indri06AtdnD
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/1/2006
Type: automatic
Task: adhoc
Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking. In addition, the title, description, and narrative portions of topic are each weighted differently.

indri06Nfi¶

Run ID: indri06Nfi
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This approach computes a unigram document model by mixing language models formed from the body, title, anchor text, and heading fields. No priors are used. Ranking is done using query likelihood.

indri06Nfip¶

Run ID: indri06Nfip
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This approach computes a unigram document model by mixing language models formed from the body, title, anchor text, and heading fields. PageRank and inlink priors were used. Ranking is done using query likelihood.

indri06Nsd¶

Run ID: indri06Nsd
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This approach uses a dependence model formulation that takes into account features computed over the body, title, anchor text, and heading fields. No priors are used.

indri06Nsdp¶

Run ID: indri06Nsdp
Participant: umass.allan
Track: Terabyte
Year: 2006
Submission: 7/28/2006
Task: namedpage
Run description: This approach uses a dependence model formulation that takes into account features computed over the body, title, anchor text, and heading fields. PageRank and inlink features are also used.

JuruMan¶

Run ID: JuruMan
Participant: ibm.carmel
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Manual run - exploiting the full system's query syntax

JuruT¶

Run ID: JuruT
Participant: ibm.carmel
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Basic run based on title only. For short queries (less than 4 terms) we expand the query with a phrase of the query text

JuruTD¶

Run ID: JuruTD
Participant: ibm.carmel
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Basic run based on title + description.

JuruTWE¶

Run ID: JuruTWE
Participant: ibm.carmel
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: A run based on title + expansion from external source. We run each topic's title by "answers.com", and the result page is used to expand the query. Expansion erms extracted from the Web result are lexical affinites of the original query terms (topic title).

mg4jAdhocBBV¶

Run ID: mg4jAdhocBBV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: A run using BM25 + minimal-interval scoring linearly combined (the former with doubled importance). Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAdhocBV¶

Run ID: mg4jAdhocBV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: A run using BM25 + minimal-interval scoring linearly combined. Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAdhocBVV¶

Run ID: mg4jAdhocBVV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: A run using BM25 + minimal-interval scoring linearly combined (the latter with doubled importance). Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAdhocV¶

Run ID: mg4jAdhocV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: A run using minimal-interval scoring. Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAutoBBV¶

Run ID: mg4jAutoBBV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: A run using BM25 + minimal-interval scoring, linearly combined (the former has doubled importance). Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mg4jAutoBV¶

Run ID: mg4jAutoBV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: A run using BM25 + minimal-interval scoring, linearly combined. Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mg4jAutoBVV¶

Run ID: mg4jAutoBVV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: A run using BM25 + minimal-interval scoring, linearly combined (the latter has doubled importance). Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mg4jAutoV¶

Run ID: mg4jAutoV
Participant: umilano.vigna
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: A run using minimal-interval scoring. Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mpiiotopk¶

Run ID: mpiiotopk
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching

mpiiotopk2¶

Run ID: mpiiotopk2
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching Pruning scans each list at most up to depth 1/5th of the list length

mpiiotopk2p¶

Run ID: mpiiotopk2p
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching Pruning scans each list at most up to depth 1/5th of the list length

mpiiotopkpar¶

Run ID: mpiiotopkpar
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching

mpiircomb¶

Run ID: mpiircomb
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Queries were automatically generated from the title and description fields (allowing duplicate words), removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no

mpiirdesc¶

Run ID: mpiirdesc
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Queries were automatically generated from the description fields, removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no

mpiirmanual¶

Run ID: mpiirmanual
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Queries were generated manually from the title, description and narrative fields and processed using top-k algorithm as described in our VLDB'06 paper "IO-Top-k". After the query generation, documents are retrieved automatically. Scoring function bm25 Stemming no Pruning no

mpiirtitle¶

Run ID: mpiirtitle
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Queries were automatically generated from the title fields, removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no

MU06TBa1¶

Run ID: MU06TBa1
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Impact-based retrieval. Using proximity to break tie. Manual queries.

MU06TBa2¶

Run ID: MU06TBa2
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Impact-based retrieval. Baseline.

MU06TBa5¶

Run ID: MU06TBa5
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Impact-based retrieval. Using proximity to break tie.

MU06TBa6¶

Run ID: MU06TBa6
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Impact-based retrieval. Using proximity to break tie. Manual queries.

MU06TBn2¶

Run ID: MU06TBn2
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/31/2006
Task: namedpage
Run description: Plain impact retrieval over content and incoming anchor text.

MU06TBn5¶

Run ID: MU06TBn5
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/31/2006
Task: namedpage
Run description: Plain BM25 over content and incoming anchor text.

MU06TBn6¶

Run ID: MU06TBn6
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/31/2006
Task: namedpage
Run description: Modified impact retrieval over content and incoming anchor text.

MU06TBn9¶

Run ID: MU06TBn9
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 7/31/2006
Task: namedpage
Run description: Modified impact retrieval over content and incoming anchor text.

MU06TBy1¶

Run ID: MU06TBy1
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 6/20/2006
Task: efficiency
Run description: Baseline, full-processing with impact-sorted index.

MU06TBy2¶

Run ID: MU06TBy2
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Impact retrieval + smoothing

MU06TBy5¶

Run ID: MU06TBy5
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Impact retrieval + dynamic pruning

MU06TBy6¶

Run ID: MU06TBy6
Participant: umelbourne.ngoc-anh
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Impact retrieval + dynamic pruning + 4 streams

p6tbadt¶

Results | Participants | Input | Summary | Appendix

Run ID: p6tbadt
Participant: polytechnicu.suel
Track: Terabyte
Year: 2006
Submission: 7/3/2006
Type: automatic
Task: adhoc
Run description: Results based on BM25 are reorganized using a Decision Tree trained by previous two years' judgements.

p6tbaxl¶

Results | Participants | Input | Summary | Appendix

Run ID: p6tbaxl
Participant: polytechnicu.suel
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: pagerank, bm25, anchor text

p6tbeb¶

Results | Participants | Input | Summary | Appendix

Run ID: p6tbeb
Participant: polytechnicu.suel
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: baseline run with 1GB cache.

p6tbedt¶

Results | Participants | Input | Summary | Appendix

Run ID: p6tbedt
Participant: polytechnicu.suel
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Variation of BM25 with Decision tree.

p6tbep8¶

Results | Participants | Input | Summary | Appendix

Run ID: p6tbep8
Participant: polytechnicu.suel
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Unreliable pruning used.

rmit06cmpind¶

Run ID: rmit06cmpind
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 8/7/2006
Task: comp_eff

rmit06cmpwum¶

Run ID: rmit06cmpwum
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 8/7/2006
Task: comp_eff

rmit06cmpzet¶

Run ID: rmit06cmpzet
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 8/7/2006
Task: comp_eff

rmit06effic¶

Run ID: rmit06effic
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 6/21/2006
Task: efficiency
Run description: Standard run (single stream). Index built with no term offsets. Query evaluation with stopping, light stemming and Dirichlet language modelling for ranking.

sabtb06aa1¶

Results | Participants | Input | Summary | Appendix

Run ID: sabtb06aa1
Participant: sabir.buckley
Track: Terabyte
Year: 2006
Submission: 7/3/2006
Type: automatic
Task: adhoc
Run description: very simple vector run, all fields of topic

sabtb06at1¶

Results | Participants | Input | Summary | Appendix

Run ID: sabtb06at1
Participant: sabir.buckley
Track: Terabyte
Year: 2006
Submission: 7/3/2006
Type: automatic
Task: adhoc
Run description: very simple vector title run

sabtb06man1¶

Results | Participants | Input | Summary | Appendix

Run ID: sabtb06man1
Participant: sabir.buckley
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: 5 minutes manual effort per topic, possibly editing topic text, but mostly judgements. Typically 5 iterations of retrieval, with Rocchio feedback (concurrent with more judgements). Expand by 30 terms for judging runs, Expand by 100 terms for final 10000 run.

THUADALL¶

Results | Participants | Input | Summary | Appendix

Run ID: THUADALL
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/3/2006
Type: automatic
Task: adhoc
Run description: Result combination of BM25 (and, or) and Language model (Dirichlet Prior Method) ranking. all ranking methods are performed on the whole collection together with anchor text.

THUADAO¶

Results | Participants | Input | Summary | Appendix

Run ID: THUADAO
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/3/2006
Type: automatic
Task: adhoc
Run description: BM25, whole collection with anchor text. Result combination (and, or)

THUADLMAO¶

Results | Participants | Input | Summary | Appendix

Run ID: THUADLMAO
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/3/2006
Type: automatic
Task: adhoc
Run description: Result combination of BM25 (and, or) and Language model (Dirichlet Prior Method) ranking. all ranking methods are performed on the whole collection together with anchor text.

THUADLMO¶

Results | Participants | Input | Summary | Appendix

Run ID: THUADLMO
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/3/2006
Type: automatic
Task: adhoc
Run description: Result combination of BM25 and Language model (Dirichlet Prior Method) ranking. Both ranking methods are performed on the whole collection together with anchor text.

THUADOR¶

Results | Participants | Input | Summary | Appendix

Run ID: THUADOR
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: BM25 ranking over whole collection with in-link anchor text.

THUNPABS¶

Results | Participants | Input | Summary | Appendix

Run ID: THUNPABS
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. Several fields are extracted from whole collection according to HTML structure (such as title, bold text, etc.) and query terms appearing in these fields are given a higher weight during ranking process.

THUNPCOMB¶

Results | Participants | Input | Summary | Appendix

Run ID: THUNPCOMB
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: Result combination of 3 runs [1] THUNPABS, [2] language model instead of BM25 ranking over the same data collection as THUNPABS, [3] result filtering based on THUNPABS, only the results containing all query terms are retained. Results are ranked based on results' different RSVs in these 3 runs.

THUNPNOSTOP¶

Results | Participants | Input | Summary | Appendix

Run ID: THUNPNOSTOP
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: BM25 ranking (using Tminer 3.0 system) over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. No stopwords are filtered in the indexing process.

THUNPTA3¶

Results | Participants | Input | Summary | Appendix

Run ID: THUNPTA3
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. Repeated anchor text is reduced from the corpus.

THUNPWP18¶

Results | Participants | Input | Summary | Appendix

Run ID: THUNPWP18
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight.

THUTeraEff01¶

Results | Participants | Input | Summary | Appendix

Run ID: THUTeraEff01
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 6/22/2006
Task: efficiency
Run description: BM25 ranking over both anchor text and content of .GOV corpus.

THUTeraEff02¶

Results | Participants | Input | Summary | Appendix

Run ID: THUTeraEff02
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 6/22/2006
Task: efficiency
Run description: BM25 ranking over extracted abstract of the .GOV2 corpus. Abstracts are extracted according to document structure of web pages.

THUTeraEff03¶

Results | Participants | Input | Summary | Appendix

Run ID: THUTeraEff03
Participant: tsinghuau.zhang
Track: Terabyte
Year: 2006
Submission: 6/22/2006
Task: efficiency
Run description: Result combination according to the reciprocal rank of THUTeraEff01 and THUTeraEff02.

TWTB06AD01¶

Run ID: TWTB06AD01
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 6/28/2006
Type: automatic
Task: adhoc
Run description: This is an automatic run which combined the dependence model with pseudo-relevance feedback.

TWTB06AD02¶

Run ID: TWTB06AD02
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 6/28/2006
Type: manual
Task: adhoc
Run description: This is a manual run which made use of pseudo-relevance feedback.

TWTB06AD03¶

Run ID: TWTB06AD03
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 6/28/2006
Type: manual
Task: adhoc
Run description: This is a manual run which is just a simple query likelihood run.

TWTB06AD04¶

Run ID: TWTB06AD04
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 6/28/2006
Type: automatic
Task: adhoc
Run description: This is a title-only run which made use of dependence modeling.

TWTB06AD05¶

Run ID: TWTB06AD05
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 6/28/2006
Type: automatic
Task: adhoc
Run description: This run is a simple title-only query likelihood run.

TWTB06NP01¶

Run ID: TWTB06NP01
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 7/19/2006
Task: namedpage
Run description: This is a run which uses document structure techniques.

TWTB06NP02¶

Run ID: TWTB06NP02
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 7/19/2006
Task: namedpage
Run description: This is a run which uses the title field of documents.

TWTB06NP03¶

Run ID: TWTB06NP03
Participant: pekingu.yan
Track: Terabyte
Year: 2006
Submission: 7/19/2006
Task: namedpage
Run description: This is a run which uses pagerank prior and the title field of documents.

UAmsT06a3SUM¶

Run ID: UAmsT06a3SUM
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Combination of (1) full-text index run with weight 80%, (2) extracted titles index run with weight 10%, and (3) extracted anchor-texts index run with weight 80%. All runs use a stemmed index, and a language model with little smoothing (lambda = .9), no feedback.

UAmsT06aAnLM¶

Run ID: UAmsT06aAnLM
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Stemmed anchor-text index, using a language model with little smoothing (lambda = 0.9), no feedback.

UAmsT06aTDN¶

Run ID: UAmsT06aTDN
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Stemmed full-text index, using a language model with little smoothing (lambda = 0.9), no feedback. Selected the 10 most characteric terms from the TDN-fields.

UAmsT06aTeLM¶

Run ID: UAmsT06aTeLM
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Stemmed full-text index, using a language model with little smoothing (lambda = 0.9), no feedback.

UAmsT06aTTDN¶

Run ID: UAmsT06aTTDN
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Combination of (1) Title-only run and (2) TDN-run using the 10 most characteric terms from the TDN-fields. Both using a stemmed full-text index, using a language model with little smoothing, no feedback.

UAmsT06n3SUM¶

Run ID: UAmsT06n3SUM
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/29/2006
Task: namedpage
Run description: CombSUM combination of full-text (.8), titles (.1) and anchors (.1), all runs use language model (lambda = .9), and a standard length-prior.

UAmsT06nAnLM¶

Run ID: UAmsT06nAnLM
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/29/2006
Task: namedpage
Run description: Extracted anchor texts index (stemmed), using language model (lambda = .9, standard length prior).

UAmsT06nTeLM¶

Run ID: UAmsT06nTeLM
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/29/2006
Task: namedpage
Run description: Full text index (stemmed), using language model (lambda = .9, standard length prior).

UAmsT06nTurl¶

Run ID: UAmsT06nTurl
Participant: uamsterdam.ilps
Track: Terabyte
Year: 2006
Submission: 7/29/2006
Task: namedpage
Run description: Full text index (stemmed), using language model (lambda = .9, standard length prior) and a URL-length prior.

uogTB06M¶

Run ID: uogTB06M
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: Divergence From Randomess weighting model with document structure

uogTB06MP¶

Run ID: uogTB06MP
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: Divergence From Randomness weighting model with document structure and precision enhancement model

uogTB06MPIA¶

Run ID: uogTB06MPIA
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/30/2006
Task: namedpage
Run description: Divergence From Randomness weighting model with document structure, precision enhancement model, and query-independent evidence

uogTB06QET1¶

Run ID: uogTB06QET1
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: DFR document weighting framework with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06QET2¶

Run ID: uogTB06QET2
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: DFR document weighting framework with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06S50L¶

Run ID: uogTB06S50L
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: DFR document weighting framework and query reformulation with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06SS10L¶

Run ID: uogTB06SS10L
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: DFR document weighting framework, query reformulation and a new indexing procedure with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06SSQL¶

Run ID: uogTB06SSQL
Participant: uglasgow.ounis
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: DFR document weighting framework, query reformulation and a new indexing procedure with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uwmtFadDS¶

Run ID: uwmtFadDS
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/27/2006
Type: automatic
Task: adhoc
Run description: BM25 + additional weight for terms in title fields etc.

uwmtFadTPFB¶

Run ID: uwmtFadTPFB
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/27/2006
Type: automatic
Task: adhoc
Run description: BM25 + term proximity + pseudo-relevance feedback

uwmtFadTPRR¶

Run ID: uwmtFadTPRR
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/24/2006
Type: automatic
Task: adhoc
Run description: This is BM25 + term proximity for an initial result set. The top 10 documents from the initial result set are used to build a language model which is then employed to rerank all documents in the result set according to their divergence from this language model.

uwmtFcompI0¶

Run ID: uwmtFcompI0
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompI1¶

Run ID: uwmtFcompI1
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompI2¶

Run ID: uwmtFcompI2
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompI3¶

Run ID: uwmtFcompI3
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompW¶

Run ID: uwmtFcompW
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompW1¶

Run ID: uwmtFcompW1
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompW2¶

Run ID: uwmtFcompW2
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompW3¶

Run ID: uwmtFcompW3
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/7/2006
Task: comp_eff

uwmtFcompZ0¶

Run ID: uwmtFcompZ0
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 8/13/2006
Task: comp_eff

uwmtFcompZ1¶

Run ID: uwmtFcompZ1
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 8/13/2006
Task: comp_eff

uwmtFcompZ2¶

Run ID: uwmtFcompZ2
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 8/13/2006
Task: comp_eff

uwmtFcompZ3¶

Run ID: uwmtFcompZ3
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 8/13/2006
Task: comp_eff

uwmtFdcp03¶

Run ID: uwmtFdcp03
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/15/2006
Task: efficiency
Run description: This run uses a pruned in-memory index containing the top 3% terms from every document. vbyte is used for index compression.

uwmtFdcp06¶

Run ID: uwmtFdcp06
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/16/2006
Task: efficiency
Run description: This run uses a pruned in-memory index containing the top 6% terms from every document. vbyte is used for index compression.

uwmtFdcp12¶

Run ID: uwmtFdcp12
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/15/2006
Task: efficiency
Run description: This run uses a pruned in-memory index containing the top 12% terms from every document. Index compression is done using a length-limited Huffman code.

uwmtFmanual¶

Run ID: uwmtFmanual
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/30/2006
Type: manual
Task: adhoc
Run description: This run is a combination of a manual run, in which a bored graduate student tried to find relevant documents by hand, and a few automatic runs, merged together. It makes use of the full topic statements, TDN.

uwmtFnoprune¶

Run ID: uwmtFnoprune
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 6/15/2006
Task: efficiency
Run description: Frequency index, compressed using vbyte. Frequencies of terms appearing in special parts of the document (title, headlines, etc.) are boosted.

uwmtFnpsRR1¶

Run ID: uwmtFnpsRR1
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/26/2006
Task: namedpage
Run description: BM25 with weighted fields and local reranking based on links and anchor text as a second step.

uwmtFnpstr1¶

Run ID: uwmtFnpstr1
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/11/2006
Task: namedpage
Run description: BM25 with extra weight for terms within special HTML tags.

uwmtFnpstr2¶

Run ID: uwmtFnpstr2
Participant: uwaterloo-clarke
Track: Terabyte
Year: 2006
Submission: 7/11/2006
Task: namedpage
Run description: BM25 + extract weight according to document structure. Integrated duplicate elimination.

wumpus¶

Run ID: wumpus
Participant: max-planck.theobald
Track: Terabyte
Year: 2006
Submission: 9/5/2006
Task: comp_eff

zetabm¶

Run ID: zetabm
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Zettair probabilistic model (BM25) run

zetadir¶

Run ID: zetadir
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Zettair language model (dirichlet) run

zetaman¶

Run ID: zetaman
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: manual
Task: adhoc
Run description: Zettair manual run using title plus salient keywords from description and narrative

zetamerg¶

Run ID: zetamerg
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Zettair and Indri merged run (round-robin merge)

zetamerg2¶

Run ID: zetamerg2
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/2/2006
Type: automatic
Task: adhoc
Run description: Zettair and Indri merged run 2 (merde on score)

zetnpbm¶

Run ID: zetnpbm
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Baseline zettair run (using BM25)

zetnpfa¶

Run ID: zetnpfa
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Mixed retrieval of full text and anchor text

zetnpft¶

Run ID: zetnpft
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Mixed retrieval of full text index and tags (TITLE and Hx)

zetnpfta¶

Run ID: zetnpfta
Participant: rmit.scholer
Track: Terabyte
Year: 2006
Submission: 7/27/2006
Task: namedpage
Run description: Mixed retrieval of full text, anchor text and tags (TITLE and Hx)