Skip to content

Runs - Terabyte 2006

AMRIMtp20006

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: AMRIMtp20006
  • Participant: ecole-des-mines.beigbeder
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/1/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Automatic run with title field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary. Moreover of re-ranking, our method discovers at least 80 in addition to Zettair BM25 list.

AMRIMtp5006

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: AMRIMtp5006
  • Participant: ecole-des-mines.beigbeder
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Automatic run with title field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary.

AMRIMtpm5006

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: AMRIMtpm5006
  • Participant: ecole-des-mines.beigbeder
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Manual run with words of all field. This run is adapted at our method because we construct boolean queries which can be analysed by fuzzy proximity. Our boolean queries take words of all topic field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary.

arscDomAlog

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: arscDomAlog
  • Participant: ualaska.fairbanks.newby
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Automatic run, each web domain is gov2 was indexed and searched individually, the ranked results from each domain were fit and normalized to a logistic curve, then merged.

arscDomAsrt

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: arscDomAsrt
  • Participant: ualaska.fairbanks.newby
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Automatic run, each web domain is gov2 was indexed and searched individually, the ranked results from each domain were sorted via GNU sort on the relevance score.

arscDomManL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: arscDomManL
  • Participant: ualaska.fairbanks.newby
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: manual
  • Task: adhoc
  • Run description: Manual processing included reading the topics and descriptions, thinking a bit, and constructing a query with boolean AND, OR, NOT. As with the automatic run, each web domain is searched independent of the others and the results are merged after fitting to a normalized logistic curve.

arscDomManS

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: arscDomManS
  • Participant: ualaska.fairbanks.newby
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: manual
  • Task: adhoc
  • Run description: Manual processing included reading the topics and descriptions, thinking a bit, and constructing a query with boolean AND, OR, NOT. As with the automatic run, each web domain is searched independent of the others and the results are merged using GNU sort on the relevance score.

CoveoNPRun1

Results | Participants | Input | Summary | Appendix

  • Run ID: CoveoNPRun1
  • Participant: coveo.soucy
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query.

CoveoNPRun2

Results | Participants | Input | Summary | Appendix

  • Run ID: CoveoNPRun2
  • Participant: coveo.soucy
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the previous one, but weights for each criteria have been slightly changed.

CoveoNPRun3

Results | Participants | Input | Summary | Appendix

  • Run ID: CoveoNPRun3
  • Participant: coveo.soucy
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the previous one, but acronyms built from the query have been added.

CoveoNPRun4

Results | Participants | Input | Summary | Appendix

  • Run ID: CoveoNPRun4
  • Participant: coveo.soucy
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the first one, but weights for query words found in title and in keyphrases extracted from a summarization tool are boosted.

CoveoRun1

Results | Participants | Input | Summary | Appendix

  • Run ID: CoveoRun1
  • Participant: coveo.soucy
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/29/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. We return a max of 1000 results since this is the max number of results that our system is able to return with the current configuration.

CWI06COMP1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CWI06COMP1
  • Participant: lowlands-team.deVries
  • Track: Terabyte
  • Year: 2006
  • Submission: 9/5/2006
  • Task: comp_eff

CWI06DISK1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CWI06DISK1
  • Participant: lowlands-team.deVries
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Entire collection indexed on a single machine. A decent (10-disk) RAID used to see the system performance with I/O based processing (3GB of buffer memory only). Single query stream to analyze sequential performance. Index is compressed. Scores are precomputed, and stored in compressed, quantized form. Top-20 is retrieved using two-pass strategy. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent, we copied the result from the 'Total wall-clock time' field.

CWI06DISK1ah

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CWI06DISK1ah
  • Participant: lowlands-team.deVries
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/26/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Entire collection indexed on a single machine. A decent (10-disk) RAID used to see the system performance with I/O based processing (3GB of buffer memory only). Single query stream to analyze sequential performance. Index is compressed. Scores are precomputed, and stored in compressed, quantized form. Top-10000 is retrieved using two-pass strategy. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent, we copied the result from the 'Total wall-clock time' field.

CWI06DIST8

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CWI06DIST8
  • Participant: lowlands-team.deVries
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/20/2006
  • Task: efficiency
  • Run description: This is a distributed run using a centralized broker, that runs 4 query streams in parallel, against 8 workstations that each index one eighth of the full document collection. Each workstation has 2GB of RAM, and a Athlon64 X2 3800+ dual core CPU. Each workstation runs our MonetDB/X100 system, which is a research relational DBMS, designed for high performance on data- and query-intensive workloads. This means that all the data structures we use are stored in relational tables. On each node, the index (inverted file stored in a relational table) is cached in RAM (using compression). Term occurences are ordered on docid, to allow for merge-joins and optimize for our PFOR-DELTA compression scheme. Queries are executed using a two-pass strategy in the first pass, we try to retrieve the per-node top-20 using the boolean conjunction of the query terms in combination with Okapi BM25 ranking. If this fails to return 20 results over all 8 nodes, we execute a second pass, in which we use a boolean-disjunctive variant of the same query, again ranking with Okapi BM25. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

CWI06DIST8ah

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CWI06DIST8ah
  • Participant: lowlands-team.deVries
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/26/2006
  • Type: automatic
  • Task: adhoc
  • Run description: This is a distributed run using a centralized broker, that runs 4 query streams in parallel, against 8 workstations that each index one eighth of the full document collection. Each workstation has 2GB of RAM, and a Athlon64 X2 3800+ dual core CPU. Each workstation runs our MonetDB/X100 system, which is a research relational DBMS, designed for high performance on data- and query-intensive workloads. This means that all the data structures we use are stored in relational tables. On each node, the index (inverted file stored in a relational table) is cached in RAM (using compression). Term occurences are ordered on docid, to allow for merge-joins and optimize for our PFOR-DELTA compression scheme. Queries are executed using a two-pass strategy in the first pass, we try to retrieve the per-node top-10000 using the boolean conjunction of the query terms in combination with Okapi BM25 ranking. If this fails to return 10000 results over all 8 nodes, we execute a second pass, in which we use a boolean-disjunctive variant of the same query, again ranking with Okapi BM25. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

CWI06MEM1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CWI06MEM1
  • Participant: lowlands-team.deVries
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Single node, single query-stream, sequential run, using RAM resident index. Using 16GB of RAM allows to keep the entire index in-memory to avoid I/O completely. Per document term scores are quantized and compressed. Although the machine has 4 CPUs, only one of them was used for sequentially processing the query single query stream. Note a different machine was used for indexing (the one from CWI06DISK1 run). Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

CWI06MEM4

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: CWI06MEM4
  • Participant: lowlands-team.deVries
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/20/2006
  • Task: efficiency
  • Run description: Entire collection indexed in RAM on a single machine. Note a different machine was used for indexing (the one from CWI06DISK run). Using 16GB of RAM allows to keep the entire index in-memory to avoid I/O completely. Per document term scores are quantized and compressed. 4 CPUs are used to see the benefit of 4 query streams. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field

DCU05BASE

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: DCU05BASE
  • Participant: dublincityu.gurrin
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Automatic run on a sorted index

hedge0

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hedge0
  • Participant: northeasternu.aslam
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Hedge metasearch (without feedback) over eight standard lemur retrieval systems.

hedge10

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hedge10
  • Participant: northeasternu.aslam
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Hedge metasearch (10 documents feedback) over eight standard lemur retrieval systems.

hedge30

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hedge30
  • Participant: northeasternu.aslam
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Hedge metasearch (30 documents feedback) over eight standard lemur retrieval systems.

hedge5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hedge5
  • Participant: northeasternu.aslam
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Hedge metasearch (5 documents feedback) over eight standard lemur retrieval systems.

hedge50

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: hedge50
  • Participant: northeasternu.aslam
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Hedge metasearch (50 documents feedback) over eight standard lemur retrieval systems.

humT06l

Results | Participants | Input | Summary | Appendix

  • Run ID: humT06l
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/29/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Plain content search, boolean-OR of query terms, English inflections, normal tf and idf dampening, document length normalization.

humT06xl

Results | Participants | Input | Summary | Appendix

  • Run ID: humT06xl
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/29/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Same as humT06l except that extra 20% weight on proximity match of query terms.

humT06xlc

Results | Participants | Input | Summary | Appendix

  • Run ID: humT06xlc
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Same as humT06xl except that a duplicate filtering heuristic was applied and only 1000 rows per topic were returned.

humT06xle

Results | Participants | Input | Summary | Appendix

  • Run ID: humT06xle
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/29/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Blind feedback using top-2 rows of humT06xl

humT06xlz

Results | Participants | Input | Summary | Appendix

  • Run ID: humT06xlz
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: automatic
  • Task: adhoc
  • Run description: One percent subset of first 9000 rows of humT06xl (rows 1, 101, 201, 301, ..., 8901) plus last 1000 rows of humT06xl (rows 9001-10000).

humTE06i3

Results | Participants | Input | Summary | Appendix

  • Run ID: humTE06i3
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/19/2006
  • Task: efficiency
  • Run description: Boolean-AND of query words, normal tf and idf dampening; no document length normalization, no stemming.

humTE06v2

Results | Participants | Input | Summary | Appendix

  • Run ID: humTE06v2
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/19/2006
  • Task: efficiency
  • Run description: Differs from humTE06i3 in that Boolean-OR is used, extra 20% weight on matching Title, terms in more than 10% of rows discarded, document length normalization enabled.

humTN06dpl

Results | Participants | Input | Summary | Appendix

  • Run ID: humTN06dpl
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/9/2006
  • Task: namedpage
  • Run description: Content weight 10, Title weight 2, Phrase-in-Title weight 1, Url-depth weight 5, English inflections, normal tf and idf dampening, document length normalization.

humTN06dplc

Results | Participants | Input | Summary | Appendix

  • Run ID: humTN06dplc
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/9/2006
  • Task: namedpage
  • Run description: Same as humTN06dpl except that a duplicate filtering heuristic was applied.

humTN06l

Results | Participants | Input | Summary | Appendix

  • Run ID: humTN06l
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: Same as humTN06pl except special weights on title omitted.

humTN06pl

Results | Participants | Input | Summary | Appendix

  • Run ID: humTN06pl
  • Participant: hummingbird.tomlinson
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: Same as humTN06dpl except url-depth weighting omitted.

icttb0600

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: icttb0600
  • Participant: cas-ict.wang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0601

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: icttb0601
  • Participant: cas-ict.wang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0602

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: icttb0602
  • Participant: cas-ict.wang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0603

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: icttb0603
  • Participant: cas-ict.wang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

icttb0604

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: icttb0604
  • Participant: cas-ict.wang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered,and duplicated urls were removed. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.

indri06AdmD

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06AdmD
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/1/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Dependence model run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking.

indri06AlceB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06AlceB
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/1/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using BM25 features. Both term proximity and phrase matches are taken into account during ranking.

indri06AlceD

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06AlceD
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/1/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking.

indri06Aql

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06Aql
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/1/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Query likelihood (bag of words) baseline run.

indri06AtdnD

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06AtdnD
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/1/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking. In addition, the title, description, and narrative portions of topic are each weighted differently.

indri06Nfi

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06Nfi
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This approach computes a unigram document model by mixing language models formed from the body, title, anchor text, and heading fields. No priors are used. Ranking is done using query likelihood.

indri06Nfip

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06Nfip
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This approach computes a unigram document model by mixing language models formed from the body, title, anchor text, and heading fields. PageRank and inlink priors were used. Ranking is done using query likelihood.

indri06Nsd

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06Nsd
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This approach uses a dependence model formulation that takes into account features computed over the body, title, anchor text, and heading fields. No priors are used.

indri06Nsdp

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: indri06Nsdp
  • Participant: umass.allan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/28/2006
  • Task: namedpage
  • Run description: This approach uses a dependence model formulation that takes into account features computed over the body, title, anchor text, and heading fields. PageRank and inlink features are also used.

JuruMan

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: JuruMan
  • Participant: ibm.carmel
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Manual run - exploiting the full system's query syntax

JuruT

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: JuruT
  • Participant: ibm.carmel
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Basic run based on title only. For short queries (less than 4 terms) we expand the query with a phrase of the query text

JuruTD

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: JuruTD
  • Participant: ibm.carmel
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Basic run based on title + description.

JuruTWE

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: JuruTWE
  • Participant: ibm.carmel
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: A run based on title + expansion from external source. We run each topic's title by "answers.com", and the result page is used to expand the query. Expansion erms extracted from the Web result are lexical affinites of the original query terms (topic title).

mg4jAdhocBBV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAdhocBBV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: A run using BM25 + minimal-interval scoring linearly combined (the former with doubled importance). Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAdhocBV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAdhocBV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: A run using BM25 + minimal-interval scoring linearly combined. Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAdhocBVV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAdhocBVV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: A run using BM25 + minimal-interval scoring linearly combined (the latter with doubled importance). Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAdhocV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAdhocV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: A run using minimal-interval scoring. Query were generated by some interaction, simulating a user playing with a search engine.

mg4jAutoBBV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAutoBBV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: A run using BM25 + minimal-interval scoring, linearly combined (the former has doubled importance). Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mg4jAutoBV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAutoBV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: A run using BM25 + minimal-interval scoring, linearly combined. Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mg4jAutoBVV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAutoBVV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: A run using BM25 + minimal-interval scoring, linearly combined (the latter has doubled importance). Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mg4jAutoV

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mg4jAutoV
  • Participant: umilano.vigna
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: A run using minimal-interval scoring. Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).

mpiiotopk

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiiotopk
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching

mpiiotopk2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiiotopk2
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching Pruning scans each list at most up to depth 1/5th of the list length

mpiiotopk2p

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiiotopk2p
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching Pruning scans each list at most up to depth 1/5th of the list length

mpiiotopkpar

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiiotopkpar
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching

mpiircomb

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiircomb
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Queries were automatically generated from the title and description fields (allowing duplicate words), removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no

mpiirdesc

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiirdesc
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Queries were automatically generated from the description fields, removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no

mpiirmanual

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiirmanual
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Queries were generated manually from the title, description and narrative fields and processed using top-k algorithm as described in our VLDB'06 paper "IO-Top-k". After the query generation, documents are retrieved automatically. Scoring function bm25 Stemming no Pruning no

mpiirtitle

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: mpiirtitle
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Queries were automatically generated from the title fields, removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no

MU06TBa1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBa1
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Impact-based retrieval. Using proximity to break tie. Manual queries.

MU06TBa2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBa2
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Impact-based retrieval. Baseline.

MU06TBa5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBa5
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Impact-based retrieval. Using proximity to break tie.

MU06TBa6

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBa6
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Impact-based retrieval. Using proximity to break tie. Manual queries.

MU06TBn2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBn2
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/31/2006
  • Task: namedpage
  • Run description: Plain impact retrieval over content and incoming anchor text.

MU06TBn5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBn5
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/31/2006
  • Task: namedpage
  • Run description: Plain BM25 over content and incoming anchor text.

MU06TBn6

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBn6
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/31/2006
  • Task: namedpage
  • Run description: Modified impact retrieval over content and incoming anchor text.

MU06TBn9

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBn9
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/31/2006
  • Task: namedpage
  • Run description: Modified impact retrieval over content and incoming anchor text.

MU06TBy1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBy1
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/20/2006
  • Task: efficiency
  • Run description: Baseline, full-processing with impact-sorted index.

MU06TBy2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBy2
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Impact retrieval + smoothing

MU06TBy5

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBy5
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Impact retrieval + dynamic pruning

MU06TBy6

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: MU06TBy6
  • Participant: umelbourne.ngoc-anh
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Impact retrieval + dynamic pruning + 4 streams

p6tbadt

Results | Participants | Input | Summary | Appendix

  • Run ID: p6tbadt
  • Participant: polytechnicu.suel
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/3/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Results based on BM25 are reorganized using a Decision Tree trained by previous two years' judgements.

p6tbaxl

Results | Participants | Input | Summary | Appendix

  • Run ID: p6tbaxl
  • Participant: polytechnicu.suel
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: pagerank, bm25, anchor text

p6tbeb

Results | Participants | Input | Summary | Appendix

  • Run ID: p6tbeb
  • Participant: polytechnicu.suel
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: baseline run with 1GB cache.

p6tbedt

Results | Participants | Input | Summary | Appendix

  • Run ID: p6tbedt
  • Participant: polytechnicu.suel
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Variation of BM25 with Decision tree.

p6tbep8

Results | Participants | Input | Summary | Appendix

  • Run ID: p6tbep8
  • Participant: polytechnicu.suel
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Unreliable pruning used.

rmit06cmpind

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: rmit06cmpind
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 8/7/2006
  • Task: comp_eff

rmit06cmpwum

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: rmit06cmpwum
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 8/7/2006
  • Task: comp_eff

rmit06cmpzet

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: rmit06cmpzet
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 8/7/2006
  • Task: comp_eff

rmit06effic

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: rmit06effic
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/21/2006
  • Task: efficiency
  • Run description: Standard run (single stream). Index built with no term offsets. Query evaluation with stopping, light stemming and Dirichlet language modelling for ranking.

sabtb06aa1

Results | Participants | Input | Summary | Appendix

  • Run ID: sabtb06aa1
  • Participant: sabir.buckley
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/3/2006
  • Type: automatic
  • Task: adhoc
  • Run description: very simple vector run, all fields of topic

sabtb06at1

Results | Participants | Input | Summary | Appendix

  • Run ID: sabtb06at1
  • Participant: sabir.buckley
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/3/2006
  • Type: automatic
  • Task: adhoc
  • Run description: very simple vector title run

sabtb06man1

Results | Participants | Input | Summary | Appendix

  • Run ID: sabtb06man1
  • Participant: sabir.buckley
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: 5 minutes manual effort per topic, possibly editing topic text, but mostly judgements. Typically 5 iterations of retrieval, with Rocchio feedback (concurrent with more judgements). Expand by 30 terms for judging runs, Expand by 100 terms for final 10000 run.

THUADALL

Results | Participants | Input | Summary | Appendix

  • Run ID: THUADALL
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/3/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Result combination of BM25 (and, or) and Language model (Dirichlet Prior Method) ranking. all ranking methods are performed on the whole collection together with anchor text.

THUADAO

Results | Participants | Input | Summary | Appendix

  • Run ID: THUADAO
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/3/2006
  • Type: automatic
  • Task: adhoc
  • Run description: BM25, whole collection with anchor text. Result combination (and, or)

THUADLMAO

Results | Participants | Input | Summary | Appendix

  • Run ID: THUADLMAO
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/3/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Result combination of BM25 (and, or) and Language model (Dirichlet Prior Method) ranking. all ranking methods are performed on the whole collection together with anchor text.

THUADLMO

Results | Participants | Input | Summary | Appendix

  • Run ID: THUADLMO
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/3/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Result combination of BM25 and Language model (Dirichlet Prior Method) ranking. Both ranking methods are performed on the whole collection together with anchor text.

THUADOR

Results | Participants | Input | Summary | Appendix

  • Run ID: THUADOR
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: BM25 ranking over whole collection with in-link anchor text.

THUNPABS

Results | Participants | Input | Summary | Appendix

  • Run ID: THUNPABS
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. Several fields are extracted from whole collection according to HTML structure (such as title, bold text, etc.) and query terms appearing in these fields are given a higher weight during ranking process.

THUNPCOMB

Results | Participants | Input | Summary | Appendix

  • Run ID: THUNPCOMB
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: Result combination of 3 runs [1] THUNPABS, [2] language model instead of BM25 ranking over the same data collection as THUNPABS, [3] result filtering based on THUNPABS, only the results containing all query terms are retained. Results are ranked based on results' different RSVs in these 3 runs.

THUNPNOSTOP

Results | Participants | Input | Summary | Appendix

  • Run ID: THUNPNOSTOP
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: BM25 ranking (using Tminer 3.0 system) over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. No stopwords are filtered in the indexing process.

THUNPTA3

Results | Participants | Input | Summary | Appendix

  • Run ID: THUNPTA3
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. Repeated anchor text is reduced from the corpus.

THUNPWP18

Results | Participants | Input | Summary | Appendix

  • Run ID: THUNPWP18
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight.

THUTeraEff01

Results | Participants | Input | Summary | Appendix

  • Run ID: THUTeraEff01
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/22/2006
  • Task: efficiency
  • Run description: BM25 ranking over both anchor text and content of .GOV corpus.

THUTeraEff02

Results | Participants | Input | Summary | Appendix

  • Run ID: THUTeraEff02
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/22/2006
  • Task: efficiency
  • Run description: BM25 ranking over extracted abstract of the .GOV2 corpus. Abstracts are extracted according to document structure of web pages.

THUTeraEff03

Results | Participants | Input | Summary | Appendix

  • Run ID: THUTeraEff03
  • Participant: tsinghuau.zhang
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/22/2006
  • Task: efficiency
  • Run description: Result combination according to the reciprocal rank of THUTeraEff01 and THUTeraEff02.

TWTB06AD01

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06AD01
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/28/2006
  • Type: automatic
  • Task: adhoc
  • Run description: This is an automatic run which combined the dependence model with pseudo-relevance feedback.

TWTB06AD02

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06AD02
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/28/2006
  • Type: manual
  • Task: adhoc
  • Run description: This is a manual run which made use of pseudo-relevance feedback.

TWTB06AD03

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06AD03
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/28/2006
  • Type: manual
  • Task: adhoc
  • Run description: This is a manual run which is just a simple query likelihood run.

TWTB06AD04

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06AD04
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/28/2006
  • Type: automatic
  • Task: adhoc
  • Run description: This is a title-only run which made use of dependence modeling.

TWTB06AD05

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06AD05
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/28/2006
  • Type: automatic
  • Task: adhoc
  • Run description: This run is a simple title-only query likelihood run.

TWTB06NP01

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06NP01
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/19/2006
  • Task: namedpage
  • Run description: This is a run which uses document structure techniques.

TWTB06NP02

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06NP02
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/19/2006
  • Task: namedpage
  • Run description: This is a run which uses the title field of documents.

TWTB06NP03

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: TWTB06NP03
  • Participant: pekingu.yan
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/19/2006
  • Task: namedpage
  • Run description: This is a run which uses pagerank prior and the title field of documents.

UAmsT06a3SUM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06a3SUM
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Combination of (1) full-text index run with weight 80%, (2) extracted titles index run with weight 10%, and (3) extracted anchor-texts index run with weight 80%. All runs use a stemmed index, and a language model with little smoothing (lambda = .9), no feedback.

UAmsT06aAnLM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06aAnLM
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Stemmed anchor-text index, using a language model with little smoothing (lambda = 0.9), no feedback.

UAmsT06aTDN

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06aTDN
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Stemmed full-text index, using a language model with little smoothing (lambda = 0.9), no feedback. Selected the 10 most characteric terms from the TDN-fields.

UAmsT06aTeLM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06aTeLM
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Stemmed full-text index, using a language model with little smoothing (lambda = 0.9), no feedback.

UAmsT06aTTDN

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06aTTDN
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Combination of (1) Title-only run and (2) TDN-run using the 10 most characteric terms from the TDN-fields. Both using a stemmed full-text index, using a language model with little smoothing, no feedback.

UAmsT06n3SUM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06n3SUM
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/29/2006
  • Task: namedpage
  • Run description: CombSUM combination of full-text (.8), titles (.1) and anchors (.1), all runs use language model (lambda = .9), and a standard length-prior.

UAmsT06nAnLM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06nAnLM
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/29/2006
  • Task: namedpage
  • Run description: Extracted anchor texts index (stemmed), using language model (lambda = .9, standard length prior).

UAmsT06nTeLM

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06nTeLM
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/29/2006
  • Task: namedpage
  • Run description: Full text index (stemmed), using language model (lambda = .9, standard length prior).

UAmsT06nTurl

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: UAmsT06nTurl
  • Participant: uamsterdam.ilps
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/29/2006
  • Task: namedpage
  • Run description: Full text index (stemmed), using language model (lambda = .9, standard length prior) and a URL-length prior.

uogTB06M

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06M
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: Divergence From Randomess weighting model with document structure

uogTB06MP

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06MP
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: Divergence From Randomness weighting model with document structure and precision enhancement model

uogTB06MPIA

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06MPIA
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/30/2006
  • Task: namedpage
  • Run description: Divergence From Randomness weighting model with document structure, precision enhancement model, and query-independent evidence

uogTB06QET1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06QET1
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: DFR document weighting framework with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06QET2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06QET2
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: DFR document weighting framework with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06S50L

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06S50L
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: DFR document weighting framework and query reformulation with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06SS10L

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06SS10L
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: DFR document weighting framework, query reformulation and a new indexing procedure with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uogTB06SSQL

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uogTB06SSQL
  • Participant: uglasgow.ounis
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: DFR document weighting framework, query reformulation and a new indexing procedure with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.

uwmtFadDS

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFadDS
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/27/2006
  • Type: automatic
  • Task: adhoc
  • Run description: BM25 + additional weight for terms in title fields etc.

uwmtFadTPFB

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFadTPFB
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/27/2006
  • Type: automatic
  • Task: adhoc
  • Run description: BM25 + term proximity + pseudo-relevance feedback

uwmtFadTPRR

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFadTPRR
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/24/2006
  • Type: automatic
  • Task: adhoc
  • Run description: This is BM25 + term proximity for an initial result set. The top 10 documents from the initial result set are used to build a language model which is then employed to rerank all documents in the result set according to their divergence from this language model.

uwmtFcompI0

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompI0
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompI1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompI1
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompI2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompI2
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompI3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompI3
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompW

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompW
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompW1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompW1
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompW2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompW2
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompW3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompW3
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/7/2006
  • Task: comp_eff

uwmtFcompZ0

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompZ0
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 8/13/2006
  • Task: comp_eff

uwmtFcompZ1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompZ1
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 8/13/2006
  • Task: comp_eff

uwmtFcompZ2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompZ2
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 8/13/2006
  • Task: comp_eff

uwmtFcompZ3

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFcompZ3
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 8/13/2006
  • Task: comp_eff

uwmtFdcp03

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFdcp03
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/15/2006
  • Task: efficiency
  • Run description: This run uses a pruned in-memory index containing the top 3% terms from every document. vbyte is used for index compression.

uwmtFdcp06

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFdcp06
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/16/2006
  • Task: efficiency
  • Run description: This run uses a pruned in-memory index containing the top 6% terms from every document. vbyte is used for index compression.

uwmtFdcp12

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFdcp12
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/15/2006
  • Task: efficiency
  • Run description: This run uses a pruned in-memory index containing the top 12% terms from every document. Index compression is done using a length-limited Huffman code.

uwmtFmanual

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFmanual
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/30/2006
  • Type: manual
  • Task: adhoc
  • Run description: This run is a combination of a manual run, in which a bored graduate student tried to find relevant documents by hand, and a few automatic runs, merged together. It makes use of the full topic statements, TDN.

uwmtFnoprune

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFnoprune
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 6/15/2006
  • Task: efficiency
  • Run description: Frequency index, compressed using vbyte. Frequencies of terms appearing in special parts of the document (title, headlines, etc.) are boosted.

uwmtFnpsRR1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFnpsRR1
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/26/2006
  • Task: namedpage
  • Run description: BM25 with weighted fields and local reranking based on links and anchor text as a second step.

uwmtFnpstr1

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFnpstr1
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/11/2006
  • Task: namedpage
  • Run description: BM25 with extra weight for terms within special HTML tags.

uwmtFnpstr2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: uwmtFnpstr2
  • Participant: uwaterloo-clarke
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/11/2006
  • Task: namedpage
  • Run description: BM25 + extract weight according to document structure. Integrated duplicate elimination.

wumpus

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: wumpus
  • Participant: max-planck.theobald
  • Track: Terabyte
  • Year: 2006
  • Submission: 9/5/2006
  • Task: comp_eff

zetabm

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetabm
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Zettair probabilistic model (BM25) run

zetadir

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetadir
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Zettair language model (dirichlet) run

zetaman

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetaman
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: manual
  • Task: adhoc
  • Run description: Zettair manual run using title plus salient keywords from description and narrative

zetamerg

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetamerg
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Zettair and Indri merged run (round-robin merge)

zetamerg2

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetamerg2
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/2/2006
  • Type: automatic
  • Task: adhoc
  • Run description: Zettair and Indri merged run 2 (merde on score)

zetnpbm

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetnpbm
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Baseline zettair run (using BM25)

zetnpfa

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetnpfa
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Mixed retrieval of full text and anchor text

zetnpft

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetnpft
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Mixed retrieval of full text index and tags (TITLE and Hx)

zetnpfta

Results | Participants | Proceedings | Input | Summary | Appendix

  • Run ID: zetnpfta
  • Participant: rmit.scholer
  • Track: Terabyte
  • Year: 2006
  • Submission: 7/27/2006
  • Task: namedpage
  • Run description: Mixed retrieval of full text, anchor text and tags (TITLE and Hx)