Runs - Terabyte 2006¶
AMRIMtp20006¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: AMRIMtp20006
- Participant: ecole-des-mines.beigbeder
- Track: Terabyte
- Year: 2006
- Submission: 7/1/2006
- Type: automatic
- Task: adhoc
- Run description: Automatic run with title field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary. Moreover of re-ranking, our method discovers at least 80 in addition to Zettair BM25 list.
AMRIMtp5006¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: AMRIMtp5006
- Participant: ecole-des-mines.beigbeder
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: automatic
- Task: adhoc
- Run description: Automatic run with title field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary.
AMRIMtpm5006¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: AMRIMtpm5006
- Participant: ecole-des-mines.beigbeder
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Manual run with words of all field. This run is adapted at our method because we construct boolean queries which can be analysed by fuzzy proximity. Our boolean queries take words of all topic field. We use our fuzzy term proximity ranking first and zettair BM25 then. Note that for our method, we index all the documents buts only with the terms of the topic file because our computer haven't enough RAM to load all the vocabulary.
arscDomAlog¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: arscDomAlog
- Participant: ualaska.fairbanks.newby
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: automatic
- Task: adhoc
- Run description: Automatic run, each web domain is gov2 was indexed and searched individually, the ranked results from each domain were fit and normalized to a logistic curve, then merged.
arscDomAsrt¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: arscDomAsrt
- Participant: ualaska.fairbanks.newby
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: automatic
- Task: adhoc
- Run description: Automatic run, each web domain is gov2 was indexed and searched individually, the ranked results from each domain were sorted via GNU sort on the relevance score.
arscDomManL¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: arscDomManL
- Participant: ualaska.fairbanks.newby
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: manual
- Task: adhoc
- Run description: Manual processing included reading the topics and descriptions, thinking a bit, and constructing a query with boolean AND, OR, NOT. As with the automatic run, each web domain is searched independent of the others and the results are merged after fitting to a normalized logistic curve.
arscDomManS¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: arscDomManS
- Participant: ualaska.fairbanks.newby
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: manual
- Task: adhoc
- Run description: Manual processing included reading the topics and descriptions, thinking a bit, and constructing a query with boolean AND, OR, NOT. As with the automatic run, each web domain is searched independent of the others and the results are merged using GNU sort on the relevance score.
CoveoNPRun1¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: CoveoNPRun1
- Participant: coveo.soucy
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query.
CoveoNPRun2¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: CoveoNPRun2
- Participant: coveo.soucy
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the previous one, but weights for each criteria have been slightly changed.
CoveoNPRun3¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: CoveoNPRun3
- Participant: coveo.soucy
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the previous one, but acronyms built from the query have been added.
CoveoNPRun4¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: CoveoNPRun4
- Participant: coveo.soucy
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. This run is similar to the first one, but weights for query words found in title and in keyphrases extracted from a summarization tool are boosted.
CoveoRun1¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: CoveoRun1
- Participant: coveo.soucy
- Track: Terabyte
- Year: 2006
- Submission: 6/29/2006
- Type: automatic
- Task: adhoc
- Run description: Our ranking algorithm combines about 20 criteria. Each criteria has a weight and each weight was trained using data from previous years. Queries are executed using a query with the boolean AND operator between terms. If there is less than 1000 results, we add results after the AND query using an OR query. We return a max of 1000 results since this is the max number of results that our system is able to return with the current configuration.
CWI06COMP1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: CWI06COMP1
- Participant: lowlands-team.deVries
- Track: Terabyte
- Year: 2006
- Submission: 9/5/2006
- Task: comp_eff
CWI06DISK1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: CWI06DISK1
- Participant: lowlands-team.deVries
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Entire collection indexed on a single machine. A decent (10-disk) RAID used to see the system performance with I/O based processing (3GB of buffer memory only). Single query stream to analyze sequential performance. Index is compressed. Scores are precomputed, and stored in compressed, quantized form. Top-20 is retrieved using two-pass strategy. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent, we copied the result from the 'Total wall-clock time' field.
CWI06DISK1ah¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: CWI06DISK1ah
- Participant: lowlands-team.deVries
- Track: Terabyte
- Year: 2006
- Submission: 6/26/2006
- Type: automatic
- Task: adhoc
- Run description: Entire collection indexed on a single machine. A decent (10-disk) RAID used to see the system performance with I/O based processing (3GB of buffer memory only). Single query stream to analyze sequential performance. Index is compressed. Scores are precomputed, and stored in compressed, quantized form. Top-10000 is retrieved using two-pass strategy. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent, we copied the result from the 'Total wall-clock time' field.
CWI06DIST8¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: CWI06DIST8
- Participant: lowlands-team.deVries
- Track: Terabyte
- Year: 2006
- Submission: 6/20/2006
- Task: efficiency
- Run description: This is a distributed run using a centralized broker, that runs 4 query streams in parallel, against 8 workstations that each index one eighth of the full document collection. Each workstation has 2GB of RAM, and a Athlon64 X2 3800+ dual core CPU. Each workstation runs our MonetDB/X100 system, which is a research relational DBMS, designed for high performance on data- and query-intensive workloads. This means that all the data structures we use are stored in relational tables. On each node, the index (inverted file stored in a relational table) is cached in RAM (using compression). Term occurences are ordered on docid, to allow for merge-joins and optimize for our PFOR-DELTA compression scheme. Queries are executed using a two-pass strategy in the first pass, we try to retrieve the per-node top-20 using the boolean conjunction of the query terms in combination with Okapi BM25 ranking. If this fails to return 20 results over all 8 nodes, we execute a second pass, in which we use a boolean-disjunctive variant of the same query, again ranking with Okapi BM25. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field
CWI06DIST8ah¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: CWI06DIST8ah
- Participant: lowlands-team.deVries
- Track: Terabyte
- Year: 2006
- Submission: 6/26/2006
- Type: automatic
- Task: adhoc
- Run description: This is a distributed run using a centralized broker, that runs 4 query streams in parallel, against 8 workstations that each index one eighth of the full document collection. Each workstation has 2GB of RAM, and a Athlon64 X2 3800+ dual core CPU. Each workstation runs our MonetDB/X100 system, which is a research relational DBMS, designed for high performance on data- and query-intensive workloads. This means that all the data structures we use are stored in relational tables. On each node, the index (inverted file stored in a relational table) is cached in RAM (using compression). Term occurences are ordered on docid, to allow for merge-joins and optimize for our PFOR-DELTA compression scheme. Queries are executed using a two-pass strategy in the first pass, we try to retrieve the per-node top-10000 using the boolean conjunction of the query terms in combination with Okapi BM25 ranking. If this fails to return 10000 results over all 8 nodes, we execute a second pass, in which we use a boolean-disjunctive variant of the same query, again ranking with Okapi BM25. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field
CWI06MEM1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: CWI06MEM1
- Participant: lowlands-team.deVries
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Single node, single query-stream, sequential run, using RAM resident index. Using 16GB of RAM allows to keep the entire index in-memory to avoid I/O completely. Per document term scores are quantized and compressed. Although the machine has 4 CPUs, only one of them was used for sequentially processing the query single query stream. Note a different machine was used for indexing (the one from CWI06DISK1 run). Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field
CWI06MEM4¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: CWI06MEM4
- Participant: lowlands-team.deVries
- Track: Terabyte
- Year: 2006
- Submission: 6/20/2006
- Task: efficiency
- Run description: Entire collection indexed in RAM on a single machine. Note a different machine was used for indexing (the one from CWI06DISK run). Using 16GB of RAM allows to keep the entire index in-memory to avoid I/O completely. Per document term scores are quantized and compressed. 4 CPUs are used to see the benefit of 4 query streams. Remark The 'TREC 2006 Terabyte Track Guidelines' on the web, did not mention anything about submiting 'Total CPU time'. Especially in a distributed setting, this number requires an elaborate and clear definition. As we did not know up front this number was needed for submissions, and neither what it's supposed to represent we copied the result from the 'Total wall-clock time' field
DCU05BASE¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: DCU05BASE
- Participant: dublincityu.gurrin
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: automatic
- Task: adhoc
- Run description: Automatic run on a sorted index
hedge0¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: hedge0
- Participant: northeasternu.aslam
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Hedge metasearch (without feedback) over eight standard lemur retrieval systems.
hedge10¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: hedge10
- Participant: northeasternu.aslam
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Hedge metasearch (10 documents feedback) over eight standard lemur retrieval systems.
hedge30¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: hedge30
- Participant: northeasternu.aslam
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Hedge metasearch (30 documents feedback) over eight standard lemur retrieval systems.
hedge5¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: hedge5
- Participant: northeasternu.aslam
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Hedge metasearch (5 documents feedback) over eight standard lemur retrieval systems.
hedge50¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: hedge50
- Participant: northeasternu.aslam
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Hedge metasearch (50 documents feedback) over eight standard lemur retrieval systems.
humT06l¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humT06l
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 6/29/2006
- Type: automatic
- Task: adhoc
- Run description: Plain content search, boolean-OR of query terms, English inflections, normal tf and idf dampening, document length normalization.
humT06xl¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humT06xl
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 6/29/2006
- Type: automatic
- Task: adhoc
- Run description: Same as humT06l except that extra 20% weight on proximity match of query terms.
humT06xlc¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humT06xlc
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: automatic
- Task: adhoc
- Run description: Same as humT06xl except that a duplicate filtering heuristic was applied and only 1000 rows per topic were returned.
humT06xle¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humT06xle
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 6/29/2006
- Type: automatic
- Task: adhoc
- Run description: Blind feedback using top-2 rows of humT06xl
humT06xlz¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humT06xlz
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: automatic
- Task: adhoc
- Run description: One percent subset of first 9000 rows of humT06xl (rows 1, 101, 201, 301, ..., 8901) plus last 1000 rows of humT06xl (rows 9001-10000).
humTE06i3¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humTE06i3
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 6/19/2006
- Task: efficiency
- Run description: Boolean-AND of query words, normal tf and idf dampening; no document length normalization, no stemming.
humTE06v2¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humTE06v2
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 6/19/2006
- Task: efficiency
- Run description: Differs from humTE06i3 in that Boolean-OR is used, extra 20% weight on matching Title, terms in more than 10% of rows discarded, document length normalization enabled.
humTN06dpl¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humTN06dpl
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 7/9/2006
- Task: namedpage
- Run description: Content weight 10, Title weight 2, Phrase-in-Title weight 1, Url-depth weight 5, English inflections, normal tf and idf dampening, document length normalization.
humTN06dplc¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humTN06dplc
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 7/9/2006
- Task: namedpage
- Run description: Same as humTN06dpl except that a duplicate filtering heuristic was applied.
humTN06l¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humTN06l
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: Same as humTN06pl except special weights on title omitted.
humTN06pl¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: humTN06pl
- Participant: hummingbird.tomlinson
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: Same as humTN06dpl except url-depth weighting omitted.
icttb0600¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: icttb0600
- Participant: cas-ict.wang
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.
icttb0601¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: icttb0601
- Participant: cas-ict.wang
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.
icttb0602¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: icttb0602
- Participant: cas-ict.wang
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.
icttb0603¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: icttb0603
- Participant: cas-ict.wang
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.
icttb0604¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: icttb0604
- Participant: cas-ict.wang
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This run is an automatic run.We used stop word list,but no stemming. Title and url infomation were considered,and duplicated urls were removed. We used a modified BM25 fomular to rank documents,and before ranking we used a position check procedure. The index time can be devided into 3 parts. Data processing and extracting pure text used 454 minutes.Indexing used 136 minutes. Index optimization used 230 minutes.
indri06AdmD¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06AdmD
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/1/2006
- Type: automatic
- Task: adhoc
- Run description: Dependence model run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking.
indri06AlceB¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06AlceB
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/1/2006
- Type: automatic
- Task: adhoc
- Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using BM25 features. Both term proximity and phrase matches are taken into account during ranking.
indri06AlceD¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06AlceD
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/1/2006
- Type: automatic
- Task: adhoc
- Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking.
indri06Aql¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06Aql
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/1/2006
- Type: automatic
- Task: adhoc
- Run description: Query likelihood (bag of words) baseline run.
indri06AtdnD¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06AtdnD
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/1/2006
- Type: automatic
- Task: adhoc
- Run description: Latent concept expansion (pseudo-relevance feedback that takes term dependence into account) run using Dirichlet language modeling features. Both term proximity and phrase matches are taken into account during ranking. In addition, the title, description, and narrative portions of topic are each weighted differently.
indri06Nfi¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06Nfi
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This approach computes a unigram document model by mixing language models formed from the body, title, anchor text, and heading fields. No priors are used. Ranking is done using query likelihood.
indri06Nfip¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06Nfip
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This approach computes a unigram document model by mixing language models formed from the body, title, anchor text, and heading fields. PageRank and inlink priors were used. Ranking is done using query likelihood.
indri06Nsd¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06Nsd
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This approach uses a dependence model formulation that takes into account features computed over the body, title, anchor text, and heading fields. No priors are used.
indri06Nsdp¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: indri06Nsdp
- Participant: umass.allan
- Track: Terabyte
- Year: 2006
- Submission: 7/28/2006
- Task: namedpage
- Run description: This approach uses a dependence model formulation that takes into account features computed over the body, title, anchor text, and heading fields. PageRank and inlink features are also used.
JuruMan¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: JuruMan
- Participant: ibm.carmel
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Manual run - exploiting the full system's query syntax
JuruT¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: JuruT
- Participant: ibm.carmel
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Basic run based on title only. For short queries (less than 4 terms) we expand the query with a phrase of the query text
JuruTD¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: JuruTD
- Participant: ibm.carmel
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Basic run based on title + description.
JuruTWE¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: JuruTWE
- Participant: ibm.carmel
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: A run based on title + expansion from external source. We run each topic's title by "answers.com", and the result page is used to expand the query. Expansion erms extracted from the Web result are lexical affinites of the original query terms (topic title).
mg4jAdhocBBV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAdhocBBV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: A run using BM25 + minimal-interval scoring linearly combined (the former with doubled importance). Query were generated by some interaction, simulating a user playing with a search engine.
mg4jAdhocBV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAdhocBV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: A run using BM25 + minimal-interval scoring linearly combined. Query were generated by some interaction, simulating a user playing with a search engine.
mg4jAdhocBVV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAdhocBVV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: A run using BM25 + minimal-interval scoring linearly combined (the latter with doubled importance). Query were generated by some interaction, simulating a user playing with a search engine.
mg4jAdhocV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAdhocV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: A run using minimal-interval scoring. Query were generated by some interaction, simulating a user playing with a search engine.
mg4jAutoBBV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAutoBBV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: A run using BM25 + minimal-interval scoring, linearly combined (the former has doubled importance). Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).
mg4jAutoBV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAutoBV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: A run using BM25 + minimal-interval scoring, linearly combined. Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).
mg4jAutoBVV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAutoBVV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: A run using BM25 + minimal-interval scoring, linearly combined (the latter has doubled importance). Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).
mg4jAutoV¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mg4jAutoV
- Participant: umilano.vigna
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: A run using minimal-interval scoring. Query were generated from the title as in "A B C" -> (A & B & C), (A & B) | (A & C) | (B & C), A | B | C, where the comma denotes "and then" (=give me the results of this query that did not appear before).
mpiiotopk¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiiotopk
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching
mpiiotopk2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiiotopk2
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching Pruning scans each list at most up to depth 1/5th of the list length
mpiiotopk2p¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiiotopk2p
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching Pruning scans each list at most up to depth 1/5th of the list length
mpiiotopkpar¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiiotopkpar
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Algorithm Top-k query processing according to our VLDB'06 paper "IO-Top-K Index-Access Optimized Top-k Query Processing" (computes the exact top-k hits, similar to Fagin's CA algorithm, but with random accesses postponed to the end, and with a much better stopping criterion). Scores BM25 (with standard parameter setting) Stemming No Compression No Caching only the operating system's ordinary disk caching
mpiircomb¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiircomb
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Queries were automatically generated from the title and description fields (allowing duplicate words), removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no
mpiirdesc¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiirdesc
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Queries were automatically generated from the description fields, removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no
mpiirmanual¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiirmanual
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Queries were generated manually from the title, description and narrative fields and processed using top-k algorithm as described in our VLDB'06 paper "IO-Top-k". After the query generation, documents are retrieved automatically. Scoring function bm25 Stemming no Pruning no
mpiirtitle¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: mpiirtitle
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Queries were automatically generated from the title fields, removing stopwords and processed using top-k algorithm as described in our VLDB''06 paper "IO-Top-k". Scoring function bm25 Stemming no Pruning no
MU06TBa1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBa1
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Impact-based retrieval. Using proximity to break tie. Manual queries.
MU06TBa2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBa2
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Impact-based retrieval. Baseline.
MU06TBa5¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBa5
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Impact-based retrieval. Using proximity to break tie.
MU06TBa6¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBa6
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Impact-based retrieval. Using proximity to break tie. Manual queries.
MU06TBn2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBn2
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/31/2006
- Task: namedpage
- Run description: Plain impact retrieval over content and incoming anchor text.
MU06TBn5¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBn5
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/31/2006
- Task: namedpage
- Run description: Plain BM25 over content and incoming anchor text.
MU06TBn6¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBn6
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/31/2006
- Task: namedpage
- Run description: Modified impact retrieval over content and incoming anchor text.
MU06TBn9¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBn9
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 7/31/2006
- Task: namedpage
- Run description: Modified impact retrieval over content and incoming anchor text.
MU06TBy1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBy1
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 6/20/2006
- Task: efficiency
- Run description: Baseline, full-processing with impact-sorted index.
MU06TBy2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBy2
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Impact retrieval + smoothing
MU06TBy5¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBy5
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Impact retrieval + dynamic pruning
MU06TBy6¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: MU06TBy6
- Participant: umelbourne.ngoc-anh
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Impact retrieval + dynamic pruning + 4 streams
p6tbadt¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: p6tbadt
- Participant: polytechnicu.suel
- Track: Terabyte
- Year: 2006
- Submission: 7/3/2006
- Type: automatic
- Task: adhoc
- Run description: Results based on BM25 are reorganized using a Decision Tree trained by previous two years' judgements.
p6tbaxl¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: p6tbaxl
- Participant: polytechnicu.suel
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: pagerank, bm25, anchor text
p6tbeb¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: p6tbeb
- Participant: polytechnicu.suel
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: baseline run with 1GB cache.
p6tbedt¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: p6tbedt
- Participant: polytechnicu.suel
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Variation of BM25 with Decision tree.
p6tbep8¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: p6tbep8
- Participant: polytechnicu.suel
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Unreliable pruning used.
rmit06cmpind¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: rmit06cmpind
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 8/7/2006
- Task: comp_eff
rmit06cmpwum¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: rmit06cmpwum
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 8/7/2006
- Task: comp_eff
rmit06cmpzet¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: rmit06cmpzet
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 8/7/2006
- Task: comp_eff
rmit06effic¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: rmit06effic
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 6/21/2006
- Task: efficiency
- Run description: Standard run (single stream). Index built with no term offsets. Query evaluation with stopping, light stemming and Dirichlet language modelling for ranking.
sabtb06aa1¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: sabtb06aa1
- Participant: sabir.buckley
- Track: Terabyte
- Year: 2006
- Submission: 7/3/2006
- Type: automatic
- Task: adhoc
- Run description: very simple vector run, all fields of topic
sabtb06at1¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: sabtb06at1
- Participant: sabir.buckley
- Track: Terabyte
- Year: 2006
- Submission: 7/3/2006
- Type: automatic
- Task: adhoc
- Run description: very simple vector title run
sabtb06man1¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: sabtb06man1
- Participant: sabir.buckley
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: 5 minutes manual effort per topic, possibly editing topic text, but mostly judgements. Typically 5 iterations of retrieval, with Rocchio feedback (concurrent with more judgements). Expand by 30 terms for judging runs, Expand by 100 terms for final 10000 run.
THUADALL¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUADALL
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/3/2006
- Type: automatic
- Task: adhoc
- Run description: Result combination of BM25 (and, or) and Language model (Dirichlet Prior Method) ranking. all ranking methods are performed on the whole collection together with anchor text.
THUADAO¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUADAO
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/3/2006
- Type: automatic
- Task: adhoc
- Run description: BM25, whole collection with anchor text. Result combination (and, or)
THUADLMAO¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUADLMAO
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/3/2006
- Type: automatic
- Task: adhoc
- Run description: Result combination of BM25 (and, or) and Language model (Dirichlet Prior Method) ranking. all ranking methods are performed on the whole collection together with anchor text.
THUADLMO¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUADLMO
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/3/2006
- Type: automatic
- Task: adhoc
- Run description: Result combination of BM25 and Language model (Dirichlet Prior Method) ranking. Both ranking methods are performed on the whole collection together with anchor text.
THUADOR¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUADOR
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: BM25 ranking over whole collection with in-link anchor text.
THUNPABS¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUNPABS
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. Several fields are extracted from whole collection according to HTML structure (such as title, bold text, etc.) and query terms appearing in these fields are given a higher weight during ranking process.
THUNPCOMB¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUNPCOMB
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: Result combination of 3 runs [1] THUNPABS, [2] language model instead of BM25 ranking over the same data collection as THUNPABS, [3] result filtering based on THUNPABS, only the results containing all query terms are retained. Results are ranked based on results' different RSVs in these 3 runs.
THUNPNOSTOP¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUNPNOSTOP
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: BM25 ranking (using Tminer 3.0 system) over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. No stopwords are filtered in the indexing process.
THUNPTA3¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUNPTA3
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight. Repeated anchor text is reduced from the corpus.
THUNPWP18¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUNPWP18
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: BM25 ranking over whole collection together with in-link anchor text. Bi-gram matching is given a higher weight.
THUTeraEff01¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUTeraEff01
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 6/22/2006
- Task: efficiency
- Run description: BM25 ranking over both anchor text and content of .GOV corpus.
THUTeraEff02¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUTeraEff02
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 6/22/2006
- Task: efficiency
- Run description: BM25 ranking over extracted abstract of the .GOV2 corpus. Abstracts are extracted according to document structure of web pages.
THUTeraEff03¶
Results
| Participants
| Input
| Summary
| Appendix
- Run ID: THUTeraEff03
- Participant: tsinghuau.zhang
- Track: Terabyte
- Year: 2006
- Submission: 6/22/2006
- Task: efficiency
- Run description: Result combination according to the reciprocal rank of THUTeraEff01 and THUTeraEff02.
TWTB06AD01¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06AD01
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 6/28/2006
- Type: automatic
- Task: adhoc
- Run description: This is an automatic run which combined the dependence model with pseudo-relevance feedback.
TWTB06AD02¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06AD02
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 6/28/2006
- Type: manual
- Task: adhoc
- Run description: This is a manual run which made use of pseudo-relevance feedback.
TWTB06AD03¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06AD03
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 6/28/2006
- Type: manual
- Task: adhoc
- Run description: This is a manual run which is just a simple query likelihood run.
TWTB06AD04¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06AD04
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 6/28/2006
- Type: automatic
- Task: adhoc
- Run description: This is a title-only run which made use of dependence modeling.
TWTB06AD05¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06AD05
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 6/28/2006
- Type: automatic
- Task: adhoc
- Run description: This run is a simple title-only query likelihood run.
TWTB06NP01¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06NP01
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 7/19/2006
- Task: namedpage
- Run description: This is a run which uses document structure techniques.
TWTB06NP02¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06NP02
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 7/19/2006
- Task: namedpage
- Run description: This is a run which uses the title field of documents.
TWTB06NP03¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: TWTB06NP03
- Participant: pekingu.yan
- Track: Terabyte
- Year: 2006
- Submission: 7/19/2006
- Task: namedpage
- Run description: This is a run which uses pagerank prior and the title field of documents.
UAmsT06a3SUM¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06a3SUM
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Combination of (1) full-text index run with weight 80%, (2) extracted titles index run with weight 10%, and (3) extracted anchor-texts index run with weight 80%. All runs use a stemmed index, and a language model with little smoothing (lambda = .9), no feedback.
UAmsT06aAnLM¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06aAnLM
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Stemmed anchor-text index, using a language model with little smoothing (lambda = 0.9), no feedback.
UAmsT06aTDN¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06aTDN
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Stemmed full-text index, using a language model with little smoothing (lambda = 0.9), no feedback. Selected the 10 most characteric terms from the TDN-fields.
UAmsT06aTeLM¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06aTeLM
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Stemmed full-text index, using a language model with little smoothing (lambda = 0.9), no feedback.
UAmsT06aTTDN¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06aTTDN
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Combination of (1) Title-only run and (2) TDN-run using the 10 most characteric terms from the TDN-fields. Both using a stemmed full-text index, using a language model with little smoothing, no feedback.
UAmsT06n3SUM¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06n3SUM
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/29/2006
- Task: namedpage
- Run description: CombSUM combination of full-text (.8), titles (.1) and anchors (.1), all runs use language model (lambda = .9), and a standard length-prior.
UAmsT06nAnLM¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06nAnLM
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/29/2006
- Task: namedpage
- Run description: Extracted anchor texts index (stemmed), using language model (lambda = .9, standard length prior).
UAmsT06nTeLM¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06nTeLM
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/29/2006
- Task: namedpage
- Run description: Full text index (stemmed), using language model (lambda = .9, standard length prior).
UAmsT06nTurl¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: UAmsT06nTurl
- Participant: uamsterdam.ilps
- Track: Terabyte
- Year: 2006
- Submission: 7/29/2006
- Task: namedpage
- Run description: Full text index (stemmed), using language model (lambda = .9, standard length prior) and a URL-length prior.
uogTB06M¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06M
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: Divergence From Randomess weighting model with document structure
uogTB06MP¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06MP
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: Divergence From Randomness weighting model with document structure and precision enhancement model
uogTB06MPIA¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06MPIA
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/30/2006
- Task: namedpage
- Run description: Divergence From Randomness weighting model with document structure, precision enhancement model, and query-independent evidence
uogTB06QET1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06QET1
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: DFR document weighting framework with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.
uogTB06QET2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06QET2
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: DFR document weighting framework with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.
uogTB06S50L¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06S50L
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: DFR document weighting framework and query reformulation with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.
uogTB06SS10L¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06SS10L
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: DFR document weighting framework, query reformulation and a new indexing procedure with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.
uogTB06SSQL¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uogTB06SSQL
- Participant: uglasgow.ounis
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: DFR document weighting framework, query reformulation and a new indexing procedure with training on the previous Terabyte track adhoc queries. Experiments were done on the Glagrid. The hardware information and the running time are not available to us.
uwmtFadDS¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFadDS
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/27/2006
- Type: automatic
- Task: adhoc
- Run description: BM25 + additional weight for terms in title fields etc.
uwmtFadTPFB¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFadTPFB
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/27/2006
- Type: automatic
- Task: adhoc
- Run description: BM25 + term proximity + pseudo-relevance feedback
uwmtFadTPRR¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFadTPRR
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/24/2006
- Type: automatic
- Task: adhoc
- Run description: This is BM25 + term proximity for an initial result set. The top 10 documents from the initial result set are used to build a language model which is then employed to rerank all documents in the result set according to their divergence from this language model.
uwmtFcompI0¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompI0
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompI1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompI1
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompI2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompI2
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompI3¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompI3
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompW¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompW
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompW1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompW1
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompW2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompW2
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompW3¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompW3
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/7/2006
- Task: comp_eff
uwmtFcompZ0¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompZ0
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 8/13/2006
- Task: comp_eff
uwmtFcompZ1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompZ1
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 8/13/2006
- Task: comp_eff
uwmtFcompZ2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompZ2
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 8/13/2006
- Task: comp_eff
uwmtFcompZ3¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFcompZ3
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 8/13/2006
- Task: comp_eff
uwmtFdcp03¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFdcp03
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/15/2006
- Task: efficiency
- Run description: This run uses a pruned in-memory index containing the top 3% terms from every document. vbyte is used for index compression.
uwmtFdcp06¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFdcp06
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/16/2006
- Task: efficiency
- Run description: This run uses a pruned in-memory index containing the top 6% terms from every document. vbyte is used for index compression.
uwmtFdcp12¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFdcp12
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/15/2006
- Task: efficiency
- Run description: This run uses a pruned in-memory index containing the top 12% terms from every document. Index compression is done using a length-limited Huffman code.
uwmtFmanual¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFmanual
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/30/2006
- Type: manual
- Task: adhoc
- Run description: This run is a combination of a manual run, in which a bored graduate student tried to find relevant documents by hand, and a few automatic runs, merged together. It makes use of the full topic statements, TDN.
uwmtFnoprune¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFnoprune
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 6/15/2006
- Task: efficiency
- Run description: Frequency index, compressed using vbyte. Frequencies of terms appearing in special parts of the document (title, headlines, etc.) are boosted.
uwmtFnpsRR1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFnpsRR1
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/26/2006
- Task: namedpage
- Run description: BM25 with weighted fields and local reranking based on links and anchor text as a second step.
uwmtFnpstr1¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFnpstr1
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/11/2006
- Task: namedpage
- Run description: BM25 with extra weight for terms within special HTML tags.
uwmtFnpstr2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: uwmtFnpstr2
- Participant: uwaterloo-clarke
- Track: Terabyte
- Year: 2006
- Submission: 7/11/2006
- Task: namedpage
- Run description: BM25 + extract weight according to document structure. Integrated duplicate elimination.
wumpus¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: wumpus
- Participant: max-planck.theobald
- Track: Terabyte
- Year: 2006
- Submission: 9/5/2006
- Task: comp_eff
zetabm¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetabm
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Zettair probabilistic model (BM25) run
zetadir¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetadir
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Zettair language model (dirichlet) run
zetaman¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetaman
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: manual
- Task: adhoc
- Run description: Zettair manual run using title plus salient keywords from description and narrative
zetamerg¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetamerg
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Zettair and Indri merged run (round-robin merge)
zetamerg2¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetamerg2
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/2/2006
- Type: automatic
- Task: adhoc
- Run description: Zettair and Indri merged run 2 (merde on score)
zetnpbm¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetnpbm
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Baseline zettair run (using BM25)
zetnpfa¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetnpfa
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Mixed retrieval of full text and anchor text
zetnpft¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetnpft
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Mixed retrieval of full text index and tags (TITLE and Hx)
zetnpfta¶
Results
| Participants
| Proceedings
| Input
| Summary
| Appendix
- Run ID: zetnpfta
- Participant: rmit.scholer
- Track: Terabyte
- Year: 2006
- Submission: 7/27/2006
- Task: namedpage
- Run description: Mixed retrieval of full text, anchor text and tags (TITLE and Hx)