Skip to content

Proceedings - Query 1999

The TREC-8 Query Track

Chris Buckley, Janet A. Walz

Abstract

The Query Track in TREC-8 is a bit different from all the other tracks. It is a cooperative effort among the participating groups to look at the issue of 'query variability'. The evaluation averages presented in a typical system evaluation task, such as the TREC Ad-Hoc Task, conceal a tremendous variability of system performance across topics/queries. No system can possibly perform equally well on all topics: some information needs (expressed by topics) are harder than others. But what is quite surprising, especially to people just starting to look at IR, is the large variability in system performance across topics as compared to other systems. In a typical TREC task, no system is the best for all the topics in the task. It is extremely rare for any system to be above average for all the topics. Instead, the best system is normally above average for most of the topics, and best for maybe 5%-10% of the topics. It very often happens that quite below-average systems are also best for 5%-10% of the topics, but do poorly on the other topics. The Average Precision Histograms presented on the TREC evaluation result pages are an attempt to show what is happening at the individual topic level. This large topic/query variability presents a great opportunity for improving system performance. If we can understand why some systems do well on some queries but poorly on others, then we can start introducing query dependent processing to improve results on those poor performance queries. Unfortunately, we just don't have enough information from the results of a typical TREC task to really understand what is happening. The results on 50 to 150 queries are just not enough to draw any conclusions. The Query Track at TREC is an attempt to gather enough information from a large number of systems on a large number of queries to be able to start understanding query variability.

Bibtex
@inproceedings{DBLP:conf/trec/BuckleyW99,
    author = {Chris Buckley and Janet A. Walz},
    editor = {Ellen M. Voorhees and Donna K. Harman},
    title = {The {TREC-8} Query Track},
    booktitle = {Proceedings of The Eighth Text REtrieval Conference, {TREC} 1999, Gaithersburg, Maryland, USA, November 17-19, 1999},
    series = {{NIST} Special Publication},
    volume = {500-246},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {1999},
    url = {http://trec.nist.gov/pubs/trec8/papers/qtrack.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BuckleyW99.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

TREC-8 Ad-Hoc, Query and Filtering Track Experiments using PIRCS

K. L. Kwok, Laszlo Grunfeld, M. Chan

Abstract

In TREC-8, we participated in automatic ad-hoc retrieval as well as the query and filtering tracks. The theme of our participation is 'retrieval lists combination', and the technique is applied throughout our experiments to various degree. It is pointed out that our PIRCS system may be considered as a combination of probabilistic retrieval model and a language model approach. For ad-hoc, three types of experiments were done with short, medium and long queries as before. General approach is similar to TREC-7, but combination of retrieval lists from different query types were used to boost effectiveness. For query track, we submitted one short-query set, and performed retrieval for twenty one natural language query vairants. For filtering track, experiments for adaptive, batch filtering, and routing were performed. For adaptive, historical selected document list was used to train profile term weights and dynamically vary retrieval status value (rsv) threshold for deciding document selection during the course of filtering. For batch filtering, Financial Times FT92 data was used to define 6 retrieval profiles whose results were combined based on coefficients trained via a genetic algorithm. Logistic regression transforms rsv's to probabilities. Routing was similarly done with additional training data obtained from non-FT collections and two additional profiles were defined and combined

Bibtex
@inproceedings{DBLP:conf/trec/KwokGC99,
    author = {K. L. Kwok and Laszlo Grunfeld and M. Chan},
    editor = {Ellen M. Voorhees and Donna K. Harman},
    title = {{TREC-8} Ad-Hoc, Query and Filtering Track Experiments using {PIRCS}},
    booktitle = {Proceedings of The Eighth Text REtrieval Conference, {TREC} 1999, Gaithersburg, Maryland, USA, November 17-19, 1999},
    series = {{NIST} Special Publication},
    volume = {500-246},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {1999},
    url = {http://trec.nist.gov/pubs/trec8/papers/queenst8.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/KwokGC99.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

ACSys TREC-8 Experiments

David Hawking, Peter Bailey, Nick Craswell

Abstract

Experiments relating to TREC-8 Ad Hoc, Web Track (Large and Small) and Query Track tasks are described and results reported. Due to time constraints, only minimal effort was put into Ad Hoc and Query Track participation. In the Web Track, Google-style PageRanks were calculated for all 18.5 million pages in the VLC2 collection and for the 0.25 million pages in the WT2g collection. Various combinations of content score and PageRank produced no benefit for TREC style ad hoc retrieval. A major goal in the Web Track was to make engineering improvements to permit indexing of the 100 gigabyte collection and subsequent query processing using a single PC. A secondary goal was to achieve last year's performance (obtained with eight DEC Alphas) with less recourse to effectiveness-harming optimisations. The main goal was achieved and indexing times are comparable to last year's. However, effectiveness results were worse relative to last year and query processing times were approximately double.

Bibtex
@inproceedings{DBLP:conf/trec/HawkingBC99,
    author = {David Hawking and Peter Bailey and Nick Craswell},
    editor = {Ellen M. Voorhees and Donna K. Harman},
    title = {ACSys {TREC-8} Experiments},
    booktitle = {Proceedings of The Eighth Text REtrieval Conference, {TREC} 1999, Gaithersburg, Maryland, USA, November 17-19, 1999},
    series = {{NIST} Special Publication},
    volume = {500-246},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {1999},
    url = {http://trec.nist.gov/pubs/trec8/papers/acsys.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/HawkingBC99.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

SMART in TREC 8

Chris Buckley, Janet A. Walz

Abstract

This year was a light year for the Smart Information Retrieval Project at SabIR Research and Cornell. We officially participated in only the Ad-hoc Task and the Query Track. In the Ad-hoc Task, we made minor modifications to our document weighting schemes to emphasize high-precision searches on shorter queries. This proved only mildly successful; the top relevant document was retrieved higher, but the rest of the retrieval tended to be hurt. Our Query Track runs are described here, but the much more interesting analysis of these runs is described in the Query Track Overview.

Bibtex
@inproceedings{DBLP:conf/trec/BuckleyW99a,
    author = {Chris Buckley and Janet A. Walz},
    editor = {Ellen M. Voorhees and Donna K. Harman},
    title = {{SMART} in {TREC} 8},
    booktitle = {Proceedings of The Eighth Text REtrieval Conference, {TREC} 1999, Gaithersburg, Maryland, USA, November 17-19, 1999},
    series = {{NIST} Special Publication},
    volume = {500-246},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {1999},
    url = {http://trec.nist.gov/pubs/trec8/papers/sabir8.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BuckleyW99a.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

INQUERY and TREC-8

James Allan, James P. Callan, Fangfang Feng, Daniella Malin

Abstract

This year the Center for Intelligent Information Retrieval (CIIR) at the University of Massachusetts participated in seven of the tracks: ad-hoc, filtering, spoken document retrieval, small web, large web, question and answer, and the query tracks. We spent significant time working on the filtering track, resulting in substantial performance improvement over TREC-7. For all of the other tracks, we used essentially the same system as used in previous years. In the next section, we describe some of the basic processing that was applied across most of the tracks. We then describe the details for each of the tracks and in some cases present some modest analysis of the effectiveness of our results.

Bibtex
@inproceedings{DBLP:conf/trec/AllanCFM99,
    author = {James Allan and James P. Callan and Fangfang Feng and Daniella Malin},
    editor = {Ellen M. Voorhees and Donna K. Harman},
    title = {{INQUERY} and {TREC-8}},
    booktitle = {Proceedings of The Eighth Text REtrieval Conference, {TREC} 1999, Gaithersburg, Maryland, USA, November 17-19, 1999},
    series = {{NIST} Special Publication},
    volume = {500-246},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {1999},
    url = {http://trec.nist.gov/pubs/trec8/papers/trec8-umass.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/AllanCFM99.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}