Text REtrieval Conference (TREC) 2006¶

Terabyte¶

The primary goal of the Terabyte Track is to develop an evaluation methodology for terabyte-scale document collections. In addition, we are interested in efficiency and scalability issues, which can be studied more easily in the context of a larger collection.

Track coordinator(s):

S. Buttcher, University of Waterloo
C.L.A. Clarke, University of Waterloo
I. Soboroff, National Institute of Standards and Technology (NIST)

Track Web Page: https://plg.uwaterloo.ca/~claclark/TB06.html

Spam¶

The 2006 track will reprise the 2005 experiments with new filters and data, and will also investigate delayed feedback and active learning.

Track coordinator(s):

G. Cormack, University of Waterloo

Track Web Page: https://plg.uwaterloo.ca/~gvcormac/spam/

Genomics¶

The TREC Genomics Track implemented a new task in 2006 that focused on passage retrieval for question answering using full-text documents from the biomedical literature. A test collection of 162,259 full-text documents and 28 topics expressed as questions was assembled. Systems were required to return passages that contained answers to the questions. Expert judges determined the relevance of passages and grouped them into aspects identified by one or more Medical Subject Headings (MeSH) terms. Document relevance was defined by the presence of one or more relevant aspects. The performance of submitted runs was scored using mean average precision (MAP) at the passage, aspect, and document level. In general, passage MAP was low, while aspect and document MAP were somewhat higher.

Track coordinator(s):

W. Hersh, Oregon Health & Science University
A.M. Cohen, Oregon Health & Science University
P. Roberts, Oregon Health & Science University
H.K. Rekapalli, Oregon Health & Science University

Track Web Page: https://dmice.ohsu.edu/trec-gen/

Enterprise¶

The enterprise track began in TREC 2005 as the successor to the web track, and this is reflected in the tasks and measures. While the track takes much of its inspiration from the web track, the foci are on search at the enterprise scale, incorporating non-web data and discovering relationships between entities in the organization. As a result, we have created the first test collections for multi-user email search and expert finding.

Track coordinator(s):

I. Soboroff, National Institute of Standards and Technology (NIST)
A.P. de Vries, CWI
N. Craswell, Microsoft Cambridge

Track Web Page: https://trec.nist.gov/data/enterprise.html

Blog¶

The Blog track began this year, with the aim to explore the information seeking behaviour in the blogosphere. For this purpose, a new large-scale test collection, namely the TREC Blog06 collection, has been created. In the first pilot run of the track in 2006, we had two tasks, a main task (opinion retrieval) and an open task. The opinion retrieval task focuses on a specific aspect of blogs: the opinionated nature of many blogs. The second task was introduced to allow participants the opportunity to influence the determination of a suitable second task (for 2007) on other aspects of blogs, such as the temporal/event-related nature of many blogs, or the severity of spam in the blogosphere.

Track coordinator(s):

I. Ounis, University of Glasgow
C. Macdonald, University of Glasgow
M. de Rijke, University of Amsterdam
G. Mishne, University of Amsterdam
I. Soboroff, National Institute of Standards and Technology (NIST)

Track Web Page: https://www.dcs.gla.ac.uk/wiki/TREC-BLOG

Question Answering¶

Overview | Proceedings | Data | Runs | Participants

The goal of the TREC QA track is to foster research on systems that retrieve answers rather than documents in response to a question. The focus is on systems that can function in unrestricted domains.

Track coordinator(s):

H.T. Dang, National Institute of Standards and Technology (NIST)
J. Lin, University of Maryland, College Park
D. Kelly, University of North Carolina, Chapel Hill

Track Web Page: https://trec.nist.gov/data/qa/2006_qadata/qa.06.guidelines.html

Legal¶

The goal of the Legal Track at the Text Retrieval Conference (TREC) is to assess the ability of information retrieval techniques to meet the needs of the legal profession for tools and methods capable of helping with the retrieval of electronic business records, principally for use as evidence in civil litigation. In the USA, this problem is referred to as e-discovery. Like all TREC tracks, the Legal Track seeks to foster the development of a research community by providing a venue for shared development of evaluation resources (test collections) and baseline results to which future results can be compared.

Track coordinator(s):

J.R. Baron, National Archives and Records Administration
D.D. Lewis, David D. Lewis Consulting
D.W. Oard, University of Maryland

Track Web Page: http://trec-legal.umiacs.umd.edu/