Skip to content

Proceedings - Question Answering 2002

Overview of the TREC 2002 Question Answering Track

Ellen M. Voorhees

Abstract

The TREC question answering track is an effort to bring the benefits of large-scale evaluation to bear on the question answering problem. The track contained two tasks in TREC 2002, the main task and the list task. Both tasks required that the answer strings returned by the systems consist of nothing more or less than an answer in contrast to the text snippets containing an answer allowed in previous years. A new evaluation measure in the main task, the confidence-weighted score, tested a system's ability to recognize when it has found a correct answer.

Bibtex
@inproceedings{DBLP:conf/trec/Voorhees02a,
    author = {Ellen M. Voorhees},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Overview of the {TREC} 2002 Question Answering Track},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/QA11.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Voorhees02a.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Web Based Pattern Mining and Matching Approach to Question Answering

Dell Zhang, Wee Sun Lee

Abstract

We describe herein a Web based pattern mining and matching approach to question answering. For each type of questions, a lot of textual patterns can be learned from the Web automatically, using the TREC QA track data as training examples. These textual patterns are assessed by the concepts of support and confidence, which are borrowed from the data mining community. Given a new unseen question, these textual patterns can be utilized to extract and rank the plausible answers on the Web. The performance of this approach has been evaluated also by the TREC QA track data.

Bibtex
@inproceedings{DBLP:conf/trec/ZhangL02,
    author = {Dell Zhang and Wee Sun Lee},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Web Based Pattern Mining and Matching Approach to Question Answering},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/nus.web.zhang.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ZhangL02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Natural Language Based Reformulation Resource and Wide Exploitation for Question Answering

Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu

Abstract

We describe and evaluate how a generalized natural language based reformulation resource in our TextMap question answering system improves web exploitation and answer pinpointing. The reformulation resource, which can be viewed as a clausal extension of WordNet, supports high-precision syntactic and semantic reformulations of questions and other sentences, as well as inferencing and answer generation. The paper shows in some detail how these reformulations can be used to overcome challenges and benefit from the advantages of using the Web.

Bibtex
@inproceedings{DBLP:conf/trec/HermjakobEM02,
    author = {Ulf Hermjakob and Abdessamad Echihabi and Daniel Marcu},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Natural Language Based Reformulation Resource and Wide Exploitation for Question Answering},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/usc.hermjakob.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/HermjakobEM02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

IBM's Statistical Question Answering System-TREC 11

Abraham Ittycheriah, Salim Roukos

Abstract

In this paper, we document our efforts to extend our statistical question answering system for TREC-11. We incorporated a web search feature, and novel extensions of statistical machine translation as well as extracting lexical patterns for exact answers from a supervised corpus. Without modification to our base set of thirty-one categories, we were able to achieve a confidence weighted score of 0.455 and an accuracy of 29%. We improved our model on selecting exact answers by insisting on exact answers in the training corpus and this resulted in a 7% gain on TREC-11 but a much larger gain of 46% on TREC-10.

Bibtex
@inproceedings{DBLP:conf/trec/IttycheriahR02,
    author = {Abraham Ittycheriah and Salim Roukos},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {IBM's Statistical Question Answering System-TREC 11},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/ibm.ittycheriah.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/IttycheriahR02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

A Machine Learning Approach for QA and Novelty Tracks: NTT System Description

Hideto Kazawa, Tsutomu Hirao, Hideki Isozaki, Eisaku Maeda

Abstract

In one sense, the goals of QA and Novelty tasks are the same: extracting small document parts which are relevant to users' queries. Additionally, the unit of extraction is almost always fixed in both tasks. For QA, an answer is a noun phrase in most cases, and for Novelty, a sentence is recognized as the basic information unit. This observation leads us to the following unified approach to both QA and Novelty tasks: first identify information units in documents, then judge whether each unit is relevant to the query. This two step approach is amenable to machine learning methods because each step can be cast as a classification problem. For example, noun phrase identification can be achieved by classifying each word into the start/middle/end/exterior of a noun phrase; sentence identification by classifying whether each period marks the of a sentence. Additionally, relevance judgment can be regarded as the classification of a pair of query and an information unit into a relevant-pair or non-relevant-pair. In QA and Novelty Tracks at TREC 2002, we studied the feasibility of this two step approach, using Support Vector Machines as the learning algorithm of the classifiers. Since many studies on identifying information units have already been reported, we concentrate on the relevance judgment step in QA and Novelty tasks in this paper

Bibtex
@inproceedings{DBLP:conf/trec/KazawaHIM02,
    author = {Hideto Kazawa and Tsutomu Hirao and Hideki Isozaki and Eisaku Maeda},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {A Machine Learning Approach for {QA} and Novelty Tracks: {NTT} System Description},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/nttcom.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/KazawaHIM02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Extracting Answers from the Web Using Data Annotation and Knowledge Mining Techniques

Jimmy Lin, Aaron Fernandes, Boris Katz, Gregory Marton, Stefanie Tellex

Abstract

Aranea is a question answering system that extracts answers from the World Wide Web using knowledge annotation and knowledge mining techniques. Knowledge annotation, which utilizes semistructured database techniques, is effective for answering large classes of commonly occurring questions. Knowledge mining, which utilizes statistical techniques, can leverage the massive amounts of data available on the Web to overcome many natural language processing challenges. Aranea integrates these two different paradigms of question answering into a single framework. For the TREC evaluation, we also explored the problem of answer projection, or finding supporting documents for our Web-derived answers from the AQUAINT corpus.

Bibtex
@inproceedings{DBLP:conf/trec/LinFKMT02,
    author = {Jimmy Lin and Aaron Fernandes and Boris Katz and Gregory Marton and Stefanie Tellex},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Extracting Answers from the Web Using Data Annotation and Knowledge Mining Techniques},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/mit.lin.pdf},
    timestamp = {Fri, 27 Aug 2021 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/LinFKMT02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering Using XML-Tagged Documents

Kenneth C. Litkowski

Abstract

The official submission for CL Research's question-answering system (DIMAP-QA) for TREC-11 only slightly extends its semantic relation triple (logical form) technology in which documents are fully parsed and databases built around discourse entities. We were unable to complete the planned revision of our system based on a fuller discourse analysis of the texts. We have since implemented many of these changes and can now report preliminary and encouraging results of basing our system on XML markup of texts with syntactic and semantic attributes and use of XML stylesheet functionality (specifically, XPath expressions) to answer questions. The official confidence-weighted score for the main TREC-11 QA task was 0.049, based on processing 20 of the top 50 documents provided by NIST. Our estimated mean reciprocal rank was 0.128 for the exact answers and 0.227 for sentence answers, comparable to our results from previous years. With our revised XML-based system, using a 20 percent sample of the TREC questions, we have an estimated confidence-weighted score of 0.869 and mean reciprocal rank of 0.828. We describe our system and examine the results from XML tagging in terms of question-answering and other applications such as information extraction, text summarization, novelty studies, and investigation of linguistic phenomena.

Bibtex
@inproceedings{DBLP:conf/trec/Litkowski02,
    author = {Kenneth C. Litkowski},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering Using XML-Tagged Documents},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/clresearch.litkowski.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Litkowski02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The University of Sheffield TREC 2002 Q&A System

Mark A. Greenwood, Ian Roberts, Robert J. Gaizauskas

Abstract

The system entered by the University of Sheffield in the question answering track of TREC 2002 represents a significant development over the Sheffield system entered into TREC-8 [9] and TREC-9 [15], although the underlying architecture remains the same. The essence of the approach is to pass the question to an information retrieval (IR) system which uses it as a query to do passage retrieval against the text collection. The top ranked passages output from the IR system are then passed to a modified information extraction (IE) system. Syntactic and semantic analysis of these passages, along with the question, is carried out to identify the “sought entity” from the question and to score potential matches for this sought entity in each of the retrieved passages. The potential matches are then combined or discarded based on a number of criteria. The highest scoring match is then proposed as the answer to the question

Bibtex
@inproceedings{DBLP:conf/trec/GreenwoodRG02,
    author = {Mark A. Greenwood and Ian Roberts and Robert J. Gaizauskas},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The University of Sheffield {TREC} 2002 Q{\&}A System},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/sheffield.greenwood.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/GreenwoodRG02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

University of Alicante Experiments at TREC 2002

José Luis Vicedo González, Fernando Llopis, Antonio Ferrández Rodríguez

Abstract

This paper describes the architecture, operation and results obtained with the Question Answering prototype developed in the Department of Language Processing and Information Systems at the University of Alicante. This system is based on our TREC-10 approach where different improvements have been introduced. Main modifications reside on the introduction of a filtering stage into paragraph selection and answer extraction modules that allow the treatment of questions with no answer in the document collection. Moreover, WordNet has been enhanced by adding a collection of gazetteers that includes several types of proper nouns (people, organisations, and places) and a large variety of acronyms, measure and money units.

Bibtex
@inproceedings{DBLP:conf/trec/GonzalezLR02,
    author = {Jos{\'{e}} Luis Vicedo Gonz{\'{a}}lez and Fernando Llopis and Antonio Ferr{\'{a}}ndez Rodr{\'{\i}}guez},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {University of Alicante Experiments at {TREC} 2002},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/ualicante.vicedo.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/GonzalezLR02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Novel Results and Some Answers - The University of Iowa TREC 11 Results

David Eichmann, Padmini Srinivasan

Abstract

The University of Iowa participated in the novelty, adaptive filtering and question answering tracks of TREC-11. The filtering system used was an extension of the one used in TREC-7 [1] and TREC-8 [2]. Question answering was derived from the TREC-10 system. The novelty system was new...

Bibtex
@inproceedings{DBLP:conf/trec/EichmannS02,
    author = {David Eichmann and Padmini Srinivasan},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Novel Results and Some Answers - The University of Iowa {TREC} 11 Results},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/uiowa.eichmann.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/EichmannS02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering: CNLP at the TREC 2002 Question Answering Track

Anne Diekema, Jiangping Chen, Nancy J. McCracken, Necati Ercan Ozgencil, Mary D. Taffet, Özgür Yilmazel, Elizabeth D. Liddy

Abstract

This paper describes the retrieval experiments for the main task and list task of the TREC-2002 question-answering track. The question answering system described automatically finds answers to questions in a large document collection. The system uses a two-stage retrieval approach to answer finding based on matching of named entities, linguistic patterns, keywords, and the use of a new inference module. In answering a question, the system carries out a detailed query analysis that produces a logical query representation, an indication of the question focus, and answer clue words.

Bibtex
@inproceedings{DBLP:conf/trec/DiekemaCMOTYL02,
    author = {Anne Diekema and Jiangping Chen and Nancy J. McCracken and Necati Ercan Ozgencil and Mary D. Taffet and {\"{O}}zg{\"{u}}r Yilmazel and Elizabeth D. Liddy},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering: {CNLP} at the {TREC} 2002 Question Answering Track},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/syracuse.diekema.pdf},
    timestamp = {Tue, 17 Nov 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/DiekemaCMOTYL02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Statistical Selection of Exact Answers (MultiText Experiments for TREC 2002)

Charles L. A. Clarke, Gordon V. Cormack, Graeme Kemkes, M. Laszlo, Thomas R. Lynam, Egidio L. Terra, Philip L. Tilker

Abstract

For TREC 2002, the MultiText Group concentrated on the QA track. We also submitted runs for the Web track. Building on the work of previous years, our TREC 2002 QA system takes a statistical approach to answer selection, supported by a lightweight parser that performs question categorization and query genera-tion. Answer candidates are extracted from passages retrieved by an algorithm that identifies short text fragments containing weighted combinations of query terms. If the parser is able to assign one of a predetermined set of question categories to a question, the system employs a finite-state pattern recognizer to extract answer candidates. Otherwise, one- to five-word n-grams from the passages are used. Our system assumes that an answer to every question appears in the TREC corpus, and it produces a NIL result only in a few rare circumstances. Despite the simplicity of the approach, our best QA run returned correct answers to 37% of the questions. Our basic question answering strategy is an extension of the technique we used for both TREC 2000 and 2001 5]. In past years, our system ranked individual terms appearing in retrieved passages and selected 50-byte responses from the passages that included one or more of the highest ranking terms. Since exact answers are required for TREC 2002, much of our effort this year was focused on the extension of this technique to multi-term exact-answer candidates. [...]

Bibtex
@inproceedings{DBLP:conf/trec/ClarkeCKLLTT02,
    author = {Charles L. A. Clarke and Gordon V. Cormack and Graeme Kemkes and M. Laszlo and Thomas R. Lynam and Egidio L. Terra and Philip L. Tilker},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Statistical Selection of Exact Answers (MultiText Experiments for {TREC} 2002)},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/uwaterloo.clarke.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ClarkeCKLLTT02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

A Multi-Strategy and Multi-Source Approach to Question Answering

Jennifer Chu-Carroll, John M. Prager, Christopher A. Welty, Krzysztof Czuba, David A. Ferrucci

Abstract

Traditional question answering systems typically employ a single pipeline architecture, consisting roughly of three components: question analysis, search, and answer selection (see e.g., (Clarke et al., 2001a; Hovy et al., 2000; Moldovan et al., 2000; Prager et al., 2000)). The knowledge sources utilized by these systems to date primarily focus on the corpus from which answers are to be retrieved, WordNet, and the Web (see e.g., (Clarke et al., 2001b; Pasca and Harabagiu, 2001; Prager et al., 2001)). More recent research has shown that introducing feedback loops into the traditional pipeline architecture results in a performance gain (Harabagiu et al., 2001). We are interested in improving the performance of QA systems by breaking away from the strict pipeline architecture. In addition, we require an architecture that allows for hybridization at low development cost and facilitates experimentation with different instantiations of system components. Our resulting architecture is one that is modular and easily extensible, and allows for multiple answering agents to address the same question in parallel and for their results to be combined. Our new question answering system, PIQUANT, adopts this flexible architecture. The answering agents currently implemented in PIQUANT vary both in terms of the strategies used and the knowledge sources consulted. For example, an answering agent may employ statistical methods for extracting answers to questions from a large corpus, while another answering agent may transform select natural language questions into logical forms and query structured knowledge sources for answers. In this paper, we first describe the architecture on which PIQUANT is based. We then describe the answering agents currently implemented within the PIQUANT system, and how they were configured for our TREC2002 runs. Finally, we show that significant performance improvement was achieved by our multi-agent architecture by comparing our TREC2002 results against individual answering agent performance.

Bibtex
@inproceedings{DBLP:conf/trec/Chu-CarrollPWCF02,
    author = {Jennifer Chu{-}Carroll and John M. Prager and Christopher A. Welty and Krzysztof Czuba and David A. Ferrucci},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {A Multi-Strategy and Multi-Source Approach to Question Answering},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/ibm.prager.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Chu-CarrollPWCF02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

MITRE's Qanda at TREC-11

John D. Burger, Lisa Ferro, Warren R. Greiff, John C. Henderson, Scott A. Mardis, Alexander A. Morgan, Marc Light

Abstract

Qanda is MITRE's TREC-style question answering system. Since last year's evaluation, principal improvements to the system have been aimed at making it faster and more robust. We discuss the current architecture of the system in Section 1. Some work has gone into better answer formation and ranking, which we discuss in Section 2. After this year's evaluation, we have done a number of ROVER-style system combination experiments using the judged answer strings made available by NIST. We report on some success with this in Section 3. We have also performed a detailed categorization of previous TREC results according to answer type and grammatical category, as well as an analysis of Qanda's own question analysis component—see Section 4 for these analyses.

Bibtex
@inproceedings{DBLP:conf/trec/BurgerFGHMML02,
    author = {John D. Burger and Lisa Ferro and Warren R. Greiff and John C. Henderson and Scott A. Mardis and Alexander A. Morgan and Marc Light},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {MITRE's Qanda at {TREC-11}},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/mitre.burger.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BurgerFGHMML02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The YorkQA Prototype Question Answering System

Marco De Boni, José-Luis Jara-Valencia, Suresh Manandhar

Abstract

A preliminary analysis of our QA system implemented for TREC-11 is presented, with an initial evaluation.

Bibtex
@inproceedings{DBLP:conf/trec/BoniJM02,
    author = {Marco De Boni and Jos{\'{e}}{-}Luis Jara{-}Valencia and Suresh Manandhar},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The YorkQA Prototype Question Answering System},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/uyork.deboni.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BoniJM02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Coupling Named Entity Recognition, Vector-Space Model and Knowledge Bases for TREC 11 Question Answering Track

Patrice Bellot, Eric Crestan, Marc El-Bèze, Laurent Gillard, Claude de Loupy

Abstract

In this paper, we present a question-answering system combining Named Entity Recognition, Vector-Space Model and Knowledge Bases to validate answers candidates. Applying this hybrid approach, for our first participation in the TREC Q&A.

Bibtex
@inproceedings{DBLP:conf/trec/BellotCEGL02,
    author = {Patrice Bellot and Eric Crestan and Marc El{-}B{\`{e}}ze and Laurent Gillard and Claude de Loupy},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Coupling Named Entity Recognition, Vector-Space Model and Knowledge Bases for {TREC} 11 Question Answering Track},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/lia.sinequa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/BellotCEGL02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

PiQASso 2002

Giuseppe Attardi, Antonio Cisternino, Francesco Formica, Maria Simi, Alessandro Tommasi

Abstract

The University of Pisa participated to TREC 2002's QA track with PiQASso, a vertical Q system developed (except for some of the linguistic tools), entirely within our research group at the Computer Science department. The system features a filter-and-loop architecture in which non-promising paragraphs are ruled out basing on features ranging from keyword matching to complex semantic relation matching. The system also exploits the Web in order to get 'hints' at what to look for in the internal collection. This article describes the system in its entire architecture, concentrating on the Web exploita-tion, providing figures of its efficacy.

Bibtex
@inproceedings{DBLP:conf/trec/AttardiCFST02,
    author = {Giuseppe Attardi and Antonio Cisternino and Francesco Formica and Maria Simi and Alessandro Tommasi},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {PiQASso 2002},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/upisa.tommasi.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/AttardiCFST02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Mining Knowledge from Repeated Co-Occurrences: DIOGENE at TREC 2002

Bernardo Magnini, Matteo Negri, Roberto Prevete, Hristo Tanev

Abstract

This paper presents a new version of the DIOGENE Question Answering (QA) system developed at ITC-Irst. With respect to our first participation to the TREC QA main task (TREC-2001), the system presents both improvements and extensions. On one hand, significant improvements rely on the substitution of basic components (e.g. the search engine and the tool in charge of the named entities recognition) with new modules that enhanced the overall system's performance. On the other hand, an effective extension of DIOGENE is represented by the introduction of a module for the automatic assessment of the candidate answers' quality. All the variations with respect to the first version of the system as well as the results obtained at the TREC-2002 QA main task are presented and discussed in the paper.

Bibtex
@inproceedings{DBLP:conf/trec/MagniniNPT02,
    author = {Bernardo Magnini and Matteo Negri and Roberto Prevete and Hristo Tanev},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Mining Knowledge from Repeated Co-Occurrences: {DIOGENE} at {TREC} 2002},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/itcirst.magnini.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/MagniniNPT02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The Question Answering System QALC at LIMSI, Experiments in Using Web and WordNet

Gaël de Chalendar, Tiphaine Dalmas, Faiïza Elkateb-Gara, Olivier Ferret, Brigitte Grau, Martine Hurault-Plantet, Gabriel Illouz, Laura Monceaux, Isabelle Robba, Anne Vilnat

Abstract

The QALC question answering system at LIMSI (Ferret et al, 2001) has been largely modified for the TREC11 evaluation campaign. Architecture now includes the processing of answers retrieved from Web searching, and a number of already existing modules has been re-handled. Indeed, introducing the Web as additional resource with regard to the TREC corpus, brought us to experiment comparison strategies between answers extracted from different corpora. These strategies now make up the final answer selection module. The answer extraction module now takes advantage of using the WordNet semantic data base, whenever the expected answer type is not a named entity. As a result, we draw up a new analysis for these question categories, just as a new formulation of associated answer extraction patterns. We also changed the weighting system of the sentences which are candidate for answer, in order to increase answer reliability. Furthermore, the number of selected sentences is no longer decided before extraction module but inside it according whether the expected answer type is a named entity or not. In the last case, the number of selected sentences is greater than in case of a named entity answer type, so as to take better advantage of the selection made by means of extraction patterns and WordNet. Other modules have been modified: the QALC system now uses a search engine and document selection has been improved through document cutting into paragraphs and selection robustness improvement. Furthermore, named entity recognition module has been significantly modified in order to recognize precisely more of them and decrease ambiguity cases. In this paper, we first present the architecture of the system. Then, we will describe the modified modules, i.e. question analysis, document selection, named entity recognition, sentence weighting and answer extraction. Afterwards, the strategies of final answer selection of the new module will be described. Finally, we will present our results, ending with some concluding remarks.

Bibtex
@inproceedings{DBLP:conf/trec/ChalendarDEFGHIMRV02,
    author = {Ga{\"{e}}l de Chalendar and Tiphaine Dalmas and Fai{\"{\i}}za Elkateb{-}Gara and Olivier Ferret and Brigitte Grau and Martine Hurault{-}Plantet and Gabriel Illouz and Laura Monceaux and Isabelle Robba and Anne Vilnat},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The Question Answering System {QALC} at LIMSI, Experiments in Using Web and WordNet},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/limsi.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ChalendarDEFGHIMRV02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

TREC 2002 QA at BBN: Answer Selection and Confidence Estimation

Jinxi Xu, Ana Licuanan, Jonathan May, Scott Miller, Ralph M. Weischedel

Abstract

We focused on two issues: answer selection and confidence estimation. We found that some simple constraints on the candidate answers can improve a pure IR-based technique for answer selection. We also found that a few simple features derived from the question-answer pairs can be used for effective confidence estimation. Our results also confirmed findings by Dumais et al, 2002 that the World-Wide Web is a very useful resource for answering TREC-style factoid questions.

Bibtex
@inproceedings{DBLP:conf/trec/XuLMMW02,
    author = {Jinxi Xu and Ana Licuanan and Jonathan May and Scott Miller and Ralph M. Weischedel},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{TREC} 2002 {QA} at {BBN:} Answer Selection and Confidence Estimation},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/bbn.xu.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/XuLMMW02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The Integration of Lexical Knowledge and External Resources for Question Answering

Hui Yang, Tat-Seng Chua

Abstract

For the short, factoid questions in TREC, the query terms we get from the original questions are either too brief or often do not contain most relevant information in the corpus. It will be very difficult to find the answer (especially exact answer) in a large text document collection because of the gap between the query space and the document space. In order to bridge this gap, there is a need to expand the original queries to include the terms in the document space. In this research, we investigate the integration of both the Web and WordNet in performing local context and lexical correlations to bridge the gap. In order to minimize the noise introduced by the external resources, we explore detailed question classes, fine-grained named entities, and successive constraint relaxation.

Bibtex
@inproceedings{DBLP:conf/trec/YangC02,
    author = {Hui Yang and Tat{-}Seng Chua},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The Integration of Lexical Knowledge and External Resources for Question Answering},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/nus.pris.yang.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/YangC02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

ICT Experiments in TREC 11 QA Main Task

Hongbo Xu, Hao Zhang, Shuo Bai

Abstract

This is the second time we participate in the TREC-QA track. We put emphasis on candidate passage ranking and answer matching. As to named entity tagging, we applied the latest version of GATE and did some succeeding work aiming at our goal. This paper presents our methods in detail.

Bibtex
@inproceedings{DBLP:conf/trec/XuZB02,
    author = {Hongbo Xu and Hao Zhang and Shuo Bai},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{ICT} Experiments in {TREC} 11 {QA} Main Task},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/cas\_qa.hongbo.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/XuZB02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

FDU at TREC 2002: Filtering, Q&A, Web and Video Tasks

Lide Wu, Xuanjing Huang, Junyu Niu, Yingju Xia, Zhe Feng, Yaqian Zhou

Abstract

This year Fudan University takes part in the TREC conference for the third time. We have participated in four tracks of Filtering, Q&A, Web and Video. For filtering, we only participate in the sub-task of adaptive filtering. A novel method is presented, in which a winnow classifier from the description and narrative fields is constructed, and then utilized to assist our previous adaptive filtering system. A novel approach to confidence sorting, which is based on Maximum Entropy, is proposed in our Question Answering system. The rank of individual answer is determined by several weighted factors, and the confidence score is the product of the exponent of the weights of every factors. The weight of every factor is assigned during the training of previous questions. To return highly relevant key resources for web retrieval, we modified our original search system to make it return higher precision result than before. First, we proposed a novel search algorithm to get a base set of highly relevant documents. Then special post-processing modules are used to expand and re-sort the base set. This year we tried a fast manifold-based approach to face recognition in the Video Search Task. It can be used when there are only few different images of a specific person and runs fast. Experiment shows that applying this step will make the face recognition 5-fold faster and with almost no decreasing of performance.

Bibtex
@inproceedings{DBLP:conf/trec/WuHNXFZ02,
    author = {Lide Wu and Xuanjing Huang and Junyu Niu and Yingju Xia and Zhe Feng and Yaqian Zhou},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{FDU} at {TREC} 2002: Filtering, Q{\&}A, Web and Video Tasks},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/fudan.lide.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/WuHNXFZ02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering Using the DLT System at TREC 2002

Richard F. E. Sutcliffe

Abstract

This article outlines our participation in the Question Answering Track of the Text REtrieval Conference organised by the National Institute of Standards and Technology. Having not taken part before, our objective was to study the task and build a simple working system capable of answering at least some questions correctly. Only three person weeks was available for the work but this proved sufficient to achieve our goal. The article is structured as follows. Firstly, some preliminaries such as our starting point, tools and strategy are described. After this, the architecture of the Documents and Linguistic Technology Group's DLT system is outlined. Thirdly, the question types analysed by the system are described along with the named entities with which they work. Fourthly, the runs performed are presented together with the results we obtained. Finally, conclusions are drawn based on our findings.

Bibtex
@inproceedings{DBLP:conf/trec/Sutcliffe02,
    author = {Richard F. E. Sutcliffe},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering Using the {DLT} System at {TREC} 2002},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/ulimerick.sutcliffe.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Sutcliffe02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Use of Patterns for Detection of Likely Answer Strings: A Systematic Approach

Martin M. Soubbotin, Sergei M. Soubbotin

Abstract

The paper describes the Question Answering approach applied first at TREC-10 QA track and developed systematically in TREC 2002 experiments. The approach is based on the assumption that answers can be identified by their correspondence to formulas describing the structure of strings carrying certain (generalized) semantics, supposed by the question type. These formulas, or patterns, are like regular expressions but include elements corresponding to predefined lists of terms. Complex patterns can be constructed from blocks corresponding to such semantic entities as persons' or organizations' names, posts, dates, locations, etc. Using various combinations of blocks and intermediate syntactic elements allows to build a great variety of patterns. Exact position of elements corresponding to the 'exact answer' was localized within the structure of each pattern. Each pattern is characterized by a generalized semantics, thus the pattern-matching string must be checked for correlation with the question terms and/or their synonyms/substitutes.

Bibtex
@inproceedings{DBLP:conf/trec/SoubbotinS02,
    author = {Martin M. Soubbotin and Sergei M. Soubbotin},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Use of Patterns for Detection of Likely Answer Strings: {A} Systematic Approach},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/insightsoftm.sergei.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/SoubbotinS02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question-Answering via Enhanced Understanding of Questions

Dan Roth, Chad M. Cumby, Xin Li, Paul Morie, Ramya Nagarajan, Nick Rizzolo, Kevin Small, Wen-tau Yih

Abstract

We describe a machine learning centered approach to developing an open domain question answering system. The system was developed in the summer of 2002, building upon several existing machine learning based NLP modules developed within a unified framework. Both queries and data were pre-processed and augmented with pos tagging, shallow parsing informa-tion, and some level of semantic categorization (be-yond named entities) using a SNoW based machine learning approach. Given these as input, the system proceeds as an incremental constraint satisfaction process. A machine learning based question analysis module extracts structural and semantic constraints on the answer, including a fine classification of the desired answer type. The system continues in several steps to identify candidate passages and then extracts an answer that best satisfies the constraints. With the available machine learning technologies, the system was developed in six weeks with the goal of identifying some of the key research issues of QA and challenges to it.

Bibtex
@inproceedings{DBLP:conf/trec/RothCLMNRSY02,
    author = {Dan Roth and Chad M. Cumby and Xin Li and Paul Morie and Ramya Nagarajan and Nick Rizzolo and Kevin Small and Wen{-}tau Yih},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question-Answering via Enhanced Understanding of Questions},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/uiuc.roth.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/RothCLMNRSY02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The University of Michigan at TREC 2002: Question Answering and Novelty Tracks

Hong Qi, Jahna Otterbacher, Adam Winkel, Dragomir R. Radev

Abstract

The University of Michigan participated in two evaluations this year. In the Question Answering Track, we entered three different versions of our system, NSIR, previously described in |1. For the Novelty Track, we modified our multi-document summarizer, MEAD (www.summarization.com/mead) and submitted five runs with different input parameters.

Bibtex
@inproceedings{DBLP:conf/trec/QiOWR02,
    author = {Hong Qi and Jahna Otterbacher and Adam Winkel and Dragomir R. Radev},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The University of Michigan at {TREC} 2002: Question Answering and Novelty Tracks},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/umichigan.radev.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/QiOWR02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Building a Foundation System for Producing Short Answers to Factual Questions

Sameer S. Pradhan, Valerie Krugler, Steven Bethard, Wayne H. Ward, Daniel Jurafsky, James H. Martin, Sasha Blair-Goldensohn, Andrew Hazen Schlaikjer, Elena Filatova, Pablo Ariel Duboue, Hong Yu, Rebecca J. Passonneau, Vasileios Hatzivassiloglou, Kathleen R. McKeown, Gabriel Illouz

Abstract

In this paper we describe question answering research being pursued as a joint project between Columbia University and the University of Colorado at Boulder as part of ARDA's AQUAINT program. As a foundation for targeting complex questions involving opinions, events, and paragraph-length answers, we recently built two systems for answering short factual questions. We submitted results from the two systems to TREC's Q&A track, and the bulk of this paper describes the methods used in building each system and the results obtained. We conclude by discussing current work aiming at combining modules from the two systems in a unified, more accurate system and adding capabilities for producing complex answers in addition to short ones.

Bibtex
@inproceedings{DBLP:conf/trec/PradhanKBWJMBSFDYPHMI02,
    author = {Sameer S. Pradhan and Valerie Krugler and Steven Bethard and Wayne H. Ward and Daniel Jurafsky and James H. Martin and Sasha Blair{-}Goldensohn and Andrew Hazen Schlaikjer and Elena Filatova and Pablo Ariel Duboue and Hong Yu and Rebecca J. Passonneau and Vasileios Hatzivassiloglou and Kathleen R. McKeown and Gabriel Illouz},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Building a Foundation System for Producing Short Answers to Factual Questions},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/ucolorado.pradhan.pdf},
    timestamp = {Sat, 21 Jan 2023 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/PradhanKBWJMBSFDYPHMI02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The QUANTUM Question Answering System at TREC 11

Luc Plamondon, Guy Lapalme, Leila Kosseim

Abstract

This year, we participated to the Question Answering task for the second time with the QUANTUM system. We entered 2 runs for the main task (one using the web, the other without) and 1 run for the list task (without the web). We essentially built on last year's experience to enhance the system. The architecture of QUANTUM is mainly the same as last year: it uses patterns that rely on shallow parsing techniques and regular expressions to analyze the question and then select the most appropriate extraction function. This extraction function is then applied to one-paragraph long passages retrieved by Okapi to extract and score candidate answers. Among the novelties we added to QUANTUM this year is a web module that finds exact answers using high-precision reformul tion of the question to anticipate the expected context of the answer.

Bibtex
@inproceedings{DBLP:conf/trec/PlamondonLK02,
    author = {Luc Plamondon and Guy Lapalme and Leila Kosseim},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The {QUANTUM} Question Answering System at {TREC} 11},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/umontreal.quantum.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/PlamondonLK02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The JAVELIN Question-Answering System at TREC 2002

Eric Nyberg, Teruko Mitamura, Jaime G. Carbonell, James P. Callan, Kevyn Collins-Thompson, Krzysztof Czuba, Michael Duggan, Laurie Hiyakumoto, N. Hu, Yifen Huang, Jeongwoo Ko, Lucian Vlad Lita, S. Murtagh, Vasco Pedro, David Svoboda

Abstract

This paper describes the JAVELIN approach for open-domain question answering (Justification-based Answer Valuation through Language Interpretation), and our participation in the TREC 2002 question-answering track. [...]

Bibtex
@inproceedings{DBLP:conf/trec/NybergMCCCCDHHHKLMPS02,
    author = {Eric Nyberg and Teruko Mitamura and Jaime G. Carbonell and James P. Callan and Kevyn Collins{-}Thompson and Krzysztof Czuba and Michael Duggan and Laurie Hiyakumoto and N. Hu and Yifen Huang and Jeongwoo Ko and Lucian Vlad Lita and S. Murtagh and Vasco Pedro and David Svoboda},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The {JAVELIN} Question-Answering System at {TREC} 2002},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/cmu.javelin.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/NybergMCCCCDHHHKLMPS02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering Approach Using a WordNet-based Answer Type Taxonomy

Seung-Hoon Na, In-Su Kang, Sang-Yool Lee, Jong-Hyeok Lee

Abstract

In question answering (QA), answer types are semantic categories that questions require. An answer type taxonomy (ATT) is a collection of these answer types. ATT may heavily affect the performance of QA systems, because its broadness and granularity provides coverage and specificity of answer types. Cardie [1] used 13 categories for entity classification, and obtained large performance improvement, compared with the method using no categories. Also, according to Pasca et al. [3], the more categories a system uses, the better performance the system shows. For example, consider two answer type taxonomies, A={PERSON}, and B={PRESIDENT, ENGINEER, SINGER, PERSON}. Given a question “Who was the president of Vichy France?”, we know that the more specific answer type of this question is not PERSON, but PRESIDENT. Thus, if we use ATT B, a set of candidate answers from documents can be reduced to a set of PRESIDENT entities, by excluding the other PERSON entities such as ENGINEER and SINGER. This is not the case with ATT A. However, since ATT A cannot distinguish hypernyms of PERSON, the QA system should consider much more candidate answers. Thus far, most QA systems rely on small-scale ATTs, with the number of semantic categories ranging from 20 to 100. Normally, these ATTs are created from a beginning set of frequently-asked answer types like person, organization, location, number, etc., and then they are incrementally extended to include unexpected answer types from new questions. However, these ad-hoc ATTs may raise the following problems in QA. First, it is nontrivial to manually enlarge a small ATT to a large one, as new answer types appear. Second, ad-hoc ATTs do not allow easy adaptation for processing questions asking new answer types. For such questions, the system needs to modify an existing IE module to classify entities into new answer types. Third, previous ATTs do not have sufficient broadness and granularity, where they are expected characteristics of ATT for open-domain QA. Therefore, at this year's TREC, we have taken a question answering approach that uses WordNet itself as ATT. In other words, our QA system maps an answer type into a concept node called a synset in WordNet. WordNet provides sufficient diversity and density to distinguish specific answer types for most questions. By using such an ontological taxonomy, we do not have the above problems with small ad-hoc ATTs. This paper is organized as follows. Section 2 describes each module of our QA system, and section 3 shows TREC-11 evaluation results, and concluding remarks are given in section 4.

Bibtex
@inproceedings{DBLP:conf/trec/NaKLL02,
    author = {Seung{-}Hoon Na and In{-}Su Kang and Sang{-}Yool Lee and Jong{-}Hyeok Lee},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering Approach Using a WordNet-based Answer Type Taxonomy},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/pohang.seunghoon.pdf},
    timestamp = {Sat, 30 Sep 2023 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/NaKLL02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The University of Amsterdam at TREC 2002

Christof Monz, Jaap Kamps, Maarten de Rijke

Abstract

We describe our participation in the TREC 2002 Novelty, Question answering, and Web tracks. We provide a detailed account of the ideas underlying our approaches to these tasks. All our runs used the FlexIR information retrieval system.

Bibtex
@inproceedings{DBLP:conf/trec/MonzKR02,
    author = {Christof Monz and Jaap Kamps and Maarten de Rijke},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The University of Amsterdam at {TREC} 2002},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/uamsterdam.derijke.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/MonzKR02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

LCC Tools for Question Answering

Dan I. Moldovan, Sanda M. Harabagiu, Roxana Girju, Paul Morarescu, V. Finley Lacatusu, Adrian Novischi, Adriana Badulescu, Orest Bolohan

Abstract

The increased complexity of the TREC QA questions requires advanced text processing tools that rely on natural language processing and knowledge reasoning. This paper presents the suite of tools that account for the performance of the Power Answer question answering system. It is shown how questions, answers and world knowledge are transformed first in logic representation, followed by a systematic and rigorous logic proof that validly answers questions posed to the QA system. At TREC QA 2002, Power Answer obtained a confidence-weighted score of 0.856, answering correctly 415 out of 500 questions.

Bibtex
@inproceedings{DBLP:conf/trec/MoldovanHGMLNBB02,
    author = {Dan I. Moldovan and Sanda M. Harabagiu and Roxana Girju and Paul Morarescu and V. Finley Lacatusu and Adrian Novischi and Adriana Badulescu and Orest Bolohan},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{LCC} Tools for Question Answering},
    booktitle = {Proceedings of The Eleventh Text REtrieval Conference, {TREC} 2002, Gaithersburg, Maryland, USA, November 19-22, 2002},
    series = {{NIST} Special Publication},
    volume = {500-251},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2002},
    url = {http://trec.nist.gov/pubs/trec11/papers/lcc.moldovan.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/MoldovanHGMLNBB02.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}