Skip to content

Proceedings - Question Answering 2004

Overview of the TREC 2004 Question Answering Track

Ellen M. Voorhees

Abstract

The TREC 2004 Question Answering track contained a single task in which question series were used to define a set of targets. Each series contained factoid and list questions and related to a single target. The final question in the series was an “Other” question that asked for additional information about the target that was not covered by previous questions in the series. Each question type was evaluated separately with the final score a weighted average of the different component scores. Applying the combined measure on a per-series basis produces a QA task evaluation that more closely mimics classic document retrieval evaluation.

Bibtex
@inproceedings{DBLP:conf/trec/Voorhees04a,
    author = {Ellen M. Voorhees},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Overview of the {TREC} 2004 Question Answering Track},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/QA.OVERVIEW.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Voorhees04a.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering with QED and Wee at TREC 2004

Kisuh Ahn, Johan Bos, Stephen Clark, Tiphaine Dalmas, Jochen L. Leidner, Matthew Smillie, Bonnie L. Webber, James R. Curran

Abstract

This report describes the experiments of the University of Edinburgh and the University of Sydney at the TREC-2004 question answering evaluation exercise. Our system combines two approaches: one with deep linguistic analysis using IR on the AQUAINT corpus applied to answer extraction from text passages, and one with a shallow linguistic analysis and shallow inference applied to a large set of snippets retrieved from the web. The results of our experiments support the following claims: (1) Web-based IR is a good alternative to “traditional” IR; and (2) deep linguistic analysis improves quality of exact answers.

Bibtex
@inproceedings{DBLP:conf/trec/AhnBCDLSWC04,
    author = {Kisuh Ahn and Johan Bos and Stephen Clark and Tiphaine Dalmas and Jochen L. Leidner and Matthew Smillie and Bonnie L. Webber and James R. Curran},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering with {QED} and Wee at {TREC} 2004},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/uedinburgh-syd.bos.qa.pdf},
    timestamp = {Wed, 07 Jul 2021 16:44:22 +0200},
    biburl = {https://dblp.org/rec/conf/trec/AhnBCDLSWC04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Using Wikipedia at the TREC QA Track

David Ahn, Valentin Jijkoun, Gilad Mishne, Karin Müller, Maarten de Rijke, Stefan Schlobach

Abstract

We describe our participation in the TREC 2004 Question Answering track. We provide a detailed account of the ideas underlying our approach to the QA task, especially to the so-called “other” questions. This year we made essential use of Wikipedia, the free online encyclopedia, both as a source of answers to factoid questions and as an importance model to help us identify material to be returned in response to “other” questions.

Bibtex
@inproceedings{DBLP:conf/trec/AhnJMMRS04,
    author = {David Ahn and Valentin Jijkoun and Gilad Mishne and Karin M{\"{u}}ller and Maarten de Rijke and Stefan Schlobach},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Using Wikipedia at the {TREC} {QA} Track},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/uamsterdam.qa.pdf},
    timestamp = {Tue, 14 Jul 2020 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/AhnJMMRS04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

University of Lethbridge's Participation in TREC 2004 QA Track

Yllias Chali, Stephen Dubien

Abstract

Text Retrieval Conference (TREC), organised by National Institute for Standards and Technology (NIST), is a set of tracks that represent different areas of text retrieval. These tracks provide a way to measure systems progress in certain fields in text retrieval such as cross-language retrieval, retrieval filtering and genomics. We participated in the question answering track. The questions in the TREC-2004 QA track are clustered by target, which is the topic of the question. The QA track for 2004 has three types of questions: factoid questions that require only one correct response, list questions that require a non redundant list of correct responses and other questions that require a non redundant list of facts about the target that has not already been discovered by a previous answer. These questions will be answered using the AQUAINT collection, a collection of over a million newswire documents.

Bibtex
@inproceedings{DBLP:conf/trec/ChaliD04,
    author = {Yllias Chali and Stephen Dubien},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {University of Lethbridge's Participation in {TREC} 2004 {QA} Track},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/ulethbridge.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ChaliD04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

UNT at TREC 2004: Question Answering Combining Multiple Evidences

Jiangping Chen, He Ge, Yan Wu, Shikun Jiang

Abstract

Question Answering (QA) aims at identifying answers to users' natural language questions. A QA system can release the users from digesting large amount of text in order to locate particular facts or numbers. The research has drawn great attention from several disciplines such as information retrieval, information extraction, natural language processing, and artificial intelligence. TREC QA track has provided comparable QA system evaluation on a set of test questions since 1999. The degree of difficulty of the test questions has increased substantially in recent two years, which push the research toward applying more sophisticated strategies and better understanding of English texts. Question answering is very challenging due to the ambiguity of the questions, complexity of linguistic phenomena involved in the documents, and the difficulty to understand natural languages. More challenging is to locate short snippets or answers from a document collection with texts written in different languages, which is within our research interests focusing on cross-lingual or multilingual information access and retrieval. We have decided to participate in TREC 2004 Question Answering Track as our first step toward exploring advanced multilingual information retrieval. Our goal of this year is to develop a prototype automatic question answering system that can be continually expanded and improved. Our prototype QA system, named EagleQA, made use of available NLP (Natural Language Processing) tools and knowledge resources for question understanding and answer finding. This paper describes the overall structure of the system, NLP tools and lexical resources employed, our QA methodology for TREC 2004, QA test results & analysis, and our plan for future research.

Bibtex
@inproceedings{DBLP:conf/trec/ChenGWJ04,
    author = {Jiangping Chen and He Ge and Yan Wu and Shikun Jiang},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{UNT} at {TREC} 2004: Question Answering Combining Multiple Evidences},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/unorthtexas.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/ChenGWJ04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

IBM's PIQUANT II in TREC 2004

Jennifer Chu-Carroll, Krzysztof Czuba, John M. Prager, Abraham Ittycheriah, Sasha Blair-Goldensohn

Abstract

PIQUANT II, the system we used for TREC 2004, is a completely reengineered system whose core functionalities for answering factoid and list questions remain largely unchanged from previous years [Chu-Carroll et al, 2003, Prager et al, 2004]. We continue to address these questions using our multi-strategy and multi-source approach. For “other” questions, we experimented with two alternative approaches, one that uses statistical collocation information for extracting prominent passages related to the target, and the other which is a slight variation of the QA-by-Dossier approach we employed last year [Prager et al, 2004] that asks a set of sub-questions of interest about the target and returns a set of relevant passages that answer these sub-questions. In addition, to address this year's new question format, we developed a question pre-processing component to interpret individual questions against the given target to generate a self-contained natural language question as expected by subsequent components of our QA system. NIST assessed scores showed substantial improvement of our new PIQUANT II system over earlier versions of our QA system both in terms of absolute scores as well as relative improvement compared to the best and median scores in each of the three component subtasks.

Bibtex
@inproceedings{DBLP:conf/trec/Chu-CarrollCPIB04,
    author = {Jennifer Chu{-}Carroll and Krzysztof Czuba and John M. Prager and Abraham Ittycheriah and Sasha Blair{-}Goldensohn},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {IBM's {PIQUANT} {II} in {TREC} 2004},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/ibm-prager.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Chu-CarrollCPIB04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Bangor at TREC 2004: Question Answering Track

Terence Clifton, William John Teahan

Abstract

This paper describes the participation of the School of Informatics, University of Wales, Bangor in the 2004 Text Retrieval Conference. We present additions and modifications to the QITEKAT system, initially developed as an entry for the 2003 QA evaluation, including automated regular expression induction, improved question matching, and application of our knowledge framework to the modified question types presented in the 2004 track. Results are presented which show improvements on last year's performance, and we discuss future directions for the system.

Bibtex
@inproceedings{DBLP:conf/trec/CliftonT04,
    author = {Terence Clifton and William John Teahan},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Bangor at {TREC} 2004: Question Answering Track},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/uwales-bangor.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/CliftonT04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

National University of Singapore at the TREC 13 Question Answering Main Task

Hang Cui, Keya Li, Renxu Sun, Tat-Seng Chua, Min-Yen Kan

Abstract

Our participation at TREC in the past two years (Yang et al., 2002, 2003) has focused on incorporating external knowledge to boost document and passage retrieval performance in event-based open domain question answering (QA). Despite our previous successes, we have identified three weaknesses of our system with respect to this year's task guidelines. First, our system works at the surface level to extract answers, by picking the first occurrence of a string that matches the question target type from the highest ranked passage. As such, our answer extraction relies heavily on the results of passage retrieval and named entity tagging. However, a passage that contains the correct answer may contain other strings of the same target type (Light et al., 2001), which means an incorrect string may be extracted. A technique to select the answer string that has the correct relationships with respect to the other words in the question is needed. Second, our definitional QA system utilizes manually constructed definition patterns. While these patterns are precise in selecting definition sentences, they are strict in matching (i.e., slot-by-slot matching using regular expressions), failing to match correct sentences with minor variations. Third, this year's guidelines state that factoid and list questions are not independent; instead, they are all related to given topics. Under such a contextual QA scenario, we need to revise our framework to exploit the existing topic-relevant knowledge in answering the questions. [...]

Bibtex
@inproceedings{DBLP:conf/trec/CuiLSCK04,
    author = {Hang Cui and Keya Li and Renxu Sun and Tat{-}Seng Chua and Min{-}Yen Kan},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {National University of Singapore at the {TREC} 13 Question Answering Main Task},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/nus.chua.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/CuiLSCK04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Novelty, Question Answering and Genomics: The University of Iowa Response

David Eichmann, Yi Zhang, Shannon Bradshaw, Xin Ying Qiu, Li Zhou, Padmini Srinivasan, Aditya Kumar Sehgal, Hudon Wong

Bibtex
@inproceedings{DBLP:conf/trec/EichmannZBQZSSW04,
    author = {David Eichmann and Yi Zhang and Shannon Bradshaw and Xin Ying Qiu and Li Zhou and Padmini Srinivasan and Aditya Kumar Sehgal and Hudon Wong},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Novelty, Question Answering and Genomics: The University of Iowa Response},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/uiowa.novelty.qa.geo.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/EichmannZBQZSSW04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

TALP-QA System at TREC 2004: Structural and Hierarchical Relaxation Over Semantic Constraints

Daniel Ferrés, Samir Kanaan, Edgar González, Alicia Ageno, Horacio Rodríguez, Mihai Surdeanu, Jordi Turmo

Abstract

This paper describes TALP-QA, a multilingual open-domain Question Answering (QA) system under development at UPC for the past two years. The system is described and evaluated in the context of our participation in the TREC 2004 Main QA task. The TALP-QA system treats both factoid and definitional questions (other). Factoid questions are resolved with a process consisting of three phases: question processing, passage retrieval and answer extraction. Our approach to solve this kind of questions is to build a semantic representation of the questions and the sentences in the retrieved passages. A set of semantic constraints are extracted for each question. The answer extraction algorithm extracts and ranks sentences that satisfy the semantic constraints of the question. It matches are not possible the algorithm relaxes the semantic constraints structurally (removing constraints) and/or hierarchically (abstracting the constraints using a taxonomy). Definitional questions are treated in a three-stage process: passage retrieval, pattern scanning over the previous set of passages, and finally a filtering phase where only the most relevant and informative fragments are given as final output.

Bibtex
@inproceedings{DBLP:conf/trec/FerresKGARST04,
    author = {Daniel Ferr{\'{e}}s and Samir Kanaan and Edgar Gonz{\'{a}}lez and Alicia Ageno and Horacio Rodr{\'{\i}}guez and Mihai Surdeanu and Jordi Turmo},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{TALP-QA} System at {TREC} 2004: Structural and Hierarchical Relaxation Over Semantic Constraints},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/upc.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/FerresKGARST04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

The University of Sheffield's TREC 2004 QA Experiments

Robert J. Gaizauskas, Mark A. Greenwood, Mark Hepple, Ian Roberts, Horacio Saggion

Abstract

The experiments detailed in this paper are a continuation of the experiments started as part of the work undertaken in preparation for participation in the TREC 2003 QA evaluations as documented in Gaizauskas et al. [2003]. Our main experiments for TREC 2004 were concerned with investigating: a) alternative approaches to information retrieval (IR) for question answering b) alternative approaches to answer extraction for list and factoid questions, and c) alternative approaches answering definitional or ‘other' questions. In each of these three areas we have developed two principal alternatives, each of which has variants. Given the TREC limit of three test runs per site, we have not been able to evaluate properly all combinations of these approaches. Consequently, the systems we submitted only give a partial picture of work carried out, and further evaluation is underway. In the following we describe the major alternatives we have been exploring in these three areas and present the formal test results for the system combinations we submitted.

Bibtex
@inproceedings{DBLP:conf/trec/GaizauskasGHRS04,
    author = {Robert J. Gaizauskas and Mark A. Greenwood and Mark Hepple and Ian Roberts and Horacio Saggion},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {The University of Sheffield's {TREC} 2004 {QA} Experiments},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/usheffield.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/GaizauskasGHRS04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

LexiClone Inc. and NIST TREC

Ilya S. Geller

Abstract

The UniSearch-4.6 program created by the company LexiClone Inc. for seeking out textual information is intended to search for Reality as well as Truth. While running NIST TREC QA 2003 and 2004, virtually all answers obtained by LexiClone were Realities.

Bibtex
@inproceedings{DBLP:conf/trec/Geller04,
    author = {Ilya S. Geller},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {LexiClone Inc. and {NIST} {TREC}},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/lexiclone.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Geller04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Korea University Question Answering System at TREC 2004

Kyoung-Soo Han, Hoo-Jung Chung, Sang-Bum Kim, Young-In Song, Joo-Young Lee, Hae-Chang Rim

Abstract

Our QA system consists of two different components. One is for the factoid and list questions, and the other is for the other questions. The components are processed individually, and each result is combined into our submitted run. For the factoid questions, we have tried to find answers by proximity-based named entity search. Given a question, fine-grained named entities for candidate answers are selected, and all the extracted passages containing the named entities and question keywords are scored by a proximity-based measure. List questions are processed in a similar way to the factoid questions, but we empirically give a threshold value to obtain only top n candidate answers. For other questions, relevant phrases consisting of noun phrases and verb phrases are extracted using a dependency relationship to the question target from the initially retrieved sentences. After redundant phrases are eliminated from the answer candidates, final answers are selected using several selection criteria including the term statistics from an encyclopedia. Section 2 summarizes our system for factoid and list questions, and Section 3 for other questions. In Section 4, the TREC evaluation results are analyzed, and Section 5 concludes our work.

Bibtex
@inproceedings{DBLP:conf/trec/HanCKSLR04,
    author = {Kyoung{-}Soo Han and Hoo{-}Jung Chung and Sang{-}Bum Kim and Young{-}In Song and Joo{-}Young Lee and Hae{-}Chang Rim},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Korea University Question Answering System at {TREC} 2004},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/korea.u.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/HanCKSLR04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering by Searching Large Corpora With Linguistic Methods

Michael Kaißer, Tilman Becker

Abstract

In this paper we describe the QuALiM Question Answering system which uses linguistic analysis of questions as well as candidate sentences in its answer finding process. To this end we have developed a rephrasing algorithm based on linguistic patterns that describe the structure of questions and candidate sentences and where precisely to find the answer in the candidate sentences. With this method and a fall-back strategy, both using the web as their primary data source, we participated in TREC 2004. We present our official results and a follow-up evaluation to elucidate the contribution of the methods used.

Bibtex
@inproceedings{DBLP:conf/trec/KaisserB04,
    author = {Michael Kai{\ss}er and Tilman Becker},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering by Searching Large Corpora With Linguistic Methods},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/saarlandu.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/KaisserB04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Answering Multiple Questions on a Topic From Heterogeneous Resources

Boris Katz, Matthew W. Bilotti, Sue Felshin, Aaron Fernandes, Wesley Hildebrandt, Roni Katzir, Jimmy Lin, Daniel Loreto, Gregory Marton, Federico Mora, Özlem Uzuner

Abstract

MIT CSAIL's entry into this year's TREC Question Answering track focused on the conversational aspect of this year's task, on improving the coverage of our list and definition systems, and on an infrastructure to generalize our TREC-specific tools for other question answering tasks. While our overall architecture remained largely unchanged from last year, we have built on our strengths for each component: our web-based factoid engine was adapted for input from a new web search engine; our list engine's knowledge base expanded from 150 to over 3000 lists; our definitional nugget extractor now has expanded and improved patterns with improved component precision and recall. Beyond their internal improvements, these components were adapted to a larger conversational framework that passed information about the topic1 to factoids and lists. Answer selection for definitional2 questions newly took into account the prior questions and answers for duplicate removal. [...]

Bibtex
@inproceedings{DBLP:conf/trec/KatzBFFHKLLMMU04,
    author = {Boris Katz and Matthew W. Bilotti and Sue Felshin and Aaron Fernandes and Wesley Hildebrandt and Roni Katzir and Jimmy Lin and Daniel Loreto and Gregory Marton and Federico Mora and {\"{O}}zlem Uzuner},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Answering Multiple Questions on a Topic From Heterogeneous Resources},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/mit.qa.pdf},
    timestamp = {Fri, 27 Aug 2021 01:00:00 +0200},
    biburl = {https://dblp.org/rec/conf/trec/KatzBFFHKLLMMU04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

DalTREC 2004: Question Answering Using Regular Expression Rewriting

Vlado Keselj, Anthony Cox

Abstract

This is the first year that the Dalhousie University participated in TREC. We submitted three runs for the QA track. Our evaluation results are generally below the median (with one exception) but seem to be significantly higher than the worst scores, which is within our expectations considering a limited time spent on developing the system. Our approach was based on the regular expression rewriting and the use of external search engines (MultiText and PRISE). One run used Web-reinforced search.

Bibtex
@inproceedings{DBLP:conf/trec/KeseljC04,
    author = {Vlado Keselj and Anthony Cox},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {DalTREC 2004: Question Answering Using Regular Expression Rewriting},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/dalhousieu.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/KeseljC04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Evolving XML and Dictionary Strategies for Question Answering and Novelty Tasks

Kenneth C. Litkowski

Abstract

CL Research participated in the question answering and novelty tracks in TREC 2004. The Knowledge Management System (KMS), which provides a single interface for question answering, text summarization, information extraction, and document exploration, was used for these tasks. Question answering is performed directly within KMS, which answers questions either from a repository or the Internet. The novelty task was performed with the XML Analyzer, which includes many of the functions used in the KMS summarization routines. These tasks are based on creating and exploiting an XML representation of the texts used for these two tracks. For the QA track, we submitted one run and our overall score was 0.156, with scores of 0.161 for factoid questions, 0.064 for list questions, and 0.239 for “other” questions; these scores are significantly improved from TREC 2003. For the novelty track, we submitted two runs for task 1, one run for task 2, four runs for task 3, and one run for task 4. For most tasks, our scores were above the median. We describe our system in some detail, particularly emphasizing strategies that are emerging in the use of XML and lexical resources for the question answering and novelty tasks.

Bibtex
@inproceedings{DBLP:conf/trec/Litkowski04,
    author = {Kenneth C. Litkowski},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Evolving {XML} and Dictionary Strategies for Question Answering and Novelty Tasks},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/clresearch.qa.novelty.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/Litkowski04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

AnswerFinder at TREC 2004

Diego Mollá, Mary Gardiner

Abstract

AnswerFinder combines lexical, syntactic, and semantic information in various stages of the question answering process. The candidate sentences are preselected on the basis of (i) the presence of named entity types compatible with the expected answer type, and (ii) a score combination of the overlap of words, grammatical relations, and flat logical forms. The candidate answers, in turn, are extracted from (i) the set of compatible named entities and (ii) the output of a logical-form pattern matching algorithm.

Bibtex
@inproceedings{DBLP:conf/trec/MollaG04,
    author = {Diego Moll{\'{a}} and Mary Gardiner},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {AnswerFinder at {TREC} 2004},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/macquarieu.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/MollaG04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Experiments with Web QA System and TREC 2004 Questions

Dmitri Roussinov, Yin Ding, Jose Antonio Robles-Flores

Abstract

We describe our first participation in TREC. We only competed in the Question Answering (QA) category and limited our runs to factoids. Our approach was to use our open domain QA system that finds the answer among Web pages indexed by a commercial search engine, then project the answer to the TREC test collection. Our Web QA takes advantage of the redundancy of the Web, obtains the candidate answers by pattern matching, then performs probabilistic triangulation of them to assign a final score. Our novel contributions are the following 1) the probabilistic triangulation algorithm, 2) a more powerful pattern language than used in prior research, and 3) use of semantic features of expected answers instead of relying on an elaborate hierarchy of question types. Although, we were able to run only first 91 out of 230 factoid questions before the submission deadline, we find our result encouraging, and if interpolated to the entire questions set, it would have placed us above the median performance on factoid questions.

Bibtex
@inproceedings{DBLP:conf/trec/RoussinovDR04,
    author = {Dmitri Roussinov and Yin Ding and Jose Antonio Robles{-}Flores},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Experiments with Web {QA} System and {TREC} 2004 Questions},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/arizonau.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/RoussinovDR04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering with QACTIS at TREC 2004

Patrick Schone, T. Bassi, A. Kulman, Gary M. Ciany, Paul McNamee, James Mayfield

Abstract

We provide a description of the QACTIS question-answering system and its application to and performance in the 2004 TREC question-answering evaluation. Since this was also QACTIS's first year competing at TREC, we provide a complete overview of its purpose, development, structure, and its future directions.

Bibtex
@inproceedings{DBLP:conf/trec/SchoneBKCMM04,
    author = {Patrick Schone and T. Bassi and A. Kulman and Gary M. Ciany and Paul McNamee and James Mayfield},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering with {QACTIS} at {TREC} 2004},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/nsa-schone.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/SchoneBKCMM04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Question Answering Using the DLT System at TREC 2004

Richard F. E. Sutcliffe, Igal Gabbay, Kieran White, Aoife O'Gorman, Michael Mulcahy

Abstract

This article outlines our participation in the Question Answering Track of the Text REtrieval Conference organised by the National Institute of Standards and Technology. We first provide an outline of the system. We then describe the changes made relative to last year. After this we summarise our results before drawing conclusions and identifying some next steps.

Bibtex
@inproceedings{DBLP:conf/trec/SutcliffeGWOM04,
    author = {Richard F. E. Sutcliffe and Igal Gabbay and Kieran White and Aoife O'Gorman and Michael Mulcahy},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Question Answering Using the {DLT} System at {TREC} 2004},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/u.limerick.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/SutcliffeGWOM04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

THUIR at TREC 2004: QA

Wei Tan, Qunxiu Chen, Shaoping Ma

Abstract

In this paper, we describe ideas and related experiments of Tsinghua University IR group in TREC 2004 QA track. In this track, our system consists three components: Question analysis, Information retrieval, and Answer extraction. Question analysis component extracts Query Term and answer type. Information retrieval component retrieves candidate documents from index set based on paragraph level and re-ranks them to find more relevant documents. And then Answer extraction component matches empirical phrases according to answer type to extract final answer

Bibtex
@inproceedings{DBLP:conf/trec/TanCM04,
    author = {Wei Tan and Qunxiu Chen and Shaoping Ma},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{THUIR} at {TREC} 2004: {QA}},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/tsinghua-ma.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/TanCM04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

Combining Linguistic Processing and Web Mining for Question Answering: ITC-irst at TREC 2004

Hristo Tanev, Milen Kouylekov, Bernardo Magnini

Abstract

This paper describes the work we have been done in the last year on the DIOGENE Question Answering system developed at ITC-Irst. We present two preliminary experiments showing the possibility of integrating into DIOGENE a textual entailment engine based on entailment rules. We addressed the problem proposing both a methodology for acquiring rules from the Web and a matching algorithm for comparing dependency trees derived from the question and from documents. Although the overall results are not high, we consider this year participation at TREC as an intermediate step in view of a more complete and in depth integration of textual entailment rules into the system. We also report about the problems we encountered in maintaining the Web-based answer validation module.

Bibtex
@inproceedings{DBLP:conf/trec/TanevKM04,
    author = {Hristo Tanev and Milen Kouylekov and Bernardo Magnini},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {Combining Linguistic Processing and Web Mining for Question Answering: ITC-irst at {TREC} 2004},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/itc-irst-tanev.web.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/TanevKM04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}

FDUQA on TREC 2004 QA Track

Lide Wu, Xuanjing Huang, Lan You, Zhushuo Zhang, Xin Li, Yaqian Zhou

Abstract

In this year's QA Track, we process factoid questions in a way that is slightly different from our previous system [1]. The most significant difference is that we developed a new answer type category, and trained a classifier for answer type classification. To answer list questions, we use a pattern-based method to find more answers other than those found in the processing of factoid question. And an algorithm that uses some knowledge bases answers definition questions. This algorithm achieves a promising result. In our system, external knowledge is widely used, which includes WordNet and Internet. The ontology in WordNet is used in the answer type classification, and its synsets are used to do query extension. Internet is used not only to find factoid question answers, but also as knowledge base for definition questions. In the following, Section 2, 3, 4 will separately introduce our algorithm to solve factoid, list and definition questions. Section 5 will present our results in TREC2004.

Bibtex
@inproceedings{DBLP:conf/trec/WuHYZLZ04,
    author = {Lide Wu and Xuanjing Huang and Lan You and Zhushuo Zhang and Xin Li and Yaqian Zhou},
    editor = {Ellen M. Voorhees and Lori P. Buckland},
    title = {{FDUQA} on {TREC} 2004 {QA} Track},
    booktitle = {Proceedings of the Thirteenth Text REtrieval Conference, {TREC} 2004, Gaithersburg, Maryland, USA, November 16-19, 2004},
    series = {{NIST} Special Publication},
    volume = {500-261},
    publisher = {National Institute of Standards and Technology {(NIST)}},
    year = {2004},
    url = {http://trec.nist.gov/pubs/trec13/papers/fudan.wu.qa.pdf},
    timestamp = {Thu, 12 Mar 2020 00:00:00 +0100},
    biburl = {https://dblp.org/rec/conf/trec/WuHYZLZ04.bib},
    bibsource = {dblp computer science bibliography, https://dblp.org}
}