Skip to content

Overview - Confusion 1996

Proceedings | Data | Runs | Participants

For TREC-5, retrieval from corrupted data was studied through retrieval of single target documents from a corpus which was corrupted by producing page images, corrupting the bit maps, and applying OCR techniques to the results. In general, methods which attempted a probabilistic estimation of the original clean text fare better than methods which simply accept corrupted versions of the query text.

Track coordinator(s):

  • P. Kantor, Rutgers University
  • E. Voorhees, National Institute of Standards and Technology (NIST)