Building an information retrieval test collection for spontaneous conversational speech
Title | Building an information retrieval test collection for spontaneous conversational speech |
Publication Type | Conference Papers |
Year of Publication | 2004 |
Authors | Oard D, Soergel D, Doermann D, Huang X, Murray CG, Wang J, Ramabhadran B, Franz M, Gustman S, Mayfield J, Kharevych L, Strassel S |
Conference Name | Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval |
Date Published | 2004/// |
Publisher | ACM |
Conference Location | New York, NY, USA |
ISBN Number | 1-58113-881-4 |
Keywords | assessment, Automatic speech recognition, oral history, search-guided relevance |
Abstract | Test collections model use cases in ways that facilitate evaluation of information retrieval systems. This paper describes the use of search-guided relevance assessment to create a test collection for retrieval of spontaneous conversational speech. Approximately 10,000 thematically coherent segments were manually identified in 625 hours of oral history interviews with 246 individuals. Automatic speech recognition results, manually prepared summaries, controlled vocabulary indexing, and name authority control are available for every segment. Those features were leveraged by a team of four relevance assessors to identify topically relevant segments for 28 topics developed from actual user requests. Search-guided assessment yielded sufficient inter-annotator agreement to support formative evaluation during system development. Baseline results for ranked retrieval are presented to illustrate use of the collection. |
URL | http://doi.acm.org/10.1145/1008992.1009002 |
DOI | 10.1145/1008992.1009002 |