- When: 27th November 2012 15:00 - 16:00
- Where: Phys Theatre C
- Format: Seminar
Professor Mari Ostendorf of the University of Washington is visiting
Edinburgh, Glasgow and St Andrews as part of a SICSA Distinguishing
Title: Rich Speech Transcription for Spoken Document Processing
As storage costs drop and bandwidth increases, there has been rapid growth of spoken information available via the web or in online archives — including radio and TV broadcasts, oral histories, legislative proceedings, call center recordings, etc. — raising problems of document retrieval, information extraction, summarization and translation for spoken language. While there is a long tradition of research in these technologies for text, new challenges arise when moving from written to spoken language. In this talk, we look at differences between speech and text, and how we can leverage the information in the speech signal beyond the words to provide a rich, automatically generated transcript that better serves language processing applications. In particular, we look at how prosodic cues can be used to recognize segmentation, emphasis and intent in spoken language, and how this information can impact tasks such as topic detection, information extraction, translation, and social group analysis.