2005 Symposium on Next Generation Automatic Speech Recognition

去年11月底成功舉辦2004 Symposium on Next Generation Automatic Speech Recognition後,迄今已近一年, 李錦輝教授在這方面的研究工作已經獲得不少成果, 國內幾位同仁共同提出的國科會群體研究計畫「新世代自動語音辨識技術之研究」也已獲國科會核定通過執行。

為了與更多的國內學者專家交換意見,擴大合作, 李教授決定於9月訪台時特別抽出一天時間發表演講,並與大家座談。 為了擴大參與,時間特別訂在週末,方便大家共襄盛舉。



10:00-12:00 Prof. Chin-Hui Lee's talk on "Speech Recognition Based on Attribute Detection and Knowledge Integration"
Recently we are promoting a new speech research paradigm under an NSF-funded project called automatic speech attribute transcription, or ASAT. It has long been postulated that a human determines the linguistic identity of a sound based on detected evidence that exists at various levels of the speech knowledge hierarchy, from acoustics to pragmatics. The ASAT approach to ASR is formulated as follows. First a bank of event detectors convert an input speech signal into a collection of multiple time series, each describes the level of presence (or level of activity) of a particular property (or attribute) in speech over time. Then a collection of event mergers integrates detected events based on available knowledge sources and attempts to infer the presence of higher level evidences (e.g., a phone or even a word). Finally these events are validated by some evidence verifiers to produce a partially integrated lattice of hypotheses. Such refined lattices are then fed back for further knowledge integration. This iterative information fusion process always uses the original event activity functions as the raw inputs. To perform ASR, a terminating strategy can be instituted by exhausting all the supported attributes, and a final decision can be produced with all the detected and validated evidences along with the recognized sentence itself, to support any desired application. In this talk, we will describe in detail the above three major functional modules of the proposed ASAT framework, and give some preliminary results.
12:00-13:00 Lunch
13:00-15:00 Discussion


