Congratulations to Kuan-Yu Chen and Hung-Shin Lee for receiving the IEEE Spoken Language Processing Student Grant for presenting their paper at ICASSP2014.           Congratulations to Ju-Chiang Wang for receiving the 2013 Silver Medal of Merry Electroacoustic Thesis Award.           Congratulations to Ju-Chiang Wang for receiving the 2013 TAAI Best PhD Dissertation Award.           Congratulations to Hung-Yi Lo for receiving the 2013 IEEE Tainan Section Best PhD Thesis Award.           Congratulations to Hung-Yi Lo for receiving the 2013 IICM Outstanding PhD Dissertation Award.

Our research interests include spoken language processing, natural language processing, multimedia information retrieval, machine learning and pattern recognition. The research goal is to develop methods for analyzing, extracting, recognizing, indexing, and retrieving information from audio data, with the special emphasis on speech and music.

In the speech area, our research has been focused mainly on speech recognition, speaker recognition, speaker segmentation/clustering/diarization, spoken document retrieval/summarization, etc. The recent achievements include a minimum-boundary-error-based discriminative acoustic model training and decoding framework for automatic phone segmentation, a novel characterization of the alternative hypothesis using kernel discriminant analysis for likelihood ratio-based speaker verification, a new divide-and-conquer framework for fast speaker segmentation and diarization, and a probabilistic generative framework for extractive spoken document summarization. The ongoing research includes attribute-detection-based speech/language recognition, language modeling for speech recognition/document classification/information retrieval, voice conversion, hidden Markov model-based speech synthesis, etc.  

In the music area, our research has been focused mainly on vocal melody extraction, query by singing/humming, solo vocal modeling, music tag annotation, tag-based music information retrieval (MIR), etc. The recent achievements include a novel cost-sensitive multi-label (CSML) learning framework for automatic music tagging, a novel query by multiple tags with multiple levels of preference (denoted as an MTML query) scenario and a corresponding tag cloud-based query interface for MIR. We have participated in the MIREX audio tag classification task since 2009 and achieved top performance. Our ongoing research includes continuous improving of our own technologies and systems, audio feature analysis, semantic visualization of music tags, and vocal separation, so as to facilitate the management and retrieval of a large music database. Our future research directions also include real-time music tagging, singing voice synthesis, and automatic music structure analysis/summarization.