Research Topics

Our major research topics include ;

  1. Speech processing, including speech recognition/understanding/information retrieval, speech synthesis, voice conversion, and multi-modal dialogue interface.
  2. Acoustic signal processing, including a microphone array, sound field control/reproduction and sound field coding.
  3. A new speech media applied to speech-based universal communications, such as a quiet speech media (NAM: Non-Audible Murmur), and a new speech separation algorithm such as Blind Source Separation (BSS).
    We are currently researching effective usages of speech for universal communication, multi-modal interface, sound/speech media utilization in networks/communications, sound effects in multi-media paradigm and speech/sound applications in real acoustic environments.



Research Areas
  1. Speech-based natural interface/information retrieval, statistical language modeling and acoustic modeling for high-quality speech recognition/understanding.
  2. Robust speech recognition and dialog systems in real acoustical environments.
  3. Multi-modal speech interface, lip reading, visual agent for speech interface, speech dialogue with a robot, and Web retrieval by speech recognition.
  4. Hands-free speech recognition by a microphone array, distant-talking speech recognition in reverberant environments, speech enhancement by nonlinear array signal processing and speech dialogue robots.
  5. Blind source separation, fast-learning algorithm on independent component analysis and online sound-source separation even for moving sounds.
  6. Speech synthesis by rules, speech analysis-by-synthesis, voice conversion, and speech morphing.
  7. Sound field reproduction system, robust sound field reproduction in real acoustical environments, multi-loud-speaker systems, 3-D sound field coding and virtual sound realization.
  8. Non-Audible Murmur(NAM) applied to non-voice speech recognition, very quiet telephone, and aids for speech handicapped people.


References
  1. Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano, "Lip Movement Synthesis from Speech Based on Hidden Markov Models," Speech Communication, Vol.26, Nos.1-2, pp.105-115, 1998.
  2. Tetsuya Takiguchi, Satoshi Nakamura, Kiyohiro Shikano, "HMM-Separation-Based Speech Recognition for a Distant Moving Speaker," IEEE Transactions on Speech and Audio Processing, Vol.9, No.2, pp.127-140, 2001.
  3. Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano, "Distant-talking speech recognition based on a 3-D Viterbi search using a microphone array," IEEE Transactions on Speech and Audio Processing, Vol.10, No.2, pp.48-56, 2002.
  4. Akinobu Lee, Tatsuya Kawahara, Kazuya Takeda, and Kiyohiro Shikano, "A new phonetic tied-mixture model for efficient decoding", Proc. ICASSP2000, pp.1269-1272, 2000.
  5. Yosuke Tatekura, Hiroshi Saruwatari, and Kiyohiro Shikano, "An iterative inverse filter design method for the multichannel sound field reproduction system," IEICE Trans. Fundamentals, Vol.E84-A, No.4, pp.991-998, 2001.
  6. Hiroshi Saruwatari, Toshiya Kawamura, Tsuyoki Nishikawa, Kiyohiro Shikano, "Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing", IEICE Trans. Fundamentals, vol.E86-A, no.3, pp.634-639, March 2003
  7. Tomoya Takatani, Tsuyoki Nishikawa, Hiroshi Saruwatari, Kiyohiro Shikano, "High-Fidelity Blind Separation of Acoustic Signals Using SIMO-Model-Based Independent Component Analysis," IEICE Trans. Fundamentals, Vol.E87-A, No.8, pp. 2063--2072, August 2004
  8. Hiroshi Saruwatari, Hiroaki Yamajo, Tomoya Takatani, Tsuyoki Nishikawa, and Kiyohiro Shikano, ``Blind separation and deconvolution for convolutive mixture of speech combining SIMO-model-based ICA and multichannel inverse filtering,'' IEICE Trans. Fundamentals, Vol.E88-A, No.9, pp.2387-2400, 2005
  9. Yoshitaka Nakajima, Hideki Kashioka, Nick Cambell, Kiyohiro Shikano, ''Non-Audible Murmur (NAM) Recognition'', IEICE Trans. Information and Systems, Vol.E89-D, No.1, pp.1-8, 2006
  10. Shigeki Miyabe, Hiroshi Saruwatari, Kiyohiro Shikano, Yosuke Tatekura, ``Interface for Barge-in Free Spoken Dialogue System Using Nullspace Based Sound Field Control and Beamforming,'' IEICE Trans. Fundamentals, Vol.E89-A, No.3, pp.716--726, 2006
  11. Tobias Cincarek, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano, ``Utterance-based Selective Training for the Automatic Creation of Task-Dependent Acoustic Models,'' IEICE Trans. Information and Systems, Vol.E89-D, No.3, pp.962-969, 2006
  12. Randy Gomez, Akinobu Lee, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano, ``Improving Rapid Unsupervised Speaker Adaptation based on HMM Sufficient Statistics in Noisy Environments using Multi-template Models,'' IEICE Trans. Information and Systems, Vol.E89-D, No.3, pp.998-1005, 2006
  13. Yoshimitsu Mori, Hiroshi Saruwatari, Tomoya Takatani, Satoshi Ukai, Kiyohiro Shikano, Takashi Hiekata, Youhei Ikeda, Hiroshi Hashimoto, and Takashi Morita, ``Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking,'' EURASIP Journal on Applied Signal Processing, vol. 2006, Article ID 34970, 17 pages, 2006
  14. Panikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano, ``Unvoiced Speech Recognition Using Tissue-conductive Acoustic Sensor,'' EURASIP Journal on Advances in Signal Processing, vol.2007, Article ID 94068, 11 pages, 2007
  15. Tomoki Toda, Keiichi Tokuda, ``A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis,'' IEICE Trans. Information and Systems, Vol. E90-D, No. 5, pp. 816-824, May 2007
  16. Tomoki Toda, Alan W Black, Keiichi Tokuda, ``Voice Conversion Based on Maximum Likelihood Estimation of Spectral Parameter Trajectory,'' IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 8, pp. 2222-2235, Nov. 2007
  17. Tobias Cincarek, Hiromichi Kawanami, Ryuichi Nishimura, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano, ``Development, Long-Term Operation and Portability of a Real-Environment Speech-oriented Guidance System,'' IEICE Trans. Information and Systems, Vol. E91-D, No. 3, pp. 576-587, March 2008


Equipments

The Speech and Acoustic Laboratory is equipped with the following:


Grant and Funding (in 2007-2008