Hello, I’m Ryota Komatsu. Currently, I work on spoken language processing and its applications to spoken dialogue systems.

Education

2025.04 - 2028.03, doctoral student in the Department of Information and Communications Engineering, Institute of Science Tokyo, supervised by Prof. Takahiro Shinozaki.
2021.04 - 2023.03, master student in the Department of Information and Communications Engineering, Tokyo Institute of Technology, supervised by Prof. Takahiro Shinozaki.
2017.04 - 2021.03, undergraduate student in the Department of Information and Communications Engineering, Tokyo Institute of Technology, supervised by Prof. Isao Yamada.

Work experience

2023.04 - 2025.01, Research & Development Group, Hitachi, Ltd.

Publications

Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT

R. Komatsu and T. Shinozaki, "Self-supervised syllable discovery based on speaker-disentangled HuBERT," in Proc. IEEE Spoken Language Technology Workshop (SLT), Dec. 2024, pp. 1131–1136.

Continuous Action Space-Based Spoken Language Acquisition Agent Using Residual Sentence Embedding and Transformer Decoder

R. Komatsu, Y. Kimura, T. Okamoto, and T. Shinozaki, "Continuous action space-based spoken language acquisition agent using residual sentence embedding and transformer decoder," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun. 2023.

Automatic Spoken Language Acquisition Based on Observation and Dialogue

R. Komatsu, S. Gao, W. Hou, M. Zhang, T. Tanaka, K. Toyoda, Y. Kimura, K. Hino, Y. Iwamoto, K. Mori, T. Okamoto, and T. Shinozaki, "Automatic spoken language acquisition based on observation and dialogue," IEEE Journal of Selected Topics in Signal Processing (JSTSP), vol. 16, no. 6, pp. 1480–1492, 2022.

Pronunciation Adaptive Self Speaking Agent Using WaveGrad

T. Tanaka, R. Komatsu, T. Okamoto, and T. shinozaki, "Pronunciation adaptive self speaking agent using wavegrad," in Proc. The 2nd Workshop on Self-supervised Learning for Audio and Speech Processing, Feb. 2022.

A Graph Regularized RPCA by Generalized Moreau Enhanced Model

R Komatsu, M Yamagishi, and I Yamada, "A Graph Regularized RPCA by Generalized Moreau Enhanced Model," in Proc. European Signal Processing Conference (EUSIPCO), Aug. 2021, pp. 2129-2133.

Awards

Best Student Presentation Award, Acoustical Society of Japan (ASJ), 2023.

Invited Talks

A comprehensive overview of audio language models

December 03, 2025

Talk at Sixth Joint Meeting Acoustical Society of America and Acoustical Society of Japan, Honolulu, Hawaii

Introduction to Multimodal Large Language Models

June 13, 2025

Tutorial at Waseda University, Tokyo, Japan

Scholarship

Science Tokyo Tsubame Scholarship for Doctoral Students