About
Hi, I am a final-year Ph.D. student at Graduate school of Information Science and Technology, the University of Tokyo, Japan. My CV is available [here].
My research primarily focuses on the domain of spoken language processing, encompassing areas such as speech synthesis, speech representation learning, and speech quality assessment. I am especially interested in the development of speech processing models that leverages a broad range of pre-existing data and foundation models. By doing so, these models can acquire a comprehensive understanding of diverse expressions, languages, prosody, and speaker individualities. This facilitates the creation of more natural and familiar artificial voice user interfaces.
Email (university): takaaki_saeki [at] ipc.i.u-tokyo.ac.jp
Email (personal): saefrospace [at] gmail.com
Address: Room #140, Engineering bldg. #6, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
Education
- Apr. 2021 - Mar. 2024
Ph.D. Degree in Information Science and Technology, the University of Tokyo, Japan
Department of Creative Informatics, Graduate School of Information Science and Technology
Supervisor: Prof. Hiroshi Saruwatari - Apr. 2019 - Mar. 2021
Master’s Degree in Information Science and Technology, the University of Tokyo, Japan
Department of Creative Informatics, Graduate School of Information Science and Technology
Supervisor: Prof. Hiroshi Saruwatari - Apr. 2015 - Mar. 2019
Bachelor’s Degree in Engineering, the University of Tokyo, Japan
Department of Aeronautics and Astronautics, Faculty of Engineering
Awards & Honours
- Mar. 2024
Dean’s Award [link]
Graduate School of Information Science and Technology, The University of Tokyo - July 2022
Yamashita SIG Research Award [link]
Information Processing Society of Japan (IPSJ) - June 2022
Best Paper Award from IEICE [link]
The Institute of Electronics, Information and Communication Engineers (IEICE) - Mar. 2022
Ranked 1st Place in 10/16 Metrics [link]
VoiceMOS Challenge 2022 at INTERSPEECH 2022 - Mar. 2022
Telecom System Technology Award for Students [link]
The Telecommunication Advancement Foundation - Oct. 2021
Best Student Presentation Award [link]
Acoustical Society of Japan (ASJ) - Mar. 2021
Best Student Poster Award [link]
IEICE Speech Committee - Feb. 2019
Award for Excellence (2nd Place)
Recruit Holdings NLP Hackathon
Grants & Scholarships
- Aug. 2022
Google East Asia Student Travel Grants
Google - Mar. 2022
UTokyo-TOYOTA Study Abroad Scholarship in AI Field
The University of Tokyo - Apr. 2022 - Mar. 2024
Research Fellowship for Young Scientists (DC2)
Japan Society for the Promotion of Science (JSPS) - July 2021
Tobitate Study Abroad Initiative (Declined due to COVID19)
Ministry of Education, Culture, Sports, Science and Technology - June 2021 - Mar. 2022
TOYOTA/Dwango AI Scholarship
The University of Tokyo
Experience
- May 2023 - Aug. 2023
Google New York, Research Intern
Researching on speech processing. - Oct. 2022 - Jan. 2023
Carnegie Mellon University, Visiting Scholar
Researched on low-resource multilingual speech synthesis. - Apr. 2022 - Sep. 2022
Google Tokyo, Student Researcher
Researched on massive multilingual semi-supervised learning for speech synthesis. - Mar. 2021 - Mar. 2022
LINE Corporation, Part-Time Researcher
Researched on noise-robust text-to-speech synthesis. - Aug. 2021 - Sep. 2021
Preferred Networks, Research Intern
Worked on singing voice conversion. - Aug. 2019 - Sep. 2019
NEC Datascience Research Laboratories, Research Intern
Worked on speech enhancement. - Feb. 2019 - June 2019
Recruit Holdings Co., Ltd., Engineering Intern & Part-Time Engineer
Worked on data analysis and developed recommendation engine.
Reviewing
- IEEE ICASSP: 2023, 2024
- INTERSPEECH: 2023
- IEEE Signal Processing Letters: 2023
- IEEE/ACM Transactions on Audio, Speech and Language Processing: 2023
Miscellaneous Work
- June 2022
Research Talk at Google Tokyo
Self-Supervised Speech Resotoration for Historical Audio - Jan. 2021
Exhibition at Sainokuni Buisiness Arena 2021
Research on Stress-Free, Real-Time, and Full-Band Voice Conversion Based on Perceptual Model - Oct. 2020
Exhibition at CEATEC 2020
Research on Stress-Free, Real-Time, and Full-Band Voice Conversion Based on Perceptual Model