I am the Research Fellow in National University of Singapore (NUS), supervised by Prof. Li Haizhou. Prior to that, I received the PhD and Master Degree from NUS in 2023 and 2019, Bachelor Degree from Soochow University in 2018.

My research interest Audio-visual speech processing, includes (audio-only or audio-visual) speaker recognition, speaker diarization, speech extraction, active speaker detection, self-supervised learning. I have published more than 10 papers at the top international AI conferences and journals such as TASLP, ACM MM, ICASSP, INTERSPEECH.

📜 Research Area

Speech Processing: Speaker recognition, Speaker diarization, Target speaker extraction, anti-spoofing, speech separation, voice conversation, text-to-speech

Computer Vision: Face recognition; Face detection; Lip reading

Multi-modal Processing: Audio-visual active speaker detection, AV speaker recognition, AV target speaker extraction, talking face generation

Algoirthm: Self-supervised speech processing

🏫 Education

  • 2019.08 - 2023.08, Ph.D. in Speech Processing and Computer Vision, National University of Singapore (NUS), Singapore.
  • 2018.08 - 2019.06, M.Sc. in Electronic and Computer Engineer, National University of Singapore (NUS), Singapore.
  • 2014.09 - 2018.06, B.Eng. in Electronic Engineer, Soochow University, Suzhou, China.

Working Experience

  • 2023.08 - Now, Research Fellow, National University of Singapore (NUS), Singapore.

📝 Publication

2024

2023

2022

2021

2020

💻 Open Source Code

  • Speaker Recognition Framework
  • Active Speaker Detection Framework
  • Self-supervised Speaker Recognition Framework
  • Audio-visual Speaker Recognition Framework
  • Ego4d Benchmark

👔 Internship and Visiting Experience

  • 2022.02 - 2022.08, Visiting Student, Chinese University of Hong Kong (CUHKSZ), Shenzhen, China.
  • 2015.07 - 2015.08, Visiting Student, University of Cambridge, Cambridge, UK.

🎖 Others

Award

  • Nanyang Speech Technology Forum, Best Student Paper Award, 2023
  • PREMIA, Best Student Paper Award, 2022
  • The 2nd place winner in NIST Speaker Recognition Evaluation (SRE), 2021
  • The 3rd place winner in the ActivityNet Challenge (Speaker), CVPR Workshop, 2021
  • NUS Research Scholarship, 2019

Reviewer

  • Computer Vision and Pattern Recognition Conference (CVPR),
  • Transactions on Audio, Speech, and Language Processing (TASLP),
  • The International Conference on Acoustics, Speech, & Signal Processing (ICASSP),
  • Signal Processing Letters (SPL),
  • Digital Signal Processing (DSP),
  • Computer Speech & Language (CSL)

Teaching

  • EE3801 Data Engineering Principles, NUS undergraduate course
  • EE5132 Wireless and Sensor Networks, NUS graduate course