I am a Ph.D. student in Electrical Engineering and Computer Science (EECS) at National Taiwan University (NTU). I am a member of the Speech Processing and Machine Learning (SPML) Lab and Multimedia Information Retrieval Lab (MIR Lab), where I worked with Prof. Hung-yi Lee and Prof. Jyh-Shing Roger Jang. I received M.S. degree from NTU in 2023 and B.S. degree from National Taiwan University of Science and Technology (NTUST) in 2020. I'm honored to receive Google Student Travel Grant in 2024.
My Curriculum Vitae is here! My research interests span deep learning, audio-visual learning, and speech processing. We focus on deepfake detection from a defender’s perspective, developing methods to counter specific threats like speech synthesis [1], singing voice synthesis [9], and adversarial attacks [2, 8]. Additionally, we’re creating a lightweight audio-visual synchronization model to identify multi-modal out-of-sync [10] and have developed a dataset [4] incorporating recent advancements in speech synthesis to enhance defense against synthetic voice threats.
[13] Chien-yu Huang, Wei-Chih Chen, Shu-wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, Anuj Diwan, Yi-Jen Shih, Jiatong Shi, William Chen, Xuanjun Chen, et al., "Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks," in ICLR 2025.
arXiv /
Code
[12] Wenze Ren, Haibin Wu, Yi-Cheng Lin, Xuanjun Chen, Rong Chao, Kuo-Hsuan Hung, You-Jin Li, Wen-Yuan Ting, Hsin-Min Wang, Yu Tsao, "Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement," in ICASSP 2025.
arXiv
[11] Chih-Kai Yang, Yu-Kuan Fu, Chen-An Li, Yi-Cheng Lin, Yu-Xiang Lin, Wei-Chih Chen, Ho Lam Chung, Chun-Yi Kuan, Wei-Ping Huang, Ke-Han Lu, Tzu-Quan Lin, Hsiu-Hsuan Wang, En-Pei Hu, Chan-Jan Hsu, Liang-Hsuan Tseng, I-Hsiang Chiu, Ulin Sanga, Xuanjun Chen, Po-chun Hsu, Shu-wen Yang, Hung-yi Lee, "Building a Taiwanese Mandarin Spoken Language Model: A First Attempt," Tech Report, Nov. 2024.
arXiv
[10] Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-yi Lee, and Jyh-Shing Roger Jang, "Multimodal Transformer Distillation for Audio-Visual Synchronization," in ICASSP 2024.
IEEE /
arXiv /
Code /
Poster
[9] Xuanjun Chen, Haibin Wu, Jyh-Shing Roger Jang, and Hung-yi Lee. "Singing Voice Graph Modeling for SingFake Detection," in Interspeech 2024 (Oral).
ISCA /
arXiv /
Code
[8] Xuanjun Chen*, Jiawei Du*, Haibin Wu, Jyh-Shing Roger Jang, and Hung-yi Lee. "Neural Codec-based Adversarial Sample Detection for Speaker Verification," in Interspeech 2024.
ISCA /
arXiv /
Code /
Poster
[7] Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-wei Chang, Ho-Lam Chung, Alexander H. Liu, and Hung-yi Lee. "Towards audio language modeling-an overview," Overview Report, Feb. 2024.
arXiv /
Awesome List
[6] Haibin Wu, Ho-Lam Chung, Yi-Cheng Lin, Yuan-Kuei Wu, Xuanjun Chen, Yu-Chi Pai, Hsiu-Hsuan Wang, Kai-Wei Chang, Alexander H. Liu, and Hung-yi Lee. "Codec-SUPERB: An In-Depth Analysis of Sound Codec Models," in Findings of ACL 2024.
ACL /
arXiv /
Leaderboard /
Code /
Huggingface
[5] Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Jiawei Du, Kai-Wei Chang, Ke-Han Lu, Alexander Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James Glass, Shinji Watanabe, and Hung-yi Lee. "Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural codec models," in IEEE SLT 2024.
arXiv
[4] Jiawei Du, I-Ming Lin, I-Hsiang Chiu, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-yi Lee, and Jyh-Shing Roger Jang. "DFADD: The Diffusion and Flow-Matching based Audio Deepfake Dataset," in IEEE SLT 2024.
arXiv /
Code /
Huggingface
[3] Hsuan-Yu Lin, Xuanjun Chen, Jyh-Shing Roger Jang. "Singer separation for karaoke content generation," in O-COCOSDA 2024.
arXiv /
Project
[2] Xuanjun Chen*, Haibin Wu*, Helen Meng, Hung-yi Lee, and Jyh-Shing Roger Jang, "Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection," in IEEE SLT 2022, Jan 2023.
IEEE /
arXiv /
Demos /
Poster /
Video
[1] Xuanjun Chen*, Yen-Lun Liao*, Chung-Che Wang, and Jyh-Shing Roger Jang, "Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification," in ISCA SPSC 2022, Sept 2022.
ISCA /
arXiv
2024: Google Student Travel Grant, Google LLC
2024: Bursary Award for Overseas Students, CTCI Foundation
2020 - 2025: Distinguished Academic Record Award (5 years), Taipei Kwong Tong Community Associations
2021: Ranked 3rd/42 teams in the logical access track of ASVspoof 2021 challenge, Interspeech 2021
2018 - 2020: Certificate of Achievement (3 semesters), NTUST (Top 5%)
2017: National Bronze and Guangdong Provincial Gold Awards, 3rd China College Internet Entrep. Comp.
2016-2017: National Encouragement Scholarship (Only 3%) and The Third Prize of Academic Award (Top 20%), SZIIT
2023-Present: Reviewer/Program Committee, ACL ('24), EMNLP ('25), ICASSP ('23-'25), LREC-COLING/COLING ('24-'25), MLSP ('24), IALP ('24), ECCV AVGenL ('24)
2024: Invited Talker, Topic: "Singing Voice Graph Modeling for SingFake Detection," Special Session: SVDD @ IEEE SLT 2024
2024: Technical Committee, Codec-SUPERB Challenge at IEEE SLT 2024
2024-2025: Teaching Assistant, EE5200: Introduction to Generative AI (2024 Spr.) and EE5184: Machine Learning (2025 Spr.), NTU