Publications * equal contribution

Audio Large Language Models

[7]
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, ..., Xuanjun Chen, ..., Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee
TASLP 2026 bib · arXiv · IEEE · Code
[6]
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Chien-yu Huang, Wei-Chih Chen, Shu-wen Yang, ..., Xuanjun Chen, ..., Shinji Watanabe, Hung-yi Lee
ICLR 2025 bib · arXiv · OpenReview · Code
[5]
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models
Haibin Wu, Ho-Lam Chung, Yi-Cheng Lin, ..., Xuanjun Chen, ..., Kai-Wei Chang, Alexander H. Liu, Hung-yi Lee
Findings of ACL 2024 bib · arXiv · Anthology · Leaderboard · Code · HF
[4]
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural codec models
Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Jiawei Du, Kai-Wei Chang, ..., Shinji Watanabe, Hung-yi Lee
IEEE SLT 2024 bib · arXiv · IEEE Xplore
[3]
A Preliminary Exploration with GPT-4o Voice Mode
Yu-Xiang Lin, Chih-Kai Yang, Wei-Chih Chen, Chen-An Li, Chien-yu Huang, Xuanjun Chen, Hung-yi Lee
Technical Report, Feb. 2025 bib · arXiv
[2]
Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Chih-Kai Yang, Yu-Kuan Fu, Chen-An Li, ..., Xuanjun Chen, ..., Shu-wen Yang, Hung-yi Lee
Technical Report, Nov. 2024 bib · arXiv
[1]
Towards audio language modeling-an overview
Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-wei Chang, Ho-Lam Chung, Alexander Liu, Hung-yi Lee
Technical Report, Feb. 2024 bib · arXiv · Awesome

Retrieval Augmented Generation

[3]
CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning
Cheng-Yen Li*, Xuanjun Chen*, Claire Lin, Wei-Yu Chen, Wenhua Nie, Hung-yi Lee, Jyh-Shing Roger Jang
ACM Trans. Intell. Syst. Technol. (Submitted) bib · arXiv
[2]
Only Ask What You Don't Know: Grounded Delta Planning for Efficient Multi-step RAG
Wei-Chieh Chou*, Xuanjun Chen*, Jian-Ren Lin, Claire Lin, Hung-yi Lee, Jyh-Shing Roger Jang
COLM 2026 (Submitted) bib · arXiv
[1]
A Preliminary Study of RAG for Taiwanese Historical Archives
Claire Lin*, Bo-Han Feng*, Xuanjun Chen*, Te-Lun Yang, Hung-yi Lee, Jyh-Shing Roger Jang
ROCLING 2025 Best Paper Award bib · arXiv · Anthology

Audio Deepfake Detection, Localization, Attribution, and Reliability

[13]
Mitigating Proxy-to-Wild Domain Gap in Deepfake Speech
Xuanjun Chen, Yun-Shing Wu, Wei-Chung Lu, Claire Lin, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang
Technical Report 2026 bib · arXiv
[12]
Joint Fullband-Subband Modeling for High-Resolution SingFake Detection
Xuanjun Chen*, Chia-Yu Hu*, Sung-Feng Huang, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang
INTERSPEECH 2026 (Long Paper) bib · arXiv
[11]
CodecFake+: Codec-Based Resynthesized Data as a Proxy for Detecting CodecFake Speech
Xuanjun Chen*, Jiawei Du*, Haibin Wu, Lin Zhang, I-Ming Lin, ..., Jyh-Shing Roger Jang, Hung-yi Lee
TASLP 2026 bib · arXiv · IEEE · Project · HF · Code
[10]
Localizing Audio-Visual Deepfakes via Hierarchical Boundary Modeling
Xuanjun Chen*, Shih-Peng Cheng*, Jiawei Du, Lin Zhang, Xiaoxiao Miao, ..., Hung-yi Lee, Jyh-Shing Roger Jang
Technical Report 2025 bib · arXiv
[9]
How Does Instrumental Music Help SingFake Detection?
Xuanjun Chen, Chia-Yu Hu, I-Ming Lin, Yi-Cheng Lin, I-Hsiang Chiu, ..., Hung-yi Lee, Jyh-Shing Roger Jang
ICASSP 2026 bib · arXiv · IEEE
[8]
Towards Generalized Source Tracing for Codec-Based Deepfake Speech
Xuanjun Chen*, I-Ming Lin*, Lin Zhang, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang
IEEE ASRU 2025 Best Student Paper nominee bib · arXiv · Code
[7]
Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy
Xuanjun Chen*, I-Ming Lin*, Lin Zhang, Jiawei Du, Haibin Wu, Hung-yi Lee, Jyh-Shing Roger Jang
INTERSPEECH 2025 bib · arXiv · ISCA · Code
[6]
Singing Voice Graph Modeling for SingFake Detection
Xuanjun Chen, Haibin Wu, Jyh-Shing Roger Jang, Hung-yi Lee
INTERSPEECH 2024 (Oral) bib · arXiv · ISCA · Code · Lightning Talk
[5]
Neural Codec-based Adversarial Sample Detection for Speaker Verification
Xuanjun Chen*, Jiawei Du*, Haibin Wu, Jyh-Shing Roger Jang, Hung-yi Lee
INTERSPEECH 2024 bib · arXiv · ISCA · Code
[4]
DFADD: The Diffusion and Flow-Matching based Audio Deepfake Dataset
Jiawei Du, I-Ming Lin, I-Hsiang Chiu, Xuanjun Chen, ..., Yu Tsao, Hung-yi Lee, Jyh-Shing Roger Jang
IEEE SLT 2024 bib · arXiv · IEEE Xplore · Code · HF
[3]
Multimodal Transformer Distillation for Audio-Visual Synchronization
Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-yi Lee, Jyh-Shing Roger Jang
ICASSP 2024 bib · arXiv · IEEE Xplore · Code · Poster
[2]
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection
Xuanjun Chen*, Haibin Wu*, Helen Meng, Hung-yi Lee, Jyh-Shing Roger Jang
IEEE SLT 2022, Jan 2023 bib · arXiv · IEEE Xplore · Demos · Poster · Video
[1]
Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification
Xuanjun Chen*, Yen-Lun Liao*, Chung-Che Wang, Jyh-Shing Roger Jang
SPSC Symposium at INTERSPEECH 2022 bib · arXiv · ISCA Archive

Audio Generation

[3]
Training-Efficient Text-to-Music Generation with State-Space Modeling
Wei-Jaw Lee, Fang-Chih Hsieh, Xuanjun Chen, Fang-Duo Tsai, Yi-Hsuan Yang
TASLP 2026 (Submitted) bib · arXiv · ISMIR LBD · Project · Code
[2]
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Wenze Ren, Haibin Wu, Yi-Cheng Lin, Xuanjun Chen, ..., Hsin-Min Wang, Yu Tsao
ICASSP 2025 bib · arXiv · IEEE
[1]
Singer Separation for Karaoke Content Generation
Hsuan-Yu Lin, Xuanjun Chen, Jyh-Shing Roger Jang
O-COCOSDA 2024 bib · arXiv · IEEE