My research interests mainly revolve around RLHF, RLVR, AIGC, and 3D Human Modeling. I am actively seeking (1) Ph.D. positions for Fall 2027 and (2) other research collaborations related to RL or AIGC. Feel free to reach out via 📧 xiaofengtan@seu.edu.cn or 💬 WeChat: txf_06_20 . I'd be happy to connect 😊. And if you'd like to talk about Doraemon 😺, that works too. For more details, please check my CV / 中文简历.
👀 News
Mar 31, 2026
🎉 I have joined Tencent Hunyuan(混元) as a Research Intern! Looking forward to exploring RLHF, and AIGC.
Text-to-motion generation is essential for advancing the creative industry but often presents challenges in producing consistent, realistic motions. To address this, we focus on fine-tuning text-to-motion models to consistently favor high-quality, human-preferred motions—a critical yet largely unexplored problem. In this work, we theoretically investigate the DPO under both online and offline settings, and reveal their respective limitation: overfitting in offline DPO, and biased sampling in online DPO. Building on our theoretical insights, we introduce Semi-online Preference Optimization (SoPo), a DPO-based method for training text-to-motion models using “semi-online" data pair, consisting of unpreferred motion from online distribution and preferred motion in offline datasets. This method leverages both online and offline DPO, allowing each to compensate for the other’s limitations. Extensive experiments demonstrate that SoPo outperforms other preference alignment methods, with an MM-Dist of 3.25% (vs e.g. 0.76% of MoDiPO) on the MLD model, 2.91% (vs e.g. 0.66% of MoDiPO) on MDM model, respectively. Additionally, the MLD model fine-tuned by our SoPo surpasses the SoTA model in terms of R-precision and MM Dist. Visualization results also show the efficacy of our SoPo in preference alignment.
EasyTune: Efficient Step-Aware Fine-Tuning for Diffusion-Based Motion Generation
Xiaofeng Tan*, Wanjiang Weng*, Haodong Lei, and Hongsong Wang
🏆 ICLR 2026|International Conference on Learning RepresentationsCORE-A*
In recent years, motion generative models have undergone significant advancement, yet pose challenges in aligning with downstream objectives. Recent studies have shown that using differentiable rewards to directly align the preference of diffusion models yields promising results. However, these methods suffer from inefficient and coarse-grained optimization with high memory consumption. In this work, we first theoretically identify the \emphfundamental reason of these limitations: the recursive dependence between different steps in the denoising trajectory. Inspired by this insight, we propose \textbfEasyTune, which fine-tunes diffusion at each denoising step rather than over the entire trajectory. This decouples the recursive dependence, allowing us to perform (1) a dense and effective, (2) memory-efficient, and (3) fine-grained optimization. Furthermore, the scarcity of preference motion pairs restricts the availability of motion reward model training. To this end, we further introduce a \textbfSelf-refinement \textbfPreference \textbfLearning (\textbfSPL) mechanism that dynamically identifies preference pairs and conducts preference learning. Extensive experiments demonstrate that EasyTune outperforms ReFL by 62.1% in MM-Dist improvement while requiring only 34.5% of its additional memory overhead.
ReAlign: Text-to-Motion Generation via Step-Aware Reward-Guided Alignment
Wanjiang Weng*, Xiaofeng Tan*, Junbo Wang, Guo-Sen Xie, Pan Zhou, and Hongsong Wang
🏆 AAAI 2026|AAAI Conference on Artificial IntelligenceCCF-ACORE-A*
Text-to-motion generation, which synthesizes 3D human motions from text inputs, holds immense potential for applications in gaming, film, and robotics. Recently, diffusion-based methods have been shown to generate more diversity and realistic motion. However, there exists a misalignment between text and motion distributions in diffusion models, which leads to semantically inconsistent or low-quality motions. To address this limitation, we propose Reward-guided sampling Alignment (ReAlign), comprising a step-aware reward model to assess alignment quality during the denoising sampling and a reward-guided strategy that directs the diffusion process toward an optimally aligned distribution. This reward model integrates step-aware tokens and combines a text-aligned module for semantic consistency and a motion-aligned module for realism, refining noisy motions at each timestep to balance probability density and alignment. Extensive experiments of both motion generation and retrieval tasks demonstrate that our approach significantly improves text-motion alignment and motion quality compared to existing state-of-the-art methods.
Fuzzy Granule Density-Based Outlier Detection with Multi-Scale Granular Balls
Can Gao (Supervisor), Xiaofeng Tan*, Jie Zhou, Weiping Ding, and Witold Pedrycz
🏆 TKDE 2025|IEEE Transactions on Knowledge and Data EngineeringCCF-ASCI-Q1
Outlier detection refers to the identification of anomalous samples that deviate significantly from the distribution of normal data and has been extensively studied and used in a variety of practical tasks. However, most unsupervised outlier detection methods are carefully designed to detect specified outliers, while real-world data may be entangled with different types of outliers. In this study, we propose a fuzzy rough sets-based multi-scale outlier detection method to identify various types of outliers. Specifically, a novel fuzzy rough sets-based method that integrates relative fuzzy granule density is first introduced to improve the capability of detecting local outliers. Then, a multi-scale view generation method based on granular-ball computing is proposed to collaboratively identify group outliers at different levels of granularity. Moreover, reliable outliers and inliers determined by the three-way decision are used to train a weighted support vector machine to further improve the performance of outlier detection. The proposed method innovatively transforms unsupervised outlier detection into a semi-supervised classification problem and for the first time explores the fuzzy rough sets-based outlier detection from the perspective of multi-scale granular balls, allowing for high adaptability to different types of outliers. Extensive experiments carried out on both artificial and UCI datasets demonstrate that the proposed outlier detection method significantly outperforms the state-of-the-art methods, improving the results by at least 8.48% in terms of the Area Under the ROC Curve (AUROC) index.
Preprints
MotionRFT: Unified Reinforcement Fine-Tuning for Text-to-Motion Generation
Xiaofeng Tan, Wanjiang Weng, Hongsong Wang, Fang Zhao, Xin Geng, and Liang Wang
Text-to-motion generation has rapidly advanced with diffusion- and flow-based generative models, yet supervised pre-training remains insufficient to align models with high-level objectives such as semantic consistency, realism, and human preference. We present a reinforcement fine-tuning framework that comprises a heterogeneous-representation, multi-dimensional reward model MotionReward and an efficient, fine-grained fine-tuning strategy EasyTune. Extensive experiments demonstrate strong cross-model and cross-representation generalization, achieving FID 0.132 with 22.10 GB peak memory and saving up to 15.22 GB over DRaFT.
Diversity Collapse despite Constant Entropy: Understanding Entropy in Flow-based RL
Xiaofeng Tan, Jun Liu, Bin-Bin Gao, Yuanting Fan, Xi Jiang, Chengjie Wang, Hongsong Wang, and Feng Zheng
Video anomaly detection is an essential yet challenging open-set task in computer vision, often addressed by leveraging reconstruction as a proxy task. However, existing reconstruction-based methods encounter challenges in two main aspects: (1) limited model robustness for open-set scenarios, (2) and an overemphasis on, but restricted capacity for, detailed motion reconstruction. To this end, we propose a novel frequency-guided diffusion model with perturbation training, which enhances the model robustness by perturbation training and emphasizes the principal motion components guided by motion frequencies. Specifically, we first use a trainable generator to produce perturbative samples for perturbation training of the diffusion model. During the perturbation training phase, the model robustness is enhanced and the domain of the reconstructed model is broadened by training against this generator. Subsequently, perturbative samples are introduced for inference, which impacts the reconstruction of normal and abnormal motions differentially, thereby enhancing their separability. Considering that motion details originate from high-frequency information, we propose a masking method based on 2D discrete cosine transform to separate high-frequency information and low-frequency information. Guided by the high-frequency information from observed motion, the diffusion model can focus on generating low-frequency information, and thus reconstructing the motion accurately. Experimental results on five video anomaly detection datasets, including human-related and open-set benchmarks, demonstrate the effectiveness of the proposed method. The code will be released to the public.
🌟 Selected Honors
Tencent Scholarship2025
Top 2.5% at Southeast University (¥9,000)
First-Class Academic Scholarship2025
Top 10% at Southeast University (¥12,000)
Outstanding Graduate Representative2024
Speaker in graduation
Honor Bachelor Degree2024
Top 3% at Shenzhen University
Outstanding Graduate2024
Top 5% at Shenzhen University
Star of Liyuan2022, 2023
Top 0.73% at Shenzhen University ("Liyuan" means Shenzhen University, ¥20,000)
Scholarship of Outstanding Innovative Talent2020
Top 2% in Guangdong Province Science College Entrance Exam (¥20,000)
💡 Teaching Assistant
Fundamentals of Programming2022 Fall
2022 Mathematics and Computer Science Special Class, WeBank Fintech Class
Data Structure2022 Fall, 2023 Fall
2021 and 2022 Computer Science Classes
Object-Oriented Programming2023 Spring, 2024 Spring
2022 Mathematics and Computer Science Special Class, WeBank Fintech Class, 2023 Computer Science Class
Computer System2023 Spring
2022 Mathematics and Computer Science Special Class
Introduction to Computer Science2023 Fall
Common Optional Course
Operating System2024 Spring
2021 Computer Science Class
📋 Service
Reviewer: ICML 2026, NeurIPS 2026
🏠 Life
I have a passion for exploring the underlying principles of the world, not only in my research but also across natural sciences, engineering, and social humanities. Beyond academics, I find quiet joy and inspiration in the arts. I practice hard-pen calligraphy (especially admiring the works of Zhong'an Gu), visit exhibitions, and listen to Cantonese pop (e.g., Peach Blossom Island and The One For U by MC) as well as Chinese pop music (e.g., Cang Jie and Step by Step by Mayday). I also stay active with table tennis (big fan of Zhendong Fan) and badminton. Oh, and one more thing: I have been a huge fan of Doraemon since childhood!
📝 Blogcoming soon
Share thoughts on research, technology, and life insights.
If you have questions, feel free to reach out through email (xiaofengtan@seu.edu.cn)!