Publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- Bilingual Text-to-Motion Generation via Step-Aware Reward-Guided AlignmentWanjiang Weng#, Xiaofeng Tan#, Hongsong Wang, and 1 more authorUnder Review, 2025
Bilingual text-to-motion generation, which synthesizes 3D human motions from bilingual text inputs, holds immense potential for cross-linguistic applications in gaming, film, and robotics. However, this task faces critical challenges: the absence of bilingual motion-language datasets and the misalignment between text and motion distributions in diffusion models, leading to semantically inconsistent or low-quality motions. To address these challenges, we propose BiHumanML3D, a novel bilingual human motion dataset, which establishes a crucial benchmark for bilingual text-to-motion generation models. Furthermore, we propose a \textbfBilingual \textbfMotion \textbfDiffusion model (\textbfBiMD), which leverages cross-lingual aligned representations to capture semantics, thereby achieving a unified bilingual model. Building upon this, we propose \textbfReward-guided sampling \textbfAlignment (\textbfReAlign) method, comprising a step-aware reward model to assess alignment quality during sampling and a reward-guided strategy that directs the diffusion process toward an optimally aligned distribution. This reward model integrates step-aware tokens and combines a text-aligned module for semantic consistency and a motion-aligned module for realism, refining noisy motions at each timestep to balance probability density and alignment. Experiments demonstrate that our approach significantly improves text-motion alignment and motion quality compared to existing state-of-the-art methods.
- Fuzzy Granule Density-Based Outlier Detection with Multi-Scale Granular BallsCan Gao (Supervisor), Xiaofeng Tan*, Jie Zhou, and 2 more authorsIEEE Transactions on Knowledge and Data Engineering, 2025
Outlier detection refers to the identification of anomalous samples that deviate significantly from the distribution of normal data and has been extensively studied and used in a variety of practical tasks. However, most unsupervised outlier detection methods are carefully designed to detect specified outliers, while real-world data may be entangled with different types of outliers. In this study, we propose a fuzzy rough sets-based multi-scale outlier detection method to identify various types of outliers. Specifically, a novel fuzzy rough sets-based method that integrates relative fuzzy granule density is first introduced to improve the capability of detecting local outliers. Then, a multi-scale view generation method based on granular-ball computing is proposed to collaboratively identify group outliers at different levels of granularity. Moreover, reliable outliers and inliers determined by the three-way decision are used to train a weighted support vector machine to further improve the performance of outlier detection. The proposed method innovatively transforms unsupervised outlier detection into a semi-supervised classification problem and for the first time explores the fuzzy rough sets-based outlier detection from the perspective of multi-scale granular balls, allowing for high adaptability to different types of outliers. Extensive experiments carried out on both artificial and UCI datasets demonstrate that the proposed outlier detection method significantly outperforms the state-of-the-art methods, improving the results by at least 8.48% in terms of the Area Under the ROC Curve (AUROC) index.
2024
- SoPo: Text-to-Motion Generation Using Semi-Online Preference OptimizationXiaofeng Tan, Hongsong Wang, Xin Geng, and 1 more authorarXiv preprint arXiv:2412.05095 (Under Review), 2024
Text-to-motion generation is essential for advancing the creative industry but often presents challenges in producing consistent, realistic motions. To address this, we focus on fine-tuning text-to-motion models to consistently favor high-quality, human-preferred motions—a critical yet largely unexplored problem. In this work, we theoretically investigate the DPO under both online and offline settings, and reveal their respective limitation: overfitting in offline DPO, and biased sampling in online DPO. Building on our theoretical insights, we introduce Semi-online Preference Optimization (SoPo), a DPO-based method for training text-to-motion models using “semi-online” data pair, consisting of unpreferred motion from online distribution and preferred motion in offline datasets. This method leverages both online and offline DPO, allowing each to compensate for the other’s limitations. Extensive experiments demonstrate that SoPo outperforms other preference alignment methods, with an MM-Dist of 3.25% (vs e.g. 0.76% of MoDiPO) on the MLD model, 2.91% (vs e.g. 0.66% of MoDiPO) on MDM model, respectively. Additionally, the MLD model fine-tuned by our SoPo surpasses the SoTA model in terms of R-precision and MM Dist. Visualization results also show the efficacy of our SoPo in preference alignment.
- Frequency-Guided Diffusion Model with Perturbation Training for Skeleton-Based Video Anomaly DetectionXiaofeng Tan, Hongsong Wang, and Xin GengarXiv preprint arXiv:2412.03044 (Under Review), 2024
Video anomaly detection is an essential yet challenging open-set task in computer vision, often addressed by leveraging reconstruction as a proxy task. However, existing reconstruction-based methods encounter challenges in two main aspects: (1) limited model robustness for open-set scenarios, (2) and an overemphasis on, but restricted capacity for, detailed motion reconstruction. To this end, we propose a novel frequency-guided diffusion model with perturbation training, which enhances the model robustness by perturbation training and emphasizes the principal motion components guided by motion frequencies. Specifically, we first use a trainable generator to produce perturbative samples for perturbation training of the diffusion model. During the perturbation training phase, the model robustness is enhanced and the domain of the reconstructed model is broadened by training against this generator. Subsequently, perturbative samples are introduced for inference, which impacts the reconstruction of normal and abnormal motions differentially, thereby enhancing their separability. Considering that motion details originate from high-frequency information, we propose a masking method based on 2D discrete cosine transform to separate high-frequency information and low-frequency information. Guided by the high-frequency information from observed motion, the diffusion model can focus on generating low-frequency information, and thus reconstructing the motion accurately. Experimental results on five video anomaly detection datasets, including human-related and open-set benchmarks, demonstrate the effectiveness of the proposed method. The code will be released to the public.
- Multi-Scale Fuzzy Rough Sets based Anomaly Detection with Multiple AutoencodersXiaofeng Tan, Can Gao, Jie Zhou, and 1 more authorUnder Review, 2024
Anomaly detection is a practical and essential research topic with a wide range of applications. However, existing anomaly detection methods may face challenges when handling high-dimensional data with complex distributions. In this study, we propose a multiple autoencoder-based anomaly detection method with the aid of fuzzy rough sets. Specifically, the autoencoder is first improved by introducing the kernel fuzzy relation to enhance its representation capability in low-dimensional space. Then, the theory of fuzzy rough sets is employed to perform anomaly detection in the learned low-dimensional representation by fusing multi-view proximity-based information. Finally, to handle complex data, multiple autoencoders are utilized to collaboratively detect anomalies by integrating local anomaly information from different perspectives. Comparative experiments conducted on the selected datasets reveal that the proposed method is superior to state-of-the-art methods, improving over classical autoencoder by 5.58% in terms of the AUC-ROC index.
2023
- Three-way decision-based co-detection for outliersXiaofeng Tan, Can Gao, Jie Zhou, and 1 more authorInternational Journal of Approximate Reasoning, 2023
Outlier detection is an important research topic in data mining and machine learning. However, existing unsupervised outlier detection methods suffer from irrelevant and redundant attributes in high-dimensional data, and their performance is also limited by their outlier detection models that rely on only one view. In this study, we propose a three-way decision-based co-detection model for unsupervised outlier detection. Specifically, we first improve the local outlier factor (LOF) method by introducing the Gaussian kernel function to make the measure of local reachability density more accurate. Then, we introduce fuzzy rough sets to perform attribute reduction, which further reduces the negative effect of irrelevant and redundant attributes on the measure of sample similarity. Finally, we develop a co-detection model that is trained on the original view and the transformed view generated by principal component analysis and uses the strategy of the three-way decision to collaboratively detect outliers. The results of comparative experiments on the selected UCI datasets show that the proposed model outperforms state-of-the-art methods in terms of AUC-ROC index.