{"slug": "low-resource-language-discrimination-towards-chinese-dialects-with-transfer-and", "title": "Low-resource Language Discrimination Towards Chinese Dialects with Transfer learning and Data Augmentation", "summary": "Researchers developed a Chinese dialects discrimination framework using transfer learning and data augmentation to address low-resource challenges. The model outperformed state-of-the-art methods on two benchmark corpora by training a source ASR model and fine-tuning a target model with augmented data.", "body_md": "arXiv:2606.18597v1 Announce Type: new\nAbstract: Chinese dialects discrimination is a challenging natural language processing task due to scarce annotation resource. In this article, we develop a novel Chinese dialects discrimination framework with transfer learning and data augmentation (CDDTLDA) in order to overcome the shortage of resources. To be more specific, we first use a relatively larger Chinese dialects corpus to train a source-side automatic speech recognition (ASR) model. Then, we adopt a simple but effective data augmentation method (i.e., speed, pitch, and noise disturbance) to augment the target-side low-resource Chinese dialects, and fine-tune another target ASR model based on the previous source-side ASR model. Meanwhile, the potential common semantic features between source-side and target-side ASR models can be captured by using self-attention mechanism. Finally, we extract the hidden semantic representation in the target ASR model to conduct Chinese dialects discrimination. Our extensive experimental results demonstrate that our model significantly outperforms state-of-the-art methods on two benchmark Chinese dialects corpora.", "url": "https://wpnews.pro/news/low-resource-language-discrimination-towards-chinese-dialects-with-transfer-and", "canonical_source": "https://arxiv.org/abs/2606.18597", "published_at": "2026-06-18 04:00:00+00:00", "updated_at": "2026-06-18 04:25:29.429304+00:00", "lang": "en", "topics": ["natural-language-processing", "machine-learning", "artificial-intelligence"], "entities": ["CDDTLDA"], "alternates": {"html": "https://wpnews.pro/news/low-resource-language-discrimination-towards-chinese-dialects-with-transfer-and", "markdown": "https://wpnews.pro/news/low-resource-language-discrimination-towards-chinese-dialects-with-transfer-and.md", "text": "https://wpnews.pro/news/low-resource-language-discrimination-towards-chinese-dialects-with-transfer-and.txt", "jsonld": "https://wpnews.pro/news/low-resource-language-discrimination-towards-chinese-dialects-with-transfer-and.jsonld"}}