International Journal of Computational Linguistics & Chinese Language
Processing
[䏿�]
Vol. 26, No. 1, June 2021
Title:
The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020
Author:
Fu-An
Chao, Tien-Hong Lo, Shi-Yan Weng, Shih-Hsuan Chiu,
Yao-Ting Sung, and Berlin
Chen
Abstract:
This
paper describes the NTNU ASR system participating in the Formosa Speech
Recognition Challenge 2020 (FSR-2020) supported by the Formosa Speech in the
Wild project (FSW). FSR-2020 aims at fostering the development of Taiwanese
speech recognition. Apart from the issues on tonal and dialectical variations of
the Taiwanese language, speech artificially contaminated with different types of
real-world noise also has to be dealt with in the final test stage; all of these
make FSR-2020 much more challenging than before. To work around the
under-resourced issue, the main technical aspects of our ASR system include
various deep learning techniques, such as transfer learning, semi-supervised
learning, front-end speech enhancement and model ensemble, as well as data
cleansing and data augmentation conducted on the training data. With the best
configuration, our system obtains 13.1 % syllable error rate (SER) on the
final-test set, achieving the first place among all participating systems on
Track 3.
Keywords:
Formosa Speech Recognition Challenge, Deep Learning, Transfer Learning,
Semi-supervised Training
Title:
NSYSU-MITLab Speech Recognition System for Formosa Speech
Recognition Challenge 2020
Author:
Hung-Pang Lin
and
Chia-Ping Chen
Abstract:
In this paper, we describe the system team NSYSU-MITLab implemented for Formosa
Speech Recognition Challenge 2020.
We use the Transformer architecture composed of Multi-head Attention to
construct an end-to-end speech recognition system and combine it with
Connectionist Temporal Classification (CTC) for end-to-end training and
decoding.
We have also built a deep neural network combined with a hidden Markov model (DNN-HMM).
We use Time-Restricted Self-Attention and Factorized Time Delay Neural Network (TDNN-F)
for the deep neural network in DNN-HMM.
The best performance we have achieved with the proposed methods is the character
error rate of 45.5% for Taiwan Southern Min Recommended Characters (�°æ�æ¼¢å�)
task and syllable error rate 25.4% for Taiwan Minnanyu Luomazi Pinyin (�°ç��¼é𨺗)
task.
Keywords:
Automatic Speech Recognition,
Transformer,
Conformer,
Connectionist Temporal Classification,
Acoustic Model
Title:
A Preliminary Study of Formosa Speech Recognition Challenge 2020 �� Taiwanese ASR
Author:
Fu-Hao Yu,
Ke-Han Lu,
Yi-Wei Wang,
Wei-Zhe
Chang,
Wei-Kai Huang
and
Kuan-Yu Chen
Abstract:
In
order to study the effectiveness of the current deep learning-based speech
recognition models in the speech recognition tasks of Taiwanese Southern Min
Recommended Characters and Taiwan Minnanyu Luomazi Pinyin, this study uses the
corpora provided by the 2020 Formosa Speech Recognition Challenge 2020
(FSR-2020) to evalutae some neural-based ASR systems by ESPnet and Kaldi
toolkits. In the end, our system achieved a 66.1% error rate in the Taiwanese
Southern Min Recommended Characters recognition (Track2), and the error rate we
got in the Taiwan Minnanyu Luomazi Pinyin recognition (Track3) was 28.6%.
Keywords:
Taiwanese Southern Min Recommended Characters, Taiwan Minnanyu Luomazi Pinyin,
Taiwanese ASR
Title:
Textual Relations with Conjunctive Adverbials in English Writing by Chinese
Speakers: A
corpus-based Approach
Author:
Tung-Yu Kao,
and Li-mei Chen
Abstract:
The
study aims to investigate the use of conjunctive adverbials (CA, hereafter)
performing various textual relations in the English writing by Chinese speakers
across genres and over time. To begin with, a corpus of one million word was
compiled and the corpus interface was constructed. Later, 45 pieces of writing
by 5 college students during 4 semesters were selected for data annotation and
analysis, with each student contributing 9 pieces for 9 text genres. The results
show that there exists a distribution norm of CA-performed textual relations
based on CA occurrence frequency and that the distribution is independent of
genre and time effects. Compared with literature, the found distribution is also
considered free from the first language influence. This suggests that the found
distribution is a mental representation of mature human cognition, underlying
English writing on global and coherent levels. Therefore, the found distribution
is of great potential for developing automatic tools of discourse diagnosis.
Keywords:
Conjunctive Adverbial, Textual Relation, Text Genre, English Writing, Corpus
Compilation, Automatic Discourse Diagnosis
Title:
The Analysis and Annotation of Propaganda Techniquesin Chinese News Texts
Author:
Meng-Hsien Shih, Ren-feng Duann, and Siaw-Fong Chung
Abstract:
In
political news media, propaganda techniques are often employed to express one�䏭
political view, or to influence the audience�䏭 stance. Chinese corpora with the
annotation of propaganda techniques are yet to be developed. In this paper, with
an explainable approach, we annotated the use of propaganda techniques in
Chinese political news texts, and enlarged the dataset by bootstrapping using a
small set of manually annotated data. To ensure the validity, we manually
corrected the bootstrapped dataset and ran a pilot machine-learning experiment
using a naïve Bayes classifier trained with the bag-of-words feature. A
precision of 74.26% was reached for the binary classification (with or without
propaganda technique). The manually annotated data with propaganda techniques is
available online for the application of machine training and learning to predict
the stance of new texts.
Keywords:
Sentiment (Stance) Analysis, Language Resource, Propaganda Techniques, Taiwan
News Media