International Journal of Computa

International Journal of Computational Linguistics & Chinese Language Processing [中��]
Vol. 25, No. 2, December 2020

Analyzing the Morphological Structures in Seediq Words
Chuan-Jie Lin, Li-May Sung, Jing-Sheng You, Wei Wang, Cheng-Hsun Lee, and Zih-Cyuan Liao
[pdf | html]
Chinese Healthcare Named Entity Recognition Based on Graph Neural Networks
Yi Lu and Lung-Hao Lee
[pdf | html]
Improving Word Alignment for Extraction Phrasal Translation
Yi-Jyun Chen, Ching-Yu Helen Yang and Jason S. Chang
[pdf | html]
NSYSU+CHT Speaker Verification System for Far-Field Speaker Verification Challenge 2020
Yu-Jia Zhang, Chia-Ping Chen, Shan-Wen Hsiao, Bo-Cheng Chan, and Chung-li Lu
[pdf | html]
A Preliminary Study on Deep Learning-based Chinese Text to Taiwanese Speech Synthesis System
Wen-Han Hsu, Cheng-Jung Tseng, Yuan-Fu Liao, Wern-Jun Wang and Chen-Ming Pan
[pdf | html]
The Preliminary Study of Robust Speech Feature Extraction based on Maximizing the Accuracy of States in Deep Acoustic Models
Li-Chia Chang and Jeih-weih Hung
[pdf | html]

Title:
Analyzing the Morphological Structures in Seediq Words

Author:
Chuan-Jie Lin, Li-May Sung, Jing-Sheng You, Wei Wang, Cheng-Hsun Lee, and Zih-Cyuan Liao

Abstract:
NLP techniques are efficient to build large datasets for low-resource languages. It is helpful for preservation and revitalization of the indigenous languages. This paper proposes approaches to analyze morphological structures in Seediq words automatically as the first step to develop NLP applications such as machine translation. Word inflections in Seediq are plentiful. Sets of morphological rules have been created according to the linguisitic features provided in the Seediq syntax book (Sung, 2018) and based on regular morpho-phonological processing in Seediq, a new idea of �筂eep root�� is also suggested. The rule-based system proposed in this paper can successfully detect the existence of infixes and suffixes in Seediq with a precision of 98.88% and a recall of 89.59%. The structure of a prefix string is predicted by probabilistic models. We conclude that the best system is bigram model with back-off approach and Lidstone smoothing with an accuracy of 82.86%.

Keywords: Seediq, Automatic Analysis of Morphological Structures, Deep Root, Natural Language Processing for Indigenous Languages in Taiwan, Formosan Languages

Title:
Chinese Healthcare Named Entity Recognition Based on Graph Neural Networks

Author:
Yi Lu and Lung-Hao Lee

Abstract:
Named Entity Recognition (NER) focuses on locating the mentions of name entities and classifying their types, usually referring to proper nouns such as persons, places, organizations, dates, and times. The NER results can be used as the basis for relationship extraction, event detection and tracking, knowledge graph building, and question answering system. NER studies usually regard this research topic as a sequence labeling problem and learns the labeling model through the large-scale corpus. We propose a GGSNN (Gated Graph Sequence Neural Networks) model for Chinese healthcare NER. We derive a character representation based on multiple embeddings in different granularities from the radical, character to word levels. An adapted gated graph sequence neural network is involved to incorporate named entity information in the dictionaries. A standard BiLSTM-CRF is then used to identify named entities and classify their types in the healthcare domain. We firstly crawled articles from websites that provide healthcare information, online health-related news and medical question/answer forums. We then randomly selected partial sentences to retain content diversity. It includes 30,692 sentences with a total of around 1.5 million characters or 91.7 thousand words. After manual annotation, we have 68,460 named entities across 10 entity types: body, symptom, instrument, examination, chemical, disease, drug, supplement, treatment, and time. Based on further experiments and error analysis, our proposed method achieved the best F1-score of 75.69% that outperforms previous models including the BiLSTM-CRF, Lattice, Gazetteers, and ME-CNER. In summary, our GGSNN model is an effective and efficient solution for the Chinese healthcare NER task.

Keywords:
Named Entity Recognition, Graph Neural Networks, Information Extraction, Health Informatics

Title:
Improving Word Alignment for Extraction Phrasal Translation

Author:
Yi-Jyun Chen, Ching-Yu Helen Yang and Jason S. Chang

Abstract:
This thesis presents a method for extracting translations of noun-preposition collocations from bilingual parallel corpora. The results provide researchers a reference tool for generating grammar rules. In this paper, we use statistical methods to extract translations of nouns and prepositions from bilingual parallel corpora with sentence alignment, and then adjust the translations according to the Chinese collocations extracted from a Chinese corpus. Finally, we generate example sentences for the translations. The evaluation is done using randomly 30 selected phrases. We used human judge to assess the translations.

Keywords:
Word Alignment, Grammar Patterns, Collocations, Phrase Translation

Title:
NSYSU+CHT Speaker Verification System for Far-Field Speaker Verification Challenge 2020

Author:
Yu-Jia Zhang, Chia-Ping Chen, Shan-Wen Hsiao, Bo-Cheng Chan, and Chung-li Lu

Abstract:
In this paper, we describe the system Team NSYSU+CHT has implemented for the 2020 Far-field Speaker Verification Challenge (FFSVC 2020). The single systems are embedding-based neural speaker recognition systems. The front-end feature extractor is a neural network architecture based on TDNN and CNN modules, called TDResNet, which combines the advantages of both TDNN and CNN. In the pooling layer, we experimented with different methods such as statistics pooling and GhostVLAD. The back-end is a PLDA scorer. Here we evaluate PLDA training/adaptation and use it for system fusion. We participate in the text-dependent(Task 1) and text-independent(Task 2) speaker verification tasks on single microphone array data of FFSVC 2020. The best performance we have achieved with the proposed methods are minDCF 0.7703, EER 9.94% on Task 1, and minDCF 0.8762, EER 10.31% on Task 2..

Keywords:
Speaker Verification, TDNN, CNN, TDResNet, GhostVLAD

Title:
A Preliminary Study on Deep Learning-based Chinese Text to Taiwanese Speech Synthesis System

Author:
Wen-Han Hsu, Cheng-Jung Tseng, Yuan-Fu Liao, Wern-Jun Wang and Chen-Ming Pan

Abstract:
This paper focuses on the development and implementation of a Chinese Text-to-Taiwanese speech synthesis system. The proposed system combines three deep neural network-based modules including (1) a sequence-to-sequence-based Chinese characters to Taiwan Minnanyu Luomazi Pinyin (shortened to as Tâi-lô) machine translation (called C2T from now on), (2) a Tacotron2-based Tâi-lô pinyin to spectrogram and (3) a WaveGlow-based spectrogram to speech waveform synthesis subsystems.

Among them, the C2T module was trained using a Chinese-Taiwanese parallel corpus (iCorpus) and 9 dictionaries released by Academia Sinica and collected from internet, respectively. The Tacotron2 and Waveglow was tuned using a Taiwanese speech synthesis corpus (a female speaker, about 10 hours speech) recorded by Chunghwa Telecom Laboratories. At the same time, a demonstration Chinese Text-to-Taiwanese speech synthesis web page has also been implemented.

From the experimental results, it was found that (1) the best syllable error rate (SER) of 6.53% was achieved by the C2T module, (2) and the average MOS score of the whole speech synthesis system evaluated by 20 listeners gains 4.30. These results confirm that the effectiveness of integration of C2T, Tacrtron2 and WaveGlow models. In addition, the real-time factor of the whole system achieved 1/3.5.

Keywords:
Machine Translation, Taiwanese Speech Synthesis, Tacotron2, Waveglow

Title:
The Preliminary Study of Robust Speech Feature Extraction based on Maximizing the Accuracy of States in Deep Acoustic Models

Author:
Li-Chia Chang and Jeih-weih Hung

Abstract:
In this study, we focus on developing a novel speech feature extraction technique to achieve noise-robust speech recognition, which employs the information from the backend acoustic models. Without further retraining and adapting the backend acoustic models, we use deep neural networks to learn the front-end acoustic speech feature representation that can achieve the maximum state accuracy obtained from the original acoustic models. Compared with the robustness methods that retrain or adapt acoustic models, the presented method exhibits the advantages of lower computational complexity and faster training.

In the preliminary evaluation experiments conducted with the median-vocabulary TIMIT database and task, we show that the newly presented method achieves lower word error rates in recognition under various noise types and levels compared with the baseline results. Therefore, this method is quite promising and worth developing further.

Keywords:
Noise-robust Speech Feature, Speech Recognition, Deep Learning