Author:
Jia-lin Shen, Ying-chieh Tu, Po-yu Liang, Lin-shan Lee
Abstract:
This paper presents a study on speaker-independent continuous Mandarin syllable recognition under telephone environments. It compares and contrasts several cepstral bias removal techniques for compensation of telephone channel effects, including cepstral mean subtraction (CMS), signal bias removal (SBR) and stochastic matching (SM). Then some modifications and combinations of these techniques are investigated for further improvement of environmental robustness over the telephone. To better estimate contextual acoustics and co-articulation in spontaneous Mandarin telephone speech, the between-syllable context-dependent phone-like units (such as triphones, biphones and demiphones) are used to train the speech models. In addition, the discriminative capabilities of the speech models are further enhanced using the minimum classification error (MCE) algorithms. Experimental results showed that the achieved recognition rates for Mandarin base syllables are as high as 59.53%, leading to an improvement of 27.81% in the error rates.
Keyword:
Telephone speech recognition, Cepstral bias removal, Between-syllable context-dependent phone models
Author:
Yi-Chung Lin, Keh-Yih Su
Abstract:
In this paper, a level-synchronous parsing mechanism, named Phrase-Level Building (PLB), is proposed to incorporate wide-scope contextual information for parsing ill-formed sentences. This mechanism regards the task of parsing a sentence as the task of building the phrase-levels for the sentence. Therefore, the wide-scope contextual information in the phrase-levels can be used to help narrow down the search space and to select the most likely partial parses. Compared with the system which uses both the stochastic context-free grammar and the heuristics of preferring the longest phrase, the proposed PLB approach improves the precision rate for brackets in the partial parse forests from 69.37% to 79.49%. The recall rate for brackets is also improved from 78.73% to 81.39%. The proposed PLB parsing method can also be used to recover errors in ill-formed sentences, so that more complete syntactic information can be provided in later stages. Experimental results show that 35% of the ill-formed sentences can be recovered to well-formed parses. The recall rate for brackets is also significantly improved from 68.49% to 76.60% while the precision rate for brackets is improved slightly from 79.49% to 80.69%.
Keyword:
natural language parsing, ill-formed sentence parsing, partial parsing, robust parsing, error recovery
Author:
Chuan-Jie Lin, Hsin-Hsi Chen
Abstract:
This paper presents a design of a Mandarin to Taiwanese Min Nan (abbreviated as Taiwanese hereafter) machine translation system. It is the first machine translation system which focuses on these two languages. An input Mandarin sentence is segmented, tagged and translated word by word according to the part of speech of each word. The candidates come from a Mandarin-Taiwanese dictionary. If more than one candidate exists, an example base is consulted. When a Mandarin word is not found in the Mandarin-Taiwanese dictionary, it is translated according to a Single-Character dictionary. The output can be in terms of either speech or text. For speech output, we also deal with the tone sandhi problem in changing the tone of each Taiwanese syllable. Because the mapping between Taiwanese syllables and Chinese characters is still a subject of disagreement, and the phonetic spelling coding systems are not familiar to everybody, speech output is useful but is also a challenge.
Keyword: Machine Translation, Mandarin, Speech Synthesis, Taiwanese, Min Nan, Tone Sandhi
Author:
Wu Yan, James, N.H. Liu, Wang Kaizhu
Abstract:
This paper presents a hybrid approach for automatic abstraction of English text. This approach is based on statistical analysis and understanding of the text. An abstraction algorithm is introduced and its application discussed. Experiment results demonstrate that this hybrid approach absorbs the advantages of two kinds of automatic abstraction and achieves a better result.
Keyword:
Automatic abstraction, statistical abstraction, understanding abstraction