Author:
Jui-Feng Peng, Li-mei Chen, and Chia-Cheng Lee
Abstract:
This study examines the influence of lexical tone on voice onset time (VOT) in Mandarin and Hakka spoken in Taiwan. The examination of VOT values for Mandarin and Hakka word-initial stops /p, t, k, ph, th, kh/ followed by three vowels /i, u, a/ in different lexical tones revealed that lexical tone has significant influence on the VOT values for stops. The results are important as they suggest that future studies should take the influence of lexical tone into account when studying VOT values and when designing wordlists for stops in tonal languages. In Mandarin, stops�� VOT values, from the longest to the shortest, are in MR, FR, HL, and HF tones. This sequence is the same as in Liu, Ng, Wan, Wang, and Zhang (2008). Later, however, it was found that it is very likely that the sequence results from the existence of non-words. In order to produce non-words correctly, participants tended to pronounce them at a slower speed, especially those in MR tone. Therefore, we further examined the data without non-words, in which no clear sequence was found. For Hakka, post-hoc tests (Scheffe) show that aspirated stops in entering tones, which are syllables ending with a stop, have significantly shorter VOT values than they have in other tones. Although the tonal effects on VOT values are not consistently found in different sets of data, probably due to a methodology problem, the possibility of tonal effect on VOT values could not be excluded. Tonal effect, thus, should be taken into consideration in designing word lists for VOT studies. Moreover, further studies should include both real words and non-words in separate sets of word lists to verify the current study results.
Keywords:
Voice Onset Time, Hakka Stops, Mandarin Stops, Tonal Effect
Author:
Hung-Yan Gu, and Sung-Feng Tsai
Abstract:
Approximating a spectral envelope via regularized discrete cepstrum coefficients has been proposed by previous researchers. In this paper, we study two problems encountered in practice when adopting this approach to estimate the spectral envelope. The first is which spectral peaks should be selected, and the second is which frequency axis scaling function should be adopted. After some efforts of trying and experiments, we propose two feasible solution methods for these two problems. Then, we combine these solution methods with the methods for regularizing and computing discrete cepstrum coefficients to form a spectral-envelope estimation scheme. This scheme has been verified, by measuring spectral-envelope approximation error, as being much better than the original scheme. Furthermore, we have applied this scheme to building a system for voice timbre transformation. The performance of this system demonstrates the effectiveness of the proposed spectral-envelope estimation scheme.
Keywords:
Spectral Envelope, Discrete Cepstrum, Harmonic-plus-noise Model, Voice Timbre Transformation.
Author:
Lun-Wei Ku, Chia-Ying Lee, and Hsin-Hsi Chen
Abstract:
Opinion holder identification aims to extract entities that express opinions in sentences. In this paper, opinion holder identification is divided into two subtasks: author�䏭 opinion recognition and opinion holder labeling. Support vector machine (SVM) is adopted to recognize author�䏭 opinions, and conditional random field algorithm (CRF) is utilized to label opinion holders. New features are proposed for both methods. Our method achieves an f-score of 0.734 in the NTCIR7 MOAT task on the Traditional Chinese side, which is the best performance among results of machine learning methods proposed by participants, and also it is close to the best performance of this task. In addition, inconsistent annotations of opinion holders are analyzed, along with the best way to utilize the training instances with inconsistent annotations being proposed.
Keywords:
Opinion Holder Identification, Opinion Mining, Conditional Random Field, Support Vector Machine.
Author:
�§æ�å¨�(Hui-Chuan Lu)���ç¾�䪸(Lo Hsueh Lu)
Abstract:
This paper aims to introduce the process used to create a �憕arallel Learners�� Corpus of Spanish and English�� (CPATEI) by integrating the theories and techniques of Corpus Linguistics, and to investigate the conjunctions used by the Spanish learners in Taiwan.
The collected data in CPATEI are contributed by learners of Spanish from two universities in Taiwan, including Spanish compositions and their corresponding English translations written by two different levels of Spanish learners. Besides, we have collected texts revised by native speakers and POS-tagged.
Based on the collected data, we will examine the tendency of learners�� overuse, underuse and misuse of the conjunctions. At the same time, we will discuss the possible factors leading to incorrect usage of conjunctions. The results demonstrate that the frequency of conjunctions used by the learners in their Spanish compositions and English translations are lower than that of the Spanish and English native speakers in their corrections. Among the types of errors of conjunctions, underuse of the copulative and additive conjunctions and overuse of the adversative and cause-consecutive conjunctions are the two types of errors that have been noted in the writing of the Taiwanese learners. Furthermore, there are no relationships found between the wrong use of the Spanish conjunctions and their language level as well as the second language as English and the third language as Spanish. Finally, we hope the result of this research will bring benefits to the curriculum designs for the teaching of Spanish conjunctions.
Keywords:
Corpus-based, Learners�� Corpus, Parallel Corpus, Conjunctions, Second Language, Third Language