International Journal of Computa

International Journal of Computational Linguistics & Chinese Language Processing [銝剜�]
Vol. 16, No. 1-2, March/June 2011

Performance Evaluation of Speaker-Identification Systems for Singing Voice Data
Wei-Ho Tsai and Hsin-Chieh Lee,
[pdf | html]
Resolving Abstract Definite Anaphora in Chinese Texts
Tyne Liang and Jyun-Hua Cheng
[pdf | html]
Some Chances and Challenges in Applying Language Technologies to Historical Studies in Chinese
Chao-Lin Liu, Guantao Jin, Qingfeng Liu, Wei-Yun Chiu, and Yih-Soong Yu
[pdf | html]
Evaluating the Information Retrieval Performance of Query Expansion Method and On-line Search Engine on General Query
Chih-Chuan Hsu and Shih-Hung Wu
[pdf | html]

Title:
Performance Evaluation of Speaker-Identification Systems for Singing Voice Data

Author:
Wei-Ho Tsai and Hsin-Chieh Lee,

Abstract:
Automatic speaker-identification (SID) has long been an important research topic. It is aimed at identifying who among a set of enrolled persons spoke a given utterance. This study extends the conventional SID problem to examining if an SID system trained using speech data can identify the singing voices of the enrolled persons. Our experiment found that a standard SID system fails to identify most singing data, due to the significant differences between singing and speaking for a majority of people. In order for an SID system to handle both speech and singing data, we examine the feasibility of using model-adaptation strategy to enhance the generalization of a standard SID. Our experiments show that a majority of the singing clips can be correctly identified after adapting speech-derived voice models with some singing data.

Keywords: Model Adaptation, Singing, Speaker Identification

Title:
Resolving Abstract Definite Anaphora in Chinese Texts

Author:
Tyne Liang and Jyun-Hua Cheng

Abstract:
Anaphora is a rhetorical device commonly used in written texts. It denotes the use of terms referring to previously-mentioned entities, concepts, or events. In this paper, the definite anaphora in Chinese texts is addressed and empirical approaches to tackle abstract anaphors are presented. The resolution is built on the association between target anaphors and the corresponding referents in their multiple-type features extracted from different levels of discourse units. Experimental results show that features extracted from clauses are more useful than those extracted from sentences in referent identification. Besides, the presented salience-based model outperforms the SVM-based model no matter whether the best set of extracted features is employed or not.

Keywords:
Anaphora Resolution, Chinese Text, Definite Anaphora, Feature Extraction

Title:
Some Chances and Challenges in Applying Language Technologies to Historical Studies in Chinese

Author:
Chao-Lin Liu, Guantao Jin, Qingfeng Liu, Wei-Yun Chiu, and Yih-Soong Yu

Abstract:
We report applications of language technology to analyzing historical documents in the Database for the Study of Modern Chinese Thoughts and Literature (DSMCTL). We studied two historical issues with the reported techniques: the conceptualization of �鄄uaren�� (�臭犖, Chinese people) and the attempt to institute constitutional monarchy in the late Qing dynasty. We also discuss research challenges for supporting sophisticated issues using our experience with DSMCTL, the Database of Government Officials of the Republic of China, and the Dream of the Red Chamber. Advanced techniques and tools for lexical, syntactic, semantic, and pragmatic processing of language information, along with more thorough data collection, are needed to strengthen the collaboration between historians and computer scientists.

Keywords:
Temporal Analysis, Keyword Trends, Collocation, Chinese Historical Documents, Digital Humanities, Natural Language Processing, Chinese Text Analysis.

Title:
Evaluating the Information Retrieval Performance of Query Expansion Method and On-line Search Engine on General Query

Author:
Chih-Chuan Hsu and Shih-Hung Wu

Abstract:
Users might use general terms to query the information in need, when the exact keyword is unknown. We treat these inexact query terms as general queries. In this paper, we consturct a test data set to evaluate the performance of online search engine on searching Wikipedia with general queries and exact queries.

We also proposed a new query expansion method that performs better on general queries. The Wikipedia query expansion method is regarding the Wikipedia as a thesaurus to find candidates of query expansion. The expanded queries are then combined with the pseudo relevance feedback. The performance of this method is better than online search engine on the general queries.

Keywords:
Information Retrieval, General query, Test Collection, Wikipedia, Query Expansion.

��