Author:
Wei-Ho Tsai and Hsin-Chieh Lee,
Abstract:
Automatic speaker-identification (SID) has long been an important research topic. It is aimed at identifying who among a set of enrolled persons spoke a given utterance. This study extends the conventional SID problem to examining if an SID system trained using speech data can identify the singing voices of the enrolled persons. Our experiment found that a standard SID system fails to identify most singing data, due to the significant differences between singing and speaking for a majority of people. In order for an SID system to handle both speech and singing data, we examine the feasibility of using model-adaptation strategy to enhance the generalization of a standard SID. Our experiments show that a majority of the singing clips can be correctly identified after adapting speech-derived voice models with some singing data.
Keywords:
Model Adaptation, Singing, Speaker Identification
Author:
Tyne Liang and Jyun-Hua Cheng
Abstract:
Anaphora is a rhetorical device commonly used in written texts. It denotes the use of terms referring to previously-mentioned entities, concepts, or events. In this paper, the definite anaphora in Chinese texts is addressed and empirical approaches to tackle abstract anaphors are presented. The resolution is built on the association between target anaphors and the corresponding referents in their multiple-type features extracted from different levels of discourse units. Experimental results show that features extracted from clauses are more useful than those extracted from sentences in referent identification. Besides, the presented salience-based model outperforms the SVM-based model no matter whether the best set of extracted features is employed or not.
Keywords:
Anaphora Resolution, Chinese Text, Definite Anaphora, Feature Extraction
Author:
Chao-Lin Liu, Guantao Jin, Qingfeng Liu, Wei-Yun Chiu, and Yih-Soong Yu
Abstract:
We report applications of language technology to analyzing historical documents in the Database for the Study of Modern Chinese Thoughts and Literature (DSMCTL). We studied two historical issues with the reported techniques: the conceptualization of �ÜRuaren�� (�¯äºº, Chinese people) and the attempt to institute constitutional monarchy in the late Qing dynasty. We also discuss research challenges for supporting sophisticated issues using our experience with DSMCTL, the Database of Government Officials of the Republic of China, and the Dream of the Red Chamber. Advanced techniques and tools for lexical, syntactic, semantic, and pragmatic processing of language information, along with more thorough data collection, are needed to strengthen the collaboration between historians and computer scientists.
Keywords:
Temporal Analysis, Keyword Trends, Collocation, Chinese Historical Documents, Digital Humanities, Natural Language Processing, Chinese Text Analysis.
Author:
Chih-Chuan Hsu and Shih-Hung Wu
Abstract:
Users might use general terms to query the information in need, when the exact keyword is unknown. We treat these inexact query terms as general queries. In this paper, we consturct a test data set to evaluate the performance of online search engine on searching Wikipedia with general queries and exact queries.
We also proposed a new query expansion method that performs better on general queries. The Wikipedia query expansion method is regarding the Wikipedia as a thesaurus to find candidates of query expansion. The expanded queries are then combined with the pseudo relevance feedback. The performance of this method is better than online search engine on the general queries.
Keywords:
Information Retrieval, General query, Test Collection, Wikipedia, Query Expansion.