Author:
Masayuki ASAHARA, Sachi KATO, Hikari KONISHI, Mizuho IMADA, and Kikuo MAEKAWA
Abstract:
Temporal information extraction can be divided into the following tasks: temporal expression extraction, time normalisation, and temporal ordering relation resolution. The first task is a subtask of a named entity and numeral expression extraction. The second task is often performed by rewriting systems. The third task consists of event anchoring. This paper proposes a Japanese temporal ordering annotation scheme that is used to annotate expressions by referring to �勇he �𦲂alanced Corpus of Contemporary Written Japanese�� (BCCWJ). We extracted verbal and adjective event expressions as
Keywords:
Temporal Information Processing, Event Semantics, Corpus Annotation
Author:
Yu-Chun Wang, Karol Chia-Tien Chang, Richard Tzong-Han Tsai, and Jieh Hsiang
Abstract:
Extracting plausible transliterations from historical literature is a key issue in historical linguistics and other research fields. In Chinese historical literature, the characters used to transliterate the same loanword may vary because of different translation eras or different Chinese language preferences among translators. To assist historical linguists and digital humanities researchers, this paper propose a transliteration extraction method based on the conditional random field method with features based on the language models and the characteristics of the Chinese characters used in transliterations which are suitable to identify transliteration characters. To evaluate our method, we compiled an evaluation set from two Buddhist texts, the Samyuktagama and the Lotus Sutra. We also constructed a baseline approach with a suffix array based extraction method and phonetic similarity measurement. Our method significantly outperforms the baseline approach and the recall of our method achieves 0.9561 and the precision is 0.9444. The results show our method is very effective to extract transliterations in classical Chinese texts.
Keywords:
Transliteration Eextraction, Classical Chinese, Buddhist
Literation, Langauge Model, Conditional Random Fields, CRF.
Author:
Hen-Hsen Huang, Kai-Chun Chang, and Hsin-Hsi Chen
Abstract:
To prepare an evaluation dataset for textual entailment (TE) recognition, human annotators label many rich linguistic phenomena on text and hypothesis expressions. These phenomena illustrate implicit human inference process to determine the relations of given text-hypothesis pairs. This paper aims at understanding what human think in TE recognition process and modeling their thinking process to deal with this problem. At first, we analyze a labelled RTE-5 test set which has been annotated with 39 linguistic phenomena of 5 aspects by Mark Sammons et al., and find that the negative entailment phenomena are very effective features for TE recognition. Then, a rule-based method and a machine learning method are proposed to extract this kind of phenomena from text-hypothesis pairs automatically. Though the systems with the machine-extracted knowledge cannot be comparable to the systems with human-labelled knowledge, they provide a new direction to think TE problems. We further annotate the negative entailment phenomena on Chinese text-hypothesis pairs in NTCIR-9 RITE-1 task, and conclude the same findings as that on the English RTE-5 datasets
Keywords:
Textual Entailment Recognition, Chinese Processing, Semantic
��