International Journal of Computational Linguistics & Chinese Language
Processing
[]
Vol. 27, No. 1, June 2022
Title:
Preface: Corpus Linguistics and Discourse Annotation
Author:
Siaw-Fong Chung, Rafal Rzepka
and Shih-ping Wang
Title:
The Uniqueness in Speech: Prosodic Highlights-prompted Information Content
Projection in Continuous Speech Speech
Author:
Helen Kai-yun Chen and Chiu-yu
Tseng
Abstract:
Recently, it has been identified that perceived prosodic highlights in
continuous speech can function alternatively as the projector of key/focal
information allocation. This view provides a novel interpretation to the
long-held claim that prominence is used predominantly to mark key information
and alludes to the significance of information content planning prompted by
perceived prominence. Exploring further information content planning and
allocation prompted by prosodic highlights, this study focused on the
information content planning unit�婙��projector�� (PJR)
and its respective �𦑩rojection�� (PJN) (henceforth PJR-PJN units)�𤤗cross four
diverse Mandarin speech genres. Using the corpus linguistic approach and
quantitative analyses, the current study conducted acoustic correlates analyses
of F0 realization and pause duration, also the calculation of
emphasis-attributed weighting scores based on emphasis levels consistently
annotated in the speech data. While the main goal of the study was to profile
consistent acoustic realizations across the PJR-PJN units, further confirmation
of the patterned deployment of information content in continuous speech was
verified. Ultimately, the current results foregrounded the underlying mechanism
for information prosody and features unique to speech.
Keywords: Continuous Speech and
Discourse, Spoken Corpora and Annotations, Information Content Planning and
Allocation, Prosodic Highlights-prompted Projection, Emphasis-attributed
Weighting Scores, Information-attributed Weighting Scores.
Title:
Topic Development and Boundary Cues in Hakka Conversational Discourse
Author:
Shu-Chuan Tseng and Hsiao-chien Liu
Abstract:
The structure of conversational discourse is context-dependent, and the
organization of discourse segments and preferences for signaling discourse
boundaries are language-specific characteristics. Participating speakers,
speaking scenarios, and communication purposes instantaneously affect the
conduct of social interaction and verbal exchanges during a conversation. For
example, topic maintenance is sustained by the overt exchange of coherent
information, and lexical preferences at the boundaries of related discourse
segmentation can help construct the course of topic development. Moreover,
form-based discourse units are used to represent the content of spoken
utterances and to describe the interaction of speakers in conversations. This
study investigated topic-specific Hakka conversations using a top-down
two-level discourse segmentation approach to examine the development and
production of topics. Typical cues and expressions used to initiate topics and
subtopics and their respective discourse functions in the Hakka conversations
were analyzed. In the Hakka conversational data, noun phrases were preferred at
the topic and subtopic transition boundaries, and complete forms such as
clausal constructions were also favored, although the spontaneous speech was expected
to be fragmentary in terms of syntactic structure.
Keywords:
Conversation,
Discourse Units, Topic Development, Boundary Cues, Hakka
Title:
A Move Analysis of Communicative Acts in Petition Text on the Public Policy
Participation Network Platform
Author:
Wei-Ting Yang, Chen-Yu Chester Hsieh, and Siaw-Fong Chung
Abstract:
With the rapid development of information technology, the Taiwanese government
has launched the Public Policy Network Participation Platform (Join Platform),
which allows citizens to start and support a petition online and voice their
opinions regarding public issues. The aim of this study was to apply the method
of move analysis to investigate the text structure and linguistic features of
the online petition genre. In total, 40 online petition texts were collected
from the website and compiled into a corpus using the AntConc
application. The collected texts were then annotated with reference to the four
moves of the Situation, Problem, Solution, and Evaluation textual pattern and
the communicative acts in each move. The results showed that the distribution
of the moves varied across the articles and that the communicative acts in each
move were represented by high-frequency words. The findings of this research
will thus serve as a basis for future applications, such as computerized data
collection, automatic annotation of rhetorical moves, and judgment of
communicative acts in texts.
Keywords:
Move Analysis, SPSE, Policy Argumentation, Communicative Act, Join Platform
Title:
An N-gram Approach to Identifying the Chinese Linguistic Signals for the
Problem-Solution Pattern in Annotated Online Health New
Author:
Chen-Yu Chester Hsieh and Yu-Yun Chang
Abstract:
This article will report the results of an exploratory project that combined
the annotation of the Problem-Solution (PS) textual pattern in online health
news and the quantitative and qualitative methods of corpus linguistics to
investigate the linguistic features of particular rhetorical
moves. A total of 120 journalistic texts written in Chinese were collected from
a Taiwan-based journalistic website that focused on providing news related to
health and medicine and were annotated with the four components of the PS
pattern. To identify signals in the genre for the elements of the PS move
structure, an n-gram approach was then implemented to extract frequent lexicogrammatical sequences from the corpus in general and
from the Problem and Response moves in particular. The
results showed that the linguistic features found in the retrieved sequences
tended to fall within a range of categories, such as abstract nouns, medical
terms, and modal verbs, which not only served as functions relevant to the
rhetorical move in which they were used but also reflected characteristics
specific to the health news genre and the Chinese language. The findings and
annotated data generated from the current project will thus provide a solid foundation
for future research and applications.
Keywords:
N-gram, Problem-Solution Pattern, Health News, Annotation, Journalistic
Discourse
Title:
Let Me Finish!��Speech Patterns of Interruptions in
Chinese: A Corpus-based Study on Parliamentary Interpellations on Taiwan
Author:
Christian Schmidt and Chia-Rung Lu
Abstract:
This corpus-based study investigated verbal interruptions during parliamentary
interpellations based on official and publicly accessible transcriptions
provided by the Legislative Yuan of the Republic of China (Taiwan). While
interruptions have previously been understood as organizing turn-taking, as
well as cues and speech markers, the results of this study suggest that
interruptions have a dual nature. Interruption is incentivised
by confrontational discourse strategies and realized by linguistic expressions,
some of which are statistically significant and can be called keywords. Using
open-source data to explore the linguistic features in the speech patterns of
interruptions in institutional discourse, we first identified the word classes
and keywords with significant frequency shifts between interrupted, interrupting,
and regular sentences. Then, we associated the meanings of the keywords with
offensive and defensive discourse strategies. The findings of this study
indicate that interrupted sentences were more reflective of defensive discourse
strategies, while interrupting sentences were associated with offensive ones.
Moreover, conjunctions, adverbs, and pronouns played a more important role in
the speech patterns of interruptions compared with their respective footprint
in the lexicon. Conversely, nouns and verbs, with some exceptions, as well as
adjectives, played a lesser role. We argue that the confrontational incentive
structure in institutional debates creates certain linguistic patterns, mostly
statistically significant frequency shifts of keywords in interrupted and
interrupting sentences, and that these patterns might be useful in explaining
interruption.
Keywords:
Speech Patterns, Spoken Chinese, Interruptions, Institutional Discourse
Title:
Constructing a Deep Learning Model Using Language in Social
Media: The Case Study of �𤇼etrospective Adjustment��
Author:
Ren-feng Duann, Shu-I Chiu, and Hui-Wen Liu
Abstract:
This research, which used Facebook posts related to the term �禃etrospective
adjustment�� in Taiwan as the corpus, manually coded the sentiments of 6,917
posts. Randomly dividing the dataset into two subsets for training (70%) and
testing (30%) and using the Chinese pre-trained BERT model as the foundation,
we trained and fine-tuned the model with the training dataset and ran the
fine-tuned model to predict the sentiments in the test dataset. We then
compared the results of the manual coding and model prediction to explain the
differences from the perspective of linguistic features. The results indicated
that the model performed better for the posts manually coded as �𦨭eutral,�� with
an accuracy of 0.81, while the accuracies of model prediction were only 0.64
and 0.63 for the posts manually coded as �𦑩ositive�� and �𦨭egative,��
respectively. Regarding inaccuracy, the posts manually coded as �𦨭egative�� but
predicted by the model as �𦑩ositive�� and those manually coded as �𦑩ositive�� but
predicted by the model as �𦨭eutral�� ranked the highest (0.23) and the second
highest (0.22), respectively. Examining the linguistic features of the two
groups of posts, we identified seven categories of linguistic features that, we
claim, led to �𦨭egative�� coding and four categories that led to �𦑩ositive��
coding. Moreover, both groups contained posts that could not be coded
accurately without knowledge of the news and the Facebook account owners�� political/social
inclinations, which was attributed to the posts�� high relatedness to the general public and the politics of Taiwan. Considering that
the language used in social media is different from the language employed to
train current models, and that Facebook users frequently use punctuation marks
and emoticons to express their moods, we argue that there is a need to develop
a model for social media.
Keywords:
Social Media, Deep Learning, Retrospective Adjustment, Sentiment Analysis,
Natural Language Processing
��
��