In Search of Basic Units of Spoken Language: A Corpus-Driven Approach
What is the best way to analyze spontaneous spoken language? In their search for the basic units of spoken language the authors of this volume opt for a corpus-driven approach. They share a strong conviction that prosodic structure is essential for the study of spoken discourse and each bring their own theoretical and practical experience to the table. In the first part of the book they segment spoken material from a range of different languages (Russian, Hebrew, Central Pomo (an indigenous language from California), French, Japanese, Italian, and Brazilian Portuguese). In the second part of the book each author analyzes the same two spoken English samples, but looking at them from different perspectives, using different methods of analysis as reflected in their respective analyses in Part I. This approach allows for common tendencies of segmentation to emerge, both prosodic and segmental.
This chapter deals with segmentation, definition of reference units and annotation of the first corpus of Russian narratives by individuals with brain damage – people with aphasia and right hemisphere damage – and neurologically healthy speakers. We show that such parameters as pause length and intonation contours cannot be used for segmentation of impaired speech. Instead, we use syntactic criteria for identification of the reference, or – as they are called in this paper – elementary discourse units (EDUs). The Russian CliPS (Clinical Pear Stories) corpus contains multi-layer annotation of audio- and video-recordings, performed on micro- and macro-linguistic level, and can be used as a source for qualitative and quantitative research on various aspects of speech in aphasia and right hemisphere damage.