A database of North American double modals and self-repairs from YouTube
1English, University of Oulu, Finland
|Online Access:||PDF Full Text (PDF, 1.2 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe2023062157421
|Publish Date:|| 2023-06-21
Sequences of two modal verbs in spoken English can represent use of a nonstandard syntactic feature (double modal) or a corrected utterance in which a speaker begins with one modal auxiliary, but switches to another (self-repair). This article presents the Double Modals and Self-Repairs (DMSR) database, a table of naturalistic double modals and selfrepairs in videos from local government entities in North America, created from the Corpus of North American Spoken English (CoNASE). The paper describes the procedures used for the database’s creation, discusses potential uses, and presents an exploratory analysis in which a logistic regression classifier is trained with CoNASE data to distinguish authentic double modals from self-repair sequences on the basis of local discourse context. The analysis demonstrates how large corpora of speech can be used to investigate the links between syntactic and pragmatic phenomena and shows specifically that double modals are an interactive device, while two-modal sequences as self-repairs may be the result of high cognitive load. The paper concludes with a discussion of multimodal corpus creation from YouTube for the study of lexical, syntactic, and interactional phenomena in speech as well as for the analysis of complex, multilevel computer-mediated communication (CMC) phenomena.
Psychology of language and communication
|Pages:||273 - 296|
|Type of Publication:||
A1 Journal article – refereed
|Field of Science:||
113 Computer and information sciences
© 2022 Steven Coats, published by Sciendo. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.