The Corpus of Australian and New Zealand Spoken English : a new resource of naturalistic speech transcripts |
|
Author: | Coats, Steven1 |
Organizations: |
1English, Faculty of Humanities University of Oulu, Finland |
Format: | article |
Version: | published version |
Access: | open |
Online Access: | PDF Full Text (PDF, 3.6 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe2023031431499 |
Language: | English |
Published: |
Australasian Language Technology Association,
2022
|
Publish Date: | 2023-03-14 |
Description: |
AbstractThe Corpus of Australian and New Zealand Spoken English (CoANZSE) is a 190-million-word corpus of Automatic Speech Recognition (ASR) transcripts from YouTube channels of local councils and other governmental bodies in 472 locations in Australia and New Zealand. CoANZSE can be used to examine grammar and syntax in Australian and New Zealand spoken English, and because tokens are word-timed and transcripts are linked to videos, it can serve as the starting point for phonetic or multi-modal studies. Two exploratory analyses demonstrate differences between Australia and New Zealand in the relative frequencies of double modals, a rare non-standard syntactic feature, and show that transcripts from Australia and New Zealand can be distinguished on the basis of common lexical items. see all
|
Series: |
Proceedings of the Australasian Language Technology Workshop |
ISSN: | 1834-7037 |
ISSN-L: | 1834-7037 |
Volume: | 20 |
Pages: | 1 - 5 |
Article number: | 1 |
Host publication: |
Proceedings of the The 20th Annual Workshop of the Australasian Language Technology Association |
Host publication editor: |
Parameswaran, Pradeesh Biggs, Jennifer Powers, David |
Conference: |
Annual Workshop of the Australasian Language Technology Association |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
113 Computer and information sciences |
Subjects: | |
Copyright information: |
© 1963–2023 ACL; other materials are copyrighted by their respective copyright holders. Materials prior to 2016 here are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International License. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License. |
https://creativecommons.org/licenses/by/4.0/ |