ZipfExplorer : a tool for the comparison of shared lexis |
|
Author: | Coats, Steven1 |
Organizations: |
1English, University of Oulu, 90014 Oulu, Finland |
Format: | article |
Version: | published version |
Access: | open |
Online Access: | PDF Full Text (PDF, 1.8 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe2021052131113 |
Language: | English |
Published: |
CEUR Workshop Proceedings,
2021
|
Publish Date: | 2021-05-21 |
Description: |
AbstractWord frequency statistics and lexical diversity measures can provide insights into discourse differences between texts. The ZipfExplorer, a tool and online app for the interactive visualization and comparison of word frequencies in two texts, shows side-by-side rank-frequency profiles and interactive tables of shared lexis,enabling keyword analysis and shedding light on discourse differences. Four lexical diversity measures (type-token ratio, Gini coefficient, power-law alpha parameter, and Shannon entropy) are calculated for the shared word types. Word frequency information is provided for a selection of mainly literary texts, and users can upload their own files. This paper provides an overview of the visualization of word frequency distributions, describes the functionality of the ZipfExplorer tool and demonstrates some of its features, and briefly discusses the lexical diversity measures calculated by the tool. see all
|
Volume: | 2865 |
Pages: | 145 - 155 |
Article number: | 6 |
Host publication: |
Post-Proceedings of the 5th Conference Digital Humanities in the Nordic Countries (DHN 2020) Riga, Latvia, October 21-23, 2020 |
Host publication editor: |
Reinsone, Sanita Skadiņa, Inguna Daugavietis, Jānis Baklāne, Anda |
Conference: |
Conference Digital Humanities in the Nordic Countries |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
6121 Languages 113 Computer and information sciences |
Subjects: | |
Copyright information: |
© 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
https://creativecommons.org/licenses/by/4.0/ |