Arabic dialects identification : North African dialects case study |
|
Author: | Berrimi, Mohamed1; Moussaoui, Abdelouahab1; Oussalah, Mourad2; |
Organizations: |
1Department of computer sciences, University of Ferhat Abbas 1, Algeria 2Department of Computer Science and Engineering, University of Oulu, Finland |
Format: | article |
Version: | published version |
Access: | open |
Online Access: | PDF Full Text (PDF, 0.8 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe202102154776 |
Language: | English |
Published: |
RWTH Aachen University,
2020
|
Publish Date: | 2021-02-15 |
Description: |
AbstractArabic is the fourth most used language on the Internet and the official language of more than 20 countries around the world. It has three main varieties, Modern Standard Arabic, which is used in books, news and education, local Dialects that vary from region to another, and Classical Arabic, the written language of the Quran. Maghrebi dialect is the Arabic dialect language used in North African countries, where internet users from these countries feel more comfortable using local slangs than native Arabic. In this study, we present a large dataset of regional dialects of three countries, namely Algeria, Tunisia, and Morocco, then we investigate the identification of each dialect using a machine learning classifiers with TF-IDF features. The approach shows promising results, where we achieved accuracy up to 96%. see all
|
Series: |
CEUR workshop proceedings |
ISSN: | 1613-0073 |
ISSN-E: | 1613-0073 |
ISSN-L: | 1613-0073 |
Pages: | 64 - 72 |
Host publication: |
3rd Conference on Informatics and Applied Mathematics, IAM 2020 |
Host publication editor: |
Seridi, Hamid Kurulay, Muhammet Kouahla, Mohamed Nadjib Kouahla, Zineddine Farou, Kouahla Ferrag, Mohamed Amine Halimi, Khaled |
Conference: |
Conference on Informatics and Applied Mathematics |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
113 Computer and information sciences 6121 Languages |
Subjects: | |
Copyright information: |
© 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop
Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) |
https://creativecommons.org/licenses/by/4.0/ |