University of Oulu

Tran, MT., Vo, QN. & Lee, GS. Binarization of music score with complex background by deep convolutional neural networks. Multimed Tools Appl 80, 11031–11047 (2021). https://doi.org/10.1007/s11042-020-10272-2

Binarization of music score with complex background by deep convolutional neural networks

Saved in:
Author: Tran, Minh-Trieu1; Vo, Quang-Nhat2; Lee, Guee-Sang1
Organizations: 1Department of Artificial Intelligence Convergence, Chonnam National University, Gwangju, South Korea
2Center for Machine Vision and Signal Analysis, University of Oulu, Oulu, Finland
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 3.8 MB)
Persistent link: http://urn.fi/urn:nbn:fi-fe2021041910914
Language: English
Published: Springer Nature, 2021
Publish Date: 2021-04-18
Description:

Abstract

Binarization is an important step for most of document analysis systems. Regarding music score images with a complex background, the existence of background clutters with a variety of shapes and colors creates many challenges for the binarization. This paper presents a model for binarization of the complex background music score images by fusion of deep convolutional neural networks. Our model is directly trained from image regions using pixel values as inputs and the binary ground truth as labels. By utilizing the generalization capability of the residual network backbone and useful feature learning ability of dense layer, the proposed network structures can differentiate foreground pixels from background clutters, minimize the possibility of overfitting phenomenon and thus can deal with complex background noises appearing in the music score images. Comparing to traditional algorithms, binary images generated by our method have a cleaner background and better-preserved strokes. The experiments with captured and synthetic music score images show promising results compared to existing methods.

see all

Series: Multimedia tools and applications
ISSN: 1380-7501
ISSN-E: 1573-7721
ISSN-L: 1380-7501
Volume: 80
Issue: 7
Pages: 11031 - 11047
DOI: 10.1007/s11042-020-10272-2
OADOI: https://oadoi.org/10.1007/s11042-020-10272-2
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
213 Electronic, automation and communications engineering, electronics
Subjects:
Funding: This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A3B05049058 & NRF-2020R1A4A1019191).
Copyright information: © The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
  https://creativecommons.org/licenses/by/4.0/