ACapMed : automatic captioning for medical imaging

Beddiar, Djamila Romaissa; Oussalah, Mourad; Seppänen, Tapio; Jennane, Rachid

ACapMed : automatic captioning for medical imaging

Beddiar, Djamila Romaissa; Oussalah, Mourad; Seppänen, Tapio; Jennane, Rachid (2022-11-01)

Avaa tiedosto

nbnfi-fe2022112266501.pdf (9.948Mt)

nbnfi-fe2022112266501_meta.xml (35.15Kt)

nbnfi-fe2022112266501_solr.xml (33.92Kt)

Lataukset:

URL:

https://doi.org/10.3390/app122111092

Beddiar, Djamila Romaissa

Oussalah, Mourad

Seppänen, Tapio

Jennane, Rachid

Multidisciplinary Digital Publishing Institute

01.11.2022

Beddiar, D. R., Oussalah, M., Seppänen, T., & Jennane, R. (2022). ACapMed: Automatic Captioning for Medical Imaging. Applied Sciences, 12(21), 11092. https://doi.org/10.3390/app122111092

https://creativecommons.org/licenses/by/4.0/
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
https://creativecommons.org/licenses/by/4.0/

doi:https://doi.org/10.3390/app122111092

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2022112266501

Tiivistelmä

Abstract

Medical image captioning is a very challenging task that has been rarely addressed in the literature on natural image captioning. Some existing image captioning techniques exploit objects present in the image next to the visual features while generating descriptions. However, this is not possible for medical image captioning when one requires following clinician-like explanations in image content descriptions. Inspired by the preceding, this paper proposes using medical concepts associated with images, in accordance with their visual features, to generate new captions. Our end-to-end trainable network is composed of a semantic feature encoder based on a multi-label classifier to identify medical concepts related to images, a visual feature encoder, and an LSTM model for text generation. Beam search is employed to ensure the best selection of the next word for a given sequence of words based on the merged features of the medical image. We evaluated our proposal on the ImageCLEF medical captioning dataset, and the results demonstrate the effectiveness and efficiency of the developed approach.

Kokoelmat

Avoin saatavuus [31931]

Ellei muuten mainita, aineiston lisenssi on https://creativecommons.org/licenses/by/4.0/