Dynamic texture classification using unsupervised 3D filter learning and local binary encoding

Zhao, Xiaochao; Lin, Yaping; Heikkilä, Janne; Zheng, Wenming

Dynamic texture classification using unsupervised 3D filter learning and local binary encoding

Zhao, Xiaochao; Lin, Yaping; Heikkilä, Janne; Zheng, Wenming (2019-01-03)

Avaa tiedosto

nbnfi-fe2019121046450.pdf (2.248Mt)

nbnfi-fe2019121046450_meta.xml (31.39Kt)

nbnfi-fe2019121046450_solr.xml (38.32Kt)

Lataukset:

URL:

https://doi.org/10.1109/TMM.2018.2890362

Zhao, Xiaochao

Lin, Yaping

Heikkilä, Janne

Zheng, Wenming

Institute of Electrical and Electronics Engineers

03.01.2019

X. Zhao, Y. Lin, L. Liu, J. Heikkilä and W. Zheng, "Dynamic Texture Classification Using Unsupervised 3D Filter Learning and Local Binary Encoding," in IEEE Transactions on Multimedia, vol. 21, no. 7, pp. 1694-1708, July 2019. doi: 10.1109/TMM.2018.2890362

https://rightsstatements.org/vocab/InC/1.0/
© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
https://rightsstatements.org/vocab/InC/1.0/

doi:https://doi.org/10.1109/TMM.2018.2890362

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2019121046450

Tiivistelmä

Abstract

Local binary descriptors, such as local binary pattern (LBP) and its various variants, have been studied extensively in texture and dynamic texture analysis due to their outstanding characteristics, such as grayscale invariance, low computational complexity and good discriminability. Most existing local binary feature extraction methods extract spatio-temporal features from three orthogonal planes of a spatio-temporal volume by viewing a dynamic texture in 3D space. For a given pixel in a video, only a proportion of its surrounding pixels is incorporated in the local binary feature extraction process. We argue that the ignored pixels contain discriminative information that should be explored. To fully utilize the information conveyed by all the pixels in a local neighborhood, we propose extracting local binary features from the spatio-temporal domain with 3D filters that are learned in an unsupervised manner so that the discriminative features along both the spatial and temporal dimensions are captured simultaneously. The proposed approach consists of three components: 1) 3D filtering; 2) binary hashing; and 3) joint histogramming. Densely sampled 3D blocks of a dynamic texture are first normalized to have zero mean and are then filtered by 3D filters that are learned in advance. To preserve more of the structure information, the filter response vectors are decomposed into two complementary components, namely, the signs and the magnitudes, which are further encoded separately into binary codes. The local mean pixels of the 3D blocks are also converted into binary codes. Finally, three types of binary codes are combined via joint or hybrid histograms for the final feature representation. Extensive experiments are conducted on three commonly used dynamic texture databases: 1) UCLA; 2) DynTex; and 3) YUVL. The proposed method provides comparable results to, and even outperforms, many state-of-the-art methods.

Kokoelmat

Avoin saatavuus [31941]