University of Oulu

W. Peng, X. Hong and G. Zhao, "Video Action Recognition Via Neural Architecture Searching," 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 2019, pp. 11-15. doi: 10.1109/ICIP.2019.8802919

Video action recognition via neural architecture searching

Saved in:
Author: Peng, Wei1; Hong, Xiaopeng2,1; Zhao, Guoying1
Organizations: 1Center for Machine Vision and Signal Analysis, University of Oulu, Finland
2Xi’an Jiaotong University, Xi’an, P. R. China
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 0.4 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2019
Publish Date: 2019-12-02


Deep neural networks have achieved great success for video analysis and understanding. However, designing a high-performance neural architecture requires substantial efforts and expertise. In this paper, we make the first attempt to let algorithm automatically design neural networks for video action recognition tasks. Specifically, a spatio-temporal network is developed in a differentiable space modeled by a directed acyclic graph, thus a gradient-based strategy can be performed to search an optimal architecture. Nonetheless, it is computationally expensive, since the computational burden to evaluate each architecture candidate is still heavy. To alleviate this issue, we, for the video input, introduce a temporal segment approach to reduce the computational cost without losing global video information. For the architecture, we explore in an efficient search space by introducing pseudo 3D operators. Experiments show that, our architecture outperforms popular neural architectures, under the training from scratch protocol, on the challenging UCF101 dataset, surprisingly, with only around one percentage of parameters of its manual-design counterparts.

see all

Series: IEEE International Conference on Image Processing
ISSN: 1522-4880
ISSN-E: 2381-8549
ISSN-L: 1522-4880
ISBN: 978-1-5386-6249-6
ISBN Print: 978-1-5386-6250-2
Pages: 11 - 15
DOI: 10.1109/ICIP.2019.8802919
Host publication: 26th IEEE International Conference on Image Processing (ICIP), 22-25 Sept. 2019, Taipei, Taiwan
Conference: IEEE International Conference on Image Processing
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Funding: This work was supported by the Academy of Finland ICT 2023 project (Grant No. 313600), Tekes Fidipro program (Grant No. 1849/31/2015) and Business Finland project (Grant No. 3116/31/2017), Infotech Oulu, and the National Natural Science Foundation of China (Grants No. 61772419). As well, the authors wish to acknowledge CSC-IT Center for Science, Finland, for computational resources.
Academy of Finland Grant Number: 313600
Detailed Information: 313600 (Academy of Finland Funding decision)
Copyright information: © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.