The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking
|Author:||Deng, Jiankang1; Roussos, Anastasios2; Chrysos, Grigorios1;|
1Department of Computing, Imperial College London, London, UK
2Department of Computer Science, University of Exeter, Exeter, UK
3Department of Computer Science, Middlesex University London, London, UK
4Centre for Machine Vision and Signal Analysis, University of Oulu, Oulu, Finland
|Online Access:||PDF Full Text (PDF, 5.2 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe2019052216539
|Publish Date:|| 2019-05-22
In this article, we present the Menpo 2D and Menpo 3D benchmarks, two new datasets for multi-pose 2D and 3D facial landmark localisation and tracking. In contrast to the previous benchmarks such as 300W and 300VW, the proposed benchmarks contain facial images in both semi-frontal and profile pose. We introduce an elaborate semi-automatic methodology for providing high-quality annotations for both the Menpo 2D and Menpo 3D benchmarks. In Menpo 2D benchmark, different visible landmark configurations are designed for semi-frontal and profile faces, thus making the 2D face alignment full-pose. In Menpo 3D benchmark, a united landmark configuration is designed for both semi-frontal and profile faces based on the correspondence with a 3D face model, thus making face alignment not only full-pose but also corresponding to the real-world 3D space. Based on the considerable number of annotated images, we organised Menpo 2D Challenge and Menpo 3D Challenge for face alignment under large pose variations in conjunction with CVPR 2017 and ICCV 2017, respectively. The results of these challenges demonstrate that recent deep learning architectures, when trained with the abundant data, lead to excellent results. We also provide a very simple, yet effective solution, named Cascade Multi-view Hourglass Model, to 2D and 3D face alignment. In our method, we take advantage of all 2D and 3D facial landmark annotations in a joint way. We not only capitalise on the correspondences between the semi-frontal and profile 2D facial landmarks but also employ joint supervision from both 2D and 3D facial landmarks. Finally, we discuss future directions on the topic of face alignment.
International journal of computer vision
|Pages:||599 - 624|
|Type of Publication:||
A1 Journal article – refereed
|Field of Science:||
113 Computer and information sciences
Jiankang Deng was supported by the President’s Scholarship of Imperial College London. This work was partially funded by the EPSRC Project EP/N007743/1 (FACER2VM) and a Google Faculty Fellowship to Dr. Zafeiriou. We thank the NVIDIA Corporation for donating several Titan Xp GPUs used in this work.
© The Author(s) 2018. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.