University of Oulu

Z. Su et al., "Pixel Difference Networks for Efficient Edge Detection," 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 2021, pp. 5097-5107, doi: 10.1109/ICCV48922.2021.00507.

Pixel difference networks for efficient edge detection

Saved in:
Author: Su, Zhuo1; Liu, Wenzhe2; Yu, Zitong1;
Organizations: 1Center for Machine Vision and Signal Analysis, University of Oulu, Finland
2National University of Defense Technology, China
3Harbin Institute of Technology (Shenzhen), China
4Xidian University, China
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 2.4 MB)
Persistent link: http://urn.fi/urn:nbn:fi-fe2023032333036
Language: English
Published: IEEE Computer Society, 2021
Publish Date: 2023-03-23
Description:

Abstract

Recently, deep Convolutional Neural Networks (CNNs) can achieve human-level performance in edge detection with the rich and abstract edge representation capacities. However, the high performance of CNN based edge detection is achieved with a large pretrained CNN backbone, which is memory and energy consuming. In addition, it is surprising that the previous wisdom from the traditional edge detectors, such as Canny, Sobel, and LBP are rarely investigated in the rapid-developing deep learning era. To address these issues, we propose a simple, lightweight yet effective architecture named Pixel Difference Network (PiDiNet) for efficient edge detection. PiDiNet adopts novel pixel difference convolutions that integrate the traditional edge detection operators into the popular convolutional operations in modern CNNs for enhanced performance on the task, which enjoys the best of both worlds. Extensive experiments on BSDS500, NYUD, and Multicue are provided to demonstrate its effectiveness, and its high training and inference efficiency. Surprisingly, when training from scratch with only the BSDS500 and VOC datasets, PiDiNet can surpass the recorded result of human perception (0.807 vs. 0.803 in ODS F-measure) on the BSDS500 dataset with 100 FPS and less than 1M parameters. A faster version of PiDiNet with less than 0.1M parameters can still achieve comparable performance among state of the arts with 200 FPS. Results on the NYUD and Multicue datasets show similar observations. The codes are available at https://github.com/zhuoinoulu/pidinet.

see all

Series: IEEE International Conference on Computer Vision
ISSN: 1550-5499
ISSN-E: 2380-7504
ISSN-L: 1550-5499
ISBN: 978-1-6654-2812-5
ISBN Print: 978-1-6654-2813-2
Pages: 5097 - 5107
DOI: 10.1109/iccv48922.2021.00507
OADOI: https://oadoi.org/10.1109/iccv48922.2021.00507
Host publication: 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
Conference: IEEE International Conference on Computer Vision
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Subjects:
Funding: This work was partially supportedby the Academy of Finland under grant 331883 and the National Natural Science Foundation of China under Grant 61872379, 62022091, and 71701205.
Academy of Finland Grant Number: 331883
Detailed Information: 331883 (Academy of Finland Funding decision)
Copyright information: © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.