University of Oulu

W. Chen et al., "Lifelong Fine-grained Image Retrieval," in IEEE Transactions on Multimedia, 2022, doi: 10.1109/TMM.2022.3222934

Lifelong fine-grained image retrieval

Saved in:
Author: Chen, Wei1; Xu, Haoyang2; Pu, Nan3;
Organizations: 1Academy of Advanced Technology Research of Hunan, Changsha, China
2College of Communication Engineering, Xidian University, China
3Leiden Institute of Advanced Computer Science, Leiden University, The Netherlands
4DUT-RU International School of Information Science and Engineering, Dalian University of Technology, China
5Center for Machine Vision and Signal Analysis, University of Oulu, Finland
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 10.6 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2022
Publish Date: 2023-10-04


Fine-grained image retrieval has been extensively explored in a zero-shot manner. A deep model is trained on the seen part and then evaluated the generalization performance on the unseen part. However, this setting is infeasible for many real-world applications since (1) the retrieval dataset can be non-fixed so that new data are added constantly, and (2) data samples of the seen categories are also common in practice and are important for evaluation. In this paper, we explore lifelong fine-grained image retrieval (LFGIR), which learns continuously on a sequence of new tasks with data from different datasets. We first use knowledge distillation to minimize catastrophic forgetting on old tasks. Training continuously on different datasets causes large domain shifts between the old and new tasks while image retrieval is sensitive to even small shifts in the features. This tends to weaken the effectiveness of knowledge distillation by the frozen teacher. To mitigate the impact of domain shifts, we use the network inversion method to generate images of the old tasks. In addition, we design an on-the-fly teacher which transfers knowledge captured on a new task to the student to improve better generalization performance, thereby achieving a better balance between old and new tasks in the end. We name the whole framework as Dual Knowledge Distillation (DKD), whose efficacy is demonstrated by extensive experimental results on sequential tasks including 7 datasets.

see all

Series: IEEE transactions on multimedia
ISSN: 1520-9210
ISSN-E: 1941-0077
ISSN-L: 1520-9210
Issue: Online first
DOI: 10.1109/tmm.2022.3222934
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Funding: This work was supported by LIACS MediaLab at Leiden University, China Scholarship Council (CSC No. 201703170183), National Key Research and Development Program of China No. 2021YFB3100800, the Academy of Finland under grant 331883, Infotech Project FRAGES, National Natural Science Foundation of China under Grant 62102061, and the Fundamental Research Funds for the Central Universities under Grant DUT21RC(3)024. We would like to thank NVIDIA for the donation of GPU cards.
Academy of Finland Grant Number: 331883
Detailed Information: 331883 (Academy of Finland Funding decision)
Copyright information: © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.