IEEE International Conference on Image Processing (ICIP)
17-20 September 2017
CNCC, Beijing, China
Recently, image representation based on convolutional neural network (CNN) becomes more popular than SIFT based feature, such as Fisher vector (FV). However, which of the two works better for image retrieval is not entirely clear yet. In this paper, we propose to fuse CNN and FV to incorporate the advantages of both features for image retrieval. We extract CNN feature and FV from multi-scale regions, which makes the representation more robust to image noise. Then a query-adaptive feature fusion method is proposed, which is used jointly with 2-D inverted index under the framework of bag-of-words. Moreover, we make an evaluation of different CNN feature extraction methods for the region based method. Extensive experiments on four benchmark datasets demonstrate the effectiveness of our method with efficiency in both time cost and memory usage.