?
Neural Attention Mechanism and Linear Squeezing of Descriptors in Image Classification for Visual Recommender Systems
In this paper, we analyze effective methods of multi-label classification of image sets in development of visual recommender systems. We propose a two-step algorithm, which at the first step performs fine-tuning of a convolutional neural network for extraction of visual features. At the second stage, the algorithm concatenates the obtained feature vectors of each image from the input set into one descriptor using modifications of a neural aggregation module based on linear squeezing of the feature space and an attention mechanism. We perform an experimental study for the dataset Amazon Product Data solving a problem of classification of customer interests based on photos of the products they have purchased. We show that one of the highest F1-measure indicators can be achieved for a one-level attention block with squeezing of the feature vectors.