Research on image interpretation based on deep learning
-
-
Abstract
Convolution Neural Networks (CNN) and Recurrent Neural Networks (RNN) had developed rapidly in the fields of image classification, computer vision, natural language process, speech recognition, machine translation and semantic analysis, which caused researchers' close attention to computers' automatic generation of image interpretation. At present, the main problems in image description were sparse input text data, over-fitting of the model, difficult convergence of the model loss function, and so on. In this paper, NIC was used as a baseline model. For data sparseness, one-hot text in the baseline model was changed and word2vec was used to map the text. To prevent over-fitting, regular items were added to the model and Dropout technology was used. In order to make innovations in word order memory, the associative memory unit GRU for text generation was used. In experiment, the AdamOptimizer optimizer was used to update parameters iteratively. The experimental results show that the improved model parameters are reduced and the convergence speed is significantly faster, the loss function curves are smoother, the maximum loss is reduced to 2.91, and the model accuracy rate increases by nearly 15% compared with the NIC. Experiments validate that the use of word2vec to map text in the model obviously alleviates the data sparseness problem. Adding regular items and using Dropout technology could effectively prevent over-fitting of the model. The introduction of associative memory unit GRU could greatly reduce the model trained parameters and speed up the algorithm of convergence rate, improve the accuracy of the entire model.
-
-