Abstract:
Due to the vagueness of anomaly definition and the complexity of real data, video anomaly detection is one of the most challenging problems in intelligent video surveillance. Frame reconstruction (current or future frame) based on autoencoder (AE) is a popular video anomaly detection method. Using a model trained on normal data, the reconstruction error of abnormal scenes is usually much larger than that of normal scenes. However, these methods ignore the internal structure of the normal data and are memory-consuming. Based on this, a deep auto-encoding Gaussian mixture model (DAGMM) was proposed. Firstly, the deep autoencoder was used to obtain the low-dimensional representation of the input video segment and the reconstruction error, and then further input into a Gaussian mixture model (GMM). The energy probability was predicted through the Gaussian mixture model, and then the anomaly was judged through the energy density probability. The proposed DAGMM can simultaneously optimizes the parameters of the deep autoencoder and GMM in an end-to-end manner, and balance auto-encoding reconstruction, density estimation and regularization of low-dimensional representation, and has strong generalization ability. Experimental results on two public benchmark datasets show that DAGMM has reached the highest level of technological development, achieving 95.7% and 72.9% frame-level AUC on the UCSD Ped2 and ShanghaiTech dataset, respectively.