[1] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[2] Yu X, Hong S, Yu J, et al. Research on a ship target data augmentation method of visible remote sensing image [J]. Chinese Journal of Scientific Instrument, 2020, 41(11): 261-269. (in Chinese)
[3] Ma Y, Tang P, Zhao L, et al. Review of data augmentation for image in deep learning [J]. Image Graphics, 2021, 26(3): 487-502. (in Chinese) doi:  10.11834/jig.200089
[4] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90. doi:  10.1145/3065386
[5] Taylor L, Nitschke G. Improving deep learning with generic data augmentation[C]//2018 IEEE Symposium Series on Computational Intelligence (SSCI), IEEE, 2018, 1542-1547.
[6] Zhong Z, Zheng L, Kang G, et al. Random erasing data augmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 13001-13008.
[7] Ma D, Tang P, Zhao L. SiftingGAN: Generating and sifting labeled samples to improve the remote sensing image scene classification baseline in vitro [J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16(7): 1046-1050. doi:  10.1109/LGRS.2018.2890413
[8] Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein gans[EB/OL]. (2017-12-25) [2022-12-06]. https://arxiv.org/abs/1704.00028.
[9] Zheng Z, Zheng L, Yang Y. Unlabeled samples generated by gan improve the person re-identification baseline in vitro[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 3754-3762.
[10] Zhong Z, Zheng L, Zheng Z, et al. Camera style adaptation for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 5157-5166.
[11] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C]//Proceedings of the IEEE European Conference on Computer Vision, 2016: 21-37.
[12] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[13] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Comference on Computer Vision and Pattern Recognition, 2014: 580-587.
[14] Girshick R. Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440-1448.
[15] Ju M, Luo J, Liu G, et al. ISTDet: An efficient end-to-end neural network for infrared small target detection [J]. Infrared Physics & Technology, 2021, 114: 103659. doi:  10.1016/J.INFRARED.2021.103659
[16] Yao S, Zhu Q, Zhang T, et al. Infrared image small-target detection based on improved FCOS and spatio-temporal features [J]. Electronics, 2022, 11(6): 933. doi:  10.3390/electronics11060933
[17] Lu X F, Bai X F, Li S X, et al. Infrared small target detection method based on the improved weighted enhanced local contrast measurement [J]. Infrared and Laser Engineering, 2022, 51(8): 20210914. (in Chinese) doi:  10.3788/IRLA20210914
[18] Jiang R Q, Peng Y P, Xie W X, et al. Improved YOLOv4 small target detection algorithm with embedded scSE module [J]. Journal of Graphics, 2021, 42(4): 546-555. (in Chinese) doi:  10.11996/JG.j.2095-302X.2021040546
[19] Owens A, Wu J, McDermott J H, et al. Ambient sound provides supervision for visual learning[C]//European conference on computer vision. Springer, Cham, 2016: 801-816.
[20] Goodfellow I, Pouget-abadie J, Mirza M, et al. Generative adversarial networks [J]. Communications of the ACM, 2020, 63(11): 139-144. doi:  10.1145/3422622
[21] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 13713-13722.
[22] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
[23] Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 3-19.
[24] Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images: A survey and a new benchmark [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296-307. doi:  10.1016/j.isprsjprs.2019.11.023
[25] Xia G S, Bai X, Ding J, et al. DOTA: A large-scale dataset for object detection in aerial images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 3974-3983.
[26] Chen H, Qi Z, Shi Z. Remote sensing image change detection with transformers [J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1-14. doi:  10.1109/TGRS.2021.3095166