BYOL框架下的自监督高光谱图像分类

BYOL-based self-supervised learning for hyperspectral image classification

  • 摘要: 高光谱图像可以获取波段连续的图谱合一的立体数据,其具有丰富的图谱信息,能区分不同物质的类别,被广泛应用于各种遥感勘测领域。但在实际中高光谱图像的标注需要耗费大量的人力、财力和时间,可用的标注样本数量较少,难以通过训练来获得准确的分类结果,所以针对于只有少量标记样本的高光谱图像分类是一个挑战。近年来,自监督学习(Self-supervised Learning,SSL)已成为一种有效的方法,可以减少高光谱图像分类对昂贵的数据标注的依赖。SSL方法通过学习在同一图像的不同视图之间产生的潜在特征,在自然图像分类中取得了较高的分类精度。为了探索SSL方法在高光谱图像分类中的潜力,一种Bootstrap Your Own Latent(BYOL)框架下的自监督高光谱图像分类方法(BSSL)被提出。该方法通过引用自监督的图像特征学习框架BYOL,可以不需要负样本对,利用空间光谱相似的同类样本对进行网络训练及参数微调,提取到更具判别性特征。具体来说,该方法主要包括四个部分:BYOL的预训练、超像素聚类、基于“相似对”的BYOL的再训练和最终分类。为了验证该方法的有效性,在三个公开数据集上进行测试,并与五种先进的无监督、自监督分类方法SuperPCA、S3PCA、ContrastNet、SSCL和N2SSL进行对比,在Indian Pines和Salinas数据集上,BSSL方法的总体分类精度(OA)、平均分类精度(AA)、Kappa系数、召回率(recall)和f1分数(f1-score)都取得了更优值。其中在Indian Pines数据集上,OA分别比SuperPCA,S3PCA,ContrastNet,SSCL和N2SSL提高了1.32%,1.05%,5.68%,3.12%和1.27%。而在University of Pavia数据集上,BSSL方法表现没有那么出色,但在综合分类性能上也表现最优。这表明BSSL方法更适用于地物区域面积较大且分布较集中的场景,因为这对于超像素聚类来说更友好。

     

    Abstract:
    Objective Hyperspectral images can acquire continuous spectral bands integrated into a three-dimensional data set, which is rich in spectral information and capable of distinguishing different types of materials. They are widely used in various remote sensing surveying fields. However, with the rapid development of deep learning, hyperspectral image classification has made great progress, but still faces some difficulties. The annotation of hyperspectral images requires a significant amount of manpower, financial resources, and time. And the number of available labeled samples is limited, making it difficult to achieve accurate classification results through training. Therefore, the classification of hyperspectral images with only a small number of labeled samples is a challenge. Researching hyperspectral image classification in scenarios with few samples is of great practical significance for promoting the application of hyperspectral technology.
    Methods In recent years, Self-supervised Learning (SSL) has emerged as an effective approach to reduce the reliance on costly data annotation for hyperspectral image classification. SSL methods have achieved high classification accuracy in natural image classification by learning latent features that arise from different views of the same image. To explore the potential of SSL methods in hyperspectral image classification, a self-supervised hyperspectral image classification method under the Bootstrap Your Own Latent (BYOL) framework, referred to as BSSL, has been proposed. This method leverages the self-supervised image feature learning framework of BYOL, which can train the network and fine-tune parameters without the need for negative sample pairs, utilizing spatial-spectral similar pairs of the same category to extract more discriminative features. Specifically, the method mainly includes four parts: pre-training of BYOL, superpixel clustering, re-training of BYOL based on similar pairs, and final classification. In the BYOL model, the encoder employs a spectral-spatial transformer network to extract joint spatial and spectral features. The superpixel clustering utilizes a global measurement method for superpixel clustering based on binary edge maps, which can achieve more accurate clustering effects in edge areas. On the basis of clustering spatial features, the spectral similarity is calculated using the Spectral Angle Distance, ultimately obtaining a set of similar pairs for retraining the BYOL and fine-tuning the network parameters. Finally, classification is performed using a classical Support Vector Machine classifier.
    Results and Discussions To verify the effectiveness of the proposed method, tests were conducted on three public datasets and compared with five advanced unsupervised and self-supervised classification methods: SuperPCA, S3PCA, ContrastNet, SSCL, and N2SSL. On the Indian Pines and Salinas datasets, the BSSL method achieved superior values in overall classification accuracy (OA), average classification accuracy (AA), Kappa coefficient, recall, and f1-score (Tab.1, Tab.3). Specifically, on the Indian Pines dataset, the OA was improved by 1.32%, 1.05%, 5.68%, 3.12%, and 1.27% compared to SuperPCA, S3PCA, ContrastNet, SSCL, and N2SSL, respectively. On the University of Pavia dataset, while the BSSL method did not perform as outstandingly, it still demonstrated the best overall classification performance (Tab.2). This is because, although the University of Pavia dataset has a considerable number of samples for each category, the distribution is quite scattered, and some ground object category areas are elongated, which is very unfriendly to superpixel segmentation.
    Conclusions A BYOL-based self-supervised learning for hyperspectral image classification method (BSSL) was proposed. The method, by referencing the self-supervised feature learning framework BYOL, can train and fine-tune the network using spatial-spectral similar intra-class sample pairs, thereby extracting more discriminative features. The experimental results demonstrate that the BSSL method exhibits superior classification performance across all three datasets. It also indicates that the method is more suitable for scenarios where the area of the ground objects is relatively large and the distribution is more concentrated, as this is more favorable for superpixel clustering.

     

/

返回文章
返回