-
通过分辨率分解方法,基于变换域的图像融合方法可获取一系列包含不同层次的子图像,以保留更多的图像细节信息。多尺度变换是应用最为广泛的基于变换域的融合方法,其融合方法主要分为三个步骤:多尺度的正逆变换、融合规则的设计,具体融合框架流程如图1所示。
-
金字塔变换将图像分解成不同尺度且呈金字塔状的子带图像进行融合。拉普拉斯金字塔(Laplacian Pyramid,LP)变换[11]是最早提出的基于金字塔变换的融合方法,基于LP变换的成功应用,比率低通金字塔[12]、对比度金字塔(Contrast Pyramid, CP)[13]、形态学金字塔[14]和可控金字塔[15]等融合方法相继被提出。LP变换基于高斯金字塔获取的一系列差值图像虽然凸显了高频子带图像的细节特征信息,但存在图像对比度低,信息冗余等问题。基于彩色参考图像的假彩色融合[16]以及具备优越边缘表达能力的模糊逻辑[9]可改善LP变换的不足,增强图像融合效果。此外,具备人眼视觉系统图像感知特性的CP变换,可弥补LP变换图像对比度低的缺陷。但CP变换不具备方向不变性,可结合方向滤波器予以解决[17]。CP变换的改进还可从图像融合规则入手,例如基于改进区域能量的对比度金字塔算法[18]。
相比基于空间域的图像融合方法,金字塔变换作为最先发展起来的多尺度变换方法,在图像细节信息保留方面有较大提升。但金字塔变换属于冗余变换,各层数据之间相关性大,源图像差异较大的区域融合后易产生块效应,降低了算法的鲁棒性。同时,金字塔变换还存在源图像结构信息丢失严重和图像信噪比低的问题。
-
小波变换的概念最早由Grossman和Morlet[19]于1984年提出,之后Mallet[20]根据信号分解和重建的塔式算法建立了基于小波变换的多分辨率分解理论。小波变换具体包括离散小波变换(Discrete wavelet Transform, DWT)[21]、双树离散小波变换(Dual Tree Discrete Wavelet Transform, DT-DWT)[22]、提升小波变换[23]、四元数小波变换[24]和谱图小波变换[25]等。DWT通过滤波器组合实现源图像的多尺度分解,各尺度间独立性高,纹理边缘信息保留度高[26]。但DWT存在一些缺陷,具体包括振荡、移位误差、混叠以及缺乏方向选择性等[27]。DT-DWT利用可分离的滤波器组合对图像进行分解,解决了DWT缺乏方向性的问题,且具有冗余信息少,计算效率高的优势。但作为第一代小波变换,DT-DWT不适用于非欧式空间。提升小波变换是构造第二代小波变换的理想方法,可完全视为空间域的变换,其具有自适应设计强、可不规则采样等优点,融合视觉效果较好[23]。
与金字塔变换相比,小波变换不会产生图像块效应,具有高信噪比。此外,小波变换还具有完备的图像重构能力,降低了图像分解过程中的信息冗余。然而,其表达的是源图像中部分方向信息,仍会造成图像细节信息的丢失。
-
为获取图像方向信息,消除吉布斯现象,解决平移不变性等问题,Da等人[28]提出了非下采样轮廓波变换(Non-subsampled Contourlet Transform, NSCT),由于不存在采样过程,其解决了轮廓波变换的频谱混叠问题。此外,NSCT与模糊逻辑相结合能有效增强红外目标并保留可见光图像的细节[29]。NSCT与感兴趣区域的提取相结合,可成功凸显红外目标[30]。但是,NSCT的计算效率较低,无法满足高实时性的应用需求。
为满足高实时性需求,Guo等人[31]提出多尺度多方向性的剪切波变换,但其仍不具备平移不变特性。而非下采样剪切波变换(Non-subsampled Shearlet Transform, NSST)[32]不仅可以满足上述需求,且相比于NSCT,拥有更高的计算效率。优越的信息捕获和表示能力使NSST成为红外与可见光图像融合方法中的一种流行方法。Kong等人[33]在NSST融合方法的基础上,引入区域平均能量和局部对比度的融合规则,将空间域分析和多尺度几何分析的优点最大程度地融合在一起。NSST实质上属于冗余变换,为克服该不足,Kong等人[34]进一步在NSCT融合方法中提出快速非负矩阵分解,最终融合图像在保留源图像全部特征的同时降低了图像的冗余信息。
非下采样多尺度多方向几何变换所分解出的子带图像与源图像尺寸相同,这有利于源图像细节、纹理特征的提取,同时简化了后续融合规则的设计。但NSCT分解过程复杂,难以应用于实时性要求高的场景。NSST分解时由于引入了非下采样金字塔变换的方法,容易造成高频子带图像细节的丢失,还会降低图像亮度。
-
常用的多尺度变换的融合方法还有保边滤波器、Tetrolet变换、顶帽变换、哈尔小波变换等,其中保边滤波器的应用最为广泛,该方法将源图像分解为一个基本层以及一个或多个的细节层,在保持空间结构的同时还减少了融合图像的伪影现象。其中,导向滤波器在保留源图像边缘细节信息的同时还消除其块效应[35],局部边缘保持滤波器在保证图像全局特征的前提下,有效突显图像的细节信息[36]。
-
2010年,Yang和Li[37]提出了基于稀疏表示(Sparse Representation, SR)图像融合方法(融合框架如图2所示),其重点在于过完备字典的构造和稀疏系数分解算法[38]。
图 2 基于稀疏表示的红外与可见光图像融合框架
Figure 2. Sparse representation-based infrared and visible image fusion frame
过完备字典的构造方式主要有两种:基于数据模型和基于学习算法的过完备字典。基于数据模型的过完备字典是利用特定的数学模型进行构建的,该方法虽然高效但难以应对复杂数据,可利用基于联合学习策略的平稳小波多尺度字典[39]和由过完备离散余弦字典与基函数相结合的混合字典[40]来解决此问题。而基于学习算法[38, 40-47]的过完备字典是通过训练样本集的方式构造的,常用的是基于最优方向法(Method of Optimal Directions, MOD)字典[48]和奇异值分解的稀疏字典训练(Kernel Singular Value Decomposition, K-SVD)字典[49]。该类字典具有较强的自适应性但也带来较高的计算量。因此,目前过完备字典的研究趋势倾向于综合两种构造方式优点的算法。起初,Rubinstein等人[50]设计了一种综合固定字典和学习字典的用于学习稀疏字典的稀疏K-SVD方法。后来,聚类补丁和PCA[51-53]、最优方向[46]、自适应稀疏表示[41]、在线学习方法[44, 54]、多尺度几何分析领域中的K-SVD字典[38, 55]等方法也成功应用于图像融合领域。
稀疏系数分解算法中,匹配跟踪(Matching Pursuit, MP)算法利用原子向量的线性运算从训练好的过完备字典中选择最佳线性组合的原子以表征图像信息,但其迭代结果只是次最优。正交匹配追踪(Orthogonal Matching Pursuit, OMP)算法[39, 56-57]在此基础上对MP算法进行改进,经OMP算法处理后的原子组合都已处于正交化状态。在精度要求相同的情况下,OMP算法的计算效率高于MP算法。此外,为解决OMP算法和MP算法融合规则设计难的问题,同步正交匹配追踪(Synchronous Orthogonal Matching Pursuit, SOMP)算法[39, 51, 58]基于OMP算法进行了原子集的改进,其可从不同的源图像中分解出相同子集的字典,从而简化了图像融合规则设计,在图像融合领域得到广泛应用。
稀疏表示与传统多尺度变换的图像融合方法相比有两大区别[59]:一是多尺度融合方法一般都是基于预先设定的基函数进行图像融合,这样容易忽略源图像某些重要特征;而基于稀疏表示的融合方法是通过学习超完备字典来进行图像融合,该字典蕴涵丰富的基原子,有利于图像更好的表达和提取。二是基于多尺度变换的融合方法是利用多尺度的方式将图像分解为多层图像,再进行图像间的融合,因此,分解层数的选择就尤为关键。一般情况下,为从源图像获取丰富的空间信息,设计者都会设置一个相对较大的分解层数,但随着分解层数的增加,图像融合对噪声和配准的要求也越来越严格;而稀疏表示则是利用滑窗技术将图像分割成多个重叠小块并将其向量化,可减少图像伪影现象,提高抗误配准的鲁棒性。
基于稀疏表示的图像融合方法虽然能够改善多尺度变换中特征信息不足和配准要求高的问题,但其自身仍存在一些不足。主要体现在:(1)过完备字典的信号表示能力有限,容易造成图像纹理细节信息的丢失;(2)“max-L1”融合规则对随机噪声敏感,这会降低融合图像信噪比;(3)滑窗技术分割出的重叠小块,降低了算法的运行效率。
-
神经网络在图像融合的应用始于脉冲耦合神经网络(Pulse Coupled Neural Network, PCNN)模型。与其他神经网络模型相比较,PCNN模型无需训练与学习过程,就可有效提取图像信息[60]。基于PCNN的红外与可见光图像融合方法通常会与多尺度变换方法结合,例如NSCT[61-66]、NSST[60]、曲波变换[67]、轮廓波变换[68]等。PCNN主要应用于图像融合策略中,最常见的两种方式是单独作用于图像的高频子带和同时作用于图像的高低频子带。例如在NSCT融合框架中,可采用自适应双通PCNN融合图像的高低频子带系数,也可将PCNN模型作为高频子带图像的融合规则,加权平均[62]、区域方差积分[63]等作为低频子带图像的融合策略。
基于深度学习的图像融合方法渐渐成为图像融合领域的主要研究方向,但其在异源图像融合的研究还处于一个初步发展的水平。深度学习方法是将源图像的深度特征作为一种显著特征用于融合图像的重建,卷积神经网络是目前最常用于图像融合的深度学习方法。Li[69]、Ren[70]都提出一种基于预训练的VGG-19(Visual Geometry Group-19)网络,以提取源图像的深层特征,获得较好的融合效果。2019年,Ma等人[71]首次将端到端的生成对抗网络模型用于解决图像融合问题,避免了传统算法手动设计复杂的分解级别和融合规则,并有效保留源图像信息。
PCNN模型中神经元与图像像素一一对应,解决了传统方法中图像细节易丢失的问题。但PCNN模型网络结构复杂,参数设置繁琐。同时,其与多尺度变换组合的方法,也只能实现模型的局部自适应,且计算速率和泛化能力仍有待提高。基于深度学习的图像融合可从图像数据中提取深层特征,实现模型的自适应融合,具有较强的容错性和鲁棒性。但其仍未得到广泛应用,原因主要有:(1)卷积神经网络的标签数据集制作难度大。在红外与可见光图像融合领域中,一般不存在标准参考图像,模型的ground truth无法得到准确的定义;(2)端到端模型损失函数的针对性不强。端到端模型虽然解决了卷积神经网络模型图像清晰度不高和需要标准参考图像的问题,但仍缺乏针对性损失函数的设计。
-
红外与可见光图像还可使用基于子空间的融合方法,主成分分析法[72]、鲁棒主成分分析法[73]、独立成分分析法[74]、NMF[75]等,一般而言,大部分源图像都存在着冗余信息,基于子空间的融合方法通过将高维的源图像数据投影至低维空间或子空间中,以较少的计算成本获得图像的内部结构。
混合模型通过结合各方法的优点以提高图像融合的性能,常见的混合模型有多尺度变换与显著性检测、多尺度变换与SR、多尺度变换与PCNN结合等。多尺度变换与显著性检测相结合的图像融合方法一般是将显著性检测融入多尺度变换的融合框架中,以增强感兴趣区域的图像信息。显著性检测的应用方式主要两种,权重计算[76-80]和显著目标提取[81-82]。权重计算是在高低频子带图像中获得显著性图,并计算出对应权重图,最终将其应用于图像重构部分。目标检测与识别等监视应用中常会采用显著目标提取,较有代表性的是Zhang等人[81]在NSST融合框架的基础上,利用显著性分析提取红外图像的目标信息。
多尺度变换存在图像对比度低,多尺度分解级别不易确定的问题;稀疏表示则表现出源图像的纹理和边缘信息趋于平滑,计算效率低等不足。基于此,将多尺度变换和稀疏表示结合的混合模型,通常可以取得这两者的最佳平衡,稀疏表示模型通常应用于多尺度分解后的低频子带图像[83]。此外,根据PCNN可充分提取图像细节信息的优点,多尺度变换还常与SR、PCNN结合在一起,低频选择基于SR的融合规则,高频选择基于PCNN的融合规则[84]。该混合模型有效提高了融合图像的清晰度和纹理特征,但在设计融合模型时,需统筹SR和PCNN的优缺点,以免模型过于复杂,增加计算成本。
-
图像融合规则的优劣直接影响图像融合效果。传统方法所采用的融合规则均为“高频绝对值取大,低频加权平均”。这两类融合规则均属于像素级的融合规则,绝对值取大法运算成本低,但容易造成图像信息的缺失,边缘尖锐等问题。加权平均法也会使得图像某些重要特征丢失,同时,如果算法的权重设置过于简单会导致融合图像的灰度值与源图像差异过大。许多学者会基于像素级的图像融合规则进行一定的改进,例如,基于对比度和能量的绝对值取大法[24],根据图像空间频率确定权重值的加权平均法[82],基于双边滤波器的加权平均法[25]等。
此外,常见的还有基于区域、基于矩阵分析的融合规则。由于人眼在观察图像时,关注的往往是区域级别的,区域分类法会更符合人眼视觉感知系统,同时该方法充分考虑各像素间的关系,图像的局部特征可得到进一步的体现。常见的有区域平均能量[85]、区域交叉梯度[86]等。矩阵分析法主要分为模糊逻辑[56, 87]和非负矩阵分解法[34],模糊逻辑常被应用到加权平均融合规则中以融合图像的高频子带,非负矩阵分解法利用分解后图像的像素值为非负值这一特点将图像分解为非负分量,以达到低运算复杂度和高清晰度的效果,其通常应用于图像的低频子带融合规则。
表1列举了红外与可见光图像常用的融合方法、融合策略、优缺点以及所适用的场景。表中根据各方法提出时间的顺序介绍金字塔变换、小波变换、非下采样多尺度多方向几何变换、稀疏表示、神经网络、混合方法六大类融合方法。
表 1 红外与可见光图像融合方法的对比
Table 1. Comparison of infrared and visible image fusion methods
Fusion methods Specific methods Fusion strategies Advantages Limitations Applicable scenes Pyramid transforms Laplacian pyramid Fuzzy logic[9] Smoothing image edge;
Less time consumption;
Less artifactsLosing image details;
Block phenomenon;
Redundancy of dataShort-distance scenes with sufficient light, such as equipment detection Contrast pyramid Clonal selection algorithm[17];
Teaching learning based optimization[88];
Multi-objective evolutionary algorithm[89]High image contrast;
Abundant characteristic informationLow computing efficiency;
Losing image detailsSteerable pyramid The absolute value maximum selection(AVMS)[90];
The expectation maximization(EM) algorithm[91];
PCNN and weighting[92]Abundant edge detail;
Inhibiting the Gibbs effect effectively;
Fusing the geometrical and thematic feature availablyIncreasing the complexity of algorithm;
Losing the image detailsWavelet transform Discrete wavelet transform Regional energy[93];
Target region segmentation[21]Significant texture information;
Highly independent scale information;
Less blocking artifacts;
Higher signal-to-noise ratiosImage aliasing;
Ringing artifacts;
Strict registration requirementsShort-distance scenes, such as face recognition Dual tree discrete wavelet transform Particle swarm optimization[22];
Fuzzy logic and population-based optimization[94]Less redundant information;
Less time consumptionLimited directional information Lifting wavelet transform Local regional energy[23];
PCNN[85]High computing speed;
Low space complexity;Losing image details;
Distorting imageNonsubsampled multi-scale and multi-direction geometrical transform NSCT Fuzzy logic[29];
Region of interest[30]Distinct edge features;
Eliminating the Gibbs effect;
Better visual perceptionLosing image details;
Low computing efficiency;
Poor real-timeScenes with a complex background, such as rescue scenes NSST Region average energy and local directional contrast[33];
FNMF[34]Superior sparse ability;
High real-time
performanceLosing luminance information;
Strict registration requirement;
Losing image details of high frequencyCases need real-time treatment, such as intelligent traffic monitoring Sparse representation Saliency detection[44, 86-87];
PCNN[56, 95]Better robustness;
Less artifacts;
Reducing misregistration;
Abundant brightness informationSmoothing edge texture information;
Complex calculation;
Losing edge features of high frequency imagesScenes with little feature points, such as the surface of the sea 续表 1
Tab.1 ContinuedNeural network PCNN Multi-scale transform and sparse representation;
Multi-scale transformSuperior adaptability;
Higher signal-to-noise ratios;
High fault toleranceModel parameters are not easy to set;
Complex and time-consuming algorithmsAutomatic target detection and localization Deep learning VGG-19 and multi-layer fusion[69];
VGG-19 and saliency detection[70]Less artificial noise;
Abundant characteristic information
Less artifactsRequiring the ground truth in advance GAN[71] Avoiding manually designing complicated activity level measurements and fusion rules The visual information fidelity and correlation coefficient is not optimal Hybrid methods Multi-scale transform and saliency Weight calculation[76-80];
Salient object extraction[81, 82]Maintaining the integrity of the salient object region;
Improving the visual quality of the fused image;Reducing the noiseHighlighting saliency area inconsistently;
Losing the background informationThe surveillance application, such as object detection and tracking Multi-scale transform and SR The absolute values of coefficient and SR[38];
The fourth-order correlation coefficients match and SR[83]Retaining luminance information;
Excellent stability and robustnessPoor real-time Losing the image details -
图像融合质量的评价方法主要分为主观法和客观法。主观法是将图像划分为五个等级,分别是“特别好”、“好”、“一般”、“差”和“特别差”。主观法属于定性分析,具有较强的主观意识,对于两幅融合效果较为相近的图像无法做出客观的判断,同时,相邻的评价级别没有明确的划分界限,存在着一定的不足。客观评价法是通过特定的公式计算图像的相关指标信息以对融合图像进行定量分析,主要分为无参考图像与有参考图像两类评价指标[123]。
常用的图像融合评价指标的定义及说明如表2、表3所示。设源图像的尺寸大小为M×N,其中A,B,S表示源图像,F和S表示融合图像和参考图像;µ为图像的灰度均值;pk表示像素值为k的概率(k=0,1,2,···,255)。设Z=A,B,S,F,R;i=1,2,···,M;j=1,2,···,N。其中,Z(i, j)表示图像Z的灰度值;ΔZ表示图像Z的差分;表示边缘信息量;PZ和PZZ分别表示图像的概率密度函数和图像间的联合概率密度;函数表示图像的边缘强度函数。
表 2 无参考图像的评价指标
Table 2. Evaluation index without reference image
Evaluation indicators Definition Explanation IE[124] ${{IE} } = - \displaystyle\sum\limits_{i = 0}^{L - 1} { {p_i} } {\log _2}{p_i}$ Amount of information contained in an image increases as IE improves SD[125] ${{SD} } = \sqrt {\frac{1}{ {MN} }\displaystyle\mathop \sum \limits_{i = 1}^M \displaystyle\mathop \sum \limits_{j = 1}^N { {\left( {F\left( {i,j} \right) - \mu } \right)}^2} }$ Deviation between pixels and pixel mean is evaluated by SD, which improves with the increase of SD, resulting in improvement in contrast of images AG[126] ${{AG} } = \frac{1}{ {\left( {M - 1} \right)\left( {N - 1} \right)} }\displaystyle\sum\limits_{i = 1}^{M - 1} {\displaystyle\sum\limits_{j = 1}^{N - 1} {\sqrt {\frac{ {\left( {\vartriangle Z_i^2 + \vartriangle Z_j^2} \right)} }{2} } } }$ A wealth of detailed information is exhibited by a high value of AG which is used to reflect the gray variation of the image QAB/F[127] ${ {{Q} }^{{ {AB/F} } } } = \frac{ {\displaystyle\sum\limits_{i = 0}^{M - 1} {\displaystyle\sum\limits_{j = 0}^{N - 1} {\left( {Q_{\left( {i,j} \right)}^{AF}w_{\left( {i,j} \right)}^A + Q_{\left( {i,j} \right)}^{BF}w_{\left( {i,j} \right)}^B} \right)} } } }{ {\displaystyle\sum\limits_{i = 0}^{M - 1} {\displaystyle\sum\limits_{j = 0}^{N - 1} {\left( {w_{\left( {i,j} \right)}^A + w_{\left( {i,j} \right)}^B} \right)} } } }$ Fusion effect of image exhibits better as the value of QAB/F which is used to evaluate the transfer of edge information, approaches 1 MI[2] $\begin{array}{l}{I_{ { {FA} } } }(i,j) = \displaystyle\sum\limits_{i = 1}^{M - 1} {\displaystyle\sum\limits_{j = 1}^{N - 1} { {P_{ { {FA} } } }\left( {i,j} \right)} } {\log _2}\dfrac{ { {P_{FA} }\left( {i,j} \right)} }{ { {P_F}\left( i \right){P_B}\left( j \right)} }\\MI_{AB}^F = {I_{ { {FA} } } } + {I_{ { {FB} } } }\end{array}$ Amount of information preserved in an image increases
with the improvement of MI which is utilized to characterize
inheritance of image informationCC[128] ${{CC} } = \frac{ {\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ {\left( {F\left( {i,j} \right) - {\mu _F} } \right) \times \left( {S\left( {i,j} \right) - {\mu _S} } \right)} \right]} } } }{ {\sqrt {\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ { { {\left( {F\left( {i,j} \right) - {\mu _F} } \right)}^2} } \right]\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ { { {\left( {S\left( {i,j} \right) - {\mu _S} } \right)}^2} } \right]} } } } } } }$ Similarity between images improves as CC increases,
thereby preserving more image information表 3 基于参考图像的评价指标
Table 3. Evaluation index based on reference image
Evaluation indicators Definition Explanation SSIM[129] $SSI{M_{RF}} = \displaystyle\prod\limits_{i = 1}^3 {\dfrac{{2{\mu _R}{\mu _F} + {c_i}}}{{\mu _R^2 + \mu _F^2 + {c_i}}}} $ Similarity between source image and fusion image enhances with the increase of SSIM which is used to measure image luminance, contrast and structural distortion level RMSE[2] $RMSE = \sqrt {\dfrac{1}{{M \times N}}\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {{{\left[ {R\left( {i,j} \right) - F\left( {i,j} \right)} \right]}^2}} } } $ Performance indicators of images promote with the reduction of RMSE PSNR[2] $PSNR = 10 \cdot \lg \dfrac{{{{\left( {255^2 \times M \times N} \right)}}}}{{\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {{{\left[ {R\left( {i,j} \right) - F\left( {i,j} \right)} \right]}^2}} } }}$ The distortion of images decreases as the improvement of PSNR using to
evaluate whether the image noise is suppressed无参考图像的评价指标又可以分为基于单一图像的评价指标和基于源图像的评价指标。基于单一图像的图像评价方法是基于最终融合图像所进行的图像性能评价,包括信息熵(Information Entropy, IE)、标准差(Standard Deviation, SD)、平均梯度(Average Gradient, AG)、空间频率等,其通过不同的方式度量融合图像本身的信息量、灰度值分布等。IE和SD分别通过统计图像灰度分布和度量像素灰度值相较于灰度均值的偏离程度来反映融合图像的信息量。AG和空间频率反映图像的灰度变化率和清晰度。基于源图像的评价指标通常只考虑图像的某一统计特征,与主观评价的结果有出入。无参考图像的评价指标还有一类是基于源图像进行衡量,主要从信息论的角度出发,度量融合图像从源图像处所提取的信息。常用的有互信息(Mutual Information, MI)、相关系数(Correlation Coefficient, CC)以及边缘信息传递量的QAB/F。此外,还有从信息熵引申出来的交叉熵、联合熵,IE反映的仅仅是融合图像的信息量,无法说明图像的整体融合效果,而交叉熵和联合熵可弥补该不足。
基于参考图像的评价指标是通过比较源图像与标准参考图像间灰度值、噪声等的差异以评价其性能。主要包括结构相似度(Structural Similarity, SSIM)、均方根误差(Root-Mean-Square Error, RMSE)、峰值信噪比(Peak Signal-to-Noise Ratio, PSNR)等。SSIM是通过比较图像间的亮度、对比度、结构失真水平的差异性来评价图像的性能;RMSE、偏差指数和扭曲程度都是通过比较图像间像素的灰度值进行评估;PSNR是通过度量融合图像的噪声是否得到抑制来评价图像的质量。在实际的图像融合过程中,往往没有参考图像作为一个标准,所以该评价方法目前还未大规模应用。
Research progress of infrared and visible image fusion technology
-
摘要: 红外与可见光融合图像既具有红外的辐射信息又具有可见光的细节信息,在生产生活、军事监视等场景得到广泛应用,已然成为图像融合领域的重点研究方向。根据图像融合方法的核心思想、融合框架、研究进展对基于多尺度变换、稀疏表示、神经网络等融合方法进行详细阐述对比,并综述了红外与可见光图像融合在各领域内的应用现状,以及常用的评价指标。并选择具有代表性的多种融合方法与评价指标,应用于六个不同场景,验证各方法的优势与不足。最后,实验分析并总结现有红外与可见光图像融合方法存在的问题,对红外与可见光图像融合技术的发展趋势进行展望。Abstract: Infrared and visible image fusion combines the infrared thermal radiation information and visible detail information. The image fusion technique has facilitated development in numerous fields, including production, life sciences, military surveillance and others, and has become a key research direction in the field of image technology. According to the core idea, fusion framework and research progress of image fusion methods, the fusion methods based on multi-scale transformation, sparse representation, neural network, etc. are elaborated and compared, and the application status of infrared and visible light image fusion in various fields and the commonly used the evaluation index. The most representative methods and evaluation indicators are selected and applied to six different scenes in order to verify the advantages and disadvantages of each one. Finally, the existing problems of infrared and visible image fusion methods are experimentally analyzed and summarized , the development prospects of infrared and visible image fusion technology are presented.
-
Key words:
- infrared image /
- visible image /
- image fusion /
- multi-scale transform /
- sparse representation /
- neural nework
-
表 1 红外与可见光图像融合方法的对比
Table 1. Comparison of infrared and visible image fusion methods
Fusion methods Specific methods Fusion strategies Advantages Limitations Applicable scenes Pyramid transforms Laplacian pyramid Fuzzy logic[9] Smoothing image edge;
Less time consumption;
Less artifactsLosing image details;
Block phenomenon;
Redundancy of dataShort-distance scenes with sufficient light, such as equipment detection Contrast pyramid Clonal selection algorithm[17];
Teaching learning based optimization[88];
Multi-objective evolutionary algorithm[89]High image contrast;
Abundant characteristic informationLow computing efficiency;
Losing image detailsSteerable pyramid The absolute value maximum selection(AVMS)[90];
The expectation maximization(EM) algorithm[91];
PCNN and weighting[92]Abundant edge detail;
Inhibiting the Gibbs effect effectively;
Fusing the geometrical and thematic feature availablyIncreasing the complexity of algorithm;
Losing the image detailsWavelet transform Discrete wavelet transform Regional energy[93];
Target region segmentation[21]Significant texture information;
Highly independent scale information;
Less blocking artifacts;
Higher signal-to-noise ratiosImage aliasing;
Ringing artifacts;
Strict registration requirementsShort-distance scenes, such as face recognition Dual tree discrete wavelet transform Particle swarm optimization[22];
Fuzzy logic and population-based optimization[94]Less redundant information;
Less time consumptionLimited directional information Lifting wavelet transform Local regional energy[23];
PCNN[85]High computing speed;
Low space complexity;Losing image details;
Distorting imageNonsubsampled multi-scale and multi-direction geometrical transform NSCT Fuzzy logic[29];
Region of interest[30]Distinct edge features;
Eliminating the Gibbs effect;
Better visual perceptionLosing image details;
Low computing efficiency;
Poor real-timeScenes with a complex background, such as rescue scenes NSST Region average energy and local directional contrast[33];
FNMF[34]Superior sparse ability;
High real-time
performanceLosing luminance information;
Strict registration requirement;
Losing image details of high frequencyCases need real-time treatment, such as intelligent traffic monitoring Sparse representation Saliency detection[44, 86-87];
PCNN[56, 95]Better robustness;
Less artifacts;
Reducing misregistration;
Abundant brightness informationSmoothing edge texture information;
Complex calculation;
Losing edge features of high frequency imagesScenes with little feature points, such as the surface of the sea 续表 1
Tab.1 ContinuedNeural network PCNN Multi-scale transform and sparse representation;
Multi-scale transformSuperior adaptability;
Higher signal-to-noise ratios;
High fault toleranceModel parameters are not easy to set;
Complex and time-consuming algorithmsAutomatic target detection and localization Deep learning VGG-19 and multi-layer fusion[69];
VGG-19 and saliency detection[70]Less artificial noise;
Abundant characteristic information
Less artifactsRequiring the ground truth in advance GAN[71] Avoiding manually designing complicated activity level measurements and fusion rules The visual information fidelity and correlation coefficient is not optimal Hybrid methods Multi-scale transform and saliency Weight calculation[76-80];
Salient object extraction[81, 82]Maintaining the integrity of the salient object region;
Improving the visual quality of the fused image;Reducing the noiseHighlighting saliency area inconsistently;
Losing the background informationThe surveillance application, such as object detection and tracking Multi-scale transform and SR The absolute values of coefficient and SR[38];
The fourth-order correlation coefficients match and SR[83]Retaining luminance information;
Excellent stability and robustnessPoor real-time Losing the image details 表 2 无参考图像的评价指标
Table 2. Evaluation index without reference image
Evaluation indicators Definition Explanation IE[124] ${{IE} } = - \displaystyle\sum\limits_{i = 0}^{L - 1} { {p_i} } {\log _2}{p_i}$ Amount of information contained in an image increases as IE improves SD[125] ${{SD} } = \sqrt {\frac{1}{ {MN} }\displaystyle\mathop \sum \limits_{i = 1}^M \displaystyle\mathop \sum \limits_{j = 1}^N { {\left( {F\left( {i,j} \right) - \mu } \right)}^2} }$ Deviation between pixels and pixel mean is evaluated by SD, which improves with the increase of SD, resulting in improvement in contrast of images AG[126] ${{AG} } = \frac{1}{ {\left( {M - 1} \right)\left( {N - 1} \right)} }\displaystyle\sum\limits_{i = 1}^{M - 1} {\displaystyle\sum\limits_{j = 1}^{N - 1} {\sqrt {\frac{ {\left( {\vartriangle Z_i^2 + \vartriangle Z_j^2} \right)} }{2} } } }$ A wealth of detailed information is exhibited by a high value of AG which is used to reflect the gray variation of the image QAB/F[127] ${ {{Q} }^{{ {AB/F} } } } = \frac{ {\displaystyle\sum\limits_{i = 0}^{M - 1} {\displaystyle\sum\limits_{j = 0}^{N - 1} {\left( {Q_{\left( {i,j} \right)}^{AF}w_{\left( {i,j} \right)}^A + Q_{\left( {i,j} \right)}^{BF}w_{\left( {i,j} \right)}^B} \right)} } } }{ {\displaystyle\sum\limits_{i = 0}^{M - 1} {\displaystyle\sum\limits_{j = 0}^{N - 1} {\left( {w_{\left( {i,j} \right)}^A + w_{\left( {i,j} \right)}^B} \right)} } } }$ Fusion effect of image exhibits better as the value of QAB/F which is used to evaluate the transfer of edge information, approaches 1 MI[2] $\begin{array}{l}{I_{ { {FA} } } }(i,j) = \displaystyle\sum\limits_{i = 1}^{M - 1} {\displaystyle\sum\limits_{j = 1}^{N - 1} { {P_{ { {FA} } } }\left( {i,j} \right)} } {\log _2}\dfrac{ { {P_{FA} }\left( {i,j} \right)} }{ { {P_F}\left( i \right){P_B}\left( j \right)} }\\MI_{AB}^F = {I_{ { {FA} } } } + {I_{ { {FB} } } }\end{array}$ Amount of information preserved in an image increases
with the improvement of MI which is utilized to characterize
inheritance of image informationCC[128] ${{CC} } = \frac{ {\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ {\left( {F\left( {i,j} \right) - {\mu _F} } \right) \times \left( {S\left( {i,j} \right) - {\mu _S} } \right)} \right]} } } }{ {\sqrt {\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ { { {\left( {F\left( {i,j} \right) - {\mu _F} } \right)}^2} } \right]\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {\left[ { { {\left( {S\left( {i,j} \right) - {\mu _S} } \right)}^2} } \right]} } } } } } }$ Similarity between images improves as CC increases,
thereby preserving more image information表 3 基于参考图像的评价指标
Table 3. Evaluation index based on reference image
Evaluation indicators Definition Explanation SSIM[129] $SSI{M_{RF}} = \displaystyle\prod\limits_{i = 1}^3 {\dfrac{{2{\mu _R}{\mu _F} + {c_i}}}{{\mu _R^2 + \mu _F^2 + {c_i}}}} $ Similarity between source image and fusion image enhances with the increase of SSIM which is used to measure image luminance, contrast and structural distortion level RMSE[2] $RMSE = \sqrt {\dfrac{1}{{M \times N}}\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {{{\left[ {R\left( {i,j} \right) - F\left( {i,j} \right)} \right]}^2}} } } $ Performance indicators of images promote with the reduction of RMSE PSNR[2] $PSNR = 10 \cdot \lg \dfrac{{{{\left( {255^2 \times M \times N} \right)}}}}{{\displaystyle\sum\limits_{i = 1}^M {\displaystyle\sum\limits_{j = 1}^N {{{\left[ {R\left( {i,j} \right) - F\left( {i,j} \right)} \right]}^2}} } }}$ The distortion of images decreases as the improvement of PSNR using to
evaluate whether the image noise is suppressed -
[1] Li S T, Kang X D, Fang L Y, et al. Pixel-level image fusion: A survey of the state of the art [J]. Information Fusion, 2017, 33: 100-112. doi: 10.1016/j.inffus.2016.05.004 [2] Ma J Y, Ma Y, Li C. Infrared and visible image fusion methods and applications: A survey [J]. Information Fusion, 2019, 45: 153-178. doi: 10.1016/j.inffus.2018.02.004 [3] Short N J, Yuffa A J, Videen G, et al. Effects of surface materials on polarimetric-thermal measurements: Applications to face recognition [J]. Applied Optics, 2016, 55(19): 5226-5233. doi: 10.1364/AO.55.005226 [4] Heo J, Kong S G, Abidi B R. Fusion of visual and thermal signatures with eyeglass removal for robust face recognition[C]//Computer Vision and Pattern Recognition Workshop, 2004,19: 122-127. [5] Kumar K S, Kavitha G, Subramanian R, et al. MATLAB-A Ubiquitous Tool for the Practical Engineer[M]. Croatia: In Tech, 2011: 307-326. [6] Castillo J C, Fernandez-Caballero A, Serrano-Cuerda J, et al. Smart environment architecture for robust people detection by infrared and visible video fusion [J]. Journal of Ambient Intelligence and Humanized Computing, 2017, 8(2): 223-237. doi: 10.1007/s12652-016-0429-5 [7] Fendri E, Boukhriss R R, Hammami M. Fusion of thermal infrared and visible spectra for robust moving object detection [J]. Pattern Analysis and Applications, 2017, 20(4): 907-926. doi: 10.1007/s10044-017-0621-z [8] Apatean A, Rogozan A, Bensrhair A. Visible-infrared fusion schemes for road obstacle classification [J]. Transportation Research Part C-Emerging Technologies, 2013, 35: 180-192. doi: 10.1016/j.trc.2013.07.003 [9] Bulanon D M, Burks T F, Alchanatis V. Image fusion of visible and thermal images for fruit detection [J]. Biosystems Engineering, 2009, 103(1): 12-22. doi: 10.1016/j.biosystemseng.2009.02.009 [10] Raza S E A, Sanchez V, Prince G, et al. Registration of thermal and visible light images of diseased plants using silhouette extraction in the wavelet domain [J]. Pattern Recognition, 2015, 48(7): 2119-2128. doi: 10.1016/j.patcog.2015.01.027 [11] Burt P, Adelson E. The laplacian pyramid as a compact image code [J]. IEEE Transations on Communications, 1983, 31(4): 532-540. doi: 10.1109/TCOM.1983.1095851 [12] Toet A. Image fusion by a ration of low-pass pyramid [J]. Pattern Recognition Letters, 1989, 9(4): 245-253. doi: 10.1016/0167-8655(89)90003-2 [13] Toet A, Vanruyven L J, Valeton J M. Merging thermal and visual images by a contrast pyramid [J]. Optical Engineering, 1989, 28(7): 789-792. [14] Toet A. A morphological pyramidal image decomposition [J]. Pattern Recognition Letters, 1989, 9(4): 255-261. doi: 10.1016/0167-8655(89)90004-4 [15] Freeman W T, Adelson E H, Intell M. The design and use of steerable filters [J]. IEEE Transpattern Anal, 1991, 13(9): 891-906. doi: 10.1109/34.93808 [16] Yu X L, Ren J L, Chen Q, et al. A false color image fusion method based on multi-resolution color transfer in normalization YCBCR space [J]. Optik, 2014, 125(20): 6010-6016. doi: 10.1016/j.ijleo.2014.07.059 [17] Jin H Y, Jiao L C, Liu F, et al. Fusion of infrared and visual images based on contrast pyramid directional filter banks using clonal selection optimizing [J]. Optical Engineering, 2008, 47(2): 27002-27008. doi: 10.1117/1.2857417 [18] He D X, Meng Y, Wang C Y. Contrast pyramid based image fusion scheme for infrared image and visible image[C]//2011 IEEE International Geoscience and Remote Sensing Symposium, 2011: 597-600. [19] Grossmann A, Morlet J. Decomposition of hardy functions into square integrable wavelets of constant shape [J]. Siam Journal on Mathematical Analysis, 1984, 15(4): 723-736. doi: 10.1137/0515056 [20] Mallat S G. A theory for multiresolution signal decomposition-the wavelet representation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989, 11(7): 674-693. doi: 10.1109/34.192463 [21] Niu Y, Xu S, Wu L, et al. Airborne infrared and visible image fusion for target perception based on target region segmentation and discrete wavelet transform [J]. Mathematical Problems in Engineering, 2012, 2012: 732-748. [22] Madheswari K, Venkateswaran N. Swarm intelligence based optimisation in thermal image fusion using dual tree discrete wavelet transform [J]. Quantitative Infrared Thermography Journal, 2017, 14(1): 24-43. doi: 10.1080/17686733.2016.1229328 [23] Zou Y, Liang X, Wang T. Visible and infrared image fusion using the lifting wavelet [J]. Telkomnika Indonesian Journal of Electrical Engineering, 2013, 11(11): 6290-6295. [24] Chai P F, Luo X Q, Zhang Z C. Image fusion using quaternion wavelet transform and multiple features [J]. IEEE Access, 2017, 5: 6724-6734. doi: 10.1109/ACCESS.2017.2685178 [25] Yan X, Qin H L, Li J, et al. Infrared and visible image fusion with spectral graph wavelet transform [J]. Journal of the Optical Society of America a-Optics Image Science and Vision, 2015, 32(9): 1643-1652. doi: 10.1364/JOSAA.32.001643 [26] Tao G Q, Li D P, Lu G H. On image fusion based on different fusion rules of wavelet transform [J]. Acta Photonica Sinica, 2004, 33(2): 221-224. [27] Selesnick I W, Baraniuk R G, Kingsbury N G. The dual-tree complex wavelet transform [J]. IEEE Signal Processing Magazine, 2005, 22(6): 123-151. doi: 10.1109/MSP.2005.1550194 [28] Da Cunha A L, Zhou J P, Do M N. The nonsubsampled contourlet transform: theory, design, and applications [J]. IEEE Transactions on Image Processing, 2006, 15(10): 3089-3101. doi: 10.1109/TIP.2006.877507 [29] Yin S, Cao L, Tan Q, et al. Infrared and visible image fusion based on NSCT and fuzzy logic[C]//Proceedings of the 2010 IEEE International Conference on Mechatronics and Automation, 2010,5: 671-675. [30] Liu H X, Zhu T H, Zhao J J. Infrared and visible image fusion based on region of interest detection and nonsubsampled contourlet transform [J]. Journal of Shanghai Jiaotong University (Science), 2013, 18(5): 526-534. doi: 10.1007/s12204-013-1437-7 [31] Guo K, Labate D. Optimally sparse multidimensional representation using shearlets [J]. Siam Journal on Mathematical Analysis, 2007, 39(1): 298-318. doi: 10.1137/060649781 [32] Easley G, Labate D, Lim W Q. Sparse directional image representations using the discrete shearlet transform [J]. Applied and Computational Harmonic Analysis, 2008, 25(1): 25-46. doi: 10.1016/j.acha.2007.09.003 [33] Kong W W, Wang B H, Lei Y. Technique for infrared and visible image fusion based on non-subsampled shearlet transform and spiking cortical model [J]. Infrared Physics & Technology, 2015, 71: 87-98. [34] Kong W, Lei Y, Zhao H. Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization [J]. Infrared Physics & Technology, 2014, 67: 161-172. [35] Hu H M, Wu J W, Li B, et al. An adaptive fusion algorithm for visible and infrared videos based on entropy and the cumulative distribution of gray levels [J]. IEEE Transactions on Multimedia, 2017, 19(12): 2706-2719. doi: 10.1109/TMM.2017.2711422 [36] Zhang X Y, Ma Y, Fan F, et al. Infrared and visible image fusion via saliency analysis and local edge-preserving multi-scale decomposition [J]. Journal of the Optical Society of America a-Optics Image Science and Vision, 2017, 34(8): 1400-1410. doi: 10.1364/JOSAA.34.001400 [37] Yang B, Li S T. Multifocus image fusion and restoration with sparse representation [J]. IEEE Transactions on Instrumentation and Measurement, 2010, 59(4): 884-892. doi: 10.1109/TIM.2009.2026612 [38] Liu Y, Liu S P, Wang Z F. A general framework for image fusion based on multi-scale transform and sparse representation [J]. Information Fusion, 2015, 24: 147-164. doi: 10.1016/j.inffus.2014.09.004 [39] Yin H T. Sparse representation with learned multiscale dictionary for image fusion [J]. Neurocomputing, 2015, 148: 600-610. doi: 10.1016/j.neucom.2014.07.003 [40] Yang B, Li S T. Pixel-level image fusion with simultaneous orthogonal matching pursuit [J]. Information Fusion, 2012, 13(1): 10-19. doi: 10.1016/j.inffus.2010.04.001 [41] Liu Y, Wang Z F. Simultaneous image fusion and denoising with adaptive sparse representation [J]. Iet Image Processing, 2015, 9(5): 347-357. doi: 10.1049/iet-ipr.2014.0311 [42] Yin H T, Li S T. Multimodal image fusion with joint sparsity model [J]. Optical Engineering, 2011, 50(6): 067007-067009. doi: 10.1117/1.3584840 [43] Nejati M, Samavi S, Shirani S. Multi-focus image fusion using dictionary-based sparse representation [J]. Information Fusion, 2015, 25: 72-84. doi: 10.1016/j.inffus.2014.10.004 [44] Wang J, Peng J Y, Feng X Y, et al. Fusion method for infrared and visible images by using non-negative sparse representation [J]. Infrared Physics & Technology, 2014, 67: 477-489. [45] Zhang Q, Levine M D. Robust multi-focus image fusion using multi-task sparse representation and spatial context [J]. IEEE Transactions on Image Processing, 2016, 25(5): 2045-2058. doi: 10.1109/TIP.2016.2524212 [46] Zhang Q H, Fu Y L, Li H F, et al. Dictionary learning method for joint sparse representation-based image fusion [J]. Optical Engineering, 2013, 52(5): 1-11. [47] Yu N N, Qiu T S, Bi F, et al. Image features extraction and fusion based on joint sparse representation [J]. IEEE Journal of Selected Topics in Signal Processing, 2011, 5(5): 1074-1082. doi: 10.1109/JSTSP.2011.2112332 [48] Engan K, Aase S O, Husoy J H. Method of optimal directions for frame design[C]//1999 IEEE International Conference on Acoustics, Speech, and Signal Processing,1999: 2443-2446. [49] Aharon M, Elad M, Bruckstein A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation [J]. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322. doi: 10.1109/TSP.2006.881199 [50] Rubinstein R, Zibulevsky M, Elad M. Double sparsity: Learning sparse dictionaries for sparse signal approximation [J]. IEEE Transactions on Signal Processing, 2010, 58(3): 1553-1564. doi: 10.1109/TSP.2009.2036477 [51] Kim M, Han D K, Ko H. Joint patch clustering-based dictionary learning for multimodal image fusion [J]. Information Fusion, 2016, 27: 198-214. doi: 10.1016/j.inffus.2015.03.003 [52] Dong W S, Li X, Zhang L, et al. Sparsity-based image denoising via dictionary learning and structural clustering[C]//2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011: 457-464. [53] Chatterjee P, Milanfar P. Clustering-based denoising with locally learned dictionaries [J]. IEEE Transactions on Image Processing, 2009, 18(7): 1438-1451. doi: 10.1109/TIP.2009.2018575 [54] Yao Y, Guo P, Xin X, et al. Image fusion by hierarchical joint sparse representation [J]. Cognitive Computation, 2014, 6(3): 281-292. doi: 10.1007/s12559-013-9235-y [55] Ophir B, Lustig M, Elad M. Multi-scale dictionary learning using wavelets [J]. IEEE Journal of Selected Topics in Signal Processing, 2011, 5(5): 1014-1024. doi: 10.1109/JSTSP.2011.2155032 [56] Lu X Q, Zhang B H, Zhao Y, et al. Theinfrared and visible image fusion algorithm based on target separation and sparse representation [J]. Infrared Physics & Technology, 2014, 67: 397-407. [57] Zhu Z Q, Yin H P, Chai Y, et al. A novel multi-modality image fusion method based on image decomposition and sparse representation [J]. Information Sciences, 2018, 432: 516-529. doi: 10.1016/j.ins.2017.09.010 [58] Wang K P, Qi G Q, Zhu Z Q, et al. A novel geometric dictionary construction approach for sparse representation based image fusion [J]. Entropy, 2017, 19(7): 306. doi: 10.3390/e19070306 [59] Zhang Q, Liu Y, Blum R S, et al. Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: A review [J]. Information Fusion, 2018, 40: 57-75. doi: 10.1016/j.inffus.2017.05.006 [60] Kong W W, Zhang L J, Lei Y. Novel fusion method for visible light and infrared images based on NSST-SF-PCNN [J]. Infrared Physics & Technology, 2014, 65: 103-112. [61] Xiang T Z, Yan L, Gao R R. A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain [J]. Infrared Physics & Technology, 2015, 69: 53-61. [62] Ma L J, Zhao C H. An effective image fusion method based on nonsubsampled contourlet transform and pulse coupled neural network[C]//Proceedings of the 2nd International Conference on Computer and Information Applications (ICCIA 2012), 2012: 8-12. [63] Li Y, Song G-H, Yang S-C. Multi-sensor image fusion by NSCT-PCNN transform[C]//2011 IEEE International Conference on Computer Science and Automation Engineering, 2011: 638-642. [64] Kong W W, Lei Y J, Lei Y, et al. Image fusion technique based on non-subsampled contourlet transform and adaptive unit-fast-linking pulse-coupled neural network [J]. Iet Image Processing, 2011, 5(2): 113-121. doi: 10.1049/iet-ipr.2009.0425 [65] Qu X B, Yan J W, Xiao H Z, et al. Image fusion algorithm based on spatia frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform Domain [J]. Acta Automatica Sinica, 2009, 12(34): 1508-1514. [66] El-taweel G S, Helmy A K. Image fusion scheme based on modified dual pulse coupled neural network [J]. Iet Image Processing, 2013, 7(5): 407-414. doi: 10.1049/iet-ipr.2013.0045 [67] Yu Z, Yan L, Han N, et al. Image fusion algorithm based on contourlet transform and PCNN for detecting obstacles in forests [J]. Cybernetics and Information Technologies, 2015, 15(1): 116-125. doi: 10.1515/cait-2015-0010 [68] Liu S, Piao Y, Tahir M. Research on fusion technology based on low-light visible image and infrared image [J]. Optical Engineering, 2016, 55(12): 123104. doi: 10.1117/1.OE.55.12.123104 [69] Li H, Wu X J, Kittler J. Infrared and visible image fusion using a deep learning framework[C]//2018 24th International Conference on Pattern Recognition, 2018: 2705-2710. [70] Ren X, Meng F, Hu T, et al. Infrared-visible image fusion based on convolutional neural networks (CNN)[C]//International Conference on Intelligent Science and Big Data Engineering, 2018: 301-307. [71] Ma J Y, Yu W, Liang P W, et al. FusionGAN: A generative adversarial network for infrared and visible image fusion [J]. Information Fusion, 2019, 48: 11-26. doi: 10.1016/j.inffus.2018.09.004 [72] Li H, Liu L, Huang W, et al. An improved fusion algorithm for infrared and visible images based on multi-scale transform [J]. Infrared Physics & Technology, 2016, 74: 28-37. [73] Fu Z Z, Wang X, Xu J, et al. Infrared and visible images fusion based on RPCA and NSCT [J]. Infrared Physics & Technology, 2016, 77: 114-123. [74] Cvejic N, Bull D, Canagarajah N. Region-based multimodal image fusion using ICA bases [J]. IEEE Sensors Journal, 2007, 7(5): 743-751. doi: 10.1109/JSEN.2007.894926 [75] Mou J, Gao W, Song Z. Image fusion based on non-negative matrix factorization and infrared feature extraction[C]// 2013 6th International Congress on Image and Signal Processing (CISP), 2013: 1046-1050. [76] Liu Z W, Feng Y, Chen H, et al. A fusion algorithm for infrared and visible based on guided filtering and phase congruency in NSST domain [J]. Optics and Lasers in Engineering, 2017, 97: 71-77. doi: 10.1016/j.optlaseng.2017.05.007 [77] Bavirisetti D P, Dhuli R. Two-scale image fusion of visible and infrared images using saliency detection [J]. Infrared Physics & Technology, 2016, 76: 52-64. [78] Gan W, Wu X H, Wu W, et al. Infrared and visible image fusion with the use of multi-scale edge-preserving decomposition and guided image filter [J]. Infrared Physics & Technology, 2015, 72: 37-51. [79] Cui G M, Feng H J, Xu Z H, et al. Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition [J]. Optics Communications, 2015, 341: 199-209. doi: 10.1016/j.optcom.2014.12.032 [80] Zhao J F, Zhou Q, Chen Y T, et al. Fusion of visible and infrared images using saliency analysis and detail preserving based image decomposition [J]. Infrared Physics & Technology, 2013, 56: 93-99. [81] Zhang B H, Lu X Q, Pei H Q, et al. A fusion algorithm for infrared and visible images based on saliency analysis and non-subsampled shearlet transform [J]. Infrared Physics & Technology, 2015, 73: 286-297. [82] Meng F, Song M, Guo B, et al. Image fusion based on object region detection and non-subsampled contourlet transform [J]. Computers & Electrical Engineering, 2017, 62: 375-383. [83] Cai J J, Cheng Q M, Peng M J, et al. Fusion of infrared and visible images based on nonsubsampled contourlet transform and sparse K-SVD dictionary learning [J]. Infrared Physics & Technology, 2017, 82: 85-95. [84] Yin M, Duan P H, Liu W, et al. A novel infrared and visible image fusion algorithm based on shift-invariant dual-tree complex shearlet transform and sparse representation [J]. Neurocomputing, 2017, 226: 182-191. doi: 10.1016/j.neucom.2016.11.051 [85] Chai Y, Li H F, Qu J F. Image fusion scheme using a novel dual-channel PCNN in lifting stationary wavelet domain [J]. Optics Communications, 2010, 283(19): 3591-3602. doi: 10.1016/j.optcom.2010.04.100 [86] Yang B, Li S T. Visual attention guided image fusion with sparse representation [J]. Optik, 2014, 125(17): 4881-4888. doi: 10.1016/j.ijleo.2014.04.036 [87] Liu C H, Qi Y, Ding W R. Infrared and visible image fusion method based on saliency detection in sparse domain [J]. Infrared Physics & Technology, 2017, 83: 94-102. [88] Kong W W. Technique for gray-scale visual light and infrared image fusion based on non-subsampled shearlet transform [J]. Infrared Physics & Technology, 2014, 63: 110-118. [89] Adu J H, Gan J H, Wang Y, et al. Image fusion based on nonsubsampled contourlet transform for infrared and visible light image [J]. Infrared Physics & Technology, 2013, 61: 94-100. [90] Liu Z, Tsukada K, Hanasaki K, et al. Image fusion by using steerable pyramid [J]. Pattern Recognition Letters, 2001, 22(9): 929-939. doi: 10.1016/S0167-8655(01)00047-2 [91] G Liu, Z L Jing, S Y Sun, et al. Image fusion based on expectation maximization algorithm and steerable pyramid [J]. Chinese Optics Letters, 2004, 2(7): 18-21. [92] Deng H, Ma Y. Image Fusion based on steerable pyramid and PCNN[C]//2009 Second International Conference on the Applications of Digital Information and Web Technologies, 2009: 569-573. [93] Zhan L, Zhuang Y, Huang L. Infrared and visible images fusion method based on discrete wavelet transform [J]. Journal of Computers, 2017, 28(2): 057-071. [94] Saeedi J, Faez K. Infrared and visible image fusion using fuzzy logic and population-based optimization [J]. Applied Soft Computing, 2012, 12(3): 1041-1054. doi: 10.1016/j.asoc.2011.11.020 [95] Chang L H, Feng X C, Zhang R, et al. Image decomposition fusion method based on sparse representation and neural network [J]. Applied Optics, 2017, 56(28): 7969-7977. doi: 10.1364/AO.56.007969 [96] Omri F, Foufou S, Abidi M. NIR and visible image fusion for improving face recognition at long distance[C]//International Conference on Image and Signal Processing, 2014: 549-557. [97] Singh S, Gyaourova A, Bebis G, et al. Infrared and visible image fusion for face recognition[C]//Biometric Technology for Human Identification, International Society for Optics and Photonics, 2004: 585-596. [98] Heo J, Kong S G, Abidi B R, et al. Fusion of visual and thermal signatures with eyeglass removal for robust face recognition[C]//2004 Conference on Computer Vision and Pattern Recognition Workshop, 2004: 122-122. [99] Abaza A, Bourlai T. On ear-based human identification in the mid-wave infrared spectrum [J]. Image Vision Computing, 2013, 31(9): 640-648. doi: 10.1016/j.imavis.2013.06.001 [100] Uzair M, Mahmood A, Mian A, et al. Periocular region-based person identification in the visible, infrared and hyperspectral Imagery [J]. Neurocomputing, 2015, 149: 854-867. doi: 10.1016/j.neucom.2014.07.049 [101] Han J G, Pauwels E J, de Zeeuw P. Fast saliency-aware multi-modality image fusion [J]. Neurocomputing, 2013, 111: 70-80. doi: 10.1016/j.neucom.2012.12.015 [102] Schnelle S R, Chan A L. Enhanced target tracking through infrared-visible image fusion[C]//14th International Conference on Information Fusion, 2011: 1-8. [103] Jin X, Jiang Q, Yao S W, et al. A survey of infrared and visual image fusion methods [J]. Infrared Physics & Technology, 2017, 85: 478-501. [104] Toet A. Natural colour mapping for multiband nightvision imagery [J]. Information Fusion, 2003, 4(3): 155-166. doi: 10.1016/S1566-2535(03)00038-1 [105] Toet A, Hogervorst M A. Progress in color night vision [J]. Optical Engineering, 2012, 51(1): 010901. doi: 10.1117/1.OE.51.1.010901 [106] Davis J W, Sharma V. Background-subtraction using contour-based fusion of thermal and visible imagery [J]. Computer Vision and Image Understanding, 2007, 107(2-3): 162-182. [107] Niu Y F, Xu S T, Wu L Z, et al. Airborne infrared and visible image fusion for target perception based on target region segmentation and discrete wavelet transform [J]. Mathematical Problems in Engineering, 2012, 10: 732-748. [108] Bhatnagar G, Liu Z. A novel image fusion framework for night-vision navigation and surveillance [J]. Signal Image and Video Processing, 2015, 9: 165-175. doi: 10.1007/s11760-014-0740-6 [109] Paramanandham N, Rajendiran K. Multi sensor image fusion for surveillance applications using hybrid image fusion algorithm [J]. Multimedia Tools and Applications, 2018, 77(10): 12405-12436. doi: 10.1007/s11042-017-4895-3 [110] Tsagaris V, Anastassopoulos V. Fusion of visible and infrared imagery for night color vision [J]. Displays, 2005, 26(4): 191-196. [111] Hogervorst M A, Toet A. Fast natural color mapping for night-time imagery [J]. Information Fusion, 2010, 11(2): 69-77. doi: 10.1016/j.inffus.2009.06.005 [112] Mendoza F, Lu R F, Cen H Y. Comparison and fusion of four nondestructive sensors for predicting apple fruit firmness and soluble solids content [J]. Postharvest Biology and Technology, 2012, 73: 89-98. doi: 10.1016/j.postharvbio.2012.05.012 [113] Hanna B V, Gorbach A M, Gage F A, et al. Intraoperative assessment of critical biliary structures with visible range/infrared image fusion [J]. Journal of the American College of Surgeons, 2008, 206(6): 1227-1231. doi: 10.1016/j.jamcollsurg.2007.10.012 [114] Eslami M, Mohammadzadeh A. Developing a spectral-based strategy for urban object detection from airborne hyperspectral TIR and visible data [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016, 9(5): 1808-1816. doi: 10.1109/JSTARS.2015.2489838 [115] Han L, Wulie B, Yang Y L, et al. Direct fusion of geostationary meteorological satellite visible and infrared images based on thermal physical properties [J]. Sensors, 2015, 15(1): 703-714. doi: 10.3390/s150100703 [116] Li H G, Ding W R, Cao X B, et al. Image registration and fusion of visible and infrared integrated camera for medium-altitude unmanned aerial vehicle remote sensing [J]. Remote Sensing, 2017, 9(5): 441-469. doi: 10.3390/rs9050441 [117] Chang X, Jiao L C, Liu F, et al. Multicontourlet-based adaptive fusion of infrared and visible remote sensing images [J]. IEEE Geoscience and Remote Sensing Letters, 2010, 7(3): 549-553. doi: 10.1109/LGRS.2010.2041323 [118] Lu X C, Zhang J P, Li T, et al. Synergetic classification of long-wave infrared hyperspectral and visible images [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015, 8(7): 3546-3557. doi: 10.1109/JSTARS.2015.2442594 [119] Gargano M, Bertani D, Greco M, et al. A perceptual approach to the fusion of visible and NIR images in the examination of ancient documents [J]. Journal of Cultural Heritage, 2015, 16(4): 518-525. doi: 10.1016/j.culher.2014.09.006 [120] Kim S J, Deng F, Brown M S. Visual enhancement of old documents with hyperspectral imaging [J]. Pattern Recognition, 2011, 44(7): 1461-1469. doi: 10.1016/j.patcog.2010.12.019 [121] Feng Z J, Zhang X L, Yuan L Y, et al. Infrared target detection and location for visual surveillance using fusion scheme of visible and infrared images [J]. Mathematical Problems in Engineering, 2013, 2013(3): 831-842. [122] Zhao C, Guo Y, Wang Y. A fast fusion scheme for infrared and visible light images in NSCT domain [J]. Infrared Physics & Technology, 2015, 72: 266-275. [123] Zhang X L, Li X F, Li J. Validation and correlation analysis of metrics for evaluation performance of image fusion [J]. Acta Automatica Sinica, 2014, 40(2): 306-315. (in Chinese) [124] Aardt V, J an. Assessment of image fusion procedures using entropy, image quality, and multispectral classification [J]. Journal of Applied Remote Sensing, 2008, 2(1): 1-28. [125] Yun J, R ao. In-fibre Bragg grating sensors [J]. Measurement Science Technology, 1997, 8(4): 355. doi: 10.1088/0957-0233/8/4/002 [126] Zhu X X, Bamler R. A sparse image fusion algorithm with application to pan-sharpening [J]. IEEE Transactions on Geoscience Remote Sensing, 2012, 51(5): 2827-2836. [127] Xydeas C S, V. P V. Objective image fusion performance measure [J]. Military Technical Courier, 2000, 56(4): 181-193. [128] Deshmukh M, Bhosale U. Image fusion and image quality assessment of fused images [J]. International Journal of Image Processing, 2010, 4(5): 484-508. [129] Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: From error visibility to structural similarity [J]. IEEE Trans Image Process, 2004, 13(4): 600-612. doi: 10.1109/TIP.2003.819861 [130] Li S, Yang B, Hu J. Performance comparison of different multi-resolution transforms for image fusion [J]. Information Fusion, 2011, 12(2): 74-84. doi: 10.1016/j.inffus.2010.03.002 [131] Qu X B, Yan J W, Xiao H Z, et al. Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain [J]. Acta Automatica Sinica, 2008, 34(12): 1508-1514. doi: 10.1016/S1874-1029(08)60174-3