基于贝叶斯分区数据挖掘的光纤网络异常分析算法

Optical fiber network anomaly analysis algorithm based on Bayesian partition data mining

  • 摘要: 光纤网络通信中异常信息的快速、准确识别是保证通信稳定的关键,随着光纤网络通信数据的激增,也成为了近年来的一个研究热点。文中结合异常信息识别算法的精度与收敛速度之间的制约机理,提出了基于贝叶斯分区数据挖掘的异常信息识别算法。首先,采用贝叶斯定量完成数据样本的特征分类,通过极大化分析修正先验概率;然后,依据异常信息的不同类型设置挖掘特征参数及概率化系数;最后,依据贝叶斯分区分别对样本数据进行具有针对性的数据挖掘。实验以光纤局域网的通信状态数据为样本,将该算法与人工神经网络算法和遗传算法的识别结果进行对比,计算了三种算法的识别正确率、收敛速度以及算法稳定性。该算法的识别正确率均值为93.83%,在数据量增大时未发生明显的降低。收敛速度与遗传算法相近,均值为3.25 s。漏检率和误检率均值分别为0.10%和0.54%。结果表明:该算法识别正确率与收敛速度均得到了提高,稳定性好,并能够在漏检率与误检率之间通过参数控制进行微调,具有较好的应用价值。

     

    Abstract: The rapid and accurate identification of abnormal information in optical fiber network communication was the key to ensuring the stability of communication. The surge in conversion of optical fiber network communication data has also become the only research hotspot. Firstly, Bayesian partition data mining was used to quantify the feature classification of data samples, and the prior probability was corrected through maximization analysis; Secondly, the mining characteristic parameter and probability coefficient were set according to different types abnormal information; Finally, according to the Bayesian partition, the sample data was collected with specific data. The experiment takes the communication state data of the optical fiber interconnection as a sample, compared the recognition results of this algorithm with the artificial neural network algorithm and the genetic algorithm, and calculated the recognition accuracy, convergence speed and algorithm stability of the three algorithms. The average value of the recognition accuracy of this algorithm was converted to 93.83%, and there was no significant decrease when the amount of data increased. The convergence speed was similar to that of genetic algorithm, with an average value of 3.25 s. The mean values of missed detection rate and false detection rate were 0.10% and 0.54%, respectively. The results show that the recognition accuracy and convergence speed of this algorithm are improved, the stability is good, and the parameter control can be fine-tuned between the missed detection rate and the false detection rate, which has better application value.

     

/

返回文章
返回