OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Pure Mathematics 2024

加密流量数据集类别不平衡的研究
Research on Class Imbalance in Encrypted Traffic Datasets

DOI: 10.12677/PM.2024.141004, PP. 23-33

王晓

Keywords: 加密流量分类，平衡数据集，深度学习，生成对抗网络
Encrypted Traffic Classification, Balanced Dataset, Deep Learning, Generative Adversarial Networks

Full-Text Cite this paper Add to My Lib

Abstract:

近年来，随着深度学习技术的迅猛发展，网络安全领域的研究人员开始探索利用深度学习解决加密流量分类问题。然而，目前公开的加密流量数据集存在严重的类别不平衡问题，这对于深度学习分类方法的性能造成了一定的影响。从头构建一个完整的加密流量数据集是耗时且昂贵的。为了克服这个问题，本文提出了一种基于改进的生成对抗网络(GAN)的加密流量生成模型。该模型通过在GAN模型中添加数据包的统计特征和网络流的类别标签作为条件约束，从而生成逼真的流量数据，进而扩充数据集。实验证明，在使用经过本文方法增强的数据集时，基于深度学习的加密流量分类器展现出比使用随机过采样(ROS)、合成少数类过采样技术(SMOTE)和传统的对抗生成网络(GAN)技术更出色的性能。
In recent years, with the rapid development of deep learning technology, researchers in the field of network security have begun to explore using deep learning to solve the problem of encrypted traffic classification. However, currently available encrypted traffic datasets suffer from serious class imbalance issues, which can adversely affect the performance of deep learning classification methods. Creating a complete encrypted traffic dataset from scratch is both time-consuming and expensive. To address this issue, this paper proposes an improved generative adversarial network (GAN) based model for generating encrypted traffic data. The model adds packet statistics feature vectors as conditional constraints to the GAN model, thereby generating realistic traffic data to expand the dataset. Experimental results show that when using our method to enhance the dataset, the deep learning-based encrypted traffic classifier exhibits better performance than that using random oversampling (ROS), synthetic minority oversampling technique (SMOTE), and traditional GAN techniques.

References

[1]	Vu, L., Van Tra, D. and Nguyen, Q.U. (2016) Learning from Imbalanced Data for Encrypted Traffic Identification Problem. Proceedings of the Seventh Symposium on Information and Communication Technology, ser. SoICT’16, New York, NY, 147-152. https://doi.org/10.1145/3011077.3011132
[2]	Japkowicz, N. (2000) Learning from Imbal-anced Data Sets: A Comparison of Various Strategies. AAAI Workshop on Learning from Imbalanced Data Sets.
[3]	Chawla, N.V., Bowyer, K.W., Hall, L.O., et al. (2002) SMOTE: Synthetic Minority Over-Sampling Tech-nique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
[4]	Wang, Q., Li, L., Jiang, B., et al. (2020) Malicious Domain Detection Based on K-Means and Smote. International Conference on Computational Science, Amsterdam, The Netherlands, Springer, Cham, 468-481. https://doi.org/10.1007/978-3-030-50417-5_35
[5]	Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al. (2014) Generative Adversarial Nets. Advances in Neural Information Processing Systems, Montreal, 2672-2680.
[6]	Vu, L., Bui, C.T. and Nguyen, Q.U. (2017) A Deep Learning Based Method for Handling Imbalanced Problem in Network Traffic Classification. Eighth International Symposium on Information & Communication Technology, New York, December 2017, 333-339. https://doi.org/10.1145/3155133.3155175
[7]	Dainotti, A., Pescape, A. and Claffy, K.C. (2012) Issues and Future Directions in Traffic Classification. Network IEEE, 26, 35-40. https://doi.org/10.1109/MNET.2012.6135854
[8]	Mirza, M. and Osindero, S. (2014) Conditional Generative Adversarial Nets. Computer Science, 2672-2680.
[9]	Zeiler, M.D., Krishnan, D., Taylor, G.W., et al. (2010) Deconvolutional Networks. Computer Vision & Pattern Recognition, San Francisco, CA, 13-18 June 2010, 2528-2535. https://doi.org/10.1109/CVPR.2010.5539957
[10]	Wang, W., Zhu, M., Wang, J., et al. (2017) End-to-End En-crypted Traffic Classification with One-Dimensional Convolution Neural Networks. IEEE International Conference on Intelligence and Security Informatics (ISI), Beijing, 22-24 July 2017, 43-48. https://doi.org/10.1109/ISI.2017.8004872
[11]	Lin, K., Xu, X. and Gao, H. (2021) TSCRNN: A Novel Classifi-cation Scheme of Encrypted Traffic Based on Flow Spatiotemporal Features for Efficient Management of IIoT. Computer Networks, 190, Article ID: 107974. https://doi.org/10.1016/j.comnet.2021.107974
[12]	Lashkari, A.H., Kaur, G. and Rahali, A. (2020) DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic Using Deep Image Learning. Proceedings of the 2020 10th International Conference on Communication and Network Security (ICCNS 2020), New York, 27-29 November 2020, 1-13.
[13]	Lashkari, A.H., Draper-Gil, G., Mamun, M.S.I., et al. (2016) Characterization of Encrypted and VPN Traffic Using Time-Related Features. Proceedings of the 2nd International Conference on Information Systems Security and Privacy ICISSP, 1, 407-414. https://doi.org/10.5220/0005740704070414
[14]	Lashkari, A.H., Gil, G.D., Mamun, M.S.I., et al. (2017) Characterization of Tor Traffic Using Time Based Features. International Conference on Information Systems Security & Privacy, Porto, 253-262.
[15]	Zeng, Y., Gu, H., Wei, W., et al. (2019) Deep-Full-Range: A Deep Learning Based Network Encrypted Traffic Classification and Intrusion Detection Framework. IEEE Access, 7, 45182-45190. https://doi.org/10.1109/ACCESS.2019.2908225

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413

加密流量数据集类别不平衡的研究Research on Class Imbalance in Encrypted Traffic Datasets

加密流量数据集类别不平衡的研究
Research on Class Imbalance in Encrypted Traffic Datasets