全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Parallel Inference for Real-Time Machine Learning Applications

DOI: 10.4236/jcc.2024.121010, PP. 139-146

Keywords: Machine Learning Models, Computational Efficiency, Parallel Computing Systems, Random Forest Inference, Hyperparameter Tuning, Python Frameworks (TensorFlow, PyTorch, Scikit-Learn), High-Performance Computing

Full-Text   Cite this paper   Add to My Lib

Abstract:

Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.

References

[1]  Upadhyaya, S.R. (2022) Parallel Approaches to Machine Learning—A Comprehensive Survey. IEEE Transactions on Parallel and Distributed Systems, 73, 284-292.
https://doi.org/10.1016/j.jpdc.2012.11.001
[2]  Memeti, S., et al. (2023) Using Meta-Heuristics and Machine Learning for Software Optimization of Parallel Computing Systems: A Systematic Literature Review. IEEE Transactions on Parallel and Distributed Systems, 101, 893-936.
https://doi.org/10.1007/s00607-018-0614-9
[3]  Hutter, F., et al. (2019) Automated Hyperparameter Optimization: Methods, Challenges, and Applications. IEEE Access, 7, 12937-12955.
[4]  Talbi, E.G. (2023) Parallel Metaheuristics: Recent Advances and New Trends. 2nd Edition. Wiley, Hoboken, NJ.
[5]  Topcuoglu, H., et al. (2022) Edge Scheduling of Deep Neural Networks for Real-Time Inference. IEEE Transactions on Industrial Informatics, 18, 4986-4995.
[6]  Zhang, L., et al. (2021) Data Parallel Distributed Training for Sequence-to-Sequence Models. Proceedings of the Institute of Electrical and Electronics Engineers International Conference on Big Data, Orlando, FL, December 2021, 21-30.
[7]  Song, H., et al. (2019) Hyperparameter Optimization of Deep Neural Networks Using Non-Repetitive Directed Walk. Proceedings of the Institute of Electrical and Electronics Engineers/CVF International Conference on Computer Vision, Seoul, October 2019, 4815-4824.
[8]  Abadi, M., et al. (2016) TensorFlow: A System for Large-Scale Machine Learning. Proceedings of the USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, November 2016, 265-283.
[9]  Paszke, A., et al. (2017) Automatic Differentiation in PyTorch. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.
[10]  Pedregosa, F., et al. (2022) Scikit-Learn in 2022. arXiv preprint arXiv:2205.09532.
[11]  Franceschi, L., et al. (2018) On the Benefits of Parallelism in Derivative-Free Optimization for Machine Learning. Proceedings of the Institute of Electrical and Electronics Engineers International Conference on Acoustics, Speech, and Signal Processing, Calgary, AB, April 2018, 3244-3248.
[12]  Wang, W., Yuan, Y. and Tan, Q. (2018) Recent Advances in Random Forests for Machine Learning. 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), Singapore, November 2018, 1-6.
[13]  Zhou, Z. and Feng, J. (2019) Deep Forest. National Science Review, 6, 74-86.
https://doi.org/10.1093/nsr/nwy108
[14]  Aslam, N., et al. (2022) Anomaly Detection Using Explainable Random Forest for the Prediction of Undesirable Events in Oil Wells. Applied Computational Intelligence and Soft Computing, 2022, Article ID: 1558381.
https://doi.org/10.1155/2022/1558381
[15]  Wu, J., Chen, J., Xiong, H. and Ye, J. (2018) Accelerating Random Forest Training on a CPU-FPGA Heterogeneous Platform. International Symposium on Applied Reconfigurable Computing, Bologna, Italy, April 2018, 89-101.
[16]  (n.d.) What Is a Random Forest?
https://www.tibco.com/reference-center/what-is-a-random-forest
[17]  Bisaillon, C. (2022) Fake and Real News Dataset. Kaggle.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413