全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于N-gram剪枝技术的隐患文本自动评估模型
An Automatic Assessment Model Based on N-gram Pruning Technique for Hidden Danger Text

DOI: 10.12677/me.2024.123047, PP. 388-394

Keywords: 语义分析,钻井平台,N-gram,词袋向量,隐患量化
Semantic Analysis
, Drilling Platforms, N-gram, Word Bag Vector, Hazard Quantification

Full-Text   Cite this paper   Add to My Lib

Abstract:

为了自动分析海上钻井平台隐患文本中蕴含的隐患响应程度信息,量化隐患严重程度,提出一种基于N-gram词袋向量的隐患响应等级量化评估模型。首先针对1565条钻井平台的现场隐患记录进行分词与过滤处理;其次再以N-gram作为特征单元重塑词袋维度;然后提出使用逆TF-IDF值来强化特征值;最后,使用朴素贝叶斯构建隐患量化模型。结果表明:使用该方法的隐患量化评估模型具有较高的精确率、召回率及F1值。
To automatically analyze the response level information of hidden dangers contained in hidden danger texts and quantify the severity, a quantitative evaluation model based on N-gram word bag vectors is proposed for the response level of hidden dangers. Firstly, segment and filter the on-site hazard records of 1565 drilling platforms; Secondly, using N-gram as feature units to reshape the bag of words dimension; Then, it is proposed to use the inverse TF-IDF value to enhance the feature values; Finally, use naive Bayes to construct a hazard quantification model. The results show that the hazard quantification evaluation model using this method has high accuracy, recall, and F1 value.

References

[1]  崔青. 海洋平台发展现状及前景[J]. 石化技术, 2018, 25(6): 213.
[2]  何沙, 陈东升, 朱林, 姬荣斌. 海上钻井平台安全风险预警模型应用研究[J]. 中国安全生产科学技术, 2012, 8(4): 148-154.
[3]  赵京胜, 宋梦雪, 高祥. 自然语言处理发展及应用综述[J]. 信息技术与信息化, 2019(7): 142-145.
[4]  Zhi, Y.Z., Bo, F., Hang, Q., Yan, L.Z. and Xiao, B.L. (2017) Modeling Medical Texts for Distributed Representations Based on Skip-Gram Model. 2017 3rd International Conference on Information Management (ICIM), Chengdu, 21-23 April 2017, 279-283.
https://doi.org/10.1109/INFOMAN.2017.7950392
[5]  Yan, X.Y. (2017) Research and Realization of Internet Public Opinion Analysis Based on Improved TF-IDF Algorithm. 2017 16th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES), Anyang, 13-16 October 2017, 80-83.
https://doi.org/10.1109/DCABES.2017.24
[6]  G?k?ay, D., I?bilir, E. and Yildirim, G. (2012) Predicting the Sentiment in Sentences Based on Words: An Exploratory Study on ANEW and ANET. 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom), Kosice, 2-5 December 2012, 715-718.
https://doi.org/10.1109/CogInfoCom.2012.6421945
[7]  谭章禄, 王兆刚, 胡翰, 姜萱, 彭胜男. 基于文本聚类的煤矿安全隐患类型挖掘研究[J]. 中国安全科学学报, 2019, 29(3): 145-148.
[8]  陈孝慈, 谭章禄, 单斐, 高青. 基于Bigram的安全隐患文本分类研究[J]. 中国安全科学学报, 2017, 27(8): 156-161.
[9]  胡瑾秋, 张曦月, 吴志强. 结合TF-IDF的企业生产隐患关联预警及可视化研究[J]. 中国安全科学学报, 2019, 29(7): 170-176.
[10]  黄春梅, 王松磊. 基于词袋模型和TF-IDF的短文本分类研究[J]. 软件工程, 2020, 23(3): 1-3.
[11]  孟涛, 王诚. 基于扩展短文本词特征向量的分类研究[J]. 计算机技术与发展, 2019, 29(4): 57-62.
[12]  韩天园, 田顺, 吕凯光, 李旋, 张佳涛, 魏朗. 基于文本挖掘的重特大交通事故成因网络分析[J]. 中国安全科学学报, 2021, 31(9): 150-156.
[13]  李然, 林政, 林海伦, 王伟平, 孟丹. 文本情绪分析综述[J]. 计算机研究与发展, 2018, 55(1): 30-52.
[14]  洪巍, 李敏. 文本情感分析方法研究综述[J]. 计算机工程与科学, 2019, 41(4): 750-757.
[15]  谭章禄, 陈晓, 宋庆正, 陈孝慈. 基于文本挖掘的煤矿安全隐患分析[J]. 安全与环境学报, 2017, 17(4): 1262-1266.
[16]  奉国和. 文本分类性能评价研究[J]. 情报杂志, 2011, 30(8): 66-70.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413