全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于文本主题和地理位置的生活日志分类方法
Lifelog Classification Method Based on Text Theme and Geographic Location

DOI: 10.12677/CSA.2024.142048, PP. 480-488

Keywords: 生活日志,深度学习,文本分类
Lifelog
, Deep Learning, Text Classification

Full-Text   Cite this paper   Add to My Lib

Abstract:

我们从2011年开始,通过开发的App有计划地收集个人生活日志数据,目前已经有22位志愿者参与到这个项目中,收集到的有效生活日志数据超过4万余条。将这些丰富而杂乱的数据进行分类,为人们提供更清晰、有序的生活见解是一件有意义的事情。本文提出了一个生活日志文本分类模型DTC-TextCNN,通过引入LDA主题模型,对文本日志的主题特征进行提取;使用DB-SCAN算法,对发送动态时的地理位置进行聚类,得到不同的地理位置特征簇,并将提取到的文本主题特征和地理位置特征与文本动态进行拼接,输入到TextCNN模型中进行分类。实验结果表明,将地理位置这一特征引入模型中,有助于更好地理解文本发生的背景和环境,提供更丰富的上下文信息。融合了地理特征和主题特征的分类方法,弥补了生活日志文本语义模糊以及全局语义缺失的问题,提高了对于文本内容的理解水平。通过在Liu Lifelog数据集上的测试,可以看到该模型能够提高对生活日志分类的准确性。
We have been systematically collecting personal lifelog data through the development of an app since 2011. Currently, 22 volunteers have participated in this project and have collected over 40000 effective lifelog data. Classifying these rich and chaotic data to provide people with clearer and more organized insights into their lives is a meaningful thing. This article proposes a lifelog text classification model, DTC-TextCNN, which extracts topic features from text logs by introducing the LDA topic model; Using the DB-SCAN algorithm to cluster the geographical locations when sending dynamics, obtain different geographical feature clusters, concatenate the extracted text topic and geographical location features with the text dynamics, and input them into the TextCNN model for classification. The experimental results indicate that incorporating the feature of geographic location into the model helps to better understand the background and environment of text occurrence, providing richer contextual information. The classification method that integrates geographical and thematic features compensates for the problems of semantic ambiguity and global semantic loss in life log texts, and improves the level of understanding of text content. Through testing on the Liu Lifelog dataset, it can be seen that the model can improve the accuracy of lifelog classification.

References

[1]  Garcia, F.C.C., Hirao, A., Tajika, A., Furukawa, T.A., Ikeda, K. and Yoshimoto, J. (2021) Leveraging Longitudinal Lifelog Data Using Survival Models for Predicting Risk of Relapse among Patients with Depression in Remission. 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Mexico, 1-5 November 2021, 2455-2458.
https://doi.org/10.1109/EMBC46164.2021.9629798
[2]  Gurrin, C., Smeaton, A.F. and Doherty, A.R. (2014) LifeLogging: Personal Big Data. Foundations and Trends? in Information Retrieval, 8, 1-125.
https://doi.org/10.1561/1500000033
[3]  Dobbins, C., Rawassizadeh, R. and Momeni, E. (2017) Detecting Physical Activity within Lifelogs towards Preventing Obesity and Aiding Ambient Assisted Living. Neurocomputing, 230, 110-132.
[4]  Choi, J., Choi, C., Ko, H., et al. (2016) Intelligent Healthcare Service Using Health Lifelog Analysis. Journal of Medical Systems, 40, 1-10.
https://doi.org/10.1007/s10916-016-0534-1
[5]  Li, D.L., Gu, Y. and Ka-mijo, S. (2018) Smartphone Based Lifelog with Meaningful Place Detection. 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, 12-14 January 2018. 1-5.
https://doi.org/10.1109/ICCE.2018.8326116
[6]  Liu, G., Rehman, M.U. and Wu, Y. (2021) Personal Trajectory Analysis Based on Informative Lifelogging. Multimedia Tools and Applications, 80, 22177-22191.
https://doi.org/10.1007/s11042-021-10755-w
[7]  Liu, G.Q., et al. (2014) Behavior Identification Based on Ge-otagged Photo Data Set. The Scientific World Journal, 2014, Article ID: 616030.
https://doi.org/10.1155/2014/616030
[8]  Liu, G., Rehman, M.U. and Wu, Y. (2021) Toward Storytelling from Personal Informative Lifelogging. Multimedia Tools and Applications, 80, 19649-19673.
https://doi.org/10.1007/s11042-020-10453-z
[9]  Maron, M.E. (1961) Automatic Indexing: An Experimental In-quiry. Journal of the ACM, 8, 404-417.
https://doi.org/10.1145/321075.321084
[10]  马新宇, 黄春梅, 姜春茂. 基于三支决策的KNN渐进式文本分类方法[J]. 计算机应用研究, 2023, 40(4): 1065-1069.
[11]  Kalcheva, N., Karova, M. and Penev, I. (2020) Comparison of the Accuracy of SVM Kemel Functions in Text Classification. 2020 International Conference on Biomedical Innova-tions and Applications (BIA), Varna, 24-27 September 2020, 141-145.
https://doi.org/10.1109/BIA50171.2020.9244278
[12]  Kim, Y. (2014) Convolutional Neural Networks for Sen-tence Classification.
https://doi.org/10.3115/v1/D14-1181
[13]  王佳慧. 基于CNN与Bi-LSTM混合模型的中文文本分类方法[J]. 软件导刊, 2022, 1(22): 159-163.
[14]  杨阳, 刘恩博, 顾春华, 等. 稀疏数据下结合词向量的短文本分类模型研究[J]. 计算机应用研究, 2022, 39(3): 711-715, 750.
[15]  Sharma, A.K., Chaurasia, S. and Srivasta-va, D.K. (2020) Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec. Procedia Computer Science, 167, 1139-1147.
https://doi.org/10.1016/j.procs.2020.03.416
[16]  Schubert, E., Sander, J., Ester, M., et al. (2017) DBSCAN Revis-ited, Revisited: Why and How You Should (Still) Use DBSCAN. ACM Transactions on Database Systems (TODS), 42, 1-21.
https://doi.org/10.1145/3068335
[17]  Jelodar, H., Wang, Y., Yuan, C., et al. (2019) Latent Dirichlet Alloca-tion (LDA) and Topic Modeling: Models, Applications, a Survey. Multimedia Tools and Applications, 78, 15169-15211.
https://doi.org/10.1007/s11042-018-6894-4
[18]  Bakarov, A. (2018) A Survey of Word Embeddings Evaluation Methods.
[19]  Mikolov, T., Chen, K., Corrado, G., et al. (2013) Efficient Estimation of Word Representations in Vector Space.
https://doi.org/10.48550/arXiv.1301.3781
[20]  Ma, L. and Zhang, Y. (2015) Using Word2Vec to Process Big Text Data. 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, 29 October 2015 - 1 November 2015, 2895-2897.
https://doi.org/10.1109/BigData.2015.7364114
[21]  Zhang, T. and You, F. (2021) Research on Short Text Classi-fication Based on Textcnn. Journal of Physics: Conference Series. IOP Publishing, 1757, Article ID: 012092.
https://doi.org/10.1088/1742-6596/1757/1/012092
[22]  Srivastava, N., Hinton, G., Krizhevsky, A., et al. (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The Journal of Machine Learning Research, 15, 1929-1958.
[23]  Ioffe, S. and Szegedy, C. (2015) Batch Normalization: Accelerating Deep Network Training by Re-ducing Internal Covariate Shift. Proceedings of the 32nd International Conference on International Conference on Ma-chine Learning, 37, 448-456.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413