全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于预训练–微调策略的电影票房预测
Movie Box Office Prediction Based on Pre-Train and Fine-Tuning

DOI: 10.12677/MOS.2024.131034, PP. 358-364

Keywords: 电影票房预测,预训练,微调,集成学习模型
Movie Box Office Prediction
, Pre-Train, Fine-Tuning, Ensemble Learning Model

Full-Text   Cite this paper   Add to My Lib

Abstract:

有监督学习模型对数据量有着较高的依赖,然而现有电影票房数据集较少,导致预测精度降低。针对上述问题,提出一种基于预训练–微调策略的电影票房预测模型。利用电影评分和电影票房之间的相关性,在电影评分数据集上采用预训练的方式,使模型提前获取有关电影的先验知识,同时利用电影间的属性差异信息进行数据增强,最后在电影票房数据集上进行微调,实现对电影票房的预测。实验结果表明,所提方法R2指标提升了7%,MSE下降了69%。
Supervised learning models have a high dependence on the amount of data, however, the existing movie box office dataset is small, which leads to lower prediction accuracy. To address the above problems, a movie box office prediction model based on a pre-training and fine-tuning strategy is proposed. Using the correlation between movie ratings and movie box office, pre-training is used on the movie ratings dataset to make the model acquire a priori knowledge about movies in advance. At the same time, data enhancement is carried out by using the information of attribute differences between movies. Finally fine-tuning is applied on the movie box office dataset to realize the predic-tion of movie box office. Experimental results show that the proposed method improves the R2 in-dex by 7% and decreases the MSE by 69%.

References

[1]  Edwards, D.A., Buckmire, R. and Ortega-Gingrich, J. (2014) A Mathematical Model of Cinematic Box-Office Dynamics with Geographic Effects. IMA Journal of Management Mathematics, 25, 233-257.
https://doi.org/10.1093/imaman/dpt006
[2]  Kim, T., Hong, J. and Kang, P. (2015) Box Office Forecasting Using Machine Learning Algorithms Based on SNS Data. International Journal of Forecasting, 31, 364-390.
https://doi.org/10.1016/j.ijforecast.2014.05.006
[3]  Du, J., Xu, H. and Huang, X. (2014) Box Office Prediction Based on Microblog. Expert Systems with Applications, 41, 1680-1689.
https://doi.org/10.1016/j.eswa.2013.08.065
[4]  Dai, D. and Chen, J. (2021) Research on Mathematical Model of Box Office Forecast through BP Neural Network and Big Data Technology. Journal of Physics: Conference Series, 1952, Article ID: 042118.
https://doi.org/10.1088/1742-6596/1952/4/042118
[5]  Wang, Z., Zhang, J., Ji, S., et al. (2020) Predicting and Ranking Box Office Revenue of Movies Based on Big Data. Information Fusion, 60, 25-40.
https://doi.org/10.1016/j.inffus.2020.02.002
[6]  Arias, M., Arratia, A. and Xuriguera, R. (2014) Forecasting with Twitter Data. ACM Transactions on Intelligent Systems and Technology (TIST), 5, 1-24.
https://doi.org/10.1016/j.inffus.2020.02.002
[7]  Liu, T., Ding, X., Chen, Y., Chen, H.C. and Guo, M.S. (2016) Predicting Movie Box-Office Revenues by Exploiting Large-Scale Social Media Content. Multimedia Tools and Applica-tions, 75, 1509-1528.
https://doi.org/10.1007/s11042-014-2270-1
[8]  Ghiassi, M., Lio, D. and Moon, B. (2015) Pre-Production Fore-casting of Movie Revenues with a Dynamic Artificial Neural Network. Expert Systems with Applications, 42, 3176-3193.
https://doi.org/10.1016/j.eswa.2014.11.022
[9]  Zhou, Y., Zhang, L. and Yi, Z. (2019) Predicting Movie Box-Office Revenues Using Deep Neural Networks. Neural Computing and Applications, 31, 1855-1865.
https://doi.org/10.1007/s00521-017-3162-x
[10]  Asur, S. and Huberman, B.A. (2010) Predicting the Future with Social Media. 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Toronto, 31 August-3 September 2010, 492-499.
https://doi.org/10.1109/WI-IAT.2010.63
[11]  Shen, D. (2020) Movie Box Office Prediction via Joint Actor Rep-resentations and Social Media Sentiment. arXiv: 2006.13417.
[12]  Qiu, X. and Tang, T.Y. (2018) Microblog Mood Predicts the Box Office Performance. Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference, Tokyo, 21-23 December 2018, 129-133.
https://doi.org/10.1145/3299819.3299839
[13]  Devlin, J., Chang, M.W., Lee, K., et al. (2018) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv: 1810.04805.
[14]  Radford, A., Narasimhan, K., Salimans, T., et al. (2018) Improving Language Understanding by Generative Pre-Training.
https://www.mikecaptain.com/resources/pdf/GPT-1.pdf
[15]  Radford, A., Wu, J., Child, R., et al. (2019) Lan-guage Models Are Unsupervised Multitask Learners. OpenAI Blog, 1, 9.
[16]  Brown, T., Mann, B., Ryder, N., et al. (2020) Language Models Are Few-Shot Learners. Advances in Neural Information Processing Systems, 33, 1877-1901.
[17]  Jemni, S.K., Ammar, S., Souibgui, M.A., et al. (2023) ST-KeyS: Self-Supervised Transformer for Keyword Spotting in Historical Handwritten Documents. arXiv: 2303.03127.
[18]  Wang, G., Yu, F., Li, J., et al. (2023) Exploiting the Textual Potential from Vision-Language Pre-Training for Text-Based Person Search. arXiv: 2303.04497.
[19]  Guo, Y., Wang, P., Zhou, X., et al. (2022) An Improved Imaging Algorithm for HRWS Space-Borne SAR Data Processing Based on CVPRI. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16, 126-140.
https://doi.org/10.1109/JSTARS.2022.3224194
[20]  Guo, Y., Xiao, X., Wang, X., et al. (2023) A Two-Stage Real Image Deraining Method for GT-RAIN Challenge CVPR 2023 Workshop UG $^{\textbf {2}} $ + Track 3. arXiv: 2305.07979.
[21]  Wang, L., Guo, H. and Liu, B. (2023) A Boosted Model Ensembling Approach to Ball Action Spotting in Videos: The Runner-Up Solution to CVPR’23 SoccerNet Challenge. arXiv: 2306.05772.
[22]  Zhou, Y., Ringeval, F. and Portet, F. (2023) A Survey of Evaluation Methods of Generated Medical Textual Reports. Proceedings of the 5th Clinical Natural Language Processing Workshop, Toronto, July 2023, 447-459.
https://doi.org/10.18653/v1/2023.clinicalnlp-1.48
[23]  Abdelhalim, N., Abdelhalim, I. and Batista-Navarro, R.T. (2023) Training Models on Oversampled Data and a Novel Multi-class Annotation Scheme for Dementia Detection. Proceedings of the 5th Clinical Natural Language Processing Workshop, Toronto, July 2023, 118-124.
https://doi.org/10.18653/v1/2023.clinicalnlp-1.15
[24]  Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Atten-tion Is All You Need. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 4-9 De-cember 2017, 30.
https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[25]  Liu, X., Duh, K., Liu, L., et al. (2020) Very Deep Transformers for Neural Machine Translation. arXiv: 2008.07772.
[26]  Tang, W., Xu, B., Zhao, Y., et al. (2022) UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, December 2022, 7087-7099.
https://doi.org/10.18653/v1/2022.emnlp-main.477
[27]  Ji, S., Pan, S., Cambria, E., et al. (2021) A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Transactions on Neural Networks and Learning Systems, 33, 494-514.
https://doi.org/10.1109/TNNLS.2021.3070843
[28]  Cui, L. and Zhang, Y. (2019) Hierarchically-Refined Label Attention Network for Sequence Labeling. Proceedings of the 2019 Conference on Empirical Methods in Natural Lan-guage Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, November 2019, 4115-4128.
https://doi.org/10.18653/v1/D19-1422
[29]  Erickson, N., Mueller, J., Shirkov, A., et al. (2020) Autogluon-Tabular: Robust and Accurate Automl for Structured Data. arXiv: 2003.06505.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413