全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Next Words Prediction and Sentence Completion in Bangla Language Using GRU-Based RNN on N-Gram Language Model

DOI: 10.4236/jdaip.2023.114020, PP. 388-399

Keywords: Bangla Language, Words Prediction, Sentence Completion, GRU, RNN, Corpus, N-Gram

Full-Text   Cite this paper   Add to My Lib

Abstract:

We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines.

References

[1]  (2023) What Is Word Prediction?
https://www2.edc.org/ncip/library/wp/what_is.htm
[2]  Yazdani, A., Safdari, R., Golkar, A. and Niakan Kalhori S.R., (2019) Words Prediction Based on N-Gram Model for Free-Text Entry in Electronic Health Records. Health Information Science and Systems, 7, Article No. 6.
https://doi.org/10.1007/s13755-019-0065-5
[3]  Barman, P.P. and Boruah, A. (2018) A RNN Based Approach for Next Word Prediction in Assamese Phonetic Transcription. Procedia Computer Science, 143, 117-123.
https://doi.org/10.1016/j.procs.2018.10.359
[4]  Bickel, S., Haider, P. and Scheffer, T. (2005) Predicting Sentences Using N-Gram Language Models. Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, 6-8 October 2005, 193-200.
https://doi.org/10.3115/1220575.1220600
[5]  Haque, M.M., Habib, M.T. and Rahman, M.M. (2015) Automated Word Prediction in Bangla Language Using Stochastic Language Models. International Journal in Foundations of Computer Science & Technology, 5, 67-75.
https://doi.org/10.5121/ijfcst.2015.5607
[6]  Hamarashid, H.K., Saeed, S.A. and Rashid, T.A. (2020) Next Word Prediction Based on the N-Gram Model for Kurdish Sorani and Kurmanji. Neural Computing and Applications, 33, 4547-4566.
https://doi.org/10.1007/s00521-020-05245-3
[7]  Al-Mubaid, H. (2007) A Learning-Classification Based Approach for Word Prediction. The International Arab Journal of Information Technology, 4, 264-271.
[8]  Soam, M. and Thakur, S. (2022) Next Word Prediction Using Deep Learning: A Comparative Study. 2022 12th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, 27-28 January 2022, 653-658.
https://doi.org/10.1109/Confluence52989.2022.9734151
[9]  Ambulgekar, S., Malewadikar, S., Garande, R. and Joshi, B. (2021) Next Words Prediction Using Recurrent NeuralNetworks. ITM Web of Conferences, 40, Article ID: 03034.
https://doi.org/10.1051/itmconf/20214003034
[10]  Rianti, A., Widodo, S., Ayuningtyas, A.D. and Hermawan, F.B. (2022) Next Word Prediction Using Lstm. Journal of Information Technology and Its Utilization, 5, 10-13.
https://doi.org/10.56873/jitu.5.1.4748
[11]  Kumar, A., Kumar Mishra, P., Namgai, T. and Kumar, S. (2023) Next Word Prediction in Bodhi Language Using LSTM Based Approach.
https://ssrn.com/abstract=4367666
https://doi.org/10.2139/ssrn.4367666
[12]  Sharma, R., Goel, N., Aggarwal, N., Kaur, P. and Prakash, C. (2019) Next Word Prediction in Hindi Using Deep Learning Techniques. 2019 International Conference on Data Science and Engineering (ICDSE), Patna, 26-28 September 2019, 55-60.
https://doi.org/10.1109/ICDSE47409.2019.8971796
[13]  Endalie, D., Haile, G. and Taye, W. (2022) Bi-Directional Long Short-Term Memory-Gated Re-Current Unit Model for Amharic Next Word Prediction. PLOS ONE, 17, e0273156.
https://doi.org/10.1371/journal.pone.0273156
[14]  Kapadia, S. (2019) Language Models: N-Gram. A Step into Statistical Language Modeling.
https://towardsdatascience.com/introduction-to-language-models-n-gram-e323081503d9
[15]  Rakib, O.F., Akter, S., Khan, M.A., Das, A.K. and Habibullah, K.M. (2019) Bangla Word Prediction and Sentence Completion Using GRU: An Extended Version of RNN on N-Gram Language Model. 2019 International Conference on Sustainable Technologies for Industry 4.0 (STI), Dhaka, 24-25 December 2019, 1-6.
https://doi.org/10.1109/STI47673.2019.9068063
[16]  Inan, H., Khosravi, K. and Socher, R. (2016). Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling. arXiv:1611.01462.
[17]  Kostadinov, S. (2017) Understanding GRU Networks.
https://towardsdatascience.com/understanding-gru-networks-2ef37df6c9be
[18]  Arbel, N. (2018) How LSTM Networks Solve the Problem of Vanishing Gradients.
https://medium.datadriveninvestor.com/how-do-lstm-networks-solve-the-problem-of-vanishing-gradients-a6784971a577
[19]  Mystery Vault (2021) LSTM Vs GRU in Recurrent Neural Network: A Comparative Study.
https://analyticsindiamag.com/lstm-vs-gru-in-recurrent-neural-network-a-comparative-study/
[20]  Habib, M.T., Al-Mamun, A., Rahman, M.S., Siddiquee, S.M.T. and Ahmed, F. (2018) An Exploratory Approach to Find a Novel Metric Based Optimum Language Model for Automatic Bangla Word Prediction. International Journal of Intelligent Systems and Applications (IJISA), 10, 47-54.
https://doi.org/10.5815/ijisa.2018.02.05

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413