OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

International Journal of Intelligence Science 2019

Quantum Multiple Q-Learning

DOI: 10.4236/ijis.2019.91001, PP. 1-22

Michael Ganger, Wei Hu

Keywords: Quantum Computing, Reinforcement Learning, Q-Learning

Full-Text Cite this paper Add to My Lib

Abstract:

In this paper, a collection of value-based quantum reinforcement learning algorithms are introduced which use Grover’s algorithm to update the policy, which is stored as a superposition of qubits associated with each possible action, and their parameters are explored. These algorithms may be grouped in two classes, one class which uses value functions (V(s)) and new class which uses action value functions (Q(s,a)). The new (Q(s,a))-based quantum algorithms are found to converge faster than V(s)-based algorithms, and in general the quantum algorithms are found to converge in fewer iterations than their classical counterparts, netting larger returns during training. This is due to fact that the (Q(s,a)) algorithms are more precise than those based on

References

[1]   Brown, B. (2015) Q-Learning with Neural Networks.

[2]   Dong, D., Chen, C., Li, H. and Tarn, T.-J. (2008) Quantum Reinforcement Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38, 1207-1220.
https://doi.org/10.1109/TSMCB.2008.925743

[3]   Duryea, E., Ganger, M. and Hu, W. (2016) Exploring Deep Reinforcement Learning with Multi Q-Learning. Intelligent Control and Automation, 7, 129.
https://doi.org/10.4236/ica.2016.74012

[4]   Grover, L.K. (1996) A Fast Quantum Mechanical Algorithm for Database Search. Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, Philadelphia, 22-24 May 1996, 212-219.
https://doi.org/10.1145/237814.237866

[5]   Hasselt, H.V. (2010) Double Q-Learning. Advances in Neural Information Processing Systems, 23, 2613-2621.

[6]   Kober, J., Bagnell, J.A. and Peters, J. (2013) Reinforcement Learning in Robotics: A Survey. The International Journal of Robotics Research, 32, 1238-1274.
https://doi.org/10.1177/0278364913495721

[7]   Lee, J.W. (2001) Stock Price Prediction Using Reinforcement Learning. 2001 IEEE International Symposium on Industrial Electronics Proceedings, Pusan, 12-16 June 2001, 690-695.

[8]   Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. and Riedmiller, M. (2013) Playing Atari with Deep Reinforcement Learning. arXiv preprint arXiv:1312.5602.

[9]   Shor, P.W. (1994) Algorithms for Quantum Computation: Discrete Logarithms and Factoring. Proceedings 35th Annual Symposium on Foundations of Computer Science, Santa Fe, 20-22 November 1994, 124-134.
https://doi.org/10.1109/SFCS.1994.365700

[10]   Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. (2016) Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 529, 484-489.
https://doi.org/10.1038/nature16961

[11]   Sutton, R.S and Barto, A.G. (1998) Reinforcement Learning: An Introduction, Volume 1. MIT Press, Cambridge.

[12]   Tromp, J. (2016) Number of legal Go Positions.

[13]   Van Hasselt, H., Guez, A. and Silver, D. (2016) Deep Reinforcement Learning with Double Q-Learning. AAAI, 2094-2100.

[14]   IBM (2017) IBM Builds Its Most Powerful Universal Quantum Computing Processors.

[15]   Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N. and Lloyd, S. (2017) Quantum Machine Learning. Nature, 549, 195.
https://doi.org/10.1038/nature23474

[16]   Cárdenas-López, F.A., Lamata, L., Retamal, J.C. and Solano, E. (2017) Generalized Quantum Reinforcement Learning with Quantum Technologies. PLoS ONE, 13, e0200455. arXiv:1709.07848

[17]   Cao, Y., Guerreschi, G.G. and Aspuru-Guzik, A. (2017) Quantum Neuron: An Elementary Building Block for Machine Learning on Quantum Computers. arXiv:1711.11240

[18]   Crawford, D., Levit, A., Ghadermarzy, N., Oberoi, J.S. and Ronagh, P. (2016) Reinforcement Learning Using Quantum Boltzmann Machines. arXiv:1612.05695

[19]   Hu, W. (2018) Empirical Analysis of Decision Making of an AI Agent on IBM 5Q Quantum Computer. Natural Science, 10, 45.
https://doi.org/10.4236/ns.2018.101004

[20]   Hu, W. (2018) Towards a Real Quantum Neuron. Natural Science, 10, 99.
https://doi.org/10.4236/ns.2018.103011

[21]   Levit, A., Crawford, D., Ghadermarzy, N., Oberoi, J.S., Zahedinejad, E. and Ronagh, P. (2017) Free Energy-Based Reinforcement Learning Using a Quantum Processor. arXiv:1706.00074

[22]   Nielsen, M.A. and Chuang, I.L. (2011) Quantum Computation and Quantum Information: 10th Anniversary Edition. 10th Edition, Cambridge University Press, New York.

[23]   Sallans, B. and Hinton, G.E. (2004) Reinforcement Learning with Factored States and Actions. Journal of Machine Learning Research, 5, 1063-1088.

[24]   Sriarunothai, T., Wolk, S., Giri, G.S., Fries, N., Dunjko, V., Briegel, H.J. and Wunderlich, C. (2017) Speeding-Up the Decision Making of a Learning Agent Using an Ion Trap Quantum Processor. arXiv:1709.01366

[25]   Watkins, C.J.C.H. (1989) Learning from Delayed Rewards. PhD Thesis, University of Cambridge, Cambridge.

[26]   Williams, R.J. (1992) Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Machine Learning, 8, 229-256.
https://doi.org/10.1007/BF00992696

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413