全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Distributional Reinforcement Learning with Quantum Neural Networks

DOI: 10.4236/ica.2019.102004, PP. 63-78

Keywords: Continuous-Variable Quantum Computers, Quantum Reinforcement Learning, Distributional Reinforcement Learning, Quantile Regression, Distributional Q Learning, Grid World Environment, MDP Chain Environment

Full-Text   Cite this paper   Add to My Lib

Abstract:

Traditional reinforcement learning (RL) uses the return, also known as the expected value of cumulative random rewards, for training an agent to learn an optimal policy. However, recent research indicates that learning the distribution over returns has distinct advantages over learning their expected value as seen in different RL tasks. The shift from using the expectation of returns in traditional RL to the distribution over returns in distributional RL has provided new insights into the dynamics of RL. This paper builds on our recent work investigating the quantum approach towards RL. Our work implements the quantile regression (QR) distributional Q learning with a quantum neural network. This quantum network is evaluated in a grid world environment with a different number of quantiles, illustrating its detailed influence on the learning of the algorithm. It is also compared to the standard quantum Q learning in a Markov Decision Process (MDP) chain, which demonstrates that the quantum QR distributional Q learning can explore the environment more efficiently than the standard quantum Q learning. Efficient exploration and balancing of exploitation and exploration are major challenges in RL. Previous work has shown that more informative actions can be taken with a distributional perspective. Our findings suggest another cause for its success: the enhanced performance of distributional RL can be partially attributed to its superior ability to efficiently explore the environment.

References

[1]  Sutton, R.S. and Barto, A.G. (2018) Reinforcement Learning, an Introduction. Second Edition, a Bradford Book.
[2]  Ganger, M., Duryea, E. and Hu, W. (2016) Double Sarsa and Double Expected Sarsa with Shallow and Deep Learning. Journal of Data Analysis and Information Processing, 4, 159.
https://doi.org/10.4236/jdaip.2016.44014
[3]  Duryea, E., Ganger, M. and Hu, W. (2016) Exploring Deep Reinforcement Learning with Multi Q-Learning. Intelligent Control and Automation, 7, 129. https://doi.org/10.4236/ica.2016.74012
[4]  Bellemare, M.G., Dabney, W. and Munos, R. (2017) A Distributional Perspective on Reinforcement Learning.
[5]  Dabney, W., Rowland, M., Bellemare, M.G. and Munos, R. (2017) Distributional Reinforcement Learning with Quantile Regression.
[6]  Dabney, W., Ostrovski, G., Silver, D. and Munos, R. (2018) Implicit Quantile Networks for Distributional Reinforcement Learning.
[7]  Clausen, J. and Briegel, H.J. (2018) Quantum Machine Learning with Glow for Episodic Tasks and Decision Games. Physical Review A, 97, Article ID: 022303.
[8]  Crawford, D., Levit, A., Ghadermarzy, N., Oberoi, J.S. and Ronagh, P. (2016) Reinforcement Learning Using Quantum Boltzmann Machines.
[9]  Dunjko, V., Taylor, J.M. and Briegel, H.J. (2018) Advances in Quantum Reinforcement Learning.
[10]  Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N. and Lloyd, S. (2017) Quantum Machine Learning. Nature, 549, 195-202. https://doi.org/10.1038/nature23474
[11]  Dunjko, V. and Briegel, H.J. (2018) Machine Learning and Artificial Intelligence in the Quantum Domain: A Review of Recent Progress. Reports on Progress in Physics, 81, Article ID: 074001.
https://doi.org/10.1088/1361-6633/aab406
[12]  Hu, W. (2018) Towards a Real Quantum Neuron. Natural Science, 10, 99-109.
https://doi.org/10.4236/ns.2018.103011
[13]  Hu, W. (2018) Empirical Analysis of a Quantum Classifier Implemented on IBM’s 5Q Quantum Computer. Journal of Quantum Information Science, 8, 1-11.
https://doi.org/10.4236/jqis.2018.81001
[14]  Hu, W. (2018) Empirical Analysis of Decision Making of an AI Agent on IBM’s 5Q Quantum Computer. Natural Science, 10, 45-58. https://doi.org/10.4236/ns.2018.101004
[15]  Hu, W. (2018) Comparison of Two Quantum Clustering Algorithms. Natural Science, 10, 87-98.
https://doi.org/10.4236/ns.2018.103010
[16]  Ganger, M. and Hu, W. (2019) Quantum Multiple Q-Learning. International Journal of Intelligence Science, 9, 1-22. https://doi.org/10.4236/ijis.2019.91001
[17]  Duryea, E. and Hu, W. (2019) Quantum Dyna Q Learning. International Journal of Intelligence Science.
[18]  Mitarai, K., Negoro, M., Kitagawa, M. and Fujii, K. (2018) Quantum Circuit Learning.
[19]  Schuld, M. and Killoran, N. (2018) Quantum Machine Learning in Feature Hilbert Spaces.
[20]  Hu, W. and Hu, J. (2019) Training a Quantum Neural Network to Solve the Contextual Multi-Armed Bandit Problem. Natural Science, 11, 17-27. https://doi.org/10.4236/ns.2019.111003
[21]  Hu, W. and Hu, J. (2019) Q Learning with Quantum Neural Networks. Natural Science, 11, 31-39. https://doi.org/10.4236/ns.2019.111005
[22]  Hu, W. and Hu, J. (2019) Reinforcement Learning with Deep Quantum Neural Networks. Journal of Quantum Information Science, 9, 1-14.
[23]  Farhi, E. and Neven, H. (2018) Classification with Quantum Neural Networks on Near Term Processors.
[24]  Dong, D., Chen, C., Li, H. and Tarn, T.-J. (2008) Quantum Reinforcement Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38, 1207-1220.
https://doi.org/10.1109/TSMCB.2008.925743
[25]  Serafini, A. (2017) Quantum Continuous Variables: A Primer of Theoretical Methods. CRC Press, Boca Raton.
[26]  Killoran, N., Izaac, J., Quesada, N., Bergholm, V., Amy, M. and Weedbrook, C. (2018) Strawberry Fields: A Software Platform for Photonic Quantum Computing.
[27]  Killoran, N., Bromley, T.R., Arrazola, J.M., Schuld, M., Quesada, N. and Lloyd, S. (2018) Continuous-Variable Quantum Neural Networks.
[28]  Arrazola, J.M., Bromley, T.R., Izaac, J., Myers, C.R., Brádler, K. and Killoran, N. (2018) Machine Learning Method for State Preparation and Gate Synthesis on Photonic Quantum Computers.
[29]  Bellman, R. (1957) A Markovian Decision Process. Journal of Mathematics and Mechanics, 6, 679-684. https://doi.org/10.1512/iumj.1957.6.56038
[30]  Moerland, T.M., Broekens, J. and Jonker, C.M. (2018) The Potential of the Return Distribution for Exploration in RL.

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413