全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Reinforcement Learning with Deep Quantum Neural Networks

DOI: 10.4236/jqis.2019.91001, PP. 1-14

Keywords: Continuous-Variable Quantum Computers, Quantum Machine Learning, Quantum Reinforcement Learning, Deep Learning, Q Learning, Actor-Critic, Grid World Environment

Full-Text   Cite this paper   Add to My Lib

Abstract:

The advantage of quantum computers over classical computers fuels the recent trend of developing machine learning algorithms on quantum computers, which can potentially lead to breakthroughs and new learning models in this area. The aim of our study is to explore deep quantum reinforcement learning (RL) on photonic quantum computers, which can process information stored in the quantum states of light. These quantum computers can naturally represent continuous variables, making them an ideal platform to create quantum versions of neural networks. Using quantum photonic circuits, we implement Q learning and actor-critic algorithms with multilayer quantum neural networks and test them in the grid world environment. Our experiments show that 1) these quantum algorithms can solve the RL problem and 2) compared to one layer, using three layer quantum networks improves the learning of both algorithms in terms of rewards collected. In summary, our findings suggest that having more layers in deep quantum RL can enhance the learning outcome.

References

[1]  Hu, W. (2018) Towards a Real Quantum Neuron. Natural Science, 10, 99-109.
https://doi.org/10.4236/ns.2018.103011
[2]  Sutton, R.S. and Barto, A.G. (2018) Reinforcement Learning. An Introduction. 2nd Edition, A Bradford Book, Cambridge.
[3]  Ganger, M., Duryea, E. and Hu, W. (2016) Double Sarsa and Double Expected Sarsa with Shallow and Deep Learning. Journal of Data Analysis and Information Processing, 4, 159.
https://doi.org/10.4236/jdaip.2016.44014
[4]  Duryea, E., Ganger, M. and Hu, W. (2016) Exploring Deep Reinforcement Learning with Multi Q-Learning. Intelligent Control and Automation, 7, 129.
[5]  Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S. and Hassabis, D. (2015) Human-Level Control through Deep Reinforcement Learning. Nature, 518, 529-533.
https://doi.org/10.1038/nature14236
[6]  Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T. and Hassabis, D. (2016) Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 529, 484-489.
https://doi.org/10.1038/nature16961
[7]  Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton, A., Chen, Y., Lillicrap, T., Hui, F., Sifre, L., van den Driessche, G., Graepel, T. and Hassabis, D. (2017) Mastering the Game of Go without Human Knowledge. Nature, 550, 354-359.
https://doi.org/10.1038/nature24270
[8]  Peruzzo, A., McClean, J., Shadbolt, P., Yung, M.-H., Zhou, X.-Q., Love, P.J., Aspuru-Guzik, A. and O’Brien, J.L. (2014) A Variational Eigenvalue Solver on a Photonic Quantum Processor. Nature Communications, 5, Article No. 4213.
https://doi.org/10.1038/ncomms5213
[9]  Clausen, J. and Briegel, H.J. (2018) Quantum Machine Learning with Glow for Episodic Tasks and Decision Games. Physical Review A, 97, Article ID: 022303.
https://doi.org/10.1103/PhysRevA.97.022303
[10]  Crawford, D., Levit, A., Ghadermarzy, N., Oberoi, J.S. and Ronagh, P. (2016) Reinforcement Learning Using Quantum Boltzmann Machines.
[11]  Dunjko, V., Taylor, J.M. and Briegel, H.J. (2018) Advances in Quantum Reinforcement Learning.
[12]  Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N. and Lloyd, S. (2017) Quantum Machine Learning. Nature, 549, 195-202.
https://doi.org/10.1038/nature23474
[13]  Dunjko, V. and Briegel, H.J. (2018) Machine Learning and Artificial Intelligence in the Quantum Domain: A Review of Recent Progress. Reports on Progress in Physics, 81, Article ID: 074001.
https://doi.org/10.1088/1361-6633/aab406
[14]  Hu, W. (2018) Empirical Analysis of a Quantum Classifier Implemented on IBM’s 5Q Quantum Computer. Journal of Quantum Information Science, 8, 1-11.
https://doi.org/10.4236/jqis.2018.81001
[15]  Hu, W. (2018) Empirical Analysis of Decision Making of an AI Agent on IBM’s 5Q Quantum Computer. Natural Science, 10, 45-58.
https://doi.org/10.4236/ns.2018.101004
[16]  Hu, W. (2018) Comparison of Two Quantum Clustering Algorithms. Natural Science, 10, 87-98.
https://doi.org/10.4236/ns.2018.103010
[17]  Ganger, M. and Hu, W. (2019) Quantum Multiple Q-Learning. International Journal of Intelligence Science, 9, 1-22.
https://doi.org/10.4236/ijis.2019.91001
[18]  Naruse, M., Berthel, M., Drezet, A., Huant, S., Aono, M., Hori, H. and Kim, S.-J. (2015) Single-Photon Decision Maker. Scientific Reports, 5, Article No. 13253.
https://doi.org/10.1038/srep13253
[19]  Mitarai, K., Negoro, M., Kitagawa, M. and Fujii, K. (2018) Quantum Circuit Learning.
[20]  Schuld, M. and Killoran, N. (2018) Quantum Machine Learning in Feature Hilbert Spaces. arXiv:1803.07128
[21]  Hu, W. and Hu, J. (2019) Training a Quantum Neural Network to Solve the Contextual Multi-Armed Bandit Problem. Natural Science, 11, 17-27.
https://doi.org/10.4236/ns.2019.111003
[22]  Hu, W. and Hu, J. (2019) Q Learning with Quantum Neural Networks. Natural Science, 11, 31-39.
https://doi.org/10.4236/ns.2019.111005
[23]  Farhi, E. and Neven, H. (2018) Classification with Quantum Neural Networks on Near Term Processors. arXiv:1802.06002
[24]  Dong, D., Chen, C., Li, H. and Tarn, T.-J. (2008) Quantum Reinforcement Learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 38, 1207-1220.
[25]  Watkins, C.J.C.H. and Dayan (1992) Technical Note: Q-Learning. Machine Learning, 8, 279-292.
https://doi.org/10.1007/BF00992698
[26]  Watkins, C.J.C.H. (1989) Learning from Delayed Rewards. PhD Thesis, University of Cambridge, Cambridge.
[27]  Konda, V.R. and Tsitsiklis, J.N. (2003) On Actor-Critic Algorithms. Journal on Control and Optimization, 42, 1143-1166.
https://doi.org/10.1137/S0363012901385691
[28]  Grondman, I., Busoniu, L., Lopes, G.A.D. and Babuska, R. (2012) A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42, 1291-1307.
[29]  Degris, T., White, M. and Sutton, R.S. (2012) Off-Policy Actor-Critic. Proceedings of the 29th International Conference on Machine Learning, Edinburgh, 26 June-1 July 2012, 179-186.
[30]  Serafini, A. (2017) Quantum Continuous Variables: A Primer of Theoretical Methods. CRC Press, London.
[31]  Killoran, N., Bromley, T.R., Arrazola, J.M., Schuld, M., Quesada, N. and Lloyd, S. (2018) Continuous-Variable Quantum Neural Networks.
[32]  Bellman, R. (1957) A Markovian Decision Process. Journal of Mathematics and Mechanics, 6, 679-684.
[33]  Killoran, N., Izaac, J., Quesada, N., Bergholm, V., Amy, M. and Weedbrook, C. (2018) Strawberry Fields: A Software Platform for Photonic Quantum Computing.
https://doi.org/10.1512/iumj.1957.6.56038

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413