2.5. 参考文献#

待处理

正确的文献引用格式

[1]吴茂贵,郁明敏,杨本法,李涛,张粤磊.Python深度学习：基于PyTorch[M].北京:机械工业出版社, 2019.

[2][日]小川雄太郎.PyTorch 深度学习模型开发实战[M].陈欢,译.北京:中国水利水电出版社出版, 2022.

[3]Ruder S. An overview of gradient descent optimization algorithms[J]. arXiv preprint arXiv:1609.04747, 2016.

[4]Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.

[5]Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

[6]Sugata T L I, Yang C K. Leaf App: Leaf recognition with deep convolutional neural networks[C]//IOP Conference Series: Materials Science and Engineering. IOP Publishing, 2017, 273(1): 012004.

[7]Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9.

[8]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[9]Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144.

[10]Mirza M, Osindero S. Conditional generative adversarial nets[J]. arXiv preprint arXiv:1411.1784, 2014.

[11]SUTTON R S, BARTO A G. Reinforcement learning: An introduction [M]. MIT press, 2018.

[12]ZHOU S, LIU X, XU Y, et al. A Deep Q-network (DQN) Based Path Planning Method for Mobile Robots; proceedings of the 2018 IEEE International Conference on Information and Automation (ICIA), F, 2018 [C]. IEEE.

[13]KONAR A, CHAKRABORTY I G, SINGH S J, et al. A deterministic improved Q-learning for path planning of a mobile robot [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2013, 43(5): 1141-53.

[14]WATKINS C J, DAYAN P. Q-learning [J]. Machine learning, 1992, 8(3-4): 279-92.

[15]MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning [J]. Nature, 2015, 518(7540): 529-33.

[16]BUSONIU L, BABUSKA R, DE SCHUTTER B, et al. Reinforcement learning and dynamic programming using function approximators [M]. CRC press, 2010.

[17]CHEN S-L, WEI Y-M. Least-squares SARSA (Lambda) algorithms for reinforcement learning; proceedings of the 2008 Fourth International Conference on Natural Computation, F, 2008 [C]. IEEE.

[18]MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning [J]. arXiv preprint arXiv:13125602, 2013.

[19]SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search [J]. nature, 2016, 529(7587): 484.

[20]SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of go without human knowledge [J]. Nature, 2017, 550(7676): 354-9.

[21]SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay [J]. arXiv preprint arXiv:151105952, 2015.

[22]BELLMAN R. A markov decision process. journal of Mathematical Mechanics [J]. 1957.

[23]HOWARD R A. Dynamic programming and markov processes [J]. 1960.

[24]BLACKWELL D. Discrete dynamic programming [J]. The Annals of Mathematical Statistics, 1962: 719-26.

[25]WATKINS C J C H. Learning from delayed rewards [J]. 1989.

[26]喻杉. 基于深度环境理解和行为模仿的强化学习智能体设计[D]; 浙江大学, 2019.

[27]Yann LeCun. THE MNIST DATABASE of handwritten digits[EB/OL]. http://yann.lecun.com/exdb/mnist, 2013-05-15/2023-02-09.

[28]Joseph Chet Redmon. MNIST in CSV[EB/OL].http://pjreddie.com/projects/mnist-in-csv, 2023-02-06/2023-02-09.

[29]Joseph Chet Redmon. mnist_train[EB/OL].http://www.pjreddie.com/media/files/mnist_train.csv, 2023-02-06/2023-02-09.

[30]Joseph Chet Redmon. mnist_test[EB/OL].http://www.pjreddie.com/media/files/mnist_test.csv, 2023-02-06/2023-02-09.

[31]GitHub. mnist_train_100[EB/OL].https://raw.githubusercontent.com/makeyourownneuralnetwork/makeyourownneuralnetwork/master/mnist_dataset/mnist_train_100.csv, 2023-02-06/2023-02-09.