2.5. 参考文献#
待处理
正确的文献引用格式
[1]吴茂贵,郁明敏,杨本法,李涛,张粤磊.Python深度学习:基于PyTorch[M].北京:机械工业出版社, 2019.
[2][日]小川雄太郎.PyTorch 深度学习模型开发实战[M].陈欢,译.北京:中国水利水电出版社出版, 2022.
[3]Ruder S. An overview of gradient descent optimization algorithms[J]. arXiv preprint arXiv:1609.04747, 2016.
[4]Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
[5]Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
[6]Sugata T L I, Yang C K. Leaf App: Leaf recognition with deep convolutional neural networks[C]//IOP Conference Series: Materials Science and Engineering. IOP Publishing, 2017, 273(1): 012004.
[7]Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9.
[8]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[9]Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144.
[10]Mirza M, Osindero S. Conditional generative adversarial nets[J]. arXiv preprint arXiv:1411.1784, 2014.
[11]SUTTON R S, BARTO A G. Reinforcement learning: An introduction [M]. MIT press, 2018.
[12]ZHOU S, LIU X, XU Y, et al. A Deep Q-network (DQN) Based Path Planning Method for Mobile Robots; proceedings of the 2018 IEEE International Conference on Information and Automation (ICIA), F, 2018 [C]. IEEE.
[13]KONAR A, CHAKRABORTY I G, SINGH S J, et al. A deterministic improved Q-learning for path planning of a mobile robot [J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2013, 43(5): 1141-53.
[14]WATKINS C J, DAYAN P. Q-learning [J]. Machine learning, 1992, 8(3-4): 279-92.
[15]MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning [J]. Nature, 2015, 518(7540): 529-33.
[16]BUSONIU L, BABUSKA R, DE SCHUTTER B, et al. Reinforcement learning and dynamic programming using function approximators [M]. CRC press, 2010.
[17]CHEN S-L, WEI Y-M. Least-squares SARSA (Lambda) algorithms for reinforcement learning; proceedings of the 2008 Fourth International Conference on Natural Computation, F, 2008 [C]. IEEE.
[18]MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning [J]. arXiv preprint arXiv:13125602, 2013.
[19]SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search [J]. nature, 2016, 529(7587): 484.
[20]SILVER D, SCHRITTWIESER J, SIMONYAN K, et al. Mastering the game of go without human knowledge [J]. Nature, 2017, 550(7676): 354-9.
[21]SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay [J]. arXiv preprint arXiv:151105952, 2015.
[22]BELLMAN R. A markov decision process. journal of Mathematical Mechanics [J]. 1957.
[23]HOWARD R A. Dynamic programming and markov processes [J]. 1960.
[24]BLACKWELL D. Discrete dynamic programming [J]. The Annals of Mathematical Statistics, 1962: 719-26.
[25]WATKINS C J C H. Learning from delayed rewards [J]. 1989.
[26]喻杉. 基于深度环境理解和行为模仿的强化学习智能体设计[D]; 浙江大学, 2019.
[27]Yann LeCun. THE MNIST DATABASE of handwritten digits[EB/OL]. http://yann.lecun.com/exdb/mnist, 2013-05-15/2023-02-09.
[28]Joseph Chet Redmon. MNIST in CSV[EB/OL].http://pjreddie.com/projects/mnist-in-csv, 2023-02-06/2023-02-09.
[29]Joseph Chet Redmon. mnist_train[EB/OL].http://www.pjreddie.com/media/files/mnist_train.csv, 2023-02-06/2023-02-09.
[30]Joseph Chet Redmon. mnist_test[EB/OL].http://www.pjreddie.com/media/files/mnist_test.csv, 2023-02-06/2023-02-09.
[31]GitHub. mnist_train_100[EB/OL].https://raw.githubusercontent.com/makeyourownneuralnetwork/makeyourownneuralnetwork/master/mnist_dataset/mnist_train_100.csv, 2023-02-06/2023-02-09.