[1] 刘振亚. 智能电网技术 [M]. 北京: 中国电力出版社, 2010: 17-88. [2] 刘振亚. 智能电网知识读本 [M]. 北京: 中国电力出版社, 2010: 100-117. [3] 秦立军, 马其燕. 智能配电网及其关键技术 [M]. 北京: 中国电力出版社, 2010: 149-184. [4] E Hossain, H Zhu, H Vincent Poor.智能电网通信及组网技术 [M]. 北京: 电子工业出版社, 2013: 38-45. [5] T Huang, D R Liu.A Self-learning Scheme for Residential Energy System Control and Management[J]. Neural Comput &Applic (S2161-4393), 2013, 22(2) 259-269. [6] S Squartini, M Boaro, D Fuselli, et al.Adaptive Dynamic Programming Algorithm for Renewable Energy Scheduling and Battery Management[J]. Cogn Comput (S1866-9956), 2012, 5(2): 323-328. [7] A Saber, G Venayagamoorthy.Plug-in Vehicles and Renewable Energy Sources for Cost and Emission Reductions[J]. IEEE Trans. Industrial Electronics (S0278-0046), 2011, 58(4): 1229-1238. [8] 李春华, 朱新坚, 吉小鹏. 光伏系统中蓄电池管理策略研究[J]. 系统仿真学报, 2012, 24(11): 2378-2382. [9] S C Lee, S J Kim, S H Kim.Demand Side Management with Air Conditioner Loads Based on the Queuing System Model[J]. IEEE Trans. Power Systems (S0885-8950), 2011, 26(2): 661-668. [10] D O’Neill, M Levorato, A Goldsmith, et al.Residential Demand Response Using Reinforcement Learning[C]// First IEEE Internet Conference on Smart Grid Communications. USA: IEEE, 2011: 409-414. [11] J Medina, N Muller, I Roytelman.Demand Response and Distribution Grid Opportunities and Challenges[J]. IEEE Trans. Smart Grid (S1949-3053), 2010, 1(2): 193-197. [12] W B Shi, V W S Wong. Real-Time Vehicle-to-Grid Control Algorithm under Price Uncertainty[C]// IEEE Smart Grid Comm. USA: IEEE, 2011: 261-266. [13] Y Cao, T Jiang, Q Zhang.Reducing Electricity Cost of Smart Appliances via Energy Buffering Framework in Smart Grid[J]. IEEE Trans on Parallel and Distributed Systems (S1045-9219), 2012, 23(9): 1572-1582. [14] D Wang, S Y Ge, H J Jia, et al.A Demand Response and Battery Storage Coordination Algorithm for Providing Microgrid Tie-Line Smoothing Services[J]. IEEE Trans on Sustainable Energy (S1949-3029), 2014, 5(2): 476-486. [15] W B Shi, N Li, X R Xie.Optimal Residential Demand Response in Distribution Networks[J]. IEEE Journal on Selected Areas in Communications (S0733-8716), 2014, 32(7): 1441-1450. [16] B Li, S Gangadhar, S Cheng, et al.Maximize User Rewards in Distributed Generation Environments using Reinforcement Learning[C]// IEEE Energy Tech. USA: IEEE, 2011: 1-6. [17] P Poggi, G Notton, M Muselli, et al.Stochastic Study of Hourly Total Solar Radiation in Corsica Using a Markov Model[J]. Int’l J. Climatology (S1097-0088), 2000, 20(14): 1843-1860. [18] D Niyato, E Hossain, A Fallahi.Sleep and Wakeup Strategies in Solar-Powered Wireless Sensor/Mesh Networks: Performance Analysis and Optimization[J]. IEEE Transactions on Mobile Computing (S1536-1233), 2007, 6(2): 221-236. [19] 高阳, 陈世福, 陆鑫. 强化学习研究综述[J]. 自动化学报, 2004, 1(30): 86-97. [20] C J C H Watkins, P Dayan. Q-learning[J]. Machine Learning, 1992, 8(3-4): 279-292. [21] T Jaakkola, M Jordan, S Singh.On The Convergence of Stochastic Iterative Dynamic Programming Algorithms[J]. Natural Computations (S0899-7667), 1994, 6(6): 1185-1201. |