基于强化学习的电网调度研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于强化学习的电网调度研究
	王威
	2024-05-16
页数	92
学位类型	硕士
中文摘要	随着用电需求的逐渐增加和电源形式的日益丰富，传统电网调度方法的局限性日益凸显，电网运行的质量和效率也亟需进一步提升。人工智能（Artifcial Intelligence, AI）方法凭借其强大的学习能力，近年来在多个领域表现出色，为电力系统的智能化、数字化提供了新的解决方案。在此背景下，强化学习（Reinforcement Learning, RL）作为一种代表性的人工智能方法，因其在复杂决策问题上的突出表现，逐渐受到关注。强化学习通过与环境的交互，能够自主学习最优策略，在诸如棋牌、电子游戏、机器人等控制决策领域取得了显著成。其强大的自适应能力和灵活的策略优化过程，也显示了在电网调度领域的巨大应用潜力。本文旨在探索强化学习在新型电网自动调度优化中的应用，主要围绕稳态情况下的有功出力调度优化以及拓扑结构优化两个近年来备受关注的方向，对电网稳态性能进行优化。有功出力调度从源侧进行电网优化，且直接关系到电网运行的稳态性能，比如经济性、新能源利用率等。拓扑结构优化从网侧改变电网拓扑结构，优化网损，维持电网的稳定运行。同时，二者可以相互配合，共同优化电网运行的性能。为了提高有功出力调度任务中电网调度的稳定性，同时兼顾调度的质量，本文提出融合了人类先验知识的分层安全强化学习（Hierarchical Safe Reinforcement Learning）的方法，在调度过程中同时兼顾稳定性以及原始任务性能。并且构建了新能源机组占比大的有功出力调度环境，验证了所提方法的性能，分层安全强化学习的方法具有更高的稳定性，同时任务性能也不会明显下降。为了解决拓扑结构优化中的大规模不均衡离散动作空间问题，本文提出了基于搜索排序的强化学习（Reinforcement Learning based Searching and Ranking ）的方法，通过多阶段逐级降维的方法获得更小的高质量动作空间，并使用强化学习对候选动作集合进行长期视角的排序。在标准的拓扑优化环境中对所提算法进行了全面的评估，本文所提方法可以在保证存活时间和累计奖励的前提下，大幅降低智能体的单步模拟次数，提高调度的效率。最后，为了直观地展现本文研究成果，本文开发了针对有功出力调度和拓扑结构调度的电网运行可视化系统。该系统集成了本文讨论的两种关键调度场景，通过图形界面展示了算法在实际电网运行过程中的工作流程和效果，并允许用户通过交互修改调度进程，直观地对比不同策略对电网运行的影响。
英文摘要	As electricity demand continues to increase and the diversity of power sources expands, the limitations of traditional grid dispatching methods have become increasingly apparent. There is an urgent need to enhance the quality and efciency of grid operation. Leveraging its powerful learning capabilities, Artifcial Intelligence (AI) has demonstrated outstanding performance in various felds in recent years, oﬀering new solutions for the intelligent and digital transformation of power systems. Against this backdrop, Reinforcement Learning (RL), as a representative AI method, has gained attention for its remarkable performance in complex decision-making problems. RL, through interaction with the environment, can autonomously learn optimal strategies and has achieved signifcant success in felds such as go, video games, and robotic control. Its strong adaptive capability and ﬂexible strategy optimization process reveal substantial potential for application in grid dispatching. This thesis aims to explore the application of reinforcement learning in the optimization of automated dispatching in modern power grids, focusing primarily on two areas that have garnered signifcant attention in recent years: active power dispatching optimization under steady-state conditions and topology optimization. The objective is to enhance the operational quality of the grid under steady-state conditions. Active power dispatching optimization addresses grid optimization from the source side, directly impacting the steady-state quality of grid operation, including aspects such as economic efciency and renewable energy utilization. Topology optimization, on the other hand, involves altering the grid’s topology from the network side to optimize network losses and maintain stable grid operation. Moreover, these two approaches can be coordinated to jointly optimize the performance of grid operation. To improve the stability of active power dispatching tasks while maintaining dispatching quality, this paper proposes a Hierarchical Safe Reinforcement Learning method that integrates human prior knowledge. This approach balances stability and original task performance during the dispatching process. Additionally, an active power dispatching environment with a high proportion of renewable energy units was constructed to validate the performance of the proposed method. The results demonstrate that the proposed method provides higher stability without a signifcant decline in task performance. To address the issue of large-scale imbalanced discrete action spaces in topology optimization, this thesis proposes a Reinforcement Learning based Searching and Ranking method. This method employs a multi-stage dimensionality reduction approach to obtain a smaller, high-quality action space and uses reinforcement learning to rank the candidate actions from a long-term perspective. Comprehensive evaluations of the proposed algorithm were conducted in a standard topology optimization environments, demonstrating that the method signifcantly reduces the number of single-step simulations required by the agent while maintaining survival time and cumulative rewards, thus improving dispatching efciency. Finally, to visually present the research outcomes, this thesis developed a grid operation visualization system for active power dispatching and topology optimization. This system integrates the two key dispatching scenarios discussed and displays the algorithms’ workﬂows and eﬀects in actual grid operation through a graphical interface. It also allows users to interactively modify the dispatching process, providing a clear comparison of the impact of diﬀerent strategies on grid operation.
关键词	电网自动化调度电网有功出力调度电网拓扑结构优化强化学习
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/56906
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	王威. 基于强化学习的电网调度研究[D],2024.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于强化学习的电网调度研究.pdf（18647KB）	学位论文		限制开放	CC BY-NC-SA