平行交通系统中的预测与控制关键技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	平行交通系统中的预测与控制关键技术研究
	戴星原
	2022-08-21
页数	214
学位类型	博士
中文摘要	城市交通系统是典型的社会物理信息系统，参与主体众多且关联复杂。因此，交通问题的解决需要综合考虑其中的工程复杂性与社会复杂性。在此背景下，平行交通系统应运而生。平行交通系统首先构建与实际交通系统对应的软件定义的人工交通系统，然后在人工系统中通过计算实验产生大量数据，实现交通现象的涌现，并完成交通问题的因果溯源与交通管控策略的全面优化，最后基于人工交通系统与实际交通系统的平行执行，通过虚实空间的反馈闭环，实现对实际交通系统的有效引导。平行交通系统的虚实空间架构为实现复杂交通环境下安全、高效的管理与控制提供了解决方案。为了进一步应对交通系统的规模性、不确定性、长期变化引起的复杂性问题，本文围绕平行交通系统中的交通流时序建模、预测与交通信号控制展开系统性研究，整体内容包括 3 个部分： 1. 平行交通系统中的交通流时序建模与预测方法研究。交通流包含复杂的非线性时空依赖特性，这使得交通流时序建模与准确预测非常困难。为此，本文提出基于趋势的交通流时序建模与预测方法，将交通流序列中的确定性与不确定性分离，通过简单的趋势计算建模交通流稳态特性，基于去趋势序列预测建模交通流动态特性，以提升交通流预测模型的泛化性能。去趋势序列预测模型包含单点预测模型 DeepTrend 与多点预测模型 DeepTrend 2.0。DeepTrend 通过两种神经网络模块分别实现交通流序列趋势估计与去趋势交通流序列预测，并以端到端形式训练整体网络。相比基于原始数据的交通流预测方法，DeepTrend 具有更高的预测精度。进一步，DeepTrend 2.0 考虑交通空间特性，权衡了深度学习大规模交通流预测模型的精度与复杂度。该模型基于构造的路网检测点时空图像，使用去趋势机制提取交通短期时序特征，通过轻量级深度卷积模块学习交通流的时空依赖关系。上述方法保证了预测模型的高精度与低复杂度，其中去趋势机制的引入显著降低了模型参数对预测精度的影响，相比于未引入去趋势的预测，去趋势机制使模型在真实交通流数据集的 5 分钟、15 分钟、45 分钟预测平均相对误差由 12.4%、12.8%、15.2%，分别下降到 10.5%、11.3%、12.1%。趋势描述了实际系统的交通流时序特征，为平行交通系统的优化与控制提供模型支撑。 2. 基于人工交通系统的交通信号控制策略优化方法研究。平行交通系统中的交通信号控制策略优化需同时保证策略性能与优化速度，以满足大规模城市多路口协同控制的最优性与实时性需求。为此，本文借助人工交通系统的计算实验，分别针对单路口与多路口场景提出了基于预测学习的交通信号策略优化方法，通过预测与决策模块训练阶段辅助，执行阶段分离，保证了信号控制策略的有效性与实时性。在单路口场景，构建基于图像的世界模型实现路口交通状态的细粒度描述。世界模型引入了与高维交通图像空间对应的低维隐空间，并替代交通环境在隐空间产生样本辅助策略优化。该方法提升了策略探索能力与数据利用率，同时使信号决策具备可解释性。优化后的策略在单路口控制中相比基准方法近端策略优化平均降低排队长度 21.2%。进一步，在多路口场景提出了动态感知多智能体强化学习策略优化方法 DAMA（Dynamics-Aware Multi-AgentReinforcement Learning），通过全局图网络全面协调智能体决策，同时引入时空数据预测任务辅助模型学习有效的交通状态表征，实现了多路口场景高效策略学习。在包含 25 个路口的路网测试结果显示，DAMA 策略比无预测学习图网络策略降低路网整体延误 15%。基于预测学习的交通信号策略优化方法为平行交通信号控制在大规模场景下的应用研究奠定基础。 3. 虚实互动的平行交通信号控制方法研究。面向交通信号控制长期决策支持与大规模路网协同控制问题，本文从离线策略生成与在线策略生成的角度提出两种平行交通信号控制方法。基于离线策略生成的信号控制方法 TOPADS（Trend-and Offline-Reinforcement-Learning-Based Parallel Decision Support）通过交通流趋势建模路口交通模式，利用实际交通数据与离线强化学习算法优化控制策略，并以此构建决策支持库，通过推荐与持续优化实现对不同路口交通模式的长期决策支持。基于在线策略生成的信号控制方法 ATSPC（Artificial-Transportation-Systems-Based Predictive Control）面向近二百个路口的大规模路网实时协同控制问题，使用人工交通系统在线预测推演实际系统在各阶段的运行状态，并通过基于 DAMA 的协同策略优化在高维策略空间实现高效的策略学习，及时生成近似最优控制策略应对实时变化的交通需求。所提出的两种平行交通信号控制方法为实现复杂交通环境下灵活、有效、实时的交通信号控制提供了解决方案。
英文摘要	Urban transportation systems are typical cyber-physical-social systems with many participating subjects and complex connections. Therefore, the solution to traffic problems needs to consider the engineering complexity and social complexity. In this context, parallel transportation systems are proposed. Parallel transportation systems first construct software-defined artificial transportation systems corresponding to the real ones. Then, the artificial systems generate a large amount of data through computational experiments to realize the emergence of traffic phenomena. Computational experiments can be used to complete the causal traceability of traffic problems and optimize traffic control and management policies. Finally, based on the parallel execution of the artificial and the actual transportation systems, the effective guidance of actual transportation systems is realized through the feedback closed loop of the virtual and real space. The virtual-real architecture of parallel transportation systems provides solutions for achieving safe and efficient management and control in complex traffic environments. To further address the complexity caused by the transportation systems' scale, uncertainty, and time-varying nature, this thesis conducts a systematic study on traffic flow sequence modeling and prediction as well as traffic signal control in parallel transportation systems. The study proceeds in three parts: 1. Traffic flow series modeling and prediction in parallel transportation systems. Traffic flow contains complex non-linear spatiotemporal dependencies, making it difficult to model and accurately predict traffic flow time series. To this end, this thesis proposes trend-based traffic flow modeling and prediction methods, which separate the deterministic and uncertainty in traffic flow sequences. Specifically, we model the stable characteristics of the traffic flow by simple trend calculation and model the dynamic characteristics of traffic flow by predicting detrended sequence, which boosts the prediction model's generalization performance. The proposed detrending-based traffic flow prediction models include DeepTrend, a single-point prediction model, and DeepTrend 2.0, a multipoint prediction model. DeepTrend implements traffic flow trend estimation and detrended traffic flow sequence prediction through a two-layer neural network structure and trains the whole network in an end-to-end manner. The prediction model has higher accuracy than the raw-data-based traffic prediction model. To achieve network-level traffic flow forecasting, we propose DeepTrend 2.0, allowing for a tradeoff between the accuracy and complexity of deep-learning-based large-scale traffic prediction models. Specifically, the model is built on spatiotemporal images of traffic sensor networks and uses a detrending mechanism to extract short-term temporal traffic features. Then DeepTrend 2.0 learns spatiotemporal dependencies of the traffic flow by a lightweight deep convolutional module. The above procedures ensure high prediction accuracy and low complexity. Specifically, the detrending mechanism significantly reduces the influence of the model parameters on prediction accuracy. Compared with origional data-based traffic prediction, detrending-based prediction reduces the average relative errors of the prediction of 5-minute, 15-minute, and 45-minute traffic flows from 12.4%, 12.8%, 15.2% to 10.5%, 11.3%, 12.1%, respectively. The trend describes the traffic flow sequence characteristics and provides model support for optimization and control in parallel transportation systems. 2. Traffic signal control policy optimization based on artificial transportation systems. Traffic signal control policy optimization in parallel transportation systems needs to ensure both high policy performance and high optimization speed to meet the requirements of optimality and timeliness for cooperative control of multiple intersections in large-scale cities. To this end, through the computational experiments of the artificial transportation systems, we propose traffic signal optimization methods based on predictive learning for isolated and networked traffic signal control separately. The methods ensure the effectiveness and timeliness of the control policy by the following process: In the training phase, the prediction assists the decision-making module; in the execution phase, the prediction is separated from the decision-making. In an isolated intersection scenario, an image-based world model is constructed to achieve a finegrained description of the intersection traffic state. The world model introduces a lowdimensional hidden space corresponding to the high-dimensional traffic image space, and replaces the traffic environment to generate samples in the hidden space to assist strategy optimization. This method improves the policy exploration capability and data efficiency, while making the signal decision interpretable. The optimized policy reduces the queue length by 21.2% compared to the optimal baseline, i.e., proximal policy optimization, in an isolated intersection. In networked traffic control, a dynamic-aware multi-agent reinforcement learning method DAMA is proposed to achieve efficient policy learning and near-optimal real-time control in a muti-intersection traffic scenario. Specifically, DAMA introduces a global graph neural network to coordinate agents’ decisions and a traffic prediction task to assist the model in learning effective traffic state representations. DAMA reduces the overall delay of the road network that contains 25 intersections by 15% compared with the graph neural network policies without prediction modules. Predictive learning-based traffic signal control optimization lays the foundation for applying parallel traffic signal control in large-scale scenarios. 3. Parallel traffic signal control via virtual-real interaction. Aiming at the problem of long-term decision support for traffic signal control and large-scale multiintersection cooperative control, two parallel traffic signal control methods are proposed. TOPADS (Trendand Offline-Reinforcement-Learning-Based Parallel Decision Support) uses traffic flow trends to model intersection traffic patterns and optimizes control policies based on actual traffic data through offline reinforcement learning. The traffic pattern and corresponding policy are used to build a decision support warehouse. The warehouse can achieve long-term decision support for different intersection traffic patterns through recommendations and continuous optimization. ATSPC (ArtificialTransportation-System-Based Predictive Control) is oriented to the real-time cooperative control of large-scale road networks that containe nearly 200 intersections. The method uses artificial systems to predict and derive the operational state of the actual system at each stage. Efficient policy learning is achieved in a high-dimensional policy space through DAMA-based cooperative policy optimization to generate near-optimal control policies in time to cope with real-time changing traffic demands. The proposed parallel traffic signal control methods provide solutions for achieving flexible, effective, and real-time traffic signal control in complex traffic environments.
关键词	平行交通系统交通预测交通控制深度学习强化学习
语种	中文
七大方向——子方向分类	平行管理与控制
国重实验室规划方向分类	实体人工智能系统决策-控制
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/49921
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	戴星原. 平行交通系统中的预测与控制关键技术研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
平行交通系统中的预测与控制关键技术研究.（14868KB）	学位论文		限制开放	CC BY-NC-SA