基于强化学习和价格动态的农业种植规划

CASIA OpenIR > 毕业生

	基于强化学习和价格动态的农业种植规划
	樊梦涵
	2022-05-17
页数	82
学位类型	硕士
中文摘要	小农户经营的农业生产模式经研究证实更加有利于生态可持续发展，然而由于小农户的分散经营，难以实现稳定供应，因此很容易造成市场供求信息不匹配的问题，导致出现天价菜或农产品大量滞销的现象。农业合作社可以将小农户集合起来，更好地整合市场信息，通过为每个小农户提供种植规划将市场上的需求在时间和空间上分配到每一个小农户，有助于对抗产销信息不对称所带来的风险。本文立足于农业种植规划问题，研究如何在满足农业的轮作约束下，最大化合作社农户的收益。本文的研究内容主要分为三个部分。 1. 农产品价格分析和预测。价格能够很好的反映农业生产系统所处的环境，本文首先通过以鲁棒局部加权回归作为平滑方法的时间序列分解方法（Seasonal and Trend decomposition using Loess, STL）分析不同品种作物的周平均价格数据中周期性成分的占比，总结出耐储存作物的周期性成分占比较低，一年中价格变化平稳；茄果类作物和速生叶菜价格中周期性成分占比较大，一年中价格具有明显的变化趋势。在价格预测部分，本文使用梯度下降树（Gradient Boosting Decision Tree, GBDT）算法实现了基于历史数据对未来周平均价格的预测。 2. 基于Agent-Based Modelling（ABM）的计算机仿真求解未来一年内按月种植规划的问题。本文创建了基于多尺度价格调节的平行农业种植规划系统：月平均价格用于未来一年种植计划的制订，以及价格异常波动后，对一年内未执行的种植计划的调整；周平均价格用于单次采收作物采收期的确定。平行农业种植规划系统将农户个体市场上的多尺度价格动态信息融入到种植规划调整中，以提升合作社的总收益。 3. 基于强化学习方法求解种植规划中的农户决策支持问题。前述规划方法需在确定的种植规划周期下进行计算，在临近末期时选择有限，难以为剩余部分进行规划。基于强化学习的方法去掉了该人为约束，将种植规划问题转化为农户每月的作物种植决策。该方法能够在与种植规划模拟环境的交互中学习到遵守作物顺序种植的轮作约束，并且在合适的时间种植作物来最大化合作社的总收益。本文的创新点有两点。第一，在传统以最大化种植周期内总利润为目标的种植规划问题中，提出平行农业种植规划系统，利用月平均价格和周平均价格实现对种植计划和采收时间的调节。第二，将强化学习深度Q网络（Deep Q-Networks, DQN）算法应用到农业种植规划决策中，搭建农户按月进行作物选择的农业种植规划模拟环境，在环境中引入农艺管理知识，实现作物的产量和生育期长度随种植时间而变化，更贴合实际生产中制订的种植规划。同时，农户在每个月都进行作物选择，既去除了种植规划周期长度的限制，又可以在每个月都根据实际环境改进种植策略，建立了种植规划对环境中自然、生态和社会因素的动态响应机制。
英文摘要	Smallholder agriculture production models have been proved to be more ecologically sustainable. However, the decentralized operation of small farmers makes it difficult to achieve a stable supply, so it is easy to cause a mismatch of information on market supply and demand, resulting in sky-high prices or a large number of unsold agricultural products. Agricultural cooperatives can gather small farmers together, better integrate market information, and combat the risks caused by information asymmetry by providing each farmer with cropping plan to distribute market demand to each small farmer in time and space. This thesis deals with the problem of agricultural crop planning, and studies how to maximize the benefits of cooperative under the constraints of crop rotation. The research content of this thesis is mainly divided into three parts. 1. Crop price analysis and forecasting. The price can well reflect the environment of the crop planning system. This thesis analyzed the proportion of seasonal component in the price data of different crops through Seasonal and Trend decomposition using Loess(STL), and summarized that for long‐term storage crops, the seasonal component accounts for a relatively low proportion in the price, and the price changes are stable in one year; for fruit vegetable and fast-growing leafy vegetables, the seasonal component accounts for a relatively large proportion, and the price has a clear trend of change in one year. In the price forecasting, the Gradient Boosting Decision Tree(GBDT) algorithm is used to predict the average price for the next week based on historical data. 2. Solving the traditional cropping plan for a monthly year-round schedule using Agent-Based Modelling(ABM) algorithm. A parallel crop planning system based on multi-scale price adjustment is created in this thesis. The monthly average price is used for the generation of cropping plan in one year and for the adjustment of unexecuted cropping plan after abnormal price fluctuation. The weekly average price is used for the adjustment of harvesting time for single harvest crops. The parallel agricultural crop planning system is able to integrate multi-scale dynamic price information into cropping plan adjustment and enhance the total profit of the cooperative. 3. Solving the farmer's decision support problem in crop planning using reinforcement learning approach. The previous method requires computation under a defined crop planning period, which has limited choice near the end and makes it difficult to plan for the remainder. The reinforcement learning approach removes this constraint and resolves the crop planning problem into a farmer's monthly crop decision problem. The method is able to learn to follow the crop rotation constraint in interaction with the simulation environment and plant crops at the right time to maximize the total profit of the cooperative. There are two contributions in this thesis. First, in the traditional crop planning problem with the objective of maximizing the total profit, a parallel agricultural crop planning system is created to achieve the different scales of adjustment using monthly and weekly average prices. Second, the reinforcement learning Deep Q-Networks(DQN) algorithm is applied to the crop planning. The crop planning simulation environment is built, where farmers make crop selection by month. The agronomic management experience is introduced into the environment, so that crop yields and planting lengths can vary with planting time to better fit the actual production situation. Farmers make crop selection every month, which eliminates the constraint of crop planning period length. The system can update the environment parameters monthly according to the actual system, which improves crop selection strategies and establishes a dynamic response mechanism for crop planning to natural, ecological and social factors in the environment.
关键词	种植规划基于代理模拟平行管理种植决策支持强化学习
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/48481
专题	毕业生
推荐引用方式 GB/T 7714	樊梦涵. 基于强化学习和价格动态的农业种植规划[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis-电子签名.pdf（2981KB）	学位论文		限制开放	CC BY-NC-SA