基于FPGA的高速视觉目标检测方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于FPGA的高速视觉目标检测方法研究
	李建权
	2020-05-21
页数	146
学位类型	博士
中文摘要	目标检测在机器人导航、智能交通、工业检测、航空航天等诸多领域有着广泛的应用和重大的意义，也是计算机视觉领域的一个重要研究课题。受限于算法复杂度和处理器性能，基于中央处理器CPU、图形处理器GPU以及嵌入式处理器的目标检测系统帧率常为数十帧，甚至仅为十余帧，与人眼处理帧率相近。然而高速视觉目标检测系统在诸多场景，尤其是尖端制造业以及军工领域有着强劲的需求。传统的高速视觉目标检测算法常依赖色彩及亮度信息，仅适用于特定场景下简单背景中的目标检测。本文基于FPGA处理器研究高速视觉目标检测算法，以期提高高速视觉目标检测算法精度，拓展高速视觉目标检测算法适用范围。本文主要的工作和贡献有： (1) 针对现有高速视觉算法仅适用于简单背景下目标检测的问题，提出了一种基于梯度特征信息的高速视觉目标检测算法，能够以10,000帧/秒的帧率在较复杂场景下进行目标检测。该算法借鉴经典图像特征描述子HOG的思想，计算检测窗口内的梯度直方图，并将其归一化后作为检测窗口的特征信息，通过与定点量化后的SVM分类器参数点乘求和，判断检测窗口内是否包含待检测目标。通过基于硬件的算法优化及参数定点量化，最终依靠FPGA有限的片上资源实现了硬件系统的部署。为了验证所提出算法的可行性，本文借助高帧率投影仪和高速相机搭建了算法验证平台，利用高帧率投影仪进行预设图样的投影并触发高速相机同步拍摄，通过单目标及多目标检测实验进行了算法有效性验证。 (2) 针对前述算法因检测窗口间距过大引起较大定位误差的问题，提出了一种融合投影信息的高速视觉目标检测方法，提高了高速视觉目标检测算法的定位精度。该方法在前述算法的基础上增加了像素值投影模块，用于获取检测窗口内水平和垂直方向的投影信息。通过设定阈值将该投影信息转换为二值向量后，能够分析得到目标中心点与检测窗口中心点之间的距离，从而进行检测结果的补偿。特定投影图样的检测及风扇转速测定实验结果表明，通过检测窗口内图像的梯度特征与投影信息融合，可以将高速视觉目标检测的定位误差缩小到前述算法的30%左右。 (3) 针对基于传统特征算子的高速视觉目标检测算法精度受限的问题，提出了一种结合卷积神经网络的高速视觉目标检测方法，提升了高速视觉目标的检测精度和场景适用性。针对FPGA有限的硬件资源和CNN极大的模型参数之间的矛盾，设计了轻量级的网络结构并提出了适合FPGA硬件实现的量化策略。该系统结构借鉴R-CNN两阶段目标检测方法，利用传统图像特征信息进行候选区域提取，并通过卷积神经网络实现候选区域目标分类。根据权重参数及中间结果特性制定了不同的量化策略并将所有数据以定点数的形式存储于FPGA片内。基于Vivado软件的仿真结果表明，该系统能够实现2,000帧/秒的高速视觉目标检测。同时，精度评估结果表明，该方法在精度上优于前述传统算法。 (4) 面向基于视觉信息的细胞筛选应用需求，搭建了基于高速相机、微流路芯片及显微镜的高速视觉微球检测系统，实现了高速高通量的微球实时检测，验证了高速视觉目标检测算法在实际应用场景中的有效性。该系统利用电动注射泵匀速推动进样针，使聚苯乙烯微球在微流路中匀速流动，通过高速相机采集图像并进行特征提取后传输至上位机。上位机端通过分析附加信息行中的图像特征，可以快速获知微球所在位置、数量等内容。基于Qt开发框架和OpenCV开源视觉库，搭建了上位机软件进行相关信息展示。实验结果表明，本文所提出的高速视觉目标检测算法能够较好地应用于细胞筛选场景且满足其精度要求。
英文摘要	Object detection has important significance and is widely used in the fields such as robot navigation, intelligent transportation, industrial inspection and aerospace. It is also an important research topic in the field of computer vision. Limited by the computing complexity of algorithms and the computing power of processors, the frame rate of object detection based on CPU, GPU or embedded processors is often dozens of frames per second or even only more than ten frames per second, which is near to that of human eyes. However, high-speed-vision object detection system has a strong demand under many situations, especially in advanced manufacturing and military industry. The traditional high-speed-vision object detection algorithms often rely on color and brightness information, which is only suitable for the target detection under simple background in specific scenes. This dissertation focuses on FPGA-based high-speed object detection algorithms to improve the accuracy of high-speed object detection and expand the application scope of the related algorithms. The main work and contributions of this dissertation are as follows: (1) Aiming at the problem that the existing high-speed vision algorithms are only suitable for target detection under simple backgrounds, a novel high-speed vision object detection algorithm is proposed, which can detect objects in complex background at 10,000 frames per second. The proposed algorithm draws on the idea of classical image descriptor HOG. After gradient histogram calculation and normalization within detection windows, the obtained vectors are regarded as the image features. Through multiplication with parameters after fixed-point quantization, the summarized values are used to judge whether objects exist in the detection windows. After hardware-oriented algorithm optimization and fixed-point parameter quantization, the proposed algorithm is successfully implemented in the high-speed-vision platform with limited on-chip resources in the FPGA. In order to verify the feasibility of the processed algorithm, a system verification platform is built with a high-frame-rate projector and a high-speed camera. Preset patterns are projected with the high-frame-rate projector and the high-speed camera is triggered to shoot synchronously. The effectiveness of the algorithm is verified through rotating single-object and multi-object detection experiments. (2) Aiming at the problem that large positioning errors are brought by the large stride between detection windows in the former algorithm, a high-speed object detection method combining with projection information is proposed to improve the positioning accuracy. A pixel projection module is added in the former algorithm, to obtain the projection information in horizontal and vertical directions of the detection windows. The projection information is converted into binary vectors through preset thresholds, and the distance between the center of objects and detection windows can be calculated for detection result compensation. The experimental result of specific projection pattern detection and fan speed measurement shows that, through combination of both gradient feature and projection information in the detection windows, the positioning deviation of the detected objects can be reduced to about 30% of that with the former algorithm. (3) Aiming at the limited accuracy of high-speed object detection system based on handcraft descriptors, a convolutional neural network-based high-speed object detection method is proposed to improve the detection accuracy and scene applicability. A lightweight network structure and a hardware friendly quantization model are proposed to solve the contradiction between the limited hardware resource of FPGA and the huge number of network parameters. Based on the idea of two-stage object detection method R-CNN, bounding boxes are proposed with traditional image features in the proposed method, and convolutional neural network is used for object classification upon the proposed bounding boxes. Different strategies are designed according to the characteristics of weights and intermediate results of the network, and all parameters are stored in on-chip memories in form of fixed points. The simulation results based on software Vivado shows that, 2000 fps object detection can be achieved with the proposed method. Meanwhile, the simulation result of accuracy evaluation shows that the proposed method performs better than the former algorithms. (4) Facing at the application requirement of vision-based cell screening, a high-speed vision microsphere detection system is proposed based on a high-speed camera, a lab-on-a-chip and a microscope, which achieves real-time detection of microspheres with high-speed and high-throughput and verifies the effectiveness of high-speed object detection algorithm in actual application scenarios. The injection needle is driven by the electric injection pump and moves at a constant speed to ensure that the polystyrene microspheres flow at a constant speed in the microfluidic path. Images are transmitted to the master computer after collection and feature extraction in the high-speed camera. Through analyzing the image feature contained in the additional information line, the location and quantity of microspheres can be obtained. Based on the development framework Qt and open source computer vision library, the software interface is built for relevant information display. The experimental result shows that, the proposed high-speed vision object detection methods can be applied to cell screening and meet the accuracy requirement.
关键词	高速视觉目标检测 FPGA HOG特征投影信息深度学习参数量化
语种	中文
七大方向——子方向分类	智能硬件
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/39063
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	李建权. 基于FPGA的高速视觉目标检测方法研究[D]. 远程答辩. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
李建权-博士学位论文-201718014（6969KB）	学位论文		限制开放	CC BY-NC-SA