CASIA OpenIR  > 毕业生  > 硕士学位论文
基于注意力机制的图像语义边缘检测方法研究
陈宇航
2022-05-25
Pages63
Subtype硕士
Abstract

语义边缘检测是计算机视觉领域的经典研究任务之一,其旨在识别并定位
图像中目标物体的边缘像素。在早期的研究过程中,研究人员主要依据图像的颜
色、梯度和纹理信息来检测边缘,如手工设计的Sobel 算子。这类方法虽然计算
复杂度低,但是易受环境因素影响,且鲁棒性差,难以满足复杂场景的检测精度
需求。此外,高度依赖人工设计的特征提取器也限制了这类方法性能的提升。
近年来,随着深度学习技术的高速发展,卷积神经网络大规模应用于计算机
视觉领域,语义边缘检测算法的性能有了质的提升。传统方法抗噪声能力弱且缺
乏对特定边缘的选择能力,所以只适用于低精度的边缘提取,而基于深度学习的
语义边缘检测算法不仅鲁棒性强且能够针对感兴趣的边缘进行学习,因而受到研
究人员的重点关注。以CASENet 为代表的全卷积网络是语义边缘检测的常用方
法,这类方法通常采用编码器-解码器结构。编码器在卷积过程中不断下采样,从
而能够提取不同尺度的特征。解码器将融合之后的特征图恢复至原始的分辨率。
这类方法主要有两点缺陷:一是高度下采样使得大量图像边缘细节信息丢失;二
是CNN 结构难以对远距离的上下文信息进行建模,导致大量的边缘检测错误。
针对上述问题,本文在现有网络的研究基础上,引入注意力机制并设计了新的语
义边缘检测模型。实验表明,注意力机制的引入能够有效增强图像像素之间长距
离依赖的建模能力,从而提升语义边缘检测任务的精度。本文的主要贡献如下:
(1)提出了一种基于注意力机制的语义边缘检测网络——SEDTR。该网络
在传统编码器-解码器的基础上引入双歧路分支,两分支采用Transformer 结构
来分别提取低层边缘信息和高层语义信息,特征在输入解码器前利用跨分支的注
意力机制进行融合。实验表明,SEDTR 的检测精度达到了当前的SOTA(State
Of The Art) 水平。
(2)提出了一种基于注意力机制的轻量化船舶水线检测网络——WLNet。该
网络将语义分割作为辅助任务纳入水线检测网络框架,并基于注意力机制设计
了特征提取和特征融合模块,增强模型对水线边缘像素的感知能力。此外,还利
用水线的连续性特征以及双任务的对偶性质,构建了用于水线检测的新的损失
函数,提升了水线检测的精度。实验表明,在包含1000 幅从黄骅港实际采集的
水尺图像数据集上,与现有方法相比,本文方法能够大幅提升水线检测精度。

Other Abstract

Semantic edge detection is one of the classical research tasks in the field of

computer vision. In the early days, researchers mainly detected edges based on

the color, gradient and texture information of images, such as the hand-designed

Sobel operator. Although such methods have low computational complexity, they

are vulnerable to environmental factors and have poor robustness, which makes

it difficult to meet the detection accuracy requirements of complex scenes. In

addition, feature extractors with high dependence on manual design also limits

the performance of these methods.

In recent years, with the rapid development of deep learning technology, convolutional

neural networks have been widely applied in the field of computer vision,

and the performance of semantic edge detection has been improved qualitatively.

Traditional methods have weak anti-noise ability and lack the ability to select specific

edges, so they are only suitable for low-precision edge extraction. However,

semantic edge detection based on deep learning not only have strong robustness

but also can learn for interested edges, so they have attracted the focus of researchers.

Fully convolutional network represented by CASENet is a common

method for semantic edge detection. Such methods adopt encoder-decoder structure

to extract features of different scales while continuously down-sampling, and

restore to original resolution after feature fusion. There are two main defects of

these methods: first, a large number of image edge details are lost due to highly

down-sampling; second, CNN structure is difficult to model remote context information,

resulting in a large number of errors. To solve these problems, this

paper introduces attention mechanism and designs a new semantic edge detection

model based on the existing network research. Experiments show that the introduction

of attention mechanism can effectively enhance the modeling ability of

long distance dependence between image pixels, and thus improve the accuracy of

semantic edge detection. The main contributions of this paper are as follows:

(1) A semantic edge detection network (SEDTR) based on attention mechanism

is proposed. Based on the traditional encoder-decoder, this network intro-

duces a dual-channel branch, which uses Transformer structure to extract low-level

edge information and high-level semantic information respectively. Features are

fused by cross-branch attention mechanism before being input into the decoder.

Experimental results show that SEDTR achieves State Of The Art (SOTA) accuracy.

(2) A lightweight waterline detection network WLNet based on attention

mechanism is proposed. In this network, semantic segmentation was incorporated

into the framework of waterline detection network as an auxiliary task. Feature

extraction and feature fusion modules were designed based on attention mechanism

to enhance the model’s perception ability of waterline edge pixels. In addition,

a new loss function for waterline detection is constructed by using the continuity

of waterline and duality of dual tasks, which improves the accuracy of waterline

detection. The experiment shows that compared with the existing methods, the

proposed method can greatly improve the accuracy of waterline detection on the

dataset of 1000 water scale images collected from Huanghua Port.

Keyword语义边缘检测 注意力机制 水线检测 特征融合
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/48746
Collection毕业生_硕士学位论文
Corresponding Author陈宇航
Recommended Citation
GB/T 7714
陈宇航. 基于注意力机制的图像语义边缘检测方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.
Files in This Item:
File Name/Size DocType Version Access License
毕业论文(上传系统)_陈宇航.pdf(11846KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[陈宇航]'s Articles
Baidu academic
Similar articles in Baidu academic
[陈宇航]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[陈宇航]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.