CASIA OpenIR  > 毕业生  > 博士学位论文
高嵌入容量的鲁棒音频水印算法
吴世强
2024-05-19
Pages130
Subtype博士
Abstract

随着信息技术的蓬勃发展,数字音频内容依靠网络可以便捷地复制与传播。但网络的便捷也助长了网络盗版的流行,不仅损害了版权所有者的权益,还阻碍了数字经济的发展,因此保护数字音频内容的版权刻不容缓。音频水印技术为音频内容的版权保护提供了解决方案,它将版权标识(水印)隐蔽地嵌入到音频中,当存在版权纠纷时,再从音频中提取水印以确定权利归属。

能有效保护音频版权的水印技术应兼具嵌入容量、隐蔽性和鲁棒性。然而三者相互制约,提升任一属性都会降低另外两个属性。版权标识一般包含版权所有者、被授权者、版权类型和起止时间等信息,现有大多数音频水印算法更关注于提升水印的隐蔽性和鲁棒性,忽略了嵌入容量的重要性,它们通常仅嵌入几十比特的水印信息,无法满足应用需求。而在载体音频中强行嵌入更多的水印信息,水印的隐蔽性和鲁棒性又会遭到破坏。因此,如何设计高嵌入容量的音频水印算法,平衡算法的隐蔽性和鲁棒性,即提升水印抵抗信号处理攻击、去同步攻击和未知细节的攻击的能力,是音频水印技术亟需解决的问题。

针对上述问题,本文设计了一系列高嵌入容量的鲁棒音频水印算法。本文主要工作和创新点归纳如下:

1. 基于分集接收的扩展码本扩频音频水印算法

为提升水印的嵌入容量以及对信号处理攻击鲁棒性,本文提出了一种基于分集接收的扩展码本扩频音频水印算法。本文首先构建扩频方法的嵌入容量模型,并揭示了嵌入容量依赖于码本大小和水印鲁棒性的事实;然后,基于该嵌入容量模型提出了扩展码本扩频水印算法,使每个扩频序列能携带多位水印比特,大幅提升了水印容量;最后,采用了分集接收机制优化扩展码本的水印算法,降低了载体信号和信号处理攻击引入的干扰的随机性,提升了水印的鲁棒性。理论和实验均证明所提算法具有很高的嵌入容量且具备抵抗多种信号处理攻击的能力。

2. 基于局部特征点的自适应扩频音频水印算法

在高嵌入容量场景中,为提升水印的隐蔽性和对去同步攻击的鲁棒性,本文提出了一种基于局部特征点的自适应扩频音频水印算法。本文提出了一个局部特征点检测模块,所检测的特征点作为水印嵌入区域的位置参考,增强了算法对去同步攻击的鲁棒性;然后,本文提出了一种扩频序列生成器,所生成的序列相互正交且元素幅值均衡,提升了算法对信号处理攻击和载体信号干扰的鲁棒性;另外,本文提出了一种自适应嵌入强度策略,不仅使水印具有更强的隐蔽性,还保证了水印的鲁棒性;最后,上述模块与现有高嵌入容量嵌入方法组合,平衡了水印算法的隐蔽性、鲁棒性和嵌入容量。实验证明所提算法具有高嵌入容量,出色的隐蔽性,对多种信号处理攻击和去同步攻击都具有鲁棒性。

3. 基于深度特征的对抗音频水印算法

在高嵌入容量场景中,为提升水印抵抗未知细节的攻击的能力,本文提出了一种基于深度特征的对抗音频水印算法。本文首先提出了一种新的水印嵌入方法,利用已训练的特征提取器将水印嵌入到音频的深度特征中;然后,本文在训练深度特征提取器和嵌入水印时使用了数据增广技术,提升了水印对多种攻击类型的鲁棒性;同时,本文还对水印信息进行纠错编码,进一步增强了算法的鲁棒性。实验表明,该算法表现出了比传统算法更好的扩展性,能够抵抗多种已知或未知细节的攻击类型,同时能够满足高嵌入容量的应用需求。

本文研究的三种音频水印算法不仅具备高嵌入容量,还能抵抗多种攻击类型,能在多种场景中保护音频版权,维护音频传播秩序,助力数字经济的发展。

Other Abstract

With the booming development of information technology, digital audio content can be easily copied and distributed through the Internet. However, the convenience of the Internet has also contributed to the popularity of online piracy, which not only harms the rights and interests of copyright owners but also hinders the development of the digital economy. Therefore, it is urgent to protect the copyright of digital audio content. Audio watermarking technology provides a promising solution for copyright protection of audio content, which imperceptibly embeds the copyright identifier (watermark) into the host audio, and then extracts the watermark from the audio to determine the ownership of the rights when there is a copyright dispute.

The keys to the effective use of audio watermarking technology for copyright protection are embedding capacity, imperceptibility and robustness. However, the three are mutually constrained, and enhancing any of these attributes will reduce the other two. The copyright identifier generally contains information such as copyright owner, licensee, copyright type and start/end time, etc. Most existing audio watermarking algorithms focus on improving the imperceptibility and robustness of the watermark, ignoring the importance of embedding capacity, which usually embeds only a few tens of bits of watermark information and cannot meet the application requirements of factual scenarios. If more watermark information is forcibly embedded in the host audio, the imperceptibility and robustness of the watermark will be damaged. Therefore, how to design audio watermarking algorithms with high embedding capacity and balance the imperceptibility and robustness of the algorithms, i.e., to improve the ability of watermarking to resist signal processing attacks, desynchronization attacks, and attacks with unknown details, are urgent problems to be solved in audio watermarking technology.

In order to address the above problems, this thesis designs robust audio watermarking algorithms with high embedding capacity. The main contributions of this thesis are summarized as follows:

1. Extended codebook spread spectrum audio watermarking algorithm based on diversity reception

In order to improve the embedding capacity of watermarking and the robustness against signal processing attacks, an extended codebook spread spectrum audio watermarking algorithm based on diversity reception is proposed. This thesis first construct the embedding capacity model of the spread spectrum algorithms and reveal the fact that the embedding capacity depends on the codebook size and watermark robustness; then, based on this embedding capacity model, an extended codebook spread spectrum watermarking algorithm is proposed, so that each spread spectrum sequence can carry multiple watermark bits, which significantly improves the watermarking capacity; finally, the diversity reception mechanism is adopted to optimize the proposed algorithm, which reduces the randomness of the interference introduced by host signals and signal processing attacks, and improves the robustness of watermark. Both theory and experiment prove that the proposed algorithm has a high embedding capacity and is robust against multiple signal-processing attacks.

2. Adaptive spread spectrum audio watermarking algorithm based on local feature points

In order to further improve the watermark’s imperceptibility and robustness against desynchronization attacks with high embedding capacity, an adaptive spread spectrum audio watermarking algorithm based on local feature points is proposed. In this thesis, a local feature point detection module is proposed, and the detected feature points are
treated as positional references of the watermark embedding region, which enhances the robustness of the algorithm against desynchronization attacks; then, this thesis proposed a generator for spreading sequences, the generated sequences are mutually orthogonal and the amplitude of the elements are balanced, which improves the robustness of the algorithm to signal processing attacks and host signal interference; in addition, an adaptive embedding strength strategy is proposed, which not only makes the watermark more imperceptible, but also ensures the robustness of the watermark; finally, the above modules are combined with existing high embedding capacity embedding methods to balance the imperceptibility, robustness and embedding capacity of the watermarking algorithm. It is experimentally demonstrated that the proposed algorithm has a high embedding capacity, excellent imperceptibility, and strong robustness against multiple signal processing attacks and desynchronization attacks.

3. Adversarial audio watermarking algorithm based on deep features

In order to improve the robustness of the watermark against the attack of unknown details under high embedding capacity, an adversarial audio watermarking algorithm based on deep features is proposed. This thesis first propose a novel watermark embedding algorithm that utilizes a trained feature extractor to embed the watermark into the deep features of the audio; then, the data augmentation is adopted in training the deep
feature extractor and embedding the watermark, which improves the robustness of the watermark against multiple attacks; additionly, the error correction coding is performed on the watermark bits to improve the robustness of the algorithm further. Experiments show that the algorithm exhibits better scalability than traditional algorithms and is robust against attacks with known or unknown details while meeting the requirements of
applications with high embedding capacity.

The three audio watermarking algorithms studied in this thesis all have high embedding capacity and resist various attacks, which can protect audio copyright in various scenarios, maintain the order of audio dissemination, and help the development of the digital economy.

Keyword音频水印 信号处理攻击 去同步攻击 扩频 深度学习
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/57321
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
吴世强. 高嵌入容量的鲁棒音频水印算法[D],2024.
Files in This Item:
File Name/Size DocType Version Access License
高嵌入容量的鲁棒音频水印算法.pdf(3318KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[吴世强]'s Articles
Baidu academic
Similar articles in Baidu academic
[吴世强]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[吴世强]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.