基于图神经网络的金融欺诈检测方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于图神经网络的金融欺诈检测方法研究
	荆蓉蓉
	2022-05
页数	95
学位类型	硕士
中文摘要	随着电子商务和线上支付的蓬勃发展，数字金融交易成交量急剧增长。伴随而来的金融欺诈案件也逐年增加，给人们财产安全和国家的金融稳定带来了巨大威胁。金融交易行为具有时空聚合的特点，而图神经网络具有同时建模网络的同质性和结构性的能力，这对于解决金融欺诈检测问题非常有效。然而，由于金融交易中经常存在的数据质量问题以及动态演化问题，目前的基于图神经网络的解决方法可能难以适应于如此复杂的金融欺诈数据。本文结合图神经网络方法及数据挖掘技术，利用信用卡欺诈、医疗保险诈骗和加密货币洗钱等多个金融欺诈场景的特点，分别对金融欺诈检测中的类别不平衡和数据缺失的问题、标签稀缺问题、交易网络动态演化的问题进行了研究。本文的主要研究贡献包括： 1、提出了基于数据混合增强策略的信用卡欺诈检测方法。信用卡欺诈任务有两个特点：数据分布极不均衡，数据特征缺失情况较为严重，这些特点导致深度学习算法难以学习数据中的潜在模式，泛化性较差。本文提出了一个基于数据混合增强策略的信用卡欺诈检测方法，利用基于谱正则化的大规模稀疏矩阵补全算法和基于插值法的过采样方法解决特征不均衡和数据缺失。为了能够利用结构化数据之间的丰富关系，本文利用样本相似度推断用户节点之间的潜在关系并利用图神经网络进行欺诈检测。该方法能够有效增强欺诈检测的数据质量，从而提高风险交易识别的性能。 2、提出了基于半监督自适应图卷积神经网络的金融欺诈检测方法。金融交易更新极快，数据标注耗时且成本庞大，严重的标签稀缺的问题限制了分类器性能。本文提出了一种自适应图卷积神经网络模型，该模型能够从数据中自动学习适合下游任务的图结构，并借助图卷积神经网络进行半监督学习。为了进一步提升模型性能，本文设计了基于噪声自训练的半监督学习框架，该框架利用自训练的思想为无标记数据生成伪标签，与有标签数据共同训练一个高效的金融欺诈检测模型。引入的随机噪声（特别是对抗扰动噪声）能够增强训练模型的稳健性和预测的泛化性。在大型工业数据集上的实验证明了本文提出的模型可以提高学习效率，增强模型鲁棒性并提高欺诈检测性能。 3、提出了基于动态归纳图表示学习的交易网络欺诈检测方法。本文着重研究了现实中金融交易欺诈的动态网络场景，设计了一种面向动态交易网络的图表示学习方法。该方法通过使用两个循环神经网络层次化地学习图神经网络中的邻域聚合模块和节点表示更新模块的参数演化动力学，将经典的归纳式图神经网络方法推广到了动态网络上。这两种架构的融合使得模型既能归纳式地生成未见过的节点嵌入，以便更好地应对交易网络中未知节点的出现，也能捕获动态交易网络的演化模式，从而在未来的时刻做出更好的预测。大量在实际数据上的实验结果表明，本文提出的方法能够在动态交易网络风险检测问题上取得非常具竞争性的结果，具有实际使用的潜在价值。
英文摘要	With the development of e-commerce and online payment, the volume of digital financial transactions is growing rapidly. Meanwhile, the increasing number of financial fraud cases year by year threatens the people's property security and the country's financial stability. Financial transaction behavior has the characteristics of time and space aggregation. Graph neural networks can explore the homophily and structural equivalence at the same time, which is effective for financial fraud detection. However, due to the data quality problems and dynamic transactions that often exist in financial transactions, the current solutions based on graph neural networks may be difficult to adapt to such complex financial fraud data. Based on the graph neural network method and other data mining techniques, this paper utilizes the characteristics of credit card fraud, health insurance fraud, cryptocurrency money laundering and other financial fraud scenarios, and studies the problem of category imbalance and data missing in financial fraud detection, the problem of label scarcity and the dynamic evolution of transaction network. The main research contributions of this paper include: 1. Credit card fraud detection method based on data hybrid enhancement strategy. There are two characteristics of credit card fraud tasks, which are imbalanced data and missing features. These characteristics make it difficult for deep learning algorithms to learn potential relationships of data, which makes the generalization of the algorithm poor. In this paper, a credit card fraud detection method based on a data hybrid enhancement strategy is proposed. This method uses a large-scale sparse matrix completion algorithm based on spectral regularization and an over-sampling method based on interpolation to solve imbalanced data and data missing. In order to take advantage of the rich relationships between structured data, this paper uses similarity to infer the potential relationships between user nodes. This method can effectively improve the data quality of fraud detection and the performance of risky transaction identification. 2. Financial fraud detection method based on semi-supervised adaptive graph convolutional neural network. Financial transactions are updated very quickly, and data annotation is time-consuming and costly. Severe label scarcity can limit the performance of the classifier. This paper proposes a semi-supervised adaptive graph convolution neural network model. The model can automatically learn graph structures suitable for downstream tasks from the data, and semi-supervised learning is carried out with the help of graph convolutional neural network. In order to improve the performance of the model, a semi-supervised learning framework based on noise self-training is designed. This framework uses the idea of self-training to generate pseudo labels for unlabeled data, and trains an efficient financial fraud detection model together with labeled data. The random noise, especially the adversarial noise, can enhance the robustness of the training model and the generalization of the prediction. Experiments on large industrial data sets prove that the proposed model can enhance the robustness of the model and improve the performance of fraud detection. 3. Transaction network fraud detection method based on dynamic inductive graph representation learning. This paper focuses on the dynamic network scene of financial transaction fraud, and designs a graph representation learning method for dynamic transaction networks. In this method, the classical inductive graph neural networks method is extended to dynamic networks by using two recurrent neural networks to learn the dynamic evolution parameters of the neighborhood aggregation module and node update module. This model can not only generate unseen node embeddings to better cope with the appearance of unknown nodes in the trading network, but also capture the evolution patterns of dynamic trading networks in order to predict better in the future. Experimental results on real data show that the proposed method can achieve very competitive results on risk detection on dynamic trading networks, and it has potential value for practical use.
关键词	图神经网络金融欺诈检测数据质量标签稀缺动态交易网络
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/48848
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	荆蓉蓉. 基于图神经网络的金融欺诈检测方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于图神经网络的金融欺诈检测v20.pd（3339KB）	学位论文		限制开放	CC BY-NC-SA