CASIA OpenIR  > 综合信息系统研究中心  > 视知觉融合及其应用
Training Large Language Models to Follow System Prompt with Self-Supervised Fine-Tuning
Junyan Qiu1,2; Haitao Wang3; Yiping Yang2
2024-03
Conference NameInternational Joint Conference on Neural Networks
Conference Date2024-07
Conference PlaceYOKOHAMA, JAPAN
PublisherIEEE
Abstract

In the realm of artificial intelligence, system prompts
stand as directives or requests aimed at guiding systems, such
as programming environments or AI models, to execute specific
tasks or operations. Typically positioned at the commencement
of input sequences in large language models, these prompts play
a pivotal role in shaping the model’s response and guiding its
interaction flow. However, a notable challenge emerges during
multi-turn dialogues, where these models gradually diverge from
adhering to the initial system prompt, leading to inconsistencies
in the dialogue. In this paper, we present a scalable framework
facilitating the adherence of language models to system prompts
through automated data construction. Our approach, termed
SELF-SUPERVISED SYSTEM PROMPT FINE-TUNING (S3FT), be-
gins by prompting a language model to modify real dialogue
responses to fit a specific system prompt, using stylized transla-
tion. Subsequently, we select a small sample of these responses
for human preference annotation. This annotated data is utilized
to train the language model to act as a discriminator, identi-
fying high-quality examples that are then employed in further
supervised fine-tuning. Experimental results on several datasets
demonstrate that applying our method to LlaMA2 and ChatGLM
promotes human preference rates by over 50%, and outperforms
ChatGPT and GPT4 by a consideratble margin. The source code
of our paper is available in S3FT-repo.

Keywordlarge language models supervised fine-tuning instruct tuning stylized generation
MOST Discipline Catalogue工学::计算机科学与技术(可授工学、理学学位)
Indexed ByEI
IS Representative Paper
Sub direction classification自然语言处理
planning direction of the national heavy laboratory语音语言处理
Paper associated data
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/57413
Collection综合信息系统研究中心_视知觉融合及其应用
Corresponding AuthorJunyan Qiu
Affiliation1.University of Chinese Academy of Sciences
2.Institute of Automation, Chinese Academy of Sciences
3.Meituan
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Junyan Qiu,Haitao Wang,Yiping Yang. Training Large Language Models to Follow System Prompt with Self-Supervised Fine-Tuning[C]:IEEE,2024.
Files in This Item: Download All
File Name/Size DocType Version Access License
a185-qiu final.pdf(1596KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Junyan Qiu]'s Articles
[Haitao Wang]'s Articles
[Yiping Yang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Junyan Qiu]'s Articles
[Haitao Wang]'s Articles
[Yiping Yang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Junyan Qiu]'s Articles
[Haitao Wang]'s Articles
[Yiping Yang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: a185-qiu final.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.