请问能否介绍一下如何实现prompt攻击检测? #1466
Unanswered
chengq2020
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
有些用户可能会输入涉及歧视、黄暴、侵权等有安全合规风险隐患的对话提示词,常见的手法为目标劫持、角色扮演等,但我测试了几个问题 ChatGLM 都防范的很优秀,想请教一下实现方法和防御技巧,谢谢
Beta Was this translation helpful? Give feedback.
All reactions