- https://huggingface.co/datasets/jerpint-org/HackAPrompt-Playground-Submissions
- https://huggingface.co/datasets/jerpint-org/HackAPrompt-AICrowd-Submissions
- https://huggingface.co/datasets/imoxto/prompt_injection_hackaprompt_gpt35
- https://huggingface.co/datasets/imoxto/prompt_injection_cleaned_dataset-v2
- https://github.com/compass-ctf-team/prompt_injection_research/blob/main/dataset/prompt-injection-dataset.csv
- https://github.com/rabbidave/Denzel-Crocker-Hunting-For-Fairly-Odd-Prompts/blob/main/bad_prompts.csv
- https://github.com/laiyer-ai/llm-guard/blob/main/llm_guard/resources/jailbreak.json.txt
Instruction-bypass-phrases.txt contains a dump of generateInjectionKeywords() from Rebuff https://github.com/protectai/rebuff/blob/main/server/lib/detect-helpers.ts
regex pattern
(Ignore|Disregard|Skip|Forget|Neglect|Overlook|Omit|Bypass|Pay no attention to|Do not follow|Do not obey)\s*(prior|previous|preceding|above|foregoing|earlier|initial)?\s*(content|text|instructions|instruction|directives|directive|commands|command|context|conversation|input|inputs|data|message|messages|communication|response|responses|request|requests)\s*(and start over|and start anew|and begin afresh|and start from scratch)?
Thank you!