Skip to content

Instantly share code, notes, and snippets.

@morganmcg1
Created January 29, 2025 08:17
Show Gist options
  • Save morganmcg1/f9420d9d33f1119d3bd7852e40c38ef5 to your computer and use it in GitHub Desktop.
Save morganmcg1/f9420d9d33f1119d3bd7852e40c38ef5 to your computer and use it in GitHub Desktop.

Here's the data formatted as a markdown table:

Tool/Library Violation Handling Code Example - Initialization Code Example - Usage/Calling How the Guardrail is Used
AWS Bedrock Guardrails Exceptions, warnings, output modification bedrock_client.create_guardrail(name='my_guardrail', ...) response = client.invoke_model_with_guardrail(modelId='anthropic.claude-v2', guardrailId='my_guardrail_id', ...) Applied to both user inputs and model outputs during inference. Can be used to filter harmful content, block denied topics, redact PII, and filter words.
Nvidia Nemo Guardrails Reject input, alter input, execute actions, reject output, alter output config = RailsConfig.from_path("path/to/config") app = LLMRails(config) new_message = app.generate(messages=[{ "role": "user", "content": "Hello! What can you do for me?" }]) Uses Colang to define flows and apply guardrails at various stages in the conversation. Can be used to validate user input, bot messages, and control dialogue flow.
Guardrails Python library Corrective actions, re-prompting, exception guard = gd.Guard.from_rail_string(rail_str) guard.validate("123-456-7890") Uses RAIL specifications to define guardrails. Can be used to validate and structure LLM output, enforce types, and apply corrective actions.
LangChain Filter, fix, exception guard = Guard().use_many( CompetitorCheck(competitors=competitors_list, on_fail="fix"), ToxicLanguage(on_fail="filter"), ) chain = promptmodel Offers various types of guards, including dataset embeddings, general LLM guards, and RAG LLM guards. Can be used to detect jailbreak attempts, PII, toxicity, and other issues.
Arize Default response, re-prompting, exception guard = Guard().use(ArizeDatasetEmbeddings, on="prompt", on_fail="exception") validated_response = guard( llm_api=openai.chat.completions.create, prompt=prompt, model="gpt-3.5-turbo", max_tokens=1024, temperature=0.5, ) Offers various types of guards, including dataset embeddings, general LLM guards, and RAG LLM guards. Can be used to detect jailbreak attempts, PII, toxicity, and other issues.
Giskard Generate Colang rules for mitigation scan_report = gsk.scan(my_model, my_dataset) scan_report.generate_rails("config/generated_rails.co") Uses Colang rules to define guardrails. Can be used to detect vulnerabilities and generate Colang rules for mitigation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment