Guardrails & Moderation

Total 22 articles sites

Detect safety risks and toxicity

Writing & Documents Images & Design Video & Avatars Audio & Voice Productivity & Office Coding & Dev Search & Research Agents & Automation Marketing & Growth Customer Support Open Source & Models Prompts & Templates Entertainment

Sorting

release update Views Like

Meta Prompt Guard 2

Lightweight detectors for prompt injection and jailbreak attempts.

0370

Guardrails & Moderation # detector # injection # jailbreak

Meta Llama Guard 3

LLM-based safety classifier for prompts and responses; multiple sizes.

0300

Guardrails & Moderation # classifier # Llama Guard # LLM

NVIDIA NeMo Guardrails

Programmable guardrails (topic control, PII, jailbreak prevention) for LLM apps.

0380

Guardrails & Moderation # agents # Colang # guardrails

Google ShieldGemma

Open-weight safety classifiers for inputs/outputs; tune to your policies.

0460

Guardrails & Moderation # classifier # Gemma # google

Amazon Bedrock Guardrails

Centralized, reusable guardrails for prompts and responses across models.

0310

Guardrails & Moderation # AWS # Bedrock # governance

Azure AI Content Safety

Configurable content filters for harmful text and images with studio & APIs.

0460

Guardrails & Moderation # API # Azure # content safety

OpenAI Moderation API (omni-moderation-latest)

Multimodal moderation for text & images; granular safety categories and flags.

0400

Guardrails & Moderation # API # image # moderation