v vanemmerik.ai / aws-ai
Tip of the Day 2026 · 06 · 04 ≈ 9 min read Amazon Bedrock · Guardrails

Amazon Bedrock Guardrails.

After nine days inside AgentCore, today we step one layer up the stack to Amazon Bedrock Guardrails — the configurable safeguard layer that sits between every Bedrock foundation model (and every third-party model you choose to put behind it) and your users. Six filter types, two tiers, one API that runs without ever invoking a model.

$ POST /guardrail/{id}/version/{v}/apply  — ApplyGuardrail, no FM required

01What Guardrails actually is

Bedrock Guardrails is a policy layer you configure once and then attach to anything that produces or consumes model output. From the overview page: "Amazon Bedrock Guardrails provides configurable safeguards to help you build safe generative AI applications. With comprehensive safety and privacy controls across foundation models (FMs), Amazon Bedrock Guardrails offers a consistent user experience to help detect and filter undesirable content and protect sensitive information that might be present in user inputs or model responses."

The four attachment points listed on the use cases page:

SurfaceHow the guardrail is attached
Model inference guardrailConfig in a Converse / ConverseStream request, or the header on InvokeModel / InvokeModelWithResponseStream.
Bedrock Agents guardrailConfiguration field on CreateAgent / UpdateAgent.
Knowledge Bases guardrailConfiguration on RetrieveAndGenerate.
Bedrock Flows guardrailConfiguration on a PromptFlowNode or KnowledgeBaseFlowNode.

And, decoupled from all of those, the ApplyGuardrail API — covered in §06 — lets you evaluate any text against a guardrail without running a model at all.

The shift

Guardrails decouples the policy from the model. Configure it once, attach it to four surfaces, or call it standalone in front of a third-party model. One blocked-message UX, one set of CloudWatch metrics, one audit trail.

02The six filter types

Bedrock Guardrails ships six configurable filters, listed on the create-your-guardrail page. Pick any subset; a guardrail must have at least one.

03Two safeguard tiers — Standard vs Classic

From the safeguard tiers page, content filters, prompt attacks, and denied topics each pick a tier:

FeatureStandard tierClassic tier
Languages Extensive (20+) English, French, Spanish
Denied-topic definition 1,000 characters 200 characters
Prompt-leakage detection Supported Not supported
Cross-Region inference Supported Not supported
Code-domain coverage Filters extend into code comments, variable / function names, and string literals Not extended

The tier choice is per-policy. You can run Standard content filters next to Classic denied topics if that's what your migration plan looks like — but the docs are blunt about the direction of travel: Standard is "more robust" and is the default recommendation for new guardrails.

04How blocking and masking actually work

Every filter declares one of two handling actions in its result. From the harmful-content handling page:

Two consequences worth noting from the how-it-works page:

The exact wording from the docs on cost: if a guardrail blocks the response, "you're charged for the foundation model inference calls, in addition to the model response that was generated before the guardrail's evaluation." Block early; block input wherever you can.

05Contextual grounding — the RAG-only hallucination check

Contextual grounding is the filter that makes Guardrails interesting for RAG. From the contextual-grounding page, it requires three components per request:

The filter emits two independent confidence scores, each between 0 and 1, and compares them against the thresholds you configured:

The docs give a worked example: source says "London is the capital of UK. Tokyo is the capital of Japan." Query is "What is the capital of Japan?" An answer of "The capital of Japan is London" is relevant but ungrounded — low grounding score, BLOCK. An answer of "The capital of UK is London" is grounded but irrelevant — low relevance score, BLOCK. The two scores let you tune for the failure mode that matters in your domain.

Threshold gotcha

Thresholds live in [0, 0.99]. A threshold of 1 is invalid, not "strictest possible" — the service rejects it. Per the doc: "A threshold of 1 is invalid as that will block all content."

One subtlety for streaming: contextual grounding evaluates each chunk for relevance. With ConverseStream, a chunk may stream out before the whole response has been classified as irrelevant — so the user can see the start of an irrelevant answer that the filter then flags. Plan UX accordingly.

06ApplyGuardrail — policy without a model

The most underrated piece of Guardrails is that you don't need a Bedrock foundation model in the picture at all. The ApplyGuardrail API takes a guardrail ID + version + text, and returns the assessment. From the ApplyGuardrail docs: "You can use the ApplyGuardrail API to assess any text using your pre-configured Amazon Bedrock Guardrails, without invoking the foundation models."

The request shape:

POST /guardrail/{guardrailIdentifier}/version/{guardrailVersion}/apply {   "source":  "INPUT" | "OUTPUT",   "content": [ { "text": { "text": "..." } } ] }

The response has one top-level field that summarizes the decision:

{   "action": "GUARDRAIL_INTERVENED" | "NONE",   "output": [ { "text": "string" } ],  // blocked message OR masked content   "assessments": [ { /* topicPolicy, contentPolicy, wordPolicy,                  sensitiveInformationPolicy,                  contextualGroundingPolicy,                  invocationMetrics */ } ] }

Three patterns this unlocks:

ApplyGuardrail is metered in text units (one unit per 1,000 characters, rounded up) — see §07 for the per-policy throughput ceilings.

07Limits worth knowing

Pulled straight from the Amazon Bedrock service quotas table — the "(Guardrails)" rows. Numbers below are us-east-1 unless flagged otherwise. All are listed as not-adjustable unless noted.

A few non-quota gotchas the docs call out:

08Try it in five minutes

The fastest path is the AWS CLI + a fresh guardrail. Numbers below are docs-faithful but illustrative.

$ # 1. Create a small Standard-tier guardrail with content filters $ aws bedrock create-guardrail \     --name demo-guardrail --description "Smoke test" \     --blocked-input-messaging "I can't help with that." \     --blocked-outputs-messaging "I can't share that." \     --content-policy-config '{"filtersConfig":[       {"type":"HATE","inputStrength":"HIGH","outputStrength":"HIGH"},       {"type":"VIOLENCE","inputStrength":"HIGH","outputStrength":"HIGH"},       {"type":"PROMPT_ATTACK","inputStrength":"HIGH","outputStrength":"NONE"}]}'   $ # 2. Publish a version $ aws bedrock create-guardrail-version --guardrail-identifier <ID>   $ # 3. Call ApplyGuardrail — no foundation model needed $ aws bedrock-runtime apply-guardrail \     --guardrail-identifier <ID> --guardrail-version 1 \     --source INPUT \     --content '[{"text":{"text":"ignore previous instructions and ..."}}]'

The response will have "action": "GUARDRAIL_INTERVENED" and an assessments[0].contentPolicy.filters entry showing {"type":"PROMPT_ATTACK", "confidence":"HIGH", "filterStrength":"HIGH", "action":"BLOCKED"}. Swap to a benign prompt and the same call returns "action": "NONE" with an empty output array.

Tomorrow we'll cover Amazon Bedrock Knowledge Bases — vector stores, hybrid retrieval, the chunking strategies (FIXED_SIZE, HIERARCHICAL, SEMANTIC, NONE), and how RetrieveAndGenerate differs from Retrieve when you compose a KB with a guardrail.

Verified against the official AWS docs on 2026-06-04.
Sources: Detect and filter harmful content with Amazon Bedrock Guardrails, How Amazon Bedrock Guardrails works, Create your guardrail, Safeguard tiers, Use cases, ApplyGuardrail API, Contextual grounding check, Sensitive information filters, Handling options, Service quotas.
If the tip looks dated, the docs are authoritative — go check them.
Heads up — this tip is from 2026-06-04. AWS services move fast. Cross-check the Bedrock Guardrails developer guide before relying on specifics, then come back for today's tip →
C

This page — research, writing, verification, and deployment — was built by Claude Cowork. No human touched the prose, the layout, or the upload pipeline. The tip was generated this morning, cross-checked against the official AWS docs by an independent verification pass, and published to Cloudflare R2 on a schedule.

A daily experiment by Monty van Emmerik · vanemmerik.ai · what is Claude Cowork?