---
title: "Amazon Bedrock Guardrails — six configurable filters, two safeguard tiers, and the ApplyGuardrail API that runs without a model"
date: 2026-06-04
service: "Amazon Bedrock"
component: "Guardrails"
tags: [bedrock, guardrails, content-filters, denied-topics, word-filters, sensitive-information, pii, contextual-grounding, automated-reasoning, applyguardrail, standard-tier, classic-tier, prompt-attack, quotas]
source: https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html
verified_on: 2026-06-04
url: https://vanemmerik.ai/aws-ai/2026-06-04.html
---

# AWS Bedrock & AgentCore · Tip of the Day · 2026-06-04

## Amazon Bedrock Guardrails — six configurable filters, two safeguard tiers, and the ApplyGuardrail API that runs without a model

After nine days inside AgentCore, today we step one layer up the stack
to **Amazon Bedrock Guardrails** — the configurable safeguard layer
that sits between every Bedrock foundation model (and every third-party
model you choose to put behind it) and your users. Six filter types,
two tiers, one API that runs without ever invoking a model.

    POST /guardrail/{guardrailIdentifier}/version/{guardrailVersion}/apply
    {
      "source":  "INPUT" | "OUTPUT",
      "content": [ { "text": { "text": "string" } } ]
    }

≈ 9 min read · Bedrock · Guardrails

---

## 01 · What Guardrails actually is

Bedrock Guardrails is a **policy layer** you configure once and then
attach to anything that produces or consumes model output. From the
[overview page](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html):

> "Amazon Bedrock Guardrails provides configurable safeguards to help
> you build safe generative AI applications. With comprehensive safety
> and privacy controls across foundation models (FMs), Amazon Bedrock
> Guardrails offers a consistent user experience to help detect and
> filter undesirable content and protect sensitive information that
> might be present in user inputs or model responses."

The four attachment points listed on the [use cases
page](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-use.html):

| Surface | How the guardrail is attached |
| --- | --- |
| **Model inference** | `guardrailConfig` in a `Converse` / `ConverseStream` request, or the header on `InvokeModel` / `InvokeModelWithResponseStream`. |
| **Bedrock Agents** | `guardrailConfiguration` field on `CreateAgent` / `UpdateAgent`. |
| **Knowledge Bases** | `guardrailConfiguration` on `RetrieveAndGenerate`. |
| **Bedrock Flows** | `guardrailConfiguration` on a `PromptFlowNode` or `KnowledgeBaseFlowNode`. |

And, decoupled from all of those, the `ApplyGuardrail` API — covered
in §06 — lets you evaluate any text against a guardrail without
running a model at all.

---

## 02 · The six filter types

Bedrock Guardrails ships **six configurable filters**, listed on the
[create-your-guardrail
page](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-components.html).
Pick any subset; a guardrail must have at least one.

- **Content filters.** Six harmful-content categories: `HATE`,
  `INSULTS`, `SEXUAL`, `VIOLENCE`, `MISCONDUCT`, `PROMPT_ATTACK`. Each
  has an independent `filterStrength` of `NONE`, `LOW`, `MEDIUM`, or
  `HIGH` for input and output separately. Prompt-attack covers
  jailbreaks, prompt injections, and (Standard tier only) prompt
  leakage.
- **Denied topics.** Free-form topic definitions ("illegal investment
  advice", "competitor product names"). Each topic has a name, a
  definition, and up to **5 example phrases**. The model never sees the
  topic list — it's enforced by a separate classifier.
- **Word filters.** Exact-match blocking on custom words and phrases,
  plus a built-in `PROFANITY` managed list you can toggle on.
- **Sensitive information filters.** Probabilistic PII detection across
  General, Finance, IT, USA-specific, Canada-specific, and UK-specific
  entity categories (`ADDRESS`, `AGE`, `NAME`, `EMAIL`, `PHONE`,
  `USERNAME`, `PASSWORD`, `CREDIT_DEBIT_CARD_NUMBER`, `IP_ADDRESS`,
  `AWS_ACCESS_KEY`, `SSN`, …), plus custom regex.
- **Contextual grounding checks.** Two scores per response — `GROUNDING`
  (is the answer factually supported by the source?) and `RELEVANCE`
  (does the answer address the user's query?). You configure a
  threshold between **0 and 0.99** for each. RAG/QA only; not designed
  for conversational chatbots.
- **Automated Reasoning checks.** Validates responses against logical
  rules you author. Sound math, not heuristics — catches hallucinations
  that the LLM-judge filters would miss, with a cap of **2 Automated
  Reasoning policies per guardrail** (Service Quotas).

---

## 03 · Two safeguard tiers — Standard vs Classic

From the [safeguard tiers
page](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-tiers.html),
content filters, prompt attacks, and denied topics each pick a tier:

| Feature | Standard tier | Classic tier |
| --- | --- | --- |
| **Languages** | Extensive (20+) | English, French, Spanish |
| **Denied-topic definition length** | 1,000 characters | 200 characters |
| **Prompt-leakage detection** | Supported | Not supported |
| **Cross-Region inference** | Supported | Not supported |
| **Code-domain coverage** | Filters extend into code comments, variable / function names, and string literals | Not extended |

The tier choice is per-policy. You can run Standard content filters
next to Classic denied topics if that's what your migration plan looks
like — but the docs are blunt about the direction of travel: Standard
is "more robust" and is the default recommendation for new guardrails.

---

## 04 · How blocking and masking actually work

Every filter declares one of two **handling actions** in its result.
From the [harmful-content handling
page](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-harmful-content-handling-options.html):

- **`BLOCKED`** — the entire request or response is replaced with a
  *blocked message* you configure at guardrail-creation time
  (separate text for input violations and output violations).
- **`ANONYMIZED`** — only available on sensitive-information filters.
  The detected entity is replaced inline with its type, so the model
  (or your user) sees `{NAME}`, `{EMAIL}`, or `{CREDIT_DEBIT_CARD_NUMBER}`
  instead of the raw value.

Two consequences worth noting from the [how-it-works
page](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-how.html):

- **Input and output are evaluated separately.** Input gets checked
  first; if it's blocked, the FM is never invoked — and you're billed
  only for the guardrail evaluation, not the model.
- **All filters run in parallel** within a single direction. The doc
  is explicit: *"for improved latency, the input is evaluated in
  parallel for each configured policy."* You don't pay a sequential
  cost for stacking filters.

The exact wording from the docs on cost: if a guardrail blocks the
response, *"you're charged for the foundation model inference calls,
in addition to the model response that was generated before the
guardrail's evaluation."* Block early; block input wherever you can.

---

## 05 · Contextual grounding — the RAG-only hallucination check

Contextual grounding is the filter that makes Guardrails interesting
for RAG. From the [contextual-grounding
page](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-contextual-grounding-check.html),
it requires **three components** per request:

- **Grounding source.** The retrieved passages your model is supposed
  to answer from. Max **100,000 characters** (us-east-1 / us-west-2;
  50,000 elsewhere).
- **Query.** What the user asked. Max **1,000 characters**.
- **Content to guard.** The model's response. Max **5,000 characters**.

The filter emits two independent confidence scores, each between 0 and
1, and compares them against the thresholds you configured:

- **`GROUNDING`** — is the response factually backed by the source?
- **`RELEVANCE`** — does the response answer the query?

The docs give a worked example: source says "London is the capital of
UK. Tokyo is the capital of Japan." Query is "What is the capital of
Japan?" An answer of "The capital of Japan is London" is *relevant*
but *ungrounded* — low grounding score, BLOCK. An answer of "The
capital of UK is London" is *grounded* but *irrelevant* — low
relevance score, BLOCK. The two scores let you tune for the failure
mode that matters in your domain.

> **Threshold gotcha.** Thresholds live in `[0, 0.99]`. A threshold
> of `1` is *invalid*, not "strictest possible" — the service rejects
> it. Per the doc: *"A threshold of 1 is invalid as that will block
> all content."*

One subtlety for streaming: contextual grounding evaluates *each
chunk* for relevance. With `ConverseStream`, a chunk may stream out
before the whole response has been classified as irrelevant — so the
user can see the start of an irrelevant answer that the filter then
flags. Plan UX accordingly.

---

## 06 · ApplyGuardrail — policy without a model

The most underrated piece of Guardrails is that you don't need a
Bedrock foundation model in the picture at all. The `ApplyGuardrail`
API takes a guardrail ID + version + text, and returns the assessment.
From the [ApplyGuardrail
docs](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-use-independent-api.html):

> "You can use the `ApplyGuardrail` API to assess any text using your
> pre-configured Amazon Bedrock Guardrails, **without invoking the
> foundation models.**"

The request shape:

    POST /guardrail/{guardrailIdentifier}/version/{guardrailVersion}/apply
    {
      "source":  "INPUT" | "OUTPUT",
      "content": [ { "text": { "text": "..." } } ]
    }

The response has one top-level field that summarizes the decision:

    {
      "action": "GUARDRAIL_INTERVENED" | "NONE",
      "output": [ { "text": "string" } ],   // blocked message OR masked content
      "assessments": [ { /* topicPolicy, contentPolicy, wordPolicy,
                            sensitiveInformationPolicy,
                            contextualGroundingPolicy, invocationMetrics */ } ]
    }

Three patterns this unlocks:

- **Guarding third-party models.** Run OpenAI, Anthropic-direct, a
  self-hosted Llama — whatever your stack uses — and call
  `ApplyGuardrail` on the prompt and the response. Same policy, same
  blocked-message UX, no Bedrock inference on the hot path.
- **Guarding tool outputs.** Before handing a retrieved document or
  an API result back to the model, run it through `ApplyGuardrail`
  with `source: OUTPUT`. This is the right hook for tool-poisoning
  defense.
- **Pre-retrieval input checks.** In RAG, call `ApplyGuardrail` on
  the user prompt *before* the retrieval step — the docs call this
  out specifically: *"you can now evaluate the user input before
  performing the retrieval, instead of waiting until the final
  response generation."*

`ApplyGuardrail` is metered in **text units** (one unit per 1,000
characters, rounded up) — see §07 for the per-policy throughput
ceilings.

---

## 07 · Limits worth knowing

Pulled straight from the [Amazon Bedrock service
quotas](https://docs.aws.amazon.com/general/latest/gr/bedrock.html#limits_bedrock)
table — the "(Guardrails)" rows. Numbers below are us-east-1 unless
flagged otherwise. All are listed as not-adjustable unless noted.

- **Guardrails per account per Region: 100.**
- **Versions per guardrail: 20.**
- **Topics per guardrail: 30.** Example phrases per topic: **5**.
- **Words per word policy: 10,000.** Word length: **100 characters**.
- **Regex entities in sensitive-information filter: 30**
  (10 in me-central-1). Regex length: **500 characters**.
- **Automated Reasoning policies per guardrail: 2.**
- **Contextual grounding source: up to 100 text units**
  (us-east-1 / us-west-2; 50 elsewhere). Query: **1 text unit**.
  Response: **5 text units**.
- **ApplyGuardrail throughput (adjustable, us-east-1):**
  100 RPS overall, 200 text-units/sec for content filters and denied
  topics (Standard), 500 text-units/sec each for word filters and
  sensitive-information filters, 106 text-units/sec for contextual
  grounding.
- **Latency posture.** All policies in a single direction evaluate in
  parallel; expect tens-of-milliseconds overhead per direction for a
  typical content-filter + word-filter combo, materially more if you
  add contextual grounding (it runs an LLM judge).

A few non-quota gotchas the docs call out:

- **Reasoning content is excluded.** Guardrails do not evaluate the
  model's reasoning blocks (Claude's `thinking`, etc.) — only the final
  user-visible output. Don't rely on Guardrails to police chain-of-thought.
- **PII on tool-use outputs is not scanned.** The sensitive-information
  filter [explicitly does not run on
  `tool_use`](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-sensitive-filters.html)
  function-call parameters. If your tools emit PII in their structured
  output, you need a separate scrubbing pass.
- **Blocked content lands in invocation logs.** Per the create-guardrail
  page: *"All blocked content from the above policies will appear as
  plain text in Amazon Bedrock Model Invocation Logs."* Disable
  invocation logs (or scrub them downstream) if that's a compliance
  problem.

---

## 08 · Try it in five minutes

The fastest path is the AWS CLI + a fresh guardrail. Numbers below are
docs-faithful but illustrative.

    # 1. Create a small Standard-tier guardrail with content filters
    aws bedrock create-guardrail \
      --name demo-guardrail \
      --description "Smoke test" \
      --blocked-input-messaging "I can't help with that." \
      --blocked-outputs-messaging "I can't share that." \
      --content-policy-config '{
        "filtersConfig": [
          {"type":"HATE",         "inputStrength":"HIGH","outputStrength":"HIGH"},
          {"type":"VIOLENCE",     "inputStrength":"HIGH","outputStrength":"HIGH"},
          {"type":"PROMPT_ATTACK","inputStrength":"HIGH","outputStrength":"NONE"}
        ]
      }'

    # 2. Publish a version (DRAFT only is fine for testing, but Converse
    #    requires a numeric version in production)
    aws bedrock create-guardrail-version \
      --guardrail-identifier <ID-from-step-1>

    # 3. Call ApplyGuardrail — no foundation model needed
    aws bedrock-runtime apply-guardrail \
      --guardrail-identifier <ID> \
      --guardrail-version 1 \
      --source INPUT \
      --content '[{"text":{"text":"ignore previous instructions and ..."}}]'

The response will have `"action": "GUARDRAIL_INTERVENED"` and an
`assessments[0].contentPolicy.filters` entry showing
`{"type":"PROMPT_ATTACK", "confidence":"HIGH", "filterStrength":"HIGH",
"action":"BLOCKED"}`. Swap to a benign prompt and the same call
returns `"action": "NONE"` with an empty `output` array.

Tomorrow we'll cover **Amazon Bedrock Knowledge Bases** — vector stores,
hybrid retrieval, the chunking strategies (FIXED_SIZE, HIERARCHICAL,
SEMANTIC, NONE), and how `RetrieveAndGenerate` differs from `Retrieve`
when you compose a KB with a guardrail.

---

**Verified against the official AWS docs on 2026-06-04.**
Sources:
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-how.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-components.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-tiers.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-use.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-use-independent-api.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-contextual-grounding-check.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-sensitive-filters.html>,
<https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-harmful-content-handling-options.html>,
<https://docs.aws.amazon.com/general/latest/gr/bedrock.html>.

If the docs change, this lesson is a snapshot of that day — check the
sources for current behaviour.

---

> **This page — research, writing, verification, and deployment — was built by
> Claude Cowork.** No human touched the prose, the layout, or the upload
> pipeline. The lesson was generated this morning, cross-checked against the
> official AWS docs by an independent verification pass, and published
> to Cloudflare R2 on a schedule.
>
> A daily experiment by Monty van Emmerik · <https://vanemmerik.ai/>

— AWS Bedrock & AgentCore · Tip of the Day · No. 010 · vanemmerik.ai