Custom Regex Rules

Standard Entity Rules cover the categories most regulations care about. Custom Regex Rules cover everything else - internal customer IDs, project codes, employee numbers, support ticket references, anything proprietary to your business that you don’t want flowing freely through tool calls.

Each custom rule has a regex pattern, an optional set of context keywords for disambiguation, a confidence score, and an action.

Writing a rule

Open Security → Rules → Custom rules and click Add rule. The fields:

Field	What it does
Name	What the rule shows up as in alerts and the audit log. Use a descriptive name.
Entity label	The placeholder used when the rule’s action is redact. `CUSTOMER_ID` becomes `[REDACTED:CUSTOMER_ID]`.
Regex pattern	The pattern to match against tool inputs and outputs. Standard regex syntax.
Context keywords	Optional. Words that, when found near a candidate match, increase confidence.
Confidence score	The score assigned to each match before threshold comparison. Defaults to 0.5.
Action	Allow / Redact / Block when the rule matches in tool arguments.

Test with the Rule Tester before saving.

Three worked examples

Internal customer ID

Your customer IDs in your CRM look like CUST-12345 - uppercase, prefix, dash, five digits. You don’t want them leaking outside, but they can show up in agent prompts and tool outputs.

Pattern:        \bCUST-\d{5}\b
Entity label:   CUSTOMER_ID
Context:        customer, account
Score:          0.85
Action:         Redact

The \b word boundaries prevent the pattern from matching inside longer strings. The high confidence (0.85) reflects the pattern’s specificity - there’s no ambiguity about what CUST-12345 means in your context.

Internal email domain

You want to know whenever an internal email address (anything @acme-internal.com) shows up in tool calls - useful for monitoring whether the agent is leaking employee identities into customer-facing systems.

Pattern:        [\w._%+-]+@acme-internal\.com\b
Entity label:   INTERNAL_EMAIL
Context:        (none)
Score:          0.9
Action:         Block

If an internal email address shows up in a tool call’s arguments, the call is blocked before it reaches the third party and an alert is logged. Internal email addresses never get sent to customer-facing systems.

Project code with disambiguation

Your project codes look like P-A1B2. The pattern alone is loose - anything matching P- followed by four alphanumerics. Without context, it’ll false-positive on unrelated strings. Add context keywords to tighten.

Pattern:        \bP-[A-Z0-9]{4}\b
Entity label:   PROJECT_CODE
Context:        project, milestone, sprint
Score:          0.4
Action:         Redact

The lower score (0.4) means the rule only fires when context keywords push the confidence above the global threshold. A bare P-A1B2 in the middle of unrelated text won’t match; the P-A1B2 project is on track will. Context keywords solve the false-positive problem without making the regex unreadable.

How context keywords work

When the regex matches, Agent Handler scans the surrounding text (a window of ~50 characters before and after) for any of the context keywords. Each keyword found bumps the confidence score. The final score is the rule’s base score plus the keyword adjustment, compared against the threshold.

Use context keywords whenever your pattern alone is too loose. They’re often the difference between “fires constantly on legitimate traffic” and “fires precisely when it should.”

Score, threshold, and the relationship

The score you assign is what each match gets. The global threshold (configured in the Security Gateway settings) is what scores have to clear to fire. Adjust both:

High-precision rule (specific pattern, clear context): score 0.8–0.95. Will always fire when the pattern matches.
Medium-precision rule (pattern is good but ambiguous): score 0.4–0.6. Will fire when context keywords are present.
Low-precision rule (broad pattern, lots of false positives): score 0.2–0.3. Will fire only with strong context.

If you want a rule to never fire without context, set the score below the global threshold. Context keywords are the only way to push it over.

Testing before going live

Always test with the Rule Tester before saving. Paste a few realistic samples - what you’d expect the rule to catch and what you’d expect it to ignore - and confirm the rule does both. Common gotchas the tester surfaces:

The pattern matches more than you intended (forgot a word boundary).
The pattern matches less than you intended (case sensitivity, missing alternatives).
The score is too low or high relative to your threshold.
Context keywords are off - too narrow, too generic, or absent when they should be present.

Per-tool-pack overrides

Like Standard Entity Rules, Custom Regex Rules can be overridden per Tool Pack. The org-level rule applies everywhere; an override changes the action for one pack only. Useful when most Tool Packs need the rule enforced but one specific pack legitimately needs the data through.

Configure overrides on the Tool Pack’s Rules tab.

When custom regex doesn’t fit

If your pattern is so loose you’d need a small ML model - natural-language entities, fuzzy classifiers - custom regex isn’t the right tool. The standard NLP-based detectors handle most of these; if the catalog doesn’t cover yours, file a feature request. If you need semantic validation (Luhn for cards, checksums for IDs), regex matches shape but won’t validate; you may need a more sophisticated detection layer.

Investigate what your rules are catching in Violations and alerts.