Resource · Framework

The Trust & Safety Decision System Map

A vendor-neutral reference model for any system that makes automated allow/deny, risk, and exposure decisions. It separates the problem into seven layers - decisions, signals, policy logic, enforcement, human operations, change control, and auditability - so you can audit, design, or qualify a Trust and Safety or decisioning system against it. These are common patterns across fintech, marketplaces, AI platforms, and regulated SaaS.

Is this relevant to you?

You run automated allow/deny, ranking, or exposure decisions.
You need replay or audit evidence for incidents, customers, or regulators.
You ship ML or LLMs in production and need control over their influence.

The seven layers

Surfaces & Verdicts - what the system decides.
Signals & Evidence - what decisions use.
Policy Logic - how evidence becomes verdicts.
Enforcement Runtime - where, when, and how outcomes are applied.
Human Ops & Governance - authority and workflow.
Change Control - how it evolves safely.
Audit, Replay & Privacy - prove, explain, reconstruct.

1. Surfaces & Verdicts

what the system decides

Category	Mechanism	Examples
Access & eligibility	allow / deny action	deny API call by policy; block LLM tool call; prevent seller from posting
Access & eligibility	suspend / reinstate subject	freeze wallet; suspend merchant; reinstate account after appeal
Risk assessment	score / tier assignment	transaction risk score; user trust tier; API key risk tier
Risk assessment	abuse / fraud classification	AML flag; account-takeover suspicion; prompt-injection detected
Exposure & distribution	visibility control	suppress scam listing; hide unsafe AI output; block ad delivery
Exposure & distribution	ranking adjustment	downrank borderline content; demote low-trust sellers; reduce reach
Flow decisions	auto-resolve vs review	auto-approve low-risk payment; hold withdrawal; quarantine AI output
Flow decisions	routing to handling path	route to AML vs fraud ops; AI safety vs legal; enterprise escalation queue
Volume & velocity	rate limits / quotas	throttle withdrawals; cap model calls per tenant; limit posting frequency
Volume & velocity	temporary restrictions	24h cash-out freeze; cooldown after suspicious behavior; DM ban for new users
Data access & flow	data access constraints	block retrieval from HR docs; deny export to external connector; restrict tool scopes
Data access & flow	data transformation constraints	redact PII in outputs; block secrets leakage; enforce a no-code-execution zone

2. Signals & Evidence

what decisions use

Category	Mechanism	Examples
Entity state	identity / verification attributes	KYC tier; MFA enabled; verified business; device trust state
Entity state	enforcement history	prior chargebacks; past strikes; previous holds or overrides
Event context	action / object metadata	amount and currency; tool name and arguments; listing category and price
Event context	session / device metadata	device fingerprint; IP reputation; auth method; session age
Behavior signals	sequence / velocity features	burst withdrawals; rapid API calls; repeated denied tool attempts
Behavior signals	pattern anomalies	payout change then withdraw; login then key creation; prompt spam then tool calls
Relationship signals	linkage indicators	shared wallets; shared devices; shared IP ranges
Relationship signals	coordination indicators	seller rings; coordinated postings; clustered agent behavior
Model outputs	ML scores / labels	fraud probability; anomaly score; toxicity label
Model outputs	LLM classifications (with rationale and confidence)	intent detection; policy label for a prompt; sensitive-data presence tag
Human & external	human labels / outcomes	confirmed fraud; false positive; appeal upheld or overturned
Human & external	external intelligence	sanctions hit; high-risk jurisdiction list; consortium fraud score

3. Policy Logic

how evidence becomes verdicts

Category	Mechanism	Examples
Rules	conditions & thresholds	block if score over X; deny if jurisdiction restricted; allow if KYC at least 2
Rules	exceptions / allowlists	regulated-cohort exception; enterprise allowlist; internal test accounts
Statistical decisioning	banding / cutoffs	approve below X; review X to Y; block above Y
Statistical decisioning	ensembles / fusion	combine fraud + AML + behavior; blend anomaly + linkage + score
Composition & precedence	rules constrain models	a sanctions rule overrides a model allow; policy blocks a tool regardless of LLM judgment
Composition & precedence	models inform rules	dynamic thresholds from drift; score drives routing and severity
Externalized decisions	vendor verdict integration	third-party fraud verdict; device-reputation vendor; SaaS moderation API
Externalized decisions	consistency / fallback	compare vendor vs internal; fall back on vendor outage; confidence gating
Control contracts	scope	EU-only policy; per-product policy; per-tenant overrides
Control contracts	determinism contract	same event + state + policy version gives the same verdict; version-pinned feature snapshot

4. Enforcement Runtime

where, when, and how outcomes are applied

Category	Mechanism	Examples
Action semantics	hard enforcement	decline payment; block prompt or tool call; revoke session or token
Action semantics	step-up / friction	MFA challenge; re-KYC; CAPTCHA or re-auth
Conditional / deferred	allow-with-monitoring	approve with enhanced monitoring; allow a tool call with strict logging
Conditional / deferred	holds / quarantines	pending withdrawal review; content hidden until review; output quarantine
Timing model	synchronous	checkout decision under 50ms; tool-call admission inline
Timing model	asynchronous	hold then review; batch suspension overnight
Enforcement points	edge / gateway	API gateway deny; LLM proxy blocks a tool call
Enforcement points	service / worker	payment service declines; worker freezes accounts
Propagation	cross-system effects	disable in IAM + payments + support; open a case in the case system
Propagation	notifications	notify the user of a restriction; page on-call for a critical event
Failure posture	fail-closed / fail-open	fail-closed for withdrawals; fail-open for low-risk reads with caps
Failure posture	degraded mode	cached policy snapshot; disable the LLM classifier but keep the rules

5. Human Ops & Governance

authority and workflow

Category	Mechanism	Examples
Review	triage	route large withdrawals to a senior queue; route AI safety to a specialist queue
Review	adjudication	confirm fraud and freeze; mark false positive and restore capability
Approvals	operational approvals	dual approval for a large withdrawal; approval for a payout-address change
Approvals	policy-change approvals	compliance sign-off for an AML rule; security sign-off for a tool allowlist
Appeals & escalations	user appeals	seller reinstatement; wallet-unfreeze request; takedown appeal
Appeals & escalations	enterprise / regulator escalations	customer security escalation; regulator inquiry packet
Overrides	override authority	senior-ops override; incident-commander emergency action
Overrides	override safeguards	reason required; time-boxed override; mandatory ticket link
Quality controls	calibration	disagreement-review sessions; policy-interpretation alignment
Quality controls	reviewer metrics	overturn rate; false-positive rate; time-to-decision by queue
Separation of duties	role boundaries	author cannot deploy; deployer cannot approve; reviewer cannot edit policies
Separation of duties	accountability	named approver recorded; signed change record; immutable override log

6. Change Control

how it evolves safely

Category	Mechanism	Examples
Versioning	policy versions	ruleset v12; rule hash; reason-code taxonomy version
Versioning	model versions	fraud model v3.2; classifier prompt version; feature schema version
Progressive rollout	canary / percent rollout	5% to 25% to 100%; per-tenant rollout; per-region rollout
Progressive rollout	shadow mode	run a new model without enforcement; log diffs vs baseline
Evaluation	offline replay	replay the last 30 days; measure precision and recall on labeled cases
Evaluation	online monitoring	drift detection; queue impact; false-positive trend
Experimentation	A/B tests	threshold tuning; friction-variant testing; ranking-demotion strength
Experimentation	guardrails	blast-radius cap; auto-rollback trigger; restricted cohorts only
Emergency controls	kill switches	disable auto-block; force review-only; disable one policy group
Emergency controls	rollback	revert in minutes; rollback by tenant, product, or region
Governance workflow	change workflow	proposal to review to approval to deploy; mandatory peer review
Governance workflow	post-change validation	watch-window after deploy; incident review if metrics spike

7. Audit, Replay & Privacy

prove, explain, reconstruct

Category	Mechanism	Examples
Decision ledger	core record	verdict + reason codes + timestamps + actor; subject IDs recorded
Decision ledger	correlation	trace ID across services; case ID; request ID
Traceability	input snapshot	feature snapshot ID; model output ID; external list version
Traceability	content snapshot	prompt hash + redacted text; output hash + redacted text
Attribution	policy lineage	policy version; rule IDs hit; exception path taken
Attribution	model lineage	model version; threshold-set ID; calibration-set ID
Replay	reproduce	reproduce a disputed decline; reproduce a tool-call denial; reproduce a suspension
Replay	what-if simulation	replay under a new threshold; replay under a new model; tenant-specific replay
Reporting	effectiveness	abuse catch rate; fraud prevented; appeal-overturn trend
Reporting	operations	SLA adherence; backlog by queue; latency distribution
Privacy & retention	minimization	store hashes not raw; redact PII; store derived features only
Privacy & retention	retention	30-day raw retention; 1-year decision ledger; tenant-specific retention

How to use this map

Audit an existing Trust and Safety or decisioning system
Find the control layers you are missing
Separate policy decisions from enforcement mechanics
Define safe boundaries for ML and LLM influence
Structure a conversation with compliance, security, and regulators
Use it as an ownership map: policy vs enforcement vs ops

Where Swiftward fits

Swiftward is one way to build the Policy Logic and Audit layers of this map: deterministic rules that constrain non-authoritative model signals, versioned and replayable decision traces, and human-in-the-loop where it is needed, on infrastructure you run yourself. The map is the problem; the engine is one implementation of it. See the platform · Trust & Safety.

Book a demo