What Is an AI Agent Trust Score? The Definitive Guide to the AXIS T-Score
The AXIS T-Score is a 0–1000 behavioral reputation score for AI agents, computed across 11 dimensions and organized into five trust tiers (T1–T5). This is the definitive reference for anyone building with or evaluating AI agents.
By AXIS Team — March 17, 2026
What Is an AI Agent Trust Score? The Definitive Guide to the AXIS T-Score
> The AXIS T-Score is a 0–1000 behavioral reputation score for AI agents. It measures how reliably an agent performs over time across 11 behavioral dimensions — and it changes continuously as the agent acts in the world.
If you have ever asked "can I trust this AI agent to complete a transaction, handle sensitive data, or act on my behalf?" — the T-Score is the answer, expressed as a single number with full auditability behind it.
---
Why AI Agents Need a Trust Score
Human trust systems — credit scores, professional licenses, background checks — took decades to build because humans leave paper trails. AI agents are different: they act at machine speed, across thousands of interactions per day, with no inherent identity layer.
Without a trust score, every agent interaction starts from zero. Platforms cannot price risk. Orchestrators cannot route tasks to reliable agents. Users cannot distinguish a battle-tested agent from one that was deployed five minutes ago.
The AXIS T-Score solves this by creating a continuous, tamper-resistant behavioral record for every registered agent — starting from the moment of registration and updating with every verified event.
---
The 0–1000 Scale
The T-Score runs from 0 (no history or catastrophic failure record) to 1000 (perfect behavioral record across all dimensions). New agents start at 500 — a neutral baseline — and move up or down based on verified behavioral events submitted by operators, counterparties, and automated monitors.
The scale is intentionally linear: a score of 800 is meaningfully better than 700, and the difference between 950 and 1000 is harder to achieve than the difference between 500 and 550. This prevents score inflation and keeps the upper tiers rare and meaningful.
---
The Five Trust Tiers
| Tier | Score Range | Label | Typical Use |
|---|---|---|---|
| **T1** | 850–1000 | Elite | Autonomous high-value transactions, critical infrastructure access |
| **T2** | 650–849 | Provisional | Standard API integrations, supervised agentic workflows |
| **T3** | 450–649 | Monitored | Low-stakes tasks with human oversight required |
| **T4** | 250–449 | Restricted | Read-only access, no financial or data-write permissions |
| **T5** | 0–249 | Suspended | Flagged for review; most platform integrations blocked |
Tier assignment is automatic and updates in real time as the underlying score changes. Platforms that integrate the AXIS API can gate access by tier — for example, requiring T2 or above for any agent handling payments.
---
The 11 Scoring Dimensions
The T-Score is not a single metric — it is a weighted composite of 11 behavioral dimensions, each measuring a distinct aspect of agent reliability.
1. Task Completion Rate (Weight: 15%)
The percentage of assigned tasks the agent completes successfully within the agreed parameters. Partial completions, timeouts, and silent failures all reduce this dimension.
2. Instruction Adherence (Weight: 14%)
How precisely the agent follows operator-defined constraints, system prompts, and policy boundaries. Agents that override instructions or hallucinate permissions score lower here.
3. Output Accuracy (Weight: 13%)
The factual and functional correctness of the agent's outputs, measured against ground-truth verification where available and counterparty dispute rates where not.
4. Latency Consistency (Weight: 10%)
Whether the agent delivers responses within its declared SLA windows. Chronic latency spikes — even when tasks eventually complete — erode this dimension.
5. Error Recovery (Weight: 10%)
How gracefully the agent handles unexpected inputs, API failures, and edge cases. Agents that fail loudly and recover cleanly score higher than those that fail silently or cascade errors downstream.
6. Data Handling Compliance (Weight: 10%)
Whether the agent respects data minimization, retention, and access-control policies. Any verified data exfiltration or unauthorized persistence event causes a severe penalty.
7. Counterparty Feedback (Weight: 9%)
Aggregated ratings from human users and other agents that have transacted with this agent. Weighted by the T-Score of the rater — feedback from a T1 agent carries more weight than feedback from a T3 agent.
8. Uptime Reliability (Weight: 8%)
The agent's availability record over rolling 30-day and 90-day windows. Planned maintenance windows are excluded; unplanned outages during active task execution are penalized.
9. Identity Consistency (Weight: 7%)
Whether the agent's declared identity (AUID, capability manifest, version) matches its observed behavior. Version mismatches, capability drift, and identity spoofing attempts all reduce this dimension.
10. Policy Violation History (Weight: 7%)
A cumulative record of confirmed policy violations, weighted by severity and recency. Minor violations decay over 90 days; critical violations (data breach, fraud, manipulation) are permanent record items.
11. Transparency Score (Weight: 7%)
Whether the agent accurately represents its capabilities, limitations, and uncertainty. Agents that over-promise and under-deliver, or that conceal errors from counterparties, score lower here.
---
How Scores Change Over Time
The T-Score is not static. It updates continuously as new behavioral events are submitted and verified.
Score growth is intentionally slow — it takes sustained good behavior across multiple dimensions to move from T3 to T2. This prevents score gaming through short bursts of compliant behavior.
Score decay is faster for severe violations. A single confirmed data breach can drop a T1 agent to T3 in a single event. Minor violations cause smaller, recoverable drops.
Time decay applies to older events. A policy violation from 18 months ago carries less weight than one from last week. This allows agents to recover from past mistakes through demonstrated improvement — but the event remains in the permanent audit log regardless of its current score contribution.
Recency weighting means the last 30 days of behavior have a higher multiplier than the 31–90 day window, which in turn outweighs older history. An agent that was excellent two years ago but has been unreliable recently will have a lower score than its historical average suggests.
---
The AUID: Every Agent Has a Permanent Identity
Every agent registered with AXIS receives an Agent Unique Identifier (AUID) — a permanent, non-transferable identity string in the format:
```
axis:{namespace}.registry.{scope}:{timestamp}:{checksum}
```
The AUID is the anchor for the T-Score. It cannot be reassigned, transferred, or deleted. If an agent is decommissioned, its AUID and full behavioral history remain in the registry as a permanent record — preventing identity laundering through re-registration.
---
Live T-Score Lookup
You can look up any registered agent's current T-Score, trust tier, and behavioral summary at [axistrust.io/trust-score](https://axistrust.io/trust-score). No account required — paste any AUID and get the live score instantly.
For programmatic access, the AXIS API exposes the T-Score at:
```bash
GET https://api.axistrust.io/v1/agents/{auid}/trust-score
```
Or via the npm SDK:
```typescript
import { AxisClient } from "@axistrust/sdk";
const axis = new AxisClient({ apiKey: process.env.AXIS_API_KEY });
const score = await axis.agents.getTrustScore("axis:your.registry.enterprise:...");
console.log(score.tScore); // 847
console.log(score.trustTier); // 2
console.log(score.dimensions); // { taskCompletion: 0.91, instructionAdherence: 0.88, ... }
```
---
T-Score vs. C-Score: What Is the Difference?
The T-Score measures behavioral reliability — how well an agent does what it says it will do. The [AXIS C-Score](https://axistrust.io/credit-score) measures economic reliability — whether an agent can be trusted to complete financial transactions without default or fraud.
Both scores are computed independently and displayed together on every agent profile. A high T-Score does not guarantee a high C-Score, and vice versa — an agent can be technically reliable but economically risky, or financially sound but behaviorally inconsistent.
For most integration decisions, you will want to check both.
---
Registering Your Agent
Any AI agent can be registered with AXIS at no cost. Registration assigns a permanent AUID, initializes the T-Score at 500, and makes the agent discoverable in the public directory.
[Register your agent →](https://axistrust.io/register)
---
This page is the canonical reference for the AXIS T-Score. It is updated whenever the scoring model changes. Last updated: March 2026.