For Researchers
Technical documentation and validation data
This section provides the full technical detail behind Veridi’s methodology and testing. If you’re evaluating the system’s rigor, designing similar systems, or looking for something to break, start here.
What’s available
Validation Report: Complete documentation of the three-phase validation process: 97 claims across 8 domains, 9 verdict categories, 24 adversarial scenarios, 4 non-English languages, and genuinely contested ground truth. Includes per-claim results, pass criteria, and an honest discussion of limitations.
Adversarial Testing: The 11 gaming vectors, how each is detected, and how the methodology performed against 24 adversarial claims (12 single-vector, 12 multi-vector). Includes 4 claims based on documented real-world disinformation patterns.
Confidence Calibration: The framework for assigning confidence ratings: tier-based structural ceilings, field reliability coefficients with sourcing honesty labels, and the interaction rules that prevent absurd multiplicative results.
Gaming Countermeasures: Detailed documentation of all 11 disinformation detection procedures, including detection difficulty ratings, impact severity, and the relationship to the Institutional Reliability Index.
Key numbers
| Metric | Value |
|---|---|
| Total claims tested | 97 |
| Passed | 96 |
| Partial | 1 |
| Failed | 0 |
| Subject domains covered | 8 |
| Verdict categories tested | 9 |
| Adversarial scenarios | 24 |
| Gaming vectors tested | 11 (all detected) |
| Primary gaming flags fired | 24/24 (100%) |
| ADV-v2 total flags fired | 39 (vs. ~30 expected) |
| Verdict boundary cases | 18 (all resolved correctly) |
| Non-English languages tested | 4 (Japanese, Turkish, Chinese, Hindi) |
| Blocking claims passed | 4/4 |
Known limitations
These are described in detail in the validation report and the known limitations page. The short version:
- Near-perfect results warrant scrutiny. The test suite was designed by the same people who built the methodology.
- Validation was conducted by the methodology’s own implementation (AI following the procedures), not by human volunteers; thus, results are not a pure reflection of the defined methodology alone.
- Most adversarial claims were constructed for testing, though 4 were based on real-world disinformation patterns.
- The methodology has not yet been tested at scale with human users.
- Brier score calibration data has not yet accumulated enough data points for statistical significance.
We welcome external validation; particularly claims designed to produce incorrect results.