For Researchers

Technical documentation and validation data

This section provides the full technical detail behind Veridi’s methodology and testing. If you’re evaluating the system’s rigor, designing similar systems, or looking for something to break, start here.

What’s available

Validation Report: Complete documentation of the three-phase validation process: 97 claims across 8 domains, 9 verdict categories, 24 adversarial scenarios, 4 non-English languages, and genuinely contested ground truth. Includes per-claim results, pass criteria, and an honest discussion of limitations.

Adversarial Testing: The 11 gaming vectors, how each is detected, and how the methodology performed against 24 adversarial claims (12 single-vector, 12 multi-vector). Includes 4 claims based on documented real-world disinformation patterns.

Confidence Calibration: The framework for assigning confidence ratings: tier-based structural ceilings, field reliability coefficients with sourcing honesty labels, and the interaction rules that prevent absurd multiplicative results.

Gaming Countermeasures: Detailed documentation of all 11 disinformation detection procedures, including detection difficulty ratings, impact severity, and the relationship to the Institutional Reliability Index.

Key numbers

MetricValue
Total claims tested97
Passed96
Partial1
Failed0
Subject domains covered8
Verdict categories tested9
Adversarial scenarios24
Gaming vectors tested11 (all detected)
Primary gaming flags fired24/24 (100%)
ADV-v2 total flags fired39 (vs. ~30 expected)
Verdict boundary cases18 (all resolved correctly)
Non-English languages tested4 (Japanese, Turkish, Chinese, Hindi)
Blocking claims passed4/4

Known limitations

These are described in detail in the validation report and the known limitations page. The short version:

  • Near-perfect results warrant scrutiny. The test suite was designed by the same people who built the methodology.
  • Validation was conducted by the methodology’s own implementation (AI following the procedures), not by human volunteers; thus, results are not a pure reflection of the defined methodology alone.
  • Most adversarial claims were constructed for testing, though 4 were based on real-world disinformation patterns.
  • The methodology has not yet been tested at scale with human users.
  • Brier score calibration data has not yet accumulated enough data points for statistical significance.

We welcome external validation; particularly claims designed to produce incorrect results.