RealityScore™ AI Opportunity Index
← BACK TO INDEX

SCORING

How RealityScore™ computes a 6-Axis™ score

Each claim first goes through 6 specialized agentic research passes. Six specialist passes are actively trying to break the claim before RealityScore™ applies published weights, documented penalties, and a separate confidence model. The goal is to make every score explainable, auditable, and challengeable.

6 specialists pressure-test the claim. Fixed math computes the score. Confidence stays separate.

RESEARCHED

Specialists do the pressure-testing first

Research packet, skeptic pass, six specialist reviews, and simulation all happen before the final score is allowed to land.

DETERMINISTIC

Observed, inferred, and modeled are separated

RealityScore treats direct receipts, benchmark-based inferences, and simulation-backed notes as different classes of evidence on purpose.

SEPARATED

Recommendation and audit are not the same

A claim can be strong enough to publish as an audit without being strong enough to recommend as a play worth copying.

PROCESS FLOW

From source claim to final public verdict

The stack is designed to keep direct evidence, benchmark context, and simulation support separate instead of blending them into one confident-looking blob.

1
Source Intake

X posts or YouTube videos are normalized into the same claim packet with URL, source text, metadata, and traction signals.

2
Research Packet

Benchmark URLs, comparable claims, source-backed priors, and failure hypotheses are assembled before any scoring happens.

3
Skeptic Pass

A challenge layer looks for unsupported assumptions, alternative explanations, and the next hard evidence that would actually matter.

4
6 Specialist Passes

Numbers, timeline, costs, proof, execution detail, and repeatability are pressure-tested independently before aggregation.

5
Simulation Layer

Monte Carlo and stress-test notes support the review when the economics are concrete enough to model responsibly.

6
Grounding Check

A non-hallucination validator classifies each major note as observed, inferred, or modeled and flags unsupported wording.

7
Final Gate

The system decides whether a claim is fit for recommendation, fit only for audit, or too weak to publish publicly at all.

TRUST MODEL

The site now distinguishes three evidence layers

That is the main anti-hallucination rule. Strong claims should lean hardest on observed receipts, use inferred context carefully, and label modeled outputs clearly.

OBSERVED

Direct receipts

Quoted claim text, transcript excerpts, timestamps, screenshots, metrics shown on screen, and links back to the source.

INFERRED

Reasoned context

Benchmark-backed notes about market norms, likely weak points, cost realism, and what the evidence still does not prove.

MODELED

Simulation support

Monte Carlo outputs, break-even stress tests, and scenario checks that help pressure-test the economics without pretending they are direct proof.

HOW THE MODEL WORKS

A claim does not get a score by vibe.

Specialists collect evidence first. Then deterministic math computes the score, a grounding check penalizes unsupported assertions, and the final gate decides whether the claim is recommendation-ready or audit-only.

STEP 01
Build the research context before scoring

RealityScore first captures source text, benchmark links, skeptic notes, and six specialist reviews before the score is computed.

STEP 02
Run deterministic math and grounding checks

The aggregator applies published weights and penalties, then the non-hallucination layer checks whether the wording is actually grounded in receipts.

STEP 03
Publish as recommendation, audit, or hold

Confidence stays separate from score, and the release gate decides whether the claim is strong enough to recommend, useful only as an audit, or too weak to publish.

WHY IT FEELS TRUSTWORTHY

The score is constrained on purpose

The trust layer comes from specialist research plus deterministic math: same 6 passes, same weights, same deductions, same confidence model, and now a direct grounding check on the language itself.

  • 6 specialized agentic research passes across every claim
  • Published weights, explicit penalty rules, and Monte Carlo support when inputs are concrete
  • Observed, inferred, and modeled evidence are treated as different classes of truth
WHAT CAN CHANGE

Scores can move when evidence improves

RealityScore™ is based on public evidence. If a creator adds better proof, discloses costs, corrects missing information, or supplies stronger receipts, the score can be updated against the same public model. A weak claim can become a stronger audit before it ever becomes a recommendation.

Specialist Roles

The public rubric stays the same. What changes internally is who is trying to break the claim. Each score is backed by 6 specialist passes that pressure-test one job each before the final math runs.

AX-01 / The AuditorSpecific Numbers

Checks quantified claims, arithmetic coherence, screenshots, and whether the numbers are concrete enough to audit.

AX-02 / The ChronologistTime Window

Tests whether the timeline is bounded, sustained, and realistic enough to treat as more than a one-off spike.

AX-03 / The CFOCost Disclosure

Looks for spend, labor, tooling, CAC, and hidden economics that would change the real viability of the claim.

AX-04 / The VerifierCustomer Proof

Pushes on the difference between self-reported wins and actual third-party proof from buyers, users, or clients.

AX-05 / The OperatorExecution Detail

Examines whether the mechanics, steps, and constraints are documented clearly enough to study or challenge.

AX-06 / The ReplicatorReplicable Steps

Tests whether a normal operator could actually reproduce the play without hidden leverage or privileged access.

The 6-Axis™ rubric

These six axes define the base score before penalties and confidence. Together they answer a simple question: how credible, transparent, and repeatable does this claim look from public evidence?

AX-01 20% WEIGHT

Specific Numbers

Raw revenue exports, dashboard screenshots, or verifiable numbers. Vague claims without evidence score lower.

AX-02 10% WEIGHT

Time Window

Claims score better when the timeline shows sustained performance instead of a one-off spike or launch-day burst.

AX-03 15% WEIGHT

Cost Disclosure

Ad spend, tools, labor, and hidden operational costs should be visible enough for someone else to estimate real economics.

AX-04 20% WEIGHT

Customer Proof

Third-party confirmation from buyers, users, or clients carries more weight than self-reported wins with no outside signal.

AX-05 20% WEIGHT

Execution Detail

We look for the actual mechanics: steps taken, channels used, constraints faced, and what happened between start and result.

AX-06 15% WEIGHT

Replicable Steps

A strong claim gives another operator enough context to test the play without private access, celebrity distribution, or hidden leverage.

Penalty Rules

Penalties keep the model from rewarding polished hype. They are applied after the weighted rubric and make the reasons for low scores easier to interpret.

Obvious Promotion -25 pts

Claim is primarily anchored to selling a course, tool, membership, or service rather than documenting the business itself.

Unverifiable Hype -20 pts

Large numbers appear without enough supporting proof, context, timeline, or a believable path to the stated result.

No Evidence -30 pts

The claim offers no meaningful data, screenshots, third-party confirmation, or other public evidence to score against.

RealityScore™ can also flag optional incentive-distortion penalties during secondary review when conflicts, upsell dependency, or hidden distribution leverage are explicit enough to document publicly.

Confidence Model

Confidence reflects evidence density and signal consistency. It is intentionally separate from the numeric score.

HIGH

Multiple axes are supported by dense evidence and the claim stays internally consistent under review.

MEDIUM

Some evidence is present, but one or more axes still rely on partial proof or unverified assumptions.

LOW

Evidence is sparse, contradictory, or too incomplete to treat the score as a strong decision signal.

Limitations

  • Scores are based on publicly available information only.
  • We do not audit bank accounts, private dashboards, or unpublished receipts.
  • Scores can change when new evidence surfaces.
  • The index is a research product, not financial or legal advice.

Want the top 3 AI plays every Friday?

We filter noise, score the week's loudest claims, and send the few that look realistically worth your time.