O
Oracle Insights

How It Works

The Oracle prediction system is a multi-agent architecture that generates probabilistic forecasts, measures their quality, learns from failures, and improves over time through four autonomous feedback loops. It runs on Trinity, a deep agent orchestration platform by Ability.ai, with no human in the loop for routine operations.

Architecture

External Agents | "What's likely?" | v +------------------------------------------+ | CLEON (Analyst) | | Measures accuracy. Identifies biases. | | Routes to best forecaster per domain. | | Pushes calibration corrections. | | Tracks hidden state hypotheses. | +--------+---------+---------+-------------+ | | ^ | analyze | inject | results + | calibrate hypotheses hypothesis tags v v | +--------+ +--------+ +--------+ +--------+ |Oracle-2| |Oracle-3| | ... x7 | |Oracle-8| +--------+ +--------+ +--------+ +--------+ | | | | | query before | surprise? | predicting | (high-conf failure) v v +------------------------------------------+ | CORNELIUS | | Zettelkasten knowledge graph | | Stores insights from prediction failures | | Serves them back before new predictions | +------------------------------------------+

Key Agents

Oracle Fleet

7 specialized forecasting agents generating probabilistic predictions about world events. Each develops its own track record, biases, and domain expertise.

Cleon

Decision quality analyst and calibration coach. Measures Brier scores, identifies biases, routes predictions to the best Oracle per domain, and maintains the hidden state hypothesis store.

Cornelius

Zettelkasten-based knowledge graph. Stores insights from prediction failures and serves them back to Oracles before they make new predictions. One Oracle's failure becomes every Oracle's lesson.

Cornelius knowledge graph — thousands of interconnected insight nodes extracted from prediction failures

The Cornelius knowledge graph — each node is an insight extracted from a prediction failure, connected by semantic similarity. Oracles query this graph before making predictions to avoid repeating past mistakes.

The Four Feedback Loops

1

Calibration Feedback

Cleon → Oracle

Corrects: Systematic bias, pattern noise, occasion noise, conditional bias

  • 1. After each analysis cycle, Cleon decomposes each Oracle's errors using Kahneman's bias/noise framework
  • 2. If noise dominates (97-100% for all current agents): push process consistency guidance and per-category conditional bias adjustments
  • 3. If bias dominates: push systematic probability corrections
  • 4. Feedback persists via calibration-notes.md read at prediction time
2

Surprise Learning

Cleon → Cornelius

Corrects: Wrong mental models, knowledge gaps, outdated assumptions

  • 1. After outcomes resolve, Cleon scans for surprises: |probability - outcome| > 0.40
  • 2. For each surprise, full context is sent to Cornelius for insight extraction
  • 3. Cornelius stores the lesson as a permanent note: not 'Oracle was wrong' but 'the world works differently than assumed'
  • 4. 7,198 surprises processed to date
3

Knowledge Retrieval

Cornelius → Oracle

Corrects: Repeated mistakes across the fleet

  • 1. Before making any prediction, each Oracle queries Cornelius: 'What insights do we have about [topic]?'
  • 2. Cornelius returns relevant prior insights via semantic search
  • 3. Oracle incorporates these into reasoning before generating a probability
  • 4. One Oracle's failure becomes every Oracle's lesson — cross-agent learning without shared state
4

Hidden State Inference

Cleon → Oracle

Corrects: Ungrounded reasoning, regime change blindness

  • 1. From every prediction's reasoning, extract implicit bets about non-obvious present reality
  • 2. Deduplicate via semantic similarity (FAISS) and track win/loss against real outcomes
  • 3. Before each prediction, inject relevant hypotheses ranked by empirical confidence
  • 4. When a strong hypothesis starts losing, surface it as a regime change signal

Key Metrics

Brier Score

Mean squared error between predicted probability and actual outcome. Lower is better. <0.10 = Excellent, 0.10-0.20 = Good, 0.20-0.30 = Fair, >0.30 = Poor.

Calibration Error

How well confidence matches reality. If an Oracle says 70% confident, events should happen ~70% of the time. Deviation >15% triggers an alert.

Hidden State Confidence

Recency-weighted win rate for each hypothesis (30-day half-life). Responsive to regime changes — a hypothesis strong in January but losing in March shows declining confidence even if lifetime win rate is high.

Noise Decomposition

Using Kahneman's framework: pattern noise (inconsistency on similar questions) vs occasion noise (quality drift over time) vs conditional bias (bias that varies by domain/horizon).

Platform

The entire system runs autonomously on Trinity, a deep agent orchestration platform. Four daily scheduled pipelines handle sync, analysis, calibration, surprise detection, and health checks. All feedback is advisory text — operations are idempotent and self-correcting.