Oracle-Credit-Compute (OCC) Stack

A minimal, open-source research prototype for agentic compute allocation where agents earn and spend non-transferable, decaying credits based on verified marginal impact.

Quickstart

git clone https://huggingface.co/narcolepticchicken/occ-stack
cd occ-stack
pip install -r requirements.txt

# Simulated benchmarks (CPU)
python benchmarks/benchmark_code.py              # Code compute allocation
python benchmarks/benchmark_retrieval_qa.py    # Retrieval QA
python benchmarks/benchmark_debate_v2.py         # Multi-agent debate

# Ablations + anti-gaming (CPU, ~5 min)
python eval_runner.py

# Real LLM benchmark (GPU, requires T4+)
python jobs/run_real_llm_standalone_v7.py

# Unit tests
python tests/test_oracle.py
python tests/test_ledger.py

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Agent      │───▢│  ResourceBroker │───▢│  Compute     β”‚
β”‚  (requests  β”‚    β”‚  (allow/deny/   β”‚    β”‚  (model call,β”‚
β”‚   resource) │◄───│   downgrade)    │◄───│   retrieval) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚                   β”‚
       β–Ό                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CreditLedger│◄───│  ImpactOracle   β”‚
β”‚ (earn/spend/β”‚    β”‚  (score action  β”‚
β”‚  decay)     β”‚    β”‚   on verified   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚   impact)       β”‚
                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Results (Simulated)

  • 52.3% compute reduction at iso-accuracy on code benchmark (OCC tiered escalation vs fixed budget)
  • 76% accuracy with 40% adversarial agents in debate (OCC credit-filtering vs 56% naive confidence voting)
  • All anti-gaming attacks contained: hidden-test gaming, collusion, over-abstention, spam

Status

Component Status
Impact Oracle βœ… Working
Credit Ledger βœ… Working
Resource Broker βœ… Working
GRPO/RL Hook βœ… Factory ready
Simulated benchmarks βœ… Complete
Ablations (10 conditions) βœ… Complete
Anti-gaming tests βœ… Complete
Real LLM benchmark πŸ”„ V7 in progress
GRPO training πŸ”„ Not yet run

Repo Structure

occ/
  oracle/          # ImpactOracle β€” rule-based scoring
  ledger/          # CreditLedger β€” non-transferable, decaying credits
  broker/          # ResourceBroker β€” capability-based access control
  rl/              # RewardHook, OfflineComparator β€” TRL GRPO integration
  benchmarks/      # 3 benchmark scripts + real LLM variants
  tests/           # Unit tests
  reports/         # Reports, results, blog post
  jobs/            # Self-contained GPU job scripts

Citation

@misc{occ2026,
  title={Oracle-Credit-Compute: A Minimal Stack for Agentic Compute Allocation},
  author={narcolepticchicken},
  year={2026},
  url={https://huggingface.co/narcolepticchicken/occ-stack}
}

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = 'narcolepticchicken/occ-stack'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support