Evaluation Dashboard
Fine-tuning and debugging tool for LLMs generating alpha signals. Compare model iterations using Information Coefficient (IC) to measure predictive power against realized market returns.
Alpha Signal Analysis
The product is alpha signal. A fine-tuned 7B parameter language model that ingests quantum computing news, press releases, and research papers, then outputs structured trading signals scoring 9 sector tickers from -2.0 (strongly bearish) to +2.0 (strongly bullish). The signal is designed to be consumed by quantitative trading systems and portfolio construction engines. Funds like Citadel are actively looking to purchase alternative signal sources exactly like this.
This dashboard is the development and evaluation tool for iterating on that signal. It tracks predictive power across 7 model iterations (base model through LoRA, rejection sampling, DPO, and GRPO), measures Information Coefficient against realized market returns, and provides live debugging for model outputs. The best model (V7d GRPO) achieves IC = +0.157 at the 5-day horizon with p = 0.006, meaning its rankings meaningfully predict which quantum stocks will outperform the market over the following week.
What is Information Coefficient (IC)?
IC is the Spearman rank correlation between a model's predicted signal scores and the realized abnormal returns (stock return minus market return) over a given holding period. An IC of +0.15 at 5 days means the model's signal rankings meaningfully predict which stocks will outperform over the next week. In quantitative finance, IC > 0.05 with statistical significance (p < 0.10) is considered a usable signal for portfolio construction.
How to Read This Dashboard
The Signal Decay Curve shows how predictive power changes over time. A model with high IC at short horizons but rapid decay is useful for short-term trading. A model with sustained IC across horizons (like V7d GRPO) suggests the signal captures fundamental value that the market prices in gradually. Bright markers indicate statistical significance (p < 0.10). The table below provides exact IC values and direction accuracy (% of predictions where the sign was correct).
Signal Decay Curve ⓘ
IC Comparison Table ⓘ
Limitations
- Evaluation period: Jan-Jun 2026 (single market regime)
- Daily granularity (no intraday timing)
- Single-factor market model (SPY benchmark)
- Correlation does not imply causation
- Small sample size limits statistical power for subset analyses
Live Signal Debugging
Run multiple analyses and compare outputs. Results accumulate in the feed below. Submit new articles while previous analyses are still running.
Results Feed 0 analyses
Historical Prediction Analysis
Browse evaluation articles and compare predictions across all model iterations side-by-side against realized market outcomes.
Signal Comparison (All Models)
Raw Price Movement (20 days post-event)
Abnormal Returns vs. Market (SPY) ⓘ
Abnormal Returns vs. Sector (Equal-Weight Quantum Basket) ⓘ
Quantum Computing Sector Map
Understanding the competitive structure and how signals propagate across the sector.