# Show HN: AI Model Benchmark for Crypto Price Predictions

> Source: <https://coinsignal.co/benchmark>
> Published: 2026-05-31 09:35:20+00:00

[Dashboard](/)

# Model Benchmark

Ranking uses verified model performance from the database. The score favors accuracy first, then hit rate, consistency, confidence calibration, and enough sample size to trust the result.

Best calibrated

**moonshotai/kimi-k2.5** Verified predictions

**10,885** Only completed prediction windows with accuracy scores are included.

Models compared

**13** Grouped by model name across every tracked coin.

Best avg accuracy

**73.8%** Mean score from direction, range closeness, and range overlap.

Best recent form

**78.5%** Average accuracy over each model's latest 10 verified calls.

**Accuracy**

Primary quality score already recorded for each model.

**Hit Rate**

Share of predictions scoring at least 70%.

**Consistency**

Rewards models with lower accuracy variance.

**Calibration**

Checks whether confidence matches actual results.

**Recency**

Separates current form from older performance.

| Rank | Model | Score | Avg accuracy | Recent | Hit rate | Consistency | Conf gap | Samples | Coins | Last verified |
|---|---|---|---|---|---|---|---|---|---|---|
| #1 | openai/gpt-5.4High-confidence avg: 57.4% | 78.7% | 73.8% | 67.6% | 79.9% | 80.3% | -8.7% | 1067 | ADA, AVAX, BNB, BTC, DOGE +4 | May 30, 06:07 PM |
| #2 | minimax/minimax-m2.7High-confidence avg: 44.8% | 71.7% | 66.1% | 69.7% | 65.6% | 75.7% | -3.9% | 195 | ADA, AVAX, BNB, BTC, DOGE +4 | Apr 26, 06:05 AM |
| #3 | xiaomi/mimo-v2.5High-confidence avg: 46.2% | 71.2% | 66.4% | 69.2% | 60.3% | 79.2% | -2.7% | 839 | ADA, AVAX, BNB, BTC, ETH +2 | May 30, 06:07 PM |
| #4 | xiaomi/mimo-v2.5-proHigh-confidence avg: 45.5% | 71.0% | 66.6% | 78.5% | 60.0% | 79.3% | -5.3% | 815 | ADA, AVAX, BNB, BTC, ETH +2 | May 30, 06:05 PM |
| #5 | minimax/minimax-m2.5High-confidence avg: 50.8% | 70.3% | 65.3% | 64.8% | 62.2% | 76.4% | -6.8% | 1025 | ADA, AVAX, BNB, BTC, DOGE +4 | May 30, 06:07 PM |
| #6 | openai/gpt-5-miniHigh-confidence avg: 38.2% | 68.0% | 62.6% | 49.0% | 57.4% | 77.3% | -5.7% | 1110 | ADA, AVAX, BNB, BTC, DOGE +4 | May 30, 12:04 PM |
| #7 | qwen/qwen3.5-plus-20260420High-confidence avg: 51.5% | 66.1% | 61.6% | 67.3% | 48.5% | 78.6% | -3.6% | 824 | ADA, AVAX, BNB, BTC, ETH +2 | May 30, 06:07 PM |
| #8 | moonshotai/kimi-k2.5High-confidence avg: 42.8% | 64.4% | 59.0% | 66.1% | 46.9% | 75.9% | -0.0% | 1024 | ADA, AVAX, BNB, BTC, DOGE +4 | May 30, 06:05 PM |
| #9 | z-ai/glm-5.1High-confidence avg: 44.4% | 63.8% | 59.0% | 62.7% | 44.3% | 77.9% | +2.5% | 1114 | ADA, AVAX, BNB, BTC, DOGE +4 | May 30, 12:05 PM |
| #10 | deepseek/deepseek-v4-flashHigh-confidence avg: 48.0% | 63.7% | 58.7% | 64.9% | 43.3% | 78.5% | -0.5% | 834 | ADA, AVAX, BNB, BTC, ETH +2 | May 30, 06:05 PM |
| #11 | google/gemini-3-flash-previewHigh-confidence avg: 42.2% | 62.5% | 58.1% | 55.9% | 43.0% | 78.1% | +8.6% | 1015 | ADA, AVAX, BNB, BTC, DOGE +4 | May 30, 12:03 PM |
| #12 | z-ai/glm-5High-confidence avg: 42.8% | 61.5% | 54.7% | 68.3% | 52.8% | 67.7% | +9.1% | 195 | ADA, AVAX, BNB, BTC, DOGE +4 | Apr 26, 06:05 AM |
| #13 | deepseek/deepseek-v4-proHigh-confidence avg: 42.3% | 58.0% | 54.0% | 65.5% | 31.8% | 79.4% | +10.3% | 828 | ADA, AVAX, BNB, BTC, ETH +2 | May 30, 06:05 PM |
