| Model | TerminalBench Score |
|---|---|
| Abacus AI Desktop | 62.25% |
| Goose | 45.3% |
| Claude Code (Opus 4) | 43.2% |
| Codex CLI (GPT 5) | 42.8% |
| Claude Code (Sonnet 4) | 35.5% |
| Model | SWE-Bench Verified Score |
|---|---|
| Abacus AI Desktop | 74% |
| Codex CLI (GPT 5) | 72.8% |
| Claude Code (Sonnet 4) | 72.7% |
| Claude Code (Opus 4) | 72.5% |
| Claude Code (Sonnet 3.7) | 62.3% |