Coding Benchmarks
Execution-based coding evaluation · 20 prompts · 5 categories
Speed
Quality
{ } Coding
Models:
—
Prompts:
20
Scoring:
execution-based
Loading…
Leaderboard
ranked by weighted composite score
P
Pure Functions (×1.0)
Q
Data Manipulation (×1.0)
R
Algorithms (×1.5)
S
Bug Fixing (×1.5)
T
One-Shot Tasks (×2.0)
#
Model
Grade
Score
P
Q
R
S
T
TG128
Size
{ }
Loading benchmark data…