Tool-Calling Model Comparison

Tool-Calling Model Comparison

Comparisons of tool-calling benchmarks (Tau2-Bench, Terminal Bench Hard, and BFCL v4) for GPT-OSS 20B, GPT-OSS 120B, HyperNova 60B 2512, and HyperNova 60B 2602.

Format

PNG

Source

Multiverse Computing

Downloads