News

Kitchen Rush Introduced — A Benchmark for Real-Time LLMs

The new Kitchen Rush benchmark evaluates the ability of LLMs to call tools under time constraints using game mechanics.

Compiled by Sergey KostenchukPublished 2026-06-16Updated 2026-06-16

2026-06-16 Coding

Expanded analysis for this story

Open the longform version with context, source trail, and what changed.

Read longform

Show HN: Kitchen Rush, Overcooked inspired LLM tool calling benchmark Source

🛠 Kitchen Rush — A Benchmark for Evaluating LLMs in Real-Time

Kitchen Rush has been introduced to evaluate the ability of LLMs to call tools under time constraints. Unlike static tests, latency directly impacts task success here.

🌍 This allows for the evaluation of models for real-time systems (assistants, agents) where speed is critical.

👤 It helps in selecting models suitable for live interaction.

Source 1: https://github.com/bassimeledath/kitchen-rush

Sources

github.com