According to 1M AI News monitoring, Keras founder François Chollet and Zapier co-founder Mike Knoop established the non-profit ARC Prize Foundation, which released the ARC-AGI-3 benchmark. Unlike the previous two generations of static grid reasoning tasks, ARC-AGI-3 is a set of interactive turn-based environments where agents operate in a 64×64, 16-color grid world without receiving any instructions or goal hints. They must autonomously explore the environment, infer rules and victory conditions, build world models, and plan action sequences.
Scores are based on an “action efficiency” mechanism, where fewer steps to complete the same level result in higher scores, distinguishing true reasoning ability from brute-force approaches. Each environment has been human-calibrated to ensure that humans can pass it on first attempt. As of release, the leading AI models’ scores are:
The new version was partly motivated by concerns that the previous benchmark had been “contaminated.” The paper notes that Gemini 3 automatically used the ARC-AGI integer-color mapping (e.g., “3 = green”) within its reasoning chain, despite this mapping never being mentioned in the prompts. This strongly suggests that training data already covered ARC-AGI tasks extensively. ARC-AGI-3 counters such memory shortcuts through interactive environments and autonomous goal discovery mechanisms. The ARC Prize 2026 competition offers total prizes exceeding $2 million.