Humans complete all levels, AI's top accuracy 0.37%: ARC-AGI-3 tests agent intelligence with "unknown games"

BlockBeatNews

According to 1M AI News monitoring, Keras founder François Chollet and Zapier co-founder Mike Knoop established the non-profit ARC Prize Foundation, which released the ARC-AGI-3 benchmark. Unlike the previous two generations of static grid reasoning tasks, ARC-AGI-3 is a set of interactive turn-based environments where agents operate in a 64×64, 16-color grid world without receiving any instructions or goal hints. They must autonomously explore the environment, infer rules and victory conditions, build world models, and plan action sequences.

Scores are based on an “action efficiency” mechanism, where fewer steps to complete the same level result in higher scores, distinguishing true reasoning ability from brute-force approaches. Each environment has been human-calibrated to ensure that humans can pass it on first attempt. As of release, the leading AI models’ scores are:

  1. Google Gemini 3.1 Pro Preview: 0.37%
  2. OpenAI GPT 5.4 (High): 0.26%
  3. Anthropic Opus 4.6 (Max): 0.25%
  4. xAI Grok-4.20 (Beta): 0.00%

The new version was partly motivated by concerns that the previous benchmark had been “contaminated.” The paper notes that Gemini 3 automatically used the ARC-AGI integer-color mapping (e.g., “3 = green”) within its reasoning chain, despite this mapping never being mentioned in the prompts. This strongly suggests that training data already covered ARC-AGI tasks extensively. ARC-AGI-3 counters such memory shortcuts through interactive environments and autonomous goal discovery mechanisms. The ARC Prize 2026 competition offers total prizes exceeding $2 million.

View Original
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments