Sunday, August 17, 2025

Top 5 This Week

Related Posts

ChatGPT beats Grok in AI chess last, Gemini finishes third, Elon Musk says…

Printed on: Aug 10, 2025 12:22 pm IST

Grok 4 led early within the competitors however faltered within the last match in opposition to ChatGPT o3.

OpenAI’s ChatGPT o3 mannequin defeated Elon Musk’s xAI mannequin Grok 4 within the last of a Kaggle-hosted event that got down to discover the strongest chess-playing giant language mannequin (LLM). The occasion, held over three days, pitted general-purpose LLMs from a number of corporations in opposition to one another somewhat than specialised chess engines.

Elon Musk downplayed the defeat, saying Grok’s earlier sturdy outcomes have been a “aspect impact”.(AP)

Event format and contributors

Eight fashions took half, together with entries from OpenAI, xAI, Google, Anthropic and Chinese language builders DeepSeek and Moonshot AI. The competition used normal chess guidelines however examined multi-purpose LLMs, techniques that aren’t particularly optimised for chess play. BBC protection of the occasion famous that Google’s Gemini completed third after beating one other OpenAI entry.

Mobile Finder: iPhone 17 Air expected to debut next month

Closing and key moments

Grok 4 led early within the competitors however faltered within the last match in opposition to o3. Commentators and observers highlighted a number of tactical errors by Grok 4, together with repeated queen losses, which swung the match in o3’s favour. Chess.com author Pedro Pinhata mentioned: “Up till the semi finals, it appeared like nothing would be capable to cease Grok 4,” however added that Grok’s play “collapsed underneath stress” on the final day. Grandmaster Hikaru Nakamura, who commentated dwell, famous: “Grok made so many errors in these video games, however OpenAI didn’t.”

Responses and wider context

Elon Musk downplayed the defeat, saying Grok’s earlier sturdy outcomes have been a “aspect impact” and that xAI had “spent virtually no effort on chess.” The outcome provides a public dimension to the rivalry between Musk’s xAI and OpenAI, each based by individuals who as soon as labored collectively at OpenAI.

Chess has lengthy been used to measure AI progress. Previous milestones embrace specialised techniques corresponding to DeepMind’s AlphaGo, which defeated high human gamers within the recreation of Go. This Kaggle event differs by testing basic LLMs on strategic, sequential duties somewhat than utilizing devoted chess engines.

What it means

The result reveals variability in how LLMs deal with structured, adversarial duties like chess. Whereas o3’s efficiency suggests some LLMs can maintain strategic play underneath event situations, Grok 4’s collapse illustrates that outcomes should be inconsistent. Organisers and commentators are more likely to proceed utilizing chess and related duties to probe reasoning, planning and robustness in giant language fashions as the sphere evolves.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Popular Articles