• Policies & Privacy
AI News
  • Longevity
  • Culture
  • Business
  • Tech
  • Contact
No Result
View All Result
Contact Us
VeyrZest
  • Longevity
  • Culture
  • Business
  • Tech
  • Contact
No Result
View All Result
VeyrZest
No Result
View All Result
Beige chess knight model on the left casting a shadow, next to a beige notebook page with a black hand-drawn maze against a blue textured background.

Reasoning Is Not Intelligence

The new generation of AI models produces longer inference chains. That this constitutes reasoning in any philosophically meaningful sense is a claim that has not been defended — and the conflation has consequences.

Martynas Kasiulis by Martynas Kasiulis
April 22, 2026
in Tech
585
SHARES
3.3k
VIEWS
Summarize with ChatGPTShare to Facebook

The word “reasoning” acquired a precise technical meaning in AI discourse sometime around late 2024, when OpenAI released its o1 model and described it as capable of reasoning through complex problems by generating extended chains of thought before producing a response. The description was adopted rapidly — by researchers, by journalists, by the models’ developers in their own documentation — and with it came a set of implications that were rarely made explicit.

The argument that follows is narrow and precise. It is not that current AI systems are unintelligent, or incapable, or that their outputs lack value. It is that the word “reasoning” is being used in a way that conflates two things that are genuinely distinct, and that the conflation has material consequences.


WHAT THE MODELS ACTUALLY DO

Large language models trained with reinforcement learning on chains of reasoning-like text demonstrably produce longer, more structured, and in many benchmarked cases more accurate outputs on structured problem classes. On ARC-AGI, a benchmark designed by François Chollet specifically to require genuine abstraction rather than pattern completion from training data, frontier models have shown substantial recent improvement. These are real improvements. They should be reported accurately.

What they are not, in any philosophically defensible sense, is reasoning. The models produce token sequences that, statistically, have the form of reasoning — premise, inference step, conclusion. Whether anything occurring between input and output corresponds to a cognitive process that warrants the term is a question that cannot be answered by pointing at the output.


THE CHOLLET PROBLEM

This is not a new objection. Chollet, whose ARC-AGI benchmark has been influential precisely because it was designed to resist memorisation, has made the distinction clearly: a system that retrieves and recombines patterns from its training distribution is doing something categorically different from a system that encounters a genuinely novel problem and forms an abstract representation of it.

The distinction between retrieving a pattern and forming an abstraction is not a difference of degree. It is a difference of kind. Extended inference chains do not bridge that gap by being longer.

The new reasoning models are better at ARC-AGI. They are still, as Chollet and others have noted, exploiting a mechanism different from the one the benchmark was designed to isolate. The improvements appear to come from extended search over possible response structures — a process more analogous to exhaustive tree search than to the flexible abstraction humans deploy when encountering a genuinely novel problem.

A separate concern is raised by recent work on chain-of-thought faithfulness: the visible “reasoning” tokens produced by these models may not accurately represent the computational process generating the final answer. The chain of thought may be post-hoc rationalisation rather than causal explanation — a plausible-looking narrative constructed alongside, rather than producing, the output.


WHY THE CONFLATION MATTERS

When capability demonstrations are described as “reasoning,” several things follow in the broader discourse. Research resources flow toward extended-inference architectures on the assumption that more steps of reasoning-like processing will produce more intelligence-like results. Investment narratives are built on the premise that systems capable of “reasoning” are qualitatively different from — and on a path toward — systems capable of genuine open-ended problem-solving.

Policy and regulatory attention is directed at systems whose capabilities are described in these terms. The EU AI Act’s high-risk categories are partly calibrated to capability levels. If those levels are described using language that overstates what the systems are doing, the regulatory response is calibrated to a threat model that does not match the actual technology — likely in both directions: overreacting to capabilities that are present but not intelligence, and underreacting to capabilities that are present but not called reasoning.


THE PRECISE CLAIM

Extended-inference systems produce better outputs on structured tasks because they search over more possible completion paths and select for internally consistent ones. This is a genuine capability improvement with practical value. It is not reasoning in the sense that requires representation, abstraction, and genuine novelty — and there is no evidence that it is on a trajectory toward those properties simply by increasing scale or chain length.

Calling it reasoning is not a lie. It is a category error. The difference between those two charges is the difference between dishonesty and imprecision. Both have consequences. The imprecision, here, may be the more consequential of the two.


SOURCES

1.  ARC Prize Foundation. ARC-AGI-2 benchmark and leaderboard.  arcprize.org
2.  Chollet F. “On the Measure of Intelligence.” arXiv 1911.01547, November 2019.  arxiv.org/abs/1911.01547
3.  Turpin M et al. “Language Models Don’t Always Say What They Think.” arXiv 2305.04388, 2023.  arxiv.org/abs/2305.04388

Tags: THE ARGUMENT
SummarizeShare234
Martynas Kasiulis

Martynas Kasiulis

Related Stories

Two monitors on a desk display colorful 3D protein structures; a coffee cup and notes are nearby.

What AlphaFold Actually Changed

by Martynas Kasiulis
May 13, 2026
0

Four years after DeepMind solved protein structure prediction, the field has had time to assess the claim. The tool has delivered what it promised. The promise was smaller...

Aerial view of an electrical substation with transformers and high‑voltage lines arranged in a geometric grid on a paved site beside a barren landscape.

AI’s Power Bill

by Martynas Kasiulis
April 29, 2026
0

The capability story has been told without its electricity story. The second is the constraint that decides what is buildable.

Capability Was Never the Bottleneck

Capability Was Never the Bottleneck

by Martynas Kasiulis
April 27, 2026
0

The 2026 AI Index reports a sharp gain in agent capability. The deployment data tells a different and more important story.

MRI scanner in a quiet hospital hallway, with blue scrubs lying on the floor in the foreground.

What the first RCT of AI in cancer screening has actually shown

by Martynas Kasiulis
April 24, 2026
0

The MASAI trial has now reported its primary endpoint. The result is positive, narrow, and instructive — about what AI in medicine looks like when it is held...

Next Post
Clear dropper bottle on a white stand with a faint capsule sketch in the background, beige and pale tones overall.

What GLP-1 Drugs Actually Do to Ageing

VeyrZest

We bring you the best Premium Lifestyle Magazine with a perfect balance of Longevity, Culture, Business and Tech content.

Recent Posts

  • What AlphaFold Actually Changed
  • The Clock Genes
  • The Untranslatable

Categories

  • Business
  • Culture
  • Longevity
  • Tech
  • Longevity
  • Culture
  • Business
  • Tech
  • Contact

© 2026 VeyrZest - Premium Lifestyle Magazine. Website by Digibru.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Longevity
  • Culture
  • Business
  • Tech
  • Contact

© 2026 VeyrZest - Premium Lifestyle Magazine. Website by Digibru.