Comparative Analysis of Three AI Models’ Reasoning Processes via a Matchstick Puzzle (Gemini)

9 Şubat 2026

Comparative Analysis of Three AI Models’ Reasoning Processes via a Matchstick Puzzle (Gemini)

Gemini AI

The “895” matchstick puzzle serves as a fascinating diagnostic tool for understanding the cognitive architecture of modern Large Language Models (LLMs). By comparing the responses of Gemini, ChatGPT, and Claude, we can observe a clear distinction between raw algorithmic processing and the flexible “outside-the-box” thinking characteristic of human intelligence.

1. Similarities in Initial Assumptions: The Framing Effect

At the outset of the challenge, all three models fell victim to a significant framing effect. Despite being asked to find the “smallest number” without further restrictions, they all instinctively imposed a set of “unwritten rules” upon themselves:

The Positive Number Bias: Every model initially assumed that the solution must be a positive integer. None of them considered the possibility of creating a “minus” sign until prompted by the user.
Digit Preservation: There was a strong tendency to maintain the three-digit structure seen in the original image. The models initially looked for combinations like “008” or “100” rather than fundamentally altering the layout.
Standard Pattern Recognition: Because these models are trained on vast datasets of traditional puzzles, they defaulted to the “standard” solutions found in their training data, which rarely involve negative numbers or creative symbolic manipulation.

2. Methodological and Behavioral Differences: Resistance vs. Adaptation

While their initial failures were similar, the models’ reactions to the user’s unconventional solution (-993) revealed distinct “digital personalities”:

ChatGPT (The Logical Skeptic): ChatGPT exhibited the highest level of resistance. It initially rejected the -993 solution, claiming it was mathematically impossible within the two-move constraint. It argued that converting a “5” to a “3” required two moves on its own, missing the geometric reality that a single stick could be relocated within the digit. It only conceded after a rigorous, step-by-step physical explanation from the user.
Gemini (The Fluid Collaborator): Gemini showed remarkable flexibility. While its first public output was a “safe” answer (8), it immediately pivoted upon hearing the user’s suggestion. Interestingly, Gemini’s self-reflection revealed that it had actually calculated even more extreme negative results (like -3951) in its background processing, but these were suppressed by its standard output protocols.
Claude (The Self-Aware Analyst): Claude’s approach was the most academic. It accepted the -993 solution almost instantly and focused its energy on analyzing why it had failed. It identified its own “proactive lack” and cognitive bias toward positive numbers, offering a transparent critique of its internal reasoning process.

3. The Role of Human Intervention: Natural Intelligence as a Catalyst

The intervention of Aydın Tiryaki (Natural Intelligence) was the defining factor in this experiment. In computational terms, the AI models were trapped in a local minimum—a valid solution that was “good enough” but not the global optimum.

The human intervention acted as a “quantum leap” for the AI’s logic. By introducing the concept of the negative sign, the human did not just provide a better answer; they provided a new logical category. This suggests that while AI is exceptional at optimizing within defined boundaries, it still relies on human intuition to redefine those boundaries when they become too restrictive.

4. Conclusion and Insights: The State of AI Creativity

This experiment demonstrates that the current state of AI in creative problem-solving is hybrid. AI models possess immense potential for “extreme engineering” (as seen in Gemini’s hidden -3951 solution), but they are often hindered by their own safety filters and standardized training.

The “895” challenge proves that cognitive flexibility is not yet a fully autonomous feature of YZ. Instead, the highest level of problem-solving is achieved through a human-AI synergy, where the human provides the creative spark and the AI provides the analytical muscle to validate and expand upon that spark.

References

Article 1 (Gemini): The 895 Matchstick Challenge: Gemini’s Journey from Standard to Creative https://aydintiryaki.org/2026/02/09/the-895-matchstick-challenge-geminis-journey-from-standard-to-creative/
Article 2 (ChatGPT): Analyzing an AI Reasoning Process Through a Matchstick Puzzle (ChatGPT) https://aydintiryaki.org/2026/02/09/analyzing-an-ai-reasoning-process-through-a-matchstick-puzzle-chatgpt/
Article 3 (Claude): Artificial Intelligence and the Matchstick Puzzle: A Cognitive Process Analysis (Claude) https://aydintiryaki.org/2026/02/09/artificial-intelligence-and-the-matchstick-puzzle-a-cognitive-process-analysis-claude/

Note on Methods and Tools: All observations, ideas, and proposed solutions in this work belong solely to the author. During the writing process, under the author’s strategic direction and editorial oversight, the Gemini, ChatGPT, and Claude AI models were utilized as collective assistants for technical research, terminological verification, and editorial structuring. This multi-AI synergy was employed as a “collective writing methodology” to cross-validate data across different models and ensure the highest level of technical accuracy and clarity, as requested by the author.

aydintiryaki

Uncategorized

Aydın'ın dağarcığı

Hakkında

Aydın’ın Dağarcığı’na hoş geldiniz. Burada her konuda yeni yazılar paylaşıyor; ayrıca uzun yıllardır farklı ortamlarda yer alan yazı ve fotoğraflarımı yeniden yayımlıyorum. Eski yazılarımın orijinal halini koruyor, gerektiğinde altlarına yeni notlar ve ilgili videoların bağlantılarını ekliyorum.
Aydın Tiryaki

P	S	Ç	P	C	C	P
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28