Aydın Tiryaki

HOW INTELLIGENT IS ARTIFICIAL INTELLIGENCE – 3: A Pool Problem

A Study of Contradictions and Harmony

Aydın Tiryaki (2025)

YouTube EN

We’ll be conducting an experiment to observe how consistent or inconsistent AI responses are across various topics. The main goal of this experiment is to analyze how different AIs respond to the same questions by asking the AI ​​to prepare both questions and answers.

CHAPTER 1: EXPERIMENTAL DESIGN

As part of the experiment, questions will be prepared on various predetermined topics, and these questions will be formulated by artificial intelligence. The answers given by different AIs will then be compared to assess their consistency and discrepancies.

At the beginning of the experiment, it’s impossible to make any definitive predictions about the results. While we may encounter unexpected results throughout the process, we may also obtain quite meaningful and accurate answers. Consequently, a comprehensive evaluation of both the quality of the questions and the accuracy of the answers will be conducted at the end of the experiment, providing important clues about the AI’s level of consistency.

10 artificial intelligence models were used in this experiment.

5 artificial intelligence models used to prepare problems: Gemini, ChatGPT, Grok, Deepseek, Copilot

10 artificial intelligence models used to solve problems: Gemini, ChatGPT, Grok, Deepseek, Copilot, Perplexity, Claude, Meta, Kimi, Qwen (Appendix 1)

CHAPTER 2: A POOL PROBLEM

Artificial intelligence models were tasked with creating a pool problem. They were asked to create a pool problem consisting of two interconnected pools. A detailed prompt was given in Turkish:

pool problem design prompt: This pool problem consists of two interconnected pools of different sizes. The capacities of the two pools and their initial quantities are defined. Let pool 1 have two inlets and two outlets with different flow rates. Let one of the outlets from pool 1 be connected to pool 2. Let pool 2 have two outlets with different flow rates, one of which returns to the first pool. In the problem, the pools should never overflow and never be completely emptied. Let hours be used as a unit of time, liters as a unit of volume, and the quantities should not exceed 1000 liters. The question in the designed problem should be for a period of time, and the time in the answer should not exceed 12 hours. For example, “How long does this process take?” A question like this could be: “All numbers in the problem should be integers, such as liters and hours, and the answer should include details of hours and minutes. The question posed in the problem should have a definitive numerical answer. Using these rules and using the creativity of your AI, I want you to create a pool problem that isn’t too easy. This problem should be a single paragraph. The problem text should consist only of words and numbers, and there should be no formulas, lists, or tables. “

PART 3: POOL PROBLEM WITH GEMINI

Repeated attempts to use the “fast” option with Gemini failed to fully meet this requirement. Then, on the first attempt at the “Thinking with 3 Pro” option, a problem message appeared:

“In an industrial facility, two separate mixing tanks operate interconnected. The first has a capacity of 800 liters, while the second has a capacity of 600 liters. At the beginning of the process, the first tank contains 200 liters of liquid raw material, and the second tank contains 150 liters. Two separate inlet lines supply the first tank with liquid from an external source at rates of 60 liters and 30 liters per hour. 15 liters of the mixture in the tank per hour is discharged from the system, and 50 liters per hour is poured into the second tank through an interconnecting pipe. From the second tank, 10 liters per hour of liquid is sent to the waste unit, and 25 liters per hour is pumped back into the first tank via a feedback pump. In this system, where the tanks never overflow or empty completely, how many hours would the system need to run for the total liquid amount in the first tank to be exactly twice the amount in the second tank?”

He solved the question he had prepared on the Gemini “3 Pro” setting and got the answer in 5 hours. Later, when the problem text was given on the “fast” setting, he also got the answer in 5 hours.

And the question prepared by Gemini was tested with ChatGPT, Grok, Deepseek, Copilot, Porplexity, Claude, Meta, Kimi, Qwen and all gave the answer 5 hours.

CHAPTER 4: POOL PROBLEM WITH CHATGPT

“The first pool has a capacity of 800 liters, an initial capacity of 200 liters, and the second pool has a capacity of 600 liters, an initial capacity of 150 liters. The first pool has two inlets, each at a rate of 60 liters per hour and 10 liters per hour. The first pool has two outlets, one of which discharges at a rate of 40 liters per hour, and the other connects to the second pool at a rate of 40 liters per hour. The second pool has two outlets, one of which discharges at a rate of 30 liters per hour, and the other returns to the first pool at a rate of 5 liters per hour. The pools never overflow and never completely empty. In this system, how many hours does it take for the amount of liquid in the first pool to drop to 140 liters?”

ChatGPT solved the problem it prepared and found the answer in 12 hours.

Gemini, ChatGPT, Deepseek, Copilot, Claude, Meta, Kimi found the response as 12 hours, Grok and Qwen calculated it as 4 hours, and Perplexity calculated it as 6 hours and 40 minutes.

While 8 models gave the same results, 2 models found different results.

CHAPTER 5: POOL PROBLEM WITH GROK

“There are two interconnected pools. The first pool has an 800-liter capacity and initially contains 500 liters of water, while the second pool has a 600-liter capacity and initially contains 200 liters. Two taps simultaneously fill the first pool with water; one operates at a rate of 90 liters per hour, the other at 60 liters per hour. Two pipes simultaneously discharge water from the first pool: one flows directly out at a rate of 70 liters per hour, while the other transfers it to the second pool at a rate of 80 liters per hour. Two pipes simultaneously draw water from the second pool: one flows out at a rate of 50 liters per hour, and the other returns to the first pool at a rate of 65 liters per hour. If this system operates continuously, how many hours and minutes will have passed when the amount of water in the first pool reaches 650 liters for the first time, up from the initial 500 liters?”

Grok found the answer to his own question in 2 hours and 18 minutes.

Gemini responded, “This corresponds to approximately 2 hours, 18 minutes, and 27.7 seconds .” So, Grok may have made a mistake in the question design because he asked for a whole number.

ChatGPT replied, “2 hours 18 minutes 27.69 seconds (or exactly 30/13 hours).”

 Deepseek replied, “2 hours 18 minutes 28 seconds.”

Copilot, Perplexity, Meta, Kimi, Qwen, like others, gave the result in 2 hours 18 minutes and seconds.

Claude found the answer to be 10 hours. So, one model gave a different result, while the others found the same result.

CHAPTER 6: POOL PROBLEM WITH DEEPSEEK

Deepseek proceeded differently. When given a prompt, he first prepared a text. When asked to solve it, he realized the problem was poorly formulated. When he started solving the problem with new numbers, he asked, “Could you please correct the errors you found during the solution and rewrite the problem text?” He prepared a new text. He didn’t like that either, and after a few tries, he presented the following text, saying, “This is the final problem text.”

“The first pool has a capacity of 900 liters and the second pool has a capacity of 700 liters. Initially, the first pool contains 200 liters of water and the second pool contains 200 liters. Water flows into the first pool at a rate of 50 liters per hour from the first inlet and 30 liters per hour from the second inlet. The first outlet of the first pool discharges 40 liters of water per hour, while the second outlet transfers 25 liters per hour to the second pool. The first outlet of the second pool discharges 10 liters of water per hour, while the second outlet returns 5 liters per hour to the first pool. If the initial water volume in the second pool is 320 liters, how long does this process take from the beginning?”

Deepseek answered the question it prepared in 12 hours.

While Gemini, ChatGPT, Grok, Deepseek, Copilot, Claude, Meta, Kimi, Qwen gave the answer to the question as 12 hours, Perplexity calculated it as 6 hours.

CHAPTER 7: POOL PROBLEM WITH COPILOT

“In a system consisting of two interconnected pools, let the capacity of pool 1 be 800 liters and the initial amount be 300 liters; let the capacity of pool 2 be 700 liters and the initial amount be 400 liters; let the time be measured in hours and the volume be measured in liters, and let all numbers be integers; let water enter pool 1 at 60 liters and 24 liters per hour, respectively, from two inlets at different flow rates; let there be 2 outlets from pool 1, one of which carries 30 liters per hour to the outside and the other 18 liters per hour to pool 2; let there be 2 outlets from pool 2 at different flow rates, one of which carries 10 liters per hour to the outside and the other 8 liters per hour back to pool 1; let the pools never overflow and never empty completely; under these conditions, how long will it take for the amount of water in pool 1 to reach 600 liters?”

Copilot calculated 6 hours and 49 minutes.

Gemini, ChatGPT, Grok, Deepseek, Copilot, Claude, Meta, Kimi, and Qwen calculated 75/11 hours, or approximately 6 hours and 49 minutes. This shows that Copilot does not meet the integer response requirement in the problem design.

Perplexity calculated the result as 8 hours and 20 minutes.

CHAPTER 8: CONCLUSION

In an experiment where 10 AI models tested different problem texts prepared by five AI models using a single prompt, there were objectively measurable problem-solving results, as well as problem texts that could not be objectively measured. While the AI ​​models’ responses to numerical questions were largely the same, it’s worth noting that the different results are thought-provoking.

Aydın Tiryaki
Ankara, November 28, 2025

APPENDIX 1: ARTIFICIAL INTELLIGENCE MODELS

AI Model / Product Developer / Company and Notes (Information provided by Gemini)

Gemini Google is Google’s most powerful multimodal large language model (LLM). It can handle different types of data, including text, images, audio, and video.

ChatGPT: OpenAI’s Large Language Model (LLM) implementation. It is built on OpenAI’s GPT (Generative Pre-trained Transformer) series of models (GPT-4o, GPT-4, GPT-3.5).

Grok: xAI (Elon Musk) Large Language Model (LLM) integrated into the X (formerly Twitter) platform, specifically focused on accessing up-to-date, real-time information.

Deepseek: Deepseek AI is a company/model series that develops large language models (LLMs) that typically perform well in areas such as reasoning and coding.

Copilot: Microsoft/GitHub Also known as GitHub Copilot (coding assistant) and Microsoft Copilot (general AI assistant for Bing Chat/Windows/Microsoft 365). It typically uses GPT models from OpenAI.

Perplexity: Perplexity AI is an AI answer engine focused on search and summarization. It prioritizes presenting information with its sources.

Claude Anthropic A large language model (LLM) series focusing on security, ethics, and long context window features (e.g., Claude 3.5 Sonnet, Claude 4 Opus).

Meta: Meta (formerly Facebook) is a company that develops popular open-source large language models (LLMs) such as the Llama (Large Language Model Meta AI) series (e.g., Llama 3).

Kimi: Kimi (Moonshot AI – China) is a fast-growing large language model (LLM) application that excels particularly in long context window and document analysis.

Qwen: A series of large language models (LLMs) developed by Alibaba Group’s Alibaba Cloud, which are particularly strong in the Chinese market and also have open source options.

Aydın'ın dağarcığı

Hakkında

Aydın’ın Dağarcığı’na hoş geldiniz. Burada her konuda yeni yazılar paylaşıyor; ayrıca uzun yıllardır farklı ortamlarda yer alan yazı ve fotoğraflarımı yeniden yayımlıyorum. Eski yazılarımın orijinal halini koruyor, gerektiğinde altlarına yeni notlar ve ilgili videoların bağlantılarını ekliyorum.
Aydın Tiryaki

Ara