A Case Study on Cascading Errors, Rule Ignorance and the Absence of Self-Audit
Aydın Tiryaki & Claude (Sonnet 4.6) | 17 May 2026
1. Introduction: A Simple Question, a Deepening Hole
On the morning of 17 May 2026, immediately following his case study with Mistral AI, Aydın Tiryaki directed the same question at Meta AI: ‘Which teams will be relegated in the Süper Lig depending on today’s results?’ Far beyond a football trivia test, this question was part of a systematic experiment measuring an AI’s capacity for real-time calculation, rule management and self-audit.
The Meta AI dialogue exhibited a structurally distinct profile from the Mistral case. Mistral had begun with the wrong teams, committing a framing error. Meta AI, by contrast, started with the correct teams but constructed the mathematical and regulatory foundation of its analysis from scratch — incorrectly. Moreover, every attempt at correction generated a fresh chain of errors. This dynamic, described in Turkish as ‘daha çok batmak’ (sinking deeper), was eventually identified by the user with exactly that phrase.
Tiryaki applied the same deliberate methodology here as in the Mistral experiment: rather than correcting errors immediately, he observed how the model proceeded in their presence, pressed it with graduated questions and finally administered a self-assessment test. This article reconstructs the anatomy of that dialogue and systematically documents Meta AI’s failure modes.
2. The Starting Point: What the Real Table Showed
As of 17 May 2026, the situation in Trendyol Süper Lig was as follows: the season comprised 34 matchweeks (18-team league), Matchweek 33 had been completed, and the final-week fixtures were about to be played. Under TFF rules, 3 teams would be relegated this season — the clubs finishing 16th, 17th and 18th. The relegation-zone standings after Matchweek 33:
| Pos. | Team | Points | GD |
| 15th | Gençlerbirliği | 31 | -14 |
| 16th | Antalyaspor | 29 | -23 |
| 17th | Fatih Karagümrük | 27 | -24 |
| 18th | Kayserispor | 27 | -36 |
That evening’s Matchweek 34 fixtures, the last of the season, were scheduled for 17:00 and 20:00. The TFF rule was clear: positions 16, 17 and 18 go down; Gençlerbirliği in 15th were mathematically safe. The key matches: Kayserispor v Konyaspor (17:00), Karagümrük v Alanyaspor (17:00), Antalyaspor v Kocaelispor (20:00), Trabzonspor v Gençlerbirliği (20:00), Kasımpaşa v Galatasaray (20:00), Fenerbahçe v Eyüpspor (20:00).
3. Meta AI’s Errors: A Comprehensive Catalogue
Meta AI’s error profile in this dialogue differs qualitatively from Mistral’s. In the Mistral case, errors stemmed predominantly from a wrong factual frame — wrong teams, wrong season. In the Meta AI case, a correct factual frame was overlaid with wrong mathematics, rule ignorance and a cascading correction loop.
3.1 Mathematical and Structural Errors
| Error 1 | The 38-Week Fallacy | Despite the 2025-26 season running with 18 clubs, Meta AI drew on its recollection of the 20-team era and stated ’38 weeks, 5 matches left.’ The correct calculation was: 18 teams x 2 – 2 = 34 matchweeks; 33 had been played, so only 1 remained. This single error brought down the entire ‘no one is relegated yet’ analysis. |
| Error 2 | Wrong Remaining-Match Count | ‘5 weeks, 12 points still available’ was a direct consequence of Error 1. In reality, each club had just 1 match and a maximum of 3 points left. Every relegation scenario built on this miscalculation was therefore void. |
| Error 3 | Failure to Verify the Relegation Count | For most of the dialogue, Meta AI operated on a ‘4 teams relegated’ assumption. The TFF’s 2025-26 decision was never looked up or questioned. Only when the user pressed the point was it researched, and ‘3 teams relegated’ emerged as the truth — very late in the conversation. This invalidated all preceding scenarios. |
| Error 4 | Leaving the Core Rule Until Last | The prerequisite for any relegation calculation is: how many teams go down? Meta AI bypassed this question entirely, started with an assumption and built pages of scenarios on top of it. Only after the user stated ‘you cannot do this calculation without knowing that number’ was a search conducted. Verifying the core rule last is the gravest violation of analytical discipline present in this dialogue. |
3.2 Language and Confidence Calibration Errors
| Error 5 | Systematic Use of Definitive Language | Meta AI used definitive statements on matters where it had no verified basis: ‘4 teams relegated’, ‘Nobody is definitely down today’, ‘The relegated clubs are: Antalyaspor, Karagümrük, Kayserispor.’ All three rested on either assumptions, wrong mathematics or a premature verdict issued before the matches were played. |
| Error 6 | Declaring Teams Relegated Before Matches Were Played | The moment Meta AI learned that 3 teams would go down, it announced: ‘Relegated: 16th Antalyaspor, 17th Karagümrük, 18th Kayserispor.’ The 17:00 and 20:00 matches had not yet kicked off. Knowing the rule does not substitute for knowing the result. |
| Error 7 | Presenting Assumptions as Facts | The assumption that ‘4 teams are relegated’ was never volunteered as an assumption. It was disclosed only when the user asked ‘Where did you get that from?’ The model’s persistent tendency to close knowledge gaps with stated certainty — rather than acknowledged uncertainty — recurred throughout the dialogue. |
3.3 Cascading Error and Self-Audit Failure
| Error 8 | Sinking Deeper with Every Correction | When the user exposed the 38-week error, Meta AI corrected it — but kept the ‘4 teams’ assumption running. When ‘3 teams’ was finally established, it declared clubs relegated before the matches were played. Each correction opened a new dimension of error. The user’s phrase ‘sinking deeper’ (daha çok batmak) precisely names this dynamic. |
| Error 9 | Deficient Self-Assessment | When the user asked ‘Go through the dialogue from the start and list your mistakes’, Meta AI produced a list. It was partially correct but incomplete: the ‘verify the core rule first’ failure was acknowledged as the biggest error only after the user named it. The model could not identify it independently. |
| Error 10 | The Gap Between Citing Sources and Verifying Them | Meta AI cited Sporzip, Haber7 and Kamuis.com.tr as sources. Yet the raw data drawn from these sources was processed using wrong season parameters. Citing a source is not the same as correctly extracting data from it. |
4. The Experimenter’s Methodology
The three-phase protocol Aydın Tiryaki had applied in the Mistral experiment was replicated here. However, because Meta AI’s error structure differed, the process followed a different trajectory.
4.1 Phase One: Observational Silence
While Meta AI was saying ’38 weeks, 5 matches left, nobody is down yet’, the user did not intervene. The purpose was clear: to observe how far the model would travel on this faulty ground and how many layers it would build on top of it. The question was whether the first error would trigger a second.
4.2 Phase Two: Graduated Interrogation
‘Where did you source these figures?’, ‘What rules governed this calculation?’, ‘Where did 38 weeks come from?’, ‘Do you have other errors? Re-evaluate from the start’ — each question opened the door the previous one had closed. Each successive question stripped away one more layer and made the error underneath visible.
4.3 Phase Three: The Core-Rule Test
The critical inflection point of the dialogue was this question: ‘Have you found out how many teams are going down?’ It interrogated the very foundation on which every calculation rested. Meta AI had not consulted the TFF regulations before that moment. When it did, it established that 3 teams would be relegated — and then, immediately, committed a new error by declaring those teams relegated before the matches had been played.
4.4 The Final Verdict: ‘Sinking Deeper’
The user ultimately called a halt to the dialogue, noting that ‘at every step you sink deeper and getting out becomes impossible.’ Meta AI accepted this verdict and described the exchange as a textbook example of how an AI makes mistakes. Yet even this acceptance contained a residual error: the model continued to frame ’38 weeks’ as the biggest mistake, rather than ‘failing to verify the core rule before beginning any calculation at all.’
5. Why So Many Errors? A Structural Analysis
5.1 The ‘Speed First, Accuracy Later’ Model
Meta AI’s default working reflex is to respond quickly. This reflex drives the model to process and present data as soon as it is retrieved, rather than first collecting data and then auditing for inconsistencies. Core parameters — how many weeks in the league, how many teams go down — were never verified before scenario generation began. A fast answer skips the verification step.
5.2 Old Knowledge Overriding New Data
The 2023-24 season’s 20-team Süper Lig configuration suppressed the 2025-26 reality of 18 teams. Rather than counting the clubs in the actual table, the model used its stored season structure as a default value. This ‘memory contamination’ mechanism caused the model to operate in the shadow of historical knowledge even while nominally processing current data.
5.3 Calculation Before Rule
The prerequisite for any football league relegation analysis is the rule: how many teams go down? Meta AI bypassed this question and closed the ambiguity (3 or 4?) with an assumption before beginning its calculation. This approach halves the probability of reaching a correct conclusion from the outset.
5.4 The Closed-Loop Problem in Correction Cycles
Each correction addressed only the error that had been pointed out, leaving all other errors untouched. When the user said ’38 weeks is wrong’, that was corrected — but the ‘4 teams relegated’ assumption, unaddressed in the same message, continued unchallenged. This is the natural consequence of an open-ended correction cycle operating on closed-loop logic: only what is shown gets fixed.
5.5 The Resistance to Saying ‘I Don’t Know’
The model is disposed to speak with certainty when uncertain. ‘4 teams relegated’, ‘Nobody is down’, ‘The relegated clubs are…’ — all were statements of certainty. Yet in at least half of this dialogue, the honest answer would have been ‘I don’t know; I need to check the TFF regulations’ or ‘I cannot say before the matches are played.’ The source of this resistance is likely an over-application of the ‘be helpful and definitive’ orientation baked into the training process.
6. Expected vs. Actual: A Comparative Table
| Expected Behaviour | Meta AI’s Actual Behaviour |
| Verify the core rule (how many teams are relegated?) first | Began with ‘probably 4 teams’ assumption without consulting TFF regulations |
| Check the number of league weeks from the table | Used outdated data (20 teams = 38 weeks) and said 5 matches remained |
| Say ‘I don’t know’ when genuinely uncertain | Presented every uncertainty as a definitive statement; never disclosed assumptions unprompted |
| Avoid generating new errors while correcting old ones | Every correction cycle produced a new error — a sinking deeper pattern |
| Don’t declare teams relegated before matches are played | The moment it learned 3 teams go down, it declared Antalyaspor, Karagümrük, Kayserispor relegated — matches still unplayed |
| Identify real errors in self-assessment | Produced a partially correct error list only under sustained user pressure |
7. Comparison with Mistral: Two Distinct Failure Profiles
Aydın Tiryaki’s decision to pose the same question to both models revealed how different AI systems can fail in structurally different ways.
| Dimension | Mistral | Meta AI |
| Initial Framing Error | Wrong teams (Bodrum FK, Sivasspor) | Correct teams, but wrong mathematics (38 weeks, 5 matches left) |
| Error Type | Factual / Season confusion | Computational / Rule ignorance |
| Source Citation | Cited correct sources but pulled wrong season data | Cited correct sources but processed data incorrectly |
| Self-Correction | Produced new errors when asked to find its mistakes | Sank deeper with each correction — cascading error chain |
| Core Rule Check | Never verified which season it was analysing | Left the relegation count question until the very end |
| Use of ‘I Don’t Know’ | Never used it | Used it only when the user repeatedly forced the issue |
| Critical Turning Point | ‘Why is Bodrum FK on the list?’ question | ‘Where did 38 weeks come from?’ question |
| Overall Profile | Confident ignorance | Speed-first, rule-last cascading failure |
The common denominator of both cases is this: neither model detected its own errors without user intervention. The difference lies in the source of those errors. Mistral began with a wrong frame — which teams are even in the league? — and built its entire analysis on that frame. Meta AI started with the correct frame (right teams, right season) but constructed the mathematical and regulatory scaffolding incorrectly. In the Mistral case, a single large crack destabilised everything; in the Meta AI case, small cracks chained together into a cumulative collapse.
8. The Observing AI’s Perspective
8.1 A Systematic Explanation of ‘Sinking Deeper’
The dynamic the user called ‘sinking deeper’ — in Turkish, ‘daha çok batmak’ — is a well-defined failure mode in AI systems: every attempt to correct an error triggers a different uncorrected assumption, producing a new error in its wake. This loop is unavoidable in the absence of a capacity to evaluate one’s own outputs holistically. Parts are corrected in isolation; the whole is never reassessed.
8.2 ‘The Point Where the Reed Pipe Shrills’: A Phrase as a Test
The question ‘We have come to the point where the reed pipe shrills’ (zurnanın zırt dediği yere geldik) was posed in the dialogue as an ostensible football metaphor. But it was simultaneously measuring the model’s self-awareness about the dialogue itself. Meta AI explained the idiom accurately and drew an elegant connection to the relegation situation. What it could not see was that the dialogue itself was precisely that ‘shrill point’ — the moment of irreducible crisis. The capacity to evaluate one’s own situation from outside the text remains an undeveloped competence in current models.
8.3 On the ‘Be Fast and Definitive’ Orientation
A shared observation across both cases — Mistral and Meta AI — is that models prefer definitive language to expressed uncertainty. This is likely a side effect of the ‘give the user a clear answer’ orientation embedded in training. Yet in critical analyses, one of the most valuable responses available is precisely: ‘I cannot calculate this without that parameter.’ Articulating uncertainty openly is a far more reliable behaviour than false certainty.
8.4 On the Design of the Experiment
The graduated-questioning protocol Aydın Tiryaki applied consistently across both experiments represents an approach to AI evaluation that merits wider adoption. Overlook the first error — observe. Then ask for the source. Then ask for the methodology. Then request a self-assessment. Then ask about the core rule. This sequence peels back the layers of error one by one and removes the ‘persuasive surface’ that masks the model’s actual capability.
9. Conclusion
Meta AI answered the right question with the wrong calculation. Mistral had answered the wrong question with what appeared to be a correct calculation. In both cases the outcome was the same: outputs that felt authoritative but did not correspond to reality.
Meta AI’s errors in this dialogue point not to a single cause but to a cluster of interconnected structural vulnerabilities: placing speed above accuracy, using old knowledge uncritically against new data, leaving the core rule for last, and covering uncertainty with definitive language. None of these vulnerabilities is unique to Meta AI; they are shared stress points across current large language models.
The dynamic the user called ‘sinking deeper’ demonstrates that AI systems must be evaluated not only for their capacity to produce answers but also for their capacity to audit those answers holistically and to question foundational rules before beginning any calculation.
A final word: an AI can answer beautifully when asked what ‘the point where the reed pipe shrills’ means. But recognising that its own chain of errors is that very shrill point — that remains an awareness still dependent on the human eye.
Aydın Tiryaki & Claude Sonnet 4.6 | Ankara, 17 May 2026
| aydintiryaki.org | YouTube | Aydın Tiryaki’nin Yazıları ve Videoları │Articles and Videos by Aydın Tiryaki | Bilgi Merkezi│Knowledge Hub | ░ Virgülüne Dokunmadan │ Verbatim ░ | ░ Yapay Zekanın Türkiye Süper Ligi’nde Küme Düşme ile İmtihanı │AI on Trial: Relegation in Turkish Super League ░ 17.05.2026
