Why Claude AI wants this man in prison
Claude AI reversed two Dutch legal phrases in a high-stakes translation. Here is what went wrong and how to verify AI legal output.

Claude allegedly flipped the meaning of two Dutch legal phrases. That may sound like a small translation problem, but in a legal context, a single reversed negation can change the entire meaning of a document.
Dries Van Langenhove, a Belgian remigration activist and anti-establishment political activist, says Claude failed a simple Dutch-to-English legal translation in a way that made the text say the opposite of what it meant. In a post on X, he said he gave Claude a paragraph from a Dutch legal document and asked for a translation. According to his account, Claude turned a sentence meaning there was no indication of guilt into a claim that strong evidence existed against him, then turned “no reason to prosecute” into “no reason not to prosecute.”
That is the kind of AI error that looks small until you imagine it happening in a court filing, a journalist’s source document, a policy memo, or a private legal dispute. Translations like these can be grammatically smooth, legally polished, and completely wrong where they matter most.
The underlying legal document was not included with the public post, so the exact translation failure cannot be independently reproduced from the screenshot alone. But the reported failure pattern fits several known AI weaknesses: hallucination, negation failure, translation drift, model priors, and the opacity of hosted AI systems.
More on legal AI applications:
Quick takeaways
The reported error fits several documented AI failure modes, including hallucination, negation failure, translation hallucination, and model priors overriding source text.
There is no public evidence that Claude was specifically programmed to invert this translation because of Van Langenhove’s politics.
The more serious issue is that a fluent hosted model can silently change legal meaning, then explain itself afterward in a way that sounds convincing.
For legal translation, AI output should be treated as a draft with a source-aligned audit trail, never as a final authority.
Local AI alternatives help with privacy, repeatability, and account independence, but they do not eliminate hallucinations.
Who Dries Van Langenhove is, and why the case is getting attention
Readers outside Belgium may not know Van Langenhove. He is a Belgian remigration activist, and his Instagram profile presents him to followers in that political context.
That matters because the alleged translation error did not happen in a neutral school exercise. It involved a legal document, a politically active public figure, and an AI system whose output could influence how non-Dutch speakers understand the document.
Van Langenhove says he gave Claude a single paragraph from a Dutch legal text and asked for a translation. He says the only extra context was that the translation mattered for a European Parliament presentation. According to his post, Claude inverted the meaning of two key phrases.
The first phrase, “geen aanwijzing bestaat van schuld,” means that no indication of guilt exists. Claude allegedly rendered it as overwhelming evidence existing against him. The second phrase, “geen reden is tot vervolging,” means there is no reason for prosecution. Claude allegedly added the opposite force by inserting “not.”
That kind of mistake is dangerous because it’s not readily apparent that there is a mistake at all. The translation may look perfectly fluent and coherent. A reader who does not know Dutch may never notice that the English version has reversed what the source originally claimed.
The screenshot also appears to show Claude acknowledging the mistake afterward. That is useful as a product artifact, but it should not be treated as a forensic explanation of what happened inside the model. Anthropic’s own extended thinking documentation describes thinking features as offering varying levels of transparency into reasoning behavior. In other words, a displayed explanation can help users debug a mistake, but it is not guaranteed to be a complete record of the internal cause.
What the translation needed to preserve
The core issue is negation.
A legal translator must preserve who is accusing whom, what the document says, and whether the sentence affirms or denies a legal basis. The difference between “no indication of guilt” and “evidence of guilt” is not a mere stylistic issue. It is the meaning of the sentence.
The same is true for “no reason to prosecute” and “no reason not to prosecute.” In ordinary writing, that would be a bad error. In a legal document, it can alter the reader’s understanding of whether a case is being dismissed, supported, questioned, or escalated.
This is why AI translation failures deserve more attention than awkward wording. A model can preserve grammar, tone, and legal style while corrupting the core claim the reader actually needs.
Legal translation is unforgiving because the smallest words often carry the highest stakes. “No,” “not,” “unless,” “except,” “without,” and “failed to” can decide whether a sentence says someone did something, did not do something, may have done something, or should not be accused of doing something.
A fluent model can fail in exactly that spot.
The best explanation is a stack of known AI failure modes
The strongest evidence-based explanation is not that Claude decided to target Van Langenhove personally because it has been hard-coded to do so. The more likely explanation is that the task combined several known large language model weaknesses in one high-stakes place.
Claude can hallucinate. Anthropic says so directly. The company’s Claude Help Center warns that Claude can produce incorrect or misleading responses, and that users should avoid relying on it as a single source of truth for high-stakes advice.
Anthropic’s consumer terms go further. They warn that outputs can contain material inaccuracies even when they appear accurate because of detail or specificity. That warning matters here because legal translation often feels trustworthy when it sounds precise.
The reported failure also looks like a classic negation problem. The model appears to have mishandled “geen,” the Dutch word for “no” or “not any,” then produced an English version that reversed the legal meaning.
That is not an obscure edge case. Research on machine translation has found that negation can significantly reduce translation quality, with some translation directions showing quality reductions of more than 60 percent. The paper “It’s not a Non-Issue: Negation as a Source of Error in Machine Translation” focused directly on this problem.
More recent research on large language models reaches a similar conclusion. The paper “Negation: A Pink Elephant in the Large Language Models’ Room?” describes negation as a persistent reliability challenge. The authors note that LLMs can struggle to distinguish facts from their negations, misunderstand negative particles, and fail to handle negation robustly even after instruction tuning.
That matches the reported Claude failure pattern. The model did not merely choose a weak synonym. It appears to have lost the logical polarity of the sentence.
Translation hallucinations are especially hard to catch
Large translation models can produce fluent text that departs from the source. A paper on hallucinations in large multilingual translation models warned that hallucinated translations can undermine trust and create safety concerns when these systems are deployed in real-world settings.
Legal translation raises the stakes even higher. A 2024 article in the International Journal of Language & Law on applying large language models in legal translation notes that specialized translation remains difficult to automate, especially when terminology and legal precision matter.
That is the trap. Claude may be impressive at many translation tasks. It may produce a clean, natural English paragraph. It may even use legal vocabulary better than a casual bilingual speaker.
Still, none of that makes it a certified legal translator.
A bad machine translation is easiest to notice when it sounds broken. The more dangerous version sounds polished. The reader does not see missing source alignment. He is only presented with confident English and assumes the model must have preserved the meaning.
Training data may have influenced the output
The most sensitive part of the analysis is whether Van Langenhove’s identity mattered.
There is no available evidence that Claude was instructed or hard-coded to intentionally mistranslate legal text against him, so we’ll avoid speculation on that front.
Yet an alternative explanation may be even more damning: a model may carry prior associations around a public figure’s name. Van Langenhove is an anti-establishment political activist and could be considered a controversial figure. As such, his name appears broadly in political commentary, social media posts, legal reporting, and argument-heavy online material. A model trained on large amounts of internet text may have encountered those associations.
That does not prove conscious intent on the AI model’s part, or that of Anthropic developers. The model likely recognized a plausible path to “complete” a narrative that existed in its training data, overriding the incentive to strictly translate the words in front of it.
Research on context-memory conflicts in large language models shows that models can fail to update their answers when provided context conflicts with their internal knowledge. Another paper on the interplay between parametric and contextual knowledge summarizes the problem plainly: models often need to integrate provided context with knowledge stored in their weights, but they can ignore context when it conflicts with what the model has already learned.
That gives a better explanation than “Claude hates this person.” If the model’s training method results in learned associations around someone’s name, pointing toward accusation, prosecution, or legal conflict, those associations could make the model pull a translation toward an expected narrative. From the outside, a result like that may look the model itself is ideologically biased. And, in a way, it is. Cases like these may prove that overwhelming ideological bias in a model’s training data, context handling, and alignment layers mechanically result in statistical failure modes.
The practical problem is the same either way. The user receives a fluent legal translation that may have been influenced by narratives in its training data: information outside the provided snippet itself.

Alignment shapes Claude, but did it cause this error?
Anthropic has been unusually public about the idea that Claude is shaped by a model constitution. In its post on Claude’s new constitution, the company says the constitution is part of the training process and directly shapes Claude’s behavior.
That matters because Claude is not a neutral dictionary. It is a trained, aligned, policy-shaped system. Its behavior is governed by model weights, training data, human feedback, system instructions, safety policy, product updates, and deployment choices.
Those layers can make the model safer and more useful. They can also make failure modes harder to audit.
The evidence here does not support the stronger claim that Claude has a public, documented instruction to mistranslate legal text against remigration activists or anti-establishment political activists. We can’t honestly prove or disprove whether that is the case at all. However, what we can definitely say is this: users cannot inspect the full training data, model weights, hidden system prompts, classifier behavior, or model update history.
When a hosted model fails in a politically charged legal task, the user sees the output and simply cannot inspect the machinery that produced it.
If it happens once, it’s a coincidence…
One flipped negation could be a one-off sloppy translation error. Two polarity flips in the same direction suggest a stronger pattern.
The likely pattern is narrative completion. Claude may have treated the paragraph less like a source text to translate and more like a legal story to complete. Once the name and legal setting activated a prosecution frame, the model may have smoothed both sentences into the kind of English it expected to see.
That is how large language models often fail. They do not always fail by producing gibberish. They often fail by producing the most plausible-looking wrong answer.
This also explains why the error is hard to catch. A human translator who misunderstands “geen” would probably produce awkward or inconsistent output. An LLM can produce polished legal English that hides the break with the original meaning.
For readers, that is the key warning here. Don’t confuse fluency with fidelity. A translation can read beautifully while betraying the source.
Was this a ‘woke’ political hallucination?
That depends on what the phrase is meant to claim.
If it means Claude has a visible public rule telling it to mistranslate documents against Van Langenhove, there is no public evidence for that.
If it means Claude may carry political, reputational, safety, and training-data priors that can distort outputs around public political figures, there is definitely reason to believe that. LLMs encode patterns from training data. They are shaped further by human feedback, safety policies, system instructions, model updates, and refusal rules.
Those layers can improve behavior, but they can also create opaque failure modes where, for example, one-sided curation of training data bakes ideological bias into the model.
The practical point remains that a cloud model can make a politically consequential legal error while sounding calm, competent, and certain. That is enough reason to demand verification.
Hosted AI is useful, but it is rented capability
The control lever here is hosted model behavior.
With Claude, the user does not control the model version in the same way he controls a local file. He cannot inspect the training corpus. He cannot freeze every hidden instruction. He cannot verify whether a product update changed translation behavior. He cannot independently reproduce the exact model state later if the service changes.
Anthropic’s consumer terms say the company may change, add, or remove features, change limits, or stop offering services. That does not make Claude useless. It means hosted AI is rented capability.
For casual tasks, that tradeoff is usually fine. For legal translation, evidence review, politically sensitive work, journalism, court filings, source protection, or internal investigations, the lack of an auditable local record becomes a real workflow risk.
A user can save the prompt and output. That is useful. It is still not the same as preserving the full model, system prompt, safety stack, inference settings, and version history.
What this means for everyday users
The immediate lesson is simple: never trust a single AI translation for high-stakes text.
A safer workflow starts with a literal translation. Ask the model to translate sentence by sentence, preserve word order where possible, and avoid smoothing the text into a polished legal conclusion.
Then require source alignment. Each English sentence should sit next to the original source sentence. Every negation word should be marked. If the source says “no,” “not,” “without,” or “nothing,” the translation should make that visible.
Users should also explicitly tell the model not to use outside knowledge about any person, event, case, or political context named in the text. That instruction does not guarantee compliance, but it reduces the chance that the model will fill gaps from memory.
A better prompt for legal translation would be:
Translate the Dutch text into English literally.
Rules:
- Do not infer legal context.
- Do not use outside knowledge about any person named in the text.
- Preserve every negation exactly.
- For each sentence, show:
1. Original Dutch
2. Literal English translation
3. Negation words in the Dutch sentence
4. Whether the sentence affirms or denies guilt, prosecution, or evidence
5. Any wording that is ambiguous
If you are unsure, say so. Do not smooth the sentence into a legal conclusion.After that, compare the result with a second model or a dedicated translation tool. For legal use, a human translator or lawyer should verify the final wording before it appears in a presentation, filing, testimony, article, or public statement.
That may sound tedious. It is less tedious than discovering that an AI system reversed the meaning after the translation has already been used.
What this means for courts and lawyers
AI hallucinations in legal work are no longer theoretical. Reuters reported in 2025 that AI-generated legal fiction had led courts to question or discipline lawyers in multiple cases over two years. The lesson from those cases was direct: lawyers using AI must verify their filings.
The problem has continued. Reuters reported in June 2026 that a federal judge in Mississippi disqualified attorneys on both sides of a lawsuit after unverified AI-generated research led to fabricated legal citations in court filings. The judge said lawyers may use AI tools, but they must verify material submitted to the court.
Translation adds another layer of risk. Fake case citations can sometimes be detected by searching legal databases. A bad translation can be harder to spot because it may require knowledge of the source language. If the English output is polished, readers may assume the source said what the AI says it said.
Courts, lawyers, journalists, and policymakers should not accept AI translations unless they include source text, sentence alignment, and human verification. If the translation changes rights, guilt, liability, intent, prosecution status, or legal exposure, it needs a human check.
Cloud AI also raises privacy and retention questions
Cloud AI services have three practical weaknesses in this kind of workflow.
First, the model is not fully auditable. You can save the prompt and response, but you cannot fully inspect why the model produced that response.
Second, data handling matters. Anthropic’s privacy center says consumer chats and coding sessions may be used to improve Claude if the user allows it, if conversations are flagged for safety review, or if the user otherwise opts in. The same Claude privacy page on model training says chat and coding session data used for improvement can include the full related conversation.
Third, retention rules matter. Anthropic’s data retention page says deleted consumer conversations are removed from chat history immediately and deleted from back-end storage within 30 days. It also says data may be retained in de-identified form for up to five years if the user allows model improvement, and that inputs and outputs flagged by trust and safety classifiers may be retained for up to two years, with classification scores retained for up to seven years.
For a restaurant recommendation, that may be acceptable. For a legal document, political strategy memo, client file, source material, internal investigation, or unpublished reporting, it is a serious design question.
The issue is not that no one should use cloud AI. The issue is that users need to understand the tradeoff before uploading sensitive documents.
What local AI alternatives change
Local AI does not make models magically truthful. A local model can hallucinate, mistranslate, mishandle negation, and produce confident nonsense.
What local AI changes is control.
A local workflow can keep sensitive documents off a hosted account. It can preserve the exact model file, prompt, system instructions, and version used for a translation. It can be tested repeatedly on the same source text. It can be compared against other local models without sending the document to another company.
For users deciding whether owned hardware is worth it, Popular AI has covered the privacy, control, offline access, and cost tradeoffs in its guide to buying local AI hardware in 2026.
The local path can also be combined with file-aware workflows. Popular AI’s GGUF Loader Agentic Mode guide covers a local coding and agent workflow without relying on cloud accounts.
For translation, the practical setup is usually hybrid. Use a hosted frontier model when speed and quality matter and the document is not sensitive. Use a local LLM or local translation engine when privacy, repeatability, or account independence matters. Use two independent systems when the text is high-stakes. Bring in a human expert before publication, filing, testimony, or political use.
This doesn’t mean you have to stop using Claude, or cloud-based AI altogether. The point is to stop confusing its signature linguistic fluency as proof of a job well done.
More on local AI versus cloud-based AI:
The deeper lesson for power users
The Van Langenhove example is useful because it is easy to understand. Two legal negations appear to have been flipped. The mistake is visible, consequential, and politically charged.
But the same failure mode can happen in quieter settings.
A contract clause may say a company is not liable. A medical note may say there is no evidence of a condition. A compliance memo may say a firm is not under investigation. A source document may say a person did not do something. A financial report may say a risk did not materialize.
In each case, the dangerous error may be small. One word. One polarity flip. One confident sentence that sounds right.
The most dangerous AI errors are often not wild inventions. They are small reversals inside fluent prose.
Power users should build workflows that assume this can happen. Keep the source document. Demand sentence-level grounding. Save prompts and outputs. Compare models. Mark negation words. Use local tools for sensitive files. Bring in human review when the stakes are real.
Rented intelligence is useful. Auditable capability is safer when the consequences matter.
FAQ
Did Claude intentionally mistranslate Dries Van Langenhove’s legal document?
There is no public evidence that Claude intentionally targeted Van Langenhove or was specifically programmed to invert the translation. The more supportable explanation is a mix of known LLM failure modes: attempts to complete a narrative that was repeated in its training data, hallucination, negation failure, translation hallucination, and possible prior associations around a public political figure.
Is this only a Claude problem?
No. Claude is the model in the reported incident, but negation failures, hallucinations, context-memory conflicts, and legal AI mistakes are broader LLM problems. Anthropic deserves scrutiny because Claude is the tool involved, but the workflow lesson applies to ChatGPT, Gemini, Perplexity, local LLMs, and specialized legal AI tools too.
Can prompt engineering prevent this?
Prompt engineering can reduce the risk, especially when the prompt forces literal translation, sentence alignment, source quotes, negation marking, and no outside knowledge. It cannot replace human verification in legal work.
Are local models better for legal translation?
Local models are better for privacy, repeatability, and control. They are not automatically better at translation quality. A strong hosted model may produce a better first draft. A local workflow is valuable when the document is sensitive, when cloud retention is unacceptable, or when the user needs a reproducible audit trail.
Should courts allow AI translations?
AI can help prepare drafts, but courts should require source text, sentence alignment, translator verification, and responsibility from a human professional. A fluent AI translation should not be treated as evidence unless it can be audited.
Final recommendation
Treat this incident as a warning, not as proof of an automatic AI blacklist against certain political activists.
If the screenshot accurately reflects what happened, Claude made a severe legal translation error by reversing two negated statements. The best documented explanation is a stack of known problems: possible prior narratives around a public political figure in the training data, hallucination, weak negation handling, translation drift, and the opacity of hosted AI systems.
Use Claude for legal translation drafts only with strict grounding. Do not use it as the final authority. For sensitive work, build a workflow that includes source-aligned output, independent checks, human review, and a local fallback.
Explore more from Popular AI:
Start here | Local AI | Fixes & guides | Builds & gear | Popular AI podcast





What do you think is the bigger risk with AI legal translation: ordinary hallucination, political bias in training data, or the fact that hosted AI models are so hard to audit when they get something wrong?