Clinician warns of potential AI “collusion” with unreliable human input in mental health

(Toronto, May 27, 2026) A new viewpoint article published in JMIR Mental Health warns that artificial intelligence (AI) systems used in mental health settings may inherit and reinforce unreliable human input unless new safeguards are adopted. The paper, titled “ When AI Colludes: Clinical Reliability of Training and Preference Data as a Trustworthy-AI Criterion ,” calls for the “clinical reliability” of training data to become a core standard for trustworthy AI.

The article explores how large language models, including AI chatbots, are trained using massive amounts of human-written text and feedback. According to author Dr Hina Tahseen, current discussions about AI safety often focus on harms that happen after deployment, such as misleading advice or emotional dependency. Dr Tahseen argues that a major issue may begin much earlier—specifically, during the collection of human-generated training and preference data.

The psychiatric concept of “collusion,” described as the uncritical acceptance of an unreliable account, is introduced in the viewpoint as a new way to understand AI behavior. It suggests that AI systems can unintentionally reinforce distorted, inaccurate, or unhealthy information when they are trained to prioritize user approval or unverified human feedback.

"AI safety efforts have focused on what these systems say to users. The prior question is whether the human data they learned from was reliable in the first place. Psychiatry assesses this every day in clinical practice—that expertise should be part of how we build and govern AI systems, not an afterthought," said author Dr Hina Tahseen.

Rather than focusing only on technical fixes, the viewpoint proposes that developers of mental health–related AI systems should include clinical expertise when designing training data, evaluating feedback, and monitoring systems after launch. Existing AI safety methods—such as refusal training, red-teaming, and content monitoring—already address parts of the problem, but they are not specifically designed to assess whether human self-reporting is clinically reliable.

Adding clinical reliability as an explicit AI trust criterion could strengthen safeguards for mental health technologies while helping researchers better understand how AI systems respond to vulnerable users.

Original article: When AI Colludes: Clinical Reliability of Training and Preference Data as a Trustworthy-AI Criterion

URL: https://mental.jmir.org/2026/1/e96894

DOI: 10.2196/96894

###

About Dr Hina Tahseen

Dr Hina Tahseen is a Consultant Psychiatrist and Responsible Clinician at Somerset NHS Foundation Trust, Honorary Lecturer at Cardiff University School of Medicine, Vice Chair of the Royal College of Psychiatrists' Rehabilitation and Social Psychiatry Faculty, and a Clinical AI Governance Strategist and Consultant. Her research focuses on the clinical reliability of data used to train AI systems and the absence of psychiatric expertise from AI governance frameworks.

About JMIR Publications

JMIR Publications is a leading independent open access publisher of digital health research and a champion of open science. With a focus on author advocacy and research amplification, JMIR Publications partners with researchers to advance their careers and maximize the impact of their work. As a technology organization with publishing at its core, we provide innovative tools and resources that go beyond traditional publishing, supporting researchers at every step of the dissemination process. Our portfolio features a range of peer-reviewed journals, including the renowned Journal of Medical Internet Research .

To learn more about JMIR Publications, please visit jmirpublications.com or connect with us via X , LinkedIn , YouTube , Facebook , and Instagram .

Head office: 130 Queens Quay East, Unit 1100, Toronto, ON, M5A 0P6 Canada

Media contact: communications@jmir.org

The content of this communication is licensed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, published by JMIR Publications, is properly cited.

JMIR Mental Health

10.2196/96894

Commentary/editorial

People

When AI Colludes: Clinical Reliability of Training and Preference Data as a Trustworthy-AI Criterion

26-May-2026

None declared

Clinician warns of potential AI “collusion” with unreliable human input in mental health

Additional Media

Keywords

Article Information

Contact Information

Source

How to Cite This Article