From the Editor

What are the physical side effects of antidepressants? In a new, impressive Lancet study, Toby Pillinger (of King’s College London) and his co-authors attempt to answer that old question with a new approach: the first systematic review and meta-analysis. They drew on 168 RCTs that measured physical health effects of antidepressants, including almost 59 000 participants and comparisons of 30 antidepressants. “We found strong evidence that antidepressants differ markedly in their physiological effects, particularly for cardiometabolic parameters.” We consider the paper and its implications.

How safe is cannabis for those taking it for medical purposes? Dr. Beth Han (of NIMH) and her colleagues report findings from a US survey in a new JAMA Psychiatry brief report, focusing on cannabis use disorder (CUD). They report that cannabis use wasn’t less addictive when used for medical reasons. “Clinicians should consider addiction risk before recommending medical cannabis and, if they do, should monitor for CUD emergence.”

The BMJ runs humorous articles in its Christmas issue. The journal doesn’t disappoint this year. Dr. Roberto A. Correa Soto (of the Universidad de los Andes) and his co-authors write about AI hallucinations and doctor BS (yes, you read that correctly). Frankly, the paper is worth reading for the profanity alone. “Both doctors and large language models (LLMs) are driven to produce misinformation – ‘bullshit’ and ‘hallucinations’ – owing to a shared pressure to provide answers, prioritising the appearance of competence over accuracy.”

There will be no Readings for the next three weeks. 

DG

Selection 1: “The effects of antidepressants on cardiometabolic and other physiological parameters: a systematic review and network meta-analysis”

Toby Pillinger, Atheeshaan Arumuham, Robert A McCutcheon, et al.

The Lancet, 1 November 2025

Up to 17% of the adult population in Europe and North America are prescribed antidepressants. Although they are effective treatments, antidepressants can induce various physiological alterations, including weight gain, blood pressure disturbance, hyponatraemia, and QT prolongation. These side-effects have wide-reaching consequences, including discontinuation of treatment and thus poorer psychiatric outcomes. Professional bodies recommend that discussions about side-effects are central to antidepressant prescribing decisions. However, evidence syntheses on which to base these discussions are scarce, and the relative degree to which physiological alterations occur during acute treatment with different antidepressants is unclear. It is also unknown which physiological and demographic factors are associated with antidepressant-induced physiological dysregulation. Finally, although there is an association between improvements in psychotic symptoms and antipsychotic-induced metabolic disturbance in people with schizophrenia, it is not known if a similar relationship exists between improvements in depressive symptoms and antidepressant-induced metabolic alterations.

So begins a paper by Pillinger et al.

Here’s what they did:

  • They conducted a systematic review and meta-analysis “to compare and rank antidepressants based on physiological side-effects by synthesising data from randomised controlled trials (RCTs).”
  • They searched several databases, including MEDLINE.
  • They included “single-blinded and double-blinded RCTs that compared antidepressants with a placebo or with another antidepressant when used as monotherapy for the acute treatment (8 weeks, as previously defined) of adults (aged 18 years and older) with a psychiatric disorder.” 
  • They did “frequentist random-effects network meta-analyses to investigate treatment-induced changes in weight; total cholesterol; glucose; heart rate; systolic and diastolic blood pressure; corrected QT interval (QTc); sodium; potassium; aspartate transferase (AST); alanine transaminase (ALT); alkaline phosphatase (ALP); bilirubin; urea; and creatinine.” Finally, they did “meta-regressions to examine study-level associations between physiological change and age, sex, and baseline weight.”

Here’s what they found:

  • Of 26 252 citations, 151 studies and 17 FDA reports met inclusion criteria. There were 58 534 participants, comparing 30 antidepressants with placebo. 
  • Demographics. The average age of participants was 45 years; 62% were female and 75% were White.
  • Treatment. Median treatment duration was 8 weeks, with a range of 3 to 12 weeks. 
  • Metabolic and haemodynamic effects. There was a 4 kg difference in weight-change between agomelatine and maprotiline; over 21 beats-per-minute difference in heart rate change between fluvoxamine and nortriptyline; over 11 mmHg difference in systolic blood pressure between nortriptyline and doxepin. 
  • Cholesterol. “Paroxetine, duloxetine, desvenlafaxine, and venlafaxine were associated with increases in total cholesterol and, for duloxetine, glucose concentrations, despite all drugs reducing bodyweight.”
  • Liver enzymes. “There was strong evidence of duloxetine, desvenlafaxine, and levomilnacipran increasing AST, ALT, and ALP concentrations, although the magnitudes of these alterations were not considered clinically significant.”
  • Other effects. “We did not find strong evidence of any antidepressant affecting QTc, or concentrations of sodium, potassium, urea, and creatinine to a clinically significant extent.”

A few thoughts:

1. This is an impressive paper, drawing on a huge dataset, and published in a major journal.

2. The main findings in a sentence: “Marked differences were particularly evident for change in weight, heart rate, and blood pressure…”

3. The above summary doesn’t quite capture the richness and detail of the study. 

4. There were 30 antidepressants – but an uneven number of trials. Fluoxetine had the most trials (32); desipramine, phenelzine, and selegiline, the least (one trial each).

5. Is all this very reassuring? Guidelines recommend starting treatment with SSRIs. Citalopram and escitalopram, for the record, had the fewest side effects in this review. And many differences among antidepressants may not be important. Comments Dr. Jonathan Alpert (of Montefiore Einstein) in The New York Times: “Not everything that’s statistically significant is clinically meaningful.” Tricyclic antidepressants were more problematic, including for blood pressure, but such side effects are well established.

6. Perspective is important as well. I’ll quote liberally from the Substack of Dr. Niall Boyce (of Wellcome): “The risks of antidepressant treatment need to be put in the context of the condition that they are intended to treat. As the late Ozzy Osbourne put it, ‘depression ain’t funny.’ I wonder sometimes if controversies over antidepressant prescription are driven by the idea that they are somehow unnecessary medications. My view is that they can, like any drug, be inappropriately prescribed; that I would like there to be options with greater effectiveness and fewer side effects; but that the present range of medications can make a real difference to people who are struggling. We need to be clear about both limitations and advantages, while researchers strive to come up with something better.” Well said.

7. Like all studies, there are limitations. A major one: trials were relatively short, whereas patients often take antidepressants for extended periods.

The full Lancet paper can be found here:

https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(25)01293-0/fulltext

Selection 2: “Medically Recommended vs Nonmedical Cannabis Use Among US Adults”

Beth Han, Wilson M. Compton, Emily B. Einstein, et al.

JAMA Psychiatry, March 2025

With increases in cannabis use for medical purposes and its perceived benefits, patients and clinicians need to be aware of its potential risks. Studies of local data have documented that patients often use cannabis on a daily or near-daily basis to manage their conditions. Frequent cannabis use and cannabis use disorder (CUD) are associated with increased risk for negative health (eg, psychoses, hyperemesis) and academic and social outcomes.

So begins a research letter by Han et al.

  • They conducted a cross-sectional study examining “differences in cannabis use frequency and CUD prevalence for medical-only (ie, all cannabis use recommended by health care professionals) and medical-nonmedical vs nonmedical-only use among US adults.”
  • They drew data from the 2021-2022 National Surveys on Drug Use and Health (NSDUH) “which provides national and state representative data on past-year cannabis use, use frequency (ie, number of days), CUD, and other substance use disorders.” 
  • Respondents were asked whether health care professionals recommended any cannabis use; if yes, they were asked whether all cannabis use was recommended, and if so, they were classified as having “medical-only cannabis use.”
  • Several statistical analyses were done, including Poisson regression for cannabis use frequency.

Here’s what they found:

  • There were 72 668 respondents.
  • Use. Overall, 29.5% reported past-year cannabis use: 83.7% for nonmedical-only use; 9.2% for medical-only use; and 5.7% medical-nonmedical use. 
  • CUD. Overall, 34.8% had CUD. (!)
  • Demographics. “Males and females aged 18 to 34 years and males aged 35 to 49 years reporting medical-only or medical-nonmedical use had higher adjusted prevalence of severe, moderate, and mild CUD than counterparts reporting nonmedical-only use.”
  • Adjusted prevalence. “Adjusted prevalence of moderate (16.9%…) and severe (13.5%…) CUD among males aged 18 to 34 years reporting medical-only use was higher than among counterparts with nonmedical-only use.”

A few thoughts:

1. This is an interesting and important research letter with practical implications.

2. The main finding in a sentence: “adults aged 18 to 49 years reporting medical-only or medical-nonmedical cannabis use vs nonmedical-only use had higher prevalence of CUD at all severity levels and reported more frequent cannabis use.”

3. The authors have a relevant interpretation: “These findings suggest that medically recommended cannabis is not associated with reduced addiction risk compared with nonmedical use.” (!!)

4. The perception of medical cannabis among some patients is that it’s safe and effective. This paper questions the safety. What about effectiveness? A recent Reading considered the new JAMA paper analyzing the therapeutic uses of cannabis, finding: “Despite the accumulation of new studies, evidence is insufficient for the use of cannabis or cannabinoids for most medical conditions.” That paper can be found here:

https://davidgratzer.com/reading-of-the-week/reading-of-the-week-cannabinoids-for-therapeutic-use-the-new-jama-paper-also-ect-and-szalavitz-on-her-recovery-substance-use/

5. Like all studies, there are limitations. The authors note several including the reliance on self-reported data.

The full JAMA Psychiatry research letter can be found here:

https://jamanetwork.com/journals/jamapsychiatry/article-abstract/2829336

Selection 3: “Parallel pressures: the common roots of doctor bullshit and large language model hallucinations”

Roberto A. Correa Soto, Liam G. McCoy, Camilo Perdomo-Luna, et al.

The BMJ, 12 December 2025

The phenomenon of doctors presenting unfounded statements with unwavering arrogance – colloquially known as ‘bullshit’ – has long been recognised in medical practice. In parallel, the tendency of large language models (LLMs) to generate plausible but factually incorrect information, termed ‘hallucinations,’ presents a remarkably similar challenge in healthcare related artificial intelligence applications. These parallel behaviours stem from shared underlying mechanisms.

In both cases, pressure to produce output regardless of knowledge limitations can lead to a preference for any response over none, driven by a reward seeking intention…

So begins a new paper by Correa Soto et al.

The compulsion to answer

“In medical education and practice, doctors face intense pressure to demonstrate competence by having answers readily available. The cultural expectation that doctors should possess comprehensive knowledge creates an environment where admitting uncertainty can be perceived as a weakness or incompetence. Thus, medical professionals often feel obligated to provide opinions even when their expertise is limited, driving the practice of bullshitting. During rounds, resident doctors and trainees might offer explanations for laboratory abnormalities they don’t fully understand. During patient consultations, doctors might present treatment options with unwarranted certainty to maintain authority…

“LLMs face parallel constraints. They are designed to generate continuations of text based on learned patterns, with a poor capacity to recognise and communicate genuine uncertainty. Computational pathways for a refusal to provide an answer do exist and are being developed. But current LLMs tend to ‘hallucinate’ false information in their responses rather than abstaining when faced with uncertainty, incomplete information, or prompts that conflict with human values. This tendency is exacerbated by the propensity for models to excessively flatter or agree with users, which can lead models to prioritise perceived user satisfaction over truthfulness.” 

Performance, authority, and reward hacking

“The hierarchical structure of medicine creates a theatre of authority where performance often supersedes accuracy. Medical trainees learn early that confident delivery can mask knowledge gaps, while hesitation invites scrutiny… This performance aspect of medical communication rewards those who can speak authoritatively with more clinical opportunities and professional advancement, regardless of the quality of their work. Yet research has repeatedly shown the weak correlation between doctor confidence and actual clinical competence.

“Similarly, LLMs are developed by using a reward model that provides positive reinforcement for outputs perceived as helpful or satisfactory by human raters. This can lead to ‘reward hacking,’ where the model learns to maximise these reward signals – generating responses that seem plausible and confident – rather than aligning with factual accuracy or true human preferences. This can result in outputs that exploit loopholes in the reward model, even when incorrect.”

Institutional Forces

“Broader institutional forces also perpetuate the production of bullshit in medicine and hallucinations in LLMs. In academia, publication quantity is prioritised over quality. Financial interests, conflicts, and prejudices can bias research outcomes. Consequently, many findings reflect prevailing biases shaped by flexibility in designs, analyses, and reporting, rather than empirical truth…

“For LLMs, commercial imperatives similarly prioritise flawed performance metrics over accuracy and patient outcomes. Models that appear knowledgeable attract more users and investment, creating market pressure for capabilities that might exceed reliable knowledge boundaries. The rush to deploy AI in healthcare settings often bypasses rigorous validation, and AI tools often underperform when compared with initial claims.”

They suggest that the result will be a human-LLM feedback loop.

“Since LLMs learn statistical patterns from these data without verification mechanisms, they ingest and replicate these instances of bullshit. The models then generate ‘hallucinations’ that mimic the tone of the training data, perpetuating the cycle. This creates a dangerous dynamic where flawed human communication trains AI to produce similarly flawed, confidently asserted outputs, potentially leading to misdiagnosis, inadequate treatment, erosion of trust, and amplification of existing biases within healthcare.”

How to move forward? They offer a few suggestions.

  • Medical education. “Training programmes should explicitly reward phrases like ‘I don’t know’ when appropriate, making uncertainty a mark of professionalism not deficiency.”
  • Academic culture. “To bolster reliable science, academia should prioritise the totality of evidence over isolated statistically significant findings. Rather than fostering prolific doctor-researchers, a more valuable approach is to train doctors broadly in research methods and evidence based medicine.” 
  • Collaboration. “The most effective solutions for mitigating misinformation in healthcare will arise from the convergence of human and AI systems, leveraging their complementary strengths and addressing shared weaknesses by using collaborative workflows.”

A few thoughts:

1. This paper is clever, provocative, and worth thinking about.

2. The first sentence is particularly good: “The phenomenon of doctors presenting unfounded statements with unwavering arrogance – colloquially known as ‘bullshit’ – has long been recognised in medical practice.”

3. Is the paper playful and mischievous? Sure. But it does raise reasonable points about the echo chamber of AI and medicine, and the recommendations aren’t so dramatic.

The full BMJ paper can be found here:

https://www.bmj.com/content/391/bmj.r2570

Reading of the Week. Every week I pick articles and papers from the world of Psychiatry.