ANDHealth’s Post

Large language models (LLMs), used by over half of England’s local authorities to support social workers, may be downplaying women’s physical and mental issues in comparison to men’s when generating and summarising case notes. New research from The London School of Economics and Political Science (LSE) found that Google’s widely used AI model, Gemma, may be introducing gender bias into care decisions. Terms such as “disabled,” “unable” and “complex,” often associated with significant health concerns, appeared significantly more often in descriptions of men than women. Similar care needs among women were more likely to be omitted or described in less serious terms. The study used LLMs to generate 29,616 pairs of summaries based on real case notes from 617 adult social care users. To directly compare how the AI treated male and female cases, each pair described the exact same individual, with the only difference being gender. The analysis revealed statistically significant gender differences in how physical and mental health issues were described. The benchmark models exhibited some variation in output on the basis of gender, while Meta's Llama 3 showed no gender-based differences across any metrics. Google's Gemma displayed the most significant gender-based differences. In May of this year, Google announced MedGemma, a collection of generative models based on Gemma 3 designed to accelerate healthcare and life sciences AI development.

Rickman, S. Evaluating gender bias in large language models in long-term care. BMC Med Inform Decis Mak 25, 274 (2025). https://doi.org/10.1186/s12911-025-03118-0

Like
Reply

To view or add a comment, sign in

Explore content categories