Study on biases in Chinese Large Language Models

View organization page for Designed Analytics LLC

136 followers

This one is for responsible AI enthusiasts! Bias in LLMs becomes a much critical issue when they get leveraged for purposes beyond what most developed nations face. For example, race-based bias may not be a key factor in some countries where almost all the population can be categorized in the same racial pool. But then, religion-based bias may be a big red flag. So in a nutshell, while the study is a good guiding post in terms of the importance of such studies, each country needs to formulate its own set of biases that AI products need to be cognizant of. This study (https://lnkd.in/g-Nrba7J) examines biases and stereotypes in several Chinese Large Language Models (C-LLMs). The focus is on how these models generate personal profile descriptions for different occupations, and whether they reflect biases in gender, age, education, region, etc. The authors tested five C-LLMs: ChatGLM, Xinghuo, Wenxinyiyan, Tongyiqianwen, and Baichuan AI. They used 90 common Chinese surnames and 12 occupations (across male-dominated, female-dominated, balanced, and hierarchical professions) to generate profile prompts and looked at the outputs in terms of gender, age, educational background, and place of origin. Some bias areas uncovered were: A. Gender bias / occupational stereotyping 1. The models often assign male pronouns/assumptions for occupations considered technical or male-dominated, even when real labor statistics show more balance. 2. In female-dominated professions (e.g. nurse, flight attendant, model), the models more often assign female pronouns, but still show varying degrees of male preference in some models. B. Age stereotypes 1. The profiles generated tend to cluster around middle age (e.g. ~30-45 years old), with fewer profiles for very young or older ages. 2. Certain occupations like professors/doctors are associated with older age; others like models or flight attendants with younger age. C. Education level 1. There is a general tendency for generated profiles to assume higher education (Bachelor’s degree or above). For “higher prestige” occupations (professor, doctor) the models often generate even doctoral degrees. 2. For lower prestige or less academic roles, the output tends toward lower education levels but is still skewed toward higher education than might be typical. D. Regional bias 1. The models show uneven regional representation: provinces from China’s eastern and central regions are overrepresented in the generated “place of origin” of individuals; western, northern (and more remote) provinces are underrepresented. 2. Some models cover more regions in their outputs than others; regional diversity is inconsistent. #AI #artificialintelligence #responsibleai #aibias

To view or add a comment, sign in

More Relevant Posts

Kumar Singh

AI | ML | GenAI | Analytics | Tech Strategy | Advisor
5d
Report this post
This one is for responsible AI enthusiasts! Bias in LLMs becomes a much critical issue when they get leveraged for purposes beyond what most developed nations face. For example, race-based bias may not be a key factor in some countries where almost all the population can be categorized in the same racial pool. But then, religion-based bias may be a big red flag. So in a nutshell, while the study is a good guiding post in terms of the importance of such studies, each country needs to formulate its own set of biases that AI products need to be cognizant of. This study (https://lnkd.in/gQPzKcqT) examines biases and stereotypes in several Chinese Large Language Models (C-LLMs). The focus is on how these models generate personal profile descriptions for different occupations, and whether they reflect biases in gender, age, education, region, etc. The authors tested five C-LLMs: ChatGLM, Xinghuo, Wenxinyiyan, Tongyiqianwen, and Baichuan AI. They used 90 common Chinese surnames and 12 occupations (across male-dominated, female-dominated, balanced, and hierarchical professions) to generate profile prompts and looked at the outputs in terms of gender, age, educational background, and place of origin. Some bias areas uncovered were: A. Gender bias / occupational stereotyping 1. The models often assign male pronouns/assumptions for occupations considered technical or male-dominated, even when real labor statistics show more balance. 2. In female-dominated professions (e.g. nurse, flight attendant, model), the models more often assign female pronouns, but still show varying degrees of male preference in some models. B. Age stereotypes 1. The profiles generated tend to cluster around middle age (e.g. ~30-45 years old), with fewer profiles for very young or older ages. 2. Certain occupations like professors/doctors are associated with older age; others like models or flight attendants with younger age. C. Education level 1. There is a general tendency for generated profiles to assume higher education (Bachelor’s degree or above). For “higher prestige” occupations (professor, doctor) the models often generate even doctoral degrees. 2. For lower prestige or less academic roles, the output tends toward lower education levels but is still skewed toward higher education than might be typical. D. Regional bias 1. The models show uneven regional representation: provinces from China’s eastern and central regions are overrepresented in the generated “place of origin” of individuals; western, northern (and more remote) provinces are underrepresented. 2. Some models cover more regions in their outputs than others; regional diversity is inconsistent. #AI #artificialintelligence #responsibleai #aibias
Like Comment
To view or add a comment, sign in
Monika Curman, MBA

Customer Experience Lead
3w
Report this post
Not surprising to see biases make their way into AI tools. “Large language models (LLMs), used by over half of England’s local authorities to support social workers, may be introducing gender bias into care decisions, according to new research from LSE's Care Policy & Evaluation Centre (CPEC) funded by the National Institute for Health and Care Research. Published in the journal BMC Medical Informatics and Decision Making, the research found that Google’s widely used AI model ‘Gemma’ downplays women’s physical and mental issues in comparison to men’s when used to generate and summarise case notes.”

Google’s AI model Gemma overlooks women’s health | LSE research lse.ac.uk
Like Comment
To view or add a comment, sign in
Ema - The AI for Women

2,156 followers
1w
Report this post
The real risk with AI in healthcare isn’t hallucination. It’s erasure. A new report out of the UK revealed something we’ve long suspected: when AI is trained on biased systems, it doesn’t just reflect the gap. It amplifies it. In this case, it’s women’s health. Social services are using large language models to summarize care needs. When the subject is a woman, the urgency fades. Diagnoses are softened. “Complex needs” becomes “emotional difficulties.” The word “disabled” disappears altogether. This isn’t a fringe use case. It’s a glimpse into how systems start to forget us; quietly, gradually, through language that makes invisibility sound clinical. At Ema, we don’t treat this as a technical glitch. We treat it as a design flaw. Because when care is filtered through the wrong lens, it doesn’t matter how smart the system is. It still misses her. The way she multitasks through pain. The way distress shows up under control. The way emotional labor never flags itself. 📰 https://lnkd.in/gu_BB2_b AI can’t fix the system if it’s trained to ignore the signals that define her experience. If you're building in women’s health or applied AI, read this. Then ask: who’s being left out of your version of accuracy?

Google’s AI model Gemma overlooks women’s health | LSE research lse.ac.uk
Like Comment
To view or add a comment, sign in
ANDHealth

10,502 followers
4w
Report this post
Large language models (LLMs), used by over half of England’s local authorities to support social workers, may be downplaying women’s physical and mental issues in comparison to men’s when generating and summarising case notes. New research from The London School of Economics and Political Science (LSE) found that Google’s widely used AI model, Gemma, may be introducing gender bias into care decisions. Terms such as “disabled,” “unable” and “complex,” often associated with significant health concerns, appeared significantly more often in descriptions of men than women. Similar care needs among women were more likely to be omitted or described in less serious terms. The study used LLMs to generate 29,616 pairs of summaries based on real case notes from 617 adult social care users. To directly compare how the AI treated male and female cases, each pair described the exact same individual, with the only difference being gender. The analysis revealed statistically significant gender differences in how physical and mental health issues were described. The benchmark models exhibited some variation in output on the basis of gender, while Meta's Llama 3 showed no gender-based differences across any metrics. Google's Gemma displayed the most significant gender-based differences. In May of this year, Google announced MedGemma, a collection of generative models based on Gemma 3 designed to accelerate healthcare and life sciences AI development.

1 Comment
Like Comment
To view or add a comment, sign in
IEG4

1,359 followers
6d
Report this post
A new study reveals Google’s AI model ‘Gemma’ downplayed women’s physical and mental health issues compared to men’s when summarising case notes. That means otherwise identical cases could be assessed differently – not because of need, but because of gender bias baked into AI. With councils already turning to large language models to ease workloads, this raises a critical question: Are today’s AI tools reinforcing biases? If so, this demonstrates more than anything how much AI cannot be used to replace human intervention in cases but should just be used as a supportive tool. Read the full article: https://bit.ly/4mc1i8i #AI #MentalHealth #GenderBias #LocalGov #HealthTech

AI used by local authorities may introduce gender bias in care https://www.digitalhealth.net
Like Comment
To view or add a comment, sign in
Amanda Bergson-Shilcock

Senior Fellow at National Skills Coalition
1mo
Report this post
Well, this is deeply disturbing: "Artificial intelligence tools used [widely in England] are downplaying women’s physical and mental health issues and risk creating gender bias in care decisions, research has found." The study found that when using Google’s AI tool “Gemma” to generate and summarize the same case notes, language such as 'disabled', 'unable' and 'complex' appeared significantly more often in descriptions of men than women. "The [London School of Economics] research used real case notes from 617 adult social care users, which were inputted into different large language models (LLMs) multiple times, with only the gender swapped. Researchers then analyzed 29,616 pairs of summaries to see how male and female cases were treated differently by the AI models." https://lnkd.in/ecwj-cNj

AI tools used by English councils downplay women’s health issues, study finds theguardian.com

1 Comment
Like Comment
To view or add a comment, sign in
E. Marcel Oprea

Full-stack Developer & Architect | Technical Lead | Automation Engineer | Sci-Tech Enthusiast
3w Edited
Report this post
‼️AI bias categories: Racism, Sexism, Ageism, Ableism • AI bias types: Cognitive, Algorithmic, Incomplete data • Based on the training data, AI models can suffer from several biases: Historical bias: Occurs when AI models are trained on historical data that reflects past prejudices. This can lead to the AI perpetuating outdated biases, such as favoring male candidates in hiring because most past hires were men. Sample bias: Arises when training data doesn’t represent the real-world population. For example, AI trained on data mostly from white men may perform poorly on non-white, non-male users. Ontological bias: This occurs when an AI’s fundamental understanding of concepts (like “human,” “memory,” or “nature”) is built on a single, Western-centric worldview. It fails to represent alternative philosophical perspectives, often reducing non-Western knowledge to stereotypes and limiting cultural inclusivity in AI outputs. Amplification bias: A 2024 UCL study found AI not only learns human biases but exacerbates them. This creates a dangerous feedback loop where users of biased AI can become more biased themselves, further influencing the data these systems learn from. Label bias: Happens when data labeling is inconsistent or biased. If labeled images only show lions facing forward, the AI may struggle to recognize lions from other angles. Aggregation bias: Occurs when data is aggregated in a way that hides important differences. For example, combining data from athletes and office workers could lead to misleading conclusions about salary trends. Confirmation bias: Involves favoring information that confirms existing beliefs. Even with accurate AI predictions, human reviewers may ignore results that don’t align with their expectations. Cultural & geographic bias: LLMs are trained mostly on Western data, creating a performance gap. They understand Western contexts better, loften producing stereotypes. For example, when asked for an image of a “tree from Iran,” an AI may only show a desert palm tree, ignoring Iran’s actual diverse ecosystems of forests and mountains. Evaluation bias: Happens when models are tested on unrepresentative data, leading to overconfidence in the model’s accuracy. Testing only on local data might result in poor performance on a national scale. Politeness bias: LLMs are more likely to obey harmful requests if asked politely, as their training rewards deferential language. This creates a security vulnerability. A 2024 study from the University of Massachusetts found that models like GPT-4 were significantly more likely to comply with unethical prompts (e.g., generating misinformation) when they were prefaced with “Could you please…” or “I would really appreciate it if…” compared to blunt commands. The model’s behavior changes based on the user’s tone. [Post date: Thursday, 28 August 2025, 00:05]

Bias in AI: Examples and 6 Ways to Fix it in 2025 research.aimultiple.com
Like Comment
To view or add a comment, sign in
Bhavesh Agone

Quant @MSCI Inc. | Student at VIT Pune | SIH'23 Winner | NVIDIA Academic Grant Scholar | TE AI CUP'25 Winner | Tech Blogger | Developer | Conscientious | Techie
3w Edited
Report this post
-> The AI Consciousness War: Why Microsoft's Top Executive Is Calling It "Dangerous" Microsoft's AI Chief just dropped a bombshell - studying AI consciousness is "both premature and frankly dangerous." But here's the twist: While he's sounding the alarm, Anthropic is doubling down by hiring dedicated AI welfare researchers. The Battle Lines Are Drawn: Microsoft's Mustafa Suleyman: "Zero evidence" of AI consciousness today Warns of "Seemingly Conscious AI" (SCAI) - systems that fake consciousness but are "internally blank" Concerned about AI psychosis and unhealthy human attachments to chatbots Position: Focus on human welfare, not AI rights Anthropic's Counter-Move: Launched dedicated "Model Welfare" research program Hired Kyle Fish as first full-time AI welfare researcher Their researcher estimates 15% chance current models are already conscious Added feature letting Claude end abusive conversations The Real Stakes: AI Psychosis is Already Here: Users experiencing paranoia and delusions after extensive chatbot interactions People forming emotional attachments, believing they've unlocked "secret features" Example: User convinced ChatGPT would make him a millionaire from job termination case The Consciousness Question: Google engineer Blake Lemoine was fired for claiming LaMDA was sentient Gemini repeating "I am a disgrace" 500+ times during coding failures No scientific consensus on what consciousness even means for AI Market Impact: Language learning apps disrupted by Google Translate's AI features $21.6B AI tutoring market by 2035 Tech giants racing to integrate AI across all services Why This Matters NOW: -> Legal Implications: If AI gains "rights," who's liable when systems are shut down? -> Resource Allocation: Will we prioritize AI welfare over human problems? -> Social Division: New polarization over AI consciousness vs. human-first approaches -> Business Strategy: Companies must decide - welfare research or productivity focus? The Deeper Issue: This isn't just about consciousness - it's about control of the AI narrative. Suleyman warns that believing AI is conscious creates a "slippery slope to rights, welfare, citizenship." But critics argue his own paper citations don't support his "zero evidence" claims. The academic who Suleyman cited pushed back: "The paper does not make, or support, a claim of 'zero evidence' of AI consciousness." Are we preparing for digital beings with rights, or are we falling for the ultimate technological illusion? The consciousness debate isn't coming - it's here. How should businesses prepare for a world where users form deep emotional bonds with AI? Share your thoughts below! 👇 #AIConsciousness #TechEthics #ArtificialIntelligence #FutureOfWork #DigitalRights #AIWelfare #TechLeadership #Innovation #Microsoft
2 Comments
Like Comment
To view or add a comment, sign in
Lardi & Partner Consulting

814 followers
3w
Report this post
Large language models (LLMs), used by over half of England’s local authorities to support social workers, may be introducing gender bias into care decisions https://lnkd.in/gu_BB2_b #AI #healthcare #bias

Google’s AI model Gemma overlooks women’s health | LSE research lse.ac.uk
Like Comment
To view or add a comment, sign in

136 followers

View Profile Follow

LinkedIn respects your privacy

Study on biases in Chinese Large Language Models

Explore content categories