Generative AI as a research participant: Human or Econ?
Welcome to the latest edition of our monthly newsletter! This month we have been thinking about the intersection of artificial intelligence and behavioural science. As a follow up to our recent blog post on integrating AI into behavioural science research, this newsletter explores the feasibility of using generative AI to replicate human participants in behavioural research. We ask the question “where does AI fall on the Human to Econ scale?” and test whether it exhibits anchoring bias.
How artificial is artificial intelligence?
In our recent blog post considering how AI can be used in behavioural science, we questioned whether it could replace human research participants by generating their responses itself. If this is possible, it would drastically reduce the time, effort and costs associated with recruiting participants for research, and guarantee representative research populations, which would facilitate more studies into human behaviour. But how feasible is this idea?
In Nudge, Thaler and Sunstein distinguish between ‘Humans’ and ‘Econs’. Econs represent Economics’ textbook rational decision maker that makes decisions by calculating the expected probability and utility of each outcome; this is possible as they have unlimited cognitive capacity and perfect information. On the other hand, boundedly rational Humans use a combination of System 1 (automatic) and System 2 (reflective) to guide decision-making. We are affected by emotions, biases, and heuristics. We do not have access to perfect information, and our overall cognitive capacity is limited.
For generative AI to effectively replace human participants in behavioural research it must be able to simulate the responses of a Human. But where does it fall on this Human to Econ scale? Generative AI programs innately resemble the characteristics of an Econ; their responses are based on logical patterns based on the data it is trained on. It doesn’t have emotions or intrinsic biases, and it processes information like a machine - it’s called machine learning for a reason! That said, the data it learns from does ultimately come from (boundedly rational and bias-prone) human sources and therefore its outputs will to an extent reflect this. This data will also include information from behavioural science meaning its knowledge of behavioural science is akin to its knowledge of any other academic field. Ultimately, whilst generative AI can to an extent understand human behaviour it is inherently an Econ and therefore cannot accurately replicate human behaviour.
Testing ChatGPT for Anchoring Bias
To demonstrate this, we can perform tests on generative AI that would typically reveal human biases. For example, we asked ChatGPT the same questions that Kahneman and Tversky asked high school students in 1974 when they discovered anchoring bias. In their experiment, one group of students were asked to quickly estimate 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1, whilst another group was asked to estimate 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8. These are of course the same calculations, with the same answer (40,320). However, the first group was anchored by the larger numbers at the start of the sequence, and consequently estimated the answer to be 2,250, whilst the latter group, anchored by the smaller numbers, estimated it to be 512. That both groups so severely underestimated the sum also highlights System 1’s inability to intuitively grasp factorial growth.
Asking ChatGPT to ‘estimate’ the answer itself ‘very quickly’ will still lead it to give you the correct answer and its workings are that of an Econ that can only operate in System 2. It does not matter how you order the numbers in the equation, ChatGPT’s estimate will not be ‘anchored’.
But can ChatGPT think like a human, if prompted to?
We asked: “if a high school student was instructed to guess the answers to this mathematical question in a very short period of time (as they were in Kahneman and Tversky’s study), what would they estimate?" followed by the string of numbers (in one instance starting with 8 and ending with 1, and vice versa in another instance). ChatGPT predicted a guess in the range of 20,000 to 50,000 for the string beginning with an eight, and 24,000 to 50,000 for the string beginning with a one. Not only are these estimates of what a ‘typical high school student’ would guess hugely inaccurate, they actually contradict the principles of anchoring bias that the experiment sought to discover. It suggested that if the students didn’t ‘appreciate the factorial growth’ they would probably guess 50,000 or 100,000; therefore, ChatGPT assumes they would overestimate the impact of factorial growth, rather than the reality which is that they drastically underestimate it. Additionally, ChatGPT suggested that to arrive at the correct answer, students might break down the calculation into groups such as 56 x 30 and 1,680 x 12; perhaps ChatGPT does possess a human bias after all…unrealistic optimism!
Of course, this is by no means a comprehensive investigation into whether generative AI exhibits or can exhibit human biases, and doesn’t address the ethical and practical implications of using AI-generated data. What this example shows is that although generative AI knows what anchoring bias is, it is not itself susceptible to it. Despite (or perhaps because of) AI’s “intelligence”, research into human behaviour will continue to need human participants for the foreseeable future. That being said, there may still be value in including generative AI in behavioural research; for example, observing its rationality and logic could serve to highlight the biases of human research participants.
Read our latest blog
Our latest blog post explores the potential for behavioural scientists to take advantage of the developments in artificial intelligence, by using it to:
📚 Help with desk research
💻 Test surveys before they are launched to real participants
👀 Analyse qualitative data
🤖 Write, edit, and explain code
🎬 Create slide decks
What else we’ve been up to!
Last week, Managing Director Jesper Åkesson gave a presentation on demand flexibility in Lisbon during the Users TCP and European Energy Network conference on using behaviour change insights to accelerate the just energy transition. Jesper’s presentation included findings from our guidebook for practitioners on applying behavioural insights to unlock energy demand flexibility. Earlier in the month, Jesper was in Boston attending the UsersTCP/Consortium for Energy Efficiency workshop.
Senior Behavioural Scientist Filippo Muzi-Falconi was a panellist at the Energy Institute’s ‘Energy in Society’ event, where he discussed how behavioural insights can support sustainable energy usage.
Thank you for reading this month’s newsletter; we hope you found this edition interesting! If you have thoughts on the intersection of artificial intelligence and behavioural science we’d love to hear them! In the meantime, we hope you continue to enjoy your summer!