Study Shows ChatGPT Cannot Reliably Mirror Human Moral Judgment

A new study led by Associate Professor Matthew Grizzard found that ChatGPT cannot accurately predict how people judge right and wrong—even though its answers appear similar at first glance. The study, published in Scientific Reports, compared predictions from two ChatGPT models—text-davinci-003 and GPT-4o—to average ratings from 940 people who evaluated 60 moral, immoral, and neutral situations.

On the surface, the AI predictions appeared accurate because they were highly correlated with human ratings. But a closer look revealed major differences. In 52 of the 60 situations, ChatGPT’s ratings showed clear, meaningful gaps from human judgments. It also tended to rate good and neutral actions as better than humans did, and bad actions as worse.

Another issue was how little of the rating scale ChatGPT used. One model used only nine different scores across all 60 situations, and the other used just 16. Human ratings, by comparison, included 57 unique values. This severe “clumping” meant ChatGPT gave identical scores to situations that humans saw as very different.

The researchers warn that relying only on high correlations to evaluate AIs can be misleading. While ChatGPT may seem aligned with human thinking, it still lacks the nuance needed to serve as a substitute for humans in research on moral reasoning.