| Home > Publications database > How should we talk about the mental states of AI models? |
| Talk (non-conference) (Invited) | FZJ-2026-02578 |
2026
Abstract: We often describe LLMs in the same psychological terms we use to describepeople. We say a model understands a question, refuses a request, followsan instruction, or tries to deceive. Can such claims ever be literally true? Orare they always false, merely metaphorical, or otherwise defective? I arguethat they can be literally true, but that many current uses are nonethelessdefective.To accept that such claims can be literally true is to deny that linguisticallyarticulate forms of cognition are uniquely human. That uniqueness claim hasinfluential defenders. Some critics argue that psychological descriptions ofLLMs are just anthropomorphic projections. Some go further and say thatsuch claims are, for that very reason, irresponsible. I respond that mentalproperties need not come bundled in the package familiar from the humancase. LLMs lack many important mental properties that humans have, butnot all.This leaves us with a harder question. If some psychological descriptions ofLLMs can be literally true, why are they so often misleading? My answer isthat ordinary psychological terms carry background assumptions inheritedfrom the human case: about agency, stability, reciprocity, and accountability.Current systems do not support these assumptions. The problem is thereforenot solved by choosing between literal truth and metaphor. We need newconstruals: shared ways of saying which parts of the human psychologicalpicture apply to AI systems, and which do not. Claims about AI deception,where the construal problem is especially sharp, will serve as a test case.
|
The record appears in these collections: |