Journal Article FZJ-2025-04990

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Anthropocentric bias in language model evaluation

 ;

2025
MIT Press Cambridge, MA

Computational linguistics ., 1 - 10 () [10.1162/COLI.a.582]

This record in other databases:  

Please use a persistent id in citations: doi:

Abstract: Evaluating the cognitive capacities of large language models (LLMs) requires overcoming not only anthropomorphic but also anthropocentric biases. This article identifies two types of anthropocentric bias that have been neglected: overlooking how auxiliary factors can impede LLM performance despite competence (auxiliary oversight), and dismissing LLM mechanistic strategies that differ from those of humans as not genuinely competent (mechanistic chauvinism). Mitigating these biases requires an empirical, iterative approach to mapping cognitive tasks to LLM-specific capacities and mechanisms, achieved by supplementing behavioral experiments with mechanistic studies.

Classification:

Contributing Institute(s):
  1. Gehirn & Verhalten (INM-7)
Research Program(s):
  1. 5255 - Neuroethics and Ethics of Information (POF4-525) (POF4-525)

Appears in the scientific report 2025
Database coverage:
Medline ; Creative Commons Attribution-NonCommercial-NoDerivs CC BY-NC-ND (No Version) ; DOAJ ; Arts and Humanities Citation Index ; Clarivate Analytics Master Journal List ; Current Contents - Arts and Humanities ; Current Contents - Engineering, Computing and Technology ; DOAJ Seal ; Ebsco Academic Search ; Essential Science Indicators ; IF >= 5 ; JCR ; SCOPUS ; Science Citation Index Expanded ; Social Sciences Citation Index ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Aufsätze > Zeitschriftenaufsätze
Institutssammlungen > INM > INM-7
Workflowsammlungen > Öffentliche Einträge
Online First

 Datensatz erzeugt am 2025-12-04, letzte Änderung am 2025-12-08



Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)