From Human Experiments to AI Evaluation: A Behavioral Approach to LLMs
- Date: Jul 14, 2025
- Time: 04:00 PM (Local Time Germany)
- Speaker: Thilo Hagendorff (Universität Stuttgart)
- Room: Basement
Large language models (LLMs) are currently at the
forefront of intertwining AI systems with human communication and everyday
life. Therefore, it is of great importance to thoroughly assess and scrutinize
their capabilities. Due to the increasingly complex and novel behaviors of
today's LLMs, this can be done by treating them as participants in
psychological experiments originally designed to test humans. The talk will
outline how psychology can inform behavioral tests for LLMs, and how such tests
can discover emergent abilities in LLMs. As a deep dive, the talk will focus on
the phenomenon of deception in LLMs - an ability that carries significant
implications for alignment and safety.