Conference Paper Published

Which Model Mimics Human Mental Lexicon Better? A Comparative Study of Word Embedding and Generative Models

Song, H., Feng, Z., Chersoni, E., & Huang, C.-R. (2025). Which Model Mimics Human Mental Lexicon Better? A Comparative Study of Word Embedding and Generative Models. In Proceedings of the International Conference on Computational Semantics (IWCS 2025), 208-230.

URL: https://aclanthology.org/2025.iwcs-main.19/

Abstract

Word associations are commonly applied in psycholinguistics to investigate the nature and structure of the human mental lexicon, and at the same time an important data source for measuring the alignment of language models with human semantic representations.Taking this view, we compare the capacities of different language models to model collective human association norms via five word association tasks (WATs), with predictions about associations driven by either word vector similarities for traditional embedding models or prompting large language models (LLMs).Our results demonstrate that neither approach could produce human-like performances in all five WATs. Hence, none of them can successfully model the human mental lexicon yet. Our detailed analysis shows that static word-type embeddings and prompted LLMs have overall better alignment with human norms compared to word-token embeddings from pretrained models like BERT. Further analysis suggests that the performance discrepancies may be due to different model architectures, especially in terms of approximating human-like associative reasoning through either semantic similarity or relatedness evaluation. Our codes and data are publicly available at: https://github.com/florethsong/word_association.