Journal Paper Published
Study
Experience and Opportunities
| Lee, S., & Wang, S. (2026). Probability versus Prompting: Language Model Performance on Dependencies beyond English. Studies in Linguistics, 79, 205-221. |
| DOI: https://doi.org/10.17002/sil..79.202604.207 |
|
|
|
Abstract
Recent work on Tranformer-based language models (LMs) has raised a methodological debate over how to best evaluate LMs’ linguistic knowledge: through direct probability-based measures of token likelihood or through prompt-based interactions that elicit explicit responses. While prompting has become increasingly popular, its reliability as a diagnostic tool for linguistic competence remains unclear. Moreover, most prior evaluations have focused on English, leaving open questions about cross-linguistic generalizability. This paper compares probability-based and prompt-based evaluation methods in two typologically distinct languages, Hindi and Korean, targeting politeness dependencies that involve long-distance coherence. Using controlled minimal pairs, we assess models of different scales. Our results show a clear advantage for probability-based evaluation. Notably, larger prompt-based models do not outperform smaller probability-accessible models, suggesting that model scale or training data does not necessarily compensate for methodological limitations in evaluation. Our findings suggest that probability-based methods provide a more accurate and efficient window into LMs’ linguistic representations than prompt-based approaches. This study further underscores the importance of evaluation methodology in cross-linguistic LM research and calls for the need to move beyond English-centered linguistic assessments. |
|
Keywords
|
We use Cookies to give you a better experience on our website. By continuing to browse the site without changing your privacy settings, you are consenting to our use of Cookies. For more information, please see our Privacy Policy Statement.
Your browser is not the latest version. If you continue to browse our website, Some pages may not function properly.
You are recommended to upgrade to a newer version or switch to a different browser. A list of the web browsers that we support can be found here