Conference Paper Published
Study
Experience and Opportunities
| Ma, J., Feng, Z., Song, H., Chersoni, E., & Chen, Z. (2025). Reasoning or Memorization? Investigating LLMs' Capability in Restoring Chinese Internet Homophones. In Proceedings of the 3rd Workshop on Towards Knowledgeable Foundation Models (KnowFM), 120-139. |
| DOI: https://doi.org/10.18653/v1/2025.knowllm-1.11 |
|
|
|
Abstract Chinese homophones, prevalent in Internet culture, bring rich linguistic twists that are challenging for language models. While native speakers disambiguate them through phonological reasoning and contextual understanding, it remains untested how well LLMs perform on this task and whether LLMs also achieve this via similar reasoning processes or merely through memorization of homophone-original word pairs during training.In this paper, we present HomoP-CN, the first Chinese Internet homophones dataset with systematic perturbations for evaluating LLMs’ homophone restoration capabilities. Using this benchmark, we investigated the influence of semantic, phonological, and graphemic features on LLMs’ restoration accuracy, measured the reliance levels of each model on memorization during restoration through consistency ratios under controlled perturbations, and assessed the effectiveness of various prompting strategies, including contextual cues, pinyin augmentation, few-shot learning, and thought-chain approaches. |
We use Cookies to give you a better experience on our website. By continuing to browse the site without changing your privacy settings, you are consenting to our use of Cookies. For more information, please see our Privacy Policy Statement.
Your browser is not the latest version. If you continue to browse our website, Some pages may not function properly.
You are recommended to upgrade to a newer version or switch to a different browser. A list of the web browsers that we support can be found here