Research indicates sensory and motor inputs enhance large language models in expressing complex ideas

Research & Knowledge Transfer

Assessing LLM performance through grounding techniques

Led by PolyU Professor Li Ping, Sin Wai Kin Foundation Professor in Humanities and Technology, Dean of the Faculty of Humanities and Associate Director of the PolyU-Hangzhou Technology and Innovation Research Institute, the research team analysed conceptual word ratings from advanced LLMs, including ChatGPT (GPT-3.5, GPT-4) and Google LLMs (PaLM, Gemini). They compared these ratings with human-generated data for about 4,500 words across non-sensorimotor, sensory and motor domains, using the validated Glasgow and Lancaster Norms datasets.

Initially, the team compared individual human and LLM ratings to assess similarity, using human pairs as a benchmark. However, this method might overlook how multiple dimensions contribute to word representation. For instance, while “pasta” and “roses” may have similar olfactory ratings, “pasta” is more closely related to “noodles” in terms of appearance and taste. To gain a deeper understanding, the researchers conducted representational similarity analyses across various attributes.

The results showed that LLM representations were most similar to human representations in the non-sensorimotor domain, less so in sensory domains and least similar in motor domains. This indicates LLM limitations in capturing human conceptual understanding, particularly in areas involving sensory information and embodied experiences.

To investigate whether grounding could enhance LLM performance, the researchers compared LLMs trained on both language and visual input (GPT-4, Gemini) with those trained on language alone (GPT-3.5, PaLM). The grounded models demonstrated significantly higher similarity to human representations.

Advancing LLMs with multimodal learning and sensory input

Professor Li noted, “The availability of both LLMs trained on language alone and those trained on language and visual input provides a unique setting for research into the effect of sensory input on human conceptualisation.” He emphasised the potential of multimodal learning to foster more human-like representations and performance in LLMs.

The researchers envision a future where LLMs equipped with grounded sensory input – such as through humanoid robotics – can actively interpret and interact with the physical world. Professor Li stated, “These advances may enable LLMs to fully capture embodied representations that mirror the complexity and richness of human cognition, and a rose in LLM representation will then be indistinguishable from that of a human.”

This finding aligns with previous research on representational transfer. Discover how visual and tactile experiences influence object-shape knowledge by clicking here.

The research team analysed conceptual word ratings from advanced LLMs like ChatGPT and Google LLMs, comparing them to human-generated ratings of approximately 4,500 words across non-sensorimotor, sensory, and motor domains using the Glasgow and Lancaster Norms datasets.

Events

PolyU to stage Hong Kong premiere of musical “青春印豐碑”

This year, in honour of the 60th anniversary of the supply of water from Dongjiang to Hong Kong, PolyU is proud to present the ori...

Achievements

PolyU rises to record-high 54th position in latest QS World University Rankings

PolyU has reinforced its position as a global leader in academic and research excellence by achieving a remarkable result in the Quacquarelli Symonds (QS) World University Rankings 2026. In the latest announcement from QS, the University attained...

PolyU in the News

PolyU scholars discuss the drive to leverage AI in education and innovation

The integration of Artificial Intelligence (AI) into various industries has sparked a global drive to harness its power in order to gain a competitive edge. At PolyU, top scholars are not only observing these trends but are also actively positioning...

Research indicates sensory and motor inputs enhance large language models in expressing complex ideas

Assessing LLM performance through grounding techniques

Advancing LLMs with multimodal learning and sensory input

You Might Also Like