Chao Prize Lecture 2026: TalkBank and AI

Conference/Seminar

Add to Calendar

Date

08 May 2026
Organiser

Faculty of Humanities
Time

17:15 - 18:15
Venue

Function Room 5-6, Basement 1, Hotel ICON

Remarks

The Lecture will be recorded for promotional and education purpose and it will be conducted in English.

Summary

Abstract

The training and fine-tuning of AI systems depends on large amounts of accurately recorded data. For spoken language data, the largest open-access source is the TalkBank system which is becoming increasingly prominent for automatic analysis of language disorders, language acquisition, and neurolinguistic modeling. Language development data in TalkBank are being used to train better automatic speech recognition for children. Data from people who stutter is being used to train systems for recognition of stuttered speech. Systems are using TalkBank data to detect cognitive decline, understand psychosis, profile types of aphasia, track recovery from traumatic brain injury, and follow patterns of code-switching. TalkBank also provides methods for detailed analysis of conversational interactions in classrooms, air traffic control, and pragmatic deficits. Although current data are heavily skewed toward English and European languages, data from other languages is growing rapidly through increasingly extensive data-sharing.

About the speaker

Prof. Brian MacWhinney's research focuses on understanding language structure, processing, and learning as emerging from competitive processes that operate across a variety of time/process scales with unique constraints. He applies this perspective in studies of first language acquisition, second language acquisition, language typology, sociolinguistics, conversational interaction, language disorders, and neurolinguistics. He has created the TalkBank system -- the world's largest open-access database on spoken language -- which now provides an essential component in the development of AI models for understanding human language.

Previous Event Next Event