News & Events

Spinning the discourse roulette: addressing poverty of data with Monte Carlo simulations

by Dr Dennis Tay, Department of English, PolyU

Date                 11 May 2020
Time                 5:00pm-6:00pm
Venue               Live webinar
The talk will be conducted in English.

About the Speaker
Dennis Tay is an Associate Professor in the Department of English, The Hong Kong Polytechnic University. His research covers four overlapping areas: cognitive linguistics, metaphor theory, mental healthcare communication, and discourse data analytics. He is academic editor of PLOS One and associate editor of Metaphor and the Social World and Metaphor and Symbol.

An intractable problem for analysts of professional communication and discourse is not having enough data due to confidentiality or other logistical hurdles. This poses additional challenges if findings are to be seen as useful and applicable. In this talk, I share my recent exploratory work with Monte Carlo simulations (MCS), a class of computational methods used to address similar problems in fields like finance and gaming (Kroese et al., 2014), but almost never on language data. The basic logic of MCS is to i) simulate repeated random samples of a phenomenon - language and discourse in this case - using limited information about its probability distribution, and ii) use the simulations to estimate how various discourse scenarios would have panned out anyway by the laws of chance. I illustrate this with psychotherapy, a professional context where linguistic analysis is crucial but data is hard to obtain. I show how i) therapy talk is first quantified under socio-psychological and linguistic categories like analytic thinking, authenticity, clout, emotional tone, and pronouns (Pennebaker et al., 2015), ii) MCS models are then built and validated with ’training’ and ’test datasets’, iii) accurate models can informfollow-up qualitative/quantitative analysis, aswell as therapists’ understanding of the linguistic behavior of clients. The whole process is implemented with Python, an open-source programming language widely used in industry and research today.

Back to top