The concept of categorical perception (0.5 hours)

↵ Back to module homepage

In this kind of experiment, people's results often look more like this:

Graph showing how often you choose "pa", as a function of the VOT of the sound you heard. When VOT is low, the percentage of times you choose "pa" is low. As VOT increases, percentage remains low at first, but at 20-30 milliseconds it sharply jumps up to 100%, where it remains for higher VOTs.

This is quite different than the prediction I outlined. Recall that I predicted a straight line: when voice onset time gets longer, people's likelihood of choosing "pa" would get steadily higher. But that is clearly not what happened if the result looks like this. Instead, sometimes longer voice onset time doesn't matter; for example, in the image above, a sound with 10 milliseconds of voice onset time is no more likely to be heard as "pa" than a sound with 0 milliseconds of voice onset time is (both are 0% likely to be rated "pa"). On the other hand, sometimes voice onset time has a huge impact; a sound with 20 milliseconds of voice onset time is only 25% likely to be heard as "pa", but when voice onset time goes up to 30 milliseconds the "pa" response rate jumps up to 100%!

We often refer to this pattern as "categorical perception". This means that when you hear sounds, your mind organizes them into categories. It seems that for some voice onset times, you almost always hear the sound as "ba", and for other voice onset times you almost always hear the sound as "pa". There is not much in the middle.

The idea here is that we rarely hear a sound and think "Hm, that was 72.3% like a 'ba' but 27.7% like a 'pa'." Rather, we hear a sound and think, "That was a 'ba'!". Even if it wasn't a perfect "ba", our mind still decides it was a "ba" and then places it into a "category" with all the other "ba" sounds we have ever heard. Therefore, our mind doesn't care very much about whether the voice onset time was 12 milliseconds, 17.35 milliseconds, or whatever. Our mind just cares about whether the sound was a "ba" or a "pa".

In theory, your mind has decided on a categorical boundary: some voice onset time that divides "ba" sounds and "pa" sounds. When you hear a sound, you automatically check what the voice onset time is, and compare it to your categorical boundary. Almost any sound with a voice onset time shorter than the categorical boundary will sound like "ba" to you. And almost any sound with a voice onset time longer than the categorical boundary will sound like "pa" to you. In my graph above, the categorical boundary seems to be somewhere between 20 milliseconds and 30 milliseconds: VOTs around 20 milliseconds or less are almost always treated as "ba", and around 30 milliseconds or more are almost always treated as "pa".

Based on your results, where do you think your categorical boundary between "ba" and "pa" is?

I claimed that these results prove that we aren't sensitive to the small differences between different voice onset times, but we instead lump sounds into categories.

Do you agree with this claim, or do you have any criticisms? Are there any reasons you think the results of the test might not prove that? Do you think there are any problems or limitations with the kind of experiment we did?

When you have finished these activities, continue to the next section of the module: "Sounds in a bigger context".


by Stephen Politzer-Ahles. Last modified on 2021-07-13. CC-BY-4.0.