Make your word list

Make your word list (6 hours)

To carry out a priming experiment, you will need a lot of words to use as targets and primes.

A person's response to any one word may be fast or slow for many reasons. Maybe the moment the target appears on the screen the person is blinking, or sneezing, or thinking about what they had for lunch, or distracted by a noise they heard outside. Therefore, if we just had a person see one target with a related prime and one target with an unrelated prime, our results would not be very reliable; there are many reasons that person might respond faster to one target than another.

To have reliable results, therefore, each person needs to see many targets. If they see many targets with related primes and many targets with unrelated primes, then even if they were distracted when seeing one or two of those targets, hopefully the overall result will still be fairly stable.

How many words will be needed in your experiment? The rule I am giving you for this project is: Each person must respond to at least 20 targets in each condition they see. In practice that means you will need to make a lot more than 20 targets. Figuring out how many words you need is more complicated than you might expect. Let's see why. Below we will consider several factors that influence how many words will be needed in the experiment.

Repeating or not repeating targets?

When we start to design a priming experiment, we immediately face a question: Should one person see the same target more than once?

Imagine we have a target BEE, which could be preceded by a related prime (BUTTERFLY) or by an unrelated prime (EDIT). How can we design this experiment?

Option A: Repetition

Here's one option. We can let each participant see every target in both conditions. That means that when a person participants in our experiment, at some point they will see BUTTERFLY...BEE, and later on they will also see EDIT...BEE. (Or it could be in the other order; maybe they will first see EDIT...BEE, and later they will see BUTTERFLY...BEE.) This would be useful, because then we can compare how fast they responded to BEE in the related condition, vs. how fast they responded to the exact same word in the unrelated condition.

There's a problem, though. The second time they see BEE, it might already be kind of activated, because they also saw BEE earlier. The first time they see BEE it will get activated, and some of that activation might still be remaining when she sees BEE again later. (Think back to the task you did at the end of the Priming module... what I'm describing now is long-lag identity priming. Seeing the same word a second time allows people to respond to it faster [that's identity priming], even if the first time they saw it was several minutes ago [that's the long-lag aspect].)

Normally, we would expect that someone will respond fast to BUTTERFLY...BEE, and respond slowly to EDIT...BEE. But what if they see BUTTERFLY...BEE first, and EDIT...BEE second? Then, when they see EDIT...BEE they're already seeing the same word for the second time, which might make them respond fast!

This is a major problem. We might not be able to study the difference between the related and unrelated conditions, because it might be overshadowed by a difference between the first time seeing a word vs. the second time seeing the same word.

So what else could we do?

Option B: Using different targets

We have seen, therefore, that letting a person see the same target twice in the experiment is not a good idea. So if we want to compare how fast people respond to related vs. unrelated trials, we should make sure they are using different targets.

So how about if we design our experiment something like this:

Related: BUTTERFLY...BEE
Unrelated: EDIT...LIBERTARIAN

How does that look? Do you think there are any problems with this? Brainstorm a little before you scroll down to see what I say...

There is a huge problem here. It would be a terrible idea to compare how fast people respond to BEE and how fast they respond to LIBERTARIAN; there are so many differences between these words! BEE is a very short word and LIBERTARIAN is a very long one. BEE is a concrete word (a bee is a solid thing, which you can touch) and LIBERTARIAN is an abstract word (it refers to kind of an abstract idea). BEE is a fairly common word, and LIBERTARIAN is not a very common word (well, maybe it is common for some people who write about politics all the time). People might have different personal reactions to them: for example, maybe someone who got stung by a bee before is scared of bees now; or maybe someone who's a socialist has strong moral disagreements with libertarians and feels annoyed when seeing the word LIBERTARIAN. These are just the first few examples I could think of; if you brainstorm, you can probably think of many other things that are different about these words and that might cause people to react to them differently.

Imagine I do an experiment and I find that people respond faster to BUTTERFLY...BEE than they respond to EDIT...LIBERTARIAN. And imagine I then write my capstone project claiming "I have just proved that people respond to BUTTERFLY...BEE faster because BUTTERFLY and BEE are related!" Anyone reading my paper would think, "Maybe you just respond faster because BEE is such a short word! Or such a common word!" etc.

So, it is not a good idea to try to compare reaction times between different words. When the words are different, there are a million things that could cause people to react faster or slower. You could try to control all possible differences (e.g., make sure your 'related' and 'unrelated' targets are each exactly 5 letters long, each exactly the same number of syllables, each equally as common, etc.). But if you try actually doing this, you will quickly find it's almost impossible to control everything.

What, then, should we do? We have found that letting a participant see the same word twice is not a good way to design the research. But we have also found that letting a participant see different words is also not a good way to design the research. We're pretty much fucked.

The Latin square design

Ultimately, there is no perfect solution to the problems we described above; they are unavoidable problems, so you should always be aware of them when you do research.

But there is a pretty-good solution. It's not completely perfect, but it gives us a pretty good balance between the two problems we discussed. That solution is the Latin square design.

In a Latin square research design, every participant in the experiment sees every target word, but different participants see the same target word in different conditions. Let me show you an example of what that means.

First, let's imagine I have this small set of words for the experiment:

Target: BEE
- Related prime: BUTTERFLY
- Unrelated prime: EDIT
Target: CAT
- Related prime: DOG
- Unrelated prime: TABLE
Target: HOSPITAL
- Related prime: DOCTOR
- Unrelated prime: STATUE
TARGET: MATH
- Related prime: SCIENCE
- Unrelated prime: JUMP

Now a person is going to participate in our experiment; let's call that person Participant A. As we discussed, we should not let participant A see both EDIT...BEE and BUTTERFLY...BEE, because seeing the same word twice will influence their results. We could let the participant see both BEE and CAT with related primes (BUTTERFLY and DOG), and see both HOSPITAL and MATH with unrelated primes (STATUE and JUMP), but that will also not be good, because there are lots of other reasons that people may respond faster to BEE/CAT than they respond to HOSPITAL/MATH.

In fact, when we're thinking only about Participant A, we cannot solve this problem. But remember, more than one participant will join your experiment. There won't only be a Participant A; there will also be a Participant B (and C, and D, etc...). So here's what we're going to do:

Participant A will see BUTTERFLY...BEE (related), TABLE...CAT (unrelated), DOCTOR...HOSPITAL (related), and JUMP...MATH (unrelated).
Participant B will see EDIT...BEE (unrelated), DOG...CAT (related), STATUE...HOSPITAL (unrelated), and SCIENCE...MATH (related).

Here is another way you can visualize this:

	Participant A	Participant B
BEE	BUTTERFLY	EDIT
CAT	TABLE	DOG
HOSPITAL	DOCTOR	STATUE
MATH	JUMP	SCIENCE

In this table, each row shows one target, and each column shows one participant. The rest of the table shows which prime will go with that target, for that participant. For example, it shows us that Participant A will see BEE with the prime BUTTERFLY, whereas participant B will see BEE with the prime EDIT.

By doing this, we have made sure that Participant A never sees the same target twice, and neither does Participant B. That solves the first problem. We have also made sure that we can compare reaction times for the exact same word in different conditions; we can examine how fast people respond to BEE in the related condition (from Participant A) vs. how fast they respond to BEE in the unrelated condition (from Participant B). It's not perfect; maybe Participant A is overall a very fast person (maybe they play a lot of video games or something) and Participant B is overall very slow (maybe they like to think carefully before responding, to avoid getting any wrong answers). That's why an experiment needs a lot of participants, so we can hopefully make sure our results aren't being too much influenced by one or two extreme people.

What I have just described is a Latin square design.

We can generalize this a bit more. Re-making that same table, I can label the columns as "lists" (you can also think of them as "versions"), rather than participants:

	List 1	List 2
BEE	BUTTERFLY	EDIT
CAT	TABLE	DOG
HOSPITAL	DOCTOR	STATUE
MATH	JUMP	SCIENCE

This experiment, then, has two "versions" (two "lists"). Half of the volunteers (maybe Participant A, Participant C, Participant E, etc.) will do List 1 (Version 1). The other half of the volunteers (maybe Participant B, Participant D, Participant F, etc.) will do List 2 (Version 2).

In general, if you use a Latin square design, then however many conditions your experiment has, you need that many lists. For example, imagine you have not just related and unrelated, but you have different kinds of related trials. Maybe your experiment includes highly related, somewhat related, and unrelated. In that case, you would need three lists (three versions); if you are curious why, you can try thinking of stimuli and making a table like the one above, to see why you need three lists to make sure that every target gets seen in every condition. Think about the Zhou & Marslen-Wilson paper we learned about before, which had four conditions for every target. How many lists would that experiment need in a Latin square design?

Based on this information, and the decision you made before about what kind of priming experiment you want to do, you should be able to figure out how many lists your Latin square design will need.

Making the stimuli for your experiment

At the beginning of this section I gave you a rule: Each person must respond to 20 targets in each condition they see.

Look at the table we made above. In that table, there are 4 targets. Participant A responds to 2 targets that have related primes, and 2 targets that have unrelated primes. Likewise for Participant B.

This means that if you use the Latin square design, you will need more than 20 targets. If you have 2 conditions, and you want to make sure each participant sees 20 targets in each condition, that means you will need 40 targets, and each target will need two primes. If you have 4 conditions, you will need 80 targets, and each target will need 4 primes! Note that this is a lot more than the words in the example experiment we did in the "Priming" module; that was just a short demo. For this project, though, I want you to design a full experiment, which needs lots of words.

There are also a few more things you will need to think about, as discussed below.

Fillers

I assume most of you will do experiments using lexical decision (the task where the participant has to press "yes" or "no" to decide whether they think the stimulus they saw is a real word or not). What will the experiment be like if you make your 40 targets and primes and then stop?

If the experiment only includes those words, then every word in the experiment is a real word. The experiment would become very easy. A participant could just close their eyes and constantly press the "yes" button. They wouldn't need to do any thinking; they wouldn't even need to pay attention. This would not be a very good experiment.

Thus, if you want your participants to pay attention to the task, you probably want to include some extra word pairs where the target is not a real word. (I call these "fillers"; some people call these "foils".) You can review the experiment we did in the Priming module as an example.

Practice trials

It takes people a while to get used to a priming experiment. At the beginning, they might not even remember which button means "yes"/"word" and which button means "no"/"nonword".

This can be a big problem at the beginning of the experiment. Imagine that someone sees BEE, and after 350 milliseconds they recognize it and decide that it is a real word. But then they can't remember which button is which, and they spend another 2 seconds (2000 milliseconds) trying to remember the right button. The only data you will have recorded, eventually, is when they pressed the button: 2350 milliseconds after seeing the word. You will have no way of knowing if they pressed it that late because it took so long to recognize the word, or just because they were trying to remember which button.

That is one reason it's important to include practice words. I like to include a lot of practice trials, so people get time to get familiar with the buttons. Ideally, by the time they're done with the practice, they should be so familiar with the buttons that they don't even need to think consciously; they just see a word and automatically press the right button, or see a nonword and automatically press the left button. For these reasons, it's good to include a lot of practice word pairs (I often try to have at least 20).

Another benefit of having practice items is that it gives the participant a chance to make sure they understand the experiment and to ask you questions. Sometimes when they are reading the instructions they think they understand, but then when they actually see the practice they realize there is some confusion. Then they can ask you to clarify. It's better for that stuff to happen during the practice section, instead of during the real experiment (when it might be messing up your actual data).

Submitting your words

Based on the above information, you should be able to figure out how many stimuli you need. Make a complete list of words for your experiment. I find it easiest to organize these in a spreadsheet format, as shown in this file: stimuli_sample.xlsx

When you have a satisfactory experiment plan, continue to the next task: "Make a DMDX experiment".

by Stephen Politzer-Ahles. Last modified on 2021-07-12. CC-BY-4.0.