WordListsAnalytics: an R package for Property Listing Task data
A Shiny-based R package for analyzing data from Property Listing Tasks and Semantic Fluency Tasks: estimating population parameters, computing sample sizes for a target coverage, agreement probability, and clusters and shifts.
The Property Listing Task is a method used in psychology to study how people understand concepts. The idea is simple: a person is given a concept and asked to list its properties, for example to write down the things that are true of a dog. One person might list that a dog has fur, that it barks, and that it is a pet. Collecting these lists from many people, for many concepts, lets researchers study the structure behind a concept, which properties define it, how often they come up, and how concepts relate to one another. This kind of data is widely used in cognitive psychology to test ideas about how concepts are represented in the mind.
The trouble is what happens when researchers analyze this data. The values they compute from a study, such as how many properties a concept has, are treated as if they described the whole population, when they are really just estimates from the particular sample of people who took part. This leads to a few problems: results are hard to generalize beyond the sample, the number of participants is often chosen without a clear justification, and a larger study is assumed to be more precise when that is not necessarily true. WordListsAnalytics was built to address these problems, by treating the values from a study as estimates and giving researchers the tools to work with them properly.
WordListsAnalytics is built as a Shiny application wrapped in an R package, so the whole thing runs as an interface in the browser rather than through code. After installing it, a single function call opens the app, with no arguments to set up.
install.packages("WordListsAnalytics")
library(WordListsAnalytics)
WordListsAnalytics()
There is also a hosted version of the app, available here, which runs in the browser without installing anything. Either way, the interface is organized into seven tabs, each one a different analysis: uploading and cleaning the data, estimating parameters, estimating the sample size needed, simulating data, preparing the inputs for the agreement probability, computing the agreement probability, and finding clusters and shifts. The rest of this entry goes through what each tab does.
Upload data. The first tab handles the data input. The input is a CSV file with three columns: the subject who listed the property, the concept they were given, and the property they listed. Each row is a single property, so a subject who lists five properties for a concept takes up five rows. The tab also includes an example dataset, CPN-27, that can be loaded with one click to try the package out. Once the data is loaded, four optional cleaning steps can be applied: converting everything to lowercase, removing repeated rows, removing punctuation, and removing spaces. The cleaned data can then be downloaded.
Estimated parameters. The second tab computes, for each concept, a set of statistics and parameter estimates. Among them are the number of unique properties actually observed in the data, and an estimate of the total number of unique properties the concept would have in the full population, along with its standard deviation and confidence interval. The key value here is coverage: the share of all the possible properties of a concept that the current data has managed to capture. A coverage of 0.70 means the study has collected about 70% of the properties that exist for that concept in the population. This single number tells a researcher how complete their data is, and it is what the next tab builds on.
Sample size estimation. The third tab uses the coverage from the previous tab to estimate sample sizes. The researcher sets a target coverage with a slider, for example 0.80, and the tab estimates how many additional participants would be needed, for each concept, to reach it. It also estimates how many new unique properties those extra participants would be expected to contribute. This answers a question that was usually settled by convenience: how many people do I actually need. For some concepts the estimate can become unreliable, which happens when the number of additional participants needed comes out larger than twice the number who already listed properties. In that case the tab flags a warning and caps the estimate at twice the original sample instead of reporting the unreliable figure.
Data simulator. The fourth tab generates synthetic data for a concept. It estimates the frequency distribution of the properties listed for that concept and samples from it to produce new artificial listings. These distributions tend to have a long tail, with many properties mentioned by only one or two people, so the tab includes an option to add new unique properties that recreate that tail, labeled by numbers rather than words. This is useful for testing analyses or exploring how a concept’s data behaves under different assumptions.
Inputs to calculate p(a), and p(a) calculation. The fifth and sixth tabs work together to compute the agreement probability, written p(a). This is the probability that a property taken at random from one average participant’s list also appears in the list of another average participant for the same concept. In other words, it measures how much people agree on the properties of a concept. When p(a) is computed between two different concepts instead, it reflects how semantically related those concepts are.
The fifth tab prepares the inputs for this calculation. For each concept it builds a table with the frequency of every listed property, together with a value the package calls s: the average number of properties that participants listed for that concept. This is a lowercase s, not to be confused with the uppercase S used earlier for the number of unique properties of a concept. The sixth tab then computes p(a) from those inputs, through a simulation. By default it calculates the agreement of each concept with itself, but it can also compute it for a pair of different concepts. Three parameters, the number of repetitions, the number of iterations, and a moving average window, control how precise the result is, at the cost of longer computing time.
Clusters and shifts. The last tab works with a related method, the Semantic Fluency Task. Here participants are asked to name as many words as they can from a category. Two measures capture how they move through memory while doing this. A cluster is a run of semantically related words, such as dog, cat, and goldfish. A shift is a jump from one cluster to another, such as moving from pets to farm animals. Both are used to study semantic memory and executive functioning.
To find these clusters, the tab first estimates how similar the listed properties are, treating two properties as similar when participants tend to mention them close together. It then groups the properties with a partitional clustering algorithm, which forms clusters out of properties whose similarity is above a threshold. That threshold can be set by the researcher, or left to the tab, which picks the value that maximizes the number of clusters and shifts. The result is shown as a set of clusters for the chosen concept, along with the similarity scale used to build them.
Conclusion
WordListsAnalytics brings together a set of analyses for Property Listing Task and Semantic Fluency Task data that were previously hard to apply in practice. From a single interface, a researcher can estimate population parameters, compute the sample size needed to reach a target coverage, calculate agreement probability between concepts, and find clusters and shifts, without writing any code.
If you want to try it, the package is available on CRAN under the name WordListsAnalytics, and there is also a hosted version of the app that runs in the browser without installing anything. The full details are in the paper, and if you have any questions about the package you are welcome to reach out at cristobalheredia@usf.edu.