Obtaining labeled data is a significant obstacle for many NLP tasks. Recently, online games have been proposed as a new way of obtaining labeled data; games attract users by being fun to play. In this paper, we consider the application of this idea to collecting semantic relations between words, such as hypernym/ hyponym relationships. We built three online games, inspired by the real-life games of ScattergoriesTM and TabooTM. As of June 2008, players have entered nearly 800,000 data instances, in two categories. The first type of data consists of category/answer pairs (“Types of vehicle”,“car”), while the second is essentially free association data (“submarine”,” underwater”). We analyze both types of data in detail and discuss potential uses of the data. We show that we can extract from our data set a significant number of new hypernym/ hyponym pairs not already found in WordNet.
Introduction
One of the main difficulties in natural language processing is the lack of labeled data. Typically, obtaining labeled data requires hiring human annotators. Recently, building online games has been suggested an alternative to hiring annotators. For example, von Ahn and Dabbish (2004) built the ESP Game1, an online game in which players tag images with words that describe them. It is well known that there are large numbers of web users who will play online games. If a game is fun, there is a good chance that sufficiently many online users will play. We have several objectives in this paper. The first is to discuss design decisions in building word games for collecting data, and the effects of these decisions. The second is to describe the word games that we implemented and the kinds of data they are designed to collect. As of June 2008, our games have been online for nearly a year, and have collected nearly 800,000 data instances. The third goal is to analyze the resulting data and demonstrate that the data collected from our games is potentially useful in linguistic applications. As an example application, we show that the data we have collected can be used to augment WordNet (Fellbaum, 1998) with a significant number of new hypernyms.
General Design Guidelines
Our primary goal is to produce a large amount of clean, useful data. Each of these three objectives (“large”, “clean”, and “useful”) has important implications for the design of our games. First, in order to collect large amounts of data, the game must be attractive to users. If the game is not fun, people will not play it. This requirement is perhaps the most significant factor to take into account when designing a game. For one thing, it tends to discourage extremely complicated labeling tasks, since these are more likely to be viewed as work. It would certainly be a challenge (although not necessarily impossible) to design a game that yields labeled parse data, for example. In this paper, we assume that if people play a game in real life, there is a good chance they ...