Economists, investors, and journalists avidly follow monthly government data releases on economic conditions. However, these reports are only available with a lag: the data for a given month is generally released about halfway through the next month, and are typically revised several months later. Google Trends provides daily and weekly reports on the volume of queries related to various industries. We hypothesize that this query data may be correlated with the current level of economic activity in given industries and thus may be helpful in predicting the subsequent data releases. We are not claiming that Google Trends data help predict the future. Rather we are claiming that Google Trends may help in predicting the present. For example, the volume of queries on a particular brand of automobile during the second week in June may be helpful in predicting the June sales report for that brand, when it is released in July.
Our goals in this report are to familiarize readers with Google Trends data, illustrate some simple forecasting methods that use this data, and encourage readers to undertake their own analyses. Certainly it is possible to build more sophisticated forecasting models than those we describe here. However, we believe that the models we describe can serve as baselines to help analysts get started with their own modeling efforts and that can subsequently be refined for specific applications (John, 2009).
Data
Google Trends provides an index of the volume of Google queries by geographic location and category. Google Trends data does not report the raw level of queries for a given search term. Rather, it reports a query index. The query index starts with the query share: the total query volume for search term in a given geographic region divided by the total number of queries in that region at a point in time. The query share numbers are then normalized so that they start at 0 in January 1, 2004. Numbers at later dates indicated the percentage deviation from the query share on January 1, 2004. This query index data is available at country and state level for the United States and several other countries. Figure 1.1 depicts an example from Google Trends for the query [coupon]. Note that the search share for [coupon] increases during the holiday shopping season and the summer vacation season. There has been a small increase in the query index for [coupon] over time and a significant increase in 2008, which is likely due to the economic downturn. Google classifies search queries into 27 categories at the top level and 241 categories at the second level using an automated classification engine. Queries are assigned to particular categories using natural language processing methods.
The Google Trends project allows input search terms like “Myspace”, “2008 election”, or “Linux”, and sees how Google tracks how popular these search terms are over time. The resulting graphs can be quite interesting - spikes ...