500 billion words: visual stats give us cultural insights
By Murray Bourne, 22 Dec 2010
Google has scanned over 15 million books, and they've released a subset containing 500 billion words from 5.2 million books. The best thing is htey have made the whole database freely available.
You can search the Google Books Ngram Viewer to see how words have changed in popularity through the years.
The graph shows how the word "men" dominated books up until the feminist 1970s, when the word "women" started to take over. By 1985, you can see "women" were on top. [Image source]
However, when you compare "man" and "woman", the fairer sex has a long way still before dominating mentions in books.
A tale of 3 empires
Here is a comparison of the words "America", "England" and "China" as used in English-speaking books throughout the 20th century. It shows how England's influence waned, America's ascended (especially during the war years) and how China's has remained fairly constant since the 1940s.
Notice that searches on the Ngram Viewer are case sensitive. I originally (mistakenly) compared "china", "america" and "england" (all lower-case) and of course, "china" was at the top by a significant margin, since this refers to the pottery.
While these searches are really interesting, we need to ask:
- Which books are included, and which are not included yet?
- Which books are not included due to copyright issues?
- The data doesn't appear that useful for the naughties (2000 to 2010). Many terms seem to decrease in importance, or increase inexplicably.
Others to try
- global warming
- man,woman ("woman" has still got a way to go to catch up, but note how dramatically "man" has dropped)
Be the first to comment below.