7. Graphs on Logarithmic and Semi-Logarithmic Paper
by M. Bourne
Need Graph Paper?
Download graph paper, including
log-log and semi-log.
In a semilogarithmic graph, one axis has a logarithmic scale and the other axis has a linear scale. You can see some examples of semi-logarithmic graphs in this YouTube Traffic Rank graph and in this article on loudspeakers (external site). See also air pressure and Zipf Distributions later on this page.
In log-log graphs, both axes have a logarithmic scale.
The idea here is we use semilog or log-log graph paper so that we can more easily see details for small values of y as well as large values of y.
Example 1: Variable Exponent
Plot the graph of `y = 5^x` on normal and then semilogarithmic paper.
We can also graph `y = 5^x` on log-log paper (i.e. both axes use log scales)
NOTE: Both the domain (x-values) and the range (y-values) must be POSITIVE, because you cannot have the logarithm of a negative number.
We can see even more detail for small values of x and y now.
Example 2: Variable Raised to a Fractional Exponent
Graph y = x1/2 using all 3 axis types, rectangular, semi-log and log-log. This function is equivalent to `y=sqrt(x)`.
Application 1: Air pressure
1. By pumping, the air pressure in a tank is reduced by 18% each second. So the percentage of air pressure remaining is given by p = 100(0.82)t.
Plot p against t for 0 < t < 30 s on
(a) a rectangular co-ordinate system
(b) a semilogarithmic system.
Try it on paper first, and then see what you get using the LiveMath example above.
The answer is given below.
Application 2: Zipf Distributions
Consider the most common words in English. It turns out that there is a relationship between the rank of a word's occurrence and the frequency of its use. That relationship was observed by George Kingsley Zipf in the first half of the 20th century.
The Zipf Distribution is an observation comparing rank and frequency of word occurrences. In general, the word with rank k has a frequency roughly proportional to `1/k`. In other words, the second most commonly used word occurs about `1/2` as often as the most common word. Likewise, the 3rd most common word occurs about `1/3` as often as the most common word.
Zipf Distributions occur naturally in many situations, for example in:
- Calls to computer operating systems
- Colors in images
- As the basis of most approaches to image compression
- City populations (a small number of large cities, a larger number of smaller cities)
- Wealth distribution (a small number of people have large amounts of money, large numbers of people have small amounts of money)
- Company size distribution
- Artificial intelligence (in particular, "chat bots" that can chat with humans) relies on the limited number of questions and statements that people actually write in chats. (See ALICE).
a. Common English Words
Zipf originally developed his law in response to the observation that the frequency of words was inversely proportional to the rank of each word.
For example, the most common 20 words in English are listed in the following table. The table is based on the Brown Corpus, a careful study of a million words from a wide variety of sources including newspapers, books, magazines, fiction, government documents, comedy and academic publications.
The most common word, "the" occurred around `70,000` times (or `7%` of the million words counted). The next ranked word, "of", occurred around `3.6%` of the time (or about `1/2` as often as the top-ranked word.) The third most popular word was "and", with a frequency of `2.8%`, or roughly `1/3` of the frequency of the top ranked word.
|Rank||Word||Frequency||% Frequency||Theoretical Zipf
(The first 20 words in the Brown Corpus, published in 1967. This Corpus is the count of how often one million words were used in a variety of books, newspapers and other publications. [Table source no longer available, but similar to Corpus of Contemporary American English.]
I have included the "Theoretical Zipf Distribution, based on the n-th ranked word occurring approximately `1/n` times the frequency of the highest ranked word. This gives us a hyperbola, that we met before.)
Let's plot what we have observed:
The dark blue data points represent the top 20 occurring English words (with the first few labeled). The pink line is the theoretical Zipf distribution, which is found to be `f/n^0.94`, where f is the frequency of the top-ranked word and n is the rank of the word.
`f/(1^0.94) = 69970`,
`f/2^0.94 = 69970/2^0.94 = 36470, `
`f/3^0.94 = 69970/3^0.94 = 24912,`
`f/4^0.94 = 69970/4^0.94 = 19009,`
`f/5^0.94 = 69970/5^0.94 = 15412, `
The power `0.94` comes from observing the best line of fit for the word frequencies. (I just did trial and error in Excel until I found the closes fit.)
There is a fairly large gap in the pattern for the words "to", "a" and "in", but it settles down and is quite consistent after that.
We now plot the top 2000 English words and use a log-log scale (log of the rank for the horizontal axis and log of the frequency for the vertical axis). If a distribution gives us a straight line on a log-log scale, then we can say that it is a Zipf Distribution.
We see that there is a remarkably consistent result for the top 2000 most-used English words. For your information, the last few in the list of 2000 words are:
b. Websites and the Zipf Distribution
We also observe a Zipf Distribution when it comes to popularity of pages in Websites.
For example, out of the most recent 500,000 page views in Interactive Mathematics, the most commonly visited page is the homepage, with 27,855 views. The next most common page is the Algebra Introduction, with around 1/2 of the views. The 3rd ranked page has about 1/3 of the views of the most popular page.
|2||Basic Algebra Introduction||`15334`|
|3||Addition & Subtraction in Algebra||`7605`|
|4||Math Of Beauty||`5965`|
|5||Graphs of Sine and Cosine||`5749`|
|6||Volume of Solid of Revolution||`5667`|
|7||Trigonometric Graphs Introduction||`5584`|
|9||Introduction to Trigonometric Functions||`4701`|
For the top 500 pages in the site, we have the following log-log graph of the page views:
The theoretical Zipf Distribution (the pink line) is obtained as follows. The power used, 0.67, once again comes from observing the best line of fit.
`27855/2^0.67 = 17507`
`27855/3^0.67 = 13342`
`27855/4^0.67 = 11003`
`27855/5^0.67 = 9475`
`27855/6^0.67 = 8835 `
After the page ranked 200th, the pattern breaks down, but interestingly, from the 300th to the 500th page, there is still a consistent relationship between rank and frequency.
See also Zipf Distributions, log-log graphs and Site Statistics over in the IntMath blog.
Didn't find what you are looking for on this page? Try search:
Online Algebra Solver
This algebra solver can solve a wide range of math problems. (Please be patient while it loads.)
Go to: Online algebra solver
Ready for a break?
Play a math game.
(Well, not really a math game, but each game was made using math...)
The IntMath Newsletter
Sign up for the free IntMath Newsletter. Get math study tips, information, news and updates each fortnight. Join thousands of satisfied students, teachers and parents!
Math Lessons on DVD
Easy to understand math lessons on DVD. See samples before you commit.
More info: Math videos