Counting spam

Late 1998, I received my first bit of "unsolicited commercial e-mail", or spam. (I guess I've been lucky for not being spammed earlier: my primary e-mail address had already been on a web page for several years.) Since then I've stored all spam I received.


This is a graph showing for every date the number of spam e-mails received by me until then (red line). The horizontal axis (time) is linear, while the vertical axis (number of spam) is logarithmic. The resulting line is more or less straight, which means the amount of spam is growing exponentially with time.

Both green lines represent growth by a factor of 2.65 per year. Since much of the red line fits nicely between these green lines, the rate of growth of the spam is/was apparently about 2.65 per year between 2000 and 2003. This is worrying: Moore's Law says that hardware performance grows only by a factor of 2 every one and a half years!

Since April 2003 or so, the red line is clearly above the upper green line (which I put there in December 2002). This suggests that, for a while, the rate of growth has been even higher than the 2.65 per year found earlier.


By now, spam is (unfortunately) arriving in such copious amounts that a graph of the rate of spam reception has become feasible; such a graph is shown above. In this graph, the light green dots indicate the number of spams per individual day; the purple, blue and red lines represent the number of spams per day averaged over periods of a month, 100 days, and a year, respectively. As before, the horizontal (time) axis is linear, while the vertical axis is logarithmic.

This graph confirms the increase in the spam growth rate during 2003: about a factor of 6 per year. However, in 2004 things seem to have changed: not only did spam grow less quickly than in 2003, the spam rate itself has decreased; I'm now getting significantly fewer spams per day than two years ago :-)

As of March 1, 2006, our department's system administrators installed greylisting for incoming e-mail. In fact, most of the spam I still get, is spam that is coming in not through my regular e-mail address, but a few other addresses that are (legitimately) forwarded to mine from (apparently whitelisted) relays. A decrease in the spam rate caused by this around March 1 is not really clear in the general decreasing trend.

