The study (.PDF), titled "Do Zebras get more Spam than Aardvarks?" analyzed traffic logs from the U.K. ISP Demon Internet. The data analyzed was from the period Feb. 1st - March 27th of 2008.
In the study, Clayton noted that those whose local part of their email address (this is the portion to the left of the "@") begins with "A" receive about 50% spam and 50% non-spam. Clayton called this group aardvarks. When the local part begins with "Z" (call them zebras) about 75% is spam.
You're probably saying, eh? This makes no sense based on what was said earlier. Ah, but it does.
The reason more of the zebra email is spam is because so few actual email addresses start with "Z". Thus, the real portion of email is smaller. If you only look at legitimate email addresses, the picture changes: 20% of email addressed to zebras is spam, 35% of aardvarks is spam.
Clayton's theory over the reason for this difference also makes sense:
At some point, it occurred to the spammers that if firstname.lastname@example.org was a valid email address then perhaps email@example.com was valid as well, so they started to combine local parts (to the left of the @) with other domain names. This method of creating email addresses to attempt delivery to is called a dictionary attack (or sometimes a Rumpelstiltskin attack).In other words, with apologies to Zbigniew Brzezinski, there simply aren't that many Zbigniew's around, so he is pretty safe.
It's not so simple as "A" vs. "Z," as shown in the graph above. Email addresses with number starting characters receive even fewer spam emails. Give you any ideas?
Perhaps aardvarks should consider changing species — or asking their favourite email filter designer to think about how this unexpected empirical result can be leveraged into blocking more of their unwanted email.