Scott Savage Akismet StatisticsFor some reason I seem to get a heap of spam on my blog. Even since I first started blogging spam somehow seemed to be drawn to my blog site (and I don’t even mention V!agr4 that often!). At the time I took the screenshot to the right 8,368 spam comments and trackbacks had been caught. That is a pretty ridiculous number. Akismet has managed to catch 99.764% of these, which is a testament to it’s effectiveness (and a major reason why I use WordPress). I sometimes wonder whether maybe allowing a few of these spam comments (which usually link to link heavy pages) would actually help my search engine ranking.

I found an SEOBook article that contained a lot of interesting findings that unintentionally supported my theory. Firstly the highest risk item is that your blog will itself get tagged as spam, however “A few bad inbound links are not going to put your site over the edge to where it is algorithmically tagged as spam”. In fact you can push this even further; “If you can get a few well known trusted links you can get away with having a large number of spammy links”.

The next step is to understand what kind of links spam comments etc. provide.  Again from the article “Spammers either use a large number of low PageRank links, a few hard to get high PageRank links, or some combination of the two.”. So how do you weed out the low PageRank links and seize the high PageRank ones? Well if everyone is running the same Akismet filter (it takes resources to build a blacklist/heuristic filter, how many are there?) then perhaps the high PageRank comments are those that are missed by the most common filters?

Therefore should I leave the Akismet filter on, but approve everything that gets through it even if it is spam? Or if I wanted to be more scientific should I analyse the PageRank of each link in the spam comment and accept those with high PageRanks? Surely in these 8000+ spam comments the spammers hit gold somewhere, the question is how do I find it?