Caslon Analytics elephant logo title for Web Log profile
home | about | site use | resources | publications | timeline   spacer graphic   Ketupa























related pages icon

& Infocrime



related pages icon



spam law

section heading icon     splogs

This page considers 'spam blogs', aka splogs, which now account for much of the blogosphere.

It covers -

  • introduction - what are splogs
  • basis - what is their rationale and how are they created
  • figures - how many splogs and who is splogging?
  • a red tide? - are splogs poisoning the blogosphere?
  • responses - legal, management and technical responses
  • studies - literature about splogging

subsection heading icon     introduction

What is a 'spam blog', aka a splog?

Despite (or perhaps because of) some doctrinaire announcements there is disagreement about what constitutes a splog.

Most commentators characterise a splog as a blog that is generated by a machine (rather than by a human author) in order to -

  • increase the visibility (through a higher search engine ranking and through early recognition by search engines) of another site/page that a splogger is seeking to promote. Splogs typically feature multiple links to one or more such sites
  • garner revenue by featuring advertisements that attract a pay-per-click payment.

Some observers use a broader definition, including machine-generated 'spam' posts that appear on legitimate blogs which allow readers to comment on a post by the author. Those spam posts (aka comment spam) typically have nothing to do with the subject of the blog - ie they do not comment on or respond to the author's post - but instead feature links to another site. That linking is intended to raise the site's search ranking or to encourage an unwary third party to download malware from that site.

As the name suggests, a splog is the blog counterpart of spam. It often involves spammers. Like much spam it typically uses gibberish text or text scraped from another site to 'camouflage' the link/s.

Like spam it leverages some strengths and weaknesses of networked information technology -

  • software can be used to automatically generate text and create splogs (particularly in free services such as Blogger) or post spam comments on 'open' legitimate blogs
  • that text generation/publication can be on a very large scale and requires little investment by the splogger
  • search engines rely on cues rather than on an innate understanding of a blog's content and in practice thus encounter difficulty in discerning whether a blog/post is legitimate
  • search engines provide an incentive for splogging by giving higher rankings to sites/pages that feature in links from other sites or by graduating new sites/pages from the 'sandbox' on the basis that those sites are featured in such links
  • online advertising services (including services operated by major engines such as Google) provide an incentive through 'paid impressions'

Jonathan Bailey thus tartly commented that

Google, for better or worse, has its hand in every aspect of spam blogging. It finances them through AdSense, hosts them through Blogger and directs traffic to them via the search engine.

Splogs subvert expectations about use of online media and rhetoric about the blogosphere as a community space. They also subvert expectations about the nature of online advertising, reflected in Andrew Keen's comment that "the message is dead: Web 2.0 is reducing all marketing to spam".

subsection heading icon     basis

Contrary to some of the more irate claims by blog zealots, splogging is traditional and was inevitable.

It is a descendant of the spamdexing that plagued early web search engines (and that led most engines to abandon reliance on metadata), as website operators sought to gain top rankings in lists of search results and to gain the benefits from increased traffic.

It is also an heir of 15 years of spam, with some marketers realising that sufficient consumers would respond to unsolicited offers of pharmaceuticals or other products to make bulk commercial email worthwhile and that flooding the net with such messages is a trivial task. Those spammers have also gained experience in appropriating names (forgery of email addresses is discussed elsewhere on this site) and in scraping text from legitimate web sites in an effort to evade filtering of their messages.

Splogging reflects perceptions that substantial numbers of people are viewing blogs and that search engines still respect links from blogs (a respect that is currently eroding, with the algorithms used by major engines likely to discount links from blogs).

It also reflects the 'blogging explosion' (large scale uptake of blogging inhibits manual supervision by the operators of blog services) and the emphasis that some blog services have placed on user friendliness. The features that make a free 'lowest common denominator' service such as Blogger attractive to neophytes are often features that can be exploited by sploggers.

Most splogs - like most spam - are thus generated automatically. Google for example reported in October 2005 that Blogger had been targeted in an apparently coordinated attempt to create some 13,000 splogs.

Finally, it reflects the ease with which a splogger can automatically copy legitimate content from someone's blog and reproduce that content (whether as a blog page or a comment posted on someone else's blog) with the addition of a link to the splogger's preferred site.

The nature of that link may be readily discernable, similar to pages created by many domainers as part of large-scale domain name monetisation. It may instead be disguised, because the splogger has replaced a legitimate link (one for example that points to this page) with an illegitimate one (a link that an adult content site or discount clothing site).

subsection heading icon     figures

How many splogs are in existence? Are they increasing?

As with many internet metrics there is disagreement about past figures, current activity and projections. It has been claimed that over 70% of blogs (and over 75% of overall posts) in Blogger are splogs. One US consultancy reported that an average of forty-four of the top hundred results on blog search engines as of early 2006 were spam blogs.

Technorati claimed that between 3,000 and 7,000 splogs were being created each day in mid 2007, after supposedly reaching a record level of around 11,000 per day in December 2006. Peers have suggested that between 5% and 20% of all blogs are splogs.

Splogfighter claims to have detected over a million splogs, including 265,000 attributed to one splogger.

Figures for the number of spam posts in legitimate blogs are uncertain.

subsection heading icon     a red tide?

One acquaintance likens splogs to the toxic red tide
that periodically poisons some coral ecosystems.

Splogging pollutes the blogosphere because it adds substantial noise to search results in searches that are specific to blogs (and more broadly to whole-of-web searches that include blog services). If you conduct a blog search you are very likely to come across splogs. Some searches will result in lists where splogs are prominent, for example because individual pages have the requisite keyword density and links from other splogs or merely because sploggers have spawned a large number of fake blogs identifiable using the particular search term.

Some opponents have argued that splogging is an attack on the information commons, with sploggers misusing free services such as Blogger (particularly services that allow AdSense) and avoiding subscription services. They have noted that concern about comment splogging discourages legitimate bloggers from allowing readers to post comments about the author's writing, a constraint that impoverishes the online community.

Critics have noted more subtle concerns, highlighting for example what some have called machine-based plagiarism or intellectual property infringement through sploggers scraping text from legitimate sites for use in their splogs and posts. The author of this page, for example, has not authorised sploggers to copy and republish (usually in a mangled version) text from this site.

subsection heading icon     studies

Legal issues are explored in the 2006 'Splog! Or How to Stop the Rise of a New Menace on the Internet' (PDF) in 19(2) Harvard Journal of Law & Technology, 467-484 and 'Pollution in the Blogosphere: The Only Purpose of a New Form of Blog, Called a Splog, Is Fraud and Infringement' by Jeffrey Goldman & Eric German in 30 June 2007 Los Angeles Lawyer, 32-40.

icon for link to next page   next page  (microblogs)

this site
the web


version of July 2007
© Bruce Arnold | caslon analytics