This page considers 'spam blogs', aka splogs, which now
account for much of the blogosphere.
It covers -
- what are splogs
- what is their rationale and how are they created
- how many splogs and who is splogging?
red tide? - are splogs poisoning the blogosphere?
- legal, management and technical responses
- literature about splogging
is a 'spam blog', aka a splog?
Despite (or perhaps because of) some doctrinaire announcements
there is disagreement about what constitutes a splog.
Most commentators characterise a splog as a blog that
is generated by a machine (rather than by a human author)
in order to -
increase the visibility (through a higher search
engine ranking and through early recognition by
search engines) of another site/page that a splogger
is seeking to promote. Splogs typically feature multiple
links to one or more such sites
garner revenue by featuring advertisements that attract
a pay-per-click payment.
Some observers use a broader definition, including machine-generated
'spam' posts that appear on legitimate blogs which allow
readers to comment on a post by the author. Those spam
posts (aka comment spam) typically have nothing to do
with the subject of the blog - ie they do not comment
on or respond to the author's post - but instead feature
links to another site. That linking is intended to raise
the site's search ranking or to encourage an unwary third
party to download malware
from that site.
As the name suggests, a splog is the blog counterpart
of spam. It often involves
spammers. Like much spam it typically uses gibberish text
or text scraped from another site to 'camouflage' the
Like spam it leverages some strengths and weaknesses of
networked information technology -
can be used to automatically generate text and create
splogs (particularly in free services such as Blogger)
or post spam comments on 'open' legitimate blogs
text generation/publication can be on a very large scale
and requires little investment by the splogger
engines rely on cues rather than on an innate understanding
of a blog's content and in practice thus encounter difficulty
in discerning whether a blog/post is legitimate
engines provide an incentive for splogging by giving
higher rankings to sites/pages that feature in links
from other sites or by graduating new sites/pages from
the 'sandbox' on the basis that those sites are featured
in such links
advertising services (including services operated by
major engines such as Google) provide an incentive through
Jonathan Bailey thus tartly commented that
for better or worse, has its hand in every aspect of
spam blogging. It finances them through AdSense, hosts
them through Blogger and directs traffic to them via
the search engine.
subvert expectations about use of online media and rhetoric
about the blogosphere as a community space. They also
subvert expectations about the nature of online advertising,
reflected in Andrew Keen's comment that "the message
is dead: Web 2.0 is reducing
all marketing to spam".
Contrary to some of the more irate claims by blog zealots,
splogging is traditional and was inevitable.
It is a descendant of the spamdexing that plagued early
web search engines (and that led most engines to abandon
reliance on metadata),
as website operators sought to gain top rankings in lists
of search results and to gain the benefits from increased
It is also an heir of 15 years of spam, with some marketers
realising that sufficient consumers would respond to unsolicited
offers of pharmaceuticals or other products to make bulk
commercial email worthwhile and that flooding the net
with such messages is a trivial task. Those spammers have
also gained experience in appropriating names (forgery
of email addresses is discussed elsewhere
on this site) and in scraping text from legitimate web
sites in an effort to evade filtering of their messages.
Splogging reflects perceptions that substantial numbers
of people are viewing blogs and that search engines still
respect links from blogs (a respect that is currently
eroding, with the algorithms used by major engines likely
to discount links from blogs).
It also reflects the 'blogging explosion' (large scale
uptake of blogging inhibits manual supervision by the
operators of blog services) and the emphasis that some
blog services have placed on user friendliness. The features
that make a free 'lowest common denominator' service such
as Blogger attractive to neophytes are often features
that can be exploited by sploggers.
Most splogs - like most spam - are thus generated automatically.
Google for example reported in October 2005 that Blogger
had been targeted in an apparently coordinated attempt
to create some 13,000 splogs.
Finally, it reflects the ease with which a splogger can
automatically copy legitimate content from someone's blog
and reproduce that content (whether as a blog page or
a comment posted on someone else's blog) with the addition
of a link to the splogger's preferred site.
The nature of that link may be readily discernable, similar
to pages created by many domainers
as part of large-scale domain name monetisation. It may
instead be disguised, because the splogger has replaced
a legitimate link (one for example that points to this
page) with an illegitimate one (a link that an adult content
site or discount clothing site).
How many splogs are in existence? Are they increasing?
As with many internet metrics there is disagreement about
past figures, current activity and projections. It has
been claimed that over 70% of blogs (and over 75% of overall
posts) in Blogger are splogs. One US consultancy reported
that an average of forty-four of the top hundred results
on blog search engines as of early 2006 were spam blogs.
Technorati claimed that between 3,000 and 7,000 splogs
were being created each day in mid 2007, after supposedly
reaching a record level of around 11,000 per day in December
2006. Peers have suggested that between 5% and 20% of
all blogs are splogs.
claims to have detected over a million splogs, including
265,000 attributed to one splogger.
Figures for the number of spam posts in legitimate blogs
a red tide?
One acquaintance likens splogs to the toxic red tide
that periodically poisons some coral
Splogging pollutes the blogosphere because it adds substantial
noise to search results in searches that are specific
to blogs (and more broadly to whole-of-web searches that
include blog services). If you conduct a blog search you
are very likely to come across splogs. Some searches will
result in lists where splogs are prominent, for example
because individual pages have the requisite keyword density
and links from other splogs or merely because sploggers
have spawned a large number of fake blogs identifiable
using the particular search term.
Some opponents have argued that splogging is an attack
on the information commons, with sploggers misusing free
services such as Blogger (particularly services that allow
AdSense) and avoiding subscription services. They have
noted that concern about comment splogging discourages
legitimate bloggers from allowing readers to post comments
about the author's writing, a constraint that impoverishes
the online community.
Critics have noted more subtle concerns, highlighting
for example what some have called machine-based plagiarism
or intellectual property infringement through sploggers
scraping text from legitimate sites for use in their splogs
and posts. The author of this page, for example, has not
authorised sploggers to copy and republish (usually in
a mangled version) text from this site.
Legal issues are explored in the 2006 'Splog! Or How to
Stop the Rise of a New Menace on the Internet' (PDF)
in 19(2) Harvard Journal of Law & Technology,
467-484 and 'Pollution in the Blogosphere: The Only Purpose
of a New Form of Blog, Called a Splog, Is Fraud and Infringement'
by Jeffrey Goldman & Eric German in 30 June 2007 Los
Angeles Lawyer, 32-40.
next page (microblogs)