overview
types
tools
community
journalism
issues
law
dollars
enterprise
k-logs
genres
polblogs
lifeblogs
reception
digerati
brands
commodities
splogs
microblogs
exits
contrasts
lingo

related
Guides:
Security
& Infocrime
Marketing
Metrics

related
Profiles:
Search
Spam
Australian
spam law
|
splogs
This page considers 'spam blogs', aka splogs, which now
account for much of the blogosphere.
It covers -
- introduction
- what are splogs
- basis
- what is their rationale and how are they created
- figures
- how many splogs and who is splogging?
- a
red tide? - are splogs poisoning the blogosphere?
- responses
- legal, management and technical responses
- studies
- literature about splogging
introduction
What
is a 'spam blog', aka a splog?
Despite (or perhaps because of) some doctrinaire announcements
there is disagreement about what constitutes a splog.
Most commentators characterise a splog as a blog that
is generated by a machine (rather than by a human author)
in order to -
-
increase the visibility (through a higher search
engine ranking and through early recognition by
search engines) of another site/page that a splogger
is seeking to promote. Splogs typically feature multiple
links to one or more such sites
-
garner revenue by featuring advertisements that attract
a pay-per-click payment.
Some observers use a broader definition, including machine-generated
'spam' posts that appear on legitimate blogs which allow
readers to comment on a post by the author. Those spam
posts (aka comment spam) typically have nothing to do
with the subject of the blog - ie they do not comment
on or respond to the author's post - but instead feature
links to another site. That linking is intended to raise
the site's search ranking or to encourage an unwary third
party to download malware
from that site.
As the name suggests, a splog is the blog counterpart
of spam. It often involves
spammers. Like much spam it typically uses gibberish text
or text scraped from another site to 'camouflage' the
link/s.
Like spam it leverages some strengths and weaknesses of
networked information technology -
- software
can be used to automatically generate text and create
splogs (particularly in free services such as Blogger)
or post spam comments on 'open' legitimate blogs
- that
text generation/publication can be on a very large scale
and requires little investment by the splogger
- search
engines rely on cues rather than on an innate understanding
of a blog's content and in practice thus encounter difficulty
in discerning whether a blog/post is legitimate
- search
engines provide an incentive for splogging by giving
higher rankings to sites/pages that feature in links
from other sites or by graduating new sites/pages from
the 'sandbox' on the basis that those sites are featured
in such links
- online
advertising services (including services operated by
major engines such as Google) provide an incentive through
'paid impressions'
Jonathan Bailey thus tartly commented that
Google,
for better or worse, has its hand in every aspect of
spam blogging. It finances them through AdSense, hosts
them through Blogger and directs traffic to them via
the search engine.
Splogs
subvert expectations about use of online media and rhetoric
about the blogosphere as a community space. They also
subvert expectations about the nature of online advertising,
reflected in Andrew Keen's comment that "the message
is dead: Web 2.0 is reducing
all marketing to spam".
basis
Contrary to some of the more irate claims by blog zealots,
splogging is traditional and was inevitable.
It is a descendant of the spamdexing that plagued early
web search engines (and that led most engines to abandon
reliance on metadata),
as website operators sought to gain top rankings in lists
of search results and to gain the benefits from increased
traffic.
It is also an heir of 15 years of spam, with some marketers
realising that sufficient consumers would respond to unsolicited
offers of pharmaceuticals or other products to make bulk
commercial email worthwhile and that flooding the net
with such messages is a trivial task. Those spammers have
also gained experience in appropriating names (forgery
of email addresses is discussed elsewhere
on this site) and in scraping text from legitimate web
sites in an effort to evade filtering of their messages.
Splogging reflects perceptions that substantial numbers
of people are viewing blogs and that search engines still
respect links from blogs (a respect that is currently
eroding, with the algorithms used by major engines likely
to discount links from blogs).
It also reflects the 'blogging explosion' (large scale
uptake of blogging inhibits manual supervision by the
operators of blog services) and the emphasis that some
blog services have placed on user friendliness. The features
that make a free 'lowest common denominator' service such
as Blogger attractive to neophytes are often features
that can be exploited by sploggers.
Most splogs - like most spam - are thus generated automatically.
Google for example reported in October 2005 that Blogger
had been targeted in an apparently coordinated attempt
to create some 13,000 splogs.
Finally, it reflects the ease with which a splogger can
automatically copy legitimate content from someone's blog
and reproduce that content (whether as a blog page or
a comment posted on someone else's blog) with the addition
of a link to the splogger's preferred site.
The nature of that link may be readily discernable, similar
to pages created by many domainers
as part of large-scale domain name monetisation. It may
instead be disguised, because the splogger has replaced
a legitimate link (one for example that points to this
page) with an illegitimate one (a link that an adult content
site or discount clothing site).
figures
How many splogs are in existence? Are they increasing?
As with many internet metrics there is disagreement about
past figures, current activity and projections. It has
been claimed that over 70% of blogs (and over 75% of overall
posts) in Blogger are splogs. One US consultancy reported
that an average of forty-four of the top hundred results
on blog search engines as of early 2006 were spam blogs.
Technorati claimed that between 3,000 and 7,000 splogs
were being created each day in mid 2007, after supposedly
reaching a record level of around 11,000 per day in December
2006. Peers have suggested that between 5% and 20% of
all blogs are splogs.
Splogfighter
claims to have detected over a million splogs, including
265,000 attributed to one splogger.
Figures for the number of spam posts in legitimate blogs
are uncertain.
a red tide?
One acquaintance likens splogs to the toxic red tide
that periodically poisons some coral
ecosystems.
Splogging pollutes the blogosphere because it adds substantial
noise to search results in searches that are specific
to blogs (and more broadly to whole-of-web searches that
include blog services). If you conduct a blog search you
are very likely to come across splogs. Some searches will
result in lists where splogs are prominent, for example
because individual pages have the requisite keyword density
and links from other splogs or merely because sploggers
have spawned a large number of fake blogs identifiable
using the particular search term.
Some opponents have argued that splogging is an attack
on the information commons, with sploggers misusing free
services such as Blogger (particularly services that allow
AdSense) and avoiding subscription services. They have
noted that concern about comment splogging discourages
legitimate bloggers from allowing readers to post comments
about the author's writing, a constraint that impoverishes
the online community.
Critics have noted more subtle concerns, highlighting
for example what some have called machine-based plagiarism
or intellectual property infringement through sploggers
scraping text from legitimate sites for use in their splogs
and posts. The author of this page, for example, has not
authorised sploggers to copy and republish (usually in
a mangled version) text from this site.
responses
"It appears that Google's response boils down to
'trust us'. They, to date, have offered no real evidence
of their efforts and, have kept all attempts to gather
information at arm's length," says Bailey.
"This is in stark contrast to other search engines,
like Microsoft, that have published papers on their spam-fighting
techniques and seem to be taking pride in their efforts.
Google could take a few simple, but harsh, steps and practically
stop spam blogs on its Blogspot [Blogger] service; similar
improvements could stop the abuse of AdSense."
Splogfighter continues to keep a close eye on the numbers,
noting a recent monthly fall in his splog detections from
575,000 to 275,000 with cautious optimism. While he believes
Google is deleting a lot of splogs, he's now going to
ramp up data collection and test some new identification
methods.
And what is Google doing? Over the past month, the Guardian
has contacted Google several times trying to find out
how many blogs there are on Blogger, and how many splogs
and AdSense accounts are terminated for abuse each month.
But our questions have gone unanswered.
"We've always had a policy that publishers are not
to create web pages specifically for ads, and we are actively
enforcing this," says Google. What about Splogfighter's
campaign against splogs on Blogger? "We can't comment
on the issues raised by this site [Splogfighter]."
Why then does WordPress.com, which also offers free blog
hosting, have so little trouble? In part, because Matt
Mullenweg, WordPress's founder, makes cracking down on
splogs one of his priorities.
"For WordPress.com, keeping splogs off the system
is as much economics as anything else," says Mullenweg.
"I like to think that the internet's perception of
WordPress is better because we're vigilant against spammy
content."
Users of WordPress.com are encouraged to report splogs.
Says Mullenweg: "We respond within hours to any splog
report, 24 hours a day."
There's another, perhaps essential, difference: WordPress.com
doesn't allow AdSense ads. Might that be the reason why
sploggers haven't prospered there?
Mullenweg plans to allow users to add Google's AdSense
to their blogs. But will this open the floodgates to sploggers?
"Part of the WordPress brand is high-quality blogs,
and we're not going to do anything to damage that. We
have an extraordinary number of really high-quality blogs,
and some of them could do quite well with AdSense,"
says Mullenweg. "We plan to make it a paid upgrade,
at least $15 (£7.45) a year per blog, and our policies
on splogs or spammy content aren't going to change."
Another perspective on splogs comes from, who runs Plagiarism
Today, a site about online plagiarism, content theft and
copyright issues. Copying helps splogs multiply, thanks
to programs that create Blogger-hosted splogs automatically
and then fill them with relevant content stolen from the
internet. While Bailey can advise how to limit this, finding
the culprits is another matter.
"Chasing sploggers, generally, is not worth it. The
software that these guys use can generate thousands of
spam blogs an hour. Even the people who gave them the
blog to use don't know who they really are," says
Bailey.
Advertisers should also consider where their adverts are
being presented, says Bailey: "If they find out that
their adverts are being displayed on these junk sites
they're not going to want to spend as much.
studies
Legal issues are explored in the 2006 'Splog! Or How to
Stop the Rise of a New Menace on the Internet' (PDF)
in 19(2) Harvard Journal of Law & Technology,
467-484 and 'Pollution in the Blogosphere: The Only Purpose
of a New Form of Blog, Called a Splog, Is Fraud and Infringement'
by Jeffrey Goldman & Eric German in 30 June 2007 Los
Angeles Lawyer, 32-40.
next page (microblogs)
|
|