Caslon Analytics elephant logo title for SEO note
home | about | site use | resources | publications | timeline   spacer graphic   Ketupa





related pages icon

& the GII

related pages icon





section heading icon     overview

This page discusses search engine optimisation (SEO) and the SEO industry.

It covers -

It supplements discussion elsewhere on this site regarding search engines (and popular search terms), metadata, online resource identification and networks.

The following pages consider SEO principles, myths and responses. They also highlight independent studies about search .

subsection heading icon     introduction

Search engine optimization aims to ensure that online content is found by users of web search engines, in particular by appearance of the 'optimised' site/page at the top of list of search results.

It reflects recognition that many users rely on search engines rather than domain names, links or other pointers (eg an URL featured on a business card or the side of a bus) for identification of content. It also reflects awareness that some users rely on search engines as their primary measure of the quality of online content, for example assuming that the first item on a list of search results will be the most authoritative, accurate or merely up-to-date.

Vendors of search optimisation services offer various methodologies that are claimed to "guarantee" online content a favourable position in search results.

Some of those methodologies are so obvious that it is surprising consumers would pay for the advice or emphasise optimisation at the expense of more effective mechanisms to attract the attention of desired demographics.

Other claims by participants in the search optimisation industry are deceptive or merely nonsensical. That has led critics to compare SEO businesses with those that purported to offer sure-fire solutions for choosing the perfect domain name and thereby being 'found' during the dot-com bubble.

subsection heading icon     SEO optimisation

Search engine optimisation is predicated on recognition that search engines use algorithms for sorting web sites, pages and other content for presentation in a list of search results in what is sometimes referred to as a search engine results page (SERP). Those algorithms are proprietary and often highly sophisticated. They are not static: they change over time, reflecting factors such as changing user needs, efforts to foil abuse by site operators and work to produce better results.

Algorithms and thus SERP results vary from engine to engine. Variation is also attributable to different coverage of the net: some engines have simply encountered more sites/pages than others, some visit different parts of the net more frequently than other engines, some process submissions about new sites more quickly than their competitors.

Most base their retrieval on identification of content within the body of a HTML page, PDF file or other online resource. They are not based on metadata in the header of files. Major engines indeed appear to penalise crude attempts by SEO vendors and site operators to gain a higher rank by using a particular word twenty or one hundred times in the header.

They instead rank sites/pages through weightings that appear to include -

  • whether the particular resource features the search term (or an equivalent term) - often referred to as a keyword, a usage that confuses many people from a library, print indexing or legal background
  • the significance of that term (eg does it occur once or several times; is it featured as a page title, subdirectory name or navigation point within a page; does it occur as part of what the engine perceives as natural syntax or as an attempt to artificially weight the page)
  • factors such as the site's age, liveness, 'reputation' (in particular whether other respected sites point to it) and avoidance of negatives such as participation in link exchange schemes. Some of those factors are highlighted below.

Many SEO services centre their marketing on a supposed understanding of the individual algorithms and on claimed expertise in retitling web pages or rewording text to increase the number of keywords per page.

Such promotion is problematical. In particular, mechanistic approaches may result in -

  • penalisation by search engine operators, which are known to have automatically given lower rankings or even excluded some sites after egregious abuses
  • pages that read strangely and thereby send the wrong message to users.

subsection heading icon     the SEO industry

Online resource identification is a billion dollar sector of the Australian and global economy, with search engine and directory operators gaining (or aspiring to gain) substantial revenue from paid placement and other services and site operators seeking ways of ensuring that their sites appear prominently in SERP results.

The SEO industry appeared in the mid 1990s and first gained public attention after 1997 as the dot com bubble expanded. It is an echo of the domain naming and domain name valuation industries, which flourished with the bubble and attracted criticism over poor (even deceptive) performance.

There has been no comprehensive independent study of the industry. At a global and Australian level it appears to be volatile, with low barriers for entry by individuals and businesses, many of whom seem undeterred by unfamiliarity with search technologies and - alas - uninhibited about offering guarantees that demonstrably cannot be met.

SEO services for example cannot meaningfully guarantee that any new site will gain a top ranking immediately, particularly as some engines 'sandbox' new sites - ie do not include them in SERP or give them a low SERP ranking - for several weeks or months in order to minimise evanescent sites created as pointers to adult content sites.

There is no licensing by government or accreditation by the search engines, which are fiercely competitive.

Margins for successful SEO services appear to be high but many minor operators appear to leave the industry within two years, whether because they discovered revenue was hard to achieve, they received bad words of mouth (or even court action) by disappointed clients or they simply got bored.

Some have expanded into search engine submission services, charging what are often exorbitant fees to alert engines and directories to the existence of a client's site. Such services often claim that there are 2,500 or even 5,000 search engines, with success supposedly requiring submission to each and every engine. One enthusiastic spammer for example offers to submit "your web site to 700,000+ Internet search engines and directories".

Elsewhere on this site we have noted that those figures are gross exaggerations (eg conflate directories with engines) and that most users rely on a handful of engines.

Given the comments above it is unsurprising that there are no generally accepted standards, particularly standards developed by entities outside the industry and articulated by independent organisations such as Standards Australia or ANSI.

Some industry participants proclaim that their staff are "certified", for example have certification in "search engine optimisation" or are graduates of an "advanced search engine marketing skills" program. Sceptics respond - fairly or otherwise - that is difficult to embrace self-certification or diploma mill-style certification.

Wariness is appropriate where -

  • there is no independent testing or validation of the certification
  • the expertise of the graduate is not reflected in endorsement by computer scientists or academics engaged in the study of information retrieval.

Other SEO vendors emphasise a proprietary methodology, often with a black box approach that claims to identify desired keyword weightings and enable the service to effectively rewrite sites/pages for guaranteed higher rankings.

Some, with a hazy understanding of engines such as Google, claim to dramatically boost a site's ranking through inclusion of the most popular search terms or through rapid generation of other sites that exist merely to point to the client's site.

Promises from such services typically feature guarantees that the chosen site will go to a top page - or even the number one rank on that SERP - within weeks or even days. In discussing such myths later in this note we suggest that clients might better spend their money elsewhere, particular if their site isn't unique and is located in a crowded part of cyberspace.

Jason Calacanis commented that

all the SEO folks I've ever talked to--and I've talked to many over the past decade or so--have pitched me on expensive contracts that you can't cancel for two years with them to do all kinds of shady things to move up in the rankings ... I know for a fact that a lot of folks in the industry look at the SEO companies as shady. They feel like it's filled with slick guys in really bad suits running around at cheap hotels pushing desperate clients to sign fat contracts. That's how folks look at SEO--like it or not.

subsection heading icon     factors

Information about how different search engines rank search results is uncertain. Understanding has been impeded by myths about SERP and the effectiveness of SEO services.

Factors that may be significant for major engines such as Google appear to include weightings or exclusions regarding -

  • age (including how long the site has been online, the age of individual pages and measures of how frequently content is updated). Overall there appears to be a bias in favour of sites that have existed for a while, are stable and are updated on an ongoing basis - thereby being different from sites that are 'dead' and from those that are manufactured merely to redirect traffic or increase another site's link count
  • quantity (including number of pages/files, number of words in aggregate and number of words per page)
  • availability (is the content available 24/7, especially in regions such as Australia where continuous availability is expected)
  • positive reputation (number of citation by other sites, particularly citations by sites that have a high score in terms of age, quantity and so forth)
  • technical negatives (non-compliant code, broken outgoing links, conflicts between page titles and page content, indications that metatags have been heavily 'optimised' through recurrent use of words such as sex or adult, use of 'invisible' text)
  • reputation negatives (outgoing links that point to sites with a low reputation, files that feature illegal content or malware, participation in commercial link exchange schemes, sudden spurts in inclusion of links to sites with a poor reputation)
  • quality indicators (uniqueness of content, inclusion of bibliographic material and of automatically verifiable contact details)
  • measures of user satisfaction (eg click through from initial entry to other pages on the site, time spent on the site, correlation between free and paid-placement search results)
  • user demographics (matching site content with information about users)
  • auspices (weighting for recognised publishers, government agencies, professional organisations)
  • IP addresses (weighting against ISPs/ICHs that are perceived as being permissive to spammers and against address blocks that have an unusually high number of low reputation sites)
  • keywords (in particular keywords that the engine perceives as presented in an appropriate syntax rather than at an unnatural frequency and at random to subvert the algorithm)
  • cloaking (whether the site serves different content to different categories of users)

Some observers have suggested that there is human validation of particular sites or of leading results on some of the most frequent search terms.

   next page  (SEO myths)

this site
the web


version of March 2007
© Bruce Arnold | caslon analytics