| overview
 exceptionalism
 
 commons
 
 dogs in space
 
 rich & hip
 
 borders
 
 e-cargo cults
 
 community
 
 home alone
 
 red lights
 
 it's all there
 
 inattention
 
 overload
 
  
                        
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 |  it's all there? 
 This 
                        page explores myths about online access to what some writers 
                        have characterised as the "information cornucopia" 
                        or global digital library: claims that everything you 
                        want to know is online, that you can easily find it and 
                        that you will be able to do so in future.
 
 It covers -
  introduction 
 Brewster Kahle's 2001 Public Access to Digital Material 
                        article 
                        identified universal online access to content as the 
                        "epic opportunity of our digital age", claiming 
                        that
  
                         
                          technology has reached the point where scanning all 
                          books, digitizing all audio recordings, downloading 
                          all websites, and recording the output of all TV and 
                          radio stations is not only feasible but less costly 
                          than buying and storing the physical versions.  
                        A year later Business Week ran 
                        with the spin, claiming that Kahle's Internet Archive 
                        is -  
                        a 
                          collection of 10 billion pages, including Internet sites, 
                          movies, and Usenet postings 
                          five times larger than the amount of information at 
                          the Library of Congress and 
                        that  
                        Today, 
                          a single copy of everything that's on the Net -- equal 
                          to 15,000 copies of Encyclopedia Britannica 
                          -- is added to the archive every two months. Would 
                        that were true, since the Archive is in fact be far more 
                        selective. 
 It does not archive all sites. It does 
                        not archive all pages of all sites and 
                        it archives erratically.
 
 At a less rarified level there are four basic myths about 
                        information in cyberspace -
 
                        everything 
                          is online 
                          all online content can be found 
                          all online content can be accessed (and will be accessible 
                          in future) 
                          access is resulting in information overload.   
                         what's online? 
 The explosion of web sites, 
                        high numbers of results displayed by search 
                        engines, ready access to some contemporary music through 
                        filesharing and casual references to global digital libraries 
                        have encouraged a belief that "everything" is 
                        online ... or could be available via the internet through 
                        the efforts of volunteers or through the removal of impediments 
                        such as intellectual property.
 
 That belief is at best naive. Although there are several 
                        million sites on the web (with an unknown number of pages), 
                        much of the content is corporate or personal and ephemeral. 
                        As of mid-2003 a majority of pages are probably in English: 
                        some languages 
                        are barely represented in terms of readership or authorship 
                        (eg there is little Lao, Inuit, Bantu or Amharic content).
 
 What text is online is patchy in the extreme. Standard 
                        works from the Latin and Greek classics are online (albeit 
                        often in superseded editions and translations) but there 
                        is little Provencal, Persian, Chinese, Khmer or Aramaic. 
                        More recent literature is sparse: umpteen copies of pronouncements 
                        by Gilmore and Barlow but no Patrick White, Christina 
                        Stead, David Malouf, Heimito von Doderer or Robert Musil. 
                        There is little Proust, less Mann (Heinrich, Klaus or 
                        Thomas).
 
 Publishers such as Gale are undertaking large-scale digitisation 
                        programs (eg Gale's 20 million page 150,000 titles The 
                        18th Century literature project). 
                        That activity is, of course, on a commercial basis and 
                        - as in past microfilm or CD-ROM projects - access to 
                        the text is generally restricted to academic ghettoes.
 
 Digitisation of archival content from newspapers 
                        and journals is underway but again, much of that content 
                        won't be freely available. As noted in discussing electronic 
                        publishing, many current serials are online ... but protected 
                        by firewalls for access on a subscription, sessional or 
                        per item payment. Few online newspapers contain all the 
                        content that is featured in their print versions.
 
 Biography and critical literature prior to the 1980s is 
                        equally sparse and, given the priorities of initiatives 
                        such as the Gutenberg and Bartleby projects, is likely 
                        to remain so. Do not expect to find a standard edition 
                        of Lukacs, Adorno, Kojeve or Bakhtin. Only scraps by historians 
                        such as Namier, Bloch, Kehr, Febvre, Michelet or Matthiesen 
                        are on the web. In summary, most of the Library of Congress, 
                        National Library of Australia or even mid-range university 
                        or community library is not on the web.
 
 What of music? A rarely-remarked feature is the almost 
                        total absence of scores: the net provides access to recordings 
                        rather than notation. With some exceptions the classical 
                        repertoire is largely absent: little Machaut, Ives, Zemlinsky, 
                        Pergolesi, Gesualdo.
 
 As uptake of broadband increases, access to video content 
                        is growing. At the moment most video on the web (and downloaded 
                        through filesharing) emanates from the adult 
                        content sector. Neither Hollywood nor national film 
                        industries plan to release their libraries (including 
                        early b&w silent films) onto the web. The BBC's proposals 
                        to place much of its audiovisual library online is an 
                        exception with little enthusiasm from commercial and public 
                        sector peers.
 
 
  is it readily identified? 
 Much content is freely available on the net. However, 
                        ready identification of that content often poses particular 
                        challenges. In discussing internet metrics we've noted 
                        reports that suggest most traffic goes to a small proportion 
                        of sites (the 'winner takes all' model): many potential 
                        users simply don't find the content that is available. 
                        For practical purposes that content does not exist.
 
 Developments in enhanced search engines, metadata and 
                        other identification mechanisms are arguably not keeping 
                        pace with the growth of the net, the volatility of much 
                        online content and the resistance of many users to unstructured 
                        or 'naive' retrieval. It is clear that many users are 
                        content to settle for second or third best and that that 
                        many are overwhelmed by the task of sifting through exhaustive 
                        search results.
 
 Even the major engines don't cover all of the public web; 
                        few cover much of the 'deep' web, ie content that's behind 
                        firewalls or is generated dynamically from databases rather 
                        than static web pages that are readily spidered.
 
 
  is it accessible? 
 A pernicious myth is the notion that most content 
                        can be readily accessed. In considering digital divides, 
                        usability and other 
                        questions we have suggested that access to the net is 
                        quite uneven. Much of the web is 'dark', either because 
                        content is held behind firewalls (no password or no credit 
                        card number = no access) or because site operators have 
                        disregarded usability principles.
 
 Within advanced economies substantial parts of the population 
                        do not have ready access. That is because they face physical 
                        challenges (eg poor sight and motor difficulties), because 
                        the infrastructure is not available or because they simply 
                        can't afford the ongoing investment in a recent computer 
                        and broadband charges.
 
 Such impediments in Australia and New Zealand are more 
                        critical in other parts of the world, where as we have 
                        noted over a billion people do not have ready access to 
                        electricity (and several million depend on dried cow dung 
                        and straw for warmth and cooking). Hype from Microsoft, 
                        Cisco and MIT about breaking down digital divides through 
                        wireless networks seems somewhat displaced when the cost 
                        of a personal computer is several times the annual income 
                        of the average family in central Africa or Bangladesh.
 
 We have argued that notions of the digital divide encompass 
                        deficits in skills, expectations and the broader economic 
                        environment.
 
 Charles Kenny of the World Bank for example comments that
  
                        Lack 
                          of education is a major barrier to productive Internet 
                          use .... In Ethiopia, 98 percent of Internet users in 
                          1998 had a university degree, yet 64.5 percent of the 
                          overall population is illiterate. Worldwide, most people 
                          living on $1 a day are illiterate. Further, they usually 
                          speak a minority language in their own country - few 
                          speak a major global language. For example, about 17 
                          million people in Nigeria speak Igbo. My search for 
                          Web pages in Igbo turned up only five sites: a translation 
                          of the Universal Declaration of Human Rights, a translation 
                          of a document called 'The Four Spiritual Laws' (theological 
                          provenance undetermined), a translation of the food 
                          pyramid, a two-page Igbo phrase book, and a prayer manual. 
                          There isn't an Igbo translation service on the Web, 
                          so an Igbo speaker would be limited to these five. None 
                          involved sound or video, so the illiterate Igbo speaker 
                          would gain nothing. Bridging the gaps in language and 
                          technical skills as well as basic literacy will be difficult, 
                          considering the small per-student spending available 
                          in the poorest countries' primary schools, where the 
                          discretionary budget per student is as little as $5 
                          a year. Kenny 
                        rightly dismisses hype about pervasive benefits from e-commerce 
                        by noting that  
                        even 
                          if poor people are lucky enough to be literate and conversant 
                          in a major world language, their use of the Web for 
                          activities such as e-commerce is likely to be limited 
                          by their lack of credit cards, not to mention the challenge 
                          of persuading FedEx and UPS to start delivery services 
                          in their neighborhoods. Limitations in relevant content 
                          and ability to use that content perhaps best explain 
                          why only 2.2 percent of India's Internet users have 
                          ever engaged in buying or selling over the Web. That 
                        lack of the fantastic green plastic also precludes use 
                        outside libraries of paid access sites.
 
  is it accurate? 
 Notions of the web as a well-ordered and comprehensive 
                        free library (whose librarians provide quality control 
                        in the acquisiton of content and the systematic weeding 
                        out of superseded content) are misplaced. Online publication 
                        is not a guarantee of accuracy.
 
 A more effective metaphor instead is the net as the 'marketplace 
                        of ideas', in which everyone is free to offer content 
                        and in which truth eventually triumphs over ignorance 
                        or deception. Regrettably, in that marketplace lies are 
                        often more seductive or simply easier to find. Much of 
                        the factual information on the net is false or has become 
                        so through the passage of time.
 
 The self-referential nature of much online content creation 
                        - authors appropriating online content without referring 
                        to offline sources and the echo-chamber effect of much 
                        blogging and wiki, 
                        exacerbated by the 'winner takes all' phenomenon - means 
                        that inaccuracies can gain wide circulation. That is of 
                        particular concern for medical 
                        sites. It is also of concern regarding sites with a historical 
                        or technical reference function (one reason why this site 
                        features a range of online sources and references to offline 
                        writing). It is a basis for skepticism about arguments 
                        that defamation online 
                        is not a major problem, as the defamed can supposedly 
                        'out-publish' the falsehoods in a triumph of free 
                        speech.
 
 One response is the development of a digital information 
                        literacy, with readers having appropriate expectations 
                        about what is found online, skills in assessing accuracy 
                        and a capability (and commitment) to checking information 
                        found on the net.
 
 
 
 
 
 
 
 
 
  next page  (the 
                        inattention economy) 
 
 
 | 
                        
                       |