Web 2.0 and archival institutions

May 8, 2006

originally posted at archivemati.ca


I've been preparing my presentation for some upcoming conferences in Summer 2006 (IS&T Archiving, Association of Canadian Archivists, Society of American Archivists). I'm going to be talking about Web 2.0 as a set of enabling technologies and practices that can enhance the quality of archives access systems.

I've been preparing my presentation for some upcoming conferences in Summer 2006 (IS&T Archiving, Association of Canadian Archivists, Society of American Archivists). I'm going to be talking about Web 2.0 as a set of enabling technologies and practices that can enhance the quality of archives access systems.

Of course, "Web 2.0" is a hodge-podge of intersecting technologies, ideas, practices and marketing pitches. It has gotten a lot of attention over the past year in the tech press and geek blogosphere but you know it has hit maintstream when the term starts showing up in airline in-flight magazines (see AirCanada's EnRoute (May 2006)).

Although the "Web 2.0" term probably has a limited shelf life, I expect it will at least get people's attention as they scan a conference program. It provides a relatively wide and hype-charged entry-point for a discussion on some of the more interesting of the new web technologies and practices.

For the purpose of my presentation, I have focussed on three core Web 2.0 themes:

  1. usability
  2. openness
  3. community

Usability

As part of the usability theme I intend to discuss AJAX features such as auto-complete, drag-n-drop, and dynamic update of page components without reload as well as other Web 2.0-like usability improvements such as permalinks, feed aggregators, personalization, and the use of simple, functional design (with generous application of whitespace and what appears to be a preference for neon-green logos). All of these elements can be incorporated into existing and new archives access systems to improve their usability.

Openness

When I say "openness" I am actually trying to refer to four key charateristics of the Web 2.0 trend:
  1. open architectures
  2. open standards
  3. open content
  4. open source

Open architectures refers to the 'web as platform' concept that encourages the use of loosely-coupled components, web services and APIs to piece together application functionality or content 'mash-ups'. Open architectures are enabled by the use of open standards such as (URI, HTTP, XML, XHTML, CSS, Atom, etc.)

Open architectures and standards have lots of implications for the archives community as common services (e.g. archival description, subject classification, search, reference and research) might be shared between institutions at both the technical and program/service delivery levels. Likewise, loosely-coupled components can be used to improve the ongoing management of enterprise information systems, freeing the institution from being dependent on one behemoth, technology stack.

Up until now, we've only seen limited use of open architecture and technical standard concepts in archives management systems, namely as EAD finding aids and OAI-PMH harvesting.

However, there are plenty of Web2.0 type technologies and standards that can enhance archives system architectures. For example, the use of simple syndication feeds and pings, particularly through the IETF's newly approved Atom 1.0 standard, can greatly improve upon the 'harvesting' concept, including the potential to distribute not just metadata but also digital objects as Atom 1.0 supports base64-encoded binary content.

Also, geo-coding archival materials with latitude and longtitude information (related to the place of creation, use, custody, or the location of related materials) allows for integration with map-based browsing and access tools or as input to the growing variety of wifi location-based services such as walking-tours and GPS treasure hunts.

Open content and open access refer to the elminination of restrictions on the re-use of digital information through more flexible licensing practices such as those provided through Creative Commons licenses as well as the increased sharing of content on sites such as Flickr.com and OurMedia.org.

Although there are still many tough legal, business, and professional obstacles to clear, the increased adoptation of open content licensing can help archival institutions to enrich the content and contextual information of their own collections (e.g. by syndicating metadata and/or content from other collections that complement their own). More significantly, opening up collections (e.g. as proposed by the Open Content Alliance) will 'let a million flowers bloom' and enrich the 'long tail' of the Web with the wealth of unique information and cultural treasures that are preserved in archival collections.

Lastly, as institutions with limited funding that are managing public information as part of the public trust, archives can only benefit from using and supporting open source software to manage their functions, programs and websites. Fortunately, the leading Model-View-Controller (MVC) frameworks that are being used to build most of the new Web 2.0 applications are all open-source products (e.g. RubyOnRails, Django, TurboGears, Symfony) and they all run on open-source server architectures (e.g. Linux, Apache, MySQL).

Community

Aside from the technological innovations of Web 2.0, the most distinguishing characteristic of this trend has to be importance of nurturing a community around a given online service, technology or content repository. That is to say, a community in the sense of people connecting to other people but also a community that takes responsibility and ownership of the services, technology and content. Some poster children of this trend include Wikipedia, LinkedIn, and MySpace. Some buzzwords associated with this trend include social software, radical trust, decentralization, and disintermediation.

'Disintermediation' is a mouthful of a buzzword that was actually introduced in the last wave of web hype (i.e. the dot-Com boom). It refers to the concept of cutting out the middleman and, although ordering your next pet online never really took off, disintermediation does refer to a trend that is continuing today. The most recent example that has been getting a lot of attention is grassroots journalism wherein everyday people are posting their own reports, analysis, pictures and videos of events to their own blogs or community-operated news portals (e.g. NowPublic.com). These grassroots journalists are giving new insight, context and emperical information that corrects, verifies or enhances the reports provided by the traditional news outlets or, in many cases, providing coverage of events that the traditional media has ignored.

Similarly, archival institutions are going to have to accept the rise of grassroots archivists. Not as barbarians at the city gates but as value-adding partners that share the goal of preserving historical memories and experiences. In his excellent webcast presentation, Are the Archives Doomed?, Rick Prelinger discusses the emergence of what he calls 'archives groupies' and the wonderful, often unexpected results that occur when users are invited to participate in the organization and use of archival collections.

Some interesting early examples of how these Web 2.0 concepts could be applied to archival collections include:

Archival Institutions and Web 2.0

I assume, of course, that professional archivists will have issues with blurring the lines between institutionally managed archival materials and descriptions and those contributed, enhanced or re-used by patrons. Copyright and restrictive access conditions placed on material by donors are a concern. Another legitimate concern would be to protect the authenticity of archival materials and the context of their original creation and use.

I therefore see the introduction of community-managed collections, descriptions, exhibits and discussions as something that happens in parallel to the authoritative archives access systems that are managed by archival institutions and their professional staff. I see these parallel systems as taking the form of virtual collections or virtual research rooms that are loosely-coupled to the institutional systems using open architectures and standards.

These could exist completely seperate from the institution, on another organization's technology platform, but I also think that archival institutions stand to benefit from taking a leadership role in encouraging new and innovative use of their collections and being the benefactor and host of new, online communities. Web 2.0 is full of interesting stories and lessons of how that might be accomplished.