DIGITAL LIBRARY LANDSCAPE SURVEY
Prepared by Michael Manoff: Assistant Research Professor/University of Tennessee Libraries.
For the University of Tennessee Digital Library Steering Committee: March, 2004.
Advisor: William A. Britten (Head, Library Systems).
Introduction:
One of the challenges we face at this stage of the evolution of the UT-DLC is improving our connection with the users of our collections. One way of approaching this issue is to examine the access and Web organization strategies of our peers at other institutions. I have been surveying the digital library landscape with usability issues in mind, and this report summarizes what I have found. Hopefully this will inform and illuminate your decision making, as we seek to get the the digital objects, mounted with such effort and expense, out to their intended audiences.
Scope of Survey:
Selected sources are used to illustrate and reveal trends, point out new developments, and orient the UT-DLC position relative to its peers, at this point in the evolution of the digital library community. The survey is restricted to digitally reproduced or created materials that comply with the protocols of the Open Archives Initiative.
I. Primary Digital Library Web Sites:
While single DL primary Web sites serve as platforms to present and describe their collections, the thrust of most of their other content, with some notable exceptions, is directed towards promotion and internal operations. Others offer a separate homepage targeted just at the end user. Another significant finding is that many of the sites that I sampled link directly to DL collections from their library homepages. The resulting ease of navigation for users is obvious. The specific examples below illustrate these issues, as well as other strategies relevant to usability.
- Multiple Homepages: The Library of Congress takes a two-tiered approach to its digital collections: one homepage for the end user, American Memory Homepage , which links directly from the library homepage, and a second for librarians and archivists, Digital Libraries and Collections . Cornell also uses this strategy: a promotional, R&D, and technical site, Cornell Institute for Digital Collections , and a second site, obviously intended for the end user, which also links directly from the library homepage Cornell Digital Initiatives . The question naturally arises: should one DL homepage attempt to serve both end users and librarians?
- Single Homepages: Of the single sites surveyed, the University of Washington DL manages most successfully in presenting a very user-friendly homepage, which also contains technical and internal information under the headings: About this Site and Copyright and Reproduction. Note the easy access and diverse ways of accessing UW's enormous number of digital collections: UW Digital Collections . Two other user-friendly single DL homepages are: the Digital Library of Georgia , and the Indiana Digital Library Program .
- Library Homepages Directly Linked to DL Web sites: The University of Pennsylvannia, Penn Libraries ; Cornell University Library, Cornell Libraries ; Indiana, Indiana University Libraries ; the University of California, UCLA Libraries ; The LOC, Library of Congress ; The University of Washington, University of Washington Information Gateway .
- FAQ Pages: Some DLs link to an FAQ page, for example: the Digital Library of Georgia . Contained here is content relevant for end users, and promotional purposes.
- Additional Resources/Related Links Pages: Many have a page that drills out to the digital library world with related links. View the additional resources on the NYU Digital Library Team page, and the related links from the Indiana Digital Library Program . Pages such as these provide end users with options to explore further.
- Search Tools: Indiana provides a search tool for the user to exclusively query its DL Web pages: Indiana Digital Library Program .
- Site-Maps: The California Digital Library employs a site-map for navigation. Note also the uncluttered homepage, that through menus in the left margin, seems to make everything available with white space to spare. The University of Georgia does likewise: UGA Libraries Site-Map .
- Standards and Procedures: The most striking and dominant feature of the NYU-DL homepage is a Standards and Procedures DL Handbook, viewed at the NYU Digital Library Team . It is intended for outreach to other institutions who want to start DL's.
- Collection Development Policies: UGA includes a collection development policy for its digitized collections that is linked directly from its homepage: UGA Collection Development Policy .
- Collection Grouping: Virginia Tech Digital Library and Archives primarily presents its collections by format type: ETDs, Ejournals, Images, Faculty Archives, News Reports, etc. The Penn Library Digital Library Project and Cornell Digital Initiatives offer similar but less extensive grouping schemes, using such terms as: primarly visual image, primarily text, and multimedia.
- Personnel and Operational Structure: Cornell is the site with the most extensive personnel and organizational information. From the Cornell Institute for Digital Collections, click on the organization link near the top of the left margin. Indiana also presents interesting information about its staff, including a list of its digital library steering committee members (note the large staff of sixteen people): About Us .
II. Distribution of Access:
Searching library Web sites, with a focus on finding access points to digital collections other than those on their primary DL homepages, produced minimal results. While some include collection level records in their catalogs, none have links in their research guides, or subject disciplines pages. The following examples point to various distribution of access issues:
- Catalog Records: The Library of Congress includes collection level records for nineteen titles of its prodigious American Memory Project . A title search from the LOC Catalog, using American Memory, brings up the records. UCLA has also cataloged selected collections from its DL. A search for Counting California in its ORION catalog brings up the record. The University of Washington DL, catalogs all of its collection level records. They are searchable from many points across the UW Digital Collections site. Indiana's collections are also accessible through its IUCAT catalog (try a title search for the Hoagy Carmichael Collection). NYU catalogs some of its digital collections (try searching BobCat for the Database of Recorded American Music).
- Links from Associated Departments and Services: The LOC leads the way in linking to DL collections from other pages. For example: the Especially for Researchers page; the Research and Reference FAQ page; the A-Z index page (which lists all the individual collections); and the Site Map . Cornell has links to many of its digital collections from the Online Materials page of its Rare & Manuscript Collections,
- Inclusion in Database Lists: The University of Georgia was the only site surveyed that integrated its digital collections in its list of online databases. To demonstrate this capability, search for Georgia Aerial Photographs (one of UGA's digital collections), either by keyword or A-Z list in GALILEO . One can then go directly to the database, or to a database description record. Why don't the others include their digital collections from these heavily perused lists? Possibly because like ourselves, they haven't reached far enough in mainstreaming their digital library services?
- Inclusion in Subject Disciplines Pages: None of the sites surveyed included their digital collections in their reference guides, or subject disciplines pages--again for no apparently good reason. One can easily envision DL collections from parent institutions, and selected sites from many OAI repositories, seeding these types of pages.
III. Cross Archive OAI Search Tools:
One of the most most significant developments in the digital library world is the emergence of cross archive OAI search tools. With such tools the user can simultaneously search across many collections using one interface. Due to their recent development and in some cases, experimental status, these tools are largely unknown outside the digital library world. While not perfected yet, some may work well enough to merit inclusion in our databases and elsewhere on our library Web-site.
- The University of Michigan has developed the most comprehensive harvester with their OAIster project. This tool, which aims at providing one-stop-shopping to users interested in useful digital resources, searches over 3,000,000 digital objects from the repositories of 267 institutions. OAIster offers its software free of charge to other institutions who want to develop subject harvesters of there own. General in scope, its main selection criteria is that it harvests only records which are non-restricted and freely available to the public. If one does a subject search for Cades Cove for example, it pulls up all the associated photos from our Great Smokey Mountain Collection. A broader search pulls up records from many other institutions. Note the list of participating institutions with corresponding descriptions of their digital collections.
- ARC is an experimental cross archive search service sponsored by the Old Dominion Digital Library. It is currently used as a research tool to investigate issues in harvesting metadata from OAI compliant repositories.
- My OAI is a search tool that allows an individual to create an account and select which OAI databases he/she wishes to harvest.
IV. OAI Reimplementation: Some well established databases and portals, that have long been important to the academic community, are currently reimplementing to become OAI compliant. The examples below illustrate this trend:
- PubMed: Pubmed Central is now providing access to the metadata and full text of all its items: PMC OAI Service .
- NASA Technical Reports Server: NASA is reimplementing its technical reports server for OAI compliance.
- The Internet Archive: This portal to a wide variety of historical archives (Gutenberg Project, Wayback Machine, Million Book Project, Live Music Archive), is reimplementing to the OAI protocol to enable libraries and researchers to search its many text, audio, and moving image collections: The Internet Archive .
V. Institutional Repositories:
This section will attempt to provide information to the DLSC that will stimulate discussion and help in making decisions as we move towards establishing an Instituional Repository at UT. Such important issues as software selection, content, faculty participation, etc., may be facilitated by the experiences that others have had while traveling down this road. The resources below should provide a good foundation for us to build upon:
- Software Platforms: The proliferation of IR software makes it obvious that an informed discussion is necessary before rushing into a selection (these products do differ and offer discrete capabilities).What I observed is that the selection process is very dependent on the particular goals and needs of a given institution. SPARC makes available a selected list of worldwide institutional repositories that includes the type of software each uses: SPARC Resources .
- Guides to IR Software: The most current guide I have found to date is the Open Society Institute's Guide to Institutional Repository Software (Second Edition, Jan, 2004). It describes these currently available solutions: ARNO, CDSware, DSpace, Eprints, Fedora, i-Tor, and MyCoRe. This is a short and straightforward description of the platforms (it does not review or recommend one over another). It provides full contact information.
- Software Homepages: The following links connect to the homepages of IR software systems: DSpace Federation-- Fedora -- i-Tor -- MyCoRe -- Eprints -- CDSware .
- Overviews and Articles: DLIB magazine has published articles about some of these packages: DSpace: An Open Source; The Fedora Project; and Eprints.org .
- Project Descriptions: The Open Systems Fedora Repository Project provides an interesting account of the UV digital library project which resulted in the development of the Fedora software. Currently UV uses Fedora as the platform for its IR (and the rest of its digital collection as well). The University of Michigan offers a good description of its initial moves in preparing for an IR. It also examines why they chose DSpace as their preferred digital content manangement system: A Study and Prototype Proposal.
- Notable Trends: The current software breaks down along two main types: 1. Systems designed exclusively to support institutional repositories. 2. Systems designed to present IRs and other types of digital objects.
- Hybrid Institutional Repositories: A few projects have emerged that in most respects meet the designation of Institutional Repository, except that they invite submissions from authors outside their own universities. These tend to be theme, or format-based inititiatives.
- The Digital Library of Information Science and Technology, a project sponsored by the University of Arizona, accepts LIS submissions from any researcher wanting to publish scholarly content in this field. Its records are OAI compatible and it uses DSPACE for its self-archiving software: DLIST .
- In some respects, the Networked Digital Library of Thesis and Dissertations (NDLTD), also falls into this category. Restricted to ETDs, it is a format-based repository that includes publications from many institutions: Electronic Thesis/Dissertation OAI Union Catalog .
VI. Conclusions:
- While the LOC is arguably the most evolved in integrating and mainstreaming its DL resources with more established library services, progress in this area is minimal or non-existent across most of the digital library landscape.
- From the point of view of usability, the University of Washington presents one of the most successful homepages. It is instructive to examine the strategies this DL uses in putting its collections forward.
- The UT-DLC compares favorably with the digital programs of other large academic research libraries, and is poised to take a leading role in the mainstreaming process, if so resolved.
- A trend of linking directly to DLs from their homepages, and cataloging collection level records, is apparent from the survey.
- Judging from the samples used, integration of harvesting tools into appropriate places on library Web sites has not yet occurred. With one exception (UGA), inclusion of collection titles in database lists, or subject research guides, is similarly missing from the DL landscape. Action in these areas by the UT-DLC could be trend-setting.
- DLs separate into two groups around end user strategies: those which employ two distinct homepages (one for the end users, and one for peers), and those which use one homepage for both these audiences. While the two page strategy has obvious benefits, some DLs are successful in targeting both audiences with one page (most notably, UW, Indiana, and UGA).
- Reimplementation of scholarly Web sites and portals for OAI compatibility is a trend that is gaining momentum.
- The wide variety of institutional repository software packages now available requires that due consideration be given before making a selection that might be premature.
- Most of our peers are in the earliest stages of IR development. Libraries who have already chosen a software platform and mounted their collections are the exception rather than the rule.
VII. Selected Resources: