This policy statement outlines Library and Archives Canada's strategic direction in providing optimal intellectual access to its digital published materials.
Library and Archives Canada (LAC) was created in 2004 with new legislated powers to collect electronic publications on legal deposit, and to archive websites and web domains for preservation purposes. Legal deposit of electronic publications came into effect on January 1, 2007. Specifically, LAC is mandated through its new legislation to: 1) sample the Canadian Web by acquiring and archiving periodic harvests of Canadian Internet domains such as gc.ca or .ca , and by acquiring and archiving selected Websites, and 2) acquire and archive electronic publications through legal deposit or other means. LAC also follows a program of digitization of items in its published collections.
While the former National Library of Canada and National Archives of Canada collected digital publications and some Websites as early as 1994, the numbers were restricted by limited technological support and staff resources. Even so, LAC's collection of digital publications, at 28,000 titles, is already amongst the largest in the world. With new powers and mandated responsibilities, supported by new Legal Deposit Regulations (http://laws.justice.gc.ca/en/showtdm/cr/SOR-2006-337?noCookie), we anticipate an abundance of digital publications and websites to be collected by LAC in the near future. The ability to provide access to this wealth of digital information depends upon a well-planned resource description policy.
While there are many unknowns in terms of the size of the Canadian digital publishing universe and the scope and timing of technological supports, LAC has a mandate to provide intellectual access to its digital collections. A cost-effective and realistic strategy to achieve intellectual access to the anticipated vast numbers of digital resources is provided for within this policy document.
Implementation of these policies will provide the structure for intellectual access to those materials collected under the Digital Collection Development Policy, as well as LAC's digitized publications. Specifically, these policies will facilitate the following:
This policy statement covers resource description policy for intellectual access to the following types of materials:
A digitally encoded information resource made available to the public either through a communications network like the Internet, or on a physical carrier.
Networked Digital Publication:
A networked digital publication normally comprises the linked objects on one communications network domain which are judged to be intrinsic to the publication.
A set of linked web pages, usually including a home page, related by content or domain, and prepared and maintained as a collection of information by a person, group, or organization. A web page is any computer file, graphical material, or grouping of text accessible via the Internet which can be addressed by a hypertext link and rendered for a user by a browser for display or printing.
A Web domain here refers collectively to those websites using the same domain suffix according to the Domain Name System (DNS), which codifies the practice of using a name as a more human-legible abstraction of a machine's numerical IP address on the Web network.
The term "harvesting" here refers to the practice of archiving in bulk certain groups of digital resources. These resources may be identified by means of their domain name suffix, or by other means.
In this context, an LAC collection resource in a traditional analogue format such as print or analogue sound recording, which has been digitally encoded through scanning, optical character recognition or other technique to become a digital reproduction stored as a computer file, normally for purposes of preservation and/or access.
Intellectual access mechanisms are put in place for all digital collection materials regardless of their access status.
LAC's current policy is to seek permission from publishers and creators to make their digital publications and websites publicly available on the LAC site. This policy can result in three categories of digital collection materials:
This access policy applies to networked digital publications and individual websites selected for the collection. Current work is underway to broaden the range of options and methods for the provision of access, in order to allow for a balance of creators' or publisher's rights with the public's need for access.
End users may be blocked from remotely viewing information resources with access restrictions in place. Users will know of the existence of a given digital information resource in the LAC collection, even if they are not able to view it under current access policies.
6.2.1 Basic Access
Intellectual access to LAC digital published collection items is ensured at a base level through full-text indexing and searching, through a Google-like search box, for all digital publications, websites and harvested domains. A user-focused approach to the presentation and display of descriptions and search methods is recommended.
This form of access will be furnished by means of the LAC digital content management system, and made available to users via the Public Access Module of AMICAN. The quality and sophistication of the search engine, and the ability to rank results in a meaningful way for users, are key to optimizing the benefits of this approach.
6.2.2 Supplementary Access
In addition, the following types of supplementary access will be provided. They represent a continuum of choices.
These choices are presented in approximate order of preference. However, this order of preference is a guide, not a prescription. In general, the most cost-effective method that results in an appropriate level of access is desired. Intellectual access methods should not normally be duplicated.
It should be noted that supplementary access is not proposed for all digital resources under this policy; there will be residual resources which do not meet the criteria proposed for supplementary access.
188.8.131.52 Externally-Supplied Metadata
Metadata may be supplied by publishers, other libraries, or other institutions, or imported by LAC with little or no human intervention. Metadata elements should be searchable through the AMICAN Public Access Module and federated search.
The 2006 Legal Deposit Regulations state that a publisher shall provide any available descriptive data about the publication including its title, creator, language, date of publication, format, subject and copyright information. For example, ONIX records from publishers, or Dublin Core records from a variety of sources, could be acquired by LAC along with the networked digital publications or websites which they describe. One current example of externally-supplied metadata is the generation of AMICUS bibliographic records for electronic theses from student-provided information.
Externally-supplied metadata could be stored in AMICUS/AMICAN or in a separate repository. In view of the lack of controlled forms of headings and variable data quality, it should be regarded as non-standard data which may not be suitable for re-distribution to other libraries. However, it can provide useful access to LAC collections, and may in some cases provide the only metadata describing a digital information resource, serving acquisitions and intellectual access purposes.
184.108.40.206 Automatically Generated or Extracted Metadata
LAC may automatically generate or extract metadata from digital publications and websites themselves. It should be searchable through the AMICAN Public Access Module and federated search.
The quality of the resulting records, and the potential need for human review, will require further study. This method of achieving access is likely to be cost-effective, and could again be regarded as non-standard data, as above.
220.127.116.11 Acquisitions Records
Acquisitions records created in AMICUS/AMICAN by LAC staff serve to document the acquisition of digital resources. They should be searchable through the AMICAN Public Access Module and federated search.
Where feasible and when considered cost-effective, externally-supplied or automatically generated or extracted metadata may be used as the basis for creating the acquisitions record. These records may serve as the only metadata describing a digital information resource.
18.104.22.168 Standard Bibliographic Descriptions
Standard MARC bibliographic records in AMICUS/AMICAN, created according to internationally-standardized cataloguing rules (Anglo-American Cataloguing Rules, Resource Description and Access), may also be used to provide access to selected networked digital publications. Standard bibliographic records are suitable for re-distribution to other libraries. They should be searchable through the AMICAN Public Access Module and federated search.
This is the most expensive form of access. Where feasible and when considered cost-effective, externally-supplied or automatically generated metadata may be used as the basis for creating a standard bibliographic description in AMICUS/AMICAN.
Building on the strengths of the LAC collection, the following criteria will be used in selecting networked digital publications and websites to receive a standard bibliographic record:
22.214.171.124.2 Description Level
Standard descriptions for networked digital publications and websites in the LAC collection should include authoritative access points for names and subjects, and subject headings in English and French.
One such record is the Access level record. A primary goal in choosing to provide a standard bibliographic description for digital materials is to provide standardized access by means of authoritative names and subjects, to achieve collocation with other collection materials. Collocation brings works by the same authors, or on the same subjects, or with the same title, together under standardized headings. Collocation helps users to find all the materials of interest to them in a collection, rather than just a subset of materials which happen to be listed under one of several possible forms of name or subject term. Most other functions of a bibliographic record, such as the unequivocal identification of an item, or assistance in choosing one item over another, etc. can be more economically achieved through full-text searching, or through the digital item itself. Therefore, when creating a standard bibliographic description, LAC would provide the greatest benefit to users by focusing its description on those aspects that support the collocation function (name and subject authority control), and de-emphasizing other functions.
126.96.36.199.3 Granularity (Level of Aggregation)
Networked digital publications and websites are described at similar levels of granularity as other LAC collection resources. The same criteria as in 188.8.131.52.1 are used to select levels of granularity for description. If considered to be important, links may be provided between records at different levels of granularity (between parent and child records).
Granularity refers to the extent to which a system contains discrete components of ever-smaller size. In resource description, the level of granularity refers to the extent to which descriptions are provided at higher or lower levels in the hierarchy of discrete bibliographic components: a serial title, an individual serial issue, an article in a serial issue; in the case of websites, the whole website or discrete sub-sites.
LAC may create additional descriptions at higher or lower levels - a collection level record for groups or collections of digital publications and/or websites, or descriptions for components of publications or websites.
184.108.40.206.4 Description Priorities
Networked digital publications and websites are described in order of priority as defined for other formats of publications
6.2.3 Interim Provisions
In the interim before LAC is able to provide the base level of full-text indexing of digital resources, supplementary intellectual access will be provided to all digital publications and websites, rather than a selection. During this interim period, all levels of description may be used; levels of description for digital materials will be the same as those for other publications.
It is recommended that LAC formally assess the effectiveness and cost of using externally supplied and automatically generated metadata, as recommended above. This assessment should take place after LAC has had sufficient experience with it.
A bibliographic record for the resource in its original medium, if available and adequate for access, is the preferred record for a digitized resource. If an adequate record for the resource in its original medium is not available, intellectual access to LAC digitized resources follows the same policies and criteria as for networked digital publications, websites and harvested web domains (see section 220.127.116.11.1).
Because digitized resources are reproductions of items already in the LAC collection, there will exist in many cases a good quality bibliographic record for the resource in its original medium. This record, if available and adequate for access, is the preferred record for a digitized resource; details of the digitized resource (including the URL) are added to the record for the original medium.
Following the international division of bibliographic labour represented by the concept of Universal Bibliographic Control (UBC), Library and Archives Canada as the national bibliographic agency for Canada has the sole responsibility to accurately and comprehensively list all publications emanating from Canada, including digital publications. However, this policy (as well as others in this report) will result in the reduced comprehensiveness of Canadiana, the national bibliography, which is the single-entry inventory of Canadian publications intended to be as comprehensive as possible. It is recognized as well that this policy represents variance from strict application of description standards (AACR) which require separate bibliographic records for each format of publication.
It is felt that the balance of cost-benefit falls on the side of the single record for both formats approach, making cost-effective use of available, high quality bibliographic records. In many cases, the end-user prefers the uncluttered simplicity of a single record containing a description of the original item, as well as the URL for its digitized version. For the end user, immediacy of access to these publications is an important benefit. Many other Canadian libraries already follow the "single record" path, and would welcome this approach by LAC.
Linking to websites is done through a bibliographic record in AMICUS/AMICAN, which may be at an abbreviated level.
"Websites which are not Canadian, but which are of interest to Canadians (e.g. professional association sites, official sites of foreign governments) generally are linked to and are not captured for inclusion in the LAC Collection." (Digital Collection Development Selection Guidelines).
To ensure maximum usefulness of externally-supplied metadata, LAC has developed guidelines for digital publishers and creators on best practices for provision of metadata to LAC for their titles. These guidelines will be made widely available, and reflected in the digital "loading dock" web form which publishers are asked to use when transmitting their digital titles to LAC on legal deposit.
LAC's Metadata Framework for Resource Discovery is a tool for decision-making, communication and leadership. It establishes principles and guidelines for resource description within LAC. LAC's Web resource description policies conform to the principles articulated in the Metadata Framework.
The Web resource description policies in this report were developed with reference to the Digital Collection Development Policy. In particular, the selection guidelines for websites, developed in 2005/06, have been incorporated into the criteria for creating standard bibliographic descriptions. Selection guidelines for networked publications were developed in May 2006.