Library and Archives Canada
Symbol of the Government of Canada

Institutional links

Digital Initiatives at LAC

Digital Policies, Guidelines and Tools

JPEG2000 as a Preservation File Format

JPEG 2000 as a Preservation Format for Digital Raster Images at Library and Archives Canada

JPEG2000 Preservation File Format Working Group
Library and Archives Canada
June 2008

Executive Summary

This paper proposes that Library and Archives Canada (LAC) adopt baseline JPEG2000 as a preservation format for digital raster images. In support of this proposal, JPEG2000 is described in summary and assessed as a partial solution to LAC's digital storage challenges, while offering added access features and bandwidth savings. The format is also weighed against preservation format criteria established by LAC to ensure the long-term safeguarding of the organization's digital image collection. A bibliography and environmental scan of peer memory institutions are included as appendices.

Introduction and Purpose

The need to find a viable alternative to large, uncompressed TIFF files as a preservation format for digital raster images is a growing concern for Library and Archives Canada (LAC). The availability of adequate storage space is always a critical issue, and LAC's digital collection is poised to expand dramatically as the organization initiates large-scale digitization efforts to expand online access to its analog holdings. LAC is not alone in this regard, and other memory institutions internationally – among them the National Library of the Netherlands and the National Library of Norway – have investigated alternative, more storage-efficient file formats to use in place of the industry standard TIFF.

The comparably recent JPEG2000 file standard, from the same joint subcommittee that published the original JPEG standard in 1992, offers a promising balance between preservation-quality lossless compression and substantially reduced storage space (typically a 2:1 ratio for lossless compression compared to TIFF with no loss to image quality1). In addition, the added ability to 'extract on demand' lower resolution images from a single, lossless master offers further potential for storage efficiency by eliminating the need to store multiple copies of the same image at varying resolutions for both preservation and access purposes.

JPEG2000 offers other attractive features for collection dissemination as well. Progressive transmission, for example, allows a low resolution image to be displayed to a recipient once a fraction of the whole file has been transmitted, gradually increasing in refinement until the full resolution is displayed. As well, region of interest decoding (ROI) permits portions of an image to be viewed at higher resolutions than others, allowing for increased performance inasmuch as the entire file need not be altered to a sharper resolution with each progressive 'zoom.' Such decoding flexibilities promise substantial bandwidth savings during file transmission, as well as an improved overall user experience.

Dissemination features and storage savings, however, must be balanced with the requirement to safeguard the integrity of digital information assets for future generations, and any consideration of alternate preservation file formats must be weighed against clear criteria sets to ensure the longevity of preserved images.

The purpose of this paper is to propose that JPEG2000 be adopted as a preservation format for digital raster images at LAC, through a balanced consideration of the format's advantages and the organization's needs.

The JPEG2000 File Format

JPEG 2000 is a comparably new image compression standard based on advances in wavelet technology (see www.jpeg.org). The standard was developed by the Joint Photographic Experts Group (JPEG) as a subcommittee of the International Organization for Standardization (ISO), the International Electrotechnical Commission (IEC), and the ITU Telecommunication Standardization Sector (ITU-T). The Joint Photographic Experts Group is the same committee who published the now ubiquitous JPEG standard in 1992, but with a different set of international commercial and academic participants.

The impetus for a new JPEG standard arose from a desire to resolve many of the limitations of the original standard while recognizing broadening areas of application for JPEG technology. With this in mind, the group set out to create a new standard in accordance with the following basic objectives:

  • Ensure openness through a well-documented standard with widely available technical specifications
  • Offer an improved lossy compression algorithm
  • Create an option for lossless compression
  • Offer comprehensive options for bundling metadata
  • Permit storage of several resolutions within a single file

The desire to create an option for lossless compression deserves particular attention for those interested in long-term preservation. The original baseline JPEG is "lossy," implying that an image, once compressed, cannot be recovered exactly to its uncompressed state. Though irreversible, for the most part the resulting differences are minute and visually unnoticeable, or "visually lossless." In some cases, however (image preservation being one of them), a truly "lossless" compression is desired, such that a compressed image can be recovered – bit for bit – to its original pre-compressed state.

JPEG published its first draft specification for JPEG2000 in 1999, which to many represented not simply an upgrade to the previous format, but a new standard altogether. The standard is divided into twelve parts, most of which have variously followed the formal process of standardization through ISO/IEC. A short description of each follows:

  • Part 1: Core Coding System
    Part 1 defines the core coding system, or baseline, for the compression of still images (defining the basic file format "JP2"), including the JPEG2000 codestream and the steps involved in coding and decoding images. Essentially, part 1 offers a stand-alone description of a basic JPEG 2000 system. None of the other parts are essential to a baseline JPEG2000 implementation. Part 1 became an ISO/IEC standard in December of 2000 (15444-1).
  • Part 2: Extensions
    Part 2 offers optional extensions ("JPX") to the baseline JPEG2000 format, which may or may not form part of a JPEG2000 implementation. Each of the identified extensions is independent of the rest, allowing for custom combinations. Part 2 became an ISO/IEC standard in November of 2001 (15444-2).
  • Part 3: Motion JPEG2000
    Part 3 defines the file format "MJ2" (or "MJP2)" for motion sequences of JPEG2000 images. The format allows multiple JPEG2000 image frames to be combined into movie tracks, and is intended to support lossless or near-lossless quality while allowing files to be scaled down as required for delivery. Part 3 became an ISO/IEC standard in November of 2001 (15444-3).
  • Part 4: Conformance Testing
    Part 4 is concerned with the testing of conformance to JEPG2000 part 1 (baseline), specifying test procedures for encoding and decoding. Part 4 became an ISO/IEC standard in May of 2002 (15444-4).
  • Part 5: Reference Software
    Part 5 contains two source code packages for the implementation of part 1, distributed under open-source style arrangements. Part 5 became an ISO/IEC standard in November of 2001 (15444-5).
  • Part 6: Compound Image File Format
    Part 6 is designed to support document imaging, describing a hierarchy of page relationships. Part 6 became an ISO/IEC standard in April of 2005). (15444-6).
  • Part 7 was initially proposed but subsequently abandoned.
  • Parts 8 to 11 are designed to apply JPEG2000 to specific contexts:
    • Part 8: JPSEC covers additional support for security and encryption functionality, and became an ISO/IEC standard in July of 2006 (15444-8).
    • Part 9: JPIP defines a network transport (client-server) protocol, and became an ISO/IEC standard in October of 2004 (15444-9).
    • Part 10: JP3D is concerned with the coding of three dimensional data sets.
    • Part 11: JPWL covers JPEG2000 coding techniques for wireless applications, and became an ISO/IEC standard in June of 2007 (15444-11)
  • Part 12: ISO Base Media File Format
    Part 12 is a joint JPEG and MPEG (Moving Picture Experts Group) initiative to establish a base file format for future applications (specifically for timed sequences of media data), and became an ISO/IEC standard in July of 2003.

Preservation-Quality File Formats: LAC Criteria Definitions

While advances in technology promise vastly increased opportunities for access to and dissemination of information, the rate of hardware and software obsolescence is alarming from a file preservation point of view.2 Thus, clear criteria must be established to ensure that file formats selected for digital preservation will permit long-term, undiminished access to digital information assets. Upon a review of similar criteria sets published by the Library of Congress, the National Archives (UK), and the National Library of the Netherlands,3 Library and Archives Canada has established the following five criteria, which represent common threads in the sets produced by the aforementioned institutions.

  • 1. Openness/Transparency
    The relative ease with which knowledge of the file format and its technical information can be accumulated.
  • 2. Uptake among peer cultural institutions internationally
    The extent to which the format has been formally adopted by national libraries, archives, and other memory institutions internationally.
  • 3. Stability/Compatibility
    a) The degree to which the format is backward and forward compatible.
    b) The degree to which the format is protected against file corruption.
    c) The relative frequency of release of newer or replacement versions of the format over time.
  • 4. Dependencies/Interoperability The degree to which the format relies on a particular hardware or software, reader, etc.
  • 5. Standardization To what extent the format has gone through a rigorous formal standardization process.

In addition to these five criteria, further corporate considerations include:

  • Storage space and bandwidth requirements
  • Flexibility of access
  • Metadata
  • Future-looking

Criteria Applied to JPEG 2000

JPEG2000's relative compliance to LAC's criteria is ranked low, medium, or high in accordance with each a factor:

  • 1. Openness/transparency (High)
    As per the Joint Photographic Experts Group's initial objectives, JPEG2000 is an open standard. As such it is intended to be "future proof," royalty free, and encourage competitive product creation while maintaining interoperability.4
  • 2. Uptake among peer cultural institutions internationally (Medium)
    An environmental scan of JPEG2000 usage in memory institutions internationally reveals that the format has not been widely adopted for preservation purposes (see Appendix B). At the time of publication of the National Library of the Netherlands' "Alternate File Formats for Storing Master Images of Digitization Projects," for example, only one cultural institution had been identified as having definitively chosen JPEG2000 as its sole archival format.5 That being said, the scan also reveals that implementations of the file format in general are quite common, though primarily limited to use as an access format. Thus, though JPEG2000 is not yet widely used for image preservation, it is certainly not an uncommon file format in major memory institutions internationally.
  • 3. Stability/Compatibility (Medium – see individual factors below)
    a) A clear weakness of JPEG2000 is that it is not backward compatible with other JPEGs inasmuch as programs able to read JPEG files cannot automatically read JPEG2000 files. (Low)

    b) JPEG2000 promises robust error resilience through the provision of error detection and concealment mechanisms.6 (High)

    c) The decade between the publication of JPEG and JPEG2000 suggests that baseline (part 1) JPEG2000 is not likely to be replaced or superseded in the immediate future. (High)
  • 4. Dependencies/Interoperability (Medium)
    As an open standard, JPEG2000 was designed to be platform independent, and both hardware and software implementations are currently available from several vendors internationally. However, it is a concern that the format is not natively supported by most browsers and requires an additional plug-in in order to operate. A notable exception is the Apple Safari browser, which, though natively supporting the format, does not support the full range of JPEG2000 functionality.
  • 5. Standardization (High)
    JPEG2000 has been standardized by both the ITU Telecommunication Standardization Sector (ITU-T) and the International Organization for Standardization (ISO) in conjunction with the Electrotechnical Commission (IEC), as ITU-T T.800 and ISO/IEC 15444.
  • Corporate Considerations (High)
    Certainly, one of JPEG2000s major attractions is its reduced storage requirement for losslessly compressed images. Tests have consistently shown that storage savings of around 50% (2:1) as compared to uncompressed TIFF files can be anticipated without consequent loss of image quality.7 Additionally, region of interest decoding and progressive transmission promise further savings in reduced bandwidth requirements, and generally improve the user experience through more rapid and flexible transmission and manipulation options. JPEG2000 also offers substantial metadata support,8 delivered through bundling metadata elements which are then permanently associated with an individual image file.

Implementation Considerations

From an implementation perspective, JPEG2000 performance and the organizational preparedness of Library and Archives Canada deserve consideration:

  • LAC has some experience with encoding JPEG2000 lossy files for presentation, but a significant learning curve is expected with a fuller implementation of JPEG2000 as both a preservation and a presentation file format. LAC needs to build capacity and expertise in JPEG2000, and requires more exploration of the encoding and decoding process to generate lossless JPEG2000 files.
  • LAC must research and develop robust procedures to ensure that conversion from TIFF to JPEG2000 lossless format is reliable and quality controlled.
  • There remains the question of LAC's considerable legacy of TIFF image masters. Should JPEG2000 be adopted institutionally as a preservation format, to what extend should these legacy TIFFs be migrated?
  • Wavelet technology is complex and can place additional demands on hardware resources.
  • Being a preservation format, JPEG2000 must be integrated into LAC's developing Trusted Digital Repository.
  • LAC must integrate these recommendations with advice to federal government departments and our broader community of stakeholders and creators.
  • Further research and investigation should be carried out with regard to the suitability of motion JPEG2000 as a preservation file format for digital video at LAC.

Environmental Scan: General Observations9

A scan of JPEG2000's adoption in peer cultural institutions reveals some noteworthy trends. For one, the format has evidently not been widely adopted by cultural memory institutions as a preservation format for still raster images. Of the institutions reviewed, only the British Library, the National Library of Norway, and Smithsonian Libraries appear to use JPEG2000 for preservation purposes. Others, however, including the National Diet Library of Japan, the National Library of the Netherlands, and the State Library of Queensland are in similar stages to Library and Archives Canada in examining or providing recommendations regarding JPEG2000 for image preservation. Conversely, the format is widely used as a web access format, though most institutions choose to retain TIFF masters in accordance with industry standard.

In the area of film preservation, however, motion JPEG2000 has had a more substantial impact, most notably by way of the Digital Cinema Institute's introduction of motion JPEG2000 (MJ2 or MJP2) as an industry standard for digital cinema compression. Other industries that have adopted JPEG2000 to various extents include military imaging, criminal investigation, and geospacial imagery.

Conclusion and Recommendation

LAC's JPEG2000 Preservation File Format Working Group recommends that the organization adopt JPEG2000 part 1 (baseline) as a preservation file format for digital raster images.

From the perspective of the working group, JPEG2000 represents an appropriate balance between strategic corporate considerations and criteria established to ensure the long-term preservation of digital image assets (including integration into LAC's developing Trusted Digital Repository). As LAC's image collection expands – a trend that is expected to increase as plans to digitize large analog collections are realized – available storage space will diminish beyond already critical levels. Increased storage efficiency, then, makes JPEG2000 an attractive alternative to TIFF. Additionally, added features such as progressive transmission and region of interest decoding promise to reduce bandwidth requirements while increasing access flexibility. The format scores highly against LAC preservation format criteria of openness and standardization, and scores adequately with regard to uptake among peer cultural institutions, stability/compatibility, and dependencies/interoperability. The biggest risks to LAC's adoption of the format would appear to be a lack of industry uptake for preservation purposes, backward incompatibility, and a lack of native browser support. However, it is the working group's assumption that JPEG2000s ubiquity and interoperability will continue to expand over time, and that the risks currently posed are not of a criticality to exclude the format as a preservation option.


1 See Gillesse et al (2008), Buckley (2008), Bernier (2006), Janosky & Witthus (2003)

2 The UNESCO Charter on the Preservation of Digital Heritage cites rapid hardware and software obsolescence as a key factor in putting the world's digital heritage at risk. (http://portal.unesco.org/ci/en/ev.php-URL_ID=13367&URL_DO=DO_TOPIC&URL_SECTION=201.html)

3 See Gillesse et al 2008; Rauch, Carl et al. 'File-Formats for Preservation: Evaluating the Long-Term Stability of File-Formats." Proceedings ELPUB2007 Conference on Electronic Publishing : Vienna, Austria , 2007. http://elpub.scix.net/data/works/att/122_elpub2007.content.pdf; National Archives (UK). "Selecting File Formats for Long-Term Preservation." (2003). http://www.nationalarchives.gov.uk/documents/selecting_file_formats.rtf; Library of Congress. "Sustainability of Digital Formats: Planning for Library of Congress Collections." (2007). http://www.digitalpreservation.gov/formats/sustain/sustain.shtml.

4 See Murray (2004).

5 See Gillesse et al (2008).

6 See Chai & Bouzerdoum (2001), Gormish (1999).

7 See Gillesse et al (2008), Buckley (2008), Bernier (2006), Janosky & Witthus (2003).

8 See Yale (2008), Murray (2004).

9 See appendix B