Initiative for Equitable Library Access

Strategic Research

FRBR and RDA: Advances in Resource Description for Multiple Format Resources

6. Potential impact for resource discovery and data display: experiments with FRBR-ization

RDA encourages the recording of sufficient metadata and parses the data into data elements. RDA does not dictate how the metadata is displayed, nor how the search engine will use various elements to refine a search and drill down to the appropriate resource. But the use of RDA is intended to support and strengthen this new generation of navigation and of data display.

Many researchers and vendors have started to investigate and promote "FRBR-ized" displays of data. "FRBR-ization" means an application of the FRBR conceptual model in a real environment. Most current FRBR-izations use data in AACR/MARC records and apply some of the FRBR concepts to improve displays. A full FRBR-ization will require sufficient metadata recorded about the work, expression and manifestation level attributes, and a sufficient parsing of data into separate elements to permit manipulation of data for use in designing better navigation and better data displays. However, even with the available pre-RDA data, it is encouraging to see how an awareness of FRBR can already lead to better data displays.

Carlyle and Sumerlin summarize one of the major obstacles confronting users of current catalogues:

Many current catalog searches result in displays composed of lists of hundreds or even thousands of records. These lists do little to shed light on the nature and characteristics of the records retrieved. In addition, it is likely they inhibit a user's ability to identify relevant records. Displays that organize retrieved record sets into intelligible categories may communicate search results more quickly and effectively to users than current catalog displays that consist of long lists of brief record summaries.76

Carlyle and Sumerlin also point to a possible solution, better clustering of results in order to present a meaningful display to the user. The effective organization of information hinges on collocating those resources that share a similar attribute and also making clear the differences between them.

The essential and defining objective of a system for organizing information, then, is to bring essentially like information together and to differentiate what is not exactly alike.77

Unfortunately, many current OPACs do not fulfill this objective well and return long lists of unintelligible results. Or in Patrick Le Boeuf's words:

The wonderful syndetic structure of printed catalogs has yielded to databases that are barely more than collections of unrelated monads.78

When the Format Variation Working Group was appointed, one of its first tasks was to explore the viability of expression-level cataloguing. As part of the background to inform this work, members of the Group analyzed sets of existing MARC records to see if the data recorded in each field and subfield was consistently describing an attribute at the level of work, expression, manifestation or item. They discovered many areas of ambiguity and overlap.

However, most participants expressed some surprise at the difficulty of the exercise, especially given that most examples were known expression sets (i.e. there was no question that all of the manifestations represented the same expression) ... As a result of this exercise, the Group affirmed what has been observed by many ... : while in many cases it is possible for a cataloger to identify easily when several manifestations represent the same intellectual content (i.e. the same expression), the bibliographic data does not always "behave" in a way that is conducive to constructing a bibliographic record for an expression that would include predictable data elements.79

The Functional Analysis of the MARC 21 Bibliographic and Holdings Formats 80 mapped the correlations between MARC and FRBR. The mappings also demonstrated that there were areas of ambiguity and overlap. Some MARC elements do not map to anything in the FRBR model, such as MARC elements for record processing. Some FRBR attributes do not map unambiguously into MARC, or may be recorded in non-specific textual fields, such as general notes. The MARBI discussion paper 2002-DP08, Dealing with FRBR Expressions in MARC 21, points out that half of the expression-level attributes in FRBR do not have a specific MARC 21 field to contain them.81 Attributes of different entities are sometimes mixed or concatenated in one MARC data field. Strings of data in the same field that carry information about more than one entity make it harder to manipulate the data for use in creating meaningful clusters.

Ed O'Neill, a research scientist at OCLC, conducted a study to evaluate whether the bibliographic information in WorldCat MARC records was sufficient to identify FRBR entities and to allow a FRBR-ized display of search results. He chose a single work, The Expedition of Humphry Clinker by Tobias Smollett.82 He concluded that works can be reliably identified based on current information in bibliographic records, but expressions cannot be reliably identified because information is often missing:

The FRBR model provides a powerful means to improve the organization of bibliographic items, particularly for large works such as Humphry Clinker where there is no way to navigate easily within the work. Works are a valuable concept and provide a means by which to aggregate bibliographic units and simplify database organization and retrieval. It appears that works can be reliably identified from existing bibliographic records. Identifying expressions, however, is far more problematic. In the example of Humphry Clinker, the set of expressions created from the existing bibliographic records is very different from the set based on the physical examination of the books themselves... Existing bibliographic records simply do not contain sufficient information to consistently associate the records with expressions. 83

The available data has limitations. Current FRBR-izations of MARC record catalogues can only achieve partial success. Yet, even with the limitation of imperfectly recorded data, the application of FRBR concepts immediately improves the results for users.

OCLC has been a front-runner in experimenting with possible, current applications of the FRBR model. They have launched a very successful service called xISBN. This service builds on the relationship between manifestations of the same expression. Each manifestation of a book has its own identifier number, its own ISBN. Users may need one particular manifestation, but often they are searching for a copy of a particular expression. In a pre-RDA application of the FRBR model, OCLC uses an algorithm to pull together related ISBNs.

The xISBN Web service supplies ISBNs and other information associated with an individual intellectual work that is represented in WorldCat. Submit an ISBN to this service, and it returns a list of related ISBNs and selected metadata ... , rather than requiring an end user to traverse multiple records that represent many different manifestations of a book—including printings, hardback or paperback editions or even filmed versions—"FRBRized" WorldCat information allows that user to review a core record that lists all manifestations.84

RDA will encourage the recording of sufficient metadata so that one can cluster manifestations of the same expression. At this point, MARC records have a varying amount of data with which to work, and thus clustering by expression has uneven results. The xISBN service adds an additional amount of clustering by pulling together manifestations of the same "intellectual work". It does not claim to sort out expressions of the same work. Since ISBNs are assigned to books, it does in effect cluster together all the expressions in the form of alpha-numeric notation.

OCLC has also applied some FRBR-ization to WorldCat, in its display of metadata for works with many manifestations. Again, with imperfect metadata, the clustering misses titles that should be in the set, but it demonstrates how the principle of understanding the relationships between the group 1 entities can improve the user experience. Thus, if I search "Robinson Crusoe", I retrieve results that are fairly well grouped:

Screen shot of a search results page on the OCLC WorldCat website.

Figure 7: a screen shot of OCLC WorldCat search results page for the search term "Robinson Crusoe". The linked titles displayed and listed on the results page are as follows, from top to bottom:

Equivalent text for Figure 7

  1. Robinson Crusoe
    By Daniel Defoe; N C Wyeth
  2. Marooned : the strange but true adventures of Alexander Selkirk, the real Robinson Crusoe
    By Robert Kraske
  3. Robinson Crusoe
    by Deanna McFadden; Jamel Akib; Daniel Defoe
  4. Robinson Crusoe
    by Pat Rogers
  5. Robinson Crusoe on Mars
    By Audrey Schenck; Edwin F Zabel; lb Melchlor, John C Higgins; Byron, Haskin; Paul Mantee; Victor Lundin; Adam West; Daniel Defoe; Paramount Pictures
    Corporation; Schenck-Zabel Productions; Devonshire Pictures. Inc.; Criterion Collection (Firm);
  6. In Search of Robinson Crusoe
    By Timothy Severin

Under the first title, the user is given the option to "View all editions and formats". This will then lead to a hit list of over three thousand "editions". These are a mixture of different expressions and manifestations. There are different forms of expression: alpha-numeric notation, tactile notation and spoken word. There are different languages of expression: 62 languages. There are different manifestations of each expression, with different media and different carriers. The hit list itself is not clustered, but WorldCat offers facets in the left pane. The facets are based on the AACR2 classes of material, augmented by additional MARC coded information. Thus, one can pull out a subset of the 11 braille titles:

Screen shot of a search results page using the facet "braille", on the OCLC WorldCat website.

Figure 8. Screenshot from OCLC WorldCat illustrating search for "Robinson Crusoe"; results for "all editions and formats" further refined by using the facet "braille." The linked titles displayed and listed on the results page are as follows, from top to bottom:

Equivalent text for Figure 8

  1. Robinson Crusoe
    by Daniel Defoe, Braille book : Fiction English 1994, WaShington, D.C. : National Braille Press Inc.
  2. Robinson Crusoe
    by Daniel Defoe, Braille book : Fiction English 1992, New York : Knopf
  3. Robinson Crusoe
    by Daniel Defoe, Braille book : Fiction English 1992, Newark. N.J. : New Jersey Commission for the Blind and Visually Impaired
  4. Robinson Crusoe
    by Daniel Defoe, Braille book : Fiction English 1967, Louisville. KY : AAlerican Printing House for the Blind
  5. Robinson Crusoe
    by Daniel Defoe, Braille book : Fiction English 1964, New York : Scholastic
  6. The life and strange surprising adventures of
    Robinson Crusoe,
    by Daniel Defoe; E Boyd Smith, Braille book : Fiction : Juvenile audience, English 1909, Boston, Houghton Mifflin
  7. Robinson Crusoe
    by Daniel Defoe, Braille book English, London : Royal National Institute for the Blind

OCLC has worked with the existing metadata. It is encouraging to see how the display of metadata can be improved by applying FRBR concepts even within the current AACR2 and MARC21 environment. The displays are limited by the coding in the records. The coding for content and carrier is based on the AACR2 classes of material, thus the categories or facets display an unevenness of differentiation and similarity. For example, braille and large print appear as equal subsets of "book", even though they are different expressions. The set under "book" is not necessarily content in alphanumeric notation, but can include tactile notation and spoken word expressions. One can limit by sound recording or Internet resource, but not both simultaneously, thus making it more difficult to zero in on a set of audiobooks in computer media. Also, insufficient data means that not all records cluster in the right place. Thus, there is one large retrieval set for the search "Robinson Crusoe", but further down the hit list is another set of 31 records that should have been part of the first set. However, even with these limitations, any attempt to highlight the relationships between the manifestations, and to cluster the results using the FRBR conceptual model dramatically improves the user's search experience.

OCLC has also experimented with a subset of metadata for works of fiction in a prototype database called FictionFinder. 85 FictionFinder uses FRBR concepts to shape the display of data, and to cluster results to enable easier navigation. With the WorldCat cluster, WorldCat showed a manifestation level record, and pointed to the existence of other editions with the button labelled: "View all editions and formats". FictionFinder uses a work-level display as the entry point into the cluster:

Screen shot of a web page illustrating work-level display on the OCLC WorldCat website.

Figure 9. Screenshot from OCLC FictionFinder illustrating work-level display for Robinson Crusoe. The linked titles displayed and listed on the results page are as follows, from top to bottom:

Equivalent text for Figure 9

Robinson Crusoe.
Defoe. Daniel. 1661?-1731
2363 editions, in 62 languages, held by 2.3076 libraries
Summary: During one of his several adventurous voyages in the 16OOs, an Englishman becomes the sole survivor of a shipwreck and lives for nearly thirty years on a deserted island.
Genres: Adventure fiction, Robinsonades, Romans a clef, Adventure stories, Sea stories, Historical Fiction
Characters: Crusoe. Robinson (Fictitious character)
Settings: England, Foreign countries, Pacific Ocean, Scotland, Atlantic Ocean, England London, Since 1950 South America
Subjects: Survival after airplane accidents, shipwrecks, etc, Castaways, Shipwrecks, Islands, Adventure and adventurers,
Chapbooks, English, Solitude, Naufragios Novela, Spanish language
Wrote As: Johnson, Chanes

Audience: general, special
Title, Author:

  1. Robinson Crusoe, Daniel Defoe, an authoritative text, backgrounds and sources, criticism, edited by Michael Shinagel
  2. Robinson Crusoe / by Daniel Defoe; with illustrations by N.C. Wyeth
  3. The life and strange surprising adventures of Robinson Crusoe of York, mariner Daniel Defoe, edited with an introduction and notes by J. Donald Crowley
  4. Robinson Crusoe. With illustrations of the story by Thomas Stothard, together with a foreword by Arthur D. Howden Smith
  5. Robinson Crusoe / by Daniel Defoe; illustrated by Julek Heller
  6. The life and adventures of Robinson Crusoe. Illustrated by Roger Duvoisin; introduction by May Lamberton Becker
  7. The life and strange surprising adventures Of Robinson Crusoe of York, mariner: edited with an introduction by J. Donald Crowley
  8. Robinson Crusoe / with illus. by N.C.Wyeth
  9. The life and strange surprising adventures of Robinson Crusoe, by Daniel Defoe, illustrated by LyndWard
  10. Robinson Crusoe: Daniel Defoe; edited by Michael Shinagel
  11. Robinson Crusoe by Daniel Defoe
  12. Robinson Crusoe / Daniel Defoe; illustrated by N,C.Wyeth
  13. Robinson Crusoe by Daniel Defoe

Again, it allows one to limit the search by language or format, and so achieves a partial improvement in navigation and display. The clustering works well at the work level, but it remains difficult to show the user which expressions are available, and to demonstrate clearly which manifestations belong to the same expression.

There has been some exploration of the applicability of FRBR by several integrated library system vendors. VTLS is a pioneer in this area The VTLS library management system called Virtua has the capability to return results in more rigorous clusters, using a database structure of separate, linked work, expression and manifestation level records. The catalogue has a feature that allows the user to open up a "FRBR tree". This display groups together expressions of the same manifestation:

Expansion of the FRBR tree for the title Moriae encomium in the catalogue of L'Académie Louvain in Belgium 86:

Screen shot from the catalogue of L'Académie Louvain in Belgium

Figure 10. Screenshot from the catalogue of L'Académie Louvain in Belgium; expanded FRBR tree display for the search: Moriae encomium. Example suggested by VTLS Inc.

Equivalent text for Figure 10

Moriae encomium - Erasmus Roterodamus, Desiderius, 1469-1536
Books - Dutch -
De lof der zotheid / - - Wereldbibliotheek, 1973. - 182 p. ; 21 cm.
Moriae encomium, dat is De lof der zotheid / - - Manteau, 1971 - VII, 331 p. : ill. ; 19 cm.
De lof der zotheid / - - Wereldbibliotheek, 1969 - 184 p. : ill.
De lof der zotheid / - - De Nederlandsche boekhandel, 1947 - 176 p.
Books - English -
The praise of folly / - - 1913 - XXIII, 188 p. ; 19 cm.
Books - French -
Éloge de la folie / - - Castor astral, 1991 - XI, 204 p. ill.
Éloge de la folie / - - Tarbrag, 1958? - 208 p.
Éloge de la folie / - - Club français du livre, 1957 - 243, [1] p.: ill.
L'éloge de la folie / - - Garnier, 1953. - XII-189 p.
Éloge de la folie / - - Ed. de Cluny, 1947 - XXVIII, 169 p. : ill.
L'éloge de la folie / - - Terres latines, 1945 - 135 p. : ill.
Éloge de la folie / - - Office de publicité, 1943. - 84 p.
Éloge de la folie / - - Ed. du Rond-point, 1942 - 199 p.
L'éloge de la folie / - - Garnier, 1937. - XII, 327 p.
L'éloge de la folie / - - A l'enseigne du pot cassé, 1933 - 204 p. ; 18 cm.
L'éloge de la folie / - - A l'enseigne du pot cassé, 1930 - 226 p. : ill.
Jacques le fataliste et son maître / - - A l'enseigne du pot cassé, 1929 - 2 v.
Les affinités électives / - - A l'enseigne du pot cassé, 1929 - 2 v.
Mademoiselle de Scudéry et Salvator Rosa / - - A l'enseigne du pot cassé, 1929 - 216 p.
Aventures de Lazarille de Tormès / - - A l'enseigne du pot cassé, 1929 - 252 p.
Voyage sentimental / - - A l'enseigne du pot cassé, 1927 - 204 p.
L'ingénu / - - A l'enseigne du pot cassé, 1927 - 191 p.
Lysistrata / - - A l'enseigne du pot cassé, 1926 - 152 p.
L'éloge de la folie / - - A l'enseigne du pot cassé, 1926. - 210 p. : ill.
Éloge de la folie / - - Librairie de la bibliothèque nationale, 1884 - 148+8 p.
Éloge de la folie / - - Librairie des bibliophiles, 1876. - 239 p.
Éloge de la folie / - - Gosselin, 1843 - 305 p.
L'éloge de la folie / - - Diederichs, 1828 - 191 p.
L'éloge de la folie / - - Van Esse, 1827 - VIII, 190 p.
L'éloge de la folie / - - Roret, 1826 - 270 p.

Network Development and MARC Standards Office for the reportDisplays for Multiple Versions from MARC 21 and FRBR.

Based on the Functional Analysis of the MARC 21 Bibliographic and Holdings Formats, and extending this analysis, the Network Development and MARC Standards Office at the Library of Congress explored how a FRBR-ized display might affect multiple versions.87 The examples are mock-ups but they demonstrate a way to communicate information about the relationships between manifestations by using a hierarchical clustering to distinguish between works, and between expressions. Since the examples are mock-ups, they do not have to rely on existing data in bibliographic records. Instead, they can focus attention on the advantages of recording sufficient metadata to enable unambiguous and meaningful displays of bibliographic data. They point to the quality of clustering that RDA data aims to support.

Possible Hierarchical Display
Ondaatje, Michael, 1943?
The English patient.
Text - English
The English patient / Michael Ondaatje.
Imprint: Thorndike Press ; Chivers Press, 1997.
Physical description: 455 p. (large print) :
ill. ; 23 cm.
ISBN: 0786211512 (U.S. hd. : alk. paper)
ISBN: 0754010457 (U.K. hd.)
ISBN: 075402024X (U.K. pbk.)
The English patient / by Michael Ondaatje.
Edition: 1st Vintage International ed.
Imprint: Vintage Books, 1993.
Physical description: 305 p. ; 21 cm.
ISBN: 0679745203
Sound recording - English
The English patient / by Michael Ondaatje.
Imprint: Macmillan Audio Books, p1997.
Physical description: 2 sound cassettes
(ca. 4 hrs.) : analog.
ISBN: 0333675568
Publisher's number: MAB 15 Macmillan Audio Books



Related Works

The English patient.
Motion picture - English
The English patient / Miramax Films presents a
Saul Zaentz Production ; an Anthony Minghella Film.
Imprint: Miramax Home Entertainment, [1998]
Physical description: 1 videodisc (162 min.) :
bsd., col. ; 4 ¾ in.
ISBN: 1558908307
Publisher's number: 14175 Miramax
Imprint: Miramax Home Entertainment, c1997.
Physical description: 2 laserdiscs (162 min.):
sd., col. ; 12in.
The English patient / produced J&M Entertainment ;
Miramax films ; directed by Anthony Minghella.
Imprint: 1996.
Physical description: 18 reels of 18 on 9 : sd.,
col. ; 35 mm. ref print.
(two combined)]
Minghella, Anthony.
The English patient.
Text - English
The English patient / Anthony Minghella ;
based on the novel by Michael Ondaatje ; introduction by
Michael Ondaatje.
Edition: 1st ed. Imprint: Hyperion Miramax Books, c1996. Physical description: xviii, 189 p. : ill. ; 21 cm. ISBN: 078688245X


Figure 11. Illustration of a possible hierarchical display for a work, its expressions and manifestations, and related works, and the expressions and manifestations of related works. Illustration prepared by the Network Development and MARC Standards Office, Library of Congress.

There are many other experiments with FRBR-ization.88 The ones described here were a few chosen to illustrate the advantages of applying FRBR concepts, even in a pre-RDA environment.

The aim of a FRBR-ized display of bibliographic information is to present the user with a meaningful display of results, where the user can quickly and easily decipher the relationships between the resources.

Ideally, a display clustering a large number of items would present clusters that clarify the nature of items retrieved and would be composed of manageable numbers of items.89

The user may approach the task of searching from many different angles. They may approach the catalogue knowing that they want the content of a work, in an expression which they can understand. Or they may approach the catalogue looking for a genre and a particular carrier type. By encoding attributes in separate elements, each attribute can be used as a part of the search, and this search is especially precise when data recorded in that element must conform to a set of controlled vocabulary.

Current FRBR-izations demonstrate definite improvements in resource discovery and data display. These FRBR-izations are partially successful, but cannot achieve full success because the data on which they rely is imperfect.

Current bibliographic records... are neither complete nor consistent. In addition, much important information is recorded as unstructured text, mostly as notes, and is either not appropriate or very difficult for computer processing.90

The instructions in RDA ensure that well-formed metadata is recorded. This metadata supports meaningful displays, meaningful clustering of results and effective navigation through large sets of results. Considering the successful improvements using imperfect data, it is promising to think about the next level of improvements when FRBR-izations use data that is intended to support FRBR-ized search and display, such as data recorded according to RDA.

