DRAFT
Appendix E
Creating a Search from Scan Results
Version 4 September 8th 1999
The Requirement
Scan returns results that consist of terms with complementary data,
representing rows from an ordered list. The results can be presented to an end user,
enabling him or her to browse forward and optionally backwards, then select a line for
further information or processing. When a line is selected, the system may format a search
request that may return one or more records associated with the term. Typical examples are
scans (browses) on AUTHOR, SUBJECT and TITLE.
To construct a follow on search, the origin takes the USE attributes
that were used for the scan and takes data from the term field of the scan
as the search term or terms.
Database Models
The technique described above may be employed to construct a search
for various database models.
There are the following possible models:
- The scan index is derived from a database that is separate from the
bibliographic database but it contains pointers to the linked bibliographic records. The
same data occurs in both databases. Example: an Integrated Library Management System with
an authority database linking to a bibliographic database of full MARC records that
contains authorized data, including authors and subjects.
- The scan index is derived from a database, e.g. an authority database
that is inter-linked with the bibliographic database with the records in the bibliographic
database containing links to the associated database and vice versa, with no repetition of
data. Example: an Integrated Library Management System with an authority database linking
to a bibliographic database. To construct a full record, it is necessary to integrate data
from both databases.
- The scan index is derived from the bibliographic database. Example: a
title index.
- The scan index is derived from a database that is totally separate
from the bibliographic database Example: Authority database on LC and bibliographic
database somewhere else.
The technique of doing a follow on search with the same Use
attributes from the scan request and the TERM from the scan response may not be
the most efficient for the first two cases above where there are database links that may
be employed. The problem with using Term for the follow search is that the
resulting search may not be precise enough. There are a number of reasons for this.
Firstly, the term may have been truncated and actually lacks significant words, important
for the precision. Secondly, the target may not support position attributes such as first
in field or the structure attribute phrase and therefore the search
is constructed in an imprecise way such that it can retrieve unexpected records even when
a single seemingly unique line has been extracted from a scan. Another case
is where the term is not unique. This can happen where it is necessary to
repeat the term because of differences in the display term.
Example:
TERM
DUPUIS FRANCOISE
DISPLAY TERM Dupuis, Françoise
TERM
DUPUIS FRANCOISE
DISPLAY TERM Dupuis, Fran¸çoise
TERM
DUPUIS FRANCOISE
DISPLAY TERM Dupuis, Francoise
What is required is a means of using database links where they
exist to assist in the precision of the follow on search.
The
Proposal
The proposal is to include this retrieval information in otherTermInfo
as an external carried in externallyDefinedInfo. The external would be
called DirectTermAccess and would comprise the following:
| Data element |
Comment |
| z3950url (session url) |
Optional, if omitted, server
address, server port and database assumed to be the same as for the SCAN. If included then
database name is mandatory. |
| AttributesPlusTerm |
Optional if database name
is present; if missing, then the returned term can be used safely in the other database; Mandatory when database name is not present. |
| OccurrenceCount |
Optional. Indicates the
occurrence of the term in the database to be searched. For example will give the
bibliographic occurrence count of an authority TERM. |
Example:
An authority scan, e.g. author or subject performed on an index of
an authority database (auth.file) produces a scan entry with a term occurrence of 1. There
are actually 3 bibliographic records associated with this authority record. In otherTermInfo
of the scan response, there is one entry containing the identifier of the authority record
(4544).
Z39500url server address and port blank, database name = bib.file
AttributePlusTerm
attributeSet 1.2.840.10003.3.1 (Bib1)
attributeType 1 (Use attribute)
attributeValue 12 (Local number)
term 4544 (local number of the term)
occurrenceCount 3
The target may define its own attribute values for internal numbers,
particularly if it needs to distinguish between local bibliographic and authority numbers.
As the target is able to supply these in the scan response, the origin does not need to
know them in advance. Therefore internally defined attributes do not pose a problem for
interoperability.
The authority file may be located in a separate database from the
bibliographic file or it may be in the same database. For the purposes of retrieval, the
authority file should be regarded as a separate database even where it is not. The origin
needs to know the names of both databases.
Where an authority file is linked to a bibliographic file as per
database models 2 and 3, it is possible that to:
- Scan the authority file, then search the authority file, e.g. to
retrieve a MARC authority record
- Scan the authority file, then search the bibliographic file, e.g. to
retrieve the MARC bibliographic record or records associated with the authority record
| Version |
Date |
Author |
Description |
| 1 |
9.04.99 |
Janifer Gatenby |
|
| 2 |
25.07.99 |
Janifer Gatenby |
Change other term info / url to
alternative term / attributes plus term |
| 3 |
6.08.99 |
Janifer Gatenby |
Change from alternative term to
other term info with Direct Term Access as an external |
| 4 |
8.09.99 |
Janifer Gatenby |
Replace server name, server
address, port number and database with z3950url. |
|