IPIG Meeting 4-5 June 1998. Agenda Item 11.1
Ruth Moulton posted a message to the IPIG Mailing List on 21 May 1998, summarizing her findings on the use of DNS for an ILL Policy Directory.

She reported that DNS is well suited to a Policy Directory, and has the advantage that the software exists and the infrastructure of an existing data base also exists. The data that can be held is totally flexible, the only restriction is that the name space is defined as alphameric labels joined by dots (the domain names we know and love).

Ruth commented that she was hopeful that if we went down this road, we could define our record structures with the IETF. Otherwise she believed we could set up records on DNS servers over which we have administrative control.


Domain Name Server as a possible technology for a Policy Database

Ruth Moulton
21 May 1998

1.0 Introduction

Following discussion at the February 1998 IPIG meeting I have taken a look at the feasibility of using the Domain Name Server technology as a database for the IPIG Policy Directory.

Section 2.0 is a summary of what the DNS is and does. The full scoop is mainly in RFCs 1034 (concepts) and 1035 (definitions), with supplementary RFCs giving more detailed information for particular uses, administration etc. RFC 2181 gives clarifications on RFC 1035.

2.0 Facilities provided by DNS.

The DNS provides a distributed data base for looking up data held about entities in a tree structured name space.

2.1 Name Space

The name space is tree structured, and the labels at each node and the leaves are joined together with dots to provide domain names. The domain name forms a path to the node or leaf where data is held.

The root, the top level domain, is called '.'.

All other labels may be up to 63 characters, from the set a-z, 0-9 and '-', and are case insensitive (and follow rules for Arpanet host names).

The total length of a domain must be at most 255 characters, including separators.

The total name space is divided into administrative zones. A zone is administered by an organisation that has total control of the name space (domain names) within the zone and the data held for the name space.

Data is looked up by making a query on a particular domain name.

e.g.

 
                          .
	                  |
       ----------------------------------------------
       |        |                |        |        |
      EDU       UK               IT      CA        COM
             --------      
	     |       |
            AC      CO
                 ------------------
	         |       |       |
	        DEMON   BBC
               ----------- 
	       |     |   |
	    MUSWELL	

to look up the IP address of my machine (muswell.demon.co.uk) a query for the address information held at muswell.demon.co.uk. would be made. One could put out a query for other information held at this node, e.g. Policy Information.

Aliases may also be used so that an alias domain name would hold the Canonical Name from which the information must be extracted.

2.1.1 IPIG implications

We would need to developed a domain name space for the libraries having policy information stored in the database.

We could either use the existing one, e.g. the British Library already has the domain name bl.uk. , or we could invent a new name space, applying for a second level domain from NIC who administer the root. This would seem a duplication of names, and I have no idea how difficult it might be to obtain a second level domain (very, I imagine).

Since most libraries will be on the internet already, using their current domain names makes more sense, for systems not already in the current name space we could register them within existing zones.

A hypothetical example might be if RLG wanted to register an OCLC service, it could add oclc to its existing domain space, oclc.rlg....

2.2 Data

Data at each node is held in Resource Records (RRs).

An RR contains the following:
owner - domain name where authoritative RR is found
Type - 16bit numeric value
class - 16 bit numeric value
TTL - time-to-live - for cached versions of the RR
Rdata - the actual data.

There may be any number of RRs held at each node.

The format and semantics of Rdata are determined by the type and class.

Class is the higher level entity, it is intended to identify a protocol family, or instance of a protocol. E.g. IN is internet, CH is the Chaos System.

Type indicates RRs within a class, e.g. IN has MX (mail records), A (IP address records) etc.

Rdata is 'a variable length string of octets...', the format varies according to the Type and Class.

In other words the Rdata may be in any format! - RFCs 2163 and 1464 are examples of definitions of Rdata.

2.2.1 IPIG considerations

We should probably define a new Class of RRs for our use. IETF maintain Class numbers, and Type numbers for the Internet class (RFC 1700 Assigned Numbers).

I think we have a good case for a new Class (class is defined as being for a protocol or protocol set), otherwise we'd have to apply to have some types under the Internet class. Is there a higher authority that could do the application - e.g. TC46 ?

Then within our class we may define a set of types of RRs, together with their syntax and semantics.

RFC 1123 (Requirements for Internet Hosts) specifies that DNS implementations must be robust to new RR types/classes.

As well as various pieces of policy information, there could be RRs holding such thing as system Ids.

2.3 Queries

The format of query messages and responses is defined in the RFCs.

Basically one can query for particular class/type RRs from a domain name (node in the name space), the response returns the RRs (if any) together with information about the RR, e.g. is it a cached version of the data or an authoritative version.

Since the whole RR is returned the server does not have to have any specific knowledge about the RR syntax or semantics.

If the data is not held on the particular host that has received the query, the query may be further pursued in either a recursive (the query is automatically passed to another server) or iterative (the answer suggests the next server to try) fashion. A particular host is configured to act in one mode or another.

A query message may specify a number (16 bit) of name,type,class triples, i.e. it may carry a large number of queries at one time, either for more than one domain name or for many types of records at a particular domain.

The answer message may contain a similar number of RRs in response.

2.4 Other Information

The information in the data base is held by "name servers", these also answer queries.

Queries are sent and answers received by "resolvers".

Both UDP and TCP protocols are used in distributing data and servicing queries.

Data is defined in Master Files, these may be entered directly into a server, or prepared as text files and sent (by ftp, mail etc) to the administrator of a server for entering on the system. Master files are prepared by the administrator of a Zone.

A primary, authoritative, version of the master file is maintained on one server. At least one secondary version must exist on another server, but there may be as many secondaries as desired. This builds a certain amount of redundancy, and hence reliability into the system.

Servers may cache information that pass through it (in satisfying queries), the TTL value in the RR indicates how long this information may be kept.

Servers automatically distribute secondary versions of the Master information when the primary version changes.

Clients can identify trusted name servers to use before accepting referrals.

Inverse queries (i.e. give me the node that contains this RR) are optionally supported as are negative caching (information about when an RR cannot be found).

Servers may hold copies of a master file or be caching only servers.

3.0 Software

I have not had time to do a survey of available software, so only know what is available on some UNIX systems. I'm sure that interfaces are equally available for MS and Mac operating systems.

Software falls into 3 catagories:

APIs
for making queries from applications
Applications
for querying,administering and maintaining data bases at user level
Servers and Resolvers
the applications that implement the DNS protocol.

On UNIX, the BIND (Berkeley Internet Name Domain server) is distributed with Solaris, SunOS, IRIX, HP-UX, DEC OSF/1 and the BSD variants (including FreeBSD), and probably others.

The book UNIX System Administration Handbook, second edition, published by Prentice Hall has lots of information about using BIND, to administer a Zone and DNS systems in general.

I would still like to investigate what's available within BIND for developing new RR types and supporting applications to use them, and what support the existing applications (such as DIG - a general DNS query application) give, i.e try out some stuff on my local system. I'll post any more findings to the list ...


Ruth Moulton, Consultant
65 Tetherdown,
London N.10 1NH, UK
ruth@muswell.demon.co.uk
Tel:+44 181 883 5823


BACK TO

ILL Protocol ImplementationsILL Protocol Implementations
Interlibrary Loan Application Standards Maintenance Agency

Documents for IPIG Meeting, 4-5 June 1998