0XX_fields


 

Back to MARC Elements

 

 

Number and Code Fields (0xx)


 

The fields in this range are either simple (a single string or unit) or complex (multiple units in the data element). Each of these units may have qualifying information -- meta information about the value itself. Most frequently this qualifying information provides the agency, controlled list, or rules that define the provenance of the value. There are a few binary values. These latter cannot be further qualified and are generally found in indicators.

 

There are many fields in the MARC 0XX area that make up a logical gathering of elements. One example is the field for various language codes (041). It is easy to assume that the data elements in this field need to be kept together through a data structure of some kind. While it may be the case the applications will display these together as a unit, they will be treated as independent simple data elements if there is no inherent dependency between them. In the case of the 041, each subfield is meaningful on its own without the others. A counter example is that of classification codes that have two parts: the class code and the item number. While the class code has meaning on its own, the item number is only meaningful in the context of the "call number" which comprises both the class code and the item number.

 

This is primarily to point out that the analysis here follows the logic of the data elements and their relation to each other and to the focus bibliographic item, and not the structure of the MARC record. It is this generalization of the underlying data elements that should make it possible to create different record structures from the same data that is in MARC.

 

Note on Control subfields ($w, $3, $5, $6, $8)

 

I don't include the control subfields in this analysis. To the extent that they inform the list of MARC data elements they will need to be included at some point. In many cases, the control subfields have to do with the structure of the record not the semantics of the data. The $3 links together statements about the same subunit in a record for a complex item. This may be handled differently in a different  record structure. The $5 qualifies a field based on the institution to which it applies, often for item-level notes. These obviously could be placed in an item-level structure rather than the body of the bibliographic record for the manifestation. The $6 links together representations of the same field in different scripts or a transliteration to the vernacular. The $8 seems to be rarely used, and without some examples I find it hard to add to the analysis. I will keep looking for those. (The Moen statistical study lists some $8's in LC records, but only in the 0XX fields.)

 

Simple elements

These are elements that stand alone and can be used in any context. Examples are:

 

Note that there may be more than one simple element in a MARC field. Thus,

 

020 $a ISBN $c Terms of availability $z Canceled/invalid ISBN

 

is actually three separate single elements:

 

ISBN

Terms of availability

Canceled/invalid ISBN

 

Each of these is a statement about the focus of the bibliographic record, and they are not dependent on each other. (I think; if I'm wrong, let me know.)

 

 

Complex Elements

 

Many of the elements are made up of two or more dependent parts. The call numbers are a relatively simply example of this, illustrated with LC call Number:

 

050 $a Classification number $b Item number

 

The Classification number can be used without the Item number, but the Item number is not meaningful without the Classification number.

 

Note that the Classification number is repeatable, while the Item number is not. When there is more than one Classification number, any one of them can be combined with the single Item number to creation a full call number.

 

Other call numbers that follow this general pattern are: UDC (MARC 080), DDC (MARC 082).

 

Another example of a complex field is the Fingerprint Identifier (026). This is used for antiquarian books, and has the following subelements:

 

$a First and second groups of characters

$bThird and fourth group of characters

$c Date

$d number of volume or part

 

It also has a subfield that can contain the entire fingerprint and is a simple element:

 

$e Unparsed fingerprint

 

Simple strings, Complex elements

 

There are string values in the 0XX area that are simple strings but that can contain more than one data element. The ISBN is an excellent example of this:

 

020 $a 0914378260 (pbk. : v. 1)

020 $a 0670033480 (hbk. : alk. paper)

 

This is clearly more than just an ISBN, but it is coded only as a single data element. The punctuation could be considered to delineate sub-elements, but is probably not regular enough or rigorously enough followed to be processed as such.

 

Another field that is a single string but can contain more than one data point is the Number of Musical Instruments or Voices Code (028). This field has two (independent) subfields:

 

$a Perform or ensembler

$b Soloist

 

Each one carries a two-character code for the musical form, and "may be followed by a two digit number (01-99) that indicates the number of parts or performers (e.g., va02, a two-part composition for Voice - Soprano)." This is a complex element -- musical form plus number of parts -- that is coded as a single string.

 

At some future date it may be desirable to further analyze these into their logical parts. For now, they will be listed here as single strings, matching their use in MARC.

 

Qualified elements

The qualified elements are ones that have additional information about the data element itself. These qualifying subfields are about the data element, not about the bibliographic item being described. The qualification changes the meaning or semantics of the data.

 

Due to its growth over time, there are different treatments of data in MARC that relate to qualification. For example, there are fields for a number of different classification numbers:

Library of Congress Call Number

National Library of Medicine Call Number

National Agricultural Library Call Number

Dewey Decimal Classification Number

 

In a sense, each of this is a classification or call number with the type of number (or "source" of the number) included in the definition of the field.

 

At some point it became clear that if MARC were to create a separate field for every possible call/classification number, it would run out of 0XX fields. The next step was to create a field for "Other classification number." This field takes a different approach to defining a call/class number and its source. In effect, this is a general field where any call/class number can be input:

$a Classification number $b Item number $2 Source of number

 

Were MARC being developed anew today it might make sense to use this field for all class numbers. Instead of having a separate field for LCC, you would have

$a Classification number $b Item number $2 Source of number=LCC

 

Many elements are qualified by assigning agency or code source. Of these, some have a finite list of types, and others are open-ended.

 

Qualifiers with finite lists

Fully designated types can in most cases be turned into a single element for each type. For example, the 082 DDC, which is typed as either the full edition or the abridged edition, can be created as two separate elements: full edition DDC and abridged edition DDC. If there are URIs in the future that distinguish between them, then only one element will be needed, to be used with the appropriate URI. In other cases, such as that of 033 Date of Event, it is possible that the structure of the data will be sufficient to distinguish between date types. If not, then each date type should be treated as a separate element.

 

Examples:

 

 033 Date of event

 

 082 Dewey Decimal Classification

 

Open ended qualifiers

Many of the elements with open-ended typing have a qualifier that gives the source or issuing agency for the value. There is no set list of agencies that can be used to with these fields, although codes should be selected from the MARC Institution Code list.

 

In an environment where lists of values and the values themselves have URIs, the URI itself will be sufficient for the value to be fully qualified as to its source. Qualifiers are needed only where URIs are not available (which is still the majority of cases for MARC values). Qualified elements could be given a structure that includes the value and the qualifier for the value. The structure can be combined with the value (e.g. URN-like, ISMN:2222) or stored as a multi-part element (type="ISMN", number="2222")

 

Examples:

 

  024 Other standard identifier

 

 015 National Bibliographic Number

Notes on Specific Fields

015 - National Bibliography Number (and others)

When there is an invalid number, will there be a $2 source?

020 - ISBN

There is a general question for most of these fields about dependencies between subfields. In some cases it is clear that you must have a $a in order to have, for example, a $z. In the description of the 020 there is nothing said about whether the $a must be there. The National Level Bibliographic Record shows all of the 020 subfields as being mandatory if applicable (A), while the $a of the 024 is mandatory (M). This is a clue that the $c and $z in the 020 can be present even if there isn't a $a, but that in the 024 the $a must be there if the field exists.

024 - Other standard identifier

This field has 5 indicator values that make it a specific standard number (ISRC, UPC, ISMN, EAN, SICI). For others, source is specified in $2. There is another indicator, though that states that the type of identifier is unspecified (ind. 1, value=8). Presumably this means that this instance does not get a $2. In addition, for each code there is the possibility to state whether the scanned and eye-readable versions of a number or code differ. In MARC this is two codes: 0=does not differ, 1=does differ. This could be treated as a binary y/n data element.

     The cancelled and invalid codes can only be input if the 024 $a is present, thus the field necessarily is a complex unit of $a (+$c) (+$d) (+$z). ($d is for additional codes that follow the $a data).

028 - Publisher number

This has a second indicator that controls whether or not notes or added entries are derived from the field. I have skipped that in my analysis, and would be interested to hear if it is considered an important bit of info. If so, I would probably create it as a single subelement representing the 2nd indicator and its four different values.

033 - Date/Time of Event, 045 - Time period of content

These two have an indicator to say whether the date is single, multiple, or a range.While this could be covered in the date/time format, I am making it a modifier on the dates, treating it like a subfield that modifies each date subfield.

035 - System control number

This is a simple data element, but in fact it could be consider complex because the data source is included in the string: "(CaOTULAS)41063988." I am treating it as simple for the time being.

040 - Cataloging source

I'm not sure which of these, if any, can be treated as simple and which as complex, so for the moment I am moving past it. Because the 040 is not repeatable and only the "Modifying agency" subfield is repeatable, it would seem that all of the subfields would be treatable as simple -- that there is no dependency between the subfields. But I need confirmation on that.

046 - Special Coded Dates

There are instructions here that I do not understand ("The field must also contain...."), and I do not know which combinations of subfields make sense. I am skipping this one for now, and would love to talk to someone who understands the input conventions.

055 - Classification numbers assigned in Canada

This field has a second indicator for type, completeness and source of the call number that spans 0-9. With some data elements I have used the indicator values to create separate fields when there were only a few different values. In this case, I'm structuring the field with the 2nd indicator as a structural part. Thus this one gets: {type} {classification number} {item number} {source}. In comparison, the 052 (Geographic classification) has two indicator values: 1) US Dept of Defense Classification and 2) Other (source specified in $2). In that case I created two fields, one for the US Dept of Defense and one for Other. The reasoning behind this is perhaps not obvious, but to copy the MARC field into a data element would require you to translate US Dept of Defense into a $2 source code that does not exist in the MARC record. Wherever possible I have tried to avoid this kind of difference between the values in the MARC record and the data element. Clearly some rationalization of the data set could be done to remove these supposed inconsistencies, but that's a much bigger task and would need to take into consideration a lot of issues having to do with ease of input, etc.

 

0XX fields as Lists and in RDF

 

010-048 in an HTML display

This shows the simple properties at the top, followed by the complex properties, each with a code (called "Compound ID") that "names" the compound data element. This will look like:

 

LGAC
 
 
  Geographic Area Code/Local GAC code
043b
  Geographic Area Code/Local GAC code - Source
0432

 

A better display is needed, but this is a preview of the work in progress.

 

There is now a similar display of the full 0XX range in PDF. The next step is to give each "field" a name and a URI.

 

 

Back to MARC Elements