Back to MARC Elements
Number and Code Fields (0xx)
The fields in this range are either simple (a single string or unit) or complex (multiple units in the data element). Each of these units may have qualifying information -- meta information about the value itself. Most frequently this qualifying information provides the agency, controlled list, or rules that define the provenance of the value. There are a few binary values. These latter cannot be further qualified and are generally found in indicators.
There are many fields in the MARC 0XX area that make up a logical gathering of elements. One example is the field for various language codes (041). It is easy to assume that the data elements in this field need to be kept together through a data structure of some kind. While it may be the case the applications will display these together as a unit, they will be treated as independent simple data elements if there is no inherent dependency between them. In the case of the 041, each subfield is meaningful on its own without the others. A counter example is that of classification codes that have two parts: the class code and the item number. While the class code has meaning on its own, the item number is only meaningful in the context of the "call number" which comprises both the class code and the item number.
This is primarily to point out that the analysis here follows the logic of the data elements and their relation to each other and to the focus bibliographic item, and not the structure of the MARC record. It is this generalization of the underlying data elements that should make it possible to create different record structures from the same data that is in MARC.
Note on Control subfields ($w, $3, $5, $6, $8)
I don't include the control subfields in this analysis. To the extent that they inform the list of MARC data elements they will need to be included at some point. In many cases, the control subfields have to do with the structure of the record not the semantics of the data. The $3 links together statements about the same subunit in a record for a complex item. This may be handled differently in a different record structure. The $5 qualifies a field based on the institution to which it applies, often for item-level notes. These obviously could be placed in an item-level structure rather than the body of the bibliographic record for the manifestation. The $6 links together representations of the same field in different scripts or a transliteration to the vernacular. The $8 seems to be rarely used, and without some examples I find it hard to add to the analysis. I will keep looking for those. (The Moen statistical study lists some $8's in LC records, but only in the 0XX fields.)
Simple elements
These are elements that stand alone and can be used in any context. Examples are:
- Constant ratio linear vertical scale
- Library of Congress Catalog Number
- ISBN
- Terms of availability
- Report number
Note that there may be more than one simple element in a MARC field. Thus,
020 $a ISBN $c Terms of availability $z Canceled/invalid ISBN
is actually three separate single elements:
ISBN
Terms of availability
Canceled/invalid ISBN
Each of these is a statement about the focus of the bibliographic record, and they are not dependent on each other. (I think; if I'm wrong, let me know.)
Complex Elements
Many of the elements are made up of two or more dependent parts. The call numbers are a relatively simply example of this, illustrated with LC call Number:
050 $a Classification number $b Item number
The Classification number can be used without the Item number, but the Item number is not meaningful without the Classification number.
Note that the Classification number is repeatable, while the Item number is not. When there is more than one Classification number, any one of them can be combined with the single Item number to creation a full call number.
Other call numbers that follow this general pattern are: UDC (MARC 080), DDC (MARC 082).
Another example of a complex field is the Fingerprint Identifier (026). This is used for antiquarian books, and has the following subelements:
$a First and second groups of characters
$bThird and fourth group of characters
$c Date
$d number of volume or part
It also has a subfield that can contain the entire fingerprint and is a simple element:
$e Unparsed fingerprint
Simple strings, Complex elements
There are string values in the 0XX area that are simple strings but that can contain more than one data element. The ISBN is an excellent example of this:
020 $a 0914378260 (pbk. : v. 1)
020 $a 0670033480 (hbk. : alk. paper)
This is clearly more than just an ISBN, but it is coded only as a single data element. The punctuation could be considered to delineate sub-elements, but is probably not regular enough or rigorously enough followed to be processed as such.
Another field that is a single string but can contain more than one data point is the Number of Musical Instruments or Voices Code (028). This field has two (independent) subfields:
$a Perform or ensembler
$b Soloist
Each one carries a two-character code for the musical form, and "may be followed by a two digit number (01-99) that indicates the number of parts or performers (e.g., va02, a two-part composition for Voice - Soprano)." This is a complex element -- musical form plus number of parts -- that is coded as a single string.
At some future date it may be desirable to further analyze these into their logical parts. For now, they will be listed here as single strings, matching their use in MARC.
Qualified elements
The qualified elements are ones that have additional information about the data element itself. These qualifying subfields are about the data element, not about the bibliographic item being described. The qualification changes the meaning or semantics of the data.
Due to its growth over time, there are different treatments of data in MARC that relate to qualification. For example, there are fields for a number of different classification numbers:
Library of Congress Call Number
National Library of Medicine Call Number
National Agricultural Library Call Number
Dewey Decimal Classification Number
In a sense, each of this is a classification or call number with the type of number (or "source" of the number) included in the definition of the field.
At some point it became clear that if MARC were to create a separate field for every possible call/classification number, it would run out of 0XX fields. The next step was to create a field for "Other classification number." This field takes a different approach to defining a call/class number and its source. In effect, this is a general field where any call/class number can be input:
$a Classification number $b Item number $2 Source of number
Were MARC being developed anew today it might make sense to use this field for all class numbers. Instead of having a separate field for LCC, you would have
$a Classification number $b Item number $2 Source of number=LCC
Many elements are qualified by assigning agency or code source. Of these, some have a finite list of types, and others are open-ended.
Qualifiers with finite lists
Fully designated types can in most cases be turned into a single element for each type. For example, the 082 DDC, which is typed as either the full edition or the abridged edition, can be created as two separate elements: full edition DDC and abridged edition DDC. If there are URIs in the future that distinguish between them, then only one element will be needed, to be used with the appropriate URI. In other cases, such as that of 033 Date of Event, it is possible that the structure of the data will be sufficient to distinguish between date types. If not, then each date type should be treated as a separate element.
Examples:
033 Date of event
- Single date
- Multiple single dates
- Range of dates
082 Dewey Decimal Classification
- Full edition
- Abridged edition
Open ended qualifiers
Many of the elements with open-ended typing have a qualifier that gives the source or issuing agency for the value. There is no set list of agencies that can be used to with these fields, although codes should be selected from the MARC Institution Code list.
In an environment where lists of values and the values themselves have URIs, the URI itself will be sufficient for the value to be fully qualified as to its source. Qualifiers are needed only where URIs are not available (which is still the majority of cases for MARC values). Qualified elements could be given a structure that includes the value and the qualifier for the value. The structure can be combined with the value (e.g. URN-like, ISMN:2222) or stored as a multi-part element (type="ISMN", number="2222")
Examples:
024 Other standard identifier
- 0 - International Standard Recording Code
- 1 - Universal Product Code
- 2 - International Standard Music Number
- 3 - International Article Number
- 4 - Serial Item and Contribution Identifier
- 7 - Source specified in subfield $2
- 8 - Unspecified type of standard number or code
015 National Bibliographic Number
- $a - National bibliography number (R)
- $z - Canceled/invalid national bibliography number (R)
- $2 - Source (NR) Code that identifies the source of the National Bibliography Number. Code from: National Bibliography Number Source Codes.
Notes on Specific Fields
015 - National Bibliography Number (and others)
When there is an invalid number, will there be a $2 source?
020 - ISBN
There is a general question for most of these fields about dependencies between subfields. In some cases it is clear that you must have a $a in order to have, for example, a $z. In the description of the 020 there is nothing said about whether the $a must be there. The National Level Bibliographic Record shows all of the 020 subfields as being mandatory if applicable (A), while the $a of the 024 is mandatory (M). This is a clue that the $c and $z in the 020 can be present even if there isn't a $a, but that in the 024 the $a must be there if the field exists.
024 - Other standard identifier
This field has 5 indicator values that make it a specific standard number (ISRC, UPC, ISMN, EAN, SICI). For others, source is specified in $2. There is another indicator, though that states that the type of identifier is unspecified (ind. 1, value=8). Presumably this means that this instance does not get a $2. In addition, for each code there is the possibility to state whether the scanned and eye-readable versions of a number or code differ. In MARC this is two codes: 0=does not differ, 1=does differ. This could be treated as a binary y/n data element.
The cancelled and invalid codes can only be input if the 024 $a is present, thus the field necessarily is a complex unit of $a (+$c) (+$d) (+$z). ($d is for additional codes that follow the $a data).
028 - Publisher number
This has a second indicator that controls whether or not notes or added entries are derived from the field. I have skipped that in my analysis, and would be interested to hear if it is considered an important bit of info. If so, I would probably create it as a single subelement representing the 2nd indicator and its four different values.
033 - Date/Time of Event, 045 - Time period of content
These two have an indicator to say whether the date is single, multiple, or a range.While this could be covered in the date/time format, I am making it a modifier on the dates, treating it like a subfield that modifies each date subfield.
035 - System control number
This is a simple data element, but in fact it could be consider complex because the data source is included in the string: "(CaOTULAS)41063988." I am treating it as simple for the time being.
040 - Cataloging source
I'm not sure which of these, if any, can be treated as simple and which as complex, so for the moment I am moving past it. Because the 040 is not repeatable and only the "Modifying agency" subfield is repeatable, it would seem that all of the subfields would be treatable as simple -- that there is no dependency between the subfields. But I need confirmation on that.
046 - Special Coded Dates
There are instructions here that I do not understand ("The field must also contain...."), and I do not know which combinations of subfields make sense. I am skipping this one for now, and would love to talk to someone who understands the input conventions.
055 - Classification numbers assigned in Canada
This field has a second indicator for type, completeness and source of the call number that spans 0-9. With some data elements I have used the indicator values to create separate fields when there were only a few different values. In this case, I'm structuring the field with the 2nd indicator as a structural part. Thus this one gets: {type} {classification number} {item number} {source}. In comparison, the 052 (Geographic classification) has two indicator values: 1) US Dept of Defense Classification and 2) Other (source specified in $2). In that case I created two fields, one for the US Dept of Defense and one for Other. The reasoning behind this is perhaps not obvious, but to copy the MARC field into a data element would require you to translate US Dept of Defense into a $2 source code that does not exist in the MARC record. Wherever possible I have tried to avoid this kind of difference between the values in the MARC record and the data element. Clearly some rationalization of the data set could be done to remove these supposed inconsistencies, but that's a much bigger task and would need to take into consideration a lot of issues having to do with ease of input, etc.
0XX fields as Lists and in RDF
This shows the simple properties at the top, followed by the complex properties, each with a code (called "Compound ID") that "names" the compound data element. This will look like:
LGAC
|
|
|
|
Geographic Area Code/Local GAC code
|
043b
|
|
Geographic Area Code/Local GAC code - Source
|
0432
|
A better display is needed, but this is a preview of the work in progress.
There is now a similar display of the full 0XX range in PDF. The next step is to give each "field" a name and a URI.
Back to MARC Elements
Comments (0)
You don't have permission to comment on this page.