Data and Studies


Data and Studies

This area is for studies relating to bibliographic formats, or any data that is available for download.


MARC by the Numbers (With Apologies to Harper's)

 

 

The MARC21 Bibliographic Format, by the numbers

(with apologies to Harper's Magazine)

(Note: all figures exclude fields 863-887)

 

 

Number of total tags defined in the MARC21 bibliographic format: 182

Number of variable field tags: 175

Total number of subfields defined: 1711

Total number of unique subfields, based on their names: 551 (32%)

Number of the 352 indicator positions that are undefined: 211 (60%)

Number of subfields that are a type of title: 87 (5%)

Subfields in order of occurrence: $a: 174; $8: 172; $6: 161; $b: 108; $c: 83; $d: 70; $7: 67; $g: 56; $n: 52; $h: 50; $3: 50; $z: 46; $f: 45; $k: 45; $e: 44; $x: 41; $m: 39; $u: 38; $t: 37; $2: 37; $s: 35; $p: 33; $o: 32; $i: 29; $y: 29; $r: 28; $l: 27; $v: 20; $5: 20; $w: 17; $j: 17 $4: 14; $q: 14; $1: 0; $9: 0 (Note: numeric subfields generally have the same value in every field in which they appear. Non-numeric subfields can have a different value in every field in which they appear.)

Number of fixed field values (006-008): 2401

Number of 006 values: 688

Number of 006 values that are unique: 0

Number of 007 data elements defined: 118

Number of unique 007 data elements (based on their names): 55 (47%)

Total number of 007 data values in all 007 positions: 867

Number of 007 unique data values in all 007 positions: 434 (50%)

Number of 007 data values that are defined as "Unknown": 70

Number of 007 values that are "Other": 66

Number of 007 values that are "No attempt to code": 90

Total percentage of 007 values that are one of the three above: 26%

Number of 008 data elements defined for all formats: 82

Number of unique 008 data elements (based on their names): 58 (71%)

Number that are obsolete: 17 (21%)

Number of 008 data values in all positions: 846

Number that are unique: 485 (57%)

Number of 008 data values that are "Unknown" or "Other": 51 (6%)

Approximate number of defined data value lists included in the MARC21 standard: 201

 

 

To check these values, or try for others, I (kc) have two tab-delimited files that can be uploaded into a database or spreadsheet. They are:

The data elements are:

 

tag (3 digits, e.g. 006)

format to which it applies (3-5 chars, e.g. MUSIC)

position in the field (2 digits, e.g. 01)

vocabulary name (var char, e.g. Motion picture presentation format)

MARC code for the value (1 char, e.g. p)

textual name of value (var char, e.g. Standard sound aperture (reduced frame))

 

The data elements are:

 

MARC Tag (3 digits, e.g. 010)

Subfield (1 to 3 chars. If Subfield="n/a" then this is an indicator position. Position (next data element) is indicator position)

Position (2 digits, e.g. 02; if "n/a" then this is a subfield, not an indicator)

Data Element (var char, name of data element from MARC21 concise)

 

More useful MARC files

 

MARC and RDA

 

MARC Content Designation Utilization (aka Bill Moen's statistical work)

 

MARC and FRBR