| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Bib fields

Page history last edited by Karen Coyle 11 years, 5 months ago

 

Back to MARC Elements

 

Bibliographic Fields (100-899)


Fields by Frequency

As a way to prioritize the analysis, here are the top fields by frequency, according to the UT (Moen) statistical analysis:

 

In 25%-100% of records
 Next 15%

100

245

260

300

500

504

650

651

700

710

880

110

246

250

440

490

600

740

 

These give us a place to start, but in many cases similar or same situations arise in both popular and less popular fields, so it will make sense to solve those situations for all related fields or subfields.

 

Issues

The bibliographic fields are particularly difficult to render as data. Most fields have numerous subfields as well as indicator values. The fields often correspond to display segments rather than data elements, although in some cases those two are the same. Most will need to be developed as complex text fields.

 

If we look at these fields purely for their semantics we will probably discover a set of elements similar to FRBR and the RDA representation of FRBR. 

 

The RDA Toolkit has mappings from MARC to RDA and RDA to MARC. These map logically but do not reconcile differences in granularity. This means that these mappings are not 1-to-1 but many-to-many. The RDA to MARC table lists multiple MARC subfields as related to a single RDA element; the MARC table lists each MARC subfield separately but more than one can map to a single RDA element. These both resolve at the level of instance data, not their respective structures, and they require human judgment to make the decisions between the "many" in the one/many-to-many situations.

 

 

 

Fields and subfields

We take for granted that MARC is made up of fields and subfields, but the exact relationships between fields and subfields isn't always clear.  In some cases the field provides a particular context, as in the 1XX which designates the subfields that follow as part of the main entry, and in addition codes the entry as either a personal name, corporate name, or conference. (The indicator can change a personal name to a family name.) Looking at it from a linked data point of view, the field provides a relationship between the content and the primary focus of the record. It also may identify the type of data in the field (personal name).

 

In other cases, the relationship between the field and the context of the subfields is less clear. The best example of this is the 880 field, which serves as a parallel field to another field in the same record and carries an "alternate graphic representation." The 880 field has no specifically defined subfields of its own, but can contain the subfields -- and thus the subfield meanings -- of the field it parallels. It's easiest to explain with an example:

 

100 2_ |6 880-01 |a Afwah al-Awdī, Ṣalāʼah ibn ʻAmr.
880

2_ |6 100-01/(3/r |a أفوه الأودي، صلاءة بن عمرو.

 

100 1_ |6 880-01 |a Nagradov, I. S. |q (Ilʹi︠a︡ Sergeevich)

 

 

 
880 1_ |6 100-01/(N |a Наградов, И. С. |q (Илья Сергеевич)   

 

 

The 100 field and the 880 field are tied together by their sequence number (the "-01" in the $6 subfield). The tag "880" has no semantics of its own; the $6 subfield contains the tag that gives the general meaning of the field.

 

Another set of field/subfield combinations that is problematic is those that can represent either a creator (person or corporate body) or a work. In the former case, the field has only name information; in the latter it includes both a name and a title:

 

700 1#$aBeethoven, Ludwig van,$d1770-1827.

700 1#$aBeethoven, Ludwig van,$d1770-1827.$tSonatas, $mpiano.$kSelections.

 

100 1_ |a Nabokov, Vladimir Vladimirovich, |d 1899-1977.

700 1_ |a Nabokov, Vladimir Vladimirovich, |d 1899-1977. |t Lolita.

 

There is nothing in the field or indicator level to tell you whether this represents a person, as the first of each set above, or a work, as in the second. In addition, the division between the creator/actor portion and the title portion is not as neat as one would like. There is at least one subfield, $g Miscellaneous information, that can be associated with either, and in a field with both creator and title its role is only discernible based on its position in the string of subfields.

 

Administrative data

Another analysis that needs to be done is to determine the focus of each data element in the record. There are at least two foci that I see right away:

  1. The primary resource (the book, the DVD, the map, etc.)
  2. The record (administrative data)

It isn't clear to me if the linking fields (77X) fit into category 1 or if they are a different category altogether.

 

Alternate Graphical Representation

These are the 880 fields that represent an alternate character representation from another field in the record. For example:

 

100 1_ |6 880-01 |a Nagradov, I. S. |q (Ilʹi︠a︡ Sergeevich)
880 1_ |6 100-01/(N |a Наградов, И. С. |q (Илья Сергеевич)

 

It isn't clear to me what to do with these, in particular how to link them to the particular field that they should be connected to.

 

Level of Detail

To what extent it is valuable to retain the exact level of detail of the MARC record? For each field it will be necessary to ask if information separated into subfields is useful as individual data elements. Some notes, for example, have subfielding that may not result in separately usable statements:

 

     506 1#$aRestricted: Material extremely fragile;$cAccess by appointment only.

 

Redundancy

The same data can appear in multiple places in the MARC record; there are numerous fields with title subfields.

 

Ambiguous Coding

Coding of the MARC fields and subfields is often not at a sufficient level of granularity to eliminate ambiguity. Although some ambiguity is to be expected, this has a particularly detrimental effect when there is not enough clarity to support desired functionality.

 

The uniform title is an interesting example of what programmers might call "overloading." Although identically coded, these 240 fields have totally different meanings:

  24010         $a Selections
      -- For an item ..."consisting of three or more works in various forms..." The title of the work is NOT "Selections"

  24010         $a Pendolo di Foucault. $l English
     -- For a translation of a Work. "Pendolo di Foucault" IS the title of the work (in the FRBR sense of Work) that was translated. The subfield $l tells you what it was translated to.

  24010         $a Concertos, $m harpsichord, string orchestra, $n BWV 1052, $r D minor
    -- I'm not sure how to describe this, except that it is a coded description of music that has little or nothing to do with the actual title of the thing begin described. This is to music as the hierarchical place name (752) is to newspapers. It's great data, and undoubtedly very useful, but it really needs its own data element.

It gets even worse when you start looking at 700 $t's. The music people always want to have an index that includes the 100/240 and the 700 $a $t. Unfortunately, not all 700 $t's are equivalent to a 240, not even in music records. So you either throw every 700 with a $t into a uniform title field (and most of the time they won't look like uniform titles, so the value of that diminishes), or you do a title index that includes all of the titles, and the music folks are unhappy that they can't search only on THEIR uniform titles.

Clearly, if instead of throwing every kind of title into 700 $t we could have a data element for that valuable, constructed music title, then we could serve music library patrons much better.

 

Tags

100/110 - Primary agents

This is where one records the author or primary creator of a resource. The complexity here is that these are not simple descriptive fields but are headings that, in E-R parlance, represent entities with relationships to the primary focus. To put that in clearer terms, if the record describes a particular bibliographic thing, these fields represent other things that have a key creative relationship to the particular bibliographic thing.

 

The 100/110 can also be a work, based on the presence of the $t subfield, but this occurs only very rarely (about .01% or less). I haven't seen examples of this case, so I am not sure what they represent.

 

Parallel to the 100/110 is the 700/710, which are "added entries" that are either "secondary agents" or works described with an author/title entry.

 

300 Physical Description

The $a and $c in the 300 are both Mandatory in the NLBR.

These are mandatory if applicable:

$b (other physical details)

$e Extent of accompanying material

$f Type of unit

$g Size of unit

 

The 330 $e can contain information from any other subfield, but without subfielding to divide it:

 

300 ##$a271 p. :$bill. ;$c21 cm. +$e1 answer book.
300 ##$a271 p. :$bill. ;$c21 cm. +$e1 atlas (37 p., 19 leaves : col. maps ; 37 cm.)

 

This means that the 300 $e = 300 $a $b $c $f $g or anything combination thereof. This makes it likely that there will be two elements: extent and extent of accompanying material. Note that latter does not seem to be in RDA.

 

There are also examples in which the 300 $e mentions accompanying material but is not actual an instance of extent:

300 ##$a1 computer disk ;$c3 1/2 in. +$ereference manual.

This is unfortunate, but it will still need to be considered extent of accompanying material.

 

 

700/710, et al.

While the 1XX's are fairly complex, the 7XX's in this range are even more so. Where 1XX's represent creators in some sense of that meaning, the "added entries" in the 7xx range have a variety of roles. Unfortunately these roles are often not explicit in the MARC instance data. The two primary roles are: related agent, and related work. They are distinguished by the presence or absence of title subfields in the field.

 

100     1_ |a Bach, Johann Sebastian, |d 1685-1750.
245     10 |a James Galway plays Bach |h sound recording : |b two flute concertos ; Suite in B minor.

700     12 |a Bach, Johann Sebastian, |d 1685-1750. |t Suites, |m orchestra, |n BWV 1067, |r B minor.
700     1_ |a Galway, James. |4 prf

 

In this example, the first 700 field is a related work, the second 700 field is a related person. The related person has a "creator" or "agent" relationship to the bibliographic entity with the title in the 245. (In this case, "performer," but the actual relationship is often undeclared.) The related work is an analytic; that means that the related work is contained in, or a part of, the primary bibliographic entity. Related works are not always analytics:

 

245     00 |a Lolita [Motion picture]
700     1_ |a Nabokov, Vladimir Vladimirovich, |d 1899-1977. |t Lolita.

 

In this case, the relationship between the work in the 700 and the primary entity is not declared.

 

740 Added entry title, uncontrolled

The 740 has some of the same characteristics of the 7XX's with $t's -- it can represent an analytic, in which case it is (theoretically) a title for an included Work. If it isn't coded as an analytic (2nd indicator is not "2"), then it's just a title, and not much more is known. In some cases the title may represent a main entry, a work that has no associated author. In other cases the named work has the 100 field for its author. But nothing in the data coding lets you know that. Here is an example from the MARC manual:

 

100 1#$aChekhov, Anton Pavlovich,$d1860-1904.
240 10$aVishnevyi sad.$lEnglish
245 14$aThe cherry orchard ;$bUncle Vanya /$cAnton Chekhov.
700 12$aChekhov, Anton Pavlovich,$d1860-1904.$tDíàíà Vaníà$lEnglish.$f1969.
740 02$aUncle Vanya.

 

The 700 field gives us the full heading, with Chekhov as author and the uniform title (Russian, transliterated) as an analytic entry. The 740 is an additional version of the title, but without the author.

 

Related resources - 76x-78x, 440, 800-83x

 

These fields all represent other resources that the focus resource relates to. There may or may not be an actual record for the related resource. In general, these are mini-citations, sometimes giving no more than a title.

 

Of the fields in this range, these are the ones that show up as appearing in more than 1% of bibliographic records, using the University of Texas statistics from 2006 on a combination of LC and OCLC records:

 

773 - Host Item Entry
776 - Additional Physical Form Entry
780 - Preceding Entry
785 - Succeeding Entry

810 - Series Added Entry - Corporate Name
830 - Series Added Entry (>7%)

440 - Series Statement/Added Entry-Title (~13%)
  made obsolete in 2008 --> should have been moved to 830
490 - Series Statement (>13%)

 

Note that although it is a series statement, the 490 is not generally considered appropriate for linking. Series titles that should link should all be in the 8xx range.

 

More info by field:

 

Related

 

773 - Host Item Entry

     This is used for articles in journals and chapters in books. The record is for the article or chapter, and the "host" item is listed in the 773. It is likely that there is a record for the host item, so linking could take place.


776 - Additional Physical Form Entry

     The examples in the MARC documentation show this as being used primarily for microforms. This is somewhat of a quirk of the "multiple versions" rules that state that different physical formats must be cataloged separately. I wonder if this is being used for ebooks vs. print books, but I suspect that most libraries code those on the same record.

 

780 - Preceding Entry
785 - Succeeding Entry

     These two are used in serials when the serial has changed names, which explains why they are used somewhat frequently.

 

810 - Series Added Entry - Corporate Name
830 - Series Added Entry (>7%)

440 - Series Statement/Added Entry-Title (~13%)
  made obsolete in 2008 --> should have been moved to 830
490 - Series Statement (>13%)

 

Indicators

In a separate page, Indicators

 

Possible solutions

 

ISBD+

One approach would be to leave the basic cataloging description, essentially defined in ISBD, intact, without attempting to change those textual fields to something linkable. Linkable data would then have a relationship to the description, and only the LD data would be used for linked data purposes. The ISBD portion would be considered a document

 

Things and Strings

A first analysis could separate the variable fields into "things" and "strings." Things are those fields (or portions of fields) that can be represented by an authority-controlled entity. Conceptually, those things could be replaced by an identifier for the authority record. Strings are everything else. Strings themselves can be broken into categories, primarily transcribed text and supplied text.

 

Data v. Markup

Some of the variable fields could be treated as structured data, such as the structured contents notes field (505 using subfields). Another option is to treat textual fields as text with markup where needed, as in an unstructured contents note that uses ISBD punctuation to differentiate entries, authors and titles. Using markup could speed up the process of translating some of the textual fields that do not have a data equivalent, primarily the notes fields.

 

Connecting to MARC21 in ISO 2709

 

There is a need to make clear the connection between elements in MARC21 RDF and the MARC21 elements stored in ISO 2709 format. At the same time, it is not desirable to use the MARC21 2709 field/subfield designators as the identities of MARC21 in RDF because of the unfriendliness of the tag/subfield conventions to non-librarians. For this reason, it seems best to embed a link to MARC21 2709 in the description for each MARC21 in RDF element. One possibility is to use a MARC-centric URI for the MARC21 in 2709 elements:

 

http://marc21.info/element/506a

 

To encode this, it may be suitable to use OWL sameAs. However, there is the disadvantage that these URIs do not resolve (at least, not on their own). Ideas about this would be appreciated!

 

Back to MARC Elements

 

Comments (0)

You don't have permission to comment on this page.