| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

DataFormatIssues

This version was saved 17 years, 4 months ago View current version     Page history
Saved by PBworks
on November 4, 2006 at 10:10:16 pm
 

Data Format Issues and Ideas

Issues (cooked)


Issues (raw)

  • Relationship to FRBR and RDA. Is this a time of change that requires a new format, or should these be treated as separate needs?
  • Can we tackle just MARC bibliographic, or do we also need to include at least Authorities and Holdings in our analysis?
  • MARC causes library data to be marginalized (but MODS, a friendlier XML format, hasn't had much more success at crossing over to other fields)
  • How do you express the current table of contents RSS feed for a journal title from CiteULike?
  • There are a lot of advantages to MARC. We have a lot of data in MARC. It would be expensive to move whole hog off of it. Maybe we keep MARC for some bibliographic data.
  • MARC has many fields and data elements (fixed) that can no longer be expanded, so new data elements cannot be added.
  • MARC Bibliographic is not just bibliographic -- there is info for ordering, URLs to related web items, holdings info...
  • What fixed fields are used by systems? What need is there to carry this information into a new data format?
  • social tagging -- can it work? (kc: I'm copying this to the catalog discussion)
  • Browsing -- a requirement of the data format, or a system feature, or not needed (because it would be better to use topic maps)? (kc: Ditto, copying to catalog discussion)
  • Need to list problem data elements, i.e. dates (some ambiguity), genres (see MODS)
  • Examine the results of the data analysis done by Moen for 007 field in particular. Suspect that few vendors do anything with many of the elements defined in the 007. This could be an interesting place to experiment with: if there is very little legacy data there in the first place (check Moen's results); and second, for the legacy data that IS there, if no one is using the majority of the data elements (i.e., end-user retrieving or browsing in a meaningful way, not simply encoding on the creator side); then, maybe it would be fertile ground to consider 'lifting' (liberating) this part of the construct and remodeling this portion elsewhere.
  • 007 needs attention, but include staff reporting (e.g. for preservation, whatever) as a meaningful use of the data.
  • Hierarchy... as noted, MARC supports hierarchical description very little and poorly, and that's been a constraint. Given the scale and complexity with which we communicate records among systems, better support for linked or hierarchical descriptions will require sophistication in handling inheritance, identifiers, and update dates. Maintaining metadata that flows between hierarchical and flat environments is ugly. (R. Wendler)
  • How well can the data format support resource discovery?

 

 

Problems with MARC21 (K Coyle, lifted from a previous document with more explanation)

  1. Limitations on the size of records that can be created with Z39.2: a maximum of 9999 characters per field and 99999 characters per record. The latter effectively limits the number of fields that a record can contain.
  2. Inherent limitations in the MARC implementation of Z39.2: a maximum of 26 distinct content subfields and 10 control subfields can be defined per tag. (Note, numeric subfields have been designated as having a special function.)
  3. A large number of data elements with some degree of redundancy (X00 fields, X10 fields, title fields and subfields, etc.)
  4. Inconsistency between the treatment of same or similar data elements across fields.
  5. Fixed fields have values that are actually embedded in the standard. To add a new value means you have to modify the standard itself. They should all be external authoritative lists (if they should exist at all)
  6. Fixed fields that should be parallel to textual fields are a) located separately from those fields b) may not have the same values, either because of input problems or because of limitations in the value list.
  7. Variable fields that extend fixed fields (i.e. 041 extending language code in 008) because of lack of flexibility in the fixed fields. These data elements should be brought together.
  8. The use of defaults in fixed fields, which therefore convey little information because the fixed position must carry a value, even if that value is blank.
  9. Record linking in MARC21 is awkward to use and is not implemented by many systems.
  10. A mixture of logical levels (from the work to the item level) in a single record with no structural differentiation.

Comments (0)

You don't have permission to comment on this page.