[Esip-documentation] cdm_data_type: summary and minimalist proposal

John Graybeal via Esip-documentation esip-documentation at lists.esipfed.org
Tue Dec 9 12:32:46 EST 2014


As previously promised, this email summarizes the status of the cdm_data_type attribute. 

To start a discussion, I propose the removal of specific references and code lists, to produce the following definition:

"The organization of the data, as derived from the Common Data Model's Scientific Data layer and understood by THREDDS. (This is a THREDDS "dataType", and is different from the CF NetCDF attribute 'featureType', which indicates a Discrete Sampling Geometry file in CF.)"

Below is background material, if you want to understand the problem and the details. 

John

The Problem

At a minimum, the current 1.3 definition isn't perfect. Its list of valid values (point, profile, section, station, station_profile, trajectory, grid, image, or swath) doesn't agree with the referenced list (point, station, trajectory, grid, image, swath, radial). And the NODC guidance [3] is different as well, saying "The current choices are: Grid, Image, Station, Swath, and Trajectory."

Additional concerns are expressed in Bob Simons' email of 10/16, as follows:
  1) For cdm_data_type, it is unfortunate that previous versions of ACDD included a link to a specific list. Surely the intention is to evolve as Unidata/THREDDS/the common data model evolves, even if that particular list doesn't evolve.  Can we please remove that link and add the values from the CF DSG chapter that aren't in the current list here: timeSeries, timeSeriesProfile, trajectoryProfile?
  2) And please remove the NODC guidance link. That is NODC guidance about the DSG variants that NODC prefers and is not strictly relevant to a list of cdm data types.

History/References

In version 1.1, the definition for this attribute was "The THREDDS data type appropriate for this data set.", with the bolded text referencing http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#dataType

In making version 1.3, we removed cdm_data_type, then re-added it for backward compatibility. The following definition was proposed and yet survives: "The organization of the data, as derived from the Common Data Model's Scientific Data layer and understood by THREDDS (this is a THREDDS "dataType"). One of point, profile, section, station, station_profile, trajectory, grid, image, or swath. Please note that this is different from the CF NetCDF attribute 'featureType' that indicates a Discrete Sampling Geometry file—for guidance on those terms, please see the NODC guidance." The first bold text points to the same address as in v1.1; the second points to http://www.nodc.noaa.gov/data/formats/netcdf/ [3].

A detailed review of the discussion through Oct 6 starts at line 15 of this Active Issues page [5]. That latest proposal was to keep v1.3 cdm_data_type as is, and not deprecate it, as there are some examples of its utility.

A nice review of the current NetCDF-Java code usage of cdm_data_type is here [1]. The upshot is that if featureType is present, it is sufficient for that library; but older applications may still require cdm_data_type. This thread also points out an issue in the NODC guidance page at http://www.nodc.noaa.gov/data/formats/netcdf/v1.1/ [4]. (Reference [3] resolves to this). Another more recent, and arguably more relevant, code citation is at Unidata's THREDDS code: https://github.com/Unidata/thredds/blob/target-4.3.22/cdm/src/main/java/ucar/nc2/constants/FeatureType.java [6]; it has a still longer list of items.

Options

These options are mix-and-match.

The default option is doing nothing. Other conceivable options are:
- Revert to previous wording from 1.1.
- For the NODC issue: Just remove the NODC guidance link phrase ("—for guidance on those terms, please see the NODC guidance").
- For the inconsistency in the list of acceptable terms:
   A) Remove the link to THREDDS "dataType" (and let users figure things out for themselves, or add more terms such as listed under (1) above).
   B) Change the list of terms to match the current link to THREDDS "dataType".
   C) Change the link to THREDDS "dataType" and update the list of terms to match whatever we point to. 
- For the general confusion about what this term is for, add something like this sentence for context: "This attribute is maintained for compliance with older files and applications, and is neither needed nor recommended for most purposes (use featureType instead)."


References

[1] THREDDS issue discussion of cdm_data_type: https://github.com/Unidata/thredds/issues/72
[2] THREDDS data type reference in 1.1 and 1.3: http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#dataType
[3] NODC guidance reference in 1.3: http://www.nodc.noaa.gov/data/formats/netcdf/   (forwards to [4])
[4] NODC guidance referenced by @shane-axiom: http://www.nodc.noaa.gov/data/formats/netcdf/v1.1/
[5] ACDD 1.3 Reconciliation Pages: Active Issues: https://docs.google.com/spreadsheets/d/19fl5AgGkckG03yTchUjYUp4YnR09Fn1Nqps2KHenkC4/edit#gid=0
[6] Unidata THREDDS code with a list of feature types: https://github.com/Unidata/thredds/blob/target-4.3.22/cdm/src/main/java/ucar/nc2/constants/FeatureType.java
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20141209/fbd134a7/attachment.html>


More information about the Esip-documentation mailing list