[Esip-documentation] cdm_data_type: summary and minimalist proposal

Bob Simons - NOAA Federal via Esip-documentation esip-documentation at lists.esipfed.org
Tue Dec 9 14:04:35 EST 2014


On 2014-12-09 10:47 AM, Nan Galbraith via Esip-documentation wrote:
> Aha, thanks Bob (and John), that wasn't clear to me - the entities are
> within the list, and so are the double quotes. That sounds like a 
> somewhat
> rare scenario; are we sure there's support for that in all the NetCDF
> utilities? I suspect Matlab might have a little problem with it, but
> haven't tested it - either it hasn't been discussed on this list or I
> completely missed it.
>
> If we're sure we're not introducing a feature that's not supported by the
> NetCDF libraries, maybe an example would be helpful - I was definitely
> confused by this text.
I think there is general support for this approach since it has a 
precedent in csv files. And if a given piece of software doesn't support 
it, then hopefully someone will file a bug report. "Escaping" is widely 
used because this general problem pops up over and over again in 
different situations.

And I think people make an effort to avoid the problem, e.g., they don't 
make keywords that have internal commas.

>
>
> Thanks -
> Nan
>
>
> On 12/9/14 1:12 PM, Bob Simons - NOAA Federal via Esip-documentation 
> wrote:
>> It may be confusing, but I think it is correct.
>> Perhaps we need an example, e.g.,
>> entity1, entity2, "entity3, with internal comma", entity 4
>>
>> 2014-12-09 10:06 AM, Nan Galbraith via Esip-documentation wrote:
>>> Do you want input via email, on the 'talk' page, or on the Active 
>>> Issues page?
>>>
>>> I just noticed something else on the main (draft) page that I think 
>>> should be
>>> changed:
>>>
>>>     Several attributes explicitly allow the entry of multiple
>>>     entities as comma-separated
>>>     values. The entities in such lists which contain a comma must be
>>>     enclosed in straight
>>>     double quotation marks ("), which will not be considered part of
>>>     the entity.
>>>
>>> I'm not sure if this is correct, but it's certainly confusing; 
>>> netcdf inserts double quotes
>>> (e.g. in ncdump) for non-numeric, and we seem to be advising people 
>>> to add a pair
>>> on their own. I think we're just getting too far into the weeds with 
>>> this level of detail.
>>>
>>> I recommend we drip the second sentence, or the whole paragraph on 
>>> comma-separated
>>> lists.
>>>
>>> Nan
>>>
>>>
>>>
>>>
>>> On 12/9/14 12:32 PM, John Graybeal via Esip-documentation wrote:
>>>> As previously promised, this email summarizes the status of the 
>>>> cdm_data_type attribute.
>>>>
>>>> To start a discussion, I propose the removal of specific references 
>>>> and code lists, to produce the following definition:
>>>>
>>>> "The organization of the data, as derived from the Common Data 
>>>> Model's Scientific Data layer and understood by THREDDS. (This is a 
>>>> THREDDS "dataType", and is different from the CF NetCDF attribute 
>>>> 'featureType', which indicates a Discrete Sampling Geometry file in 
>>>> CF.)"
>>>>
>>>> Below is background material, if you want to understand the problem 
>>>> and the details.
>>>>
>>>> John
>>>>
>>>> *The Problem*
>>>>
>>>> At a minimum, the current 1.3 definition isn't perfect. Its list of 
>>>> valid values (point, /profile/, /section/, station, 
>>>> /station_profile/, trajectory, grid, image, or swath) doesn't agree 
>>>> with the referenced list (point, station, trajectory, grid, image, 
>>>> swath, /radial/). And the NODC guidance [3] is different as well, 
>>>> saying "The current choices are: Grid, Image, Station, Swath, and 
>>>> Trajectory."
>>>>
>>>> Additional concerns are expressed in Bob Simons' email of 10/16, as 
>>>> follows:
>>>> 1) For cdm_data_type, it is unfortunate that previous versions of 
>>>> ACDD included a link to a specific list. Surely the intention is to 
>>>> evolve as Unidata/THREDDS/the common data model evolves, even if 
>>>> that particular list doesn't evolve. Can we please remove that link 
>>>> and add the values from the CF DSG chapter that aren't in the 
>>>> current list here: timeSeries, timeSeriesProfile, trajectoryProfile?
>>>> 2) And please remove the NODC guidance link. That is NODC guidance 
>>>> about the DSG variants that NODC prefers and is not strictly 
>>>> relevant to a list of cdm data types.
>>>>
>>>> *History/References*
>>>>
>>>> In version 1.1, the definition for this attribute was "The *THREDDS 
>>>> data type* appropriate for this data set.", with the bolded text 
>>>> referencing 
>>>> http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#dataType
>>>>
>>>> In making version 1.3, we removed cdm_data_type, then re-added it 
>>>> for backward compatibility. The following definition was proposed 
>>>> and yet survives: "The organization of the data, as derived from 
>>>> the Common Data Model's Scientific Data layer and understood by 
>>>> THREDDS (this is a *THREDDS "dataType"*). One of point, profile, 
>>>> section, station, station_profile, trajectory, grid, image, or 
>>>> swath. Please note that this is different from the CF NetCDF 
>>>> attribute 'featureType' that indicates a Discrete Sampling Geometry 
>>>> file—for guidance on those terms, please see the *NODC guidance*." 
>>>> The first bold text points to the same address as in v1.1; the 
>>>> second points to http://www.nodc.noaa.gov/data/formats/netcdf/ [3].
>>>>
>>>> A detailed review of the discussion through Oct 6 starts at line 15 
>>>> of this Active Issues page 
>>>> <https://docs.google.com/spreadsheets/d/19fl5AgGkckG03yTchUjYUp4YnR09Fn1Nqps2KHenkC4/edit#gid=0> 
>>>> [5]. That latest proposal was to keep v1.3 cdm_data_type as is, and 
>>>> not deprecate it, as there are some examples of its utility.
>>>>
>>>> A nice review of the current NetCDF-Java code usage of 
>>>> cdm_data_type is here 
>>>> <https://github.com/Unidata/thredds/issues/72> [1]. The upshot is 
>>>> that if featureType is present, it is sufficient for that library; 
>>>> but older applications may still require cdm_data_type. This thread 
>>>> also points out an issue in the NODC guidance page at 
>>>> http://www.nodc.noaa.gov/data/formats/netcdf/v1.1/ [4]. (Reference 
>>>> [3] resolves to this). Another more recent, and arguably more 
>>>> relevant, code citation is at Unidata's THREDDS code: 
>>>> https://github.com/Unidata/thredds/blob/target-4.3.22/cdm/src/main/java/ucar/nc2/constants/FeatureType.java 
>>>> [6]; it has a still longer list of items.
>>>>
>>>> *Options*
>>>>
>>>> These options are mix-and-match.
>>>>
>>>> The default option is doing nothing. Other conceivable options are:
>>>> - Revert to previous wording from 1.1.
>>>> - For the NODC issue: Just remove the NODC guidance link phrase 
>>>> ("—for guidance on those terms, please see the NODC guidance").
>>>> - For the inconsistency in the list of acceptable terms:
>>>> A) Remove the link to *THREDDS "dataType"* (and let users figure 
>>>> things out for themselves, or add more terms such as listed under 
>>>> (1) above).
>>>> B) Change the list of terms to match the current link to *THREDDS 
>>>> "dataType"*.
>>>> C) Change the link to *THREDDS "dataType"* and update the list of 
>>>> terms to match whatever we point to.
>>>> - For the general confusion about what this term is for, add 
>>>> something like this sentence for context: "This attribute is 
>>>> maintained for compliance with older files and applications, and is 
>>>> neither needed nor recommended for most purposes (use featureType 
>>>> instead)."
>>>>
>>>>
>>>> *References*
>>>> *
>>>> *
>>>> [1] THREDDS issue discussion of cdm_data_type: 
>>>> https://github.com/Unidata/thredds/issues/72
>>>> [2] THREDDS data type reference in 1.1 and 1.3: 
>>>> http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/InvCatalogSpec.html#dataType
>>>> [3] NODC guidance reference in 1.3: 
>>>> http://www.nodc.noaa.gov/data/formats/netcdf/ (forwards to [4])
>>>> [4] NODC guidance referenced by @shane-axiom: 
>>>> http://www.nodc.noaa.gov/data/formats/netcdf/v1.1/
>>>> [5] ACDD 1.3 Reconciliation Pages: Active Issues: 
>>>> https://docs.google.com/spreadsheets/d/19fl5AgGkckG03yTchUjYUp4YnR09Fn1Nqps2KHenkC4/edit#gid=0
>>>> [6] Unidata THREDDS code with a list of feature types: 
>>>> https://github.com/Unidata/thredds/blob/target-4.3.22/cdm/src/main/java/ucar/nc2/constants/FeatureType.java
>>>>
>>>>
>>>> _______________________________________________
>>>> Esip-documentation mailing list
>>>> Esip-documentation at lists.esipfed.org 
>>>> <mailto:Esip-documentation at lists.esipfed.org>
>>>> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation
>>>
>>>
>>> -- 
>>> *******************************************************
>>> * Nan Galbraith        Information Systems Specialist *
>>> * Upper Ocean Processes Group            Mail Stop 29 *
>>> * Woods Hole Oceanographic Institution                *
>>> * Woods Hole, MA 02543                 (508) 289-2444 *
>>> *******************************************************
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Esip-documentation mailing list
>>> Esip-documentation at lists.esipfed.org 
>>> <mailto:Esip-documentation at lists.esipfed.org>
>>> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation
>>
>> -- 
>> Sincerely,
>>
>> Bob Simons
>> IT Specialist
>> Environmental Research Division
>> NOAA Southwest Fisheries Science Center
>> 99 Pacific St, Suite 255A (New!)
>> Monterey, CA 93940 (New!)
>> Phone: (831)333-9878 (New!)
>> Fax: (831)648-8440
>> Email: bob.simons at noaa.gov <mailto:bob.simons at noaa.gov>
>>
>> The contents of this message are mine personally and
>> do not necessarily reflect any position of the
>> Government or the National Oceanic and Atmospheric
>> Administration.
>> <>< <>< <>< <>< <>< <>< <>< <>< <>< <><
>>
>>
>> This body part will be downloaded on demand.
>
>

-- 
Sincerely,

Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
99 Pacific St, Suite 255A (New!)
Monterey, CA 93940 (New!)
Phone: (831)333-9878 (New!)
Fax: (831)648-8440
Email: bob.simons at noaa.gov

The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric
Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <>< <><

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20141209/b7b967ac/attachment-0001.html>


More information about the Esip-documentation mailing list