[Esip-documentation] ACDD 2-3 question (geospatiotemporal extent)

Nan Galbraith via Esip-documentation esip-documentation at lists.esipfed.org
Wed May 28 14:52:12 EDT 2014


Hi all -

That looks fine to me, John, thanks. I'm not sure the heading 
'Maintenance of Metadata
in Derived Products' is especially clear, though - many people don't 
think of data that's
been subset or aggregated as a derived product, that term is reserved 
for something like
calculated surface fluxes or other concepts that are not measured 
directly ... at least in
my experience.  Could we just use 'Maintenance of Metadata', or is that 
even less clear?

Also,  other attributes, like keywords and history, may be compromised 
by data aggregation
and sub-setting; not sure if we need to mention those in the text. If 
the keywords describe
the data variables in a file, e.g., those can become misleading, just as 
easily as geo-temporal
bounds.

Last,  I've re-read the file date discussions,  but I'm not at all sure 
we reached
any sort of consensus.

Cheers -
Nan

On 5/28/14 1:25 AM, John Graybeal via Esip-documentation wrote:
> All,
>
> I did a pretty thorough rewrite of the Maintenance of Metadata in 
> Derived Products 
> <http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_1-2_Working#Maintenance_of_Metadata_in_Derived_Products> section, 
> to reflect Steve Hankin's input (thank you!) and my own further 
> analysis. Below is the new text of the section. While I edited a lot 
> of Steve's wording, I think the last paragraph advances his provided 
> text pretty well, subject to review of course!
>
> I did not change the actual attribute definitions to reference back to 
> this section, because adding it to some and not others feels 
> inconsistent, and adding it to more a few feels noisy and detracts 
> from most important message about attribute maintenance. I suspect 
> this detail may be overdesigning, so if it's contentious let's just go 
> with whatever majority says we should add the back-reference to (none, 
> temporal, geospatiotemporal, all) at the next meeting.
>
> John
>
> P.S.  The lively discussion has been great on the date_x_modified 
> topic too; I'll try to sum up where we've gotten to in a day or two, 
> if there are no further comments.
>
>> ACDD attributes, like all NetCDF attributes, characterize their 
>> containing (parent) granules. As NetCDF data are processed (e.g., 
>> through subsetting or other algorithms), these characteristics can be 
>> altered. The software or user processor is responsible to update 
>> these attributes as part of the processing, but some software 
>> processes and user practices leave them unchanged. This affects both 
>> consumers and producers of these files, which comprises three roles:
>>
>>   * developers of software tools that process NetCDF files;
>>   * users that create new NetCDF files from existing ones; and
>>   * end users of NetCDF files.
>>
>> NetCDF file /creators/ (the first two roles) should ensure that the 
>> attributes of output files accurately represent those files, and 
>> specifically should not "pass through" any source attribute in 
>> unaltered form, unless it is known to remain accurate. NetCDF file 
>> /users/ (all three roles) should verify critical attribute values, 
>> and understand how the source data and metadata were generated, to be 
>> confident the source metadata is current.
>>
>> The ACDD geospatiotemporal attributes present a special case, as this 
>> information is already fully defined by the CF coordinate variables 
>> (the redundant attributes are recommended to simplify access). Errors 
>> in these attributes will create an inconsistency between the metadata 
>> and data of the granule or file. The risk of these 'inconsistency 
>> errors' is highest for files that are aggregated into longer or 
>> larger products, or subset into shorter or smaller products, such as 
>> files from numerical forecast models and gridded satellite 
>> observations. For this reason, some providers of those data types may 
>> choose to omit the ACDD geospatiotemporal attributes from their 
>> files. If the ACDD geospatiotemporal attributes are present, checking 
>> them against the CF coordinate variables can serve as a partial test 
>> of the metadata's validity.
>>
>
>
> On May 21, 2014, at 17:56, Steve Hankin via Esip-documentation 
> <esip-documentation at lists.esipfed.org 
> <mailto:esip-documentation at lists.esipfed.org>> wrote:
>
>>
>> On 5/21/2014 4:48 PM, John Graybeal wrote:
>>> Steve,
>>>
>>> Thanks for this.  Did you see my attempt to address the same point, 
>>> in the added section called "Maintenance of Metadata in Derived 
>>> Products"?  I am not arguing against your text, which is also good, 
>>> but asking that you first consider my previously offered text, over 
>>> which is based on words you and others had provided. I would 
>>> appreciate your suggestion of an optimal set of changes (sorry for 
>>> asking for extra work).
>>
>> Hi John,
>>
>> You are right that I overlooked this section.  Just read it.  
>> Editorial feedback:
>>
>> There are three groups of "users" to consider:
>>
>>  1. programmers of CF-processing applications
>>  2. end used of CF-ACDD files
>>     and
>>  3. creators of CF-ACDD files
>>
>> The title and intent of the section seems to be to inform group #1.  
>> You offer valuable advice to them.  In the midst of the section are a 
>> couple of chatty sentences addressed ambiguously to "users", but 
>> presumably targeted at group #2,  admonishing them to an attitude of 
>> suspicion when using CF-ACDD datasets.  The text offers no advice to 
>> file creators of files -- no suggestion that they should to exercise 
>> judgment.
>>
>> My editorial input would be to shorten this section, focusing it more 
>> explicitly on group 1.  Then add the text I suggested (previous 
>> email) in order to reach group 3.
>>
>>>
>>>> ACDD attributes describe the granules that they are contained in. 
>>>> As data are processed (e.g., through subsetting or other 
>>>> processes), these characteristics can change. It is the 
>>>> responsibility of the processor to update these attributes as part 
>>>> of the processing. That said, some software processes and user 
>>>> practices modify the data without appropriately updating the 
>>>> metadata attributes. Given this reality, users are encouraged to 
>>>> verify critical attribute values, and understand how the data were 
>>>> processed, to be confident you are not using 'stale' metadata.
>>>
>>> I think the recommendation for the 'Please see' note could be 
>>> applied to a large number of attributes, which is why I didn't add 
>>> it to any. Any particular reason you chose those?
>>
>> The other potential candidates are the seven geospatial_* 
>> attributes.  Yes, they could all potentially have the same note 
>> attached.  However the odds of the problem showing up through use of 
>> the time extent attributes are many times larger than for the 
>> geospatial_* extents.  If it were up to me I would put the note on 
>> all of them, but I was striving for compromise, given the push back 
>> that this issue has generated in the past.
>>
>>     - Steve
>>
>>>
>>> And for everyone to note, as a reminder: As a working tool, the 
>>> page NetCDF Utilities Metadata Handling has been created to identify 
>>> the state of play for how tools handle metadata attributes when 
>>> processing files.
>>>
>>> John
>>>
>>> On May 21, 2014, at 16:18, Steve Hankin <steven.c.hankin at noaa.gov 
>>> <mailto:steven.c.hankin at noaa.gov>> wrote:
>>>
>>>>
>>>> On 5/21/2014 12:20 PM, John Graybeal via Esip-documentation wrote:
>>>>> Hi Anna,
>>>>>
>>>>> As a significant driver I'll offer one opinion. Caveat emptor.
>>>>>> 1. Is this the best version to be using? (They will NOT be using groups)
>>>>> Arguably, yes it is the best version to be using, but it is not 
>>>>> approved at this point. I would say the status is 'stalled in a 
>>>>> mostly happy place' -- with one exception, I haven't heard any 
>>>>> complaints about this current 'Working' draft, which has been 
>>>>> around for many months now and has been carefully reviewed by at 
>>>>> least one person.  I *think* that all that is required for 
>>>>> approval is for Derrick Snowden (or someone he designates, ideally 
>>>>> not the principle updater, hint hint) to call a discussion/next 
>>>>> steps meeting, at which any remaining issues can be raised and 
>>>>> resolved.
>>>>>
>>>>> There is only one open issue under discussion, namely *whether the 
>>>>> adoption of summary metadata for geospatiotemporal ranges is good, 
>>>>> tolerable, or bad*. It is hard to know for sure whether that will 
>>>>> be changed (I suspect it will not, just from comments so far). It 
>>>>> is my hope that the fact all these attributes are **recommended**, 
>>>>> not *required*, means that it will be acceptable to leave this 
>>>>> material in, perhaps with precautionary language (a proposal for 
>>>>> which has already been added).  We haven't had a discussion in the 
>>>>> group yet about this topic.
>>>>>
>>>>
>>>> Hi John,
>>>>
>>>> The draft at 
>>>> http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Working 
>>>> needs only minor editorial additions to address the open issue 
>>>> discussed above.  In the ACDD document the word "Recommended" alone 
>>>> does not make users aware of the conditions under which the 
>>>> geospatiotemporal  extent attributes may lead to internally 
>>>> contradictory file content.   Here is a suggested addition:
>>>>
>>>>      under "*Alignment with NetCDF and CF Conventions*" add ...
>>>>
>>>>         Note that the geospatial and temporal extent of a CF
>>>>         dataset is self-documenting through its CF coordinate
>>>>         variables.  The intent of the ACDD geospatiotemporal extent
>>>>         attributes is to make it easier to infer this information
>>>>         from a file.  Since these attributes provide redundant
>>>>         information, they may create a risk of corrupted content. 
>>>>         The risk is highest for the time extents of files that are
>>>>         likely to be aggregated into longer time series, such as
>>>>         files output by numerical forecast models and in gridded
>>>>         satellite data products.
>>>>
>>>>
>>>>     under *Recommended Global Attributes: time_coverage_start,
>>>>     time_coverage_end, and time_coverage_duration* add ...
>>>>
>>>>         please see note in the "Alignment with NetCDF and CF
>>>>         Conventions" section of this users guide
>>>>
>>>> This does not change the content or spirit of the ACDD document.  
>>>> It merely informs users of trade-offs that they should be aware of.
>>>>
>>>>     - Steve
>>>>
>>>


-- 
*******************************************************
* Nan Galbraith        Information Systems Specialist *
* Upper Ocean Processes Group            Mail Stop 29 *
* Woods Hole Oceanographic Institution                *
* Woods Hole, MA 02543                 (508) 289-2444 *
*******************************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20140528/76642300/attachment-0001.html>


More information about the Esip-documentation mailing list