[Esip-documentation] ACDD 2-3 question (geospatiotemporal extent)

John Graybeal via Esip-documentation esip-documentation at lists.esipfed.org
Wed May 28 01:25:55 EDT 2014


All,

I did a pretty thorough rewrite of the Maintenance of Metadata in Derived Products section, to reflect Steve Hankin's input (thank you!) and my own further analysis. Below is the new text of the section. While I edited a lot of Steve's wording, I think the last paragraph advances his provided text pretty well, subject to review of course!

I did not change the actual attribute definitions to reference back to this section, because adding it to some and not others feels inconsistent, and adding it to more a few feels noisy and detracts from most important message about attribute maintenance. I suspect this detail may be overdesigning, so if it's contentious let's just go with whatever majority says we should add the back-reference to (none, temporal, geospatiotemporal, all) at the next meeting.

John

P.S.  The lively discussion has been great on the date_x_modified topic too; I'll try to sum up where we've gotten to in a day or two, if there are no further comments.

> ACDD attributes, like all NetCDF attributes, characterize their containing (parent) granules. As NetCDF data are processed (e.g., through subsetting or other algorithms), these characteristics can be altered. The software or user processor is responsible to update these attributes as part of the processing, but some software processes and user practices leave them unchanged. This affects both consumers and producers of these files, which comprises three roles:
> 
> developers of software tools that process NetCDF files;
> users that create new NetCDF files from existing ones; and
> end users of NetCDF files.
> NetCDF file creators (the first two roles) should ensure that the attributes of output files accurately represent those files, and specifically should not "pass through" any source attribute in unaltered form, unless it is known to remain accurate. NetCDF file users (all three roles) should verify critical attribute values, and understand how the source data and metadata were generated, to be confident the source metadata is current.
> 
> The ACDD geospatiotemporal attributes present a special case, as this information is already fully defined by the CF coordinate variables (the redundant attributes are recommended to simplify access). Errors in these attributes will create an inconsistency between the metadata and data of the granule or file. The risk of these 'inconsistency errors' is highest for files that are aggregated into longer or larger products, or subset into shorter or smaller products, such as files from numerical forecast models and gridded satellite observations. For this reason, some providers of those data types may choose to omit the ACDD geospatiotemporal attributes from their files. If the ACDD geospatiotemporal attributes are present, checking them against the CF coordinate variables can serve as a partial test of the metadata's validity. 
> 


On May 21, 2014, at 17:56, Steve Hankin via Esip-documentation <esip-documentation at lists.esipfed.org> wrote:

> 
> On 5/21/2014 4:48 PM, John Graybeal wrote:
>> Steve,
>> 
>> Thanks for this.  Did you see my attempt to address the same point, in the added section called "Maintenance of Metadata in Derived Products"?  I am not arguing against your text, which is also good, but asking that you first consider my previously offered text, over which is based on words you and others had provided. I would appreciate your suggestion of an optimal set of changes (sorry for asking for extra work).
> 
> Hi John,
> 
> You are right that I overlooked this section.  Just read it.  Editorial feedback: 
> 
> There are three groups of "users" to consider:
> programmers of CF-processing applications
> end used of CF-ACDD files
> and
> creators of CF-ACDD files
> The title and intent of the section seems to be to inform group #1.  You offer valuable advice to them.  In the midst of the section are a couple of chatty sentences addressed ambiguously to "users", but presumably targeted at group #2,  admonishing them to an attitude of suspicion when using CF-ACDD datasets.  The text offers no advice to file creators of files -- no suggestion that they should to exercise judgment.
> 
> My editorial input would be to shorten this section, focusing it more explicitly on group 1.  Then add the text I suggested (previous email) in order to reach group 3.
> 
>> 
>>> ACDD attributes describe the granules that they are contained in. As data are processed (e.g., through subsetting or other processes), these characteristics can change. It is the responsibility of the processor to update these attributes as part of the processing. That said, some software processes and user practices modify the data without appropriately updating the metadata attributes. Given this reality, users are encouraged to verify critical attribute values, and understand how the data were processed, to be confident you are not using 'stale' metadata.
>> 
>> I think the recommendation for the 'Please see' note could be applied to a large number of attributes, which is why I didn't add it to any. Any particular reason you chose those?
> 
> The other potential candidates are the seven geospatial_* attributes.  Yes, they could all potentially have the same note attached.  However the odds of the problem showing up through use of the time extent attributes are many times larger than for the geospatial_* extents.  If it were up to me I would put the note on all of them, but I was striving for compromise, given the push back that this issue has generated in the past.
> 
>     - Steve
> 
>> 
>> And for everyone to note, as a reminder: As a working tool, the page NetCDF Utilities Metadata Handling has been created to identify the state of play for how tools handle metadata attributes when processing files.
>> 
>> John
>> 
>> On May 21, 2014, at 16:18, Steve Hankin <steven.c.hankin at noaa.gov> wrote:
>> 
>>> 
>>> On 5/21/2014 12:20 PM, John Graybeal via Esip-documentation wrote:
>>>> Hi Anna,
>>>> 
>>>> As a significant driver I'll offer one opinion. Caveat emptor.
>>>>> 1. Is this the best version to be using? (They will NOT be using groups)
>>>> Arguably, yes it is the best version to be using, but it is not approved at this point. I would say the status is 'stalled in a mostly happy place' -- with one exception, I haven't heard any complaints about this current 'Working' draft, which has been around for many months now and has been carefully reviewed by at least one person.  I *think* that all that is required for approval is for Derrick Snowden (or someone he designates, ideally not the principle updater, hint hint) to call a discussion/next steps meeting, at which any remaining issues can be raised and resolved.
>>>> 
>>>> There is only one open issue under discussion, namely whether the adoption of summary metadata for geospatiotemporal ranges is good, tolerable, or bad. It is hard to know for sure whether that will be changed (I suspect it will not, just from comments so far). It is my hope that the fact all these attributes are *recommended*, not *required*, means that it will be acceptable to leave this material in, perhaps with precautionary language (a proposal for which has already been added).  We haven't had a discussion in the group yet about this topic.
>>>> 
>>> 
>>> Hi John,
>>> 
>>> The draft at http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Working needs only minor editorial additions to address the open issue discussed above.  In the ACDD document the word "Recommended" alone does not make users aware of the conditions under which the geospatiotemporal  extent attributes may lead to internally contradictory file content.   Here is a suggested addition:
>>>  under "Alignment with NetCDF and CF Conventions" add ...
>>> Note that the geospatial and temporal extent of a CF dataset is self-documenting through its CF coordinate variables.  The intent of the ACDD geospatiotemporal extent attributes is to make it easier to infer this information from a file.  Since these attributes provide redundant information, they may create a risk of corrupted content.  The risk is highest for the time extents of files that are likely to be aggregated into longer time series, such as files output by numerical forecast models and in gridded satellite data products. 
>>> 
>>> under Recommended Global Attributes: time_coverage_start, time_coverage_end, and time_coverage_duration add ...
>>> please see note in the "Alignment with NetCDF and CF Conventions" section of this users guide
>>> This does not change the content or spirit of the ACDD document.  It merely informs users of trade-offs that they should be aware of.
>>> 
>>>     - Steve
>>> 
>> 
>> John Graybeal
>> jbgraybeal at mindspring.com
>> 
>> 
>> 
> 
> _______________________________________________
> Esip-documentation mailing list
> Esip-documentation at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation

John Graybeal
jbgraybeal at mindspring.com



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20140527/6b2b1190/attachment.html>


More information about the Esip-documentation mailing list