[Esip-documentation] topics and schedule for ACDD

John Graybeal via Esip-documentation esip-documentation at lists.esipfed.org
Mon Dec 8 15:26:37 EST 2014


I removed the markups, which have been reviewed by now. Thanks for the reminder.

> Alignment with NetCDF and CF Conventions
I like the new text but I don't like not being able to see the text you are replacing. Since all of this text has now been reviewed multiple times, I am trying to be conservative and very clear when making changes.

Can you please show in the document what text was replaced, so people can easily see the two versions while looking at the document? Thanks.

> Additional Metadata: metadata_link attribute

This text in the introduction was inserted in response to a previous concern (about why this attribute existed), which I don't have time to reference at the moment. I inserted text up front rather than in the description of the attribute, because the information seemed more background then definitional. You are welcome to propose an edit to the definition, you may find a better presentation than I could.

> Maintenance of Metadata

With the exception of 'fragile', I consider your text much better than what we have. I only note that part of the reason it was wordy was to reassure those with complaints/concerns in this area; but I think your excellent text, and the passage of time, probably meets the need. 

> Also in this section, I think they keywords are as 'fragile' as the geospatiotemporal 
> attributes; my keywords often represent the variables in the file, which can be removed 
> as easily as the data's time base can be shortened.  Haven't we discussed this?

Yes, there was extensive discussion of the fact that many attributes are fragile in some way, but (at the time at least) pointing that out only seemed to extend the argument and decrease the likelihood of consensus. Since many people do not use keywords quite the way you do (choosing higher-level keywords for example), they are not as consistently going to introduce errors if the attributes aren't updated, and it leads us down a slippery slope of evaluating each attribute for its fragility. I therefore propose to unmake your adjustments in this area, resulting in the following:

> The ACDD geospatiotemporal attributes present a special case, as this information is already fully defined by the CF coordinate variables (the redundant attributes are recommended to simplify access). These attributes are redundant, but they are recommended because they greatly simplify data discovery and access. The risk of inconsistency between these attributes and the actual data is highest after aggregation or subsetting.
> 
> For this reason, some data providers may choose to omit the ACDD geospatiotemporal attributes from their files. If these attributes are present, checking them against the data can serve as a useful test of the metadata's validity.
> 


On Dec 8, 2014, at 11:08, Nan Galbraith <ngalbraith at whoi.edu> wrote:

>> The current version is at http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_1-3; soon I will edit it to incorporate the markups already visible.
> 
> The strikeout formatting is ... difficult to read. 
> 
> I updated one section of the doc,  Alignment with NetCDF and CF Conventions - 
> not using strikeout, sorry if that was something others liked. There was at least 1 
> error, in that 'Conventions' is part of NUG, not originating in CF. The text about CF 
> changing their definition (in CF 1.7) seemed extraneous, to me.  I'm also looking 
> into the NUG link - not sure it's the most current version.
> 
> The section 'Additional Metadata: metadata_link attribute' seems unnecessary. Why 
> not just leave this out, and let people use the definition: " A URL that gives the location 
> of more complete metadata."  We can add to that if we really need to point out that
> this is where ISO 19115 (or other formal) metadata might go.
> 
> In the section Maintenance of Metadata, what is a user processor? I agree that we need 
> to make a point here, that the responsibility for maintenance lies with anyone making
> changes to data, but ... do we need so much text?
> 
> Also in this section, I think they keywords are as 'fragile' as the geospatiotemporal 
> attributes; my keywords often represent the variables in the file, which can be removed 
> as easily as the data's time base can be shortened.  Haven't we discussed this?
> 
>  Minus the strikeout text, it currently says:
> ACDD attributes, like all NetCDF attributes, characterize the data they are associated
> with. As NetCDF data are processed (e.g., through subsetting or other algorithms), these
> characteristics can be altered. The software or user processor is responsible for updating
> these attributes as part of the processing, but unfortunately some software processes
> and user practices leave them unchanged. This affects both consumers and producers of
> these files,  *including these* roles:
> developers of software tools that process NetCDF files;
> users that create new NetCDF files from existing ones; and
> end users of NetCDF files.
> NetCDF file creators (the first two roles) should ensure that the attributes of output files accurately 
> represent those files, and specifically should not "pass through" any source attribute in unaltered 
> form, unless it is known to remain accurate. NetCDF file users (all three roles) should verify critical 
> attribute values, and understand how the source data and metadata were generated, to be confident 
> the source metadata is current.
> 
> The ACDD geospatiotemporal attributes present a special case, as this information is already fully 
> defined by the CF coordinate variables (the redundant attributes are recommended to simplify access). 
> Errors in these attributes will create an inconsistency between the metadata and associated data. The 
> risk of these 'inconsistency errors' is highest for files that are aggregated into longer or larger products, 
> or subset into shorter or smaller products, such as files from numerical forecast models and gridded 
> satellite observations. For this reason, some providers of those data types may choose to omit the
>  ACDD geospatiotemporal attributes from their files. If the ACDD geospatiotemporal attributes are 
> present, checking them against the CF coordinate variables can serve as a partial test of the metadata's 
> validity.
> 
> 
> I'd like to shorten this to (at a maximum):
> ACDD attributes characterize the data they are associated with. Any processing that 
> alters these characteristics should be sure to update the relevant attributes.  
> NetCDF file creators and software developers should ensure that the attributes of output 
> data accurately represent that data, and specifically should not "pass through" any source 
> attribute in unaltered form, unless it is known to remain accurate. NetCDF data users should 
> verify critical attribute values, to be confident the source metadata is appropriate.
> 
> The ACDD geospatiotemporal attributes and keywords present special cases, as the information 
> they contain is also present in the CF coordinate variables and the standard names, respectively.  
> These attributes are redundant, but they are recommended because they greatly simplify data 
> discovery and access. The risk of inconsistency between these attributes and the actual data
> is highest after aggregation or subsetting. 
> For this reason, some data providers may choose to omit the more fragile ACDD  attributes 
> from their files. If these attributes are present, checking them against the data can serve as a 
> useful test of the metadata's validity.
> 
> 
> Cheers - Nan
> 
>  
> 
> On 12/5/14 3:31 PM, John Graybeal via Esip-documentation wrote:
>> Hi everyone! Here's where I think we are with ACDD topics and schedule.
>> 
>> *** NEXT ACDD SPECIAL MEETING ***
>> 
>> We intended to hold the ACDD side meeting yesterday, but due to preparation oversight missed the chance (sorry!). I've proposed a Doodle poll for a make-up meeting next week; can you please fill it out if you'd like to attend this next meeting? The intent is to resolve everything if we can, so as to allow final adoption at the ESIP Winter meeting.
>> 
>> The Doodle poll is at http://doodle.com/t385cv5k7acm57ad. Given everyone can contribute off-line, or at the final approval meeting, I think we should hold the meeting at the best available time, unless we can't get more than 3 or so.
>> 
>> *** TOPICS ***
>> 
>> 1) Standard_name_vocabulary attribute
>> 2) Product_version and software_version
>> 3) Status of cdm_data_type
>> 4) Final comments and wrap-up
>> 
>> Details of these topics are below.
>> 
>> The current version is at http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_1-3; soon I will edit it to incorporate the markups already visible.
>> 
>> John
>> 
>> 
>> *** TOPIC 1: STANDARD_NAME_VOCABULARY ATTRIBUTE
>> 
>> Thank you for voting in the Doodle poll (see poll, or its summary at end of this email). We have no consensus or strong favorite; eliminating the lowest vote-getter (Remove it) the remaining options are close (No change, Clarifying the definition, Generalizing the definition, or Adding new attributes). 
>> 
>> No option got 70% YES votes; only one option had less than 30% NO votes (Generalize it (beyond CF) by adding CF compliance warning to the description), but it had only one YES vote.
>> 
>> SO: I will leave the poll open for additional votes (or for people to change their votes; just click on the pencil by your name). This poll is strictly informational of course. When we have the meeting, if someone wants a vote on one of these options, they can ask for it. Otherwise, we leave things the way they are (2nd favorite choice out of 5).
>> 
>> *** TOPIC 2: PRODUCT_VERSION and SOFTWARE_VERSION ATTRIBUTES
>> 
>> Proposals are:
>> * product_version: The version identifier of the product based on the algorithm or methodology applied.
>> * software_version: The version identifier of the software that generated the data. 
>> 
>> Discussion has taken place on the list (and now has a few more days to take place); let's try to finalize decisions on these attributes at the meeting, ideally resolving any issues beforehand.
>> 
>> 
>> *** TOPIC 3: CDM_DATA_TYPE ATTRIBUTE
>> 
>> I overlooked this topic in Bob Simons' email of 10/16/2014, and as a result we have not explicitly addressed his concern in the meeting after that (though there was long ago considerable discussion in the list). I will send a separate email summarizing the status of that attribute.
>> 
>> 
>> *** APPENDIX: SUMMARY OF RESULTS ON TOPIC 1 POLL 
>> 
>> Proposal
>> Description
>> Yes
>> If Need Be
>> No
>> 1) Remove it.
>> Remove attribute.
>> 2
>> 0
>> 6
>> 2) Make it specific to versions
>> Make it more specific to versions, e.g., change its definition to "The version of the CF standard names from which variable standard names are taken. Example: v27"
>> 4
>> 1
>> 3
>> 3) Leave it as is.
>> No change.
>> 2
>> 3
>> 3
>> 4) Generalize it
>> Change its definition to add the text above: "Using standard_name values that are not from the CF Standard Name Table will make the file non-compliant with CF."
>> 2
>> 4
>> 2
>> 5) Add unique_name and                     unique_name_vocabulary attributes
>> Add a variable attribute called unique_name, definition "A unique descriptive name for the variable taken from a controlled vocabulary of variable names."  And add a name for its vocabulary. 
>> 2
>> 2
>> 4
>> 
>> Comments:
>> 
>>> CF Standard Names are backwards compatible, so the version of the S.N. table isn't needed - nothing is removed or redefined (substantively). If the aim is to allow/encourage other sets of variable names, let's call it something other than standard_name_vocabulary.
>>  
>>> The standard_name_vocabulary attribute points up an oversight in CF. CF does not, within itself, provide a way to indicate which standard name vocabulary was used when selecting standard names for a file. This attribute is rectifying the oversight. And I agree that we shouldn't override CF's use of an attribute, as John G said.
>>  
>>> As much as I'd like to extend CF's focus on a single vocabulary, I don't think we should override CF's use of an attribute.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Esip-documentation mailing list
>> Esip-documentation at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation
> 
> 
> -- 
> *******************************************************
> * Nan Galbraith        Information Systems Specialist *
> * Upper Ocean Processes Group            Mail Stop 29 *
> * Woods Hole Oceanographic Institution                *
> * Woods Hole, MA 02543                 (508) 289-2444 *
> *******************************************************
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20141208/5c5ffa78/attachment-0001.html>


More information about the Esip-documentation mailing list