[Esip-documentation] New ACDD home page

Nan Galbraith ngalbraith at whoi.edu
Tue May 28 07:59:35 EDT 2013


Hi John -

Nice work!  I appreciate that you've provided definitions that are not 
circular - creator
is defined as ' person principally responsible for originating this 
data', for example - much
more clear.

Some quibbles with file dates, which are now defined more clearly (so 
that I may disagree
with them)
date_created
    The first date on which this dataset was published (this value never
    changes after
    first set of data is released the first time). 
date_modified
    The date on which this dataset (as seen by users or captured in a
    file) was last
    changed. 

I think model runs and observational data files may need different file 
time information,
but I can only speak to the obs side. That said, is there a use case for 
recording the date
on which a data set was first released?

IMHO, the dates of importance, at least for observational data are 1. 
when the last 'data value'
was modified - the 'data version date' and 2. when the NetCDF was 
written - the 'file creation
date'.  I need the former because it determines whether a file contains 
the latest version of
the observations - post-cals applied, revised algorithms used, more 
de-spiking done,
whatever.  I need the latter because it lets me know if the file 
contains the latest metadata,
the latest conventions (for those that are evolving), and if it was 
written by a good netcdf
conversion run.

'The first date on which this dataset was published' seems like it would 
require some definitions;
actually I think it might be impossible to pin down for  any project 
where real time data - possibly
even pre-deployment data - is published, updated, and replaced with 
post-recovery data.  What
does 'dataset' mean in this context?  What does 'published' mean?  If I 
put it on my group's web
site, is it published? Is there some other definition for this, or more 
to the point, an idea of why
this date might be required?

Re: time_coverage_resolution - should we change the term to 
time_coverage_interval, or
similar, since 'resolution' has so many meanings?  And should we 
specifically recommend
that it be expressed as an ISO 8601 interval string, e.g. P1H for one 
hour, or is it useful as
free text?

Re: creator and publisher fields -  I really like the way you've 
developed these, especially the way
you've expanded the publisher fields  - that gives a project like 
oceansites a place to be identified.

Quibble: I can't say I like the term creator_person, and wonder if we 
could go with the (slightly
less symmetrical) terms creator_name, creator_info, creator_institution, 
creator_institution_info -
which assumes that an 'unmodified' creator is by default a person.

The definition of the _info fields, 'can include any information as ISO 
19139 or free text' is
a little too vague, IMHO, in terms of guidance.

How should the '_info' information be presented in an ISO 19139 
compliant way? Can we
just choose some fields within CI_ResponsibleParty and list those, or 
are we thinking
of an xml snippet for this attribute?  An example (from OGC) could be 
coded either as:

creator_info: 'organisationName:con terra GmbH, email:voges at conterra.de' ;

or as:

creator_info: '<contact>
              <CI_ResponsibleParty>
                 <individualName>
                    <gco:CharacterString>Uwe Voges</gco:CharacterString>
                 </individualName>
                 <organisationName>
                    <gco:CharacterString>con terra 
GmbH</gco:CharacterString>
                 </organisationName>
                 <contactInfo>
                    <CI_Contact>
                       <address>
                         <CI_Address>
                            <electronicMailAddress>
<gco:CharacterString>voges at conterra.de</gco:CharacterString>
                            </electronicMailAddress>
                         </CI_Address>
                      </address>
                   </CI_Contact>
                 </contactInfo>
             </CI_ResponsibleParty>
         </contact>' ;

Do we recommend one over the other?  Will a multi-line, verbose 
attribute like the
latter be hard for users to implement? Does it add any functionality?

Thanks again -

Nan

On 5/20/13 9:54 PM, John Graybeal wrote:
> Hi everyone,
>
> I talked with David N and Derrick S end of last week, and we agreed on 
> some basic strategies, and today I finally updated all the definitions 
> [1].
>
> Please consider these new definitions -- and deletions [3], additions, 
> and rearrangements into different categories -- food for discussion. 
>  You might want to first decide whether to broadly accept the approach 
> in each case, then nit-pick the definitions.
>
> The approaches are documented in some detail in the discussion page of 
> the Working document [2]. But extremely briefly:
> - Neither totally computable, nor totally incomputable (but just right 
> :->); encouraged use of structured text in many fields
> - Somewhat flat, but somewhat structured: structured text in fields 
> (optionally), but fewer fields with ancillary metadata
> - Support a range of keyword styles, but avoid recommending any in 
> particular; support multiple keyword and standard_name vocabularies
> - Generally did not include guidance, as much for lack of time as 
> anything. I think a third column with guidance/references would be 
> most valuable.
> - Reflected recommendations like 
> http://wiki.esipfed.org/index.php/NetCDF,_HDF,_and_ISO_Metadata in 
> terms of the skeleton, but considered them more guidance than 
> something we could require at this stage. (Small steps.)
>
> Other changes broadly described:
> - Geospatiotemporal: Now much more explicit about the 
> geospatiotemporal attributes
> - Lineage: Now much more explicit about possible forms for this 
> information.
>
> As someone who works a lot with rich structured metadata, I like the 
> flexibility this new approach gives to do that. Conversely, I don't 
> think it shuts down any of the less formal providers/documenters of 
> metadata. I'll be curious to see your inputs.
>
> John
>
>
> [1] 
> http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_(ACDD)_Working 
> <http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Working>
> [2] 
> http://wiki.esipfed.org/index.php/Talk:Attribute_Convention_for_Data_Discovery_(ACDD)_Working 
> <http://wiki.esipfed.org/index.php/Talk:Attribute_Convention_for_Data_Discovery_%28ACDD%29_Working>
> [3] We might want to move deletions into a deprecated section, so that 
> they are allowed for backwards compatibility.
>
> ======================================  (Past email thread, for reference)
>
> In any case I will be tossing some additional specific text for each 
> term on the working page 
> (http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_(ACDD)_Working 
> <http://wiki.esipfed.org/index.php/Attribute_Convention_for_Data_Discovery_%28ACDD%29_Working>). 
>  Please take that page into account as you go forward.
>
> There is a conflict in short-term and longer-term strategies, which 
> I'll summarize here.  One can either have a flat list of attributes, 
> or a list that supports rich relationships (groupings and descriptions 
> of them), or a hybrid that cobbles together a way to show rich 
> relations in a flat list. This especially affects contacts and their 
> roles.  I classify it as short-term vs long-term, because I'm sure 
> someday we'll want to migrate to the richer relations approach (or at 
> least a hybrid), but I understand that time may not be now.
>
> The other connected issue that keeps popping up is use of controlled 
> vocabularies, where you can either specify *one* vocabulary a priori 
> for each field, or include a vocabulary field for every attribute that 
> calls for CV terms, or allow the use of fully unique CV terms within 
> any of these attributes. This is somewhat affected by whether your 
> attribute list is flat or rich.
>
> I will add my comments on these two topics to the discussion page, but 
> they have a strong bearing on the best integration approach.
>
> John
>
>
>
> On May 9, 2013, at 09:07, David Neufeld - NOAA Affiliate 
> <david.neufeld at noaa.gov <mailto:david.neufeld at noaa.gov>> wrote:
>
>> John,
>>
>> Thanks for getting this started, I've added some information on 
>> governance.
>>
>> Seems like a next step would be for Rich, Aleksandar, Ted and I to 
>> review the working draft and document some of the tweaks that crept 
>> into ncISO over the past year outside of ACDD.  Then we can discuss 
>> whether some (or all?) of those changes could be incorporated into 
>> the standard.
>>
>> http://wiki.esipfed.org/index.php/Category:Attribute_Conventions_Dataset_Discovery
>>
>> Dave
>>

-- 
*******************************************************
* Nan Galbraith        Information Systems Specialist *
* Upper Ocean Processes Group            Mail Stop 29 *
* Woods Hole Oceanographic Institution                *
* Woods Hole, MA 02543                 (508) 289-2444 *
*******************************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20130528/3bd16b48/attachment.html>


More information about the Esip-documentation mailing list