[Esip-documentation] ACDD date (version and other dates)

Bob Simons - NOAA Federal via Esip-documentation esip-documentation at lists.esipfed.org
Mon Oct 6 15:30:43 EDT 2014


On 2014-10-03 1:00 PM, Nan Galbraith via Esip-documentation wrote:
> Hi Jim, and all -
>
> In the spirit of a comment from yesterday's meeting, people prefer short,
> simple specifications - let's not try to describe everything about 
> versions of
> a data 'instance' in ACDD.  Since this is a discovery specification,  
> we might narrow
> our discussion of version dates to what is needed for a user to find 
> out whether
> a NetCDF file (instance) he encounters is something he needs to get 
> his hands
> on.  At least, we may want to keep that in mind when we look at use 
> cases for
> the various dates we're considering.
>
> Since I work with NetCDF files, I'm going to skip over the 
> granule/collection
> part of Jim's email and get right to what we've been calling 'file times'.
>
> I have 2 useful time stamps, with use cases that are very common for
> in situ data - I know I've been harping on this for a long time, but I'll
> outline it again, please bear with me:
>
> - the time the last edit or processing of observed (e.g. temperature) 
> or calculated
> (e.g. salinity) values occurred. This is the '*data version*'. 
> Discovery use case: a
> colleague can't reproduce our bulk flux outputs, he needs to determine 
> if his
> input data is the same 'data version' as what we used (otherwise,  his 
> algorithm
> may be different (therefore, wrong)).
Nan, this is already handled via date_modifed which has the definition 
in ACDD 1.0 and 1.1 of
"The date on which this data was last modified." (note that the 
definition says "data", not "metadata")

>
> - the time the file was written, which could simply reflect formatting 
> or metadata
> changes. Use case: data user has many questions about  e.g. sensor 
> heights, which
> may have been added to the data set after he accessed it. Having this 
> time stamp
> allows him to see if his metadata is out of date; it also allows me to 
> check if a remote
> server has the most up to date metadata and format.
Isn't this met by my proposed date_metadata_modified?


>
> With regards to 'original' time - which you call 'data was first 
> produced/acquired',
> I have to get into the weeds to explain why this date should not be 
> 'recommended',
> but might be in a category of 'suggested, if needed'.
>
> We put our real time data on the web, starting the moment the 
> transmitters are
> turned on. There might be 1 record in each file at that time, and it's 
> probably
> junk, since the instruments are in a parking lot - I may not even know 
> this time,
> if the transmitters are turned on over the weekend. When we recover 
> instruments,
> we discard the real time data and publish the 'first cut' of the 
> internally recorded
> data, which is later overwritten by an edited version, re-processed 
> with post-cals.
> What is the use case for providing the 'first produced' date for this 
> kind of data?
>
> This was my earlier proposal; I'd be glad to change 'file date' to 
> 'instance date' or
> something similar. I still like the idea of leaving it up to the user 
> to decide what
> level of change precipitates a new version date.
>> Maybe we should use version_date for substantive changes, and
>> file_date for the actual time stamp of the file; it would then be up
>> to the provider to decide what constitutes a new version of a file;
>> slight formatting changes, additional non-critical metadata would
>> not, but new algorithms or added data might.
>
> Cheers -
> Nan
>
> On 10/3/14 2:06 PM, Jim Biard via Esip-documentation wrote:
>> Hi.
>>
>> I was wondering if it would be useful to back the whole date 
>> attribute question up a bit.
>>
>> Without using any existing or proposed attribute name, can each 
>> stakeholder describe what kinds of date stamps they need and want?
>>
>> When describing these date stamps, I see three different entities 
>> (sort of) that they might relate to - and there are probably more. 
>> The ones that I see are:
>>
>>   * Granule - An atom of data that is bounded in space and/or time.
>>     One granule can include multiple variables, and has variable- and
>>     granule-level metadata. A granule *is not* a netCDF file. It is
>>     data and metadata floating free in "the cloud".
>>   * Collection - A group of granules that are treated as a consistent
>>     whole. A collection may be static, or it may grow over time. As
>>     with a granule, a collection is a conceptual object in "the cloud".
>>   * Granule Instance - A granule expressed as one or more netCDF files.
>>
>>
>> Using these terms, here are date stamps that I find useful/needed. 
>> Most all of these should have accompanying annotations in history 
>> metadata.
>>
>>   * Date a granule's data was first produced/acquired. This can get
>>     tricky for a granule consisting of a long time series.
>>   * Date a granule's metadata was first associated with the data.
>>   * Date a granule's data was last modified.
>>   * Date a granule's metadata was last modified.
>>   * Date a granule instance was created.
>>   * Date a granule instance was last modified.
>>   * Date a collection was established. (I say it this way on account
>>     of growing collections.) I guess this amounts to a
>>     version/edition time stamp.
>>
>>
>> There are other entities and date stamps that I have left out because 
>> I didn't see them as being relevant to a particular granule instance 
>> in a netCDF file.
>>
>> Do these make sense? Are there others that you can think of?
>>
>> Grace and peace,
>>
>> Jim
>>
>> CICS-NC <http://www.cicsnc.org/> Visit us on
>> Facebook <http://www.facebook.com/cicsnc> 	*Jim Biard*
>> *Research Scholar*
>> Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
>> North Carolina State University <http://ncsu.edu/>
>> NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
>> 151 Patton Ave, Asheville, NC 28801
>> e: jbiard at cicsnc.org
>> o: +1 828 271 4900
>>
>>
>>
>>
>>
>>
>> This body part will be downloaded on demand.
>
>
> -- 
> *******************************************************
> * Nan Galbraith        Information Systems Specialist *
> * Upper Ocean Processes Group            Mail Stop 29 *
> * Woods Hole Oceanographic Institution                *
> * Woods Hole, MA 02543                 (508) 289-2444 *
> *******************************************************
>
>
>
>
> _______________________________________________
> Esip-documentation mailing list
> Esip-documentation at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation

-- 
Sincerely,

Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
99 Pacific St, Suite 255A
Monterey, CA 93940
Phone: (831)333-9878 (Changed 2014-08-20)
Fax: (831)648-8440
Email: bob.simons at noaa.gov

The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric
Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <>< <><

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20141006/f7feb396/attachment.html>


More information about the Esip-documentation mailing list