[Esip-documentation] ACDD 1.3 issue: date & time stamps

Armstrong, Edward M (398M) via Esip-documentation esip-documentation at lists.esipfed.org
Wed Oct 8 20:52:46 EDT 2014


I personally find the proposed attributes:

date_content_modified, date_values_modified,

not too terribly useful since if data values were modified this would imply for my data organization a new version of the dataset in a completely independent granule.

I think most data providers are similar for satellite data packaging, i.e., they never go back and change the values in an existing granule.

Perhaps others have need for these as part of web services that may make modifications to a granule.

Besides the attribute “id:” there does not seem to be an way to set an explicit product version number in the ACDD. Would a “version:” or “processing_version:” attribute not be a good addition?


On Oct 7, 2014, at 2:41 PM, Bob Simons - NOAA Federal via Esip-documentation <esip-documentation at lists.esipfed.org<mailto:esip-documentation at lists.esipfed.org>> wrote:

While there are lots of types of artifacts that could be date/timestamped and thus lots of terms for them, isn't that irrelevant?
Doesn't a given set of ACDD metadata apply to the data it is attached to (e.g., a file, a file with a granule, a file with a collection, a virtual aggregated dataset in THREDDS, near real time, delayed)?
And if we use an attribute name with an artifact name, e.g., date_file_modified, doesn't that become an anachronism when the dataset is served in THREDDS?  And doesn't an attribute name that doesn't specify the artifact (e.g., date_modified) remain relevant and appropriate?

On 2014-10-07 1:59 PM, John Graybeal via Esip-documentation wrote:
For the issue of date and time stamps, I've tried to distill the key approaches and issues from emails into the Active Issues<https://docs.google.com/spreadsheets/d/19fl5AgGkckG03yTchUjYUp4YnR09Fn1Nqps2KHenkC4/edit#gid=0> table (item 9, starts row 71), mostly focusing on concrete proposals. These are my conclusions so far.

1) Beyond the key 3-4 terms, the need for any particular term is small; but many need or want *some* other term(s).
2) Thus, considering the 'useful' use cases will create a big set of functions or 'date types', somewhere between Jim's initial list and the ISO list.
3) The number of artifacts that we are considering timestamping (file, data, etc.) is shorter but more than 2; reference list below. This is a multiplier against the list of useful functions.
4) "I'd like to keep things lean and flexible, and not build out a complex taxonomy unless we really need one." (Jim B,)

This sends me back to the original 3 terms date_*  (created, modified, issued) as a necessary starting point. My strong suspicion is that casual users assume date_* terms (with no artifact) refer to the whole file/product, not just the data. So I suggest the default definitions go that direction, rather than Bob's proposal. But I'll settle for anything explicit.

I note as a matter of process that if we do not achieve consensus on a useful collection of date/time stamps, we will be stuck with the original definitions + whatever modifications can get 70% vote. This adds to my initial focus on whether we can improve the first 3 definitions, as that will affect how we consider other needed functions.

John



Terms mentioned for the artifact to be date/timestamped (with nominal synonyms by me; they aren't perfect but represent large conceptual overlaps):
- file = product ~ granule instance (Jim, I think your list of artifacts should include one corresponding directly to file)
- data = values;
- metadata = attributes;
- static dataset;
- near real-time dataset;
- collection;
- granule;
- resource ("The term resource is used in 19115-1 as a replacement for dataset in order to emphasize that metadata can be used at many levels and for many kinds of things.")


On Oct 6, 2014, at 05:31, Ge Peng - NOAA Affiliate via Esip-documentation <esip-documentation at lists.esipfed.org<mailto:esip-documentation at lists.esipfed.org>> wrote:

Trying to look at the date issue from a different perspective:

Static historical datasets and near real-time datasets have different requirements for creation date. So do that for collection-level metadata records and file-level metadata attributes. Search and discovery will touch on both collection- and file-level metadata in some way eventually.

Information about the “original” creation date and modification date can be useful, especially for data provenance. However, both “original” creation date and “modification” date imply a need for keeping track of dates.

For the static datasets and collection-level metadata records, decisions to create and modify them tend to happen less frequently with weak or no latency requirement and are often occurred subjectively and documented so they may be more feasible to be implemented. Original creation date may be relevant and good to have in this case.

On the other hand, for the near real-time datasets that usually have strong data-latency requirement and file-level metadata such as those global attributes of a NetCDF file in a near real-time dataset, it may not be feasible to keep track of original creation date, although a version number/date could be utilized. The file creation date/time stamp may be more appropriate in this case.

Following the discussion thread on the date and time stamps and based on my experience working with both static and near real-time datasets, it has become apparent to me that having a date type element may offer an option to allow the flexibility of implementing different types of dates for different types of data as it may be nearly impossible for us to find an one-size-fits-all solution.

Coming late to the discussions and not wanting to stir up any additional discussion, I have withheld my comment until our last meeting when the type element for creator was suggested. Now with Jim’s suggestion to approach this issue in a more systematic way, I am putting my suggestion in – hopefully it will help us to reach a more consistent decision as it will have long-lasting implication for everyone – providers, stewards, developers, and users.

Best regards,

--- Peng

P.S: Being thinking about it over the weekend but decided to post my comment on this morning – maybe overlapping with Ted and John’s comments.

On Fri, Oct 3, 2014 at 3:54 PM, John Graybeal via Esip-documentation <esip-documentation at lists.esipfed.org<mailto:esip-documentation at lists.esipfed.org>> wrote:
The issue of date/timestamps is by far the most challenging ACDD issue.

A very short version of the very long 1.3 history:
A) A year-plus ago, several members identified ambiguities in definition, understanding, and use of the originals (date_created, date_modified, date_issued)
B) An extensive analysis/discussion took place starting over a year's time; the opinion of that group was that new terms should be created in place of the old. These terms were date_content_modified, date_values_modified, date_product_generated. A fourth was proposed but not settled on, a la date_product_originally_created. Another request relative to this group was for a publication date (which could be date_issued or something else, depending on definitions.)
C) The broader group's recent (general) decision was that existing terms should be kept, but redefined if necessary.
D Our 10-minute attempt yesterday helped bring out some of the original and ongoing issues.
E) Jim Biard's email this morning ("ACDD date attributes question) takes a back-to-fundamentals approach, laying out some suggested use cases and concepts.

For reference (only!), I've provided below definitions of these terms more or less as they originated or were improved.

>From a process perspective I think we might want to have a separate sub-group hash out a recommendation, but I think Jim's idea of gather requirements is a necessary first step anyway. So I propose
(a) you respond to Jim's email with your inputs to his summary, and
(b) reply to this email with any other comments or analysis, especially about process.

I do not have any clever ideas to make this topic more easily resolvable, given the constraints in (A) and (C) above. Let's start the discussion and see how it goes.

John

====================

date_created
The date on which the data was created.
date_modified
The date on which this data was last modified.
date_issued
The date on which this data was formally issued.
date_content_modified
The date on which any of the provided content, including data, metadata, and presented format, was last changed (including creation)
date_values_modified
The date on which the provided data values were last changed (including creation); excludes metadata and formatting changes
date_product_generated
The date on which this data file or product was produced/distributed. While this date is like a file timestamp, the date_content_modified and date_values_modified should be used to assess the age of the contents of the file or product.
date_product_originally_created
The date on which this data file or product first came into existence.

<END>

_______________________________________________
Esip-documentation mailing list
Esip-documentation at lists.esipfed.org<mailto:Esip-documentation at lists.esipfed.org>
http://www.lists.esipfed.org/mailman/listinfo/esip-documentation




--
Ge Peng, Ph.D
Research Scholar
Cooperative Institute for Climate and Satellites NC<http://cicsnc.org/>
North Carolina State University<http://ncsu.edu/>
NOAA's National Climatic Data Center<http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
ge.peng at noaa.gov<mailto:ge.peng at noaa.gov>
o: +1 828 257 3009
f:  +1 828 257 3002

Following CICS-NC on Facebook<http://www.facebook.com/cicsnc>



_______________________________________________
Esip-documentation mailing list
Esip-documentation at lists.esipfed.org<mailto:Esip-documentation at lists.esipfed.org>
http://www.lists.esipfed.org/mailman/listinfo/esip-documentation




_______________________________________________
Esip-documentation mailing list
Esip-documentation at lists.esipfed.org<mailto:Esip-documentation at lists.esipfed.org>
http://www.lists.esipfed.org/mailman/listinfo/esip-documentation


--
Sincerely,

Bob Simons
IT Specialist
Environmental Research Division
NOAA Southwest Fisheries Science Center
99 Pacific St, Suite 255A
Monterey, CA 93940
Phone: (831)333-9878 (Changed 2014-08-20)
Fax: (831)648-8440
Email: bob.simons at noaa.gov<mailto:bob.simons at noaa.gov>

The contents of this message are mine personally and
do not necessarily reflect any position of the
Government or the National Oceanic and Atmospheric
Administration.
<>< <>< <>< <>< <>< <>< <>< <>< <>< <><

_______________________________________________
Esip-documentation mailing list
Esip-documentation at lists.esipfed.org<mailto:Esip-documentation at lists.esipfed.org>
http://www.lists.esipfed.org/mailman/listinfo/esip-documentation

-ed

Ed Armstrong
JPL Physical Oceanography DAAC
818 519-7607



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20141009/6eb9010b/attachment-0001.html>


More information about the Esip-documentation mailing list