[Esip-preserve] Identifiers

Mark A. Parsons parsonsm at nsidc.org
Thu Feb 16 10:54:12 EST 2012


Personally, I think defining a data set too precisely is a fools errand. It is the responsibility of the data authors and stewards to define something that makes sense for their designated community and slap a DOI and a name on to it.

To me a data set is simply a logical arrangement of data that has meaning to a designated community.

Your definition below, for example, does not work for many, perhaps the majority, of NSIDC data sets.

Cheers,

-m. 
On 16 Feb 2012, at 8:06 AM, Curt Tilmes wrote:

> On 02/15/2012 03:48 PM, Bruce Barkstrom wrote:
>> It would be useful to at least having some clear definitions of
>> things.
>> 
>> So - to go back to the "undefined term" "data set" does this term
>> refer to
> 
> Yes, we need to define it.  We keep putting it off.  Let's debate this
> one now.  We might not come to complete agreement, but perhaps we can
> refine this sufficiently to come up with something we can make use of.
> 
> 
> I use it for something comparable to the EOSDIS Data Model concept of
> Earth Science Data Type (ESDT) + Collection.
> 
> So, for example, the { "MODIS/Terra Snow Cover 5-Min L2 Swath 500m"
> (MOD10_L2), "Collection 5" } is one dataset.
> 
> { MOD10_L2, Collection 6 } would be a distinct dataset, and need a
> distinct identifier (eventually DOI).
> 
> 
> I'm also not wedded to the term "dataset" for this concept -- if
> someone can sell me on an alternative.  I just think we need some term
> for this concept we can all live with..  "dataset" is the most natural
> I can come up with.
> 
> 
> A couple notes for people who don't speak "NASA EOSDIS Data Model":
> 
> 1. A dataset is made up of granules.
> 
> 2. Each granule in the dataset was made in a "common" (I won't define
> that for now) way.
> 
> 3. Each granule in the dataset has a common format, metadata, filename
> convention, etc.  A reader for one granule will also be able to read
> another granule from the same dataset.
> 
> 
> I'll further add these definitions for discussion:
> 
> A "static dataset" doesn't change.  The set of granules and their
> particular contents is constant.
> 
> A "dynamic dataset" can change.  For example, the datasets above will
> grow every day since they are part of an ongoing NASA mission that
> keeps capturing and processing new data.  The granules that were part
> of the dataset yesterday and the granules that are part of the dataset
> today are different.  (I know this causes some folks heartburn, but it
> is a reality we need to accomodate.)
> 
> 
> Once we get dataset straight, we can talk about subsets/other
> aggregations.
> 
> 
> -- 
> Curt Tilmes
> U.S. Global Change Research Program
> 1717 Pennsylvania Avenue NW, Suite 250
> Washington, D.C. 20006, USA
> 
> +1 202-419-3479 (office)
> +1 443-987-6228 (cell)
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve



More information about the Esip-preserve mailing list