[Esip-preserve] Identifiers

Curt Tilmes Curt.Tilmes at nasa.gov
Thu Feb 16 10:06:08 EST 2012


On 02/15/2012 03:48 PM, Bruce Barkstrom wrote:
> It would be useful to at least having some clear definitions of
> things.
>
> So - to go back to the "undefined term" "data set" does this term
> refer to

Yes, we need to define it.  We keep putting it off.  Let's debate this
one now.  We might not come to complete agreement, but perhaps we can
refine this sufficiently to come up with something we can make use of.


I use it for something comparable to the EOSDIS Data Model concept of
Earth Science Data Type (ESDT) + Collection.

So, for example, the { "MODIS/Terra Snow Cover 5-Min L2 Swath 500m"
(MOD10_L2), "Collection 5" } is one dataset.

{ MOD10_L2, Collection 6 } would be a distinct dataset, and need a
distinct identifier (eventually DOI).


I'm also not wedded to the term "dataset" for this concept -- if
someone can sell me on an alternative.  I just think we need some term
for this concept we can all live with..  "dataset" is the most natural
I can come up with.


A couple notes for people who don't speak "NASA EOSDIS Data Model":

1. A dataset is made up of granules.

2. Each granule in the dataset was made in a "common" (I won't define
that for now) way.

3. Each granule in the dataset has a common format, metadata, filename
convention, etc.  A reader for one granule will also be able to read
another granule from the same dataset.


I'll further add these definitions for discussion:

A "static dataset" doesn't change.  The set of granules and their
particular contents is constant.

A "dynamic dataset" can change.  For example, the datasets above will
grow every day since they are part of an ongoing NASA mission that
keeps capturing and processing new data.  The granules that were part
of the dataset yesterday and the granules that are part of the dataset
today are different.  (I know this causes some folks heartburn, but it
is a reality we need to accomodate.)


Once we get dataset straight, we can talk about subsets/other
aggregations.


-- 
Curt Tilmes
U.S. Global Change Research Program
1717 Pennsylvania Avenue NW, Suite 250
Washington, D.C. 20006, USA

+1 202-419-3479 (office)
+1 443-987-6228 (cell)


More information about the Esip-preserve mailing list