[Esip-preserve] Identifiers

Bruce Barkstrom brbarkstrom at gmail.com
Thu Feb 16 14:34:24 EST 2012


'Granule' is still undefined.  Very early in EOS and in the
development of EOSDIS, a 'granule' was the smallest item
identifiable in the inventory.  Apparently the concern was
that the storage media would divide files and place the first
part of the file on one tape and the second part on another.
I'm aware that the term 'granule' may not be identical with
the term file.  So, to start another entry in the dictionary:

Granule:
1.  A data file.
2.  Several files identified with a single inventory identifier.
3.  One or more data files, together with metadata and other
explanatory metadata and documentation  available as a
single (OAIS RM) Dissemination Information Package.

Note also

Static Dataset:
1.  An unchanging dataset
2.  A closed dataset

Dynamic Dataset:
1.  A collection of files that may have additional files
added to it with further production
2.  A file to which further production will append data
[with the example being the GHCN monthly average
precipitation records - and probably other GHCN data
files for things like temperature and humidity]
3.  A relational database subject to being modified
by transactions
4.  An Open data collection.

On Thu, Feb 16, 2012 at 10:06 AM, Curt Tilmes <Curt.Tilmes at nasa.gov> wrote:
> On 02/15/2012 03:48 PM, Bruce Barkstrom wrote:
>>
>> It would be useful to at least having some clear definitions of
>> things.
>>
>> So - to go back to the "undefined term" "data set" does this term
>> refer to
>
>
> Yes, we need to define it.  We keep putting it off.  Let's debate this
> one now.  We might not come to complete agreement, but perhaps we can
> refine this sufficiently to come up with something we can make use of.
>
>
> I use it for something comparable to the EOSDIS Data Model concept of
> Earth Science Data Type (ESDT) + Collection.
>
> So, for example, the { "MODIS/Terra Snow Cover 5-Min L2 Swath 500m"
> (MOD10_L2), "Collection 5" } is one dataset.
>
> { MOD10_L2, Collection 6 } would be a distinct dataset, and need a
> distinct identifier (eventually DOI).
>
>
> I'm also not wedded to the term "dataset" for this concept -- if
> someone can sell me on an alternative.  I just think we need some term
> for this concept we can all live with..  "dataset" is the most natural
> I can come up with.
>
>
> A couple notes for people who don't speak "NASA EOSDIS Data Model":
>
> 1. A dataset is made up of granules.
>
> 2. Each granule in the dataset was made in a "common" (I won't define
> that for now) way.
>
> 3. Each granule in the dataset has a common format, metadata, filename
> convention, etc.  A reader for one granule will also be able to read
> another granule from the same dataset.
>
>
> I'll further add these definitions for discussion:
>
> A "static dataset" doesn't change.  The set of granules and their
> particular contents is constant.
>
> A "dynamic dataset" can change.  For example, the datasets above will
> grow every day since they are part of an ongoing NASA mission that
> keeps capturing and processing new data.  The granules that were part
> of the dataset yesterday and the granules that are part of the dataset
> today are different.  (I know this causes some folks heartburn, but it
> is a reality we need to accomodate.)
>
>
> Once we get dataset straight, we can talk about subsets/other
> aggregations.
>
>
> --
> Curt Tilmes
> U.S. Global Change Research Program
> 1717 Pennsylvania Avenue NW, Suite 250
> Washington, D.C. 20006, USA
>
> +1 202-419-3479 (office)
> +1 443-987-6228 (cell)


More information about the Esip-preserve mailing list