[Esip-preserve] [Infusion] Suggestion for tech infusion activity vis a vis MEaSUREs

Alice Barkstrom alicebarkstrom at verizon.net
Wed Apr 14 10:46:43 EDT 2010


The ESDT description still needs a distinction between data sources,
as well as collections based on homogeneity of the production configuration
(which machines and which software were used).  I'll note that OIDs 
fit into the
approach suggested here - see
  http://www.oceandis.com/metadata/Text_Documentation/Example/App_B.pdf
for one form of OID structure.

Bruce B.

At 10:04 AM 4/14/2010, Curt Tilmes wrote:
>On 03/23/2010 02:35 PM, Wilson, Brian D (335G) wrote:
> > We will need to formulate this consensus recommendation quickly.
> >
> > I suggest two features:
> >
> > 1) Publish the MEASUREs datasets as a dataset paper in an appropriate
> > journal so the *dataset* has a refrence-able DOI.
>
>We've begun to discuss/distinguish the concepts of "Data Type" (what
>EOS call's ESDT) from "Dataset", which is a specific version (EOS
>parlance 'Collection') of that Data Type in the ESIP Preservation
>cluster identifiers group.
>
>I put some strawman terms and definitions here: (up for discussion!)
>http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Identifiers#Definitions
>
>I think each of those concepts needs a referenceable identifier from
>which we can construct data citations.
>
>For example, consider ESDT FOO.  It is archived in DAAC MyOrg
>(CrossRef DOI Org 10.12345), which has archived data from ESDT FOO for
>collection 1 (a "Closed Data Set") and is currently archiving
>collection 2 (an "Open Data Set" still being processed from current
>data).
>
>We need a citation for the general data type:
>
>Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO.
>
>and a citation for each data set (each version of the data time).
>Rather than registering a new DOI for each new version (collection),
>I'm inclined to advise reusing the data type DOI:
>
>Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>Collection 1.
>
>This "datatype DOI" could also be the 'published paper describing the
>dataset' DOI, but I guess I'd be inclined to have separate DOIs, one
>for the paper, and one for the datatype.  Then a paper could reference
>either or both as appropriate to the nature of the use.
>
>
>Alternatively, we could register distinct DOIs for each new version:
>
>Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO.1,
>Collection 1.
>
>For the "Open Data Set" case, I think we must precisely qualify the
>citation to reference the specific granule membership of the dataset.
>There are a few ways to do this, but I think the cleanest is a
>date/time stamp:
>
>Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>Collection 2, 2010-04-01T14:00:00.
>
> > 2) Serve the dataset granules from permanent (as possible) URL's
> > from the origin sites and the receiving DAAC's.  The grabbed real
> > estate, the root of the URL, should reference MEASUREs and the
> > institution, and not contain the name of a computer (or something
> > else that is dumb).
> >
> > 3) As far as truly permanent URI's, I don't know what to say.  I
> > don't think either the handle system, XRI's, or any other system has
> > gotten traction (a large market share).  This is mostly the fault of
> > the W3C, which thinks the entire problem has been solved by existing
> > URLs and URNs.  Hogwash.
>
>I like including both identifiers, datatype and dataset.  I'm leaning
>toward using DOIs for the datatype and PURLs for the precise data
>specification and locator:
>
>Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>Collection 2, http://purl.org/NET/MyOrg/data/FOO/2/2010-04-01T14:00:00.
>
>(Though, as Ruth points out, ARKs are nice too and have their own
>benefits.)
>
>Curt
>_______________________________________________
>Esip-preserve mailing list
>Esip-preserve at lists.esipfed.org
>http://www.lists.esipfed.org/mailman/listinfo/esip-preserve




More information about the Esip-preserve mailing list