[Esip-preserve] [Infusion] Suggestion for tech infusion activity vis a vis MEaSUREs

Ruth Duerr rduerr at nsidc.org
Wed Apr 14 13:22:03 EDT 2010


Absolutely agreed!

- Ruth

On Apr 14, 2010, at 11:09 AM, J Glassy wrote:

> all,
> 
> for what its worth, I couldn't agree more with Chris' last two
> emails. Resolving ambiguity up front,
> pro actively, has got to be one of the biggest motivations for
> adopting unique identifiers.
> 
> joe
> 
> On Wed, Apr 14, 2010 at 11:03 AM, Christopher Lynnes
> <Chris.Lynnes at nasa.gov> wrote:
>> To expound on my reasoning below just a little bit, part of (or even most
>> of) the point of unique identifiers is to eliminate ambiguity.  Complicated
>> versioning schemes leave enough ambiguity in (e.g. MODIS versions 005 and
>> 051--I never quite grokked the difference) that they warrant DOIs at the
>> version level, just to emphasize that they are different DataType versions.
>> 
>> On Apr 14, 2010, at 12:57 PM, Christopher Lynnes wrote:
>> 
>>> With the complexity and diversity of some of the versioning schemes
>>> out there, I would advocate for using a DOI for each Dataset (i.e.,
>>> DataType + Version).  If a researcher used data from multiple versions
>>> of a dataset, then the citation of multiple DOIs will make that
>>> crystal clear.
>>> 
>>> On Apr 14, 2010, at 10:04 AM, Curt Tilmes wrote:
>>> 
>>>> On 03/23/2010 02:35 PM, Wilson, Brian D (335G) wrote:
>>>>> 
>>>>> We will need to formulate this consensus recommendation quickly.
>>>>> 
>>>>> I suggest two features:
>>>>> 
>>>>> 1) Publish the MEASUREs datasets as a dataset paper in an appropriate
>>>>> journal so the *dataset* has a refrence-able DOI.
>>>> 
>>>> We've begun to discuss/distinguish the concepts of "Data Type" (what
>>>> EOS call's ESDT) from "Dataset", which is a specific version (EOS
>>>> parlance 'Collection') of that Data Type in the ESIP Preservation
>>>> cluster identifiers group.
>>>> 
>>>> I put some strawman terms and definitions here: (up for discussion!)
>>>> 
>>>> http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Identifiers#Definitions
>>>> 
>>>> I think each of those concepts needs a referenceable identifier from
>>>> which we can construct data citations.
>>>> 
>>>> For example, consider ESDT FOO.  It is archived in DAAC MyOrg
>>>> (CrossRef DOI Org 10.12345), which has archived data from ESDT FOO for
>>>> collection 1 (a "Closed Data Set") and is currently archiving
>>>> collection 2 (an "Open Data Set" still being processed from current
>>>> data).
>>>> 
>>>> We need a citation for the general data type:
>>>> 
>>>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO.
>>>> 
>>>> and a citation for each data set (each version of the data time).
>>>> Rather than registering a new DOI for each new version (collection),
>>>> I'm inclined to advise reusing the data type DOI:
>>>> 
>>>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>>>> Collection 1.
>>>> 
>>>> This "datatype DOI" could also be the 'published paper describing the
>>>> dataset' DOI, but I guess I'd be inclined to have separate DOIs, one
>>>> for the paper, and one for the datatype.  Then a paper could reference
>>>> either or both as appropriate to the nature of the use.
>>>> 
>>>> 
>>>> Alternatively, we could register distinct DOIs for each new version:
>>>> 
>>>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO.1,
>>>> Collection 1.
>>>> 
>>>> For the "Open Data Set" case, I think we must precisely qualify the
>>>> citation to reference the specific granule membership of the dataset.
>>>> There are a few ways to do this, but I think the cleanest is a
>>>> date/time stamp:
>>>> 
>>>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>>>> Collection 2, 2010-04-01T14:00:00.
>>>> 
>>>>> 2) Serve the dataset granules from permanent (as possible) URL's
>>>>> from the origin sites and the receiving DAAC's.  The grabbed real
>>>>> estate, the root of the URL, should reference MEASUREs and the
>>>>> institution, and not contain the name of a computer (or something
>>>>> else that is dumb).
>>>>> 
>>>>> 3) As far as truly permanent URI's, I don't know what to say.  I
>>>>> don't think either the handle system, XRI's, or any other system has
>>>>> gotten traction (a large market share).  This is mostly the fault of
>>>>> the W3C, which thinks the entire problem has been solved by existing
>>>>> URLs and URNs.  Hogwash.
>>>> 
>>>> I like including both identifiers, datatype and dataset.  I'm leaning
>>>> toward using DOIs for the datatype and PURLs for the precise data
>>>> specification and locator:
>>>> 
>>>> Smith, John. "Some Earth Science Data", FOO, DOI: 10.12345/FOO,
>>>> Collection 2, http://purl.org/NET/MyOrg/data/FOO/
>>>> 2/2010-04-01T14:00:00.
>>>> 
>>>> (Though, as Ruth points out, ARKs are nice too and have their own
>>>> benefits.)
>>>> 
>>>> Curt
>>>> 
>>>> _______________________________________________
>>>> Infusion mailing list
>>>> Infusion at lists.sciencedatasystems.org
>>>> 
>>>> http://lists.sciencedatasystems.org/mailman/listinfo/infusion_lists.sciencedatasystems.org
>>> 
>>> --
>>> Christopher Lynnes             NASA/GSFC, Code 610.2
>>> 301-614-5185
>>> 
>>> 
>>> _______________________________________________
>>> Infusion mailing list
>>> Infusion at lists.sciencedatasystems.org
>>> 
>>> http://lists.sciencedatasystems.org/mailman/listinfo/infusion_lists.sciencedatasystems.org
>> 
>> --
>> Christopher Lynnes             NASA/GSFC, Code 610.2         301-614-5185
>> 
>> 
>> _______________________________________________
>> Infusion mailing list
>> Infusion at lists.sciencedatasystems.org
>> http://lists.sciencedatasystems.org/mailman/listinfo/infusion_lists.sciencedatasystems.org
>> 
> 
> 
> 
> -- 
> ----------------------------------------------------------------
> Joseph Glassy
> Lead Software Engineer (contractor)
> NASA Measures (Freeze/Thaw),Rm CFC 424
> College of Forestry and Conservation
> Univ. Montana, Missoula, MT 59812
> Tel: 406-243-6318     Cellular: 406-544-3315
> and:
> Research Analyst/Programmer
> University of Montana NSF EPSCoR Program
> Davidson Honors College Room 013
> Missoula, MT 59812
> um.glassy at gmail.com
> Campus phone 243-6337   Cell(406) 544-3315
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve



More information about the Esip-preserve mailing list