[Esip-preserve] On Earth Science Data File Uniqueness

Ruth Duerr rduerr at nsidc.org
Wed Feb 9 12:30:09 EST 2011


On Feb 9, 2011, at 10:19 AM, Lynnes, Christopher S. (GSFC-6102) wrote:

> 
> On Feb 9, 2011, at 12:08 PM, Curt Tilmes wrote:
> 
>> On 02/09/11 11:50, Lynnes, Christopher S. (GSFC-6102) wrote:
>>> I thought UUID was designed to answer only the question: are data
>>> items A and B bitwise-identical?
>> 
>> Absolutely not.  You're thinking of digital signatures or hashes such
>> as MD5 or SHA-1 which can be used to verify file content
>> integrity/fixity.
> 
> I should have phrased that:  has someone asserted that data items A and B are bitwise identical, i.e., by assigning a UUID.  BTW, I thought our preferred method for assigning UUIDs was to derive them from the SHA-1, anyway?

Yes, some of the UUID generating algorithms are based on things like HD5 or SHA-1.  We haven't gotten to the point of developing a best practice for which algorithm to use, though clearly there are additional advantages of using one of the message digest based algorithms
>> 
>> UUID is just a way to make an identifier that is globally unique
>> forever [1] and easily recognizable as a UUID.
> 
> OK, if it doesn't answer the question, are they identical, what question does the UUID answer???  Just having a unique identifier in and of itself is not intrinsically useful.

Well... if this file and that file both have the same UUID, then one should be a copy of the other.  If one of the message digest algorithms was used to create the UUID, then redoing that algorithm should demonstrate that they are bit identical and that they haven't changed since created.  But there are a few if's in there...
> 
>> 
>>> If they have the same UUID, then the answer is yes.  If they have
>>> different UUIDs, then the answer is that there is no evidence to say
>>> that they are bitwise identical.
>> 
>> They can be assigned to the object arbitrarily without regard to
>> content.
> 
> in that case, we have a misuse of UUIDs by the UUID creator.
> 
>> 
>> For example, here is one: 0cdf7b24-f374-419e-8cce-9758432cfdfa
>> It's totally unique in the world (go ahead, google it)
>> 
>> You could assign it to some chunk of data (some object) as you will.
>> Of course, if two of us tried to assign it to different data, we'd
>> have a problem, but we wouldn't do that.
>> 
>> Curt
>> 
>> [1] Ok, ok.  Practically globally unique forever, like really really
>> likely to be globally unique forever.  Almost perfect.  Like it is
>> "Hitchhiker's Guide to the Galaxy" improbable that you would get a
>> conflict.
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
> 
> --
> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
> 
> 
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve



More information about the Esip-preserve mailing list