[Esip-preserve] [esip-semanticweb] Identifiers for people?

Sky Bristol sbristol at usgs.gov
Wed Mar 7 13:56:45 EST 2012


Hi all,

USGS has been working on this issue as well for some time. We've done it as part of what we've used "master data management" for in a project called ScienceBase. I just cross-posted a blog I'd written a while back to a public location that provides an overview on this work.

https://my.usgs.gov/confluence/x/dQCfCQ

One of the biggest challenges we are having in a government world that we still haven't resolved to satisfaction is dealing with Privacy Act concerns. There's all kinds of things that come into play when a government agency starts aggregating information on people, even when many of them work for us. In that respect, doing this under the auspices of DataONE would be a little smoother. :-)

The blog post doesn't talk about IDs specifically, but we've ended up storing any number of IDs with people records in ScienceBase. We've got Library of Congress name authority IDs, local data system IDs, and a variety of others. We do as much automated disambiguation every time we encounter a new contact as we can and are still working on developing the data librarian practice to fill in the human bits.

Cheers.

<.(((<<<~~~~<.(((<<<~~~~<.(((<<<
     Sky Bristol
     USGS Core Science Systems
     sbristol at usgs.gov
     Cell: 303-241-4122
<.(((<<<~~~~<.(((<<<~~~~<.(((<<<

On Mar 7, 2012, at 12:37 PM, Lynnes, Christopher S. (GSFC-6102) wrote:

> This topic came up at the last ESIP Semantic Web telecon, where URIs are needed to identify people for some of our linked data efforts.  I thought either Erin, Tom Narock, or Eric Rozell had done some thinking on how to do this, at least for ESIP members...
> 
> On Mar 7, 2012, at 12:51 PM, Robert R. Downs wrote:
> 
>> Curt  -
>> 
>> You might already be aware of the activities of ORCID 
>> http://about.orcid.org/ and its collaborators to address these issues.
>> 
>> Thanks,
>> 
>> Bob Downs
>> 
>> On 3/7/2012 11:59 AM, Curt Tilmes wrote:
>>> 
>>> A bunch of other groups have assigned various sets of identifiers for
>>> most of the other things I'm looking at (Thank you GCMD keywords:
>>> http://gcmd.nasa.gov/Resources/valids/archives/keyword_list.html)
>>> 
>>> Most of the databases I see seem to ignore the need to unambiguously
>>> identify people.
>>> 
>>> Most databases simply fall back on a plain text literal and identify
>>> an author as "John Doe" (or even "J. Doe").
>>> 
>>> I want to indicate that "John Doe" the P.I. for an instrument is the
>>> same "John Doe" who authored some paper.  I need a clear, unambiguous
>>> identifier for that person.
>>> 
>>> I could simply assign an integer as I insert into my database (I know,
>>> I know -- I am not a number, I am a free man!).  Another thought is
>>> UUID, even though they are big and ugly and make even bigger and
>>> uglier URIs.
>>> 
>>> foaf:mbox_sha1sum [1] has a certain appeal since independent databases
>>> have a prayer of independently assigning the same identifier to the
>>> same person, but even that relies on jdoe at nasa.gov keeping the same
>>> mbox_sha1sum associated with himself when he becomes johnd at noaa.gov.
>>> 
>>> Keeping the name itself in the URI is nice since you can look at it
>>> and know who it is talking about (try that with an embedded UUID), but
>>> what do you do when the 2nd (and 3rd) John Doe shows up?  Or if he
>>> becomes Jane Doe?
>>> 
>>> Other thoughts?
>>> 
>>> Curt
>>> 
>>> [1] http://xmlns.com/foaf/spec/#term_mbox_sha1sum
>>> 
>> 
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
> 
> --
> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
> 
> 
> _______________________________________________
> esip-semanticweb mailing list
> esip-semanticweb at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-semanticweb



More information about the Esip-preserve mailing list