[Esip-preserve] [esip-semanticweb] Identifiers for people?

Nancy Hoebelheinrich njhoebel at gmail.com
Thu Mar 8 09:16:54 EST 2012

Adding to what Matt has suggested here, the Library of Congress does make
its Name Authority File (NAF) available for semantic web use at this
website:  http://id.loc.gov/authorities/names.html .  There are about 8
million  names of people and organizations in that database, gathered over
many decades, mostly from published books, of course, but other kinds of
published materials as well.  Not many libraries have been able to afford
cataloging of journals, so there would be fewer journal authors, certainly,
but still might be a very rich resource to use not only for materials
published in English, but for other languages too.    LC has made the NAF
URIs available in a number of forms that would work for the Semantic Web
including     RDF/XML,  N-Triples,   and JSON.  

Nancy Hoebelheinrich
Information Analyst
njhoebel at gmail.com
nhoebel at kmotifs.com

-----Original Message-----
From: esip-preserve-bounces at lists.esipfed.org
[mailto:esip-preserve-bounces at lists.esipfed.org] On Behalf Of Matt Mayernik
Sent: Wednesday, March 07, 2012 2:10 PM
To: esip-preserve at lists.esipfed.org
Subject: Re: [Esip-preserve] [esip-semanticweb] Identifiers for people?

In library world, this issue is called "authority control". Sky mentioned
the Library of Congress name authority IDs; that is the standard tool used
by library catalogers (at least in the US) to do name disambiguation. The
LoC keeps authority files for authors, which are exactly what Bruce
described: a list of possible names for the same person, with one preferred
name provided. There is also an international initiative along these lines
to create a Virtual International Authority File (VIAF), http://viaf.org/.
VIAF is available as RDF, and thus linked-data friendly, etc. The LoC
authority files might be as well, I'm not sure.

For this debate though, the big issue with both of these is that they are
based on book authors. So journal authors who haven't written books wouldn't
be in there, and of course the same for the K-12 educators, community
college teachers, etc. that Christopher mentioned.

I guess the take away is that these library-based "people" vocabs should
definitely be part of the discussion, but as another player in the game,
they aren't comprehensive in-and-of themselves.


Matthew Mayernik
Research Data Services Specialist
NCAR Library
National Center for Atmospheric Research (NCAR) Boulder, CO, 80307-3000

On 3/7/2012 2:25 PM, Bruce Barkstrom wrote:
> Find somebody who has a copy of AARC2 and see what librarians have 
> recommended before assuming that the whole world didn't exist before 
> the Web.
> Bruce B.
> On Wed, Mar 7, 2012 at 4:23 PM, Curt Tilmes<Curt.Tilmes at nasa.gov>  wrote:
>> On 03/07/2012 01:56 PM, Sky Bristol wrote:
>>> USGS has been working on this issue as well for some time. We've 
>>> done it as part of what we've used "master data management" for in a 
>>> project called ScienceBase. I just cross-posted a blog I'd written a 
>>> while back to a public location that provides an overview on this 
>>> work.
>>> https://my.usgs.gov/confluence/x/dQCfCQ
>> Very interesting -- thanks so much for posting that.  A lot of 
>> interesting work going on there.  (BTW -- Tom Armstrong says Hi!)
>>> One of the biggest challenges we are having in a government world 
>>> that we still haven't resolved to satisfaction is dealing with 
>>> Privacy Act concerns. There's all kinds of things that come into 
>>> play when a government agency starts aggregating information on 
>>> people, even when many of them work for us. In that respect, doing 
>>> this under the auspices of DataONE would be a little smoother. :-)
>> Hmm.. I'll probably have to look into that some more..
>>> The blog post doesn't talk about IDs specifically, but we've ended 
>>> up storing any number of IDs with people records in ScienceBase. 
>>> We've got Library of Congress name authority IDs, local data system 
>>> IDs, and a variety of others. We do as much automated disambiguation 
>>> every time we encounter a new contact as we can and are still 
>>> working on developing the data librarian practice to fill in the 
>>> human bits.
>> So anyway, Peter Schoephoester seems like a really popular person at 
>> USGS, (associated with 346,675 items, wow! busy guy!)
>> His "Person Profile Page" in sciencebase refers to him simply by 
>> number (his "Party Id"):
>>   http://my.usgs.gov/catalog/catalogParty/show/15322
>> so it looks like you simply mint a new Party Id for each 
>> disambiguated person/organization you add to the system, then just 
>> always use id
>> 15322 whenever you want to refer to that person?
>> I'll definitely be taking a closer look at sciencebase..
>> --
>> Curt Tilmes, Ph.D.
>> U.S. Global Change Research Program
>> 1717 Pennsylvania Avenue NW, Suite 250 Washington, D.C. 20006, USA
>> +1 202-419-3479 (office)
>> +1 443-987-6228 (cell)
>> http://globalchange.gov
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
Esip-preserve mailing list
Esip-preserve at lists.esipfed.org

More information about the Esip-preserve mailing list