[Esip-preserve] [esip-semanticweb] Identifiers for people?

Bruce Barkstrom brbarkstrom at gmail.com
Wed Mar 7 22:18:50 EST 2012

I wonder if we aren't trying to do something that existed a decade
ago.  When I was head of the LaRC DAAC, we had talked with
the LaRC technical library staff - and one of them had put together
an interface that had authors from journal articles, references to
articles, and if you were interested would produce a link to our
ordering interface without human intervention.  The expensive part
of what we'd have had to do was to pay for an abstracting service
that would extract existing abstracts from articles ($25 k at that point
in time).  As I recall, it took this library staff member less than three
days to create a serviceable web interface that landed directly on
the appropriate point of our ordering interface.

My impression is that this problem was solved by inexpensive
tools the librarians already knew about and could adapt immediately.
Why are we trying to reinvent square wheels when there's probably
an inexpensive circular wheel already available - and one that even
a decade ago was pretty serviceable?  [I don't know the tool or even
the name of the librarian - but I don't think all of the references to
researchers indicate a very serious attempt at due diligence on
what might be already available.]

Bruce B.

On Wed, Mar 7, 2012 at 5:09 PM, Matt Mayernik <mayernik at ucar.edu> wrote:
> In library world, this issue is called "authority control". Sky mentioned
> the Library of Congress name authority IDs; that is the standard tool used
> by library catalogers (at least in the US) to do name disambiguation. The
> LoC keeps authority files for authors, which are exactly what Bruce
> described: a list of possible names for the same person, with one preferred
> name provided. There is also an international initiative along these lines
> to create a Virtual International Authority File (VIAF), http://viaf.org/.
> VIAF is available as RDF, and thus linked-data friendly, etc. The LoC
> authority files might be as well, I'm not sure.
> For this debate though, the big issue with both of these is that they are
> based on book authors. So journal authors who haven't written books wouldn't
> be in there, and of course the same for the K-12 educators, community
> college teachers, etc. that Christopher mentioned.
> I guess the take away is that these library-based "people" vocabs should
> definitely be part of the discussion, but as another player in the game,
> they aren't comprehensive in-and-of themselves.
> Matt
> ---
> Matthew Mayernik
> Research Data Services Specialist
> NCAR Library
> National Center for Atmospheric Research (NCAR)
> Boulder, CO, 80307-3000
> On 3/7/2012 2:25 PM, Bruce Barkstrom wrote:
>> Find somebody who has a copy of AARC2 and see
>> what librarians have recommended before assuming
>> that the whole world didn't exist before the Web.
>> Bruce B.
>> On Wed, Mar 7, 2012 at 4:23 PM, Curt Tilmes<Curt.Tilmes at nasa.gov>  wrote:
>>> On 03/07/2012 01:56 PM, Sky Bristol wrote:
>>>> USGS has been working on this issue as well for some time. We've
>>>> done it as part of what we've used "master data management" for in a
>>>> project called ScienceBase. I just cross-posted a blog I'd written a
>>>> while back to a public location that provides an overview on this
>>>> work.
>>>> https://my.usgs.gov/confluence/x/dQCfCQ
>>> Very interesting -- thanks so much for posting that.  A lot of
>>> interesting work going on there.  (BTW -- Tom Armstrong says Hi!)
>>>> One of the biggest challenges we are having in a government world
>>>> that we still haven't resolved to satisfaction is dealing with
>>>> Privacy Act concerns. There's all kinds of things that come into
>>>> play when a government agency starts aggregating information on
>>>> people, even when many of them work for us. In that respect, doing
>>>> this under the auspices of DataONE would be a little smoother. :-)
>>> Hmm.. I'll probably have to look into that some more..
>>>> The blog post doesn't talk about IDs specifically, but we've ended
>>>> up storing any number of IDs with people records in
>>>> ScienceBase. We've got Library of Congress name authority IDs, local
>>>> data system IDs, and a variety of others. We do as much automated
>>>> disambiguation every time we encounter a new contact as we can and
>>>> are still working on developing the data librarian practice to fill
>>>> in the human bits.
>>> So anyway, Peter Schoephoester seems like a really popular person at
>>> USGS, (associated with 346,675 items, wow! busy guy!)
>>> His "Person Profile Page" in sciencebase refers to him simply by
>>> number (his "Party Id"):
>>>  http://my.usgs.gov/catalog/catalogParty/show/15322
>>> so it looks like you simply mint a new Party Id for each disambiguated
>>> person/organization you add to the system, then just always use id
>>> 15322 whenever you want to refer to that person?
>>> I'll definitely be taking a closer look at sciencebase..
>>> --
>>> Curt Tilmes, Ph.D.
>>> U.S. Global Change Research Program
>>> 1717 Pennsylvania Avenue NW, Suite 250
>>> Washington, D.C. 20006, USA
>>> +1 202-419-3479 (office)
>>> +1 443-987-6228 (cell)
>>> http://globalchange.gov
>>> _______________________________________________
>>> Esip-preserve mailing list
>>> Esip-preserve at lists.esipfed.org
>>> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve

More information about the Esip-preserve mailing list