[Esip-preserve] Another Slant on Identifiers

Nancy Hoebelheinrich njhoebel at gmail.com
Tue Apr 12 16:54:06 EDT 2011


[snip]
I'm working up an example of the double-entry approach as part of what I'm
writing in the use case documentation.  I'll also note that data producers
have
to deal with unique identifiers before librarians - and they will have
identifiers
that deal with versions.  I'm inclined pretty strongly in thinking
that librarians
don't have a lot of experience with the versioning problem in large-scale
Earth science data production, although perhaps that reflects my personal
biases.  Let's see if we can lay out a scenario and then talk about what it
reveals.  [end snip]

I don't see a lot of utility in making a distinction between data producers
and data content managers (or librarians in your terms) when discussing the
use of identifiers to assist in inventory control.  The problems are still
the same -- what comes out (production) or in (to storage for later
retrieval), what is "the data unit in question related to, if anything and
how can that relationship be maintained, how can the data unit be made
available for retrieval without losing it, etc.  The problem of scale is an
interesting one, but reasonable solution(s) to the problem should probably
be applicable to most situations.  In terms of whether librarians have
experience with versioning on a large scale, that may or may not be true
(and may be -- although have you ever dealt with keeping track of errata for
published and unpublished legal opinions?  whew! ), but I daresay there's
information to be learned from anyone who's been involved in the data ==>
information management business for those with open minds, so yes, laying
out scenarios as the group has been doing will reveal both problem
identification, problem rationalization, and hopefully, reasonable problem
solving.

Nancy

On Tue, Apr 12, 2011 at 11:52 AM, Bruce Barkstrom <brbarkstrom at gmail.com>wrote:

> Exactly.
>
> I was going back to the experience Alice and I had back in graduate school
> when we maintained a double entry bookkeeping system by hand, using an
> abacus for math.  After five our six years of daily use, you can get
> reasonably
> proficient with an abacus.  The core of the experience, though, is the use
> of
> data structrures that ensure consistency and help track the flow of items
> through accounting transactions.
>
> I think the basic approach involves creating a separate account (or ledger)
> for each item with an identifier.  I did take a look on the Web to see
> whether
> this approach seemed sensible.  So, I checked on inventory control
> software,
> particularly to see about items that need serial numbers (ascession
> numbers, if you
> will).  Based on the items that Google responds with, this problem is
> pretty well
> within standard accounting approaches and is built into most
> accounting software
> that deals with manufacturing or other large ticket items, like cars,
> airplanes, or
> RVs.  It looks like the accountants would call it serialized inventory
> control.
>
> I think using the double entry approach would also let us tie our use of
> the
> term "provenance" to the more familiar use of that term for art objects.
>  If you
> recall, I had taken a look at Web sources for this information and
> found that in the
> art world, provenance seemed to be related to who had owned items.  In
> other
> words, a good inventory control system that used serialized inventory items
> would allow an Archive to say which items had been loaned out, which had
> been
> copied, where in the Archive they were stored, and so on.  Such a system
> should
> also provide useful help in auditing the state of the inventory.
> [Alice would probably
> note that one still might need to find out who had locked books in his or
> her
> carel - or had slightly "rearranged" the order in which books were placed
> on
> shelves in order to keep "their" collection accessible just to them.]
>
> There is probably a more esoteric use for this in a production environment,
> which is calculating the increase in value of items created by running old
> data through software.  I don't know that this problem is particularly
> difficult
> from an algorithmic perspective, although getting reliable estimates of the
> proper value increase (or some numerical range) may be a rather
> interesting exercise
> in getting accountants to reach consensus.  I think this application might
> be
> an exercise in cost accounting (noting that the US Standard General Ledger
> lurks just off stage).
>
> I'm working up an example of the double-entry approach as part of what I'm
> writing in the use case documentation.  I'll also note that data producers
> have
> to deal with unique identifiers before librarians - and they will have
> identifiers
> that deal with versions.  I'm inclined pretty strongly in thinking
> that librarians
> don't have a lot of experience with the versioning problem in large-scale
> Earth science data production, although perhaps that reflects my personal
> biases.  Let's see if we can lay out a scenario and then talk about what it
> reveals.
>
> Thanks for noting the ties to library experience.
>
> Bruce B.
>
> On Tue, Apr 12, 2011 at 11:49 AM, Nancy Hoebelheinrich
> <njhoebel at gmail.com> wrote:
> > Hey, Bruce:
> > In my experience, content management systems of various sorts use such
> > identifiers upon receipt of whatever needs to be tracked, sometimes
> called
> > "accession" numbers.  The numbers could be used within a given system,
> and
> > depending upon how it was prefixed, outside systems as well.   In the
> > digital environment, again in my experience, this is an ideal use for
> UUIDs
> > with other identifiers serving other purposes that we've talked about
> more,
> > i.e., citation.  One of the things I hope we can think about for the
> > metadata testbed project is how well OIDs and other, less commonly known
> > identifiers could be used for this kind of purpose as well as for
> citation
> > purposes.  We'll have to beg the question as to how a content / digital
> > asset / inventory management system would use such identifiers to track
> > versions, and important characteristics such as provenance as you mention
> as
> > there are different solutions out there, but it would be useful to at
> least
> > look at the identifier schemes in terms of how they might facilitate such
> > tracking.
> > Nancy
> >
> > On Tue, Apr 12, 2011 at 6:24 AM, Bruce Barkstrom <brbarkstrom at gmail.com>
> > wrote:
> >>
> >> As I have been going through my use case documentation for the glacier
> >> photo collection, it occurred to me that identifiers of unique objects
> are
> >> useful not only for finding objects "outside" a particular archive; they
> >> are
> >> also useful for inventory control within an archive.
> >>
> >> This is perhaps easier to see for physical objects, like photographic
> >> negatives or photographic prints.  If you identify a negative, say, with
> >> a particular ID, then you could create a "ledger" (or inventory record)
> >> that contains the current location of the object ("look at shelf 34 in
> >> rack 25") and the previous locations ("used to be on shelf 25 in rack
> >> 15").  In an old-fashioned accounting system, the transactions that
> >> change the state of an inventory would appear in an accounting "journal"
> >> that the accountants would periodically use to enter transactions to
> >> the ledger accounts.
> >>
> >> We haven't had any discussion I can recall about how the identifiers
> >> would be used in such a system.  I suspect identifiers would be
> >> even more important in this context.  Such an inventory system
> >> would be a natural place to deal with provenance tracing (noting
> >> that the ledger accounts could include IPR).  They would allow
> >> systematic auditing of an archive's inventory.  They might even solve
> >> (or partially solve) the "orphan file problem" that we talked about
> >> earlier.
> >>
> >> Any comments?
> >>
> >> Bruce B.
> >> _______________________________________________
> >> Esip-preserve mailing list
> >> Esip-preserve at lists.esipfed.org
> >> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
> >
> >
> >
> > --
> > Nancy Hoebelheinrich
> > Information Analyst
> > San Mateo, CA  94401
> > njhoebel at gmail.com
> > (m) 650-302-4493
> > (f) 650-745-3333
> >
>



-- 
Nancy Hoebelheinrich
Information Analyst
San Mateo, CA  94401
njhoebel at gmail.com
(m) 650-302-4493
(f) 650-745-3333
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-preserve/attachments/20110412/09130fb8/attachment.html>


More information about the Esip-preserve mailing list