[Esip-preserve] A Question About Custodianship

Ruth Duerr rduerr at nsidc.org
Mon May 20 16:36:12 EDT 2013


Hi Bruce,

Well… first I should note that to your first & third items below, the Data Conservancy assigns a unique URL to the event stream of every item in the archive which is updated whenever anything happens to that item (i.e., creation of new representations due to format preservation issues, etc.).  The whole point of that is to provide the sort of audit trail you are talking about.  They also have the ability to import the more normal kinds of production provenance information represented in PROV (for example).  So some group is at least thinking about these issues….

As for the Data Management Training modules.  The point to remember is that the first batch was to be aimed at scientists (i.e., data users) not data managers.  We only got about half of those published.  We haven't even started on the data manager training…  So, no, no modules on those topics yet.

Ruth


On May 19, 2013, at 10:10 AM, Bruce Barkstrom <brbarkstrom at gmail.com> wrote:

> As a result of some programming activities I've been doing, the following
> questions seem relevant to some of our thinking about preservation and
> stewardship:
> 
> 1.  While provenance in our discussions has been discussed at some
> length, that discussion (including the recent W3C recommendations)
> seems mainly focused on retaining the history of production, I don't
> think we've had much discussion about record keeping for custodianship.
> By this I mean the records associated with who authorized activities in
> a data archive or repository, such as ingesting the data, or creating
> backups, or revising metadata.  This includes chains of authorization
> and what auditors and lawyers might call chains of evidence (there
> is some discussion of these issues that's readily available by typing
> in these terms to any convenient search engine. 
> 
> 2.  The custodianship issues also include items often discussed under
> security and privacy, including keeping records on user access and
> providing various degrees of user access.  The issue might be similar
> to the ones that go with "who gets a library card," although we often
> think of electronic access permissions as being different from getting
> a library card.  It's also roughly equivalent to libraries with open stacks
> as opposed to libraries with closed stacks.
> 
> 3.  So, the generic question might be "what records does an archive
> need to keep about events that occur after data has been ingested?"
> A variant on this is "what records should an archive keep in order to
> conduct audits of accesses, provide aids for forensic application after
> a security incident, or for understanding user patterns of navigation 
> and data ordering?"
> 
> 4.  How do organizational cultures interact with policies on record keeping?
> One might think of the CIO culture and its rules regarding privacy
> and stringent access controls, versus the researcher culture with
> the emphasis on open access - often assuming no user registration
> whatever.  A third culture is the one associated with commercial
> entities that engage in marketing and restricted access.
> 
> My initial impression of the W3C recommendations on provenance 
> is that these custodianship issues aren't really included in those 
> documents.
> 
> Also, I am curious if the Data Management Training modules
> have included any discussions about these custodianship
> issues.
> 
> Bruce B.



More information about the Esip-preserve mailing list