[Esip-preserve] a new relation type for subset citations

Ruth Duerr ruth.duerr3 at gmail.com
Fri Jul 17 01:08:37 EDT 2015


Hi Jeff,

I can tell you that very long discussions of the "what does the subset cover" issue were held in very many places.  The general agreement was that the purpose of a subset specifier was not to replace the methods portion of a paper; that the very best a repository could do was to allow explicit identification of what subset the user actually obtained - what they did with that data after that, was their issue to describe in their paper - so not this was definitely not meant to address workflow.

I am sure that he would be happy to talk to you about this (actually he has been trying pretty hard to get folks to comment and he has quite a few pilot implementations at this point).  I took the liberty of adding him to this conversation.

Ruth


On Jul 16, 2015, at 4:48 PM, Fontaine, Kathy via Esip-preserve <esip-preserve at lists.esipfed.org> wrote:

> Hi all - one more thing....
> 
> Please note that the RDA outputs are designed to solve one particular problem that was identified and scoped by the Working Group.  These should not ever be viewed as _the_ universal answer to all related problems.
> 
> With that in mind, if the issue you are describing, Jeff, is not addressed in the initial conditions, that's why you see what you see.  The comment, then, might be addressed in future or follow-on work.
> 
> I don't know, but just wanted to put that caveat out there.
> 
> Thanks
> 
> K
> 
> 
> 
> ________________________________
> Dr. Kathleen Fontaine
> Managing Director, Research Data Alliance/US (RDA/US)
> 
> Amos Eaton Building, Room 211
> Rensselaer Polytechnic Institute (RPI)
> 110 8th Street
> Troy, NY 12180-3590
> 
> Cell:  410-991-6728
> Office:  518-276-2829
> 
> Email:  fontak at rpi.edu
> Skype:  ksfontaine
> 
> ________________________________________
> From: Fox, Peter
> Sent: Thursday, July 16, 2015 7:30 PM
> To: Jeff de La Beaujardiere - NOAA
> Cc: Greg Janée; ESIP Preserve List; Fontaine, Kathy
> Subject: Re: [Esip-preserve] a new relation type for subset citations
> 
> Jeff - go here https://rd-alliance.org/groups/data-citation-wg.html (need to register/ login) and can add comments (cc: Kathy F - who is here at ESIP for more info on commenting).
> ---Peter.
> 
>> On 16 Jul 2015, at 19:25 , Jeff de La Beaujardiere - NOAA via Esip-preserve <esip-preserve at lists.esipfed.org> wrote:
>> 
>> I strongly disagree with the RDA recommendation to include subset specifiers in citations and to require the provider to record them permanently. Besides the huge burden on the providers, subsetting is only one part of the workflow. Scientific papers need to describe the work they did including subsetting, mathematical operations, assumptions, et cetera, so merely capturing the subset information is nearly worthless. If the workflow is to be captured in a machine-readable fashion, then a service-neutral language such as (but not necessarily) the OGC Web Coverage Processing Service grammar should be referenced in the paper using a URL maintained by the author or publisher of the paper.
>> 
>> I would like to register this objection with RDA but am not sure where to do so. I have CCed Mark Parsons as a start.
>> 
>> Regards,
>> Jeff DLB
>> 
>> 
>> Jeff de La Beaujardiere, PhD
>> NOAA Data Management Architect
>> 1335 East-West Hwy, Silver Spring MD 20910 USA
>> +1 301 713 7175 (NESDIS/ACIO-S - SSMC1/5236)
>> ORCID: http://orcid.org/0000-0002-1001-9210
>> 
>> On Thu, Jul 16, 2015 at 9:54 AM, Greg Janée <esip-preserve at lists.esipfed.org> wrote:
>> The RDA data citation working group has recommended that subsets of datasets (more broadly, queries against datasets) be persistently identified upon request; cf. https://www.rd-alliance.org/group/data-citation-wg/wiki/wgdc-recommendations.html.
>> `
>> For this to work, queries have to be stored *somewhere*.  One approach (this appears to be the RDA group's working assumption) is for the provider to take on the burden of permanently storing queries, and from there it can issue PIDs for those queries by whatever means it has available.  Another approach is for the provider to support a query API of some kind, e.g., query URLs (think http://dataset?query=this+that+and+the+other).  This may result in lengthy URLs, but an external identifier system can be used to assign short, opaque PIDs that redirect to those query URLs.
>> 
>> Regardless of the approach taken, the net result is multiple, related PIDs: one PID for the dataset as a whole, and then multiple PIDs, one per stored query.  It would be beneficial to record the relationship between these identifiers, particularly in the case when an external identifier system is being used.  The DataCite metadata schema (http://schema.datacite.org) lists a number of possibilities, but none quite fit:
>> 
>> - IsPartOf/HasPart: "A IsPartOf B" implies that B can be broken down into some disjoint pieces, and A is one of those pieces.  But a cited subset is not a disjoint part of a whole.
>> 
>> - IsCitedBy/Cites: "A IsCitedBy B" implies that B mentions A in some way, and is possibly intellectually derived from A, but not necessarily.  This is intentionally a pretty vague relationship (as vague as publication citations, right?), whereas a cited subset has a very specific relationship to the whole.
>> 
>> - IsReferencedBy/References: same thing.
>> 
>> - IsMemberOf/HasMember (a newly proposed relation): "A IsMemberOf B" implies that B has some rules or standards for inclusion, and A satisfies those rules.  Doesn't seem applicable in this case.
>> 
>> So I'm wondering if we need a new relation, which I'll provisionally call "IsCitedSubsetOf".  "A IsCitedSubsetOf B" would mean that a reference to A is really a reference to B, but only a subset of B was actually used.  Note that I didn't call it IsSubsetOf, for that would bring up the same issues that IsPartOf has.  It seems important to record that the purpose of these query PIDs is for citation and nothing else.
>> 
>> Thoughts?
>> -Greg
>> 
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://lists.deltaforce.net/mailman/listinfo/esip-preserve
>> 
>> NOTE: This message was trained as non-spam. If this is wrong, please correct the training as soon as possible.
>> Spam
>> Not spam
>> Forget previous vote
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://lists.deltaforce.net/mailman/listinfo/esip-preserve
> 
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://lists.deltaforce.net/mailman/listinfo/esip-preserve
> 



More information about the Esip-preserve mailing list