[Esip-preserve] ESIP Citation Guidelines

alicebarkstrom at frontier.com alicebarkstrom at frontier.com
Mon Oct 11 19:09:15 EDT 2010


My understanding of citation has, perhaps, advanced beyond my
understanding of a year ago.  From my perspective, the issue
of citations is partly one of getting as close as possible
to citations for journal papers.  If the citations aren't as
close to that kind of citation they have to overcome a significant
barrier associated with relearning the formalism for citations.

On the other hand, I've had concerns over the precision of
the citations.  If a data user is giving an "advertising"
endorsement of some producer's data collection, that's one
kind of citation.  On the other hand, if the intent is to
identify exactly which data were used in a particular 
validation experiment or intercomparison, that's an
entirely different kind of citation (and one with a much
larger and perhaps uncitable number of PGE's and data files
that would be publishable).  The precision problem for replication
is the far more difficult issue to deal with.

I will revisit my critique of the precision issue in publishable
form in the reasonably near future.  I'll also truncate my writing
to provide the relevant material about "unique identifiers" shortly
- probably this week, but not later than the end of next week.

Bruce b.
----- Original Message -----
From: "Mark A. Parsons" <parsonsm at nsidc.org>
To: alicebarkstrom at frontier.com
Cc: esip-preserve at lists.esipfed.org
Sent: Monday, October 11, 2010 5:20:28 PM
Subject: Re: [Esip-preserve] ESIP Citation Guidelines

Understood, Bruce. The IPY guidelines do not mandate any sort of identifier. They are very much geared toward developing a citation based on what exists now. They are more geared toward providing fair credit than providing a direct, unambiguous collection to the precise data used. That said if data are cited according to these guidelines, you will have a much better chance of finding and retrieving that data than when the data are only informally acknowledged in passing. I suspect there will never be a perfect solution, but currently data are hardly cited at all. That is the crux of the issue, not the lack of a technical solution. I believe we need to encourage the practice of data citation NOW. We cannot wait for the perfect solution or data will never be cited and the situation will only get worse. 

Indeed, Bruce I think you suggested a year ago that the IPY guidelines were a practical approach that could engage the scientific community while we continue to hash out the technical details.

-m.
On 11 Oct 2010, at 3:03 PM, alicebarkstrom at frontier.com wrote:

> At least from my perspective (probably gloomily of
> Scandanavian genetic predisposition), until we've
> got a threat analysis that moves in the direction
> of quantifying the probability of identifiers 
> "coming loose" from the data itself, as well as
> the probability of detecting changes, and some
> approach to auditing for corruption, our job on
> this is far from done.
> 
> I'll also note that I don't think we've done an
> adequate job of taking into account the difficulties
> of dealing with format and data order rearrangements.
> I am quite certain that it is unfeasible to provide
> a draconian standardization of data formats and data
> file interpretations.  As a result, cryptographic
> digests only protect against tampering with the
> bits in a file - but they don't deal with the question
> of being able to uniquely identify two files with
> scientifically identical data that have different
> cryptographic digests (or bit-by-bit intercomparisons).
> This line of reasoning strongly suggests that the
> notion of a "unique authentic version" of a file is
> impossible.
> 
> Bruce B.
> ----- Original Message -----
> From: "Mark A. Parsons" <parsonsm at nsidc.org>
> To: esip-preserve at lists.esipfed.org
> Sent: Monday, October 11, 2010 3:52:10 PM
> Subject: [Esip-preserve] ESIP Citation Guidelines
> 
> Hi all,
> 
> We have been talking a lot about data citation over the last year. The broader data community has too (GEO, DataCite, CODATA, ...). It seems to me that it is incumbent upon ESIP to make some sort of statement on the issue. Our statement of principles to be presented at the annual meeting mentions "appropriate citation," and say that "Data intermediaries will work with data creators to develop clear citations." We should give more guidance on how to do this. Probably not as part of the principles, but as a separate evolving document.
> 
> My understanding is that there is general consensus that, for the moment, the IPY Guideline (http://ipydis.org/data/citations.html) are suitable for collections, especially extant collections, but more work needs to be done to more precisely identify specific granules, subsets, versions, etc. The Committee that developed and maintained the IPY Guidelines no longer exists, and the website where they reside will close soon. Would it be reasonable for ESIP to adopt, update, maintain, and promote these guidelines?Do you all think it's a good idea? Is there some official process for this?
> 
> Cheers,
> 
> -m. 
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve



More information about the Esip-preserve mailing list