[Esip-preserve] [ESIP-all] Please review Draft ESIP Data Citation Guidelines

Peter Cornillon pcornillon at me.com
Fri Aug 26 14:01:39 EDT 2011


Hi Mark,

Thanks for the response. More in-line below.

On Aug 26, 2011, at 12:01 PM, Mark A. Parsons wrote:

> Hi Peter,
> 
> Thanks for your interest. Ironically, I was unable to respond right away because I was at a Data Citation meeting hosted by the National Academy.
> 
> Anyway, let me see what I can do with your situation. I copy the rest of the cluster, in case someone else want to chime in.
> 
> My first recommendation is that you actually cite the data in the references section not just an acknowledgement blurb.

Yes, if I understand you here, we are doing that. What I am looking for is the verbage that someone who uses the data that we produce should use when acknowledging the data in a pub.

> In your case, the tricky part, then is figuring out who the data "author", "publishers", etc. are for the citation.
> 
> My first question is what are you doing with the derived data sets. Are you archiving or distributing them anywhere?

They will be made available on our web site as netCDF 4 files and via OPeNDAP. The netCDF files will conform to CF 1.5.

> If so, then I see those as the citable objects. The documentation for those data sets should in turn cite the SST data from NODC. If you are not distributing the derived products, then I think you should cite the original SST data and describe the edge detection process in the methods section of your paper.

Right, we will do that in our pubs that make use of the data, but, as I said above, we want to distribute the data sets that we produce.

> My second question is how significant do you think the edge detection process was scientifically and intellectually? Have you created a new data set, a new intellectual product, or is it more accurate to say it was a more minor manipulation or "edit" of the original data set?

In the case of the fronts data they are a new product. The gradient data are a little less clear in that we apply a Sobel operator to the data which, and I'm guessing here, is not reversible, but even if it were, we median filter the input data which is irreversible, so I think that we are OK in saying that the edit produces a scientifically new product. I'm assuming that it's reversibility of the data that determines a scientifically new product. Is that right or is more subtle than that?

> In the first case, you or your team would be the "author" of the new data set. In the second case, you might be considered an "editor"
> 
> So given all that here is how I suggest you cite these data to help ensure both credit and validation, using the following elelments:
> 
> 	• Author(s)--the people or organizations responsible for the intellectual work to develop the data set. The data creators.
> 	• Release Date--when the particular version of the data set was first made available for use (and potential citation) by others.
> 	• Title--the formal title of the data set
> 	• Version--the precise version of the data used. Careful version tracking is critical to accurate citation.
> 	• Archive and/or Distributor--the organization distributing or caring for the data, ideally over the long term.
> 	• Locator/Identifier--this could be a URL but ideally it should be a persistant service, such as a DOI, Handle or ARK, that resolves to the current location of the data in question.

We are creating a UUID for each file that we produce. 

> 	• Access Date and Time--because data can be dynamic and changeable in ways that are not always reflected in release dates and versions, it is important to indicate when on-line data were accessed.
> 
> 
> 
> For the original SSTs:
> 
> Person or team at Miami. Initial release date of the version used. "Cool SST data set, version x.x". National Oceanographic Data Center. Access URL or any kind of persistent location or identifier provided by NODC. Accessed on date.
> 
> For the edge detection data
> 
> Cornillon, P. et al., Date made available. "Cool edge data set, version x.x" URhode Island or whoever is distributing the data. Access URL or get a DOI. Date accessed.
> 
> or
> 
> Miami team. release date. "Cool edge data set, version x.x" edited by Cornillon, et al. URhode Island or whoever is distributing the data. Access URL or get a DOI. Date accessed.

I guess that the real issue here is what does 'editing' someone's data means? If I understand what you are saying, if one is not editing someone else's data set, but deriving new value out of it, then the original data are not acknowledged? This might make sense in that it could just get too complicated if one had to acknowledge then entire parentage of every data set used, but it still bugs me a bit. In this example, I could not have produced my fronts data set without the data from Miami.

Sorry for being so dense on this. 

Peter

> This may not be precisely correct, but I hope you get the idea. I encourage you to read the guidelines. Let me know if you have other questions.
> 
> Cheers,
> 
> -m. 
> 
> 
> On 23 Aug 2011, at 1:19 PM, Peter Cornillon wrote:
> 
>> Hi Mark,
>> 
>> I am generating a number of data sets and would like some pointers on how to acknowledgement them. What I'm looking for is how to structure the short blurb that one usually puts in the acknowledgements of a journal article. Here's a typical scenario. I acquire some SST data from NODC. It may have been created by someone at the University of Miami. I apply our edge detection operator to the SST fields and generate files with front pixels in them. I would like to tell people how they should acknowledge the data. Is the correct protocol to acknowledge UMiami for producing the original data, NODC for serving those data and my institution, the University of Rhode Island (URI), for producing the edge detection fields or should I simply ask that URI be acknowledged?
>> 
>> Peter
>> 
>> On Aug 17, 2011, at 10:38 AM, Mark A. Parsons wrote:
>> 
>>> Hi all,
>>> 
>>> Data citation is getting increasing attention. The National Academy of Sciences has established a committee to explore the issue and develop recommendations on the topic. They are hosting a symposium next week and they have invited me to speak on the developing ESIP Guidelines. So I just wanted to send a little reminder that we are still open for comments on these guidelines. Thanks greatly to those who have already commented.
>>> 
>>> Cheers,
>>> 
>>> -m. 
>>> 
>>> 
>>> On 21 Jul 2011, at 5:40 PM, Mark A. Parsons wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> Creating a great data set can be a life’s work (consider Charles Keeling). Yet, scientists do not receive much recognition for creating rigorous, useful data. At the same time, in a post “climategate” world there is increased scrutiny on science and a greater need than ever to adhere to scientific principles of transparency and repeatability. The Council of the American Geophysical Union (AGU) asserts that the scientific community should recognize the value of data collection, preparation, and description and that data “publications” should “be credited and cited like the products of any other scientific activity.” Currently, however, authors rarely cite data formally in journal articles, and they often lack guidance on how data should be cited. The ESIP Federation Preservation and Stewardship Cluster has been working this issue for some time now. We started with a townhall meeting at AGU in 2009 and have had subsequent sessions at ESIP meetings and the GeoData2011 Conference as well as extensive e-mail and telecon discussion. 
>>>> 
>>>> We have written some draft citation guidelines that we believe address the vast majority of data citation scenarios. We have presented these guidelines in multiple fora, including two ESIP meetings, for feedback and believe they are pretty solid. Now we ask all interested ESIPers to please review these guidelines closely and send feedback directly to the wiki or to the Cluster (esip-preserve at lists.esipfed.org). We plan to finalize the guidelines this fall for submission to the ESIP Assembly for formal approval at the winter meeting, so please comment soon.
>>>> 
>>>> The guidelines are at: http://wiki.esipfed.org/index.php/Interagency_Data_Stewardship/Citations/provider_guidelines
>>>> 
>>>> There is also an overview presentation on data citation at: ftp://sidads.colorado.edu/pub/ppp/conf_ppp/Parsons/How_to_Cite_an_Earth_Science_Data_Set.pdf
>>>> 
>>>> Thanks,
>>>> 
>>>> -m.
>>>> 
>>>> 
>>>> 
>>>> --- 
>>>> Mark A. Parsons
>>>> Lead Program Manager
>>>> National Snow and Ice Data Center
>>>> University of Colorado, 449 UCB, Boulder, Colorado 80309-0449, USA
>>>> +1-303-492-2359, +1-303-492-2468 (fax)
>>>> skype: mark.a.parsons
>>>> http://nsidc.org, http://eloka-arctic.org
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> ESIP-all mailing list
>>>> ESIP-all at lists.esipfed.org
>>>> http://www.lists.esipfed.org/mailman/listinfo/esip-all
>>> 
>>> _______________________________________________
>>> ESIP-all mailing list
>>> ESIP-all at lists.esipfed.org
>>> http://www.lists.esipfed.org/mailman/listinfo/esip-all
>> 
>> --
>> Peter Cornillon
>>  215 South Ferry Road                                     Telephone: (401) 874-6283
>>   Graduate School of Oceanography                          Fax: (401) 874-6283
>>    University of Rhode Island                                 Internet: pcornillon at gso.uri.edu
>>     Narragansett, RI 02882   USA
>> 
>> 
> 

--
Peter Cornillon
  215 South Ferry Road                                     Telephone: (401) 874-6283
   Graduate School of Oceanography                          Fax: (401) 874-6283
    University of Rhode Island                                 Internet: pcornillon at gso.uri.edu
     Narragansett, RI 02882   USA


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-preserve/attachments/20110826/919033f6/attachment-0001.html>


More information about the Esip-preserve mailing list