[Esip-preserve] Next Telecon

Kenneth Casey Kenneth.Casey at noaa.gov
Tue Nov 10 06:50:55 EST 2009


Mark,


On Nov 9, 2009, at 10:49 PM, Mark A. Parsons wrote:

> Hi all,
>
> Some thoughts on the AGU Townhall for discussion tomorrow:
>
> We are scheduled for an hour on Thursday 1930-2030 (see description  
> below). We want to introduce the topic and get everyone thinking,  
> but we also want to allow for discussion. I think we should allow  
> at least 1/2 hour for discussion.
>
> Our current plan is to have Bernard introduce the AGU position  
> statement and then have Rob or Ruth speak on ESIP activities  
> (including the work on identifiers). We also talked about  
> introducing specific approaches to data citation. I mentioned the  
> IPY guidelines. Bob Cook pointed out that ORNL has a similar  
> approach. Indeed, about a decade ago NSIDC introduced the concept  
> to all the DAACs who supposedly adopted it across the board. Other  
> organizations, including GBIF, Pangea, and others also have  
> approaches. All these approaches are similar but not identical. Do  
> we want to achieve some sort of commonality?

I would think the answer to that question is "yes".  It seems that  
broad adoption of data citations will be hard enough and perhaps  
nearly impossible if there is not a single, clear, and simple way to  
do it - one that is closely analogous to the way we cite manuscripts  
now.

> I think we want AGU journals to take a lead on the issue. How can  
> we do that? I'm happy to give a bit of an overview, but I'll need  
> help. There are several issues  that none of the approaches have  
> fully addressed, including making citations machine understandable  
> and capturing specific versions or subsets of data in a citation.
>

I am not sure about the journals taking the lead... I don't  
necessarily disagree but it is also not entirely clear to me.  I  
think what you are talking about is not so much the journals  
themselves but rather their publishers.. in other words, the question  
is, "When it comes to data set citations, who should be the  
authoritative entity?"  Is that the question?  If so, I would tend to  
think that the answer is somewhat similar to the answer you would get  
to the question, "When it comes to manuscript citations, who is the  
authoritative entity?" The short answer to that question is "the  
publishers" but that alone is not enough since it leads to "who can  
be a publisher?"  And I think the answer to that question is  
something like, "Well, any organization that can demonstrate  
sufficient reliability to be respected by the community and accepted  
as a trusted entity."  In short, a sort of survival of the fittest  
approach.  I haven't really thought about this very deeply, so I may  
be missing important points, but natural candidates for such  
community acceptance would be national data centers, perhaps some  
universities, etc.  Coming from a national data center, I am sure my  
perspective is influenced, but I do know that issues of trust,  
reliability, and openness are very important to us (and that we don't  
always have that trust and so must continually work to earn it and be  
worthy of it).

I think my gut-level reaction to the idea of journal publishers  
taking the lead on data set citations stems from what I perceive as a  
large difference in what it takes to steward a manuscript over time  
(from submission, through verification/peer review, publication, and  
long-term preservation) when compared to what it takes to steward a  
data set over time.  Think about the World Wide Web for a moment -  
hyper text transfer protocol (HTTP) took off with amazing speed and  
blazed across the world because rendering text and hyperlinks on a  
client is a relatively straightforward thing to do. Please don't  
think I am diminishing the achievement in any way, but in comparison  
finding, sending, and understanding data across the internet is an  
ongoing challenge that continues to be addressed by many, many  
people.    I think the same can be said when it comes to publishing  
text vs. publishing data.  While by no means easy or perfect, the  
process of peer-review, publication, and citation of text seems very  
straightforward when compared to the same steps for data.  For  
example, not too many journal articles I know are updated every five  
minutes (like a data set from a moored tropical buoy) or revised many  
times (like the way we reprocess many satellite data sets over and  
over again).  The granularity of a manuscript seems so simple when  
compared to selecting/defining the granularity of a "data set".   
Minor algorithm differences can have huge impacts on a data set.   
Imagine if the entire results and conclusions of a manuscript could  
change dramatically if you altered the location of a comma in the text.


> Then there is the issue of data peer-review. There are specific  
> peer-reviewed journals devoted to data publication, such as  _Earth  
> Science Data_ and _Ecological Archives_.  Personally, I think this  
> approach is limited and even  misguided, but I am probably unusual  
> in that regard ( I don't like DOI's either).
>

Can you summarize why you don't like DOIs?

> Bottom line is that we have to determine what we want to accomplish  
> out of this townhall , and the best way to get there. That's the  
> topic for tomorrow.
>

Unfortunately I can not make the telecon, but I will look forward to  
the email discussions!

Ken

> Talk soon,
>
> -m.
>
> Peer-Reviewed Data Publication and Other Strategies to Sustain  
> Verifiable Science
> Moscone West, Room 2008
> Cosponsored by EP, IN
>
> Objective, verifiable science requires formal, reviewed publication  
> of both data and research results. Data publication facilitates  
> essential scientific processes including transparency,  
> reproducibility, documentation of uncertainty, and preservation.  
> The AGU Council reaffirmed this fundamental responsibility in a  
> revised position statement. Nonetheless, data publication lacks  
> established cultural practices and quality standards for modern,  
> complex, digital data sets. This town hall meeting will present the  
> AGU position statement and evolving international data publication  
> mechanisms. We seek input from all disciplines on state-of-the-art  
> approaches for data peer-review, peer-recognition, citation, and  
> other verification practices. The Federation of Earth Science  
> Information Partners will publish discussion results.
>
>
>
>
> On 9 Nov 2009, at 11:09 AM, Ruth Duerr wrote:
>
>> Well it is unanimous.  The next telecon is tomorrow at 11 am MST  
>> (1 pm EST, 10 am PST).  Agenda (with leads indicated) includes:
>>
>> - EP-TOMS plans (John Moses)
>> - Preparations for AGU town hall (Mark Parsons)
>> - Draft ESIP statement on data (Ruth)
>> - Planning for winter ESIP meeting (Ruth)
>> - Others?
>>
>>
>> _______________________________________________
>> Esip-preserve mailing list
>> Esip-preserve at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
>
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve


[NOTE: The opinions expressed in this email are those of the author  
alone and do not necessarily reflect official NOAA, Department of  
Commerce, or US government policy.]

Kenneth S. Casey, Ph.D.
Technical Director
NOAA National Oceanographic Data Center
1315 East-West Highway
Silver Spring MD 20910
301-713-3272 ext 133
http://www.nodc.noaa.gov/





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-preserve/attachments/20091110/32694d47/attachment-0001.htm>


More information about the Esip-preserve mailing list