[Esip-preserve] Fwd: PLOS Computational Biology: Ten Simple Rules for Reproducible Computational Research

Bruce Barkstrom brbarkstrom at gmail.com
Mon Oct 28 13:30:35 EDT 2013


Well, if you consider EOSDIS as a partial implementation,
the original project cost about $500M, used about 5,000 person-years,
and has an ongoing maintenance cost of about 10% of the total
development cost per year.  The science teams for EOS are
probably roughly comparable to EOSDIS - and require fully
qualified scientists who can deal with the math and physics
of instruments and data reduction to geophysical quantities.
It isn't impossible - just expensive - and requires people to
commit to decades of work.

As an  additional note, the "hard sciences" (meaning such
fields as atmospheric science, oceanography, and geophysics,
rather than biological sciences) dealing with the Earth are
observational.  This means that practitioners can't run experiments,
which complicates replication.  The math in these areas is not
necessarily statistical, although advanced statistics and pretty
complex Monte Carlo simulations may be necessary to answer
questions regarding statistical significance.

Bruce B.

Bruce B.


On Mon, Oct 28, 2013 at 9:50 AM, Lynnes, Christopher S. (GSFC-6102) <
christopher.s.lynnes at nasa.gov> wrote:

>  Those rules may be simple, in that they are written pithily, but there is
> a world of complexity beneath nearly every single one of them.
>  Particularly for scientists that do not have robust data, workflow and
> configuration management systems to rely on.
>
>  If only there were a cross-platform, easy-to-install, easy-to-manage
> "science management system" designed for individual scientists, with full
> support for managing:
> o archiving
> o provenance
> o workflows
> o analysis results
> o annotations
> o versions (of every artifact above)
> o secure public access
>
>  How hard could it be?  :-)
>
>  On Oct 28, 2013, at 8:37 AM, Curt Tilmes <Curt.Tilmes at nasa.gov>
>  wrote:
>
>  -------- Original Message --------  Subject: PLOS Computational Biology:
> Ten Simple Rules for Reproducible Computational Research  Date: Sun, 27
> Oct 2013 07:54:56 -0600  From: Steve Aulenbach <saulenbach at usgcrp.gov><saulenbach at usgcrp.gov>
> *PLOS Computational Biology: Ten Simple Rules for Reproducible
> Computational Research*
>
> http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285;jsessionid=4AFC0E022E4769E5856CC5BB897EF6F9
>
> "Replication is the cornerstone of a cumulative science [1]. However, new
> tools and technologies, massive amounts of data, interdisciplinary
> approaches, and the complexity of the questions being asked are
> complicating replication efforts, as are increased pressures on scientists
> to advance their research [2]. As full replication of studies on
> independently collected data is often not feasible, there has recently been
> a call for reproducible research as an attainable minimum standard for
> assessing the value of scientific claims [3]. This requires that papers in
> experimental science describe the results and provide a sufficiently clear
> protocol to allow successful repetition and extension of analyses based on
> original data [4].
>
> [...]
>
> We here present ten simple rules for reproducibility of computational
> research. These rules can be at your disposal for whenever you want to make
> your research more accessible—be it for peers or for your future self."
>
>
>
>    - Rule 1: For Every Result, Keep Track of How It Was Produced<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s2>
>    - Rule 2: Avoid Manual Data Manipulation Steps<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s3>
>    - Rule 3: Archive the Exact Versions of All External Programs Used<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s4>
>    - Rule 4: Version Control All Custom Scripts<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s5>
>    - Rule 5: Record All Intermediate Results, When Possible in
>    Standardized Formats<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s6>
>    - Rule 6: For Analyses That Include Randomness, Note Underlying Random
>    Seeds<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s7>
>    - Rule 7: Always Store Raw Data behind Plots<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s8>
>    - Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of
>    Increasing Detail to Be Inspected<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s9>
>    - Rule 9: Connect Textual Statements to Underlying Results<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s10>
>    - Rule 10: Provide Public Access to Scripts, Runs, and Results<http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003285#s11>
>
>
>  _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
>
>
>  --
> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
> "The future is already here--it's just not very evenly distributed." Wm.
> Gibson
>
>
>
>
> _______________________________________________
> Esip-preserve mailing list
> Esip-preserve at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-preserve
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-preserve/attachments/20131028/e37ce651/attachment.html>


More information about the Esip-preserve mailing list