[Esip-preserve] FYI: The Vast Majority of Raw Data From Old Scientific Studies May Now Be Missing

Ted Habermann thabermann at hdfgroup.org
Fri Dec 20 11:30:02 EST 2013


Jennifer et al.,

I don't know if you are familiar with the HDF4 Mapping Project (http://www.hdfgroup.org/projects/h4map/) that implemented something along these lines. The idea was to create XML "maps" of HDF4 files that identified the objects in a file along with their types, sizes, locations, and test values from each object. These maps could be used in the future to read the data from the files even if the HDF4 libraries were no longer available.

I learned more about this project when I joined HDF and we are now working to make these maps available through a web service (THREDDS) and useful to current users as well. Our directory of HDF4 examples is available at http://eosdap.hdfgroup.uiuc.edu:8887/thredds/catalog/mnt/ftp/pub/outgoing/NASAHDF/catalog.html. When you click on a file there is a list of services available for that file (a typical THREDDS data set page). The link next to H4MAP gives you the map XML for that file. Note that this is a URL so maps for a series of files can be downloaded using a series of URLs. I am using that approach to help me understand differences in the metadata models used in HDF-EOS and non HDF-EOS datasets.

BTW, this is a test environment created as a mechanism for getting feedback from the community. The goal of the environment is to determine how these services work and don't work for these data. If you run into problems or have ideas / suggestions please let us know.

Ted



On Dec 20, 2013, at 8:41 AM, Wei, Jennifer C. (GSFC-610.2)[ADNET SYSTEMS INC] <jennifer.c.wei at nasa.gov<mailto:jennifer.c.wei at nasa.gov>> wrote:

Hi Curt,

We (GES DISC) are currently undergoing satellite data preservation, especially for those decommissioned satellites, such as UARS, TOMS, HIRDLS, etc.  I am recently get involved with the task.  What we have encountered are not only the data were saved on the old magnetic tapes or even on floppy discs, but those old data were written in the old machine-based binary form, which we don’t have the machine to read them so we can transform them into the modern language.

Maybe one of the data preservation is to come up a way to add metadat (xml, or ancillary information) for the old observation data, so they can be machine-readable for future use.  I have seen this need not only  in the binary raw data, but also in the current in-situ measurements saved in the simple text files.  Another is what is the “supporting documentation” for future people use?

Earlier this year at one of NSF EarthCube workshops, a lot of earth scientists had also addressed this issue/concern.  I think it would be nice to see ESIP take lead on this.

Thanks
Jennifer
--
Dr. Jennifer Wei
ADNET Systems, Inc.

GES DISC Code 610.2
NASA Goddard Space Flight Center
Greenbelt, MD. 20771

Phone: (301) 614-6558
Email: jennifer.c.wei at nasa.gov<x-msg://20/jennifer.c.wei@nasa.gov>




Tilmes, Curt (GSFC-6190) On 12/20/13, 9:20 AM, "Tilmes, Curt (GSFC-6190)" <curt.tilmes at nasa.gov<x-msg://20/curt.tilmes@nasa.gov>> wrote:

Shocking News!

The Vast Majority of Raw Data From Old Scientific Studies May Now Be Missing

"One of the foundations of the scientific method is the reproducibility of results. In a lab anywhere around the world, a researcher should be able to study the same subject as another scientist and reproduce the same data, or analyze the same data and notice the same patterns.

This is why the findings of a study published today in Current Biology are so concerning. When a group of researchers tried to email the authors of 516 biological studies published between 1991 and 2011 and ask for the raw data, they were dismayed to find that more 90 percent of the oldest data (from papers written more than 20 years ago) were inaccessible. In total, even including papers published as recently as 2011, they were only able to track down the data for 23 percent."

http://blogs.smithsonianmag.com/science/2013/12/the-vast-majority-of-raw-data-from-old-scientific-studies-may-now-be-missing/

We've talked about doing such a study for the Earth Sciences -- I think such a study would shine a light on our problems..  Who's up for it?

Curt
_______________________________________________
Esip-preserve mailing list
Esip-preserve at lists.esipfed.org<mailto:Esip-preserve at lists.esipfed.org>
http://www.lists.esipfed.org/mailman/listinfo/esip-preserve

==== Ted Habermann ===========================
   Director of Earth Science, The HDF Group
   Voice: (217) 531-4202
   Email: thabermann at hdfgroup.org<mailto:thabermann at hdfgroup.org>
==== HDF: Software that Powers Science ============

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-preserve/attachments/20131220/bb41115b/attachment-0001.html>


More information about the Esip-preserve mailing list