[Esip-cloud] Cloud formats study

Patrick Quinn patrick at element84.com
Wed Nov 6 12:22:54 EST 2019


Hi cluster!

My team is starting work on a joint NASA / ESA study to inform decisions
about which file formats to recommend for which uses for cloud-based data.
I'm trying to collect work that may have already been done, and I know a
few of you at least have a ton of knowledge here. If this is something you
have done work on, I'd appreciate it if you could take a look and respond
by 11/15, but the sooner the better.

Our questions:

Have you done any research on data file formats that you can share with
us?  We are particularly interested in comparisons that touch on the points
below but would welcome additional insights.  Please let us know how to
give you proper credit for anything you provide.

   - Data access performance to support common forms of analysis, including
   time series, shape-based averaging, regridding and data intercomparison.
   - Compatibility with existing off the shelf tools, including Panoply,
   gdal, nco, Jupyter/xarray, ArcGIS and GIS.
   - Ability to support fine-grained requests from S3 via range-get or
   other means.
   - Ability to comply with community metadata conventions (e.g., CF)
   - Availability of independent libraries to read the data in C/C++,
   Fortran, Python and R
   - Comparative cost of data preparation, storage and analysis, adjusted
   for lossless compressibility as appropriate.
   - Ability to represent several different data types / structures
   including imagery, swath, trajectory, point cloud, Platte-Carre and
   Sinusoidal grids, in situ and airborne
   - Ability to verify data integrity upon reformatting and ongoing
   - Self-describability, i.e., ability to include complete sets of both
   descriptive and structural metadata
   - Open specification
   - Number of independent implementations of read/write API
   - Standards-body approval (OGC, W3C, etc.)

In your experience, what formats have you seen users preferring for which
domains or use cases?

Do you have examples of test data sets for things like time series analysis
that you can share?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.esipfed.org/pipermail/esip-cloud/attachments/20191106/d1d8807f/attachment.htm>


More information about the Esip-cloud mailing list