[Esip-documentation] Let's get rid of spatial and temporal bounds in ACDD

Ted Habermann thabermann at hdfgroup.org
Fri Mar 14 16:14:45 EDT 2014


All,

I agree with Ed and John, this is a software tool problem that should be fixed at the source. The description of the history attribute has always implied that it should be updated when a file is processed (even though, IMHO, it is almost entirely unsuited for doing that). The same is true for many others (listed by John G. earlier in this thread). The current practice is sloppy data management that, from the sound of this thread, is pervasive in the community. Of course, ncISO provides a very easy way to identify occurrences of this problem throughout a THREDDS catalog. The "Catalog Cleaner" is another venue for quantifying the damage. CF is a community standard. Maybe it is time for the community to recommend providing correct metadata with the files and to avoid developers and datasets that don't.

A related problem is that the bounds calculated from the data are only available if you read the data. Many users may not be equipped to easily read the data during a data discovery process. They may not want to go beyond ncdump -x -h (or something like that) before they fire up the whole netCDF machine...

BTW, this problem is trivial relative to that associated with virtual datasets created through aggregation. In those cases, there is no clear mechanism for providing meaningful metadata, although the rich inventory we created several years ago comes close... That situation is much more prone to mistakes as all semblance of the historic record is wiped out.

Its Friday, and spring... As Dave said last week, a good time for a rant!
Ted


On Mar 14, 2014, at 1:44 PM, Steve Hankin <steven.c.hankin at noaa.gov<mailto:steven.c.hankin at noaa.gov>> wrote:

Hi All,

I'm joining into this discussion from the wings.  The topic here -- the common tendency for the ACDD geo-spacio-temporal bounds attributes to get corrupted -- has been beaten around a number of times among different groups.  At this point it isn't clear that there is a "clean" resolution to the problem;  there are already so many files out there that contain these attributes that there may be no easy way to unwind the problem.  Might the best path forward be to see about adding some words of caution into the the documents that suggest the use of these attributes?  Something along these lines:
Caution:   The encoding of geo-spatial bounds values as global attributes is a practice that should be used with caution or avoided.

The encoding of geo-spatial bounds values as global attributes introduces a high likelihood of corruption, because the attibute values duplicate information already contained in the self-describing coordinates of the dataset.   A number of data management operations that are common with netCDF files will invalidate the values stored as global attributes.  Such operations include extending the coordinate range of a netCDF file along its record axis;  aggregating a collection of netCDF files into a larger datasets (for example aggregating model outputs along their time axes); or appending files using file-based utilities (e.g. nco).

It is recommended that 1) the use of these global attributes be restricted to files whose contents are known to be completely stable -- i.e. files very unlikely to be aggregated into larger collections;  and 2) as a matter of best practice, software reading CF files should ignore these global attributes; instead it should compute the geo-spatial bounds by scanning the coordinate ranges found within the CF dataset, itself.
Comments?

    - Steve

________________________________

From: Armstrong, Edward M (398M) <Edward.M.Armstrong at jpl.nasa.gov><mailto:Edward.M.Armstrong at jpl.nasa.gov>
Date: Wed, Mar 12, 2014 at 6:03 PM
Subject: Re: [Esip-documentation] Let's get rid of spatial and
temporal bounds in ACDD
To: Nan Galbraith <ngalbraith at whoi.edu><mailto:ngalbraith at whoi.edu>
Cc: Cluster Documentation <esip-documentation at lists.esipfed.org><mailto:esip-documentation at lists.esipfed.org>


I think that is an excellent idea.

The output of the PO.DAAC HiTIDE subsetter that I mentioned in a
previous email does exactly that and includes some useful information
about the granule and the wrapped subsetting request via OPeNDAP.
Below is a snapshot of some of the global attributes from  a subsetted
AVHRR SST granule (look at the naiad_ attributes):

  :southernmost_latitude = -89.72987f; // float
  :northernmost_latitude = 89.80405f; // float
  :westernmost_longitude = -179.9997f; // float
  :easternmost_longitude = 179.99994f; // float
  :file_quality_index = 1S; // short
  :comment = "none";
  :naiad_download_date = "2014-03-10 21:25:47";
  :naiad_granule_url =
"http://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/L2P/AVHRR19_G/NAVO/2013/358/20131224-AVHRR19_G-NAVO-L2P-SST_s0827_e1009-v01.nc.bz2"<http://podaac-opendap.jpl.nasa.gov/opendap/allData/ghrsst/data/L2P/AVHRR19_G/NAVO/2013/358/20131224-AVHRR19_G-NAVO-L2P-SST_s0827_e1009-v01.nc.bz2>;
  :naiad_constraint_expression =
"lat[8000:1:9200][104:1:408],lon[8000:1:9200][104:1:408],time[0:1:0],sst_dtime[0:1:0][8000:1:9200][104:1:408],rejection_flag[0:1:0][8000:1:9200][104:1:408],SSES_bias_error[0:1:0][8000:1:9200][104:1:408],aod_dtime_from_sst[0:1:0][8000:1:9200][104:1:408],DT_analysis[0:1:0][8000:1:9200][104:1:408],brightness_temperature_11um[0:1:0][8000:1:9200][104:1:408],aerosol_optical_depth[0:1:0][8000:1:9200][104:1:408],sources_of_aod[0:1:0][8000:1:9200][104:1:408],confidence_flag[0:1:0][8000:1:9200][104:1:408],brightness_temperature_4um[0:1:0][8000:1:9200][104:1:408],SSES_standard_deviation_error[0:1:0][8000:1:9200][104:1:408],sea_surface_temperature[0:1:0][8000:1:9200][104:1:408],brightness_temperature_12um[0:1:0][8000:1:9200][104:1:408],proximity_confidence[0:1:0][8000:1:9200][104:1:408],satellite_zenith_angle[0:1:0][8000:1:9200][104:1:408]";
}

I did misspeak earlier when I indicated that the spatial bounds were
also updated, in this case southernmost_latitude etc. They are the
original (global) bounds.  I have requested to the developer that
these bounds be updated for every subset request. Hopefully it will
get in the next version.



On Mar 8, 2014, at 3:13 AM, Nan Galbraith <ngalbraith at whoi.edu><mailto:ngalbraith at whoi.edu> wrote:

> Hows about adding an attribute that contains the URL of the data
> file to which the bounds apply? If your aggregator/sub-setter
> has misled you by failing to update the bounds attribute, he's also
> provided you with the link to the data you actually wanted.
>
> For programs that collect lots of data and don't molest it, this would
> let them continue to use the bounds atts; for programs that slice
> and dice, they'd be motivated to either update the fields dynamically
> or remove them.
>
> Cheers - Nan
>
> Is it really Friday? I'm at sea (yes, people still do that, sometimes)
> and I've lost all sense of time and place (no geospatial and temporal
> bounds information).
>
>
>
>
> On 3/7/14 6:47 PM, David Neufeld - NOAA Affiliate wrote:
>>
>> Ok, since it's Friday and we are in rant mode, I'm going to have a little fun with this one...
>>
>> Recommended Attribute Disclaimer:
>>
>> Suggested text: "When using geospatial and temporal bounds information in your global attributes, please know that it introduces a likely source of error and that you are far better off reading these values from the data stored in the file.  If you do choose to use the attributes please also include a global checksum attribute that humans can look at to decide whether the file has changed since you originally recorded these values."
>>
>> On Fri, Mar 7, 2014 at 2:02 PM, Nan Galbraith <ngalbraith at whoi.edu<mailto:ngalbraith at whoi.edu> <mailto:ngalbraith at whoi.edu><mailto:ngalbraith at whoi.edu>> wrote:
>>
>>
>>
>>    Maybe we should add some text about which attributes should be
>>    considered fragile and under what conditions they need to be
>>    recalculated or removed, but I'm not in favor of removing the
>>    terms from ACDD.
>>
>>
>
>
> --
> *******************************************************
> * Nan Galbraith                        (508) 289-2444 *
> * Upper Ocean Processes Group            Mail Stop 29 *
> * Woods Hole Oceanographic Institution                *
> * Woods Hole, MA 02543                                *
> *******************************************************
>
>
> _______________________________________________
> Esip-documentation mailing list
> Esip-documentation at lists.esipfed.org<mailto:Esip-documentation at lists.esipfed.org>
> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation

-ed

Ed Armstrong
JPL Physical Oceanography DAAC
818 519-7607



_______________________________________________
Esip-documentation mailing list
Esip-documentation at lists.esipfed.org<mailto:Esip-documentation at lists.esipfed.org>
http://www.lists.esipfed.org/mailman/listinfo/esip-documentation


--
Dr. Richard P. Signell   (508) 457-2229
USGS, 384 Woods Hole Rd.
Woods Hole, MA 02543-1598

_______________________________________________
Esip-documentation mailing list
Esip-documentation at lists.esipfed.org<mailto:Esip-documentation at lists.esipfed.org>
http://www.lists.esipfed.org/mailman/listinfo/esip-documentation

[cid:32323496-C60B-49FF-8310-11CCF46BDC72]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20140314/fbfdcf5f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SignatureSm.png
Type: image/png
Size: 30402 bytes
Desc: SignatureSm.png
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20140314/fbfdcf5f/attachment-0001.png>


More information about the Esip-documentation mailing list