[esip-semantictech] NASA GCMD Keywords Version 9.0 Released (2019-11-12)

Matt Jones jones at nceas.ucsb.edu
Mon Jan 13 14:39:46 EST 2020


SiriJodha et al.,

Members of the DataONE network use a shared format list that is used to tag
granules with the serialization format. It might be useful to you in this
effort.

Each format includes a format identifier, a human readable name, indication
whether it is primarily used to serialize metadata or data granules, and
its associated mime media type and extensions used.  When it makes sense,
we use the mime-type as the formatId, but in many cases one mime-type
corresponds to many formats or different versions of formats, and so in
those cases we use something else that is reasonable.  Here is HDF5 as an
example:

    <objectFormat>
        <formatId>application/x-hdf5</formatId>
        <formatName>Hierarchical Data Format version 5 (HDF5)</formatName>
        <formatType>DATA</formatType>
        <mediaType name="application/x-hdf5"/>
        <extension>h5</extension>
    </objectFormat>

We extend this list as needed when new formats are encountered, and the
current list can be retrieved from the DataONE formats service, which is at
https://cn.dataone.org/cn/v2/formats . I've also attached a copy of the
current list in case it is useful.

There have been other format list standardization efforts, including most
recently the Unified Digital Format Registry (UDFR) <https://www.udfr.org/>,
run by the California digital library. That is an ontological format
registry that harmonizes prior vocabularies, including the earlier PRONOM
and GDFR (the Global Digital Format Registry) efforts. Although we haven't
adopted it yet, we do try to be sure we are consistent with UDFR at
DataONE.  We'd love to see a global service like this be adopted, rather
than doing it agency by agency.

Matt


*Matthew B. Jones*
ORCID: 0000-0003-0077-4738 <https://orcid.org/0000-0003-0077-4738>
Director of Informatics R&D, National Center for Ecological Analysis and
Synthesis <http://www.nceas.ucsb.edu/ecoinfo>
PI, NSF Arctic Data Center <https://arcticdata.io/>
Director, DataONE <https://dataone.org/> program
University of California Santa Barbara


On Mon, Jan 13, 2020 at 2:07 PM John Scialdone via esip-semanticweb <
esip-semanticweb at lists.esipfed.org> wrote:

> SiriJodha,
>
> Yes, we can contribute here, request edit permission.
>
> Thanx..
> John
>
> On 1/13/2020 8:15 AM, Siri Jodha Khalsa wrote:
>
> Hi John,
>
>
> I agree, it would be good to have the DAACs compile their list of formats.
> Ideally, there would be a shared spreadsheet, where each DAAC would check
> against what was already there, to avoid have the same format called
> different things.
>
>
> I've taken the GCMD list and assigned encodings to all that I could
> identify.  The categories are ASCII, Binary, Image, Library (i.e.
> associated with a software library like HDF), and Proprietary (which is
> somewhat of a mixed bag, could be ASCII or Binary, open or closed).
>
>
> The spreadsheet is here:
> https://docs.google.com/spreadsheets/d/1Lt7hl-_NKbp37FZkQ870b9c9LhlS39ZMq1N-UZdPUp8/edit?usp=sharing
>
> Feedback, corrections, additions welcome. I'll give edit permission as
> requested.
>
>
> Cheers,
>
> SiriJodha
>
>
> On 1/11/20 12:04 AM, John Scialdone wrote:
>
> Siri Jodha,
>
> We've been kicking around the the Data Format Controlled Vocabulary list
> as well. We had a call with Valerie, Tyler and Scott recently about
> inconsistencies in this list. One of the goals of this list (from ARC team
> review of our metadata) was to help users understand the software needed to
> read/use the data. We suggested to add a field to this structure whereby
> values such as "ESRI", "Microsoft", "QGIS", "Adobe", "Google" etc. could be
> associated with a format. I think it would be a good exercise for all the
> DAACs to generate a list of formats they use and associated s/w, then bring
> them together over some telecons, face to face meetings, and thru tracking
> this effort via the Earthdata wiki, to eventually help generate a more
> well-thought-out list.
>
> Thanx..
> John
>
>
> On 1/10/2020 4:08 PM, Siri Jodha Khalsa via esip-semanticweb wrote:
>
> I'm curious whether the ESIP semantic community has any opinions on these
> two controlled vocabularies.
>
> One question I have is why a measurement keyword list was necessary when
> GCMD already has the science keywords (a source for the original SWEET).
> i.e. why not integrate measurements (which are represented as variables in
> the science keywords) into the science keywords?
>
>
> the data format list is even more perplexing to me. "incidence angle file"
> is a format? Georeferenced TIFF in addition to GeoTIFF?  DV? (digital
> value?) is a format?  KML as well as OGC KML? ASCII and text (what about
> unicode?) DEM, if this refers to digital elevation models with a .DEM
> extension from USGS, are ASCII files.
>
>
> Many formats are subtypes of other formats listed. Wouldn't a better
> approach be to list the *encodings* (a much smaller list, which would
> tell users how to read the data with software) and then add the conventions
> that have been applied (e.g. CF for netCDF or GRIB for binary). For the
> encodings list, why not start with mime types?
>
>
> sjs
>
>
> On 11/12/19 5:00 PM, Stevens, Tyler B. (GSFC-423.0)[Stinger Ghaffarian
> Technologies] via esip-semanticweb wrote:
>
>
> * The NASA Global Change Master Directory (GCMD) staff is pleased to
> announce the release of the GCMD keywords version 9.0. Version 9.0 consists
> of two new keyword schemes: (1) Measurement Name and (2) Granule Data
> Format.  The Measurement Name
> <https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/MeasurementName/?format=csv>
> list is a preliminary set of (~100) keywords that represent an observable
> property, usually geophysical, geo-biophysical, physical, or chemical. The
> Granule Data Format
> <https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/GranuleDataFormat/?format=csv>
> list of keywords represent the format of the data that is distributed by
> the data center.  The keywords help facilitate the classification and
> discovery of Earth Science data by providing a rich vocabulary for
> characterizing the data. The GCMD keywords are used by hundreds of data
> providers worldwide for categorizing the ~33,000 records stored in the
> Common Metadata Repository <http://earthdata.nasa.gov/cmr>. For more
> information about the keywords and how to access them, please visit the
> Keyword Landing Page.
> <https://earthdata.nasa.gov/about/gcmd/global-change-master-directory-gcmd-keywords>
> Questions about the keywords can be submitted to support at earthdata.nasa.gov
> <support at earthdata.nasa.gov> or directed to Valerie Dixon at
> valerie.dixon at nasa.gov <valerie.dixon at nasa.gov>.*
>
> _______________________________________________
> esip-semanticweb mailing listesip-semanticweb at lists.esipfed.orghttps://lists.esipfed.org/mailman/listinfo/esip-semanticweb
>
> --
> Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
> National Snow and Ice Data Center
> University of Colorado
> Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976http://cires.colorado.edu/~khalsahttp://orcid.org/0000-0001-9217-5550
>
>
>
> _______________________________________________
> esip-semanticweb mailing listesip-semanticweb at lists.esipfed.orghttps://lists.esipfed.org/mailman/listinfo/esip-semanticweb
>
>
> --
> John Scialdone
> Manager, Data Center Services: NASA Socioeconomic Data and Applications
> Center (SEDAC)
> Project Lead: Jamaica Bay Research and Management Information Network
> (JBRMIN)
> Project Lead: Jamaica Bay & Sandy Hook BioBlitz Events
>
> --------------------------------------------------------------------------------------
> Center for International Earth Science Information Network (CIESIN)
> Earth Institute @ Columbia University
> Lamont-Doherty Earth Observatory (LDEO)
> 61 Route 9W, PO Box 1000, Palisades, New York 10964 USA
> Phone: (845) 365-8978; FAX: (845) 365-8922
> Email: jscialdo at ciesin.columbia.edu; jns74 at columbia.edu
> CIESIN web site: www.ciesin.columbia.edu
> SEDAC web site: sedac.ciesin.columbia.edu
> JBRMIN web site: www.ciesin.columbia.edu/jamaicabay
> Sandy Hook web site: bioblitz17.ciesin.columbia.edu
>
> Follow us on:
> <https://twitter.com/ciesin/> Twitter |
> <https://www.facebook.com/socioeconomicdataandappsctr> Facebook |
> <https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public>
> Youtube
>
> --
> Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
> National Snow and Ice Data Center
> University of Colorado
> Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976http://cires.colorado.edu/~khalsahttp://orcid.org/0000-0001-9217-5550
>
>
> --
> John Scialdone
> Manager, Data Center Services: NASA Socioeconomic Data and Applications
> Center (SEDAC)
> Project Lead: Jamaica Bay Research and Management Information Network
> (JBRMIN)
> Project Lead: Jamaica Bay & Sandy Hook BioBlitz Events
>
> --------------------------------------------------------------------------------------
> Center for International Earth Science Information Network (CIESIN)
> Earth Institute @ Columbia University
> Lamont-Doherty Earth Observatory (LDEO)
> 61 Route 9W, PO Box 1000, Palisades, New York 10964 USA
> Phone: (845) 365-8978; FAX: (845) 365-8922
> Email: jscialdo at ciesin.columbia.edu; jns74 at columbia.edu
> CIESIN web site: www.ciesin.columbia.edu
> SEDAC web site: sedac.ciesin.columbia.edu
> JBRMIN web site: www.ciesin.columbia.edu/jamaicabay
> Sandy Hook web site: bioblitz17.ciesin.columbia.edu
>
> Follow us on:
> <https://twitter.com/ciesin/> Twitter |
> <https://www.facebook.com/socioeconomicdataandappsctr> Facebook |
> <https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public>
> Youtube
> _______________________________________________
> esip-semanticweb mailing list
> esip-semanticweb at lists.esipfed.org
> https://lists.esipfed.org/mailman/listinfo/esip-semanticweb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.esipfed.org/pipermail/esip-semanticweb/attachments/20200113/4dc3d063/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: formatlist.xml
Type: text/xml
Size: 39320 bytes
Desc: not available
URL: <http://lists.esipfed.org/pipermail/esip-semanticweb/attachments/20200113/4dc3d063/attachment-0001.xml>


More information about the esip-semanticweb mailing list