<div dir="ltr"><div>SiriJodha et al.,</div><div><br></div><div>Members of the DataONE network use a shared format list that is used to tag granules with the serialization format. It might be useful to you in this effort. <br></div><div><br></div><div>Each format includes a format identifier, a human readable name, indication whether it is primarily used to serialize metadata or data granules, and its associated mime media type and extensions used. When it makes sense, we use the mime-type as the formatId, but in many cases one mime-type corresponds to many formats or different versions of formats, and so in those cases we use something else that is reasonable. Here is HDF5 as an example:<br></div><div><span style="font-family:monospace"><br> <objectFormat><br> <formatId>application/x-hdf5</formatId><br> <formatName>Hierarchical Data Format version 5 (HDF5)</formatName><br> <formatType>DATA</formatType><br> <mediaType name="application/x-hdf5"/><br> <extension>h5</extension><br> </objectFormat></span></div><div><br></div><div>We extend this list as needed when new formats are encountered, and the current list can be retrieved from the DataONE formats service, which is at <a href="https://cn.dataone.org/cn/v2/formats">https://cn.dataone.org/cn/v2/formats</a> . I've also attached a copy of the current list in case it is useful.</div><div><br></div><div>There have been other format list standardization efforts, including most recently the <a href="https://www.udfr.org/">Unified Digital Format Registry (UDFR)</a>, run by the California digital library. That is an ontological format registry that harmonizes prior vocabularies, including the earlier PRONOM and GDFR (the Global Digital Format Registry) efforts. Although we haven't adopted it yet, we do try to be sure we are consistent with UDFR at DataONE. We'd love to see a global service like this be adopted, rather than doing it agency by agency.<br></div><div><br></div><div>Matt<br></div><div><br></div><div><br></div><div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><b>Matthew B. Jones</b></div><div>ORCID: <a href="https://orcid.org/0000-0003-0077-4738" target="_blank">0000-0003-0077-4738</a></div><div>
Director of Informatics R&D, <a href="http://www.nceas.ucsb.edu/ecoinfo" style="color:rgb(17,85,204)" target="_blank">National Center for Ecological Analysis and Synthesis</a></div><div>PI, NSF <a href="https://arcticdata.io/" style="color:rgb(17,85,204)" target="_blank">Arctic Data Center</a></div><div>Director, <a href="https://dataone.org/" style="color:rgb(17,85,204)" target="_blank">DataONE</a> program
</div><div>
University of California Santa Barbara</div></div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jan 13, 2020 at 2:07 PM John Scialdone via esip-semanticweb <<a href="mailto:esip-semanticweb@lists.esipfed.org">esip-semanticweb@lists.esipfed.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
SiriJodha,<br>
<br>
Yes, we can contribute here, request edit permission.<br>
<br>
Thanx..<br>
John<br>
<br>
<div>On 1/13/2020 8:15 AM, Siri Jodha Khalsa
wrote:<br>
</div>
<blockquote type="cite">
<p>Hi John,</p>
<p><br>
</p>
<p>I agree, it would be good to have the DAACs compile their list
of formats. Ideally, there would be a shared spreadsheet, where
each DAAC would check against what was already there, to avoid
have the same format called different things.</p>
<p><br>
</p>
<p>I've taken the GCMD list and assigned encodings to all that I
could identify. The categories are ASCII, Binary, Image,
Library (i.e. associated with a software library like HDF), and
Proprietary (which is somewhat of a mixed bag, could be ASCII or
Binary, open or closed).</p>
<p><br>
</p>
<p>The spreadsheet is here:
<a href="https://docs.google.com/spreadsheets/d/1Lt7hl-_NKbp37FZkQ870b9c9LhlS39ZMq1N-UZdPUp8/edit?usp=sharing" target="_blank">https://docs.google.com/spreadsheets/d/1Lt7hl-_NKbp37FZkQ870b9c9LhlS39ZMq1N-UZdPUp8/edit?usp=sharing</a></p>
<p>Feedback, corrections, additions welcome. I'll give edit
permission as requested. </p>
<p><br>
</p>
<p>Cheers,</p>
<p>SiriJodha<br>
</p>
<p><br>
</p>
<div>On 1/11/20 12:04 AM, John Scialdone
wrote:<br>
</div>
<blockquote type="cite">
Siri Jodha,<br>
<br>
We've been kicking around the the Data Format Controlled
Vocabulary list as well. We had a call with Valerie, Tyler and
Scott recently about inconsistencies in this list. One of the
goals of this list (from ARC team review of our metadata) was to
help users understand the software needed to read/use the data.
We suggested to add a field to this structure whereby values
such as "ESRI", "Microsoft", "QGIS", "Adobe", "Google" etc.
could be associated with a format. I think it would be a good
exercise for all the DAACs to generate a list of formats they
use and associated s/w, then bring them together over some
telecons, face to face meetings, and thru tracking this effort
via the Earthdata wiki, to eventually help generate a more
well-thought-out list.<br>
<br>
Thanx..<br>
John<br>
<br>
<br>
<div>On 1/10/2020 4:08 PM, Siri Jodha
Khalsa via esip-semanticweb wrote:<br>
</div>
<blockquote type="cite">
<p>I'm curious whether the ESIP semantic community has any
opinions on these two controlled vocabularies. <br>
</p>
<p>One question I have is why a measurement keyword list was
necessary when GCMD already has the science keywords (a
source for the original SWEET). i.e. why not integrate
measurements (which are represented as variables in the
science keywords) into the science keywords?</p>
<p><br>
</p>
<p>the data format list is even more perplexing to me.
"incidence angle file" is a format? Georeferenced TIFF in
addition to GeoTIFF? DV? (digital value?) is a format? KML
as well as OGC KML? ASCII and text (what about unicode?)
DEM, if this refers to digital elevation models with a .DEM
extension from USGS, are ASCII files. </p>
<p><br>
</p>
<p>Many formats are subtypes of other formats listed. Wouldn't
a better approach be to list the <i>encodings</i> (a much
smaller list, which would tell users how to read the data
with software) and then add the conventions that have been
applied (e.g. CF for netCDF or GRIB for binary). For the
encodings list, why not start with mime types?<br>
</p>
<p><br>
</p>
<p>sjs<br>
</p>
<p><br>
</p>
<div>On 11/12/19 5:00 PM, Stevens,
Tyler B. (GSFC-423.0)[Stinger Ghaffarian Technologies] via
esip-semanticweb wrote:<br>
</div>
<blockquote type="cite">
<b style="font-weight:normal">
<p dir="ltr" style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:0pt;margin-bottom:0pt">
<span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The NASA Global Change
Master Directory (GCMD) staff is pleased to announce
the release of the GCMD keywords version 9.0. Version
9.0 consists of two new keyword schemes: (1)
Measurement Name and (2) Granule Data Format. </span></p>
<br>
<p dir="ltr" style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:0pt;margin-bottom:0pt">
<span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The </span><a href="https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/MeasurementName/?format=csv" target="_blank"><span style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Measurement Name</span></a><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400"> list is a preliminary
set of (~100) keywords that represent an observable
property, usually geophysical, geo-biophysical,
physical, or chemical. The </span><a href="https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/GranuleDataFormat/?format=csv" target="_blank"><span style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Granule Data Format</span></a><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400"> list of keywords
represent the format of the data that is distributed
by the data center. </span></p>
<p dir="ltr" style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:12pt;margin-bottom:12pt"> <span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The
keywords help facilitate the classification and
discovery of Earth Science data by providing a rich
vocabulary for characterizing the data. The GCMD
keywords are used by hundreds of data providers
worldwide for categorizing the ~33,000 records stored
in the</span><a href="http://earthdata.nasa.gov/cmr" target="_blank"><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400;text-decoration:underline"> </span><span style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Common Metadata
Repository</span></a><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">.</span></p>
<span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">For more information
about the keywords and how to access them, please visit
the</span><a href="https://earthdata.nasa.gov/about/gcmd/global-change-master-directory-gcmd-keywords" target="_blank"><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400"> </span><span style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Keyword Landing Page.</span></a><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400"> Questions about the keywords can be
submitted to </span><a href="mailto:support@earthdata.nasa.gov" target="_blank"><span style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">support@earthdata.nasa.gov</span></a><span style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400"> </span><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">or
directed to Valerie Dixon at </span><span style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400"><a href="mailto:valerie.dixon@nasa.gov" target="_blank">valerie.dixon@nasa.gov</a></span><span style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">.</span></b> <br>
<fieldset></fieldset>
<pre>_______________________________________________
esip-semanticweb mailing list
<a href="mailto:esip-semanticweb@lists.esipfed.org" target="_blank">esip-semanticweb@lists.esipfed.org</a>
<a href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb" target="_blank">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a>
</pre>
</blockquote>
<pre cols="72">--
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a href="http://cires.colorado.edu/%7Ekhalsa" target="_blank">http://cires.colorado.edu/~khalsa</a>
<a href="http://orcid.org/0000-0001-9217-5550" target="_blank">http://orcid.org/0000-0001-9217-5550</a>
</pre>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
esip-semanticweb mailing list
<a href="mailto:esip-semanticweb@lists.esipfed.org" target="_blank">esip-semanticweb@lists.esipfed.org</a>
<a href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb" target="_blank">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a>
</pre>
</blockquote>
<br>
<div>-- <br>
John Scialdone<br>
Manager, Data Center Services: NASA Socioeconomic Data and
Applications Center (SEDAC)<br>
Project Lead: Jamaica Bay Research and Management Information
Network (JBRMIN)<br>
Project Lead: Jamaica Bay & Sandy Hook BioBlitz Events<br>
--------------------------------------------------------------------------------------<br>
Center for International Earth Science Information Network
(CIESIN)<br>
Earth Institute @ Columbia University<br>
Lamont-Doherty Earth Observatory (LDEO)<br>
61 Route 9W, PO Box 1000, Palisades, New York 10964 USA<br>
Phone: (845) 365-8978; FAX: (845) 365-8922<br>
Email: <a href="mailto:jscialdo@ciesin.columbia.edu" target="_blank">jscialdo@ciesin.columbia.edu</a>; <a href="mailto:jns74@columbia.edu" target="_blank">jns74@columbia.edu</a><br>
CIESIN web site: <a href="http://www.ciesin.columbia.edu" target="_blank">www.ciesin.columbia.edu</a><br>
SEDAC web site: <a href="http://sedac.ciesin.columbia.edu" target="_blank">sedac.ciesin.columbia.edu</a><br>
JBRMIN web site: <a href="http://www.ciesin.columbia.edu/jamaicabay" target="_blank">www.ciesin.columbia.edu/jamaicabay</a><br>
Sandy Hook web site: <a href="http://bioblitz17.ciesin.columbia.edu/" target="_blank">bioblitz17.ciesin.columbia.edu</a><br>
<p>Follow us on:<br>
<a href="https://twitter.com/ciesin/" target="_blank"><img src="http://ciesin.columbia.edu/images/twitter.png" border="0"></a> Twitter | <a href="https://www.facebook.com/socioeconomicdataandappsctr" target="_blank"><img src="http://ciesin.columbia.edu/images/fb.png" border="0"></a> Facebook | <a href="https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public" target="_blank"><img src="http://ciesin.columbia.edu/images/youtube.png" border="0"></a> Youtube </p>
</div>
</blockquote>
<pre cols="72">--
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a href="http://cires.colorado.edu/%7Ekhalsa" target="_blank">http://cires.colorado.edu/~khalsa</a>
<a href="http://orcid.org/0000-0001-9217-5550" target="_blank">http://orcid.org/0000-0001-9217-5550</a>
</pre>
</blockquote>
<br>
<div>-- <br>
John Scialdone<br>
Manager, Data Center Services: NASA Socioeconomic Data and
Applications Center (SEDAC)<br>
Project Lead: Jamaica Bay Research and Management Information
Network (JBRMIN)<br>
Project Lead: Jamaica Bay & Sandy Hook BioBlitz Events<br>
--------------------------------------------------------------------------------------<br>
Center for International Earth Science Information Network
(CIESIN)<br>
Earth Institute @ Columbia University<br>
Lamont-Doherty Earth Observatory (LDEO)<br>
61 Route 9W, PO Box 1000, Palisades, New York 10964 USA<br>
Phone: (845) 365-8978; FAX: (845) 365-8922<br>
Email: <a href="mailto:jscialdo@ciesin.columbia.edu" target="_blank">jscialdo@ciesin.columbia.edu</a>; <a href="mailto:jns74@columbia.edu" target="_blank">jns74@columbia.edu</a><br>
CIESIN web site: <a href="http://www.ciesin.columbia.edu" target="_blank">www.ciesin.columbia.edu</a><br>
SEDAC web site: <a href="http://sedac.ciesin.columbia.edu" target="_blank">sedac.ciesin.columbia.edu</a><br>
JBRMIN web site: <a href="http://www.ciesin.columbia.edu/jamaicabay" target="_blank">www.ciesin.columbia.edu/jamaicabay</a><br>
Sandy Hook web site: <a href="http://bioblitz17.ciesin.columbia.edu/" target="_blank">bioblitz17.ciesin.columbia.edu</a><br>
<p>Follow us on:<br>
<a href="https://twitter.com/ciesin/" target="_blank"><img src="http://ciesin.columbia.edu/images/twitter.png" border="0"></a> Twitter | <a href="https://www.facebook.com/socioeconomicdataandappsctr" target="_blank"><img src="http://ciesin.columbia.edu/images/fb.png" border="0"></a>
Facebook | <a href="https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public" target="_blank"><img src="http://ciesin.columbia.edu/images/youtube.png" border="0"></a> Youtube
</p>
</div>
</div>
_______________________________________________<br>
esip-semanticweb mailing list<br>
<a href="mailto:esip-semanticweb@lists.esipfed.org" target="_blank">esip-semanticweb@lists.esipfed.org</a><br>
<a href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb" rel="noreferrer" target="_blank">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a><br>
</blockquote></div>