<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Hi Matt,</p>
<p>Thanks so much for this info. Agree, it'd be great to have a
persistent global registry of formats used for data. The NASA
Common Metadata Repository needs to store this information, which
comes from the DAACs, so the list becomes the union of what the
DAACs submit. The lack of a referenceable list of data formats
does impede interoperability. The problem doesn't belong to any
one agency or scientific discipline so it's an issue of who would
support the effort to develop and maintain the registry.</p>
<p>SiriJodha<br>
</p>
<div class="moz-cite-prefix">On 1/13/20 8:39 PM, Matt Jones wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAFSW8xmQ+=MYFYNqX+HStaafY2w5_R4Gsror1RFWxtA=xR2iNg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>SiriJodha et al.,</div>
<div><br>
</div>
<div>Members of the DataONE network use a shared format list
that is used to tag granules with the serialization format. It
might be useful to you in this effort. <br>
</div>
<div><br>
</div>
<div>Each format includes a format identifier, a human readable
name, indication whether it is primarily used to serialize
metadata or data granules, and its associated mime media type
and extensions used. When it makes sense, we use the
mime-type as the formatId, but in many cases one mime-type
corresponds to many formats or different versions of formats,
and so in those cases we use something else that is
reasonable. Here is HDF5 as an example:<br>
</div>
<div><span style="font-family:monospace"><br>
<objectFormat><br>
<formatId>application/x-hdf5</formatId><br>
<formatName>Hierarchical Data Format version 5
(HDF5)</formatName><br>
<formatType>DATA</formatType><br>
<mediaType name="application/x-hdf5"/><br>
<extension>h5</extension><br>
</objectFormat></span></div>
<div><br>
</div>
<div>We extend this list as needed when new formats are
encountered, and the current list can be retrieved from the
DataONE formats service, which is at <a
href="https://cn.dataone.org/cn/v2/formats"
moz-do-not-send="true">https://cn.dataone.org/cn/v2/formats</a>
. I've also attached a copy of the current list in case it is
useful.</div>
<div><br>
</div>
<div>There have been other format list standardization efforts,
including most recently the <a href="https://www.udfr.org/"
moz-do-not-send="true">Unified Digital Format Registry
(UDFR)</a>, run by the California digital library. That is
an ontological format registry that harmonizes prior
vocabularies, including the earlier PRONOM and GDFR (the
Global Digital Format Registry) efforts. Although we haven't
adopted it yet, we do try to be sure we are consistent with
UDFR at DataONE. We'd love to see a global service like this
be adopted, rather than doing it agency by agency.<br>
</div>
<div><br>
</div>
<div>Matt<br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>
<div>
<div dir="ltr" class="gmail_signature"
data-smartmail="gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div><b>Matthew B. Jones</b></div>
<div>ORCID: <a
href="https://orcid.org/0000-0003-0077-4738"
target="_blank" moz-do-not-send="true">0000-0003-0077-4738</a></div>
<div>
Director of Informatics R&D, <a
href="http://www.nceas.ucsb.edu/ecoinfo"
style="color:rgb(17,85,204)" target="_blank"
moz-do-not-send="true">National Center for
Ecological Analysis and Synthesis</a></div>
<div>PI, NSF <a href="https://arcticdata.io/"
style="color:rgb(17,85,204)" target="_blank"
moz-do-not-send="true">Arctic Data Center</a></div>
<div>Director, <a href="https://dataone.org/"
style="color:rgb(17,85,204)" target="_blank"
moz-do-not-send="true">DataONE</a> program
</div>
<div>
University of California Santa Barbara</div>
</div>
</div>
</div>
</div>
</div>
<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, Jan 13, 2020 at 2:07
PM John Scialdone via esip-semanticweb <<a
href="mailto:esip-semanticweb@lists.esipfed.org"
moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> SiriJodha,<br>
<br>
Yes, we can contribute here, request edit permission.<br>
<br>
Thanx..<br>
John<br>
<br>
<div>On 1/13/2020 8:15 AM, Siri Jodha Khalsa wrote:<br>
</div>
<blockquote type="cite">
<p>Hi John,</p>
<p><br>
</p>
<p>I agree, it would be good to have the DAACs compile
their list of formats. Ideally, there would be a shared
spreadsheet, where each DAAC would check against what
was already there, to avoid have the same format called
different things.</p>
<p><br>
</p>
<p>I've taken the GCMD list and assigned encodings to all
that I could identify. The categories are ASCII,
Binary, Image, Library (i.e. associated with a software
library like HDF), and Proprietary (which is somewhat of
a mixed bag, could be ASCII or Binary, open or closed).</p>
<p><br>
</p>
<p>The spreadsheet is here: <a
href="https://docs.google.com/spreadsheets/d/1Lt7hl-_NKbp37FZkQ870b9c9LhlS39ZMq1N-UZdPUp8/edit?usp=sharing"
target="_blank" moz-do-not-send="true">https://docs.google.com/spreadsheets/d/1Lt7hl-_NKbp37FZkQ870b9c9LhlS39ZMq1N-UZdPUp8/edit?usp=sharing</a></p>
<p>Feedback, corrections, additions welcome. I'll give
edit permission as requested. </p>
<p><br>
</p>
<p>Cheers,</p>
<p>SiriJodha<br>
</p>
<p><br>
</p>
<div>On 1/11/20 12:04 AM, John Scialdone wrote:<br>
</div>
<blockquote type="cite"> Siri Jodha,<br>
<br>
We've been kicking around the the Data Format Controlled
Vocabulary list as well. We had a call with Valerie,
Tyler and Scott recently about inconsistencies in this
list. One of the goals of this list (from ARC team
review of our metadata) was to help users understand the
software needed to read/use the data. We suggested to
add a field to this structure whereby values such as
"ESRI", "Microsoft", "QGIS", "Adobe", "Google" etc.
could be associated with a format. I think it would be a
good exercise for all the DAACs to generate a list of
formats they use and associated s/w, then bring them
together over some telecons, face to face meetings, and
thru tracking this effort via the Earthdata wiki, to
eventually help generate a more well-thought-out list.<br>
<br>
Thanx..<br>
John<br>
<br>
<br>
<div>On 1/10/2020 4:08 PM, Siri Jodha Khalsa via
esip-semanticweb wrote:<br>
</div>
<blockquote type="cite">
<p>I'm curious whether the ESIP semantic community has
any opinions on these two controlled vocabularies. <br>
</p>
<p>One question I have is why a measurement keyword
list was necessary when GCMD already has the science
keywords (a source for the original SWEET). i.e. why
not integrate measurements (which are represented as
variables in the science keywords) into the science
keywords?</p>
<p><br>
</p>
<p>the data format list is even more perplexing to me.
"incidence angle file" is a format? Georeferenced
TIFF in addition to GeoTIFF? DV? (digital value?)
is a format? KML as well as OGC KML? ASCII and text
(what about unicode?) DEM, if this refers to digital
elevation models with a .DEM extension from USGS,
are ASCII files. </p>
<p><br>
</p>
<p>Many formats are subtypes of other formats listed.
Wouldn't a better approach be to list the <i>encodings</i>
(a much smaller list, which would tell users how to
read the data with software) and then add the
conventions that have been applied (e.g. CF for
netCDF or GRIB for binary). For the encodings list,
why not start with mime types?<br>
</p>
<p><br>
</p>
<p>sjs<br>
</p>
<p><br>
</p>
<div>On 11/12/19 5:00 PM, Stevens, Tyler B.
(GSFC-423.0)[Stinger Ghaffarian Technologies] via
esip-semanticweb wrote:<br>
</div>
<blockquote type="cite"> <b
style="font-weight:normal">
<p dir="ltr"
style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:0pt;margin-bottom:0pt">
<span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The
NASA Global Change Master Directory (GCMD)
staff is pleased to announce the release of
the GCMD keywords version 9.0. Version 9.0
consists of two new keyword schemes: (1)
Measurement Name and (2) Granule Data Format. </span></p>
<br>
<p dir="ltr"
style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:0pt;margin-bottom:0pt">
<span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The
</span><a
href="https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/MeasurementName/?format=csv"
target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Measurement
Name</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
list is a preliminary set of (~100) keywords
that represent an observable property, usually
geophysical, geo-biophysical, physical, or
chemical. The </span><a
href="https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/GranuleDataFormat/?format=csv"
target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Granule
Data Format</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
list of keywords represent the format of the
data that is distributed by the data center. </span></p>
<p dir="ltr"
style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:12pt;margin-bottom:12pt">
<span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The
keywords help facilitate the classification
and discovery of Earth Science data by
providing a rich vocabulary for characterizing
the data. The GCMD keywords are used by
hundreds of data providers worldwide for
categorizing the ~33,000 records stored in the</span><a
href="http://earthdata.nasa.gov/cmr"
target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400;text-decoration:underline">
</span><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Common
Metadata Repository</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">.</span></p>
<span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">For
more information about the keywords and how to
access them, please visit the</span><a
href="https://earthdata.nasa.gov/about/gcmd/global-change-master-directory-gcmd-keywords"
target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
</span><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Keyword
Landing Page.</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
Questions about the keywords can be submitted to
</span><a href="mailto:support@earthdata.nasa.gov"
target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">support@earthdata.nasa.gov</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400">
</span><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">or
directed to Valerie Dixon at </span><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400"><a
href="mailto:valerie.dixon@nasa.gov"
target="_blank" moz-do-not-send="true">valerie.dixon@nasa.gov</a></span><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">.</span></b>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
esip-semanticweb mailing list
<a href="mailto:esip-semanticweb@lists.esipfed.org" target="_blank" moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a>
<a href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb" target="_blank" moz-do-not-send="true">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a>
</pre>
</blockquote>
<pre cols="72">--
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a href="http://cires.colorado.edu/%7Ekhalsa" target="_blank" moz-do-not-send="true">http://cires.colorado.edu/~khalsa</a>
<a href="http://orcid.org/0000-0001-9217-5550" target="_blank" moz-do-not-send="true">http://orcid.org/0000-0001-9217-5550</a>
</pre>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
esip-semanticweb mailing list
<a href="mailto:esip-semanticweb@lists.esipfed.org" target="_blank" moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a>
<a href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb" target="_blank" moz-do-not-send="true">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a>
</pre>
</blockquote>
<br>
<div>-- <br>
John Scialdone<br>
Manager, Data Center Services: NASA Socioeconomic Data
and Applications Center (SEDAC)<br>
Project Lead: Jamaica Bay Research and Management
Information Network (JBRMIN)<br>
Project Lead: Jamaica Bay & Sandy Hook BioBlitz
Events<br>
--------------------------------------------------------------------------------------<br>
Center for International Earth Science Information
Network (CIESIN)<br>
Earth Institute @ Columbia University<br>
Lamont-Doherty Earth Observatory (LDEO)<br>
61 Route 9W, PO Box 1000, Palisades, New York 10964
USA<br>
Phone: (845) 365-8978; FAX: (845) 365-8922<br>
Email: <a href="mailto:jscialdo@ciesin.columbia.edu"
target="_blank" moz-do-not-send="true">jscialdo@ciesin.columbia.edu</a>;
<a href="mailto:jns74@columbia.edu" target="_blank"
moz-do-not-send="true">jns74@columbia.edu</a><br>
CIESIN web site: <a
href="http://www.ciesin.columbia.edu"
target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu</a><br>
SEDAC web site: <a
href="http://sedac.ciesin.columbia.edu"
target="_blank" moz-do-not-send="true">sedac.ciesin.columbia.edu</a><br>
JBRMIN web site: <a
href="http://www.ciesin.columbia.edu/jamaicabay"
target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu/jamaicabay</a><br>
Sandy Hook web site: <a
href="http://bioblitz17.ciesin.columbia.edu/"
target="_blank" moz-do-not-send="true">bioblitz17.ciesin.columbia.edu</a><br>
<p>Follow us on:<br>
<a href="https://twitter.com/ciesin/"
target="_blank" moz-do-not-send="true"><img
src="http://ciesin.columbia.edu/images/twitter.png"
moz-do-not-send="true" border="0"></a> Twitter |
<a
href="https://www.facebook.com/socioeconomicdataandappsctr"
target="_blank" moz-do-not-send="true"><img
src="http://ciesin.columbia.edu/images/fb.png"
moz-do-not-send="true" border="0"></a> Facebook
| <a
href="https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public"
target="_blank" moz-do-not-send="true"><img
src="http://ciesin.columbia.edu/images/youtube.png"
moz-do-not-send="true" border="0"></a> Youtube </p>
</div>
</blockquote>
<pre cols="72">--
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a href="http://cires.colorado.edu/%7Ekhalsa" target="_blank" moz-do-not-send="true">http://cires.colorado.edu/~khalsa</a>
<a href="http://orcid.org/0000-0001-9217-5550" target="_blank" moz-do-not-send="true">http://orcid.org/0000-0001-9217-5550</a>
</pre>
</blockquote>
<br>
<div>-- <br>
John Scialdone<br>
Manager, Data Center Services: NASA Socioeconomic Data and
Applications Center (SEDAC)<br>
Project Lead: Jamaica Bay Research and Management
Information Network (JBRMIN)<br>
Project Lead: Jamaica Bay & Sandy Hook BioBlitz Events<br>
--------------------------------------------------------------------------------------<br>
Center for International Earth Science Information Network
(CIESIN)<br>
Earth Institute @ Columbia University<br>
Lamont-Doherty Earth Observatory (LDEO)<br>
61 Route 9W, PO Box 1000, Palisades, New York 10964 USA<br>
Phone: (845) 365-8978; FAX: (845) 365-8922<br>
Email: <a href="mailto:jscialdo@ciesin.columbia.edu"
target="_blank" moz-do-not-send="true">jscialdo@ciesin.columbia.edu</a>;
<a href="mailto:jns74@columbia.edu" target="_blank"
moz-do-not-send="true">jns74@columbia.edu</a><br>
CIESIN web site: <a href="http://www.ciesin.columbia.edu"
target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu</a><br>
SEDAC web site: <a
href="http://sedac.ciesin.columbia.edu" target="_blank"
moz-do-not-send="true">sedac.ciesin.columbia.edu</a><br>
JBRMIN web site: <a
href="http://www.ciesin.columbia.edu/jamaicabay"
target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu/jamaicabay</a><br>
Sandy Hook web site: <a
href="http://bioblitz17.ciesin.columbia.edu/"
target="_blank" moz-do-not-send="true">bioblitz17.ciesin.columbia.edu</a><br>
<p>Follow us on:<br>
<a href="https://twitter.com/ciesin/" target="_blank"
moz-do-not-send="true"><img
src="http://ciesin.columbia.edu/images/twitter.png"
moz-do-not-send="true" border="0"></a> Twitter | <a
href="https://www.facebook.com/socioeconomicdataandappsctr"
target="_blank" moz-do-not-send="true"><img
src="http://ciesin.columbia.edu/images/fb.png"
moz-do-not-send="true" border="0"></a> Facebook | <a
href="https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public"
target="_blank" moz-do-not-send="true"><img
src="http://ciesin.columbia.edu/images/youtube.png"
moz-do-not-send="true" border="0"></a> Youtube </p>
</div>
</div>
_______________________________________________<br>
esip-semanticweb mailing list<br>
<a href="mailto:esip-semanticweb@lists.esipfed.org"
target="_blank" moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a><br>
<a
href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a><br>
</blockquote>
</div>
</blockquote>
<pre class="moz-signature" cols="72">--
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a class="moz-txt-link-freetext" href="http://cires.colorado.edu/~khalsa">http://cires.colorado.edu/~khalsa</a>
<a class="moz-txt-link-freetext" href="http://orcid.org/0000-0001-9217-5550">http://orcid.org/0000-0001-9217-5550</a>
</pre>
</body>
</html>