<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Hi Matt,</p>
    <p>Thanks so much for this info. Agree, it'd be great to have a
      persistent global registry of formats used for data. The NASA
      Common Metadata Repository needs to store this information, which
      comes from the DAACs, so the list becomes the union of what the
      DAACs submit.  The lack of a referenceable list of data formats
      does impede interoperability.  The problem doesn't belong to any
      one agency or scientific discipline so it's an issue of who would
      support the effort to develop and maintain the registry.</p>
    <p>SiriJodha<br>
    </p>
    <div class="moz-cite-prefix">On 1/13/20 8:39 PM, Matt Jones wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAFSW8xmQ+=MYFYNqX+HStaafY2w5_R4Gsror1RFWxtA=xR2iNg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div>SiriJodha et al.,</div>
        <div><br>
        </div>
        <div>Members of the DataONE network use a shared format list
          that is used to tag granules with the serialization format. It
          might be useful to you in this effort. <br>
        </div>
        <div><br>
        </div>
        <div>Each format includes a format identifier, a human readable
          name, indication whether it is primarily used to serialize
          metadata or data granules, and its associated mime media type
          and extensions used.  When it makes sense, we use the
          mime-type as the formatId, but in many cases one mime-type
          corresponds to many formats or different versions of formats,
          and so in those cases we use something else that is
          reasonable.  Here is HDF5 as an example:<br>
        </div>
        <div><span style="font-family:monospace"><br>
                <objectFormat><br>
                    <formatId>application/x-hdf5</formatId><br>
                    <formatName>Hierarchical Data Format version 5
            (HDF5)</formatName><br>
                    <formatType>DATA</formatType><br>
                    <mediaType name="application/x-hdf5"/><br>
                    <extension>h5</extension><br>
                </objectFormat></span></div>
        <div><br>
        </div>
        <div>We extend this list as needed when new formats are
          encountered, and the current list can be retrieved from the
          DataONE formats service, which is at <a
            href="https://cn.dataone.org/cn/v2/formats"
            moz-do-not-send="true">https://cn.dataone.org/cn/v2/formats</a>
          . I've also attached a copy of the current list in case it is
          useful.</div>
        <div><br>
        </div>
        <div>There have been other format list standardization efforts,
          including most recently the <a href="https://www.udfr.org/"
            moz-do-not-send="true">Unified Digital Format Registry
            (UDFR)</a>, run by the California digital library. That is
          an ontological format registry that harmonizes prior
          vocabularies, including the earlier PRONOM and GDFR (the
          Global Digital Format Registry) efforts. Although we haven't
          adopted it yet, we do try to be sure we are consistent with
          UDFR at DataONE.  We'd love to see a global service like this
          be adopted, rather than doing it agency by agency.<br>
        </div>
        <div><br>
        </div>
        <div>Matt<br>
        </div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>
          <div>
            <div dir="ltr" class="gmail_signature"
              data-smartmail="gmail_signature">
              <div dir="ltr">
                <div>
                  <div dir="ltr">
                    <div><b>Matthew B. Jones</b></div>
                    <div>ORCID: <a
                        href="https://orcid.org/0000-0003-0077-4738"
                        target="_blank" moz-do-not-send="true">0000-0003-0077-4738</a></div>
                    <div>
                      Director of Informatics R&D, <a
                        href="http://www.nceas.ucsb.edu/ecoinfo"
                        style="color:rgb(17,85,204)" target="_blank"
                        moz-do-not-send="true">National Center for
                        Ecological Analysis and Synthesis</a></div>
                    <div>PI, NSF <a href="https://arcticdata.io/"
                        style="color:rgb(17,85,204)" target="_blank"
                        moz-do-not-send="true">Arctic Data Center</a></div>
                    <div>Director, <a href="https://dataone.org/"
                        style="color:rgb(17,85,204)" target="_blank"
                        moz-do-not-send="true">DataONE</a> program
                    </div>
                    <div>
                      University of California Santa Barbara</div>
                  </div>
                </div>
              </div>
            </div>
          </div>
          <br>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Mon, Jan 13, 2020 at 2:07
          PM John Scialdone via esip-semanticweb <<a
            href="mailto:esip-semanticweb@lists.esipfed.org"
            moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div bgcolor="#FFFFFF"> SiriJodha,<br>
            <br>
            Yes, we can contribute here, request edit permission.<br>
            <br>
            Thanx..<br>
            John<br>
            <br>
            <div>On 1/13/2020 8:15 AM, Siri Jodha Khalsa wrote:<br>
            </div>
            <blockquote type="cite">
              <p>Hi John,</p>
              <p><br>
              </p>
              <p>I agree, it would be good to have the DAACs compile
                their list of formats. Ideally, there would be a shared
                spreadsheet, where each DAAC would check against what
                was already there, to avoid have the same format called
                different things.</p>
              <p><br>
              </p>
              <p>I've taken the GCMD list and assigned encodings to all
                that I could identify.  The categories are ASCII,
                Binary, Image, Library (i.e. associated with a software
                library like HDF), and Proprietary (which is somewhat of
                a mixed bag, could be ASCII or Binary, open or closed).</p>
              <p><br>
              </p>
              <p>The spreadsheet is here: <a
href="https://docs.google.com/spreadsheets/d/1Lt7hl-_NKbp37FZkQ870b9c9LhlS39ZMq1N-UZdPUp8/edit?usp=sharing"
                  target="_blank" moz-do-not-send="true">https://docs.google.com/spreadsheets/d/1Lt7hl-_NKbp37FZkQ870b9c9LhlS39ZMq1N-UZdPUp8/edit?usp=sharing</a></p>
              <p>Feedback, corrections, additions welcome. I'll give
                edit permission as requested.  </p>
              <p><br>
              </p>
              <p>Cheers,</p>
              <p>SiriJodha<br>
              </p>
              <p><br>
              </p>
              <div>On 1/11/20 12:04 AM, John Scialdone wrote:<br>
              </div>
              <blockquote type="cite"> Siri Jodha,<br>
                <br>
                We've been kicking around the the Data Format Controlled
                Vocabulary list as well. We had a call with Valerie,
                Tyler and Scott recently about inconsistencies in this
                list. One of the goals of this list (from ARC team
                review of our metadata) was to help users understand the
                software needed to read/use the data. We suggested to
                add a field to this structure whereby values such as
                "ESRI", "Microsoft", "QGIS", "Adobe", "Google" etc.
                could be associated with a format. I think it would be a
                good exercise for all the DAACs to generate a list of
                formats they use and associated s/w, then bring them
                together over some telecons, face to face meetings, and
                thru tracking this effort via the Earthdata wiki, to
                eventually help generate a more well-thought-out list.<br>
                <br>
                Thanx..<br>
                John<br>
                <br>
                <br>
                <div>On 1/10/2020 4:08 PM, Siri Jodha Khalsa via
                  esip-semanticweb wrote:<br>
                </div>
                <blockquote type="cite">
                  <p>I'm curious whether the ESIP semantic community has
                    any opinions on these two controlled vocabularies. <br>
                  </p>
                  <p>One question I have is why a measurement keyword
                    list was necessary when GCMD already has the science
                    keywords (a source for the original SWEET). i.e. why
                    not integrate measurements (which are represented as
                    variables in the science keywords) into the science
                    keywords?</p>
                  <p><br>
                  </p>
                  <p>the data format list is even more perplexing to me.
                    "incidence angle file" is a format? Georeferenced
                    TIFF in addition to GeoTIFF?  DV? (digital value?)
                    is a format?  KML as well as OGC KML? ASCII and text
                    (what about unicode?) DEM, if this refers to digital
                    elevation models with a .DEM extension from USGS,
                    are ASCII files. </p>
                  <p><br>
                  </p>
                  <p>Many formats are subtypes of other formats listed.
                    Wouldn't a better approach be to list the <i>encodings</i>
                    (a much smaller list, which would tell users how to
                    read the data with software) and then add the
                    conventions that have been applied (e.g. CF for
                    netCDF or GRIB for binary). For the encodings list,
                    why not start with mime types?<br>
                  </p>
                  <p><br>
                  </p>
                  <p>sjs<br>
                  </p>
                  <p><br>
                  </p>
                  <div>On 11/12/19 5:00 PM, Stevens, Tyler B.
                    (GSFC-423.0)[Stinger Ghaffarian Technologies] via
                    esip-semanticweb wrote:<br>
                  </div>
                  <blockquote type="cite"> <b
                      style="font-weight:normal">
                      <p dir="ltr"
style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:0pt;margin-bottom:0pt">
                        <span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The
                          NASA Global Change Master Directory (GCMD)
                          staff is pleased to announce the release of
                          the GCMD keywords version 9.0. Version 9.0
                          consists of two new keyword schemes: (1)
                          Measurement Name and (2) Granule Data Format. </span></p>
                      <br>
                      <p dir="ltr"
style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:0pt;margin-bottom:0pt">
                        <span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The
                        </span><a
href="https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/MeasurementName/?format=csv"
                          target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Measurement
                            Name</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
                          list is a preliminary set of (~100) keywords
                          that represent an observable property, usually
                          geophysical, geo-biophysical, physical, or
                          chemical. The </span><a
href="https://gcmdservices.gsfc.nasa.gov/kms/concepts/concept_scheme/GranuleDataFormat/?format=csv"
                          target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Granule
                            Data Format</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
                          list of keywords represent the format of the
                          data that is distributed by the data center. </span></p>
                      <p dir="ltr"
style="line-height:1.38;margin-right:9pt;text-align:justify;margin-top:12pt;margin-bottom:12pt">
                        <span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">The
                          keywords help facilitate the classification
                          and discovery of Earth Science data by
                          providing a rich vocabulary for characterizing
                          the data. The GCMD keywords are used by
                          hundreds of data providers worldwide for
                          categorizing the ~33,000 records stored in the</span><a
                          href="http://earthdata.nasa.gov/cmr"
                          target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400;text-decoration:underline">
                          </span><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Common
                            Metadata Repository</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">.</span></p>
                      <span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">For
                        more information about the keywords and how to
                        access them, please visit the</span><a
href="https://earthdata.nasa.gov/about/gcmd/global-change-master-directory-gcmd-keywords"
                        target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
                        </span><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">Keyword
                          Landing Page.</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">
                        Questions about the keywords can be submitted to
                      </span><a href="mailto:support@earthdata.nasa.gov"
                        target="_blank" moz-do-not-send="true"><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400;text-decoration:underline">support@earthdata.nasa.gov</span></a><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400">
                      </span><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">or
                        directed to Valerie Dixon at </span><span
style="font-size:12pt;font-family:Arial;color:rgb(17,85,204);font-weight:400"><a
                          href="mailto:valerie.dixon@nasa.gov"
                          target="_blank" moz-do-not-send="true">valerie.dixon@nasa.gov</a></span><span
style="font-size:12pt;font-family:Arial;color:rgb(0,0,0);font-weight:400">.</span></b>
                    <br>
                    <fieldset></fieldset>
                    <pre>_______________________________________________
esip-semanticweb mailing list
<a href="mailto:esip-semanticweb@lists.esipfed.org" target="_blank" moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a>
<a href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb" target="_blank" moz-do-not-send="true">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a>
</pre>
                  </blockquote>
                  <pre cols="72">-- 
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a href="http://cires.colorado.edu/%7Ekhalsa" target="_blank" moz-do-not-send="true">http://cires.colorado.edu/~khalsa</a>
<a href="http://orcid.org/0000-0001-9217-5550" target="_blank" moz-do-not-send="true">http://orcid.org/0000-0001-9217-5550</a>
</pre>
                  <br>
                  <fieldset></fieldset>
                  <br>
                  <pre>_______________________________________________
esip-semanticweb mailing list
<a href="mailto:esip-semanticweb@lists.esipfed.org" target="_blank" moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a>
<a href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb" target="_blank" moz-do-not-send="true">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a>
</pre>
                </blockquote>
                <br>
                <div>-- <br>
                  John Scialdone<br>
                  Manager, Data Center Services: NASA Socioeconomic Data
                  and Applications Center (SEDAC)<br>
                  Project Lead: Jamaica Bay Research and Management
                  Information Network (JBRMIN)<br>
                  Project Lead: Jamaica Bay & Sandy Hook BioBlitz
                  Events<br>
--------------------------------------------------------------------------------------<br>
                  Center for International Earth Science Information
                  Network (CIESIN)<br>
                  Earth Institute @ Columbia University<br>
                  Lamont-Doherty Earth Observatory (LDEO)<br>
                  61 Route 9W, PO Box 1000, Palisades, New York 10964
                  USA<br>
                  Phone: (845) 365-8978; FAX: (845) 365-8922<br>
                  Email: <a href="mailto:jscialdo@ciesin.columbia.edu"
                    target="_blank" moz-do-not-send="true">jscialdo@ciesin.columbia.edu</a>;
                  <a href="mailto:jns74@columbia.edu" target="_blank"
                    moz-do-not-send="true">jns74@columbia.edu</a><br>
                  CIESIN web site: <a
                    href="http://www.ciesin.columbia.edu"
                    target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu</a><br>
                  SEDAC web site: <a
                    href="http://sedac.ciesin.columbia.edu"
                    target="_blank" moz-do-not-send="true">sedac.ciesin.columbia.edu</a><br>
                  JBRMIN web site: <a
                    href="http://www.ciesin.columbia.edu/jamaicabay"
                    target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu/jamaicabay</a><br>
                  Sandy Hook web site: <a
                    href="http://bioblitz17.ciesin.columbia.edu/"
                    target="_blank" moz-do-not-send="true">bioblitz17.ciesin.columbia.edu</a><br>
                  <p>Follow us on:<br>
                    <a href="https://twitter.com/ciesin/"
                      target="_blank" moz-do-not-send="true"><img
                        src="http://ciesin.columbia.edu/images/twitter.png"
                        moz-do-not-send="true" border="0"></a> Twitter |
                    <a
                      href="https://www.facebook.com/socioeconomicdataandappsctr"
                      target="_blank" moz-do-not-send="true"><img
                        src="http://ciesin.columbia.edu/images/fb.png"
                        moz-do-not-send="true" border="0"></a> Facebook
                    | <a
href="https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public"
                      target="_blank" moz-do-not-send="true"><img
                        src="http://ciesin.columbia.edu/images/youtube.png"
                        moz-do-not-send="true" border="0"></a> Youtube </p>
                </div>
              </blockquote>
              <pre cols="72">-- 
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a href="http://cires.colorado.edu/%7Ekhalsa" target="_blank" moz-do-not-send="true">http://cires.colorado.edu/~khalsa</a>
<a href="http://orcid.org/0000-0001-9217-5550" target="_blank" moz-do-not-send="true">http://orcid.org/0000-0001-9217-5550</a>
</pre>
            </blockquote>
            <br>
            <div>-- <br>
              John Scialdone<br>
              Manager, Data Center Services: NASA Socioeconomic Data and
              Applications Center (SEDAC)<br>
              Project Lead: Jamaica Bay Research and Management
              Information Network (JBRMIN)<br>
              Project Lead: Jamaica Bay & Sandy Hook BioBlitz Events<br>
--------------------------------------------------------------------------------------<br>
              Center for International Earth Science Information Network
              (CIESIN)<br>
              Earth Institute @ Columbia University<br>
              Lamont-Doherty Earth Observatory (LDEO)<br>
              61 Route 9W, PO Box 1000, Palisades, New York 10964 USA<br>
              Phone: (845) 365-8978; FAX: (845) 365-8922<br>
              Email: <a href="mailto:jscialdo@ciesin.columbia.edu"
                target="_blank" moz-do-not-send="true">jscialdo@ciesin.columbia.edu</a>;
              <a href="mailto:jns74@columbia.edu" target="_blank"
                moz-do-not-send="true">jns74@columbia.edu</a><br>
              CIESIN web site: <a href="http://www.ciesin.columbia.edu"
                target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu</a><br>
              SEDAC web site: <a
                href="http://sedac.ciesin.columbia.edu" target="_blank"
                moz-do-not-send="true">sedac.ciesin.columbia.edu</a><br>
              JBRMIN web site: <a
                href="http://www.ciesin.columbia.edu/jamaicabay"
                target="_blank" moz-do-not-send="true">www.ciesin.columbia.edu/jamaicabay</a><br>
              Sandy Hook web site: <a
                href="http://bioblitz17.ciesin.columbia.edu/"
                target="_blank" moz-do-not-send="true">bioblitz17.ciesin.columbia.edu</a><br>
              <p>Follow us on:<br>
                <a href="https://twitter.com/ciesin/" target="_blank"
                  moz-do-not-send="true"><img
                    src="http://ciesin.columbia.edu/images/twitter.png"
                    moz-do-not-send="true" border="0"></a> Twitter | <a
href="https://www.facebook.com/socioeconomicdataandappsctr"
                  target="_blank" moz-do-not-send="true"><img
                    src="http://ciesin.columbia.edu/images/fb.png"
                    moz-do-not-send="true" border="0"></a> Facebook | <a
href="https://www.youtube.com/channel/UCjUjAvV7M04SxxpM5wq4fMw?view_as=public"
                  target="_blank" moz-do-not-send="true"><img
                    src="http://ciesin.columbia.edu/images/youtube.png"
                    moz-do-not-send="true" border="0"></a> Youtube </p>
            </div>
          </div>
          _______________________________________________<br>
          esip-semanticweb mailing list<br>
          <a href="mailto:esip-semanticweb@lists.esipfed.org"
            target="_blank" moz-do-not-send="true">esip-semanticweb@lists.esipfed.org</a><br>
          <a
            href="https://lists.esipfed.org/mailman/listinfo/esip-semanticweb"
            rel="noreferrer" target="_blank" moz-do-not-send="true">https://lists.esipfed.org/mailman/listinfo/esip-semanticweb</a><br>
        </blockquote>
      </div>
    </blockquote>
    <pre class="moz-signature" cols="72">-- 
Siri-Jodha Singh KHALSA, Ph.D., SMIEEE
National Snow and Ice Data Center
University of Colorado
Boulder, CO 80309-0449 Phone: 1-303-492-1445 GV: 1-303-736-9976
<a class="moz-txt-link-freetext" href="http://cires.colorado.edu/~khalsa">http://cires.colorado.edu/~khalsa</a>
<a class="moz-txt-link-freetext" href="http://orcid.org/0000-0001-9217-5550">http://orcid.org/0000-0001-9217-5550</a>
</pre>
  </body>
</html>