[Esip-documentation] Fwd: Re: ACDD: creator, project, institution - 2

Signell, Richard via Esip-documentation esip-documentation at lists.esipfed.org
Mon Sep 29 11:13:45 EDT 2014


Gang,

At first I was thinking that Bob's outrage was not productive (I could
practically see the bulding veins on the side of his neck), but then again,
it *did* cause me to read the entire e-mail.  ;-)

And on reflection, I think I agree with Bob on all these points.   It
doesn't matter so much what the actual names are -- we just need to make
sure that people understand how they are to be used.  So keeping the
existing names,  enabling backwards compatibility, along with perhaps more
documentation/examples of how they are to be used, seems sensible, and
adding new unique names to 1.3 for content not covered in 1.0

-Rich

P.S. I think when client developers speak, we should listen to them very
carefully -- after all, they are the folks who we are really building
standards for -- and they will hopefully leverage those standards for the
benefit of the end-user and public.


On Mon, Sep 29, 2014 at 10:58 AM, Bob Simons - NOAA Federal via
Esip-documentation <esip-documentation at lists.esipfed.org> wrote:

>  My replies are interspersed.
>
>
>  On 2014-09-26 3:05 PM, John Graybeal wrote:
>
> Bob,
>
>  I'm just answering these two most recent questions for the moment, to
> introduce to everyone the types of use cases many of us have had to deal
> with.
>
>  This is not meant to argue with your position -- in a little bit I'll
> offer an overall option to address that, once I have it fully laid out. So
> please hold off on restating your position in response, until I've had a
> chance to address directly. Thanks!
>
>
>
>   Isn't it clear that creator_name is for the data creator's name?
>
>
>  The question that comes up for a user is "what does 'data creator' mean
> for our situation?" See my use case below.
>
>  If you look at the ISO Translation Notes in ACDD 1.1, you can see the
> same issues popping up, where creator_name is mapped to 3 different roles:
> citation/citedResponsibleParty role=originator, point of contact, and
> metadata contact. This (one name for multiple meanings) wasn't working for
> our use cases.
>
> creator_name should be mapped to originator. There is no need to change
> this attribute's name.
> If you want a separate point of contact, then add data_contact_name and
> data_contact_email to ACDD 1.3.
> If you want a separate metadata contact that isn't the publisher, then add
> metadata_contact_name and metadata_contact_email to ACDD 1.3.
> Changing creator_name to creator does not solve the
> one-name-for-multiple-meanings problem. Adding other x_name and x_email
> does.
>
>
>  Isn't it clear that institution and project are the creator's
> institution and project?
>
>
>  No, it isn't. Many data sets are created by folks who don't have an
> institution or have multiple institutions, or are created by an institution
> (not a user) that has nothing to do with the actual final presentation or
> intellectual ownership of the data set.
>
> Yes, it is. Seriously?!  Read and look at the definition in ACDD 1.0 at
>
> http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/formats/DataDiscoveryAttConvention.html
> It is very clear.
> I don't care if people were confused. People will always be confused.
> Explain it to them and move on.
>
> institution is clearly defined as the creator's institution.
> If you want a separate publisher_institution, fine, add it.
> Changing institution to creator_institution does not solve the problem of
> different institutions. Adding other x_institution does.
>
> [Ah. Given all the examples, below, let me add one guideline to clarify
> the meaning of creator: I mean the creator of *this* data (to which the
> metadata is attached). If some data is significantly changed (e.g., beyond
> QA/QC, e.g., combined with other data) then it becomes a new product and so
> has a new creator.  The original creators are relegated to "history"
> (literally and figuratively) and/or external documentation. This isn't my
> idea. This is a common practice and common sense. If some PI at JPL creates
> a model combining data from several sources, he is the creator of the model
> data. Perhaps ACDD 1.3 just needs to state this.]
>
>
>  Here's a use case I had to deal with:
>
> My answers are for my proposal:
> creator_name, project and institution all apply to data creator and
> already with that definition in ACDD 1.0.
>
> Add: (see terms added below)
>
> Note: Okay. I'll play this game. But if you're just going to pick this
> apart because you envision the situation slightly different than me, don't
> bother.
> My point is: All these situations can be handled by attribute names
> already in ACDD 1.0 plus several new attribute names for new items.  There
> is no need to deprecate ACDD 1.0 names and replace them just for the sake
> of replacing them.
>
>
>  1) Someone makes an observation with their custom instrument. Call her
> A. Her institution is Z.
>
> creator_name=A
> institution=Z
>
>  2) The data from the instrument is collected by an ocean observing
> system called B (without which the data could not be collected), and is
> published by that system. Let's call that the raw data. B doesn't have an
> institution (really, it is built by dozens of institutions).
>
>
> creator_name=A or B:  (Personally, I vote for A, since B is the publisher,
> but that is certainly determined by mutual agreement between A and B ahead
> of time.)
> publisher_name/url/email=from B
> (If B is an Ocean Observing System with built by a big consortium, the OOS
> certainly has a name. Show me one that doesn't. More likely is that there
> are two casually cooperating groups -- fine, list them both.)
> Maybe you need to pull in contributor_x from ACDD 1.0. It depends on the
> details.
>
>  3) A standard process in the ocean observing system (written by team C
> from institution X) takes the raw data and performs automated QC checks.
> Let's call that QC data.
>
> Use processing_level from ACDD 1.0 ("A textual description of the
> processing (or quality control) level of the data.")
> None of the proposals in ACDD 1.3 deal with this.
> There is no need to deprecate attribute names from ACDD 1.0 to deal with
> this.
>
>  4) A second process in the ocean observing system (written by
> contributor D from institution W) takes the QC data, interpolates it, and
> combines it with other data to create a grid. Let's call that gridded data.
>
> "combines with other data" means it is a new product. So,
> creator_name=D
> creator_url=(if there is an external web page about how this dataset was
> created)
> institution=W
> history=(information about how this dataset was created)
>
>
>  5) An individual E (from institution V) takes a subset of the gridded
> data and publishes that as part of their paper, also submitting it back to
> the observatory as a reprocessed data set.
>
> This is pretty vague.
> If he didn't reprocess it (and he's just saying he did), the observatory
> shouldn't accept it.
> If he did reprocess it:
> creator_name=E
> institution=V
> history= details the original source data and the processing steps he took.
> publisher_name/email/url=from B
>
>  6) The ocean observatory publishes the newly submitted, reprocessed data
> set. Let's call that the curated reprocessed data.
>
> This is pretty vague. It depends a lot on what "curated" means here. You
> say the published data is as E submitted it, so it sounds lightly curated.
> So:
> The same answers as 5.  B is getting the same credit that a book company
> gets for helping an author publish his/her book.
>
>  7) A republishing system (like ERDDAP), call it F, out of institution U,
> takes the curated reprocessed data, and re-offers it in multiple formats.
> Let's call that the reformatted data.
>
> Multiple formats are just different representations of the same data. All
> the data remains the same.
> The ERDDAP administrator's name/address/email get added to the ISO
> metadata as the contact for the service (which doesn't replace or change
> the creator or the publisher or other information which is also in the ISO
> metadata).
> ERDDAP doesn't do much (especially compared to the creator). It doesn't
> take much credit.
>
>
>  If I make a table for each of those products 1 through 7, what do you
> think the creator_name is for each? And is the institution always creator's
> institution?
>
> Yes. institution is defined as the creator's institution.
> I see that I add one guideline to the definition of creator: I mean the
> creator of *this* data (to which the metadata is attached). If some data
> is significantly changed (e.g., beyond QA/QC, e.g., combined with other
> data) then it becomes a new product and so gets a new creator.  The
> original creators are relegated to "history" (literally and figuratively)
> and/or external documentation. This isn't my idea. This is a common
> practice and common sense. If some PI at JPL creates a model combining data
> from several sources, he is the creator of the model data. Perhaps ACDD 1.3
> just needs to state this.
>
>
>  For us, these answers were not obvious from the existing names or
> definitions. That's what motivated us to spend this long time trying to
> make improvements.
>
> All of these situations seemed clear to me (although they are hypothetical
> so it is easy to pick nits based on different understanding of the
> situation). I deal these using ACDD 1.0 all of the time.
>
>
>
>  John
>
>
>  On Sep 26, 2014, at 12:50, Bob Simons - NOAA Federal via
> Esip-documentation <esip-documentation at lists.esipfed.org> wrote:
>
>  I should have included this formatted snippet from the original ACDD 1.0
> at
>
> http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/formats/DataDiscoveryAttConvention.html
> :
>
>
>   creator_name
> <http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/formats/DataDiscoveryAttConvention.html#creator_name_Attribute>
>  The data creator's name, URL, and email. The "institution" attribute
> will be used if the "creator_name" attribute does not exist.
>  metadata/creator/name
>   creator_url
> <http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/formats/DataDiscoveryAttConvention.html#creator_url_Attribute>
>  metadata/creator/contact at url
>   creator_email
> <http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/formats/DataDiscoveryAttConvention.html#creator_email_Attribute>
>  metadata/creator/contact at email  institution
> <http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/formats/DataDiscoveryAttConvention.html#institution_Attribute>
>  metadata/creator/name  project
>
> <http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/formats/DataDiscoveryAttConvention.html#project_Attribute> The
> scientific project that produced the data.
>  metadata/project
> Isn't it clear that creator_name is for the data creator's name?
> Isn't it clear that institution and project are the creator's institution
> and project?
>
> Nan, if someone misread this and used creator_name, institution, or
> project for some person/group other than the creator, then that is their
> mistake.  There is no need to deprecate these attribute names and create
> new attribute names just because someone misread/misused these attributes.
> That is their mistake. Tell them so they can fix it.
>
>
> On 2014-09-26 12:14 PM, Bob Simons - NOAA Federal wrote:
>
>
> On 2014-09-26 11:47 AM, Nan Galbraith wrote:
>
> Hi all -
>
> If you're using an ACDD version number in your metadata, these
> updates shouldn't cause any problems for you. If you eventually
> decide that there's some value in the revised spec, you can adopt
> the changes, otherwise your data & code should be just fine.
>
> What about those other standards (e.g., IOOS Gliders) that have adopted
> ACDD 1.0 attribute names but in the future would like to add some of the
> new ACDD 1.3 attribute names)?
>
>
> IMHO, changing the definitions of the terms in a standard is far
> worse than changing the terms themselves. Creator_name was
> originally defined as "data creator's name" - really no definition
> at all. If we change that to add any meaning (one who originally
> collected the data, or who created the data file?) we could make
> metadata in existing data sets incorrect.
>
> Only if someone really misread the 1.0 definitions.
>
> 1.0 says "data creator's name" not "data file creator's name". It is the
> person/group that created the data.
> You're taking an odd reading of the definition or some groups misuse of
> the existing name and definition and saying that that justifies deprecating
> the attribute.
>
> And the "institution" definition is grouped with creator, so it is clearly
> the data creator's institution.
>
> And "project" is defined in ACDD 1.0 as "The scientific project that
> produced the data."   (Not, e.g., the group that published the data.) It
> is clearly not the publisher's project.
>
> So there is no need to change the existing attribute names.
>
>
> Other than that, though, I see your point, and sympathize with
> your reluctance to change these terms.  I also agree that we may have
> changed some terms without strictly needing to,
>
> Exactly.
>
> but in the case of
> creator_project and creator_institution (vs project and institution) I
> think that allows for documenting other projects and institutions -
> e.g. the project/institution that processes, aggregates, and/or
> distributes a dataset might want some visibility for their efforts. In
> the original version, they got that only at the expense of the
> originator's
> information.
>
> Fine. Then leave project and institution as is for the creator, but add
> publisher_project and publisher_institution.
>
>
> I've seen the effect of this many times, where data collected by a PI in
> my group appears on a portal with only the name and institution of
> the last person to handle it. Or, when I send my data to OceanSITES for
> distribution,  I'd like the OceanSITES project to be part of the metadata,
> but not to remove the original information about the person and project
> that collected and originally provided the data. Having creator_project
> and _institution as named fields makes this information more likely to
> be preserved as it should be.
>
> I understand. Leaving project and institution as is for the creator, and
> adding publisher_project and publisher_institution offers a viable solution.
>
>
> Regards -
> Nan
>
>
> On 9/26/14 9:46 AM, Nancy Ritchey - NOAA Federal via Esip-documentation
> wrote:
>
> Bob,
> Well said!  I agree with your assessment.  We've spent many years working
> with our providers to use these standards appropriately allowing the use of
> common tools across multiple platforms and communities.  Changing the
> standard as proposed will have many unintentional consequences that may
> negate its future use.  A thoughtful, practical solution is needed.
> Nancy Ritchey
>
> ---------- Forwarded message ----------
> From: *Bob Simons - NOAA Federal via Esip-documentation*<
> esip-documentation at lists.esipfed.org
> <mailto:esip-documentation at lists.esipfed.org>
> <esip-documentation at lists.esipfed.org>>
> Date: Thu, Sep 25, 2014 at 7:03 PM
> Subject: [Esip-documentation] ACDD: creator, project, institution
> To: John Graybeal <john.graybeal at marinexplore.com
> <mailto:john.graybeal at marinexplore.com> <john.graybeal at marinexplore.com>>,
> ESIP Documentation <esip-documentation at lists.esipfed.org
> <mailto:esip-documentation at lists.esipfed.org>
> <esip-documentation at lists.esipfed.org>>
>
>
> I'm sure I'm coming late to this discussion:
>
> Why does ACDD 1.3 have creator, not creator_name, like 1.0?
> Why does ACDD 1.3 have creator_project, not project, like 1.0?
> Why does ACDD 1.3 have creator_institution, not institution (which is in
> CF!), like 1.0?
> If you want to add creator_institution_info, why not just add
> institution_info?
>
> It seems like these changes are just to change to names that the new ACDD
> group prefers, but at a HUGE cost.
> I have 1000's of datasets that have creator_name, project, and institution
> attributes.
> I have written software, ERDDAP, that strongly recommends creator_name and
> requires institution.
> I have told numerous people and groups to follow the ACDD standard.
> Now you are breaking your own standard.
> The new ACDD group seems to think there are no consequences to changing
> attribute names and that it can be done just to suit the group's fancy.
> It doesn't matter if you or I think the new names are better. That is not
> the issue.  If you are unhappy with the old system, change the definitions
> to clarify the attribute's usage, don't change the attribute names. Changes
> that break the old standard are wrong, wrong, wrong.
> And no, saying that all attributes are optional doesn't make it okay to
> change the attribute's names. If ACDD says that the data creator's name is
> in an attribute called creator_name, then that is where it should be (last
> year, this year, next year, and in 50 years).
>
> ---
> Standards should be backwards compatible.
> Standards should be as stable as possible.
> ACDD should be cleaning up the definitions of existing attributes and
> sparingly adding new attributes that provide a place for new pieces of
> information, NOT changing existing attribute names.
>
>
> --
> Sincerely,
>
> Bob Simons
>
>
>
>
> --
> Sincerely,
>
> Bob Simons
> IT Specialist
> Environmental Research Division
> NOAA Southwest Fisheries Science Center
> 1352 Lighthouse Ave
> Pacific Grove, CA 93950-2079
> Phone: (831)333-9878 (Changed 2014-08-20)
> Fax: (831)648-8440
> Email: bob.simons at noaa.gov
>
> The contents of this message are mine personally and
> do not necessarily reflect any position of the
> Government or the National Oceanic and Atmospheric
> Administration.
> <>< <>< <>< <>< <>< <>< <>< <>< <>< <><
>
>
> --
> Sincerely,
>
> Bob Simons
> IT Specialist
> Environmental Research Division
> NOAA Southwest Fisheries Science Center
> 1352 Lighthouse Ave
> Pacific Grove, CA 93950-2079
> Phone: (831)333-9878 (Changed 2014-08-20)
> Fax: (831)648-8440
> Email: bob.simons at noaa.gov
>
> The contents of this message are mine personally and
> do not necessarily reflect any position of the
> Government or the National Oceanic and Atmospheric
> Administration.
> <>< <>< <>< <>< <>< <>< <>< <>< <>< <><
>  _______________________________________________
> Esip-documentation mailing list
> Esip-documentation at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation
>
>
>
> --
> Sincerely,
>
> Bob Simons
> IT Specialist
> Environmental Research Division
> NOAA Southwest Fisheries Science Center
> 1352 Lighthouse Ave
> Pacific Grove, CA 93950-2079
> Phone: (831)333-9878 (Changed 2014-08-20)
> Fax: (831)648-8440
> Email: bob.simons at noaa.gov
>
> The contents of this message are mine personally and
> do not necessarily reflect any position of the
> Government or the National Oceanic and Atmospheric
> Administration.
> <>< <>< <>< <>< <>< <>< <>< <>< <>< <><
>
>
>
> _______________________________________________
> Esip-documentation mailing list
> Esip-documentation at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-documentation
>
>


-- 
Dr. Richard P. Signell   (508) 457-2229
USGS, 384 Woods Hole Rd.
Woods Hole, MA 02543-1598
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.lists.esipfed.org/pipermail/esip-documentation/attachments/20140929/1d872e93/attachment-0001.html>


More information about the Esip-documentation mailing list