[Esip-documentation] updates to ACDD
Nan Galbraith
ngalbraith at whoi.edu
Mon Nov 22 15:23:10 EST 2021
Hi Chris and all -
I'm very interested in these new identity terms, which we would like to
add to the OceanSITES netCDF specification; the OS spec is built on CF
and ACDD 1.2.
> the need to include identifiers (ORCID, ResearchID, AuthorID, etc) for
> the person listed in the creator attributes and the people listed in
> the contributor attributes.
We use the ACDD terms, creator_*, publisher_*, contributer_*, (where
the * is name, email, url, etc) and our current manual adds the modifier
_orcid - at least to the publisher field (not sure why the others seem to
be missing, I'll look into that).
From the manual:
name example note
publisher_orcid
0000-0003-0228-9795
available from https://orcid.org/
In reviewing some of my files, though, I see I've also used
creator_id = "https://orcid.org/0000-0001-8001-6886" ;
I think that's actually better, because it allows different ID
types. There is a group working on a system similar to ORCID
that will identify groups - I'd like to see a persistent identifier for
the research group I work in, because I think our data needs to be
tracked back to the group, more than to any individual. At some point
we may supply the group name as the publisher, and it would be good to
have a PID for that.
Also, I've been looking for the definitions of the roles we used in
ACDD. I still believe these definitions can be improved upon, (eg:
'creator: the person who created the data') but since they're just referred
to as ISO roles, it's hard to find them these years later.
I disagree with Bob Simon on the problems that arise if we change
definitions.
If you trust a data writer when he says a file uses ACDD, why on earth
wouldn't
you trust him to also document the version? A simple work around, of
course,
is to ditch the terms we use for roles, and make new, improved ones. It can
still be ACDD.
Last item - does anyone think ACDD could be moved to the Discovery
cluster?
At least in theory, it was originally all about discovery.
Regards - Nan
On 10/14/21 5:17 PM, John Graybeal via Esip-documentation wrote:
> Megan, Ted,
>
> To clarify the history, the ESIP Documentation Cluster did accept
> responsibility for managing this specification, did they not? I think
> that is not a role that ESIP should (can afford to?) let fall on the
> floor, regardless of the current status of the Cluster.
>
> I don't think you were saying that would happen, I just wanted to be
> sure that's where things stand in terms of 'ownership' of the ACDD
> standardization process.
>
> John
>
>> On Oct 14, 2021, at 12:09 PM, Megan Carter <megancarter at esipfed.org>
>> wrote:
>>
>> Hi Bob and Others,
>>
>> I would like to clarify that the ESIP Documentation Cluster has been
>> on hiatus for several months now. They have not been having regular
>> meetings and, as far as I know, are not having discussions of ACDD
>> outside of this thread. If anyone is interested in this area and
>> would like to start organizing Documentation Cluster meetings again,
>> that would certainly be possible.
>>
>> Best,
>> Megan
>>
>> Megan Carter Orlando
>> ESIP Community Director
>>
>> On Thu, Oct 14, 2021 at 3:00 PM Bob Simons - NOAA Federal via
>> Esip-documentation <esip-documentation at lists.esipfed.org> wrote:
>>
>> John, you make it sound so easy, but there have been no
>> changes since 2014.
>>
>> Regarding "They* would not necessarily have to discuss any other
>> change,":
>> There are several pending requests for changes. It would be odd
>> to just consider one person's request and not consider the
>> requests of other people who have been waiting longer.
>> Related: do we want to start releasing a series of versions of
>> ACDD, e.g., one per change? Isn't it better to do the changes in
>> batches, like almost every other standard?
>>
>> When the Documentation Cluster decides to investigate
>> changing ACDD, I really hope they widely advertise the meetings
>> where changes to ACDD will be considered so relevant parties
>> don't miss out.
>>
>>
>>
>>
>>
>> On Wed, Oct 13, 2021 at 5:53 PM John Graybeal
>> <jbgraybeal at sonic.net> wrote:
>>
>> Matthew,
>>
>> Thanks for this, it simplifies the answers a lot!
>>
>> Short answer is that (as I understand it) the ACDD
>> conventions are 'managed' by the ESIP Documentation Cluster,
>> and to start the process of adding an attribute to those
>> conventions you would make the request to that cluster. It's
>> my hope that the cluster would relatively quickly form a team
>> and process to support the request. They* would not
>> necessarily have to discuss any other change, and other than
>> formalizing what process they want to follow, it could be
>> quite quick to decide. (Posting updated documents might take
>> longer!) It could be a valuable exercise for the
>> Documentation Cluster and community, in my humble opinion.
>>
>> When you say "an attribute explicitly for people
>> identifiers", you are right that ACDD allowed this use but
>> purposefully left it ambiguous in the creator_url. At some
>> point, if you want to make ACDD less ambiguous, the amount of
>> duplicate content starts going up because of
>> backward-compatibility requirements, and maybe you want a new
>> path that's incompatible with all the existing metadata.
>> (That's what we ran into last time.)
>>
>> You can certainly add a contributor_url, though nothing
>> preclused using a URL for contributor_name. But I agree a URL
>> would be good. (Or maybe these days, an IRI. And be sure you
>> allow multiples! and roles for each! oops, going beyond my
>> mandate :->)
>>
>> Finally, totally agree the metadata profile doesn't go
>> outside of the ACDD conventions, it's fine really. I just
>> want to encourage individual users to get that uniquely
>> identifiable recognition too!
>>
>> Very clarified, and appreciate closing the loop on this
>> question. Someone from the Documentation Cluster may wish to
>> comment from this point.
>>
>> John
>>
>> * I'm a member of the Documentation Cluster but have not
>> often had time to participate, alas.
>>
>>
>>> On Oct 13, 2021, at 6:06 AM, Mathew Biddle - NOAA Affiliate
>>> <mathew.biddle at noaa.gov> wrote:
>>>
>>> Bob, John, and Chris,
>>>
>>> I just joined this list yesterday, so I'm just seeing Bob's
>>> response now. I wholeheartedly agree with all of your
>>> comments, thank you for walking through your logic on
>>> updating existing attributes/definitions. I think there
>>> might have been some mischaracterization of what it is Chris
>>> was asking for so I'd like to take a step back.
>>>
>>> The problem: the Animal Telemetry Network (ATN) is looking
>>> to collect persistent identifiers for people
>>> (preferably using ORCiD, but other options are available).
>>> ATN would like to include those identifiers in the netCDF
>>> metadata at the appropriate location. So, we did a quick
>>> search through ACDD and didn't see an attribute
>>> explicitly for people identifiers. So, my first response is
>>> that this might be something worthwhile to add to the
>>> upstream conventions, how might we do that?
>>>
>>> Where we stand: After discussing with John on the
>>> #marinedata slack channel (thanks for monitoring that
>>> channel BTW!) I was reminded of the ACDD 1.3 creator_url
>>> <https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3#creator_url>
>>> attribute, which will suit the ATNs purpose. See that
>>> discussion in this ticket
>>> <https://github.com/ioos/ioos-atn-data/issues/24#issuecomment-937836068>.
>>> However, it would also be beneficial if we could supplement
>>> the ACDD 1.3 recommended attributes with a *contributor_url*
>>> attribute which doesn't exist now. This would be an addition
>>> to the existing attributes, not a change to definitions or
>>> attribute names. So, the question becomes, how do we
>>> contribute/start a conversation on adding a *new* attribute
>>> to the ACDD conventions? Is this even possible?
>>>
>>> As for the IOOS Metadata Profile v1.2 description for what
>>> goes in the creator_url, I'll discuss it with the team. From
>>> how I read it, it's not going outside of the ACDD 1.3
>>> conventions. It's providing more explicit guidance as to how
>>> the IOOS community should use it (maybe the creator_type
>>> should always be 'institution' for the IOOS profile to make
>>> that connection more clear).
>>>
>>> I hope this clarifies things.
>>>
>>> Thanks everyone for your valuable input.
>>>
>>> Matt
>>>
>>> On Tue, Oct 12, 2021 at 11:49 PM Work Sonic via
>>> Esip-documentation <esip-documentation at lists.esipfed.org> wrote:
>>>
>>> Well said, I agree with all of your statements here.
>>> (Well, except for the part where you said "And they are
>>> forgetting about all the consumers of data files who
>>> have to deal with different versions of a standard that
>>> have different attribute names and definitions." I don't
>>> think any of us forgot about those people—many of us
>>> were those people who had to deal with existing data
>>> sets—but we had varying opinions about whether breaking
>>> changes were a good idea. In the end, the group decided
>>> that for that version, we would not make any breaking
>>> changes.)
>>>
>>> I expect Chris has a pretty clear picture at this point! :-)
>>>
>>> Thanks Bob for the input.
>>>
>>> John
>>>
>>>> On Oct 12, 2021, at 6:38 AM, Bob Simons - NOAA Federal
>>>> <bob.simons at noaa.gov> wrote:
>>>>
>>>> Regarding "IOOS had recommended creator_url be ...":
>>>> IOOS can recommend whatever they want, but it doesn't
>>>> change the ACDD 1.3 definition of "creator_url" which is
>>>> "The URL of the *person* (or other creator type
>>>> specified by the creator_type attribute) principally
>>>> responsible for creating this data." [emphasis added]
>>>> and which seems to be directly at odds with the IOOS
>>>> recommendation.
>>>>
>>>> Regarding "at least one major user would not accept any
>>>> changes to existing ACDD attributes that would
>>>> invalidate any use that followed a previous version"
>>>> and the desire of some people to make major changes to
>>>> ACDD:
>>>> The person resisting changes to existing attribute
>>>> names and definitions was me. I think the reasons for
>>>> that should be obvious:
>>>> This community of NOAA (especially NCEI with its
>>>> archive), NASA, and hundreds of other groups, has 100's
>>>> of thousands of datasets and 100's of millions of files
>>>> using the ACDD terms as defined in the various versions
>>>> 1.0 - 1.3 of the ACDD standard. If you change one of
>>>> the attribute names or definitions:
>>>>
>>>> * At best you introduce a complication (a file's
>>>> reader has to be aware of the difference between
>>>> ACDD versions and interpret the attribute
>>>> differently according to the stated ACDD version).
>>>> That means crosswalks to other metadata formats
>>>> need to be more sophisticated in order to deal with
>>>> the differences between ACDD versions. That might
>>>> not be too bad for the changes in one version of
>>>> ACDD, but what about after 5,6,7 versions of ACDD?
>>>> * You introduce uncertainty: Did the file's creator
>>>> properly understand the differences between how the
>>>> attribute was different versions of ACDD? This
>>>> makes the crosswalks much harder to write.
>>>>
>>>> You can see both of those problems already (although
>>>> with minor consequences) with the pointless change in
>>>> the spelling of the "acknowledgment" (the US spelling
>>>> of the word) in pre-1.3 versions of ACDD to
>>>> "acknowledgement" (the British spelling) in ACDD 1.3.
>>>> It is now common to see files claiming to be ACDD 1.3
>>>> compliant with "acknowledgment" (incorrect) and others
>>>> using "acknowledgement" (correct).
>>>> It should be obvious that if a definition changed
>>>> (instead of the spelling of the attribute name), it
>>>> would be a far more complex situation where the reader
>>>> must then guess what the file creator intended.
>>>>
>>>> This situation highlights something else: there is a
>>>> huge difference between the perspective of a person
>>>> creating a new data file for a new dataset, and a
>>>> person reading files from various sources. A person
>>>> creating a new data file for a new dataset has no prior
>>>> constraints. All they want to do is express the
>>>> metadata content into the file using the standard. But
>>>> everyone and every project has different needs. For
>>>> them, it's easy to get frustrated with a standard
>>>> because it doesn't fit their idea of what the perfect
>>>> metadata standard would be. Given a blank slate and
>>>> working alone, everyone would create a different
>>>> standard. The hard part of making the standard (and it
>>>> was hard) is that we had to reconcile all these
>>>> different ideas about what the /perfect/metadata
>>>> standard would be. So "perfect" gets thrown out (since
>>>> it is impossible) and "acceptable compromise" is the
>>>> best we can hope for. (Yes, as my wife says,
>>>> "compromise is when nobody is happy".)People making
>>>> ACDD 1.3 had very diverse ideas about what topics
>>>> should be addressed, what the attribute names should
>>>> be, and especially what the definitions should be.
>>>> Unfortunately, some people naturally retain this idea
>>>> that we should revamp the standard, but they are
>>>> forgetting about all the other people who would revamp
>>>> the standard in a different way. And they are
>>>> forgetting about all the consumers of data files who
>>>> have to deal with different versions of astandard that
>>>> have different attribute names and definitions
>>>>
>>>> As a great example of not changing attribute names or
>>>> definitions but just adding new attribute names and
>>>> definitions, look at CF, which has been quite stable
>>>> through 9(?) versions. The result is that a writer of a
>>>> file with CF metadata can reliably write attributes to
>>>> the file as they have been for years (although
>>>> periodically adding new attributes to their
>>>> vocabulary), and a reader of a file with CF metadata
>>>> can pretty reliably ignore the stated CF version and
>>>> just see what attributes are present. If a given
>>>> attribute is present, it's definition is known. Thank
>>>> goodness!
>>>>
>>>> Ethan Davis had the remarkable luxury of having a blank
>>>> slate and (I think) of working alone when he created
>>>> ACDD 1.0. But now, forever more, new versions of ACDD
>>>> will be painfully hashed out by numerous people working
>>>> toward an acceptable compromise. On behalf of other
>>>> consumers of data files and on behalf of software
>>>> developers who write software that processes data
>>>> files, please, please, please, let's keep existing
>>>> attribute names and definitions stable. If you want to
>>>> add new attribute names and definitions to address new
>>>> concepts, go for it.
>>>>
>>>> Here's a compromise (but where everyone is happy) for
>>>> all of you who really want to make massive changes to
>>>> ACDD (essentially starting from a blank slate): go for
>>>> it! Create your own metadata standard (just as Ethan
>>>> did with ACDD 1.0), but just give it a different name,
>>>> not "ACDD". After all, that is what your new metadata
>>>> standard is: a new metadata standard. If it is a great
>>>> standard, as ACDD 1.0 was, it will fill a niche and be
>>>> widely adopted and possibly supplant ACDD (if it
>>>> addresses the same issues). But effectively killing off
>>>> the current ACDD 1.x by labelling your new and very
>>>> different standard ACDD 2.0, is wrong unless everyone
>>>> in the ESIP Documentation Cluster agrees that it is
>>>> time to kill off ACDD 1.x and go down that different
>>>> route (and you probably won't get my vote). I like
>>>> ACDD (1.0 to 1.3) and its stability is incredibly
>>>> valuable to me. I have a lot invested in ACDD 1.3.. I
>>>> know I'm not alone -- think of NASA and NCEI with
>>>> millions of archived files with ACDD 1.x metadata and
>>>> all the software that writes and reads these files.
>>>>
>>>> Best wishes.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Oct 11, 2021 at 4:33 PM John Graybeal via
>>>> Esip-documentation
>>>> <esip-documentation at lists.esipfed.org> wrote:
>>>>
>>>> Hi Chris,
>>>>
>>>> I believe there have been 3, maybe 4 threads in the
>>>> past 2-3 years about updating ACDD. I wouldn't say
>>>> any of them were as action-oriented as
>>>> yours—sometimes interest in a particular attribute,
>>>> other times general interest in whether it's being
>>>> maintained. None has gone so far as to say "I want
>>>> to open a new round of discussion for ACDD." The
>>>> list archive may have some details.
>>>>
>>>> I note that nothing *precludes* your using those
>>>> identifiers for the people or organizations in the
>>>> contributor attributes, all those identifiers can
>>>> be named via URLs, which is consistent with the
>>>> ACDD spec. What are you trying to do exactly that
>>>> isn't already possible?
>>>>
>>>> Within the past week, there was a question in the
>>>> #marinedata Slack channel of the ESIP workspace
>>>> about ORCiDs in netCDF, followed by a long
>>>> discussion about the ACDD 1.3 creator_url. In the
>>>> course of that discussion it was mentioned that
>>>> IOOS had recommended creator_url be
>>>>> The URL of the institution that collected the
>>>>> data. Note that this should always reference an
>>>>> institution URL, and not a personal URL, even
>>>>> if creator_type=person.
>>>> I wasn't there so I can't fairly assess that
>>>> guidance, but it is sufficiently, umm, unexpected
>>>> that it'd be nice to get your needs met by ACDD
>>>> directly.
>>>>
>>>> That said, two considerations about ACDD: (1) At
>>>> the last update round, at least one major user
>>>> would not accept any changes to existing ACDD
>>>> attributes that would invalidate any use that
>>>> followed a previous version. So our ability to
>>>> update certain fields the way many members wanted
>>>> to was effectively blocked. (2) With ESIP's
>>>> Documentation Cluster(Committee?) as the current
>>>> 'standards body' for ACDD, you'd be starting down a
>>>> path that has not been travelled yet, to my knowledge.
>>>>
>>>> I hope that is the right level of information to
>>>> share in response to your query! I think it would
>>>> be great for ACDD to get another round, especially
>>>> if it was clear that a break from the past was
>>>> necessary to improve metadata quality from what the
>>>> current standard can support. Obviously that would
>>>> open up quite a number of questions that just might
>>>> go beyond your own interest. ;-)
>>>>
>>>> John
>>>>
>>>>> On Oct 11, 2021, at 1:02 PM, Chris Turner via
>>>>> Esip-documentation
>>>>> <esip-documentation at lists.esipfed.org> wrote:
>>>>>
>>>>> Hello all,
>>>>>
>>>>> I'm curious about any movement or interest to
>>>>> update the ACDD. I know that v1.3 is 6 years old,
>>>>> and the ESIP wiki makes it looks like there hasn't
>>>>> been interest or discussion in this since 2017. Is
>>>>> there still any intent to develop v2.0?
>>>>>
>>>>> My sudden interest in this comes from the need to
>>>>> include identifiers (ORCID, ResearchID, AuthorID,
>>>>> etc) for the person listed in the creator
>>>>> attributes and the people listed in the
>>>>> contributor attributes. I'd like to do this in a
>>>>> community-vetted way, but but if there isn't an
>>>>> active community working on ACDD anymore, we can
>>>>> look at including these attributes in one of the
>>>>> netCDF community profiles - probably the the IOOS
>>>>> metadata profile
>>>>> <https://ioos.github.io/ioos-metadata/ioos-metadata-profile-v1-2.html>.
>>>>>
>>>>>
>>>>> Thanks for whatever you can tell me.
>>>>>
>>>>> - Chris
>>>>>
>>>>>
>>>>> --
>>>>> Chris Turner
>>>>> Data Librarian | Axiom Data Science
>>>>> chris at axiomdatascience.com
>>>>> <mailto:chris at axiomalaska.com>
>>>>
--
*******************************************************
* Nan Galbraith Information Systems Specialist *
* Upper Ocean Processes Group Mail Stop 29 *
* Woods Hole Oceanographic Institution *
* Woods Hole, MA 02543 (508) 289-2444 *
*******************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.esipfed.org/pipermail/esip-documentation/attachments/20211122/b3d8bcbb/attachment-0001.htm>
More information about the Esip-documentation
mailing list