<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div><span class="Apple-tab-span" style="white-space:pre"> </span>Hi Lesley-<br class=""><div><blockquote type="cite" class=""><div class="WordSection1" style="page: WordSection1; caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 14px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><br class=""><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class="">I am doing some work on MT data in Australia, and what I think we are going to move towards is a DOI for each individual station in an MT survey/network, but that is a very confronting suggestion to some geophysicists so it is softly, softly at the moment. The idea is that when the stations are aggregated into a dataset, this also gets a DOI. </span></div></div></blockquote><div><br class=""></div><span class="Apple-tab-span" style="white-space:pre"> </span>This certainly sounds like a good way to go, given that you have a plan for aggregating a collection of stations. We have found ourselves very reliant on organized governance to get DOIs for networks in place (it's their network, not ours or the Federation's) so just getting that level of registration has been a task. Referencing stations happens indirectly through our own database lookups (and of other federated repositories), so the information is there, even as we don't place a PID on the instrument itself.</div><div><span style="font-family: Calibri, sans-serif; font-size: 11pt;" class=""> </span><br class=""><blockquote type="cite" class=""><div class="WordSection1" style="page: WordSection1; caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 14px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class="">The UNAVCO people presented this work on DOIs in either the 2018 or 2019 AGU and I tried to get them to publish something on their composite and aggregate DOIs, but I can’t see that they have done this yet (it’s like trying to get IRIS to publish a referenceable paper on their ‘Dirt-to-Desktop concept – hint, hint).</span></div></div></blockquote><div><br class=""></div><span class="Apple-tab-span" style="white-space:pre"> </span>I think you may find that these efforts have been 'work in progress' and have seen some success, but perhaps not yet ready to publish as a solved solution. I couldn't find a direct reference, either. Indeed, the new IRIS MT facility is in the process of designing an effective dirt-to-desktop workflow to serve these datasets. </div><div><br class=""></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>The closest to dirt-to-desktop that we have for seismic is our PH5 dataset pipeline for temporary experiments.</div><div><br class=""></div><div><a href="https://www.passcal.nmt.edu/content/ph5-what-it" class="">https://www.passcal.nmt.edu/content/ph5-what-it</a></div><div><a href="http://service.iris.edu/ph5ws/" class="">http://service.iris.edu/ph5ws/</a></div><div><br class=""></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>This involves a partnership between the instrumentation center, the data repository, and the PI to provide tools that allow the PI to largely author and maintain their datasets independently and convey them to the data repository after a series of validation checks. The main idea is to have less of a middle-process that stands between the PI and getting data pushed out for dissemination. It should be noted that we are taking what we have learned from PH5 and exploring an evolution in formatting, which means that someday we will see these datasets transition away from PH5.</div><div><br class=""><blockquote type="cite" class=""><div class="WordSection1" style="page: WordSection1; caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 14px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class=""><o:p class=""></o:p></span></div><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class=""><o:p class=""> </o:p></span></div><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class="">This way with something like CRediT we can finally start to acknowledge the people that go out in the field and dig the holes and actually collect the data, and more importantly, recognise those who funded the data collection initiative. Once you go into the more highly evolved data products, these people are rarely if ever citable in a machine-readable way (if you are lucky it is in free text in the acknowledgements).</span></div></div></blockquote><div><br class=""></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>One thing we are doing, and will continue to develop, is making it easier for scientists to get a citation for datasets based on the networks they are accessing. Right now, it's a simple web tool and later we will make it more service-oriented. </div><div><br class=""></div><div><span class="Apple-tab-span" style="white-space:pre"> </span><a href="https://fdsn.org/networks/citation/" class="">https://fdsn.org/networks/citation/</a></div><div><br class=""></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>In addition, we are now inserting DOIs for networks into our XML metadata, so the attribution is carried there when the user requests it. Our issue, now, is to engage all of the FDSN networks so ensure that they have an associated DOI. There are many that do not or have not registered one.</div><div><br class=""></div><blockquote type="cite" class=""><div class="WordSection1" style="page: WordSection1; caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 14px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class=""><o:p class=""></o:p></span></div><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class=""><o:p class=""> </o:p></span></div><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class="">Critical to this is Data Versioning and the NASA processing levels – have you see the outputs of the RDA Data Versioning Working Group? This WG produced a<span class="Apple-converted-space"> </span><a href="https://rd-alliance.org/group/data-versioning-wg/outcomes/principles-and-best-practices-data-versioning-all-data-sets-big" style="color: blue; text-decoration: underline;" class="">white paper</a><span class="Apple-converted-space"> </span>based on<span class="Apple-converted-space"> </span><a href="https://rd-alliance.org/group/data-versioning-wg/outcomes/compilation-data-versioning-use-cases-rda-data-versioning-working" style="color: blue; text-decoration: underline;" class="">39 use cases</a>.</span></div></div></blockquote><div><br class=""></div><span class="Apple-tab-span" style="white-space:pre"> </span>I have read some of the material coming out of the RDA and looked at one of the papers you referenced here. There are a lot of good conceptualizations here. I still think that each data repository will see the conditions and criteria differently in terms of versioning, what constitutes a dataset, and how metadata changes affect the identity of the dataset as a whole. In addition, the infrastructural and cooperative demands of detailed PIDs for data and organizations will be formidable for any data center to take on. I do think we can strive for these goals in increments, though.<br class=""><blockquote type="cite" class=""><div class="WordSection1" style="page: WordSection1; caret-color: rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 14px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;"><div style="margin: 0cm; font-size: 11pt; font-family: Calibri, sans-serif;" class=""><span lang="EN-GB" class=""><o:p class=""></o:p></span></div></div></blockquote></div><div><br class=""></div><div><br class=""></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>-Rob</div></body></html>