[Esip-cor] Notifications

John Graybeal jgraybeal at stanford.edu
Wed Feb 6 17:14:45 EST 2019


This is very helpful, I can speak to my two main concerns now.

  1.  So if you wanted a separate message queue, say, one for each ontology in COR, you would have hundreds of inbox ontologies?

The LDN service is not means to be an inbox for every entry in COR. It is just for the LDN’s. It’s disjoint from all other existing entries in COR.

The LDN service is an entirely generic service, as I understand it, Anyone who wants to create a LDN inbox can do so; anyone who wants to publish to that inbox or listen on it can can do. If tomorrow I want a separate message queue (= inbox) for every ontology in COR—which makes perfect sense as a use case—I could take your software and make that happen. (Or maybe I'd have to create a new deployment of it for each inbox?)  The purpose of your current inbox is for testing, but potentially it is the first of many (many many).  It is easy to imagine a scaling issue. Hence my second point.

what happens if it is also populated with <insert large number of entities here>?
I don’t see this as an issue. People can still search for the resource they want to find via the GUI query bar.

Two constant refrains across MMI ORR, COR, and BioPortal, among other community portals I've run:
  1)  "I can't find my resource because there are so many of them."
  2)  "I thought about using your repository, but there's so much junk in there it doesn't look like a quality resource."

Some of these systems' improvements have directly targeted presenting useful resources to browsers and searchers, with limited success. But in the extreme case, if you don't know what string to search for ahead of time, and/or you get 10^x* responses for the string you do enter, it can be literally impossible to find what you need. And certainly the UI browser would be all but useless if the top 100 hits were notifications. These impacts could have direct consequences on adoption of COR by the community.

To think about options, I think it's helpful to stipulate that an LDN RDF graph is a different kind of artifact in the COR, especially if it can be identified and managed as such by the system. For example, it isn't clear that showing the notification graphs to the front page UI is useful—there is no information in the title that's recognizable to a browser. Similarly, searching should be able to be restricted to LDN triples or non-LDN triples. With some software changes to COR, LDN could be a very interesting application in the COR context.

But fundamentally, I think it's important we recognize that a notification system is just like a logging system: it needs to be built to handle a very large number of interactions. (Think of someone with a small-but-popular reprocessing tool that is logging status and error notifications every time the tool is run. The log can become big quickly, especially if it is aggregating notifications from the whole world's use of that tool.) I know we aren't there now, but we should understand that implication as we discuss this rather cool technology.

John

* BioPortal has 750 ontologies and 9 million terms; MMI ORR has 200+ ontologies and I'll guess 3,000+ terms; COR has 286 ontologies and I'll guess 10,000+ terms. The COR number of terms will hopefully grow by 2 orders of magnitude before too long, not counting terms embedded in notifications.



On Feb 6, 2019, at 11:23 AM, Mcgibbney, Lewis J (398M) <Lewis.J.Mcgibbney at jpl.nasa.gov<mailto:Lewis.J.Mcgibbney at jpl.nasa.gov>> wrote:

Hi John,
No problems. Replies inline

From: John Graybeal <jgraybeal at stanford.edu<mailto:jgraybeal at stanford.edu>>
Date: Tuesday, February 5, 2019 at 11:20 PM
To: "Mcgibbney, Lewis J (398M)" <Lewis.J.McGibbney at jpl.nasa.gov<mailto:Lewis.J.McGibbney at jpl.nasa.gov>>
Cc: Carlos Rueda <carueda at gmail.com<mailto:carueda at gmail.com>>, "esip-cor at lists.esipfed.org<mailto:esip-cor at lists.esipfed.org>" <esip-cor at lists.esipfed.org<mailto:esip-cor at lists.esipfed.org>>, ESIP COR <mmi.mmiorr at gmail.com<mailto:mmi.mmiorr at gmail.com>>
Subject: Re: [Esip-cor] Notifications

Lewis,

I'm sorry, I am new to this terminology and technology so I need to ask more questions.


  1.  What does an "LDN IRI" represent? An ontology used to capture a set of notifications, or a single notification?  It seems the latter. (But then, see #3.)

Yes it’s the latter.


  1.  Do I understand correctly that your inbox ontology represents a single message queue?

It’s merely a structured container which describes a collection of LDN’s. The LDN’s then exist as other individual names graphs.

  1.  So if you wanted a separate message queue, say, one for each ontology in COR, you would have hundreds of inbox ontologies?

The LDN service is not means to be an inbox for every entry in COR. It is just for the LDN’s. It’s disjoint from all other existing entries in COR.


  1.  Assuming Carlos correctly derived that each ontology with a UUID in it is a single notification, I think I understand how you are trying to label the ontologies. It still seems that there is a bit of overlap, in that what I would call the 'term' pattern in COR is being used for _both_ the ontology ID and the notification ID. (Except it seems we don't say anything about the notification *inside* the ontology, like <notificationX>  <createdBy> <personY> so this is a bit confusing too.) Perhaps closer inspection would reveal they are not the same ID, or my model of the elements is not right, or the triples display I am reading for the ontology is not complete, or I understand it all correctly and it actually makes perfect sense that way (just not to me).

Carlos identified the bug which was causing me confusion. I’ll go and fix this and then hopefully things will become clearer.

It would help if the ontologies that are being created had metadata in them describe what they are, as (I think) I previously noted.

I agree on this John… I can create some of this by default and I will. Do you have suggestions for what you would like to see?
Again however I would argue that they LDNs are no ontologies. They are merely RDF graphs.

I think it is worth the mental exercise of playing scenarious that use this capability out in our heads before we go too far down this path. Keeping in mind COR's UI is designed to deal primarily with human-curated ontologies of concepts,

Understood and yes I am very much game for engaging in that effort.

what happens if it is also populated with:
 - content from a notification endpoint that receives a large number of notifications every day/hour/minute?

I don’t see this as an issue. People can still search for the resource they want to find via the GUI query bar.

 - content from a large number of notification endpoints?

Again, I don’t see the issue here. People can still search for the resource they want to find via the GUI query bar.

 - content via lots of different applications supporting notifications with various degrees of clarity?

Same applies. Using COR as the backend for a LDN service does not take anything away from the requirements for COR in the first place.
https://esipfed.github.io/stc/UseCases/STCUseCasesAndRequirements.html

Can the UI, as it is currently written,  support those scenarios in a way that is compatible with existing uses of the COR?

Yes, I think it definitely can.

It feels to me like these are fundamental questions that we should discuss before doing much advertising of LDN within COR.  I'm sorry to want to slow things down, but I've spent several hours trying to understand the technology and this application of it, and I clearly do not understand it at all well yet.

No problem at all. I think this would be useful for us to spend some time discussing at a future committee telecon as well. I did however want to launch the LDN service before the telecon so we had material to talk about and an opportunity to address some bugs.

Given constraints on my time this week, some more time to share and appreciate the details will be very welcome.

Yes John, thanks for writing I really appreciate it.
Lewis

========================
John Graybeal
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.esipfed.org/pipermail/esip-cor/attachments/20190206/17605e0e/attachment-0001.html>


More information about the Esip-cor mailing list