[esip-semantictech] [AGENDA] ESIP SemTech Telecon - 2019-05-28

Thu May 30 10:37:22 EDT 2019

Please forgive the delayed response -- I am still traveling :)  Also, thank
you all for engaging in this discussion. I am adding Krzysztof Janowicz in
cc.

> From time to time there are proposals to encode geometry in RDF, seemingly
> with the notion that the RDF stack provides the tools necessary to process
> more or less any data. I’m not so sure.

I think there might be a misunderstanding, and please let me know if I am
missing something -- but the idea is to *not* encode geometry in RDF,
neither as Triples nor in Literals. NeoGeo proposed the former, GeoSPARQL
advises the latter. There are clear and pressing issues with storing large,
complex geometry data in Literals with no apparent benefit. While the idea
may seem okay on paper, it actually fails in practice. We demonstrated a
few of those issues in our workshop paper [1] and have since gathered
plenty more feedback from the community about other latent issues, such as
(just to name a few) the underlying RDBMS not allowing text literals beyond
a few megabytes, or creating named graphs with different levels of geometry
simplification in order to make spatial operations in SPARQL queries
feasible. Whether or not one views the use of literals as appropriate from
a modeling perspective, there are very real technical limitations that
prompted us to rethink the need for geometry in RDF, especially since it
seems these problems might only become apparent (and quite serious) at
scale.

If geometry is available then the topology can be pre-computed for all
> geometries and then stored. But it should be noted that essentially this is
> just caching, or an optimization strategy

Thank you for bringing this up because I think it is important to emphasize
that pre-computing topology is not simply about caching. When dealing with
geographic information, topology cannot be computed from geometry alone.
>From our latest paper [2]:

[...] we believe that knowledge graphs and Linked Data more concretely will
> benefit further from topological relations. One could now argue that such
> topological relations can be computed using geometries but not the other
> way around. While this is true in an abstract mathematical sense, it does
> not hold for actual data. In fact, topological relations between places
> cannot be easily computed based on geometry alone. While there are many
> reasons for this (Franklin, 1984; Computing and Querying Topological
> Relations in Linked Geographic Data 3 Ubeda and Egenhofer, 1997), our
> argument will focus on the role of domain knowledge, vagueness, and
> uncertainty (Bennett, 2001) and not on computational issues.

 - Blake

[1] https://blake-regalia.net/resource/2017-LDOW_Geometries.pdf
[2] https://blake-regalia.net/resource/2019-TGIS_Topology.pdf

On Tue, May 28, 2019 at 10:20 PM Cox, Simon (L&W, Clayton)
<Simon.Cox at csiro.au> wrote:

> Hi Blake –
>
>
>
> Ø  The main idea is that RDF Literals are not suitable for complex
> geometry data.
>
>
>
> Lets wind this back a bit.
>
>
>
> From time to time there are proposals to encode geometry in RDF, seemingly
> with the notion that the RDF stack provides the tools necessary to process
> more or less any data. I’m not so sure. RDF is about relationships and
> logic, and not numerical computation. OTOH, processing geometry is very
> much about numbers. In particular multi-component quantities (vectors). RDF
> is weak on the latter. There is a strong case for recognising the boundary
> between logic and geometry, and apply the appropriate meta-model on each
> side of the boundary – RDF for logic, and something else for geometry. I’m
> fine with literals on the geometry side of the boundary.
>
>
>
> A part of your argument that I do agree with is that computing topological
> relationships on-the-fly is a mugs game, for all the reasons that you
> showed in your presentation. But again, the GIS world has already been here
> – I think Arc/Info was topological, and only got dumbed down when
> shapefiles appeared and then didn’t recover topology when ArcGIS came
> along.
>
>
>
> The necessary topological relationships are well known (three flavours are
> implemented in GeoSPARQL). If geometry is available (I don’t care if it’s
> in literals or RDF) then the topology can be pre-computed for all
> geometries and then stored. But it should be noted that essentially this is
> just caching, or an optimization strategy (though there may also be some
> cases, e.g. cadastre in many jurisdictions which rely on ‘meets and
> bounds’, i.e. where the topological relationships come first and geometry
> must be computed from them).
>
>
>
> I’m not at all convinced that focussing on the serialization is the actual
> issue here.
>
>
>
> Simon
>
>
>
>
>
> *From:* Blake Regalia [mailto:blake.regalia at gmail.com]
> *Sent:* Wednesday, 29 May, 2019 11:35
> *To:* Cox, Simon (L&W, Clayton) <Simon.Cox at csiro.au>
> *Cc:* Mcgibbney, Lewis J (398M) <lewis.j.mcgibbney at jpl.nasa.gov>; Mike
> Daniels <daniels at ucar.edu>; esip-semanticweb at lists.esipfed.org
> *Subject:* Re: [esip-semantictech] [AGENDA] ESIP SemTech Telecon -
> 2019-05-28
>
>
>
> Simon,
>
>
>
> Your general approach (which I think is to persist geometry
> representations outside the context of the feature, and link to them
> through URIs) makes sense, and I think essentially matches practice in GIS
> systems for decades now (where geometry was in a separate table).
>
>
>
> The main idea is that RDF Literals are not suitable for complex geometry
> data. Other than that, the data model is nearly the same as GeoSPARQL with
> some extensions to provide metadata (e.g., attributes such as vertex count,
> centroid, area, etc.) about the geometries themselves. Finally, we advise
> that on-demand topology is too expensive on high-resolution geodata (such
> spatial queries are not feasible at scale) and that for topology to be
> practical, it needs additional context beyond merely the geometries alone
> (even assuming they are cleaned) due to vagueness and uncertainty
> principles; whereas precomputing metrically-refined topology with context
> (e.g., what threshold to use for an approximate topological relation
> between a forest and a lake) is not only feasible but also capable of
> producing more meaningful relations than strict topological relations
> (e.g., DE-9IM between spatial regions).
>
>
>
> However, I’m not sure that you need a new vocabulary or namespace.   The
> definition of :hasGeometry in the GeoSPARQL standard[1] (clause 8.3.1.1) is
> It is an owl:ObjectProperty, but there is no requirement for the object to
> be a local blank-node. It can be a URI, as in your slide 18.
>
>
>
> Thank you for being so observant! You are correct, it is already possible
> to use URIs (i.e., instead of blank nodes) with GeoSPARQL as described in
> the presentation; as you pointed out, this does not require use of a new
> predicate. In fact, the approach here is fully compatible with GeoSPARQL in
> theory, both in terms of the vocabulary in the data model and the
> extensible value testing functions in SPARQL (e.g., geof:intersection,
> geof:convexHull, etc.).
>
>
>
> The reason we show a new predicate for hasGeometry is mostly to highlight
> to the viewer that we are proposing something new here, not that we intend
> to replace the GeoSPARQL ontology. In other words, these custom predicates
> and classes have only been used for demonstration purposes and
> proof-of-concepts so far. However, we could assume that such a predicate is
> an rdfs:subPropertyOf geosparql:hasGeometry, or that the ago:Geometry class
> is an rdfs:subClassOf geosparql:Geometry if it became necessary to add
> e.g., property restrictions.
>
>
>
> A set of properties are provided, one of which is geo:hasSerialization.
> WKT is only provided as an example, and is not mandatory - GML is included
> in GeoSPARQL as another option, but other representations are not
> prohibited thanks to the RDF open-world-assumption.
>
>
>
> Yes, I also spoke about GML during the presentation; and again I am not
> saying that anything about GeoSPARQL needs to be changed. I merely hope
> that the community sees the benefits in using dereferenceable IRIs instead
> of blank nodes for geometries (which we recommend for *all* non-point
> features), and that complex geometries pose many challenges when encoded as
> human-readable formats in RDF Literals.
>
>
>
> You will likely also need to negotiate over the *schematic* form of the
> representation (e.g. neogeo vs geosparql).
>
>
>
> Very interesting. I had not yet seen the draft for content negotiation by
> profile. The negotiation concept for geometries certainly requires more
> development -- there was also a comment on the call about negotiation and
> available representations of geometries. On a related note, I think there
> are several transactional aspects to consider when geometries are taken out
> of RDF Literals including how to deal with versioning, Linked Data
> Platform, Web Feature Service, and Web Processing Service.
>
>
>
>  - Blake
>
>
>
>
>
> On Tue, May 28, 2019 at 4:09 PM Cox, Simon (L&W, Clayton) <
> Simon.Cox at csiro.au> wrote:
>
> Thanks Blake –
>
> Sorry I missed your presentation. Somehow it had dropped out of my
> calendar.
>
>
>
> I’ve looked through your slides. Your general approach (which I think is
> to persist geometry representations outside the context of the feature, and
> link to them through URIs) makes sense, and I think essentially matches
> practice in GIS systems for decades now (where geometry was in a separate
> table).
>
>
>
> However, I’m not sure that you need a new vocabulary or namespace.   The
> definition of :hasGeometry in the GeoSPARQL standard[1] (clause 8.3.1.1) is
>
>
>
> geo:hasGeometry a rdf:Property,
>
> owl:ObjectProperty;
>
> rdfs:isDefinedBy <http://www.opengis.net/spec/geosparql/1.0>;
>
> rdfs:label "has Geometry"@en;
>
> rdfs:comment "A spatial representation for a given feature."@en;
>
> rdfs:domain geo:Feature;
>
> rdfs:range geo:Geometry .
>
>
>
> It is an owl:ObjectProperty, but there is no requirement for the object to
> be a local blank-node. It can be a URI, as in your slide 18.
>
>
>
> You may worry about the rdfs:range, which is given as geo:Geometry, which
> is defined in clause 8.4. A set of properties are provided, one of which is
> geo:hasSerialization. WKT is only provided as an example, and is not
> mandatory - GML is included in GeoSPARQL as another option, but other
> representations are not prohibited thanks to the RDF
> open-world-assumption.  There is merely the entailment that the object of a
> geo:hasGeometry property is a member of the class geo:Geometry.
>
>
>
> As you note, content negotiation for representations of a geometry will be
> helpful.
>
> However, format (serialization) negotiation using HTTP Accept: is only
> part of the story.
>
> You will likely also need to negotiate over the *schematic* form of the
> representation (e.g. neogeo vs geosparql).
>
> This is the topic of an upcoming W3C note from the Data Exchange Working
> Group ‘Content Negotiation by Profile’[2].
>
>
>
> Simon
>
>
>
> [1] https://portal.opengeospatial.org/files/?artifact_id=47664
>
> [2] https://w3c.github.io/dxwg/conneg-by-ap/
>
>
>
> *From:* esip-semanticweb [mailto:
> esip-semanticweb-bounces at lists.esipfed.org] *On Behalf Of *Blake Regalia
> via esip-semanticweb
> *Sent:* Wednesday, 29 May, 2019 07:13
> *To:* Mcgibbney, Lewis J (398M) <lewis.j.mcgibbney at jpl.nasa.gov>
> *Cc:* esip-semanticweb at lists.esipfed.org; Mike Daniels <daniels at ucar.edu>
> *Subject:* Re: [esip-semantictech] [AGENDA] ESIP SemTech Telecon -
> 2019-05-28
>
>
>
> Slides from today's presentation:
>
>
>
>
> https://www.slideshare.net/BlakeRegalia/towards-a-more-efficient-paradigm-of-storing-and-querying-spatial-data-on-the-semantic-web
>
>
>
>  - Blake Regalia
>
>
>
>
>
> On Thu, May 23, 2019 at 4:12 PM Mcgibbney, Lewis J (398M) <
> lewis.j.mcgibbney at jpl.nasa.gov> wrote:
>
> Hi esip-semanticweb,
>
> This is a courtesy email regarding preparation for our next telecon.
>
> We will be hosting Blake Regalia, UCSB who will be “…Revisiting the
> Representation of and Need for Raw Geometries on the Linked Data Web”.
>
>
>
> After Blake’s presentation, we will use the remainder of our time to
> discuss the science-on-shcema.org proposal which can be found at
> https://docs.google.com/document/d/1O539ROr9W7FUEDzR2ni2H2Doxx_2zK8AF-sma8pBDe0/edit?usp=sharing
>
> Mike Daniels will be joining us for that.
>
>
>
> Our meeting minites can be found at
> https://docs.google.com/document/d/19agZraGms4vsv7S2SP0SpPWuTtfIQOet4NkyvMUEClQ/edit#heading=h.yn5iw79j9hmd
> SemTech Monthly Telecon
>
>    - 4th Tuesday of each month at 4pm Eastern
>    - GoToMeeting: https://www.gotomeeting.com/join/976796333
>    - Phone Access: United States: +1 (872) 240-3212
>    - Access Code: 976-796-333
>
>
>
> Lewis
>
>
>
> Dr. Lewis John McGibbney Ph.D., B.Sc.(Hons)
>
> Data Scientist III
>
> Computer Science for Data Intensive Applications Group (398M)
>
> Instrument Software and Science Data Systems Section (398)
>
> Jet Propulsion Laboratory
>
> California Institute of Technology
>
> 4800 Oak Grove Drive
>
> Pasadena, California 91109-8099
>
> Mail Stop : 158-256C
>
> Tel:  (+1) (818)-393-7402
>
> Cell: (+1) (626)-487-3476
>
> Fax:  (+1) (818)-393-1190
>
> Email: lewis.j.mcgibbney at jpl.nasa.gov
>
> ORCID: orcid.org/0000-0003-2185-928X
>
>
>
>            [image: signature_492949258]
>
>
>
>  Dare Mighty Things
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.esipfed.org/pipermail/esip-semanticweb/attachments/20190530/03e253e8/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 3432 bytes
Desc: not available
URL: <http://lists.esipfed.org/pipermail/esip-semanticweb/attachments/20190530/03e253e8/attachment-0001.png>