[esip-semanticweb] some ToolMatch info for inferring rules

Wed Mar 28 14:45:32 EDT 2012

A few comments:

(1) Since the classes have to be curated by hand, we would only put
verified usability in the ontology. The inferred properties could be added
by the rules. Consider the following conceptual example of object property
and susbproperty:

visualizes
 |- visualizesInferred

Where visualizesInferred was generated by rules. This allows use to
distinguish between the cases.

(2) But one use case that would be very useful is to query the ToolMatch
to ask, "find all inferred usability properties". Then a human can review
the inferred usability list to manually verify and then either update the
rules and/or update the ontology. This implies almost a machine queryable
mechanism to separate out the asserted and inferred triples. The sub
property approach loses the inferredness characteristic in the property
name. Flagging inferredness with another property doesn't work as clean.

(3) Interesting thing to note, as an implementation-specific detail, the
Jena Framework allows us to filter out the asserted triples from the
inferred graph using the iterator.filterDrop() capability. It can allow us
to filter out all triples from the original asserted graph so you end up
with only inferred triples from which to get the list of inferred
usability properties. So solves the limitation in the approach described
in (1) and (2).

(4) With provenance in the mix, it can be used to validate inferred
usability. Seeing in the lineage successful usages of Panoply showing maps
of HDF-EOS5 L3 gridded, would yield useful information that should be fed
back into the ToolMatch rules and/or ontology.

--Hook

On 3/28/12 7:26 AM, "Lynnes, Christopher S. (GSFC-6102)"
<christopher.s.lynnes at nasa.gov> wrote:

>
>On Mar 28, 2012, at 10:17 AM, Eric Rozell wrote:
>
>> I'm wondering if some form of explanation is also going to be a use
>>case for ToolMatch.  It's one thing to say that Panopoly is compatible
>>with your dataset, it may be even more useful to say that the reason
>>Panopoly is compatible is because there is a netCDF version of your
>>dataset (letting the ToolMatch user know that they should use the netCDF
>>dataset with Panopoly).
>> 
>> This is more of a stretch use case, but for the HDF-EOS2 conversion...
>>the explanation would say "Panopoly can draw maps of this dataset   iff
>> it is converted to netCDF/CF"
>
>This might be of use to an application developers for some cases.  (Data
>users seem to care only about which datasets will work with their tool of
>choice, or what tools they might use, not so much why.)
>
>> 
>> As to your question about inferred vs. asserted usability, that would
>>be something the explanation component would take care of.  We could
>>create a sub-property of each of the "compatibility" properties (i.e.,
>>compatible with, visualizes, draws map of, etc.) that is used only in
>>the case of direct assertions.  The explanation component could detect
>>that the match was inferred due to the sub property relationship.
>> 
>> Yet another stretch use case would be to use provenance to keep track
>>of who asserted compatibility (so you can hunt them down when you can't
>>get it to work!).
>
>I think this is a great idea.  Well, not the hunting down part :-), but
>the keeping track part. This is useful in knowing how much to trust the
>assertion.  Also, if we had a provenance-related property, adding who
>asserted the relationship would tell us whether it is asserted vs.
>inferred, yes?
>
>> 
>> 
>> --Eric
>> 
>> 
>> On Mar 28, 2012, at 9:55 AM, Lynnes, Christopher S. (GSFC-6102) wrote:
>> 
>>> Here are some examples from my own investigations.  So far, it seems
>>>like there are always exceptions to the rule, usually where someone has
>>>used an unusual grid (MISR's SOM grid), or sometimes due to bugs like
>>>the OMAEROe/OPeNDAP exception.  That exception may be fixed in the
>>>future, if/when the bug fix comes out and is then deployed to the
>>>OPeNDAP server.
>>> 
>>> Panoply can draw maps of:
>>> *  netCDF gridded data that follows CF-conventions for coordinates
>>> ** note that some HDF-EOS2 Level 3 data can be converted to netCDF/CF
>>>(this is dataset by dataset)
>>> 
>>> * netCDF swath data represented as a grid with CF-compliant coordinates
>>> ** e.g., AIRS Level 2 Standard Retrievals, converted to netCDF
>>> 
>>> * HDF-EOS5 Level 3 gridded data
>>> 
>>> * Most HDF-EOS2 Level 3 (gridded) data that is offered through OPeNDAP
>>> ** Exceptions: ?
>>> 
>>> * Most HDF-EOS5 Level 3 (gridded) data that is offered through OPeNDAP
>>> ** Exceptions:  OMAEROe (Aura OMI Aerosol Global Gridded (0.25 deg
>>>Lat/Lon grids)). Oddly, Panoply CAN draw a map of the same data product
>>>in its HDF-EOS5 form.
>>> 
>>> * Some HDF-EOS2 Level 2 (swath) data
>>> ** Yes: MODIS Level 2 Aerosols (MOD04_L2)
>>> ** No: MISR Level 2 Aerosols (MIL2ASAE)
>>> 
>>> * Some HDF-EOS2 Level 2 (swath) data offered through OPeNDAP
>>> ** Yes:  AIRS Level 2 Standard Retrievals (AIRX2RET)
>>> ** No: MISR Level 2 Aerosols (MIL2ASAE) - due to SOM grid, I think
>>> 
>>> The use case is, for a given choice of dataset, should Panoply show up
>>>on the list of tools that are known to work with it?
>>> 
>>> Bottom line is that I think we can use inference from various
>>>properties of the dataset (format, data structure, convention
>>>compliance, access protocol) to say whether something is *likely* to be
>>>usable by a given tool, but due to tool/library bugs, missteps in data
>>>product and just unusual edge cases, the only way to know for sure is
>>>to try the tool with that dataset and assert the relationship.
>>> 
>>> Is there a way we could distinguish between inferred usability and
>>>asserted usability?
>>> --
>>> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
>>> 
>>> 
>>> _______________________________________________
>>> esip-semanticweb mailing list
>>> esip-semanticweb at lists.esipfed.org
>>> http://www.lists.esipfed.org/mailman/listinfo/esip-semanticweb
>>> 
>> 
>
>--
>Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
>
>
>_______________________________________________
>esip-semanticweb mailing list
>esip-semanticweb at lists.esipfed.org
>http://www.lists.esipfed.org/mailman/listinfo/esip-semanticweb