[Esip-discovery] Relating services to datasets served, and datasets to available services

Thu Sep 1 12:51:50 EDT 2011

Hi Guys,

On Sep 1, 2011, at 6:54 AM, Lynnes, Christopher S. (GSFC-6102) wrote:

> Brian,
>  I'm not completely opposed to the two-attribute solution, but unfortunately, adding ESIP specific attributes to the link element would make me want to go further.  Specifically, I would want to go back to using standard values for the rel attribute, like "enclosure" instead of "http://..../data#" (Note 1). To disambiguate different kinds of enclosures, I would advocate an additional attribute, e.g., esip:enclosure_type="http://..../data#". Likewise, metadata would be rel=describedby, with a disambiguiator of esip:describedby_type="http://..../metadata#" and browse would be rel=icon, disambiguated by, well, you get the picture
>  This would remove the single biggest discrepancy between the ESIP use of OpenSearch and the OGC use of OpenSearch. We might be technically compliant then.
>  If we went that way, then I would back the two-attribute solution.

+1, I would favor standard rel="" values such as enclosure, with the additional metadata that Chris suggests.

Cheers,
Chris

> 
> Note 1:  "The value "enclosure" signifies that the IRI in the value of the href attribute identifies a related resource that is potentially large in size and might require special handling."
> 
> On Aug 31, 2011, at 9:27 PM, Wilson, Brian D (335G) wrote:
> 
>> 
>> Chris,
>> 
>> The reason (I  think) that it relates to DCP-2 is that we need to consistently use the
>> <link> tag in all of the casting/discovery standards.  I would like to have all the power
>> of the two attributes (rel to express purpose and protocol to express plumbing semantics)
>> in designing solutions like my proposed idea to "relate" service and dataset casts.
>> 
>> If any solution besides choice 5 wins for DCP-2, then I'll still want to use the two-attribute
>> technique, but other folks will be doing it differently.
>> 
>> Also, the "collection" rel in DCP-1 is already defined to point to a collection cast in the
>> documentation.  This is exactly what I need.
>> 
>> But let me return to the "point to a data file" case that DCP-2 is directly about.
>> 
>> I would argue that putting the DAP URI in the rel attribute doesn't properly solve the problem.
>> 
>> Eventually, people will have to write scripts to automatically process, display, and use the casts.
>> In DCP-1, multiple "generic" purposes were defined in the ESIP-discovery namespace for what
>> might appear in the 'rel' attribute of a <link> tag.  Like, this points to a 'browse' image, or a
>> 'data' file, or a 'collection'.  I think this a great and proper use of the rel attribute.
>> 
>> So, any script can look for a <link> tag with rel='data' and know that they have found the link
>> that points to a data item.  Of course, the data item might be a file pointed to by ftp or http,
>> or a DAP URL, or an OGC/WCS GetCoverage call.
>> 
>> If we let all of the possibilities for how a data item might be accessed slip into the 'rel' attribute
>> as separate values, then to find the data item the script will have to look for a potentially
>> open-ended list of 'rel' values and see if one of them is DAP, OGC/WMS, OGC/WCS, webification,
>> etc.  It will never be certain or easy to find the linked-to data item.
>> 
>> I hope we are all in agreement that if there is a data item to be pointed to then it should be
>> the link with rel='data'.
>> 
>> Given that basis, then the protocol to be used to get the data item, if not already
>> completely specified by the scheme in the HREF, can be specified with a versioned URI in
>> the second ('esip:protocol') attribute.
>> 
>> The other alternative of jamming the DAP specifier at the end of the 'data' URI both violates
>> the proper of use fragments in URI's, and precludes the *hooks* that both the rel and
>> esip:protocol URI's could be de-referenceable in the future to provide additional information
>> and support extensions or additional purposes.
>> 
>> These decisions are important since currently in DCP's 1 and 2 we are standardizing some
>> ground rules and reusable techniques that will be used in the design of all thecasting standards.
>> We need a consistent and powerful set of design conventions.
>> 
>> -- Brian
>> 
>> 
>> 
>> 
>> ________________________________________
>> From: Lynnes, Christopher S. (GSFC-6102) [christopher.s.lynnes at nasa.gov]
>> Sent: Wednesday, August 31, 2011 6:14 AM
>> To: Wilson, Brian D (335G)
>> Cc: esip-discovery at lists.esipfed.org; Manipon, Gerald John M (335H-Affiliate)
>> Subject: Re: [Esip-discovery] Relating services to datasets served, and datasets to available services
>> 
>> Brian,
>> Yes, that all makes a lot of sense.  But how do you see it impacting the DCP-2 proposal, which was targeted at alternate access methods (specifically) for individual files, not collections?
>> 
>> On Aug 30, 2011, at 8:07 PM, Wilson, Brian D (335G) wrote:
>> 
>>> 
>>> Folks,
>>> 
>>> I'd like to float a solution to two problems that are related to DCP-2:
>>> 
>>> - how to have a service cast entry specify what datasets it "serves"
>>> - how to have a dataset (or collection) cast specify what services are
>>>  available to query, access, or transform each dataset.
>>> 
>>> The proposed solution is to reuse the two casting standards and the
>>> OpenSearch protocol.
>>> 
>>> The list of datasets that a particular service allows access to might be
>>> lengthy and computed using a search or semantic lookup process.
>>> Trying to 'name' datasets in the service cast is problematic since one
>>> has the problem of what names to use.
>>> 
>>> So the idea is to hide this 'lookup' behind a URL, which could be
>>> an OpenSearch URL for example.
>>> 
>>> So the scast entry would contain a <link> tag as follows:
>>> 
>>> <link rel="http://esipfed.org/ns/discovery/1.1/collection#"
>>>   type="application/atom+xml"
>>>   xmlns:esip="http://esipfed.org/ns/discovery/1.1/"
>>>   esip:protocol="http://a9.com/-/spec/opensearch/1.1/"
>>>   href="<specific OpenSearch URL that does the appropriate search>" />
>>> 
>>> Here I've reused the "collection" URI that has already been defined, but
>>> we could define a more specific one like "collectionsServed".  Either way,
>>> this is the known link (with rel=) to answer that question.
>>> 
>>> The fact that this <link> is an OpenSearch is expressed in the 'protocol'
>>> attribute using the usual URI for versioned opensearch protocol.
>>> Of course, the <link> could be of some other type; e.g. a direct link to
>>> a collection cast.
>>> 
>>> And this is why I think we need both the 'rel' and 'protocol' attributes.
>>> One needs to specify both that the link's purpose is to answer collectionsServed
>>> question, and the protocol for getting the answer is an OpenSearch yielding a
>>> feed (the collection cast). By using two attributes, each of these URI's could
>>> also be de-referenceable and point to some additional information.
>>> 
>>> The beauty of hiding the collectionsServed question behind a search link
>>> is that it nicely reuses the OpenSearch protocol and the collection cast format.
>>> The list of collectionsServed is available on demand, it can change without
>>> having to alter and re-publish the service cast, and metadata describing the
>>> collections is immediately available in the usual feed format.
>>> 
>>> A GUI that wants to present metadata about the collections served has it
>>> immediately available.  But service metadata and dataset metadata are
>>> strictly separated into their respective casts, and reuse known formats.
>>> 
>>> Behind the OpenSearch link, the collectionsServed lookup might be a
>>> SQL dbase lookup, or a SPARQL query, or involve some semantic reasoning.
>>> Implementers are free to innovate any way they want to, but in the meantime
>>> we can move forward with standardizing the casting formats.  If they don't
>>> want to use OpenSearch for this link, they can also choose an alternate
>>> protocol, at the risk of requiring users to understand additional request &
>>> response formats (besides OS and collection cast).
>>> 
>>> The reverse problem can be solved in the same way.
>>> 
>>> A link to 'servicesAvailable' could appear in each entry in a collection
>>> cast, as in:
>>> 
>>> <link rel="http://esipfed.org/ns/discovery/1.1/service#"
>>>   type="application/atom+xml"
>>>   xmlns:esip="http://esipfed.org/ns/discovery/1.1/"
>>>   esip:protocol="http://a9.com/-/spec/opensearch/1.1/"
>>>   href="<specific OpenSearch URL that does the appropriate search>" />
>>> 
>>> This time the opensearch yields a service cast listing the services available
>>> for that dataset, with the usual metadata in a known format.
>>> 
>>> This also explains why I have been arguing for standardizing two attributes
>>> in the <link> tag:  rel and esip:protocol.  Given this flexibility and extra
>>> power, we can design solutions to thorny problems like above.  And all of
>>> our URI's will be cleanly defined or reused from W3C, and ultimately
>>> could be de-referenceable for more information if that serves additional
>>> purposes.
>>> 
>>> It occurs to me that we might think about defining 'vendor-specific'
>>> MIME types for the casting (extended Atom) formats.  For example:
>>> "application/vnd.esip.discovery.cast.collection" and
>>> "application/vnd.esip.discovery.cast.service".  However, this seems a
>>> bit ugly and perhaps counterproductive.  For most purposes, it
>>> will be better to use the generic Atom or RSS mime type that more
>>> software will know.
>>> 
>>> -- Brian
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Esip-discovery mailing list
>>> Esip-discovery at lists.esipfed.org
>>> http://www.lists.esipfed.org/mailman/listinfo/esip-discovery
>> 
>> --
>> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
>> 
>> 
> 
> --
> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185
> 
> 
> _______________________________________________
> Esip-discovery mailing list
> Esip-discovery at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-discovery

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann at nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++