[Esip-discovery] Relating services to datasets served, and datasets to available services
Mattmann, Chris A (388J)
chris.a.mattmann at jpl.nasa.gov
Thu Sep 1 12:51:50 EDT 2011
Hi Guys,
On Sep 1, 2011, at 6:54 AM, Lynnes, Christopher S. (GSFC-6102) wrote:
> Brian,
> I'm not completely opposed to the two-attribute solution, but unfortunately, adding ESIP specific attributes to the link element would make me want to go further. Specifically, I would want to go back to using standard values for the rel attribute, like "enclosure" instead of "http://..../data#" (Note 1). To disambiguate different kinds of enclosures, I would advocate an additional attribute, e.g., esip:enclosure_type="http://..../data#". Likewise, metadata would be rel=describedby, with a disambiguiator of esip:describedby_type="http://..../metadata#" and browse would be rel=icon, disambiguated by, well, you get the picture
> This would remove the single biggest discrepancy between the ESIP use of OpenSearch and the OGC use of OpenSearch. We might be technically compliant then.
> If we went that way, then I would back the two-attribute solution.
+1, I would favor standard rel="" values such as enclosure, with the additional metadata that Chris suggests.
Cheers,
Chris
>
> Note 1: "The value "enclosure" signifies that the IRI in the value of the href attribute identifies a related resource that is potentially large in size and might require special handling."
>
> On Aug 31, 2011, at 9:27 PM, Wilson, Brian D (335G) wrote:
>
>>
>> Chris,
>>
>> The reason (I think) that it relates to DCP-2 is that we need to consistently use the
>> <link> tag in all of the casting/discovery standards. I would like to have all the power
>> of the two attributes (rel to express purpose and protocol to express plumbing semantics)
>> in designing solutions like my proposed idea to "relate" service and dataset casts.
>>
>> If any solution besides choice 5 wins for DCP-2, then I'll still want to use the two-attribute
>> technique, but other folks will be doing it differently.
>>
>> Also, the "collection" rel in DCP-1 is already defined to point to a collection cast in the
>> documentation. This is exactly what I need.
>>
>> But let me return to the "point to a data file" case that DCP-2 is directly about.
>>
>> I would argue that putting the DAP URI in the rel attribute doesn't properly solve the problem.
>>
>> Eventually, people will have to write scripts to automatically process, display, and use the casts.
>> In DCP-1, multiple "generic" purposes were defined in the ESIP-discovery namespace for what
>> might appear in the 'rel' attribute of a <link> tag. Like, this points to a 'browse' image, or a
>> 'data' file, or a 'collection'. I think this a great and proper use of the rel attribute.
>>
>> So, any script can look for a <link> tag with rel='data' and know that they have found the link
>> that points to a data item. Of course, the data item might be a file pointed to by ftp or http,
>> or a DAP URL, or an OGC/WCS GetCoverage call.
>>
>> If we let all of the possibilities for how a data item might be accessed slip into the 'rel' attribute
>> as separate values, then to find the data item the script will have to look for a potentially
>> open-ended list of 'rel' values and see if one of them is DAP, OGC/WMS, OGC/WCS, webification,
>> etc. It will never be certain or easy to find the linked-to data item.
>>
>> I hope we are all in agreement that if there is a data item to be pointed to then it should be
>> the link with rel='data'.
>>
>> Given that basis, then the protocol to be used to get the data item, if not already
>> completely specified by the scheme in the HREF, can be specified with a versioned URI in
>> the second ('esip:protocol') attribute.
>>
>> The other alternative of jamming the DAP specifier at the end of the 'data' URI both violates
>> the proper of use fragments in URI's, and precludes the *hooks* that both the rel and
>> esip:protocol URI's could be de-referenceable in the future to provide additional information
>> and support extensions or additional purposes.
>>
>> These decisions are important since currently in DCP's 1 and 2 we are standardizing some
>> ground rules and reusable techniques that will be used in the design of all thecasting standards.
>> We need a consistent and powerful set of design conventions.
>>
>> -- Brian
>>
>>
>>
>>
>> ________________________________________
>> From: Lynnes, Christopher S. (GSFC-6102) [christopher.s.lynnes at nasa.gov]
>> Sent: Wednesday, August 31, 2011 6:14 AM
>> To: Wilson, Brian D (335G)
>> Cc: esip-discovery at lists.esipfed.org; Manipon, Gerald John M (335H-Affiliate)
>> Subject: Re: [Esip-discovery] Relating services to datasets served, and datasets to available services
>>
>> Brian,
>> Yes, that all makes a lot of sense. But how do you see it impacting the DCP-2 proposal, which was targeted at alternate access methods (specifically) for individual files, not collections?
>>
>> On Aug 30, 2011, at 8:07 PM, Wilson, Brian D (335G) wrote:
>>
>>>
>>> Folks,
>>>
>>> I'd like to float a solution to two problems that are related to DCP-2:
>>>
>>> - how to have a service cast entry specify what datasets it "serves"
>>> - how to have a dataset (or collection) cast specify what services are
>>> available to query, access, or transform each dataset.
>>>
>>> The proposed solution is to reuse the two casting standards and the
>>> OpenSearch protocol.
>>>
>>> The list of datasets that a particular service allows access to might be
>>> lengthy and computed using a search or semantic lookup process.
>>> Trying to 'name' datasets in the service cast is problematic since one
>>> has the problem of what names to use.
>>>
>>> So the idea is to hide this 'lookup' behind a URL, which could be
>>> an OpenSearch URL for example.
>>>
>>> So the scast entry would contain a <link> tag as follows:
>>>
>>> <link rel="http://esipfed.org/ns/discovery/1.1/collection#"
>>> type="application/atom+xml"
>>> xmlns:esip="http://esipfed.org/ns/discovery/1.1/"
>>> esip:protocol="http://a9.com/-/spec/opensearch/1.1/"
>>> href="<specific OpenSearch URL that does the appropriate search>" />
>>>
>>> Here I've reused the "collection" URI that has already been defined, but
>>> we could define a more specific one like "collectionsServed". Either way,
>>> this is the known link (with rel=) to answer that question.
>>>
>>> The fact that this <link> is an OpenSearch is expressed in the 'protocol'
>>> attribute using the usual URI for versioned opensearch protocol.
>>> Of course, the <link> could be of some other type; e.g. a direct link to
>>> a collection cast.
>>>
>>> And this is why I think we need both the 'rel' and 'protocol' attributes.
>>> One needs to specify both that the link's purpose is to answer collectionsServed
>>> question, and the protocol for getting the answer is an OpenSearch yielding a
>>> feed (the collection cast). By using two attributes, each of these URI's could
>>> also be de-referenceable and point to some additional information.
>>>
>>> The beauty of hiding the collectionsServed question behind a search link
>>> is that it nicely reuses the OpenSearch protocol and the collection cast format.
>>> The list of collectionsServed is available on demand, it can change without
>>> having to alter and re-publish the service cast, and metadata describing the
>>> collections is immediately available in the usual feed format.
>>>
>>> A GUI that wants to present metadata about the collections served has it
>>> immediately available. But service metadata and dataset metadata are
>>> strictly separated into their respective casts, and reuse known formats.
>>>
>>> Behind the OpenSearch link, the collectionsServed lookup might be a
>>> SQL dbase lookup, or a SPARQL query, or involve some semantic reasoning.
>>> Implementers are free to innovate any way they want to, but in the meantime
>>> we can move forward with standardizing the casting formats. If they don't
>>> want to use OpenSearch for this link, they can also choose an alternate
>>> protocol, at the risk of requiring users to understand additional request &
>>> response formats (besides OS and collection cast).
>>>
>>> The reverse problem can be solved in the same way.
>>>
>>> A link to 'servicesAvailable' could appear in each entry in a collection
>>> cast, as in:
>>>
>>> <link rel="http://esipfed.org/ns/discovery/1.1/service#"
>>> type="application/atom+xml"
>>> xmlns:esip="http://esipfed.org/ns/discovery/1.1/"
>>> esip:protocol="http://a9.com/-/spec/opensearch/1.1/"
>>> href="<specific OpenSearch URL that does the appropriate search>" />
>>>
>>> This time the opensearch yields a service cast listing the services available
>>> for that dataset, with the usual metadata in a known format.
>>>
>>> This also explains why I have been arguing for standardizing two attributes
>>> in the <link> tag: rel and esip:protocol. Given this flexibility and extra
>>> power, we can design solutions to thorny problems like above. And all of
>>> our URI's will be cleanly defined or reused from W3C, and ultimately
>>> could be de-referenceable for more information if that serves additional
>>> purposes.
>>>
>>> It occurs to me that we might think about defining 'vendor-specific'
>>> MIME types for the casting (extended Atom) formats. For example:
>>> "application/vnd.esip.discovery.cast.collection" and
>>> "application/vnd.esip.discovery.cast.service". However, this seems a
>>> bit ugly and perhaps counterproductive. For most purposes, it
>>> will be better to use the generic Atom or RSS mime type that more
>>> software will know.
>>>
>>> -- Brian
>>>
>>>
>>>
>>> _______________________________________________
>>> Esip-discovery mailing list
>>> Esip-discovery at lists.esipfed.org
>>> http://www.lists.esipfed.org/mailman/listinfo/esip-discovery
>>
>> --
>> Dr. Christopher Lynnes NASA/GSFC, Code 610.2 phone: 301-614-5185
>>
>>
>
> --
> Dr. Christopher Lynnes NASA/GSFC, Code 610.2 phone: 301-614-5185
>
>
> _______________________________________________
> Esip-discovery mailing list
> Esip-discovery at lists.esipfed.org
> http://www.lists.esipfed.org/mailman/listinfo/esip-discovery
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann at nasa.gov
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
More information about the Esip-discovery
mailing list