[Esip-discovery] Relating services to datasets served, and datasets to available services

Wilson, Brian D (335G) bdwilson at jpl.nasa.gov
Thu Sep 1 18:19:00 EDT 2011


You imply that you somehow feel queasy about going further.  I think this is a good idea
in order to maximize compatibility with vanilla Atom feeds and OpenSearch, while having
the fully specific attributes we need for ESIP and Earth Science purposes.

The logic for a *three* attribute solution would be as follows.

Unfortunately, the scheme in a HREF simply doesn't specify the specific REST protocol being
used in a HTTP URL.  And the mime type in the 'type' attribute also doesn't have enough
versioned information.  Thus, we need a more specific attribute to indicate versioned DAP,
protocol, versioned WMS protocol, etc.

Similarly, to be maximally compatible we need to honor the allowed 'rel' attributes, like icon
and enclosure, that are commonly used.  But then a more specific attribute can contain the
more specific sub-purposes we need.

More concretely, some examples would be:

rel="enclosure"  esip:purpose="...data#"  esip:protocol="DAP"  href="..."
rel="enclosure"  esip:purpose="...data#"  esip:protocol="WMS"  href="..."
rel="enclosure"  esip:purpose="...data#"  href="<http/ftp link to data file>"

rel="icon"  esip:purpose="...browse/image#"  esip:protocol="WMS"  href="..."
rel="icon"  esip:purpose="...browse/image#"  href="<http link to png file>"

rel="describedBy"  esip:purpose="...metadata#"  esip:protocol="OpenSearch"  href="..."
rel="describedBy"  esip:purpose="...metadata#"  type="html" href="<documentation web page>"

rel="search"  esip:purpose="...search#"  esip:protocol="OpenSearch"  href="..."

rel="related"  esip:purpose="...service#"  esip:protocol="WMS"  href="..."

rel="related"  esip:purpose="...serviceCast#"  esip:protocol="OpenSearch"  href="..."
rel="related"  esip:purpose="...serviceCast#"  href="..."

rel="related"  esip:purpose="...collectionCast#"  esip:protocol="OpenSearch"  href="..."

The rel="service" is reserved for Atom publishing services according to IANA so I didn't
use it in the "service#" example.  We could use the generic rel="related" for those specific
ESIP purposes that don't have an obvious generic counterpart.

The rel="via"  might also be useful.

Also, it seems like we should have ESIP purposes to distinguish between a related
service, here denoted with "service#", and a service cast denoted "serviceCast#",
which contains metadata for a list of services, and might be the result of an OpenSearch.
That last link type is exactly what would be used in a collectionCast to point to the
list of available services.

Similarly, perhaps "collection#" should be used to point back to the collection in a
data granule cast, and "collectionCast#" indicates the usual datasets cast.

I'll have to think a bit more to see how all of the specific rel's in DCP-1 might fit into
generic rels in common use.  The list of IANA-registered link relations is at:

I think the discussion is evolving the design toward a solution that is maximally compatible
with the generic Atom/OpenSearch world, but fully typed and scriptable for ESIP purposes.

I'm all in favor of formulating a DCP-3 along these lines, replacing the table in DCP-1.

 -- Brian

From: Lynnes, Christopher S. (GSFC-6102) [christopher.s.lynnes at nasa.gov]
Sent: Thursday, September 01, 2011 6:54 AM
To: Wilson, Brian D (335G)
Cc: esip-discovery at lists.esipfed.org; Manipon, Gerald John M (335H-Affiliate)
Subject: Re: [Esip-discovery] Relating services to datasets served, and datasets to available services

  I'm not completely opposed to the two-attribute solution, but unfortunately, adding ESIP specific attributes to the link element would make me want to go further.  Specifically, I would want to go back to using standard values for the rel attribute, like "enclosure" instead of "http://..../data#" (Note 1). To disambiguate different kinds of enclosures, I would advocate an additional attribute, e.g., esip:enclosure_type="http://..../data#". Likewise, metadata would be rel=describedby, with a disambiguiator of esip:describedby_type="http://..../metadata#" and browse would be rel=icon, disambiguated by, well, you get the picture
  This would remove the single biggest discrepancy between the ESIP use of OpenSearch and the OGC use of OpenSearch. We might be technically compliant then.
  If we went that way, then I would back the two-attribute solution.

Note 1:  "The value "enclosure" signifies that the IRI in the value of the href attribute identifies a related resource that is potentially large in size and might require special handling."

On Aug 31, 2011, at 9:27 PM, Wilson, Brian D (335G) wrote:

> Chris,
> The reason (I  think) that it relates to DCP-2 is that we need to consistently use the
> <link> tag in all of the casting/discovery standards.  I would like to have all the power
> of the two attributes (rel to express purpose and protocol to express plumbing semantics)
> in designing solutions like my proposed idea to "relate" service and dataset casts.
> If any solution besides choice 5 wins for DCP-2, then I'll still want to use the two-attribute
> technique, but other folks will be doing it differently.
> Also, the "collection" rel in DCP-1 is already defined to point to a collection cast in the
> documentation.  This is exactly what I need.
> But let me return to the "point to a data file" case that DCP-2 is directly about.
> I would argue that putting the DAP URI in the rel attribute doesn't properly solve the problem.
> Eventually, people will have to write scripts to automatically process, display, and use the casts.
> In DCP-1, multiple "generic" purposes were defined in the ESIP-discovery namespace for what
> might appear in the 'rel' attribute of a <link> tag.  Like, this points to a 'browse' image, or a
> 'data' file, or a 'collection'.  I think this a great and proper use of the rel attribute.
> So, any script can look for a <link> tag with rel='data' and know that they have found the link
> that points to a data item.  Of course, the data item might be a file pointed to by ftp or http,
> or a DAP URL, or an OGC/WCS GetCoverage call.
> If we let all of the possibilities for how a data item might be accessed slip into the 'rel' attribute
> as separate values, then to find the data item the script will have to look for a potentially
> open-ended list of 'rel' values and see if one of them is DAP, OGC/WMS, OGC/WCS, webification,
> etc.  It will never be certain or easy to find the linked-to data item.
> I hope we are all in agreement that if there is a data item to be pointed to then it should be
> the link with rel='data'.
> Given that basis, then the protocol to be used to get the data item, if not already
> completely specified by the scheme in the HREF, can be specified with a versioned URI in
> the second ('esip:protocol') attribute.
> The other alternative of jamming the DAP specifier at the end of the 'data' URI both violates
> the proper of use fragments in URI's, and precludes the *hooks* that both the rel and
> esip:protocol URI's could be de-referenceable in the future to provide additional information
> and support extensions or additional purposes.
> These decisions are important since currently in DCP's 1 and 2 we are standardizing some
> ground rules and reusable techniques that will be used in the design of all thecasting standards.
> We need a consistent and powerful set of design conventions.
> -- Brian
> ________________________________________
> From: Lynnes, Christopher S. (GSFC-6102) [christopher.s.lynnes at nasa.gov]
> Sent: Wednesday, August 31, 2011 6:14 AM
> To: Wilson, Brian D (335G)
> Cc: esip-discovery at lists.esipfed.org; Manipon, Gerald John M (335H-Affiliate)
> Subject: Re: [Esip-discovery] Relating services to datasets served, and datasets to available services
> Brian,
>  Yes, that all makes a lot of sense.  But how do you see it impacting the DCP-2 proposal, which was targeted at alternate access methods (specifically) for individual files, not collections?
> On Aug 30, 2011, at 8:07 PM, Wilson, Brian D (335G) wrote:
>> Folks,
>> I'd like to float a solution to two problems that are related to DCP-2:
>> - how to have a service cast entry specify what datasets it "serves"
>> - how to have a dataset (or collection) cast specify what services are
>>   available to query, access, or transform each dataset.
>> The proposed solution is to reuse the two casting standards and the
>> OpenSearch protocol.
>> The list of datasets that a particular service allows access to might be
>> lengthy and computed using a search or semantic lookup process.
>> Trying to 'name' datasets in the service cast is problematic since one
>> has the problem of what names to use.
>> So the idea is to hide this 'lookup' behind a URL, which could be
>> an OpenSearch URL for example.
>> So the scast entry would contain a <link> tag as follows:
>> <link rel="http://esipfed.org/ns/discovery/1.1/collection#"
>>    type="application/atom+xml"
>>    xmlns:esip="http://esipfed.org/ns/discovery/1.1/"
>>    esip:protocol="http://a9.com/-/spec/opensearch/1.1/"
>>    href="<specific OpenSearch URL that does the appropriate search>" />
>> Here I've reused the "collection" URI that has already been defined, but
>> we could define a more specific one like "collectionsServed".  Either way,
>> this is the known link (with rel=) to answer that question.
>> The fact that this <link> is an OpenSearch is expressed in the 'protocol'
>> attribute using the usual URI for versioned opensearch protocol.
>> Of course, the <link> could be of some other type; e.g. a direct link to
>> a collection cast.
>> And this is why I think we need both the 'rel' and 'protocol' attributes.
>> One needs to specify both that the link's purpose is to answer collectionsServed
>> question, and the protocol for getting the answer is an OpenSearch yielding a
>> feed (the collection cast). By using two attributes, each of these URI's could
>> also be de-referenceable and point to some additional information.
>> The beauty of hiding the collectionsServed question behind a search link
>> is that it nicely reuses the OpenSearch protocol and the collection cast format.
>> The list of collectionsServed is available on demand, it can change without
>> having to alter and re-publish the service cast, and metadata describing the
>> collections is immediately available in the usual feed format.
>> A GUI that wants to present metadata about the collections served has it
>> immediately available.  But service metadata and dataset metadata are
>> strictly separated into their respective casts, and reuse known formats.
>> Behind the OpenSearch link, the collectionsServed lookup might be a
>> SQL dbase lookup, or a SPARQL query, or involve some semantic reasoning.
>> Implementers are free to innovate any way they want to, but in the meantime
>> we can move forward with standardizing the casting formats.  If they don't
>> want to use OpenSearch for this link, they can also choose an alternate
>> protocol, at the risk of requiring users to understand additional request &
>> response formats (besides OS and collection cast).
>> The reverse problem can be solved in the same way.
>> A link to 'servicesAvailable' could appear in each entry in a collection
>> cast, as in:
>> <link rel="http://esipfed.org/ns/discovery/1.1/service#"
>>    type="application/atom+xml"
>>    xmlns:esip="http://esipfed.org/ns/discovery/1.1/"
>>    esip:protocol="http://a9.com/-/spec/opensearch/1.1/"
>>    href="<specific OpenSearch URL that does the appropriate search>" />
>> This time the opensearch yields a service cast listing the services available
>> for that dataset, with the usual metadata in a known format.
>> This also explains why I have been arguing for standardizing two attributes
>> in the <link> tag:  rel and esip:protocol.  Given this flexibility and extra
>> power, we can design solutions to thorny problems like above.  And all of
>> our URI's will be cleanly defined or reused from W3C, and ultimately
>> could be de-referenceable for more information if that serves additional
>> purposes.
>> It occurs to me that we might think about defining 'vendor-specific'
>> MIME types for the casting (extended Atom) formats.  For example:
>> "application/vnd.esip.discovery.cast.collection" and
>> "application/vnd.esip.discovery.cast.service".  However, this seems a
>> bit ugly and perhaps counterproductive.  For most purposes, it
>> will be better to use the generic Atom or RSS mime type that more
>> software will know.
>> -- Brian
>> _______________________________________________
>> Esip-discovery mailing list
>> Esip-discovery at lists.esipfed.org
>> http://www.lists.esipfed.org/mailman/listinfo/esip-discovery
> --
> Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185

Dr. Christopher Lynnes     NASA/GSFC, Code 610.2    phone: 301-614-5185

More information about the Esip-discovery mailing list