[Esip-discovery] open search extensions

Tue Jan 25 00:04:43 EST 2011

Hi all,
I'd like to open up a discussion of the Open Search efforts within 
ESDSWG/ESIP. I am fairly new to this community so please forgive me if 
some of these issues and ideas have been discussed already.

At UNAVCO we have a NASA ROSES grant (GSAC) where we are working on 
defining a common web service and html API for querying geodesy data 
repositories.  We currently have 3 implementations of these GSAC 
repositories running at UNAVCO, Scripp's SOPAC and (soon) NASA's CDDIS 
sites. We also have a 4th implementation of a federated repository that 
can search across multiple external repositories and will soon have a 
5th implementation in a  stand-alone repository for 3rd party sites. 
Here is UNAVCO's implementation:
http://facility.unavco.org/gsacws/gsacapi/site/form

As part of this work we have developed a facility for repositories to 
describe their query capabilities. This is similar to the Open Search 
initiative but it is a more flexible and richer mechanism. You can see 
the auto-generated API documentation here:
http://facility.unavco.org/gsacws/gsacapi/repository/info
and a write up on this here:
http://facility.unavco.org/data/gsacws/development/capabilities.html

It should be noted that the above site form as well as the resource 
query forms:
http://facility.unavco.org/gsacws/gsacapi/resource/form
are completely generated via reading the declarative specification of 
the capabilities of the underlying repository.

The basic idea is that a data repository describes a query end point URL 
and a set of possible URL arguments. The key differentiation between 
this approach and Open Search is the introduction of a type mechanism. 
As described in the above documentation each search parameter has a type 
associated with it. Based on the type a front-end can assemble the 
appropriate search interface. So, for example, our site searches have 
parameters such as  site code (string), name (string), type 
(enumeration), bounds (spatial bounds), etc.

As we describe in the write up, while Open Search has the ability to 
have extensions these are still essentially hard-coded and front ends 
need to have explicit knowledge of those extensions to produce search 
interfaces. With the capabilities/type approach the only "hard-coding" 
required of a front end is for the front end to understand the type 
system. This provides much greater flexibility in defining what can be 
searched for. For example, in our resource (i.e., file) search interface 
there are actually 2 different dates that can be searched for - the data 
date and the archive publish date.

What this has enabled us to do is to develop a generic layer called the 
GSAC Service Layer (GSL). The GSL (a Java servlet) reads a set of 
capabilities from an underyling repository implementation and generates 
the appropriate interfaces. Likewise, in our implementation of a 
federated search repository the system reads the capabilities from a set 
of external repositories and merges those capabilities together.

While the goal of having open search interfaces is important the 
text-search oriented focus of the Open Search spec is perhaps too 
limited for the complexities inherent within geoscience data systems.

While our major focus has been on development we'd like to see if some 
of these ideas can gain traction in the broader ESIP/ESDWSG communities 
and would welcome discussion and feedback.

Thanks,
-Jeff