[Esip-preserve] Identifiers and REST

Bruce Barkstrom brbarkstrom at gmail.com
Fri Apr 29 11:08:41 EDT 2011


I was reading Fielding, R. T. and Taylor, R. N, 2002: Principled
Design of the Modern Web Architecture, ACM Trans. Internet
Technology, Vol. 2, 115-150, a fundamental statement that
lays out the definition of REST.  They note

"Most software systems are created with the implicit assumption
that the entire system is under the control of one entity, or at
least that all entities participating within a system are acting
towards a common goal and not at cross-purposes.  Such an
assumption cannot be safely made when a system runs openly
on the Internet.  Anarchic scalability refers to the need for architectural
elements to continue operating under ananticipated load, or when
given malformed or maliciously constructed data, since they
may be communicating with elements outside their organizational
control.
...
Multiple organizational boundaries imply that multiple trust
boundaries could be present in any communication.
...
Multiple organizational boundaries also mean that the system
must be prepared for gradual and fragmented change." [p. 119]

I think the notion in this quote also implies that one should expect
separate vocabularies and naming conventions inside individual
organizational boundaries.  As an example, I think it's pretty clear
that the part identifiers used by Ford differ from those used by GM
or Toyota - even for parts that probably fulfill the same functional
requirements.

This suggests that each organization will have their own schemas
for workflow identifiers or file identifiers - whether we like it or not.
I suspect the impossibility of obtaining a single schema is clear,
even with the rage for "standards".  Thus, it may be particularly
important to try to design systems keeping three things in mind:

1.  Systems designers need some way of organizing things, so
they need clear definitions of the architectural structures (particularly
the relationships between objects in the architecture) and they
need to provide definitions that are understandable to people outside
of the architecture design group.

2.  Designers should be prepared for multiple identifier schemas and
for translators
between them.  This may get interesting if the schemas make
different organizational assumptions.  Putting this a bit more
concretely, it would be sensible to assume that if there is a
hierarchy, then there will be an alternative hierarchy that some
other organization prefer.  For example, one group of librarians
will use a Dewey Decimal hierarchy for classifying nonfiction, while
another will use the Library of Congress classification scheme.
Same objects - different classification.

3.  Gradual and fragmented change also implies fragmented and
gradual identifier schema change.  In one case, there are some
data files that were stored on the old HPSS storage system with
an identifier schema based on the assumption that magnetic tapes
were stable and would be there foreever.  Of course, that classification
and naming convention doesn't seem sensible now.  However, to migrate
from the old convention appears expensive - and the people running
the archive don't feel like they've got the time to design a new system.

Having said this, I don't think one wants to stop work - but new systems
are not silver bullets that are going to transform the world without work.
To put it somewhat facetiously "take one pill of clarity about current
assumptions and then put your nose back on the grindstone".

Bruce B.


More information about the Esip-preserve mailing list