[Esip-preserve] A Note on Use Cases and Definitions

Bruce Barkstrom brbarkstrom at gmail.com
Sun May 15 14:04:57 EDT 2011


Since I didn't mention it in the previous e-mail, the new page
on Use Cases is on the ESIP Preservation Cluster Wiki at
http://wiki.esipfed.org/index.php/Preservation_Use_Cases

I'd like to suggest we follow the example of the Oxford
English Dictionary (OED) and make sure that every definition
we come up with in things like the Provenance Ontology
and the Content Standard have a reference to one or more instances
of the use of that term in the Use Case catalog.  We need
to make sure that our definitions have ties to the real communities
of data producers and data users - and not just our own, probably
limited mental models of the definitions.

As an example, several of the metadata standards, including
PREMIS and OPM refer to the term "events".  I think an "event"
has a more diverse set of linguistic communities of practice
whose use of this term has a longer history than the ones
associated with the relatively recent discussions in the IT
community.  Here are some examples:

1.  Single Event Upset (SEU) [http://en.wikipedia.org/wiki/Single_event_upset]
is "a change of state caused by ions or electro-magnetic radiation
striking a sensitive node in a micro-electronic device, such as in a
microprocessor, semiconductor memory, or power transistors."
This kind of event is important in satellite data collection because it
corrupts data - and may lead to exception handling (like having to
turn off instruments so the satellite can be put back into an acceptable
operating condition), entries in exception event logs, and remedial
recovery processing that might be needed to repair data records.
Clearly repairs to files to correct for SEU's need to be noted in
collection versioning.  Note also that this kind of example has been
well-known in the satellite data collection community since Earth
science data collection from satellites began in the 1950's.

2.  Event: "A change in state arising from a stimulus within the system
or external to the system; or because of the passage of time." [Klein, M. H.,
Ralya, T., Pollak, B., Obenze, R., Harbour, M. G., 1993: "A Practitioner's
Handbook for Real-Time Analysis: Guide to Rate Monotonic Analysis for
Real-Time Systems", Kluwer Academic Publishers, Boston, MA]  In this
case, the term "Event" is used in the design and operation of embedded
software systems, which are typically multi-CPU, concurrent software
systems, including current design of large scale data production and
archival systems, almost certainly including NPSS (or whatever NPOESS
has become), as well as EOSDIS.

3.  [Accounting]: "Happenings of consequence can also be classified
into two categories:
- Government-related EVENTS [emphasis added] represent mainly accidents
for which the government is responsible and required by law to reimburse
the injured parties for damages
- Government-acknowledged EVENTS [emphasis added] are occurrences
for which the government is not responsible but elects, as a matter of policy,
to provide relief to the victims.  They include primarily natural disasters,
such as hurricanes and earthquakes."  [Granof, M. H., 2007: "Government
and Not-for-Profit Accounting: Concepts and Practices", Fourth Edition,
John Wiley & Sons, Hoboken, NJ]  In this case, the relevance of the
definition arises from such user classes as individuals who wanted to
obtain data confirming or denying legal liability for particular events, such
as the opening of the Morganza Spillway in the current Mississippi flood.
Thus, this kind of "event" relates to production history provenance and
provenance of custodianship.

4. [Hierarchical Production Control Systems in Industrial Engineering and
Operations Research]  Here, the reference is to [Gershwin, S. B., 1994:
"Manufacturing Systems Engineering", Prentice-Hall, Englewood Cliffs, NJ]
which does not appear to have a glossary, but refers to events in many places
in the text.  On p. 22, Gershwin notes that "Intuitively, an event is
an occurrence,
as we usually think of it.  Technically, it is a subset of the sample space."
which I believe relies on a formal notion that the state of a manufacturing
system is a probability space and that events mark transitions from one
state to another.  What gets interesting in this book is that Gershwin
introduces
the notion that "An activity is a pair of events associated with a resource."
[p. 363].  This allows him to introduce event hierarchies and use them in
designing hierarchical production control systems.  This theoretical background
ties events in with Work Breakdown Structures that are used
in planning and scheduling production - in Gershwin's case, this use means
essentially any kind of industrial production.  By extension, this includes
data production scheduling for Earth sciences - which is directly related to
the production patterns that shape the organization of our data collections.
Industrial engineering, of course, has formal roots that go back into the early
1800's.

I'm not sure how best to incorporate the broader community definitions
- although
where system designers have to pull in concepts from any of these existing
communities, they may not care what definitions arise from a small
group of people
divorced from their work (whether that perception is correct or not).  It may be
that we can use namespaces to segregate the definitions used by different
communities if we are developing formal ontologies.  However, I think we need to
be able to tie our definitions to concrete examples represented in publicly
accessible use cases.

Bruce B.


More information about the Esip-preserve mailing list