[Esip-preserve] Identifiers

Bruce Barkstrom brbarkstrom at gmail.com
Tue Feb 21 09:45:27 EST 2012


On the question of the distinction between a granule and a file,
it depends on what the response a request for a granule produces.
In my mental model of these items, I distinguish between a data file,
which contains only data, metadata-containing files, and documentation.

As an analog, I have a table saw in my home shop.  That saw came
with a user manual and a parts list that could be used to order
replacement parts if something broke.  From my standpoint, I
might want to order just the table saw without the parts list or
the manual, since I had copies of those already.  That would be
equivalent to getting just a data file.  If I bought the table saw
in a retail store, it would usually come with the parts list and the
user manual already in the box.  That would be equivalent to obtaining
a granule.

The difference between ordering a data file and ordering a granule
that might include metadata and documentation as well as data
may not be important for orders of single items.  It becomes much
more important if the data required comes as many files.  As long
as the data files are similar enough to each other that a user can
set up production once and be reasonably certain of processing
each file successfully, then there's no need to have more than one
or two copies of the user's manual.

Hope this clarifies the issue.

Bruce B.

On Mon, Feb 20, 2012 at 1:52 PM, Bruce Barkstrom <brbarkstrom at gmail.com> wrote:
> The attached file contains thoughts on two communities to
> whom the differences we've been discussing are likely to
> matter: archive managers and data producers.  As a data
> producer, it has mattered a great deal whether the discussion
> is about a relational database (as an object that might
> contain a dataset), a file (as another object that might contain
> one), or a collection of files with a very long history (say 20
> years) of structure (and perhaps yet another kind of data set).
>  Somewhat similar concerns arise for
> data managers, which I was for five years.
>
> Bruce B.
>
> On Sat, Feb 18, 2012 at 3:10 AM, Greg Janée <gjanee at eri.ucsb.edu> wrote:
>> Mark A. Parsons wrote:
>>> I don't think there is a falsifiable definition of data set. Or rather all definitions are false. It's very situational.
>>
>> Agreed.  To put it another way, I think this attempt to define "dataset" is doomed because a dataset is a cognitive construct, and cognitive constructs do not have exact definitions and hard boundaries, but look more like overlapping categories that are characterized by exemplars and degrees of membership.
>>
>> Is there a *functional* reason why we need to define terms like "dataset" and "granule"?  I guess a necessary (but not sufficient) condition for me to be convinced by any definitions for "dataset" and "granule" is that there is some kind of functional difference between them; some different functional affordances.
>>
>> From the old Alexandria days I recall a passionate debate over what constituted a "title".  (That may sound quaint now, but I assure you, a librarian armed with an AACR2 reference is a formidable adversary.)  What cut through that particular Gordian knot was looking at the question purely functionally: we only care about titles to the extent that we do something with them.  And the answer at that time was, all we do with titles is display them in search result lists.  Ergo, a "title" is that which you want to see displayed as a search result, no more, no less.  Corollary: a title should be about one line wide when displayed in a typical font size.
>>
>> Regarding data and citation, from a functional perspective I would say that if a particular entity has an identifier, and can be independently referenced (or is independently actionable), and if the entity's provider is committed to maintaining that entity and its identifier and its independent referencability, then the entity is "citable".  Notice that this definition is independent of both the size of the entity and the terminology the provider uses in referring to it.
>>
>> -Greg
>>


More information about the Esip-preserve mailing list