mapping files to objects

Mike McDonald mikemac@titian.engr.sgi.com
Tue, 06 May 1997 12:36:11 -0700


>To: lispos@math.gatech.edu
>Subject: mapping files to objects
>From: cwg@deepeddy.com
>Date: Tue, 06 May 1997 11:40:11 -0500
>
>--==_Exmh_417284588P
>Content-Type: text/plain; charset=us-ascii
>
>In the interest of getting something that is (a) implementable on top of the 
>Unix file system and (b) has semantics that can be expanded to The Right 
>Thing (tm) later, I'd like to make a modest proposal (no, we aren't going to 
>eat our young).
>
>1) Let's take the CL path objects and add the notion of a file type to the 
>object.  This file type would *not* appear in the printed representation of 
>the path, but would instead by set and read by accessor functions.  This type 
>could then be mapped into a MIME type or a MacOS style owner/type pair.  
>Initially, I thought it should be a MIME type, but then my stupidity passed 
>and I realized that it should be a CLOS class.  *duh*.  This class would have 
>a method for determining the MIME type or any other foreign typing system 
>necessary.

  If pathname were a CLOS class, then one could subclass it instead of
having fixed slots. For instance, the URL-pathname class could add the
other 4 (I lost count) fields that it needs to keep track of all of
the parts of a URL. The 6 normal components would still contain their
normal contents, so any program that just deals with pathnames would
have a chance of working.

>2) Since a Unix file system doesn't have these semantics in it, let's define a 
>facility which allows one to define mappings between Unix paths and the above 
>defined classes.  A similar facility should be defined between MIME types and 
>classes.

  Embedding or inferring type info from the name is a really, really
bad idea. It's so error prone as to be useless. If you want to know
the type of data in a file, look in the file!


>3) It now becomes nearly trivial to write the function that I previously 
>proposed to map from a pathname to an object representing that file.

  Not even close. What's a .tgz file? Is it a compressed file or a tar
file or a compressed tar file? Kind of depends on what you want to do
with it, doesn't it? What's a .l file? (It has multiple meanings on
most Unix systems, partly due to us Lispers.)

  A much better approach is the way the 'file' command does it. The
Irix version, for instance is table driven to do some pattern
matching. Typically, it looks at the first N characters to see if it
is a particular string, like "%!" means it's a PostScript file. 

  Mike McDonald
  mikemac@engr.sgi.com