mapping files to objects
Tue, 06 May 1997 15:22:48 -0500
Content-Type: text/plain; charset=us-ascii
> If pathname were a CLOS class, then one could subclass it instead of
> having fixed slots. For instance, the URL-pathname class could add the
> other 4 (I lost count) fields that it needs to keep track of all of
> the parts of a URL. The 6 normal components would still contain their
> normal contents, so any program that just deals with pathnames would
> have a chance of working.
> Embedding or inferring type info from the name is a really, really
> bad idea. It's so error prone as to be useless. If you want to know
> the type of data in a file, look in the file!
> >3) It now becomes nearly trivial to write the function that I previously
> >proposed to map from a pathname to an object representing that file.
> Not even close. What's a .tgz file? Is it a compressed file or a tar
> file or a compressed tar file? Kind of depends on what you want to do
> with it, doesn't it? What's a .l file? (It has multiple meanings on
> most Unix systems, partly due to us Lispers.)
In the .tgz case, it is kinda all three. Fortunately with multiple
inheritance, it shouldn't be hard to allow it to behave appropriately in all
contexts. In the .l case, you've got a point. Since I use perl much more
than I use prolog, I've had to reconfigure my emacs's mode for when it sees a
.pl file. That's exactly the same situation.
> A much better approach is the way the 'file' command does it. The
> Irix version, for instance is table driven to do some pattern
> matching. Typically, it looks at the first N characters to see if it
> is a particular string, like "%!" means it's a PostScript file.
I suppose I'd sound like I've been spending too much time on Unix if I were to
say I didn't want to look in the filesystem because of performance issues, and
if I were a good lisper, I'd be thinking about semantics that guarantee that
things work rather than run fast, right?
I merely didn't want to either (a) read a 10mb mail file just to determine
that it is a mail file or (b) to open/read/close the first block of a mail
file twice. If whoever writes this hunk of code, can either keep those cases
from happening or show that it's not important, that's fine with me.
Okay, let's have it your way. The particular situation that I'm interested in
for a mail program is to be able to recognise various mailfile formats.
An mbox file starts with "From " and has more of them later down in the file.
Much as we hate that, that's what it does and I can recognise it.
A qmail Maildir contains subdirectories called "tmp", "cur", "new". It may
have hidden files as well. I think I can recognise that. It might be named
"Maildir" in a user's home directory.
An mh mail directory may have subfolders that are just directories; it may
have files with names that are all digits (maybe preceded by a comma if
deleted); it may have configuration files with names like forwcomps,
replcomps, components, etc; it may have hidden files with names like
.mh_sequences or .xmhcache; it's probably inside another mail directory;
it might be named "Mail" in a user's home directory; it may be completely
empty. I don't know how to recognise that.
A usenet newsgroup fits most of the same criteria as an mh mail directory,
except that I probably don't have write permission to the directory. If I can
recognise it, I should be able to treat it mostly the same way. It's known by
being rooted somewhere that's defined in a configuration file in the news
software. If I implement it, I'd like to handle it with a different, but
related class than the mh directory. If I don't implement it, I'd like it to
*not* be confused with the similar looking mh directories, but it's not clear
how to tell them apart.
Maybe I'm just overly fixated on a few screw cases that happen to be some of
the first cases that I'll want to deal with. These are the cases that lead me
to want to find ways to declare the type of a particular file/directory.
All I really want is a better interface to Unix file systems so I can start
implementing routines that read such files. If we can agree on an interface,
I'll probably throw together something minimal for my own use making lots of
assumptions and get started. I can then replace my minimal stub with a real
implementation when there is one. What I don't want is to just write code
like I would do in C (or did do in perl) that only sees files as streams of
Chris Garrigues O- cwg@DeepEddy.Com
Deep Eddy Internet Consulting +1 512 432 4046
609 Deep Eddy Avenue
Austin, TX 78703-4513 http://www.DeepEddy.Com/~cwg/
-----BEGIN PGP MESSAGE-----
-----END PGP MESSAGE-----