Reality Isn't the Intersection of Linux and Windows

Posted by Daniel Lyons on October 26, 2008

I often find myself looking at a programming language and wondering, “will this language be the one I use to implement my operating system which will change the universe?” Now, we both know I’m probably never going to do that, but I think it’s a useful thought experiment anyway.

While looking at this language I get to the part about the filesystem API and I see it has some silly limitations. Like it won’t open directories for reading or writing. Or it only permits opening files for reading or writing, but not appending. This kind of thing.

So right away I have a problem. I haven’t ever bootstrapped my own OS from scratch, but I’m imagining all the code I’m going to have to remove to make this language work from that context. Even a language as stripped down as Io seems to wind up with this cruft inside it, cruft that comes from looking at the world as the intersection of Linux and Windows. Well, they both have files and directories.

In truth, the filesystem seems to have been a major source of innovation and mutual incompatibility for years. Consider some of the strange things in the API. Modes. On Windows, coming from DOS apparently, you could open a file in “binary” mode or “text” mode. Let’s not forget those ever-so-happy drive letters either. Unix has hard links and soft links; the former means you have different files with the same master inode, the latter means you have a file that contains nothing but the path to another file. HPFS (the Mac one) has symlinks and “short cuts” which aren’t recognized by the Unix part of the OS, as well as the famous “forks” of a file, data or resource, depending on what information you want.

Take a look at Common Lisp’s filesystem API and you’ll see it has support for host information and version information as well. The Lisp machine didn’t have any trouble addressing files on other computers; now we at least try to make this happen through some kind of VFS layer. Versioned filesystems are almost completely unheard of today unless you’re running VMS or Veritas.

And then you have Reiser4 which wants to let you access the internal structure of the file via the filesystem. I think that sounds cool but I doubt Reiser4 is going to become mainstream anytime soon.

Then you have the metadata filesystems, ACLs, Unix-style permissions and more. Some of these things are mutually exclusive, but supporting all of them would be incredibly difficult and full of not particularly cross-platform code. R6RS makes the valid and often missed point that filesystems often let you make filenames that aren’t valid in any encoding, so it isn’t particularly wise to depend on UTF-8 strings for filesystem access. This is just the sort of black ice that hides under the surface and destroys expensive boats for thinking filesystems look like just another little piece of North Atlantic ice in your run-of-the-mill operating system.

All the same it really makes me wonder. If you look at RDBMSes, you see the need for this out-of-band data for selecting the database and authenticating. The relational algebra and the relational calculus really have nothing to say about authentication or table spaces. Filesystems don’t even have the benefit of a nice theory underneath them to be compared to.

It would be easy to look at R6RS or any other language’s specification and get discouraged about what OS you could write and host on it. Actually, it would probably be much easier with any other language than Scheme. These standard practices are probably stifling creativity, if there’s any there to be stifled anyway. Don’t let your language, or anything else, dictate terms.