While developing DOSContainer, I restarted my efforts multiple times over due to me underestimating the complexity of the task. The current attempt starts at the very beginning and works up from there. The very beginning of the PC-era being the IBM 5150 with two 160KB 5.25" floppy drives. IBM had sourced an operating system from Microsoft that it would sell as PC-DOS to go with the system. How that deal came to be is the subject matter for another article. Right now, let’s dive into what’s needed for IBM PC-DOS 1.00 to accept an empty floppy for use.
Floppy specifications
Floppy disks, for those too young to have seen them in the wild, are a means of removable storage. The IBM PC came with two drives that took single-sided 5.25" floppies with a capacity of 160KB each. That’s where I’m starting with DOSContainer so that’s what I’m dissecting in the most detail I possibly can. As it turns out, there’s quite a lot of detail even before you write anything useful to the disk.
A floppy on the IBM-PC is divided into sides, tracks and sectors. Being single-sided, the initial PC could use only one side of the disk. This side would then be divided into 40 concentric rings, the so-called tracks. These ring-shaped tracks would then be divided into 8 sectors each, leading up to 1 side holding 40 tracks with 8 sectors each. That’s 320 raw sectors, each of which holds 512 bytes of information adding up to a grand total of 163,840 bytes or 160KB of raw capacity.
The file system
In order to make sense of a floppy’s contents, files need to be written to it in a structured manner. IBM PC-DOS 1.00 has a utility to prepare
a disk for use named FORMAT
. The idea is that you have your system disk in drive A, put a new disk in drive B and then type FORMAT B:
in order to prepare the disk for use. The action of “formatting” a disk destroys any and all data that may have been present on the disk so
be careful what you wish for. Once done, the floppy will contain the bare-bones structures needed for the rest of PC-DOS to store and retrieve
files on it. The most important structural elements are: the boot sector, two copies of the file allocation table, the root directory and the
data region of the disk.
The boot sector
The boot sector is the very first sector on the floppy, located at sector 0. On the original PC this sector is always identical. In later
operating systems the contents of the boot sector are adjusted to suit the physical characteristics of the disk, but not so on PC-DOS 1.00. It
only knows about the 160KB variant and nothing else. While you could replace the floppy drives in a PC with more versatile ones, you would also
need a newer version of the operating system to handle them. Since only one type of floppy exists in the PC-DOS 1.00 universe, the boot sector
gets written verbatim from a part of the FORMAT
utility onto sector zero of any floppy it formats. You may be surprised to learn that
this sector contains executable code also on floppies that do not contain anything operating system related at all.
Another interesting little factoid about the early IBM boot sector is that the author of the executable code in it actually signed his work. When you look at the bytes of an IBM PC-DOS 1.00 boot sector, you’ll find the name of Bob (Robert) O’Rear who was the seventh employee of Microsoft at the time.
File Allocation Tables
Sectors 1 and 2 each contain a copy of the File Allocation Table (FAT). The purpose of the File Allocation Table is to keep track of which “cluster” is allocated to which file’s data in the data region of the disk. On an original IBM PC floppy a “cluster” would map onto exactly a single sector of data. The idea behind having two copies of the FAT originally was to have a backup in case the primary table got damaged. This idea didn’t exactly pan out in practice due to the inability of a disk recovery utility to conclusively determine which of the FAT’s was the damaged one at any one time. Regardless, keeping two copies of the FAT is a practice that remains with it to this very day even though recovery utilities generally don’t do much with them.
Root directory
Sector 3, 4, 5 and 6 are occupied by the root directory. This is sequential table of contents for the floppy. Each entry in the root directory occupies
32 bytes. The root directory occupying 4 sectors adds up to 32 entries in the root directory. That’s also where the ability for IBM PC-DOS 1.00 to
store files ends: there is no MKDIR
command in PC-DOS 1.00 yet, so you can’t create directories/folders yet. After storing 32 files, the disk
reports as full because the root directory is out of entries, even though the data region may still have sectors available.
An empty root directory in later implementations of FAT file systems can simply be bytes set to a value of zero. In the case of IBM we’re dealing
with a corporate behemoth that had very specific ideas on how computers were supposed to work. They invented considerable chunks of the field so
that should come as no surprise. In order for a floppy to be usable on an IBM PC, the root directory gets filled with empty placeholder entries that
start with a byte of 0xE5
, which is the marker for a file or subdirectory that is deleted. Semantically it implies that something existed
in this place before, but not anymore. The rest of the 32 bytes for each entry are filled with 0xF6
, which is the standard filler byte that
IBM used on its bigger systems to indicate “empty space” on storage systems at the time.
The data region
Right after the root directory you’ll find the data region. These are empty sectors that will contain the actual contents of the files you write
to the floppy. Contrary to later FAT implementations, IBM also wipes all of this with the same 0xF6
filler bytes, even though that is not
technically required. Setting the cluster value to zero in the FAT should be enough to mark the space in the data region as available and changing
the first character of the root directory entry to 0xE5
marks it as effectively “deleted”. In later FAT implementations this would allow
tools to effectively “undelete” files or even “unformat” whole disks. Not so on the early IBM PC: once you format a disk, it’s all wiped into
oblivion. While this is a clean approach and prevents confidential data from lingering on a disk, it does put an extra responsibility on the
operator of the machine: a format is final.