Safe writing

Up
Defragmenter
Recycled directory
Safe writing
Space efficiency

Overview

The filesystem ensures that its structure is never in an invalid state on the disk. This includes things like the bitmap, the directory tree and file information (for example size and protection bits). Data blocks, the space which contains the data stored in a file, are however not kept completely valid at all times for performance reasons.

The filesystem keeps its structure valid by never overwriting blocks directly. This means that even if a crash or power loss occurs that the old structure will still be present on the disk. When rebooting your machine SFS will be able to detect if changes were pending and will either discard them if they weren't completed yet or finish the pending changes.

However, I already mentioned that SFS doesn't do this for data blocks. This means that if a crash occurs it is possible that some of the data which you were writing to a file has been lost or has partially overwritten existing data in that file.

In the worst case this means the following: For example, take a file of 1000 bytes. The last action you did before the crash was to write 2000 bytes from position 500; in other words the first 500 bytes are unmodified and the new file size becomes 2500 bytes.

When a crash occurs immediately after this write action the filesize will still be 1000 bytes, however the bytes from position 500 to 999 will have been overwritten with new data. The reason that the filesize won't have changed yet is because these changes were discarded to keep the structure of the disk valid. The 500 overwritten bytes however were written immediately and can't be recovered.

Internals

As was said, Smart Filesystem only ensures that its own structures are kept valid. To do this it keeps track of all changes made to this structure. If a filesize needs to be updated in a specific block, then we add this change to a list of changes to be made. This list is kept in memory until the time comes to commit these changes to the disk. The same goes for all other changes made to the filesystem structure. They are all recorded and added to the list in memory.

The caching system in SFS is smart enough to distinguish between original blocks and blocks with the latest changes applied to them. Also when reading new blocks from disk SFS will automatically apply any changes to these blocks (if any) before using them for internal operations.

All changes which belong to the same operation are kept together. Creating a new empty file for example will result in a number of small changes. A fileheader is created, the file is given a node number and the file is linked into a hash chain. Either all these changes are either added to the changes buffer or none.

The way the changes are stored in memory is very simple. SFS compares the original and modified version of a block and stores the difference between them using a quick and very simple compression scheme. This keeps memory consumption low and also speeds up writing the changes to disk since they take of far less space using this simple compression technique.

When the time comes to commit the changes to disk, then SFS will first look for a free area on the disk (SFS automatically ensures there is always enough free space for this). In this free area it writes the buffer of changes in its compressed form. When this buffer was written correctly, a special block is written to a fixed location. This block is called the Transaction Failure block.

The Transaction Failure block points to the compressed changes which were written earlier to free areas of the disk. The mere presence of this block indicates that there are pending changes in compressed form on the disk. Its presence indicates that the last changes weren't completed yet, hence its name.

After writing the compressed changes and the Transaction Failure block, SFS will start to make the actual changes to its structure on the disk. It will simply overwrite existing blocks now, replacing them with their updated versions.

If this process is interupted then next time SFS is started it will see the Transaction Failure block. It will load the compressed changes from the free area of the disk and continue to make the changes (changes already made are simply made again). You could compare this to the validating process of FFS, but you'll never notice since this will usually take only a fraction of a second to complete.

If the process was interupted before the Transaction Failure block was written, then no changes will have been made yet and SFS will simply use the old structure (this in effect discards the last changes made to the disk).

If however everything went smoothly and the system didn't crash during this procedure the Transaction Failure block will be removed again, which indicates the disk is in a valid state. The whole process of updating the disk in this way usually takes less than a second.

Assumptions

Smart Filesystem makes a few assumptions to be able to guarantee that the system of keeping your disk valid at all times works:

Writing a single block is atomic. This means either the block was physically written completely to disk, or not at all. Checksums are used here for extra safety should this operation not be atomic (I haven't been able to confirm or deny this yet for hard drives -- such information seems to be hard to find).
Write Caching is disabled -- this means that everything written to disk (particularly the changes buffer) was indeed immediately written physically before any other blocks are written. There is a very delicate order here in which things need to get written to be able to guarantee it works. See below.
Device drivers which have internal buffers must respect the CMD_UPDATE command which flushes the internal buffers to disk immediately. SFS will use CMD_UPDATE before and after any critical operations.

Order in which things must be written:

Writing all changes to empty areas on the disk.
Writing the Transaction Failure block which indicates there is a valid but unfinished set of compressed changes on the disk. This block points to the blocks stored under step 1.
Applying the real modifications to the disk, replacing any blocks which need to be modified.
Removing the Transaction Failure block.

Smart Filesystem assumes that all blocks written in each of the steps above were physically written before blocks of any of the following steps are physically written to disk. Between the steps SFS will call CMD_UPDATE to flush any buffers the device driver might be using (trackdisk.device does this for example).

Final words

This system is quite safe, but there is the slight possibility that things go wrong anyway if any of the assumptions Smart Filesystem makes isn't met. Backing up your important data is therefore still important, no matter how safe the filesystem. Even if the chance of failure by crash or power loss has in theory been reduced to fractions of a percent, then there is still the possibility of fatal bugs in the filesystem or bad sectors on your disk.

Safe writing

Overview

Internals

Assumptions

Final words

All rights reserved. For comments, problems or questions regarding this page contact John Hendrikx. Last updated: 17 oktober 1998.

All rights reserved.
For comments, problems or questions regarding this page contact John Hendrikx.
Last updated: 17 oktober 1998.