How to Fragment Your File System

Here is a tiny little python script to generate file system fragmentation.

“But Tiiiiiiim! Tools are supposed to defragment your filesystem! Why would you ever want a script to fragment one?”

In one of the gadgets I’m working on, I had a need to evaluate disk (well, memory card) performance in real-world and worst-case scenarios. If you are sampling high-speed data with a puny microcontroller, you cannot afford your disk going into lalaland while your puny buffer RAM runneth over. While file fragmentation is – in theory – not a big deal for Flash media as it is for spinning rust drives (no mechanical heads to reposition), your filesystem driver still needs to grovel through the filesystem to find the next free block to write. In a typical implementation, writing to a FAT filesystem with a giant file in the middle of it could incur a significant write hiccup as the FS driver encounters the file and has to seek through its entire FAT chain, potentially fetching and parsing numerous sectors of allocation data before finally finding a free cluster for the next write. This script allows testing of such scenarios.

What it does:

Give it the name of a disk to fragment, and it will begin creating files full of junk data on the disk until it receives a write error (normally indicating the disk is full). It will then delete a random subset of these files, leaving free-space holes scattered throughout the disk. For most filesystems and OSes, the freespace will not be automatically consolidated, and will remain fragmented until the remaining files are deleted (or the disk formatted, etc.) or a defragmentation utility is run. You can then evaluate the performance of your (software, device, etc.) on this disk on a realistic simulation of a well-used drive.

Configuration options:

path – set this to the directory to generate the fragments in. On FAT filesystems, it is necessary to use a subdirectory and not the root directory, due to a limitation on the number of files that can be stored in the root directory.

filldensity – this value, ranging from 0.0 to 1.0, sets the percentage of junk files to remain at the end of operation. A higher value means more files left behind, i.e. less freespace gaps.

minfragsize, maxfragsize – this sets the size range of junk files to create. The size of each file created will be selected at random from this range.

CAVEATS:

I only tested this on a Windows PC, for reasonable file sizes (MB, not GB) and card sizes (a few GB). If your device’s size is measured in rooms, gigaquads or Libraries of Congress, it may not work, or your device may be obsolete by the time it finishes. The “junk” to make the junk files is stored in RAM out of laziness; you probably want to fix this if making multi-GB junkfiles.

This script was written to test a FAT-based device. Not all filesystems respond the same way to fragmentation, so YMMV on other filesystem types.

This emulates fragmentation only. Many other factors could affect your embedded Flash media performance, such as Flash cell wear (aka hot count, or total number of write/erase cycles performed), write amplification, operating temperature and/or voltage (depending on the memory technology and controller), phase of the moon, etc. This script does not emulate any of these other factors. On the bright side, it should be a more faithful test for other memory technologies, e.g. FRAM/MRAM, that are fast and relatively insensitive to cell wear, and will better reflect software delays due to filesystem parsing.

Leave a Reply