Fragmentation. Hurting computer performance and making your files less recoverable, fragmentation can become your worst enemy when it comes to recovering lost data. Why does that happen, what can be done to recover fragmented files, and how to avoid fragmentation in the first place? Read along to find out.
Let’s say you are saving a file on a hard drive. If your hard drive is empty, or near empty, the operating system will allocate a contiguous block of sectors on the disk and store the file in that block… well, continuously. But what happens if you’ve written to that hard drive before, adding and deleting data? Sooner or later (depending on the file system used, the size of the hard drive and what kind of operating system you are running), your hard drive will no longer have long enough chunks of free disk space to fit large files. This situation is extremely common, and pretty much every operating system (including those built into portable devices) can deal with it.
When writing a new file, the operating system tries to find a contiguous chunk of free space on the disk in an attempt to store the file in a contiguous manner. However, if the file is larger than any of the available chunks of free space, or if the file grows later on (e.g. you are recording a video), there simply may not be a single contiguous chunk large enough to fit the file without dividing it into smaller fragments. Each of the smaller fragments is then stored in a separate chunk of free space. As a result, the file’s content becomes scattered randomly across the disk surface. This is called fragmentation.
Why Fragmentation Is Evil
Fragmentation is known to slow down file operations on magnetic hard drives, both reads and writes. Why “magnetic hard drives”? Because fragmentation does not have adverse effect on storage performance on solid-state media, and here’s why.
If you are using a traditional magnetic hard drive, it has spinning disks and moving heads inside. When reading a file, the hard drive will first position the heads over the area on the disk containing the data and wait while the heads stabilize. Only then it will read (or write) information. If a file is stored in a single contiguous chunk, the disk will only need to position its heads once. Contiguous read operations in modern hard drives are extremely fast; typically, you will see at least 70 to 100 MB/s when reading contiguous files.
If the file has more than one fragment, the hard drive will have to reposition the heads every time it needs to move to the next contiguous block. These operations are relatively slow. As a result, reading a highly fragmented file can be as slow as 5 to 15 MB/s on that very same disk.
Solid-state media such as SSD drives, USB flash drives and all common memory cards do not have any mechanical moving parts inside. There is no need to wait until something positions or stabilizes. As a result, reading or writing fragmented files is nearly as fast as reading or writing contiguous data. However, fragmentation on solid-state media introduces other issues, data recoverability being one of the most serious ones.
Operating Systems Try Reducing Fragmentation
How exactly does the operating system choose fragments of free space to be filled with new data? There are many different algorithms used by the different operating systems on various file systems. Most modern operating systems attempt to minimize fragmentation by choosing an optimal way of allocating free space. What’s considered an “optimal” way differs across the systems. For example, the “best fit” algorithm will always prefer the smallest chunk of free clusters that is big enough to fit the file. The “worst fit” is the opposite, giving preference to the largest chunks, ignoring smaller “holes” at the beginning of the disk and adding files to the end of the allocated area before the disk is filled up. The “first-fit” algorithm writes to the first available chunk that’s big enough. There are also other algorithms, all with their pros and contras, and all optimized for different usage scenarios. Some algorithms are faster than others in writing large files, while some algorithms work better reducing fragmentation for files that are likely to grow, and so on.
Importantly, there is nothing you should (or could) do to configure how these algorithms work. But can you do anything at all to reduce fragmentation?
Speaking of mechanical (magnetic) hard drives, defragmenting the disk helps reduce fragmentation a great deal. Defragmentation rearranges data stored on the hard drive so that files are stored in contiguous manner. Defragmentation works best if you have enough free space on your hard drive, as the system will need to store any file it moves before releasing the disk space it occupies. As a result, the system may be unable to fully defragment a disk that has little free space available (which is one of the reasons why it is often advised to leave some percentage of disk space available).
Defragmentation tools can be launched manually or can operate on a schedule. Scheduled defragmentation is a great way to maintain your mechanical hard drives. In recent versions of Windows, the built-in defrag tool can operate on a schedule without the need to install third-party solutions.
Defragmenting Solid-State and Flash-Based Media
What about defragmenting solid-state media such as USB flash sticks or SSD drives? Defragmenting these in traditional fashion makes no sense. In fact, attempting to defragment flash-based storage by shuffling data around will only cause additional (and completely unnecessary) wear to the flash cells, shortening their effective lifespan without making the disk work any faster.
So why do we have SSD defragmenter in Windows 8 and 8.1? The tool simply issues the TRIM command to the SSD drive, ensuring that any free space available on the disk is properly recognized by the SSD controller and its garbage collection algorithms.
Fragmentation and Data Recovery
Finally, we are approaching the main point of this article: recovering information from fragmented storage media. As you can see, you can keep your mechanical hard drives nice and clean with an automated disk defragmenter running on a schedule. However, there is no such option available for SSD drives. Being inherently smaller in size compared to similarly priced mechanical hard drives, SSD disks are much more likely to be “filled up”. With little free space left on the disk, the operating system will have no other choice but fragment data. As a result, any files you write onto an SSD device will be likely heavily fragmented. Add to that the fact that SSD drives physically wipe cells containing deleted data (via garbage collection and TRIM), and your chances of recovering anything from an SSD drive are close to nil.
SSD drives aside, why is fragmentation so bad for recovering deleted files? Because at the time a file is deleted, the system may also erase file system records that point to exact physical clusters on the disks occupied by a given file. This is not usually a problem with NTFS, which is used on virtually all Windows hard drives, but other file systems such as FAT32 (used on USB flash drives and memory cards up to 32 GB) will effectively remove these pointers at the time a file is deleted. As a result, simply following links from the file system and re-assembling a file may not be an option.
Oh, great, the file system doesn’t work, so let’s try carving! Well, not so fast. File carving works great when recovering contiguous files, and not-so-great if you are attempting to recover a fragmented file.
When carving, a data recovery tool such as Partition Recovery reads information from the disk one sector after another looking for familiar signatures that could point to a file in a certain format. When it senses something similar to a file header (e.g. “PDF”), it analyzes the data and, if the data is in fact the file’s header, the algorithm analyzes that data in order to determine the size of that file. The idea here is that, by knowing the address of the beginning of a file and knowing its size, we can simply read that many clusters from the hard drive and voila! We have the file back.
This works great if your hard drive is nicely defragmented, and if the file you’re recovering is stored on one contiguous piece. If, however, the file is sliced into pieces scattered around the disk, the carving tool will read data that belongs to the beginning of a file followed by data that belongs to some other files.
If you’re using a tool with a somewhat more intelligent carving algorithm, that tool may be able to work around this situation by analyzing the file system and only carving unallocated areas on the disk. This greatly increases chances of successful carving of fragmented files, but not every tool can support this mode. You can usually recognize tools supporting this type of carving if they allow choosing what can be carved: the entire disk surface, allocated areas (files) or unallocated areas (free disk space).
On the other hand, if you are attempting to carve a hard drive that has no file system at all (e.g. after formatting or repartitioning the disk), the entire disk space will be considered “unallocated”, and you won’t have any choice but to scan the full disk surface.
In real world, recovering deleted files may be done even if they are fragmented. If, for example, you are trying to recover a recently deleted file, the data recovery tool will start by analyzing the file system. As most hard drives in Windows PCs are using NTFS, the tool will have a great chance that to discover the correct sequence of records referring to all the physical clusters on the hard drive that used to belong to that file. As a result, recovering a fragmented file will not be as big of a problem as it could possibly be on FAT32 or if there were no file system at all.