This article has been originally published in italian (here). Feedbacks on content and translation are appreciated. Contributions are welcome. The original article must be considered the reference in case of updates.
Securely erasing data from a disk implies using proper methods that prevent their recovery at a later time. Either we deal with single files or the content of a whole disk, the plain “deletion” through the file manager or even a disk format may not be enough for the purpose. To remove any trace of the data we wish to “destroy”, one of the existing secure erasure procedures should be used, thus making recovery impossible (or, at least, very difficult).
Technical details of data storage (and, subsequently, deletion) on magnetic disks, SSD or flash drives are quite complex and generally not relevant for application. I would rather discuss the reasons to erase personal data in a secure way and show (in next articles) the usage of some of the common programs available for Windows and Gnu/Linux to accomplish this task.
Photos, documents containing personal information (be it your mail address or bank account number), confidential work documents: any file recently stored (and deleted) on our storage media might be recovered, wholly or partially, after being simply erased (not moved to trash) by the file manager. This operation does not actually imply destruction of data: data (the very single bytes) are still there, even though they cannot be identified as a file any more.
The lines below schematically describe the disk contents before erasing a file:
[INDEX] [BLOCK 1-9: CONTAINS FILE1.TXT][BLOCK 10-11: FREE SPACE][BLOCK 12-20: CONTAINS PHOTO.JPG] [END INDEX] [DATA AREA] BLOCK |01|02|03|04|05|06|07|08|09|10|11|12|13|14|15|16|17|18|19|20| DATA | F| I| L| E| 1| .| T| X| T| | | P| H| O| T| O| .| J| P| G| [END DATA AREA]
Supposing to erase the file PHOTO.JPG, the disk contents will be modified in the following way:
[INDEX] [BLOCK 1-9: CONTAINS FILE1.TXT][BLOCK 10-20: FREE SPACE] [END INDEX] [DATA AREA] BLOCK 01|02|03|04|05|06|07|08|09|10|11|12|13|14|15|16|17|18|19|20| DATA F| I| L| E| 1| .| T| X| T| | | P| H| O| T| O| .| J| P| G| [END DATA AREA]
Disk space marked as “empty” still contains data that were accessible before deletion: in fact, the plain deletion is nothing but marking disk areas still containing data as again available.
Information on position and dimension of files are written in an index (in different ways, according to the file system). When information related to a specific file are removed from the index, file cannot be retrieved any more: file has been “deleted”.
Nevertheless, if someone has a magnetic force microscope close at hand to search the disk surface, it seems possible to find the traces of 2 or 3 previous write operations on disk. Or, at least, so they said…
If you happen not to have such a tech gadget in your toolbox, and not to be a paranoid as well, 3 or 4 overwriting passes make data recovery practically infeasible, at least with tools usually available for normal computer users. More strict safety and secrecy requirements (military grade data, for example) might prescribe to use methods with an highe number of overwriting passes. Indeed, if you are a paranoid, or you need absolute certainty of the result, the ultimate solution is the physical destruction of the medium: someone is even selling a DIY kit of common tools and operative instructions to disassemble a magnetic hard disk and make it utterly useless.
When might a secure erase be required?
When a storage medium (typically an hard disk) is given away or simply sent out for maintenance. That is, in all those situations when we will not have physically available the storage medium any more, medium on which our data were (are!) saved. Lastly, when we want to selectively remove the traces of specific files stored on our disks.
Have you ever jotted down a password or your ATM PIN code on a plain text file, promptly deleting it after recopying or using the code itself? Good, I might just make a cash withdrawal now, from YOUR bank account!
Due to the confidential data disks of work computers might hold, these should be securely erased before being refurbished or simply dismissed.
Differently, a quick disk clean-up achieved through a disk format or partitioning (e.g. to make a fresh install of the operative system) are enough when the disk remains in our hands. Secure erase cannot repair defects on disks (bad sectors) and does not affect the disk performance by any means, requiring indeed an intensive disk activity to carry out the operation (it can take even hours, many hours, to finish!).
Given the final outcome of a secure erase (that is exactly what we want!), extreme care must be taken to correctly identify the disk on which we are going to operate.
If you blow up your system disk or your whole photos/music/movies collection, please do not come here in tears: you’ve been warned!
Secure erase by data overwriting
The easiest way to erase data (from now on I will always mean erase securely) is to overwrite them multiple times. Different erasure methods mostly differ in the number of overwriting operations and the data patterns written on disk. The Gutmann method for example, consists of 35 overwriting passes (8 with random data, 27 with predefined patterns executed in random order). What are you hiding so secret to torture your disks so cruelly? 😉
Other methods are named after the several agencies that regulated the secure erase procedures (DoD – Department of Defense, NSA – National Security Agency) and prescribe less passes (from 3 to 7, usually) with different alternations of random and patterned data. The aim of such a variation is to “confuse” the possible residual magnetic tracks and prevent data recovery.
The procedures implemented in the majority of commercial softwares providing secure erase features are:
METHOD │ # OVERWRITINGS │ DATA PATTERNS ───────────────┼──────────────────┼────────────────────────────────────────────────────────────────────── DOD-5220.22-M │ 3 │ Random data ───────────────┼──────────────────┼────────────────────────────────────────────────────────────────────── Schneier (NSA) │ 7 │ All 0's / All 1's / 5 passes of random data ───────────────┼──────────────────┼────────────────────────────────────────────────────────────────────── Gutmann │ 35 │ 4 passes of random data / 27 passes of data with predefined patterns │ │ executed in random order / 4 passes of random data
For further details, please refer to the pages linked below:http://en.wikipedia.org/wiki/Data_erasure http://en.wikipedia.org/wiki/Data_remanence
Considering the data recovery software available for home/corporate users and the lack of strict data secrecy requirements, even a single overwriting makes recovery difficult, increasing the possibility to get corrupted data. Of course, if very “sensitive” data have to be destroyed (military-grade or corporate files, compromising pictures -who hasn’t got one?- not yet published on some social network) you might say “better safe than sorry”, thus dramatically increasing the number of overwriting passes.
The data density increase on hard disk plates (10 years ago there were few GB/plate; today, keeping the same dimensions density might reach 500GB to 1TB per plate) makes even harder data recovery after a single overwriting. Tracks on the disk surface have become so close to each other that it has become practically impossible, even performing a physical analysis of the disk surface (what can be so important to justify this?), to detect the remnants of previous writings on the edges of the tracks themselves.
Finally, there is a quick and safe erase method, relying on a set of commands built into the disk firmware (no additional software required): Secure Erase (SE). This command has been added to disks’ firmware since ATA types larger than 10-15GB: each hard disk marketed today is therefore enabled to “self-destruct” via SE.
The advantages of SE compared to other methods are basically two:
- faster execution (a single overwriting is enough)
- areas of disk usually reserved and not accessible by deletion/recovery software are erased, too.
This limits the mechanical/thermal stress on disk and ensures an higher security level, if required.
The drawback is that SE always affects the whole disk: a single “peculiar” file cannot be targeted, contrarily to most software using the methods described above. I suggest to read, for further details, the CMRR (Centre for Magnetic Recording Research) page, that is actually the “Mr-Know-it-all” home as far as hard disks are concerned :-). Specifically the “Data Sanitization Tutorial” provides technical hints and comparisons between different methods, as well as legal implications of an inappropriate data deletion.
Paranoias and advanced technologies aside, a practical rule of thumb is to be considered: the number of passes proportionally increases the execution time while the security level (that is, the difficulty to recover data) grows more and more slowly. The “sanitization” of a whole recent hard disk (that may be up to 3TB size) might require hours, if not days, to complete, even going with a 2- or 3-pass method. More, the disk would have to withstand quite a high mechanical and thermal stress that might even harmful for the disk itself.
Physical actions on storage media
We have been kind and delicate until now, just trying to “mess up” remaining data by overwriting passes. When this is not enough, due to very strict security requirements, we have to get a bit rougher!
Tortures we can put our storage supports through (hard disks, CD/DVD, flash drives, etc.) are destruction by grinding, melting or vaporization and de-magnetizing (degaussing – unsuitable for CD/DVD). As suggested by CMRR experts, for hard disk simply bending the plates prevents any attempt to recover data.
The “degaussing” method deserves some additional detail: hard disk drives are made up by metal platters on which a small magnetic head “writes” by changing the magnetization of a small area (see the Wikipedia page). If a strong magnetic field is applied to the disk, the state (polarity) of each area is changed, that in turn means that the single bit of information stored there is changed as well. This magnetic fields affects also the magnets of motors spinning the disk and moving the head (seems we’re talking about an old turntable…), making them useless. Not only was the drive “cleaned up” of all data, but it also became a shiny paperweight!
Of course physical destruction of disks require specific equipment and implies strong security and secrecy requirements not needed for home/corporate users. Unless you have the usual embarrassing photos ready to be tagged on some nosey site… 🙂
In my next articles I will be describing some of the available software for Windows and Linux used to “wipe” data out.