Skip to main content

File Deletion versus Secure Wiping (and how do I wipe an SSD?)

When is a deleted file actually removed from your device, or at least when does it become unrecoverable? It turns out that this question isn't always easy to answer, nor is a secure file deletion easy to achieve in all circumstances.

To better understand this we have to start from the basic principle that when you delete a file on your computer you are only deleting the pointer to the file, not the actual data. The data on your hard disk drive (HDD) is stored magnetically in sectors on platters that spin round inside the HDD (we'll come onto SSDs in a bit). So, how does the computer know where to look for your file? It has a table of indexes such as the File Allocation Table (FAT) or Master File Table (MFT) in NTFS. When you delete a file in your OS, all you are actually doing is removing its entries from the table of indexes so your OS can't find it any more and doesn't know it's there. However, all the data is still stored on the disk and IS STILL RECOVERABLE! Tools like Piriform's Recuva can scan your disk for orphaned files and file fragments and allow you to recover them.

So, how do you actually securely delete a file so that it is unrecoverable? The most common way to securely delete a file is to overwrite it one or more times with other data before removing the entries in the index table. Different schemes for overwriting the data exist from NIST, the US DoD, HMG, Australian Government, etc. These usually consist of 1-3 rounds of writing all zeros, all ones or random patterns to the sectors, i.e. physically overwriting the data on the disk before 'deleting' it. There are many tools available to securely delete files and securely wipe drives according to these requirements.

Excellent, we've solved the problem of secure file deletion. Or have we? Well, no. There are usually some hidden areas of drives such as bad sectors that haven't actually failed, Host Protected Area (HPA), Device Configuration Overlay (DCO), etc. Interestingly, with DCO it is possible that you have a significantly bigger HDD capacity than is reported by the drive. Some manufacturers will sell bigger HDDs with the capacity artificially reduced for a variety of reasons. However, the important point here is that there are areas of the drive that you cannot normally access, but that may contain remnants of your data.

What of Solid State Drives (SSD)? Are they easier or harder to securely wipe? It turns out that they are much harder to wipe. SSDs can store your data anywhere and the controllers are programmed to 'wear' the drive evenly by keeping track of areas that get a lot of use and moving data around on the drive. So, assuming you keep roughly the same file size, when you edit your file on an HDD the original physical sectors will usually get overwritten with the new version. However, with SSDs, it is likely that the new version will be written to new areas of the disk, leaving the originals intact. It is very difficult to know where an SSD actually writes your data. They also have many hidden areas as above as well as capacity used to cope with failing sectors or evening up the wear. The long and tall of it is that if you use software to overwrite the file, like you would on an HDD, you probably haven't overwritten the data at all, but you will have reduced the life of your drive.

So how do we secure delete a file on an SSD? There aren't that many manufacturers of SSDs and most of them provide utilities to securely wipe their drives using the ATA Secure Erase (SE) command, which is a firmware supported command to securely wipe the whole drive, releasing electrons from the storage cells, thus wiping the storage. That's just wiped our whole drive though; how do I wipe just a file? Well, you can't really. You either wipe the drive or don't bother.

There is a 'gotcha' here as well though. I said earlier that there aren't many SSD manufacturers, but if you go to buy one there seem to be loads. Well, people like HP and IBM rebrand other people's SSDs (I believe they use Crucial). What's the harm in this? Well, they will sometimes re-flash the firmware to have their own feature set. That means that the original manufacturer's Secure Erase software may not work on them and the IBMs and HPs don't always provide an alternative (other than the traditional overwriting you would do on an HDD).

There must be something you can do though, surely? Well, yes there is. If you first encrypt your drive, or use file-level encryption, then the data that is on the drive should be unrecoverable (assuming you haven't stored the keys on the drive as well). This is actually your best bet for an SSD, but also does no harm on a traditional HDD.

OK, so if I want to get rid of a drive that is End of Life, what should I do? If it's an HDD, you should secure wipe it by overwriting the whole drive several times as described above, degauss it (i.e. using electro magnets to wipe the magnetic data on the platters) and then shred the drive. Yes, I did say shred the drive... into tiny pieces. You can get some impressive machinery to do this, or use a service to shred them on site for you. What about SSDs? Use the ATA Secure Erase function from the manufacturer's software and then shred them as before (just make sure the shredding process actually destroys the chips so they can't be re-floated onto another board to read them).

Comments

Popular Posts

Coventry Building Society Grid Card

Coventry Building Society have recently introduced the Grid Card as a simple form of 2-factor authentication. It replaces memorable words in the login process. Now the idea is that you require something you know (i.e. your password) and something you have (i.e. the Grid Card) to log in - 2 things = 2 factors. For more about authentication see this post . How does it work? Very simply is the answer. During the log in process, you will be asked to enter the digits at 3 co-ordinates. For example: c3, d2 and j5 would mean that you enter 5, 6 and 3 (this is the example Coventry give). Is this better than a secret word? Yes, is the short answer. How many people will choose a memorable word that someone close to them could guess? Remember, that this isn't a password as such, it is expected to be a word and a word that means something to the user. The problem is that users cannot remember lots of passwords, so remembering two would be difficult. Also, having two passwords isn't real

How Reliable is RAID?

We all know that when we want a highly available and reliable server we install a RAID solution, but how reliable actually is that? Well, obviously, you can work it out quite simply as we will see below, but before you do, you have to know what sort of RAID are you talking about, as some can be less reliable than a single disk. The most common types are RAID 0, 1 and 5. We will look at the reliability of each using real disks for the calculations, but before we do, let's recap on what the most common RAID types are. Common Types of RAID RAID 0 is the Stripe set, which consists of 2 or more disks with data written in equal sized blocks to each of the disks. This is a fast way of reading and writing data to disk, but it gives you no redundancy at all. In fact, RAID 0 is actually less reliable than a single disk, as all the disks are in series from a reliability point of view. If you lose one disk in the array, you've lost the whole thing. RAID 0 is used purely to speed up dis

Trusteer or no trust 'ere...

...that is the question. Well, I've had more of a look into Trusteer's Rapport, and it seems that my fears were justified. There are many security professionals out there who are claiming that this is 'snake oil' - marketing hype for something that isn't possible. Trusteer's Rapport gives security 'guaranteed' even if your machine is infected with malware according to their marketing department. Now any security professional worth his salt will tell you that this is rubbish and you should run a mile from claims like this. Anyway, I will try to address a few questions I raised in my last post about this. Firstly, I was correct in my assumption that Rapport requires a list of the servers that you wish to communicate with; it contacts a secure DNS server, which has a list already in it. This is how it switches from a phishing site to the legitimate site silently in the background. I have yet to fully investigate the security of this DNS, however, as most