Skip to main content

Secret Sharing Algorithm for Protecting Files in the Cloud

Data stored in the cloud can be compromised or lost (see my previous post). So, we have to come up with a way to secure those files. We can encrypt them before storing them in the cloud, which sorts out the disclosure aspects. However, what if the data is lost due to some catastrophe befalling the cloud service provider? We could store it on more than one cloud service and encrypt it before we send it off. Each of them will have the same file. What if we use an insecure, easily guessable password to protect the file, or the same one to protect all files? I have often thought that secret sharing algorithms could be employed to good effect in these circumstances instead.

What are secret sharing algorithms? They are algorithms that will share a secret between several parties, such that none of them can know the secret without the help of others. Either all or a subset of them will need to get together and put their parts together to obtain the original secret. A simplistic solution can be achieved by XORing the secret with a random number, then giving the result to one party and the random number to the other. Neither one can find out what the secret was without the other. To retrieve the secret they only need to XOR the two parts together again. This can be extended to any number of parties.

A more sophisticated way would be to allow the secret to be retrieved from a subset of the parts distributed. In the previous example, if any of the parties loses their part, or refuses to disclose it, then nobody can reveal the secret. This isn't much good if one of our cloud service providers fails. On the other hand, if we can share the secret between three people, but only require any two to regenerate the original, then we have some redundancy. This is an example of a (k,n) threshold scheme with k=2 and n=3.

How do we achieve this though? Well, Adi Shamir proposed a simple secure secret sharing algorithm. It is based on drawing graphs. To uniquely define a straight line, you need two points on that line. Similarly, to define a parabola you need three points. A cubic requires four, etc. So, we can distribute points on a line to each party we want to share the secret with. The order of the line will determine how many of them need to get together to regenerate it. So, we could define a random straight line and distribute three points on it to three different parties. However, only two of them need to get together to regenerate the original secret.

We set up a (k,n) threshold scheme by setting the free coefficient to be the secret and then choosing random numbers for each of the other coefficients. The polynomial then becomes the following:


where a0 is our secret. Now we can distribute points on the line to each of the n parties simply by calculating y for a series of different values for x. We can use the Lagrange Basis Polynomials to reconstruct the equation of the line from k points. However, we do not need to reconstruct the whole line, we are only interested in the free term. This simplifies the equations that we need to use. For example, if we have a straight line, then we only need two points (x0,y0) and (x1,y1). We can then calculate a0 as follows:
Similarly, for a parabola and three points (x0,y0), (x1,y1) and (x2,y2) we have:
This should be fairly simple to implement and use. You would need to sign up to a few cloud services, but you wouldn't have all your eggs in one basket and you wouldn't be reliant on weak passwords.

Comments

  1. Thanks for the secret secure algorithms for protecting files in the cloud. I think a secure secret sharing system is really helpful to allocate shares so that everyone shares with no extra information about the secret.

    ReplyDelete

Post a Comment

Popular Posts

Coventry Building Society Grid Card

Coventry Building Society have recently introduced the Grid Card as a simple form of 2-factor authentication. It replaces memorable words in the login process. Now the idea is that you require something you know (i.e. your password) and something you have (i.e. the Grid Card) to log in - 2 things = 2 factors. For more about authentication see this post . How does it work? Very simply is the answer. During the log in process, you will be asked to enter the digits at 3 co-ordinates. For example: c3, d2 and j5 would mean that you enter 5, 6 and 3 (this is the example Coventry give). Is this better than a secret word? Yes, is the short answer. How many people will choose a memorable word that someone close to them could guess? Remember, that this isn't a password as such, it is expected to be a word and a word that means something to the user. The problem is that users cannot remember lots of passwords, so remembering two would be difficult. Also, having two passwords isn't real

How Reliable is RAID?

We all know that when we want a highly available and reliable server we install a RAID solution, but how reliable actually is that? Well, obviously, you can work it out quite simply as we will see below, but before you do, you have to know what sort of RAID are you talking about, as some can be less reliable than a single disk. The most common types are RAID 0, 1 and 5. We will look at the reliability of each using real disks for the calculations, but before we do, let's recap on what the most common RAID types are. Common Types of RAID RAID 0 is the Stripe set, which consists of 2 or more disks with data written in equal sized blocks to each of the disks. This is a fast way of reading and writing data to disk, but it gives you no redundancy at all. In fact, RAID 0 is actually less reliable than a single disk, as all the disks are in series from a reliability point of view. If you lose one disk in the array, you've lost the whole thing. RAID 0 is used purely to speed up dis

Trusteer or no trust 'ere...

...that is the question. Well, I've had more of a look into Trusteer's Rapport, and it seems that my fears were justified. There are many security professionals out there who are claiming that this is 'snake oil' - marketing hype for something that isn't possible. Trusteer's Rapport gives security 'guaranteed' even if your machine is infected with malware according to their marketing department. Now any security professional worth his salt will tell you that this is rubbish and you should run a mile from claims like this. Anyway, I will try to address a few questions I raised in my last post about this. Firstly, I was correct in my assumption that Rapport requires a list of the servers that you wish to communicate with; it contacts a secure DNS server, which has a list already in it. This is how it switches from a phishing site to the legitimate site silently in the background. I have yet to fully investigate the security of this DNS, however, as most