Friday, October 30, 2015

securing the hegemonic aether for you peasants against you peasants others...,


So let's start with fully homomorphic encryption. It's a particular type of encryption scheme, different from what you have seen so far, which lets you encrypt data-- so that is the first arrow pointing from the user to the cloud. She encrypts her data and stores it in the cloud. And then the magic sauce is a procedure that lets the cloud take this encrypted data and do computations on the underlying data. Remember, the cloud doesn't see what's inside, and yet, magically, it can do computations on it. And what it gets at the end of it is the encrypted result of this computation. Once the cloud gets the encrypted result, it sends the encrypted result back to the user. Now the user has the key. She can open the box, decrypt, and what does she learn? She learns the result of the computation. We let the user use the cloud for everything that she wanted to do, except now she also has privacy. So the bottom line of fully homomorphic encryption is that it lets you do anything that you want to do on plain text data you can see, it lets you do on encrypted data which you cannot see. 

What kind of security do I want from homomorphic encryption? The standard notion of security, the golden standard these days in cryptography, is the notion of indistinguishablility of ciphertexts, or semantic security. This was a notion developed by Shafi Goldwasser and Silvio Micali back in the '80s. And what that says is that, number one, encryption has to be probabilistic. In other words, if you encrypt a message twice, you should get completely different ciphertexts. Encryption injects randomness into the process, and the ciphertext looks different every time you encrypt it. Now, what that means is that if you see a stream of ciphertexts passing by, you won't even know if there are any repeats in this ciphertext. It doesn't let you figure that out. This is a very, very strong notion of security. Again, just to be clear, indistinguishability of ciphertexts says that you, as an adversary, can pick two messages. You pick. It's your choice. They have to have the same length, because from the ciphertext, you can tell what the length of the message is. So I'm not trying to hide the length information. So you pick two messages, same length. Send it to me. I, as the challenger for you, will pick a random one of the two messages and encrypt them using this probabilistic encryption, and send it back to you. Your job is to figure out which message I encrypted. If you manage to figure this out, you've won. So I say that an encryption system is secure, semantically secure, or indistinguishably secure, if there is no way that you can win in this game. So that is the notion of security I need from encryption. That is the notion of security that I need from fully homomorphic encryption as well. Now that we know what homomorphic encryption is, and what kind of security notion you want it to satisfy, let's take a step towards trying to see how we can achieve it. 

So what is homomorphic encryption? Again, the new magic sauce is a way to take encryptions of plaintexts-- data-- together with the function that you're interested in computing, and somehow coming up with an encryption of f of this data, the function applied on this data. So before we even start talking about how to do this, we have to figure out, what is this function? How do I represent this function? Is it a C program? Is it a Java program? Is it-- what is it? Is it a hardware circuit? How do we represent these functions? Well, for me today, and in all the literature on homomorphic encryption and, in fact, secure multiparty computation, the standard way to represent functions, computations, is through a circuit. So what's a circuit? You have two types of gates here, addition gates and multiplication gates. Now, I can define this over any field. So if I define it over a field of size 2, these are basically XOR and AND gates. But I could do something more general if I wanted. So how is the circuit defined? You basically put together XOR gates, addition gates, and multiplication gates, or AND gates, in this form of a tree or a graph. And you feed the circuit inputs that comes from the top, and every time the circuit computes either the addition or the multiplication of the bits that are fed into it, and that keeps going. So this is a model of computation, and that is a model I am going to work with. When I say function, I mean a circuit computing the function. 

Now, you might say what happens if I want a computer program or extra data? Turns out that I can convert a program into a circuit, as long as I know its running time. So if you don't have infinite loops in your program, you can undraw the program into a circuit, and that's what I will work with. Now, there is a separate line of research, very interesting work, which deals with how to compute on programs directly without undrawing them and turning them into circuits. Once you've fixed your model of computation, once you have circuits, you think about it, and you realize that if you want to compute a circuit on encrypted inputs-- encrypted bits, here, let's say-- all you have to do is to add encrypted bits and multiply encrypted bits. In other the words, you want an encryption system where given encryption of x1 and encryption of x2, two bits or two numbers, x1 and x2, you should be able to turn it into an encryption of x1 plus x2, as well as an encryption of x1 times x2. If you can do this, you can go through the circuit step by step. Every time you will have an encryption of the wire off the circuit and you will keep going. You will get the encrypted result. That's what we're going to do today.