Multiparty Computation: The beacon of privacy solutions explained
By Ivan Damgård, Chief Cryptographer at Partisia Blockchain
You may have heard about Multiparty computation, or MPC, for short, but you may not know exactly what it is. That’s why in this blog post, we are going to explain exactly what multiparty computation is, and discuss the profound and seemingly impossible tasks it can help us accomplish.
Multiparty computation has a number of real world applications. Say, for instance, we have a set of people who all have the same type of job in various companies. They are interested in finding out what the average of their salaries is, in order to guide their future salary negotiations. On the other hand, for obvious privacy reasons, they are not willing to simply reveal their salaries to each other.
So, the question is: can these people interact in some way that allows each individual to find out what the average of all of their salaries is, but in such a way that no one learns any more than that? At first glance, this may seem impossible. How can we compute on data if the data is not available to us? Nevertheless, it can be done, and here is how.
Firstly, we can see that we really only need to compute the sum of all the salaries. If we have the sum, we can divide by the number of participants to get the average, and conversely if you know the average, you can multiply to get the sum, so the sum does not reveal more than does the average.
If, for example, we have 5 people participating, we call them A, B, C, D, and E. We will let X stand for any of these people. Now, for participant X to enter his salary S into the computation, he does the following: he chooses 5 random numbers x1,.., x5, but with the property that
S = x1+ x2 +x3 +x4 +x5
This can be done, for instance, by choosing the first 4 numbers randomly and then choosing the last one so that the sum is correct. Now, you privately send x1 to A, x2 to B, and so on (you also give yourself a number).
Note that this does not reveal your salary to anyone else: For instance, C gets x3, but this is just a random number. To understand this better, consider a really simple case with just two people, Bob and Charlie. Let’s say the secret number is 3 and we give 5 to Bob and -2 to Charlie. Now, from Bob’s point of view, he has no idea which number Charlie received. And since 5 plus whatever can give any result, Bob still has no idea what the secret number is.
This is an example of so-called secret-sharing which is one of the most basic and important tools in MPC. In the example, the secret is S and the xi’s are called shares. For readers with a math background, I should mention that for the above description to be correct, one should actually choose the random xi’s modulo some large enough number p and then the additions should be done modulo p. But I will ignore this here to keep it simple.
So, in MPC lingo, to enter your salary into the computation, participant X will secret-share his salary, to get shares x1,…, x5 and give one share to each participant.
Now we are finally ready to see how the secure computation is done. Once every participant has secret-shared his salary, everyone has received one number (share) from everyone else. In table form, shares held by A, B, C, D and E looks as follows:
So, for instance, (a1,…, a5) are the shares of A’s salary, so they sum to A’s salary. Now, each participant adds all the numbers (shares) they hold, and sends the sum to all the other participants. In other words, the sum of each column in the table is computed and published, so for instance, the results in column sums s1,…, s5.
Now, it turns out that the sum of all the salaries is simply s1+s2+s3+s4+s5.
To see why, notice that if we were to add along each row in the table, we would get the salaries, and adding those gives the result we are after. So this result is actually the sum of all numbers in the above table. But now, notice that when we add the columns and then add column sums, we are just adding everything in a different order. But, of course, the sum is still the same!
The protocol we have seen is a good example of a general principle that MPC constructions often follow: first the inputs are secret shared. And now, instead of computing on the actual data, participants do a corresponding computation on the shares they have. And from those (partial) results they obtain, we can finally reconstruct the actual output we wanted. This keeps the inputs private, because we only compute on shares that do not reveal anything about the actual data.
In real life MPC, more advanced techniques are used where:
- we can do general computation, not just sums.
- we can complete the job, even if some participants crash or malfunction.
- we can at least identify those that did not do what they were supposed to.
However, the general design pattern is still very similar.
Use cases for MPC
The average salary problem is just one example. It turns out that there are tons of situations in real life where MPC could solve the problem presented. Let’s take a look at a few of these situations, so we can see what they have in common.
Online consumer auction
Auctions exist in many variants and are used for all kinds of purposes, but for the purpose of this, let’s concentrate on the simple variant where an item is for sale, and where the highest bid wins. We assume the auction is online (similarly to those that occur on Ebay), where the price starts at some preset amount and people place increasing bids until no one wants to bid more than the holder of the current highest bid. As this is a time consuming process, most of these online auctions allow you to provide the maximum amount you are willing to pay and the system will bid on your behalf up to this max bid. This way, the bidder with the highest chosen strike price of the item wins the auction at a price just slightly above the runner up, which even allows all parties to save time thanks to the automatic bidding.
For such an auction to work in a fair way, it is obvious that the maximum amount you are willing to pay should be kept private. For instance, if the auctioneer knows your maximum strike price and is working with the seller, they can force the price to be just below the highest maximum bid and so force the winner to pay more than if the auction had been honest.
On the other hand, the result of the auction could in principle be computed, given the true maximum value each bidder assigns to the item on sale.
A typical procurement system is a sort of inverted auction, where some party (the buyer) asks companies (sellers and bidders) to bid for a contract, i.e., to make an offer on the price for doing a certain job. In such a case, the lowest bid wins. But on the other hand, bidders are interested in getting as large a price as possible.
Let’s consider the simplest possible auction with a single round of sealed bids and where the bidder with the lowest bid wins and gets compensated according to the lowest bid (a first price sealed bid auction). It is obvious that bids are private information as the seller with the lowest bid could benefit from increasing the bid to just below the second highest bid. Also, if competing sellers have access to the identity of bidders and their sealed bids, it would be easier for them to corrupt the competition and agree on prices outside of the auction.
On the other hand, the true value of the bids are needed to find the winner of the auction.
Assume you run a company. You will naturally be interested in how well you are doing compared to other companies in the same line of business as yours. The comparison may be concerned with a number of different parameters, such as profit relative to size, average salaries, productivity, etc. Other companies will most likely have similar interests in such a comparison, which is known as a benchmark analysis. Such an analysis takes input from all participating companies. Based on this, it tries to compute information on how well a company in the given line of business should be able to perform, and finally each company is told how its performance compares to this “ideal”.
It is clear that each company will insist that its own data is private and must not be leaked to their competitors. On the other hand, the desired results can be computed from the private data and there are several known methods from economics for doing such an analysis efficiently.
In most countries, public institutions, such as the tax authorities or the health care system, keep databases containing information on citizens. In many cases there are advantages one can get from coordinated access to several such databases. Researchers may be able to get statistics they could not get otherwise, or institutions might benefit from an administrative advantage by being able to quickly gather the information they need on a certain individual.
On the other hand, there is clearly a privacy concern here: access to many different databases by a single party opens the possibility to compile complete dossiers on particular citizens, which would be a violation of privacy. In fact, accessing data on the same person in several distinct databases is forbidden by law in several countries precisely because of this concern.
The answer? MPC.
In all of the above scenarios, we see very similar situations: we have a number of parties that each possess some private data. We want to do some computation that needs all the private data as input. The parties are interested in learning the result, or at least some part of it, but still want to keep their private data as confidential as possible.
A very simple (but naïve) solution to this would be to find some party, T, that everyone is willing to trust. Now all parties privately give their input to T, she does the required computation, announces the result to the parties, and forgets about the private data she has seen.
However, this is actually very unsatisfactory: firstly, we have created a single point of attack from where all the private data can potentially be stolen. For instance, the party who manages the bids in a sealed bid auction is such a critical single point. Second, the parties must all completely trust T, both with respect to privacy and correctness of the results. Now, the reason why there are privacy concerns is that the parties do not trust each other in the first place, so why should we believe that they can find a new party they all trust?
On the other hand, the problems we have seen are exactly the kind of problems MPC can solve, but without creating a single point of attack. It is clear that the list of potential applications of MPC technology is essentially endless.