The Trend Towards Blockchain Privacy: Zero Knowledge Proofs

One of the bigger trends in the blockchain world, particularly when it comes to financial services and specifically capital markets operations, has been a need for privacy and confidentiality in the course of daily business. This has meant that blockchain solutions are being designed with this primary need in mind. This has led to all the private blockchain solutions being developed today.
When you build for privacy and confidentiality there are tradeoffs that come with that. Mainly you lose transparency, which was the major feature of the the first blockchain: bitcoin. As originally designed, a blockchain is a transparency machine. In this system, the computers are distributed and no one entity controls the network. Not only this, but anyone can be a validator and anyone can write to or read from the network. Clients and validators can be anonymous and all the data gets stored locally in every node (replication). This makes all transaction data public.
The security of bitcoin is made possible by a verification process in which all participants can individually and autonomously validate transactions. While bitcoin addresses the privacy problem by issuing pseudonymous addresses, it is still possible to find out who's addresses they are through various techniques.
This is the polar opposite of what is happening in the private blockchain world, where decentralization and transparency are not deemed as necessary for many capital markets use cases.
What is important is privacy and confidentiality, latency (speed) and scalability (able to maintain high performance as more nodes are added to the blockchain). Encrypted node-to-node (n2n) transactions mean only the two parties involved in the transaction receive data. In many of these systems there are opt ins for third party nodes (regulators) to be a part of the transaction.
Other systems being developed for similar purposes, which have been written about on this blog, have one designated block generator which collects and validates proposed transactions, periodically batching them together into a new-block proposal. Consensus is provided by a Generator that applies rules (validates) agreed to by the nodes (chain cores) to the block and designated block signers.
In these systems, decentralization is simply not necessary because all the nodes are known parties. In private blockchains the nodes must be known in order to satisfy certain regulatory and compliance requirements. The focus has been on how to preserve privacy and confidentiality while achieving speed, scalability, and network stability. Therefore, there are ways for legal recourse even between parties who don't necessarily trust each other.
Strong, durable cryptographic identification
What are cryptography and encryption?
As noted above with privacy and confidentiality being pivotal, encryption has become a major focus for all blockchains. Many of these solutions are using advanced cryptographic techniques that provide strong mathematically provable guarantees for the privacy of data and transactions.
In a recent blog post titled "A Gentle Reminder About Encryption" by Kathleen Breitman of R3CEV, she succinctly provides a great working definition:
"Encryption refers to the operation of disguising plaintext, information to be concealed. The set of rules to encrypt the text is called the encryption algorithm. The operation of an algorithm depends on the encryption key, or an input to the algorithm with the message. For a user to obtain a message from the output of an algorithm, there must be a decryption algorithm which, when used with a decryption key, reproduces the plaintext."
If this encryption uses ciphertext to decrypt this plaintext, you get homomorphic encryption and this (combined with digital signature techniques) is the basis for the cryptographic techniques which will be discussed in this post. Homomorphic encryption allows for computations to be done on encrypted data without first having to decrypt it. In other words, this technique allows the privacy of the data/transaction to be preserved while computations are performed on it, without revealing that data/transaction. Only those with decrypt keys can access what exactly that data/transaction was.
Homomorphic encryption means that decrypt(encrypt(A) + encrypt(B)) == A+B. This is known as homomorphic under addition.
So a computation performed on the encrypted data when decrypted is equal to a computation performed on the encrypted data.
The key question being asked is: How can you convince a system of a change of state without revealing too much information?
After all, blockchains want to share a (change of) state; not information. On a blockchain, some business process is at state X and now moves to state Y, this needs to be recorded and proved while preserving privacy and not sharing a lot of information. Furthermore, this change of state needs to happen legally, otherwise there is a privacy breach.
Cryptographic techniques like zero knowledge proofs (ZKPs), which use different types of homomorphic encryption, separate:
1) reaching a conclusion on a state of affairs
2) the information needed to reach that state of affairs
3) show that that state is valid.
The rest of this post will discuss how the trend towards privacy has led to cryptographic techniques, some old and some new, being used to encrypt transactions and the data associated with them from everyone except the parties involved. The focus will be on Zero Knowledge Proofs, zk SNARKs, Hawk, confidential signatures, state channels and homomorphic encryption.
The privacy problem on a blockchain is the main gap for deployment for all of the cryptographic solutions talked about below.
Outside of a blockchain, there are examples of homomorphic encryption in practice. CryptDB is an example of system that uses homomorphic encryption and other attribute preserving encryption techniques to query databases securely. It is used in production at Google and Microsoft amongst other places.
It does have limitations though: you have to define the kinds of queries you want ahead of time and it is easy to leak data. CryptDB provides confidentiality for data content and for names of columns and tables; however CryptDB does not hide the overall table structure, the number of rows, the types of columns, or the approximate size of data in bytes. One method CryptDB uses to encrypt each data items is by onioning. This allows each data item to be placed in layers of increasingly stronger encryption.
Confidential signatures
Gregory Maxwell designed a cryptographic tool (CT) to improve the privacy and security of bitcoin-style blockchains. It keeps the amounts transferred visible only to participants in the transaction. CT's make the transaction amounts and balances private on a blockchain through encryption, specifically additively homomorphic encryption. What users can see is is the balances of their own accounts and transactions that they are receiving. Zero knowledge proofs are needed to demonstrate to the blockchain that none of the encrypted outputs contain a negative value.
The problem with Confidential Transactions is that they only allow for very limited proofs as mentioned above. zkSNARKs and Zero Knowledge Proofs (ZKPs) which will be described in detail below, allow you to prove virtually any kinds of transaction validation while keeping all inputs private.