Information processing tasks require various digital resources including storage, network bandwidth and/or computation. In the physical world, we often measure the value of resources in terms of their monetary cost. For example, a kWh might cost 10¢, a particular CPU might cost $283 per unit, and 16GB of a certain DRAM module might cost $324. Understanding the cost of required resources helps us evaluate the utility of performing a particular task.
Wickr’s first research program is designed to accomplish the following goal:
Develop an egalitarian and moderately hard to evaluate cryptographic hash function.
By “egalitarian” we mean that the cost per evaluation of the hash function should vary as little as possible across real world computational devices. In particular, we want to minimize the cost advantage per evaluation of a custom built device (e.g. using an ASIC) over a general purpose off-the-shelf PC. To ensure that honest parties can still evaluate the hash function we ask that the cost not be too high. But to prevent an adversary from evaluating the hash function too often (or at least too rapidly) we still require a moderate cost per fresh evaluation. More practically, we would like the cost to be scalable (e.g. to accommodate for changing technology and different cost/benefit ratios). Therefor, we want the hash function to come equipped with an explicit (set of) parameter(s) with which to scale (different aspects of) the cost. Concretely, one parameter could regulate the amount of memory required to evaluate the function while another regulates the amount of computation required.
Applications: An important meta-use for such a moderately hard hash function is to meter out access to a given resource or capability. Here are some specific practical examples:
- Password Protection: To mitigate brute force guessing of passwords (e.g., via dictionary attack), it has long been standard practice to first apply a moderately hard hash (e.g., PBKDF2, bcrypt, scrypt, Argon2id) to the password before the resulting digest is stored (or used by some higher-level security protocol). This way, if any information that depends on the password hash leaks into the adversaries hands, they will be forced to repeatedly evaluate the moderately hard hash function for each new password guess they make.
- Sybil Attacks and Distributed Consensus: A Sybil attack on a peer-to-peer system takes place when the intended behavior of the system is subverted by an adversary creating (or controlling) a large number of identities. For example, a p2p lottery is vulnerable to Sybil (like) attacks if an adversary can significantly skew their probability of winning by obtaining a disproportionate number of lottery tickets. Recently a great deal of attention has been focused on preventing such attacks on the distributed consensus protocols underlying most major cryptocurrencies (e.g. for Bitcoin, Ethereum, etc.). Roughly speaking, these protocols can be viewed as a sequence of lotteries (e.g. roughly every 10 minutes for Bitcoin) where the winner of a lottery gets to decide on a fresh block of transactions to be add to the blockchain. To prevent any one party having to much influence over how the blockchain grows (e.g. in order to mount double-spending attacks) it is crucial that no one party is able to win these lotteries too often. So, to prevent this, obtaining each new lottery ticket requires first evaluating some underlying moderately hard hash function (e.g. SHA256 for Bitcoin). In other words, the entire security of the cryptocurrency depends on choosing the right underlying hash function. More generally, one way to significantly raise the cost of a Sybil attack is to require (several) evaluations of a moderately hard hash function in order to create an account.
- Denial of Service Mitigation: A common means of mounting denial of service attack is to overwhelm the limited resources of a system with bogus request thereby preventing honest parties from making use of the system. A classic example of such an attack is spam email consumes both the resources of the email servers involved as well as the time and energy of the people receiving the spam email. To raise the cost of mounting such attacks a we can design systems in a way that making use of the limited resources requires multiple evaluations of a moderately hard hash function. For example, to combat spam email a fresh email is only accepted by the server if the email comes with an added nonce x such that hash(email, date, sender, recipient, x) < threshold. On the one hand, on expectation many nonces need to be tried before one satisfying the inequality is found ensuring that the sender of the email insured a moderate amount of cost for this email. On the other hand, testing that an email is accompanied by valid nonce requires only a single evaluation of the hash function. What’s more, by adjusting decreasing/increasing the threshold value the cost to the sender can automatically be adjusted at no further expense to the email server. Similar mechanism can be designed for other services susceptible to DoS attacks such as say an SQL server, Webserver, etc. whenever the invocation of the limited resources is easily tied to a unique string.
For each of these use cases, we see that honest parties too must perform the moderately hard task. This has at least two consequences motivating the goals for this research program as laid out above. First, the cost of performing the task should not be prohibitively large (e.g., password hashing might take about .5 second for a login server or 2 seconds for password-based file/disk encryption). Second, the cost should be egalitarian to ensure that the adversary does not gain any significant advantage by investing in special hardware.
The Path Forward: We break down the task of building and evaluating a suitable moderately hard hash function into can be, achieving via the following pair of steps.
- Develop a new way to measure the resources required to perform a computational task (e.g. memory, computation, network, etc.) that accurately reflect its monetary cost of performing the task on a general purpose and on task-specific hardware. Moreover, the cost predicted by the measure should be as close as possible on the different types of devices, nor should it be possible to achieving significantly lower amortized cost at scale.To get a feel for the challenge here it is instructive to look at how we normally measure computational resources, why that measure fails at the above stated goal and how that failure has resulted in undesirable trends in practice.Some (by no means exhaustive) highlights in the ongoing effort to developing more egalitarian complexity measures can be found in . Broadly speaking, these works aim to leverage the observation that costs associated with memory tend to be much more egalitarian than purely computational ones.
- Define a concrete, practical, moderately hard hash function. That is, a parametrized cryptographic hash function for which the new complexity measure scales smoothly in the functions parameters. To ensure an accurate cost estimate we need that the only computational resources required (in significant quantities) to evaluate the function are those accounted for by the complexity measure. Finally, to prevent adversaries gaining amortized cost advantages at scale the function must also be amortization-free. That is no significant reduction in the resources required per evaluation of the function can be achieved even, say, on related inputs and/or on parallel hardware.Some new (practical) moderately hard functions were proposed in . Other candidates can be found in the list of entrants to the Password Hashing Competition as well as buried inside various proof-of-work based consensus protocols for cryptocurrencies (e.g. .
In a sense, the goal is to build a hash function that represents one (moderately large) unit of computation measured in monetary cost per evaluation.
Traditionally, we often measure computational complexity in terms of the number of computational steps required. For example, we may speak of the number of machine code instructions, circuit gates, swaps, comparisons or the additions and/or multiplication required. Indeed, for a task requiring little other resources (e.g. memory access, network bandwidth, human interaction, etc.) measuring complexity this way does give a reasonably accurate estimate of the cost required to perform the task on different types of hardware.
Unfortunately, these measures end up being far from egalitarian. In fact, regardless of the exact way we measure it pure computation ends up being far cheaper (e.g. 10,000 times so) to perform with ASICs than with general purpose CPUs (at least at scale). In practice, this asymmetry in cost has been exploited quite effectively in the Bitcoin mining industry. In particular, for those willing to invest sufficient funds it often turns out to be far more cost effective to make use of special purpose mining hardware. Unfortunately, the net result is a small subset of people end up with a large amount of influence over the Bitcoin consensus protocol disproportionate to the amount of financial resources they actually spend. This directly contravenes the goal of decentralizing control over Bitcoin. A better understanding of which kinds of computational tasks are more egalitarian can help us build more decentralized blockchain protocols.