On Transaction Fees, And The Fallacy of Market-Based Solutions
Of all the parts of the Ethereum protocol, aside from the mining function the fee structure is perhaps the least set in stone. The current values, with one crypto operation taking 20 base fees, a new transaction taking 100 base fees, etc, are little more than semi-educated guesses, and harder data on exactly how much computational power a database read, an arithmetic operation and a hash actually take will certainly give us much better estimates on what exactly the ratios between the different computational fees should be. The other part of the question, that of exactly how much the base fee should be, is even more difficult to figure out; we have still not decided whether we want to target a certain block size, a certain USD-denominated level, or some combination of these factors, and it is very difficulty to say whether a base fee of 0.001 would be more appropriate. Ultimately, what is becoming more and more clear to us is that some kind of flexible fee system, that allows consensus-based human intervention after the fact, would be best for the project.
When many people coming from Bitcoin see this problem, however, they wonder why we are having such a hard time with this issue when Bitcoin already has a ready-made solution: make the fees voluntary and market-based. In the Bitcoin protocol, there are no mandatory transaction fees; even an extremely large and computationally arduous transaction can get in with a zero fee, and it is up to the miners to determine what fees they require. The lower a transaction’s fee, the longer it takes for the transaction to find a miner that will let it in, and those who want faster confirmations can pay more. At some point, an equilibrium should be reached. Problem solved. So why not here?
The reality, is, however, is that in Bitcoin the transaction fee problem is very far from “solved”. The system as described above already has a serious vulnerability: miners have to pay no fees, so a miner can choke the entire network with an extremely large block. In fact, this problem is so serious that Satoshi close to fix it with the ugliest possible path: set a maximum block size limit of 1 MB, or 7 transactions per second. Now, without the immensely hard-fought and politically laden debate that necessarily accompanies any “hard-forking” protocol change, Bitcoin simply cannot organically adapt to handle anything more than the 7 tx/sec limit that Satoshi originally placed.
And that’s Bitcoin. In Ethereum, the issue is even more problematic due to Turing-completeness. In Bitcoin, one can construct a mathematical proof that a transaction N bytes long will not take more than k*N time to verify for some constant k. In Ethereum, one can construct a transaction in less than 150 bytes that, absent fees, will run forever:
( TO, VALUE, ( PUSH, 0, JMP ), v, r, s )
In case you do not understand that, it’s the equivalent of 10: DO_NOTHING, 20: GOTO 10; an infinite loop. And as soon as a miner publishes a block that includes that transaction, the entire network will freeze. In fact, thanks to the well-known impossibility of the halting problem, it is not even possible to construct a filter to weed out infinite-looping scripts.
Thus, computational attacks on Ethereum are trivial, and even more restrictions must be placed in order to ensure that Ethereum remains a workable platform. But wait, you might say, why not just take the 1 MB limit, and convert it into a 1 million x base fee limit? One can even make the system more future-proof by replacing a hard cap with a floating cap of 100 times the moving average of the last 10000 blocks. At this point, we need to get deeper into the economics and try to understand what “market-based fees” are all about.
Crypto, Meet Pigou
In general terms, an idealized market, or at least one specific subset of a market, can be defined as follows. There exist a set of sellers, S(1) … S(n), who are interested in selling a particular resource, and where seller S(i) incurs a cost c(i) from giving up that resource. We can say c(1) < c(2) < … < c(n) for simplicity. Similarly, there exist some buyers, B(1) … B(n), who are interested in gaining a particular resource and incur a gain g(i), where g(1) > g(2) > … > g(n). Then, an order matching process happens as follows. First, one locates the last k where g(k) > c(k). Then, one picks a price between those two values, say at p = (g(k) + c(k))/2, and S(i) and B(i) make a trade, where S(i) gives the resource to B(i) and B(i) pays p to S(i). All parties benefit, and the benefit is the maximum possible; if S(k+1) and B(k+1) also made a transaction, c(k+1) > v(k+1), so the transaction would actually have negative net value to society. Fortunately, it is in everybody’s interest to make sure that they do not participate in unfavorable trades.
The question is, is this kind of market the right model for Bitcoin transactions? To answer this question, let us try to put all of the players into roles. The resource is the service of transaction processing, and the people benefitting from the resource, the transaction senders, are also the buyers paying transaction fees. So far, so good. The sellers are obvious the miners. But who is incurring the costs? Here, things get tricky. For each individual transaction that a miner includes, the costs are borne not just by that miner, but by every single node in the entire network. The cost per transaction is tiny; a miner can process a transaction and include it in a block for less than 0.00001 is being paid by thousands of nodes all around the world.
It gets worse. Suppose that the net cost to the network of processing a transaction is close to 0.05 the system would still be in balance. But what is the equilibrium transaction fee going to be? Right now, fees are around 0.00001. If a transaction with a fee of 0.00001, and the remaining $0.04999 worth of costs will be paid by the rest of the network together – a cryptographic tragedy of the commons.
Now, suppose that the mining ecosystem is more oligarchic, with one pool controlling 25% of all mining power. What are the incentives then? Here, it gets more tricky. The mining pool can actually choose to set its minimum fee higher, perhaps at 0.00001 and 0.00001 and $0.00099 before now have the incentive to increase their fees to make sure this pool confirms their transactions – otherwise, they would need to wait an average of 3.3 minutes longer. Thus, the fewer miners there are, the higher fees go – even thought a reduced number of miners actually means a lower network cost of processing all transactions.
From the above discussion, what should become painfully clear is that transaction processing simply is not a market, and therefore trying to apply market-like mechanisms to it is an exercise in random guessing at best, and a scalability disaster at worst. So what are the alternatives? The economically ideal solution is one that has often been brought up in the context of global warming, perhaps the largest geopolitical tragedy of the commons scenario in the modern world: Pigovian taxes.
Price Setting without A Market
The way a Pigovian tax works is simple. Through some mechanism, the total net cost of consuming a certain quantity of a common resource (eg. network computation, air purity) is calculated. Then, everyone who consumes that resource is required to pay that cost for every unit of the resource that they consume (or for every unit of pollution that they emit). The challenge in Pigovian taxation, however, is twofold. First, who gets the revenue? Second, and more importantly, there is no way to opt out of pollution, and thus no way for the market to extract people’s preferences about how much they would need to gain in order to suffer a given dose of pollution; thus, how do we set the price?
In general, there are three ways of solving this problem:
- Philosopher kings set the price, and disappear as the price is set in stone forever.
- Philosopher kings maintain active control over the price.
- Some kind of democratic mechanism
There is also a fourth way, some kind of market mechanism which randomly doles out extra pollution to certain groups and attempts to measure the extent to which people (or network nodes in the context of a crytocurrency) are willing to go to avoid that pollution; this approach is interesting but heavily underexplored, and I will not attempt to examine it at this point in time.
Our initial strategy was (1). Ripple’s strategy is (2). Now, we are increasingly looking to (3). But how would (3) be implemented? Fortunately, cryptocurrency is all about democratic consensus, and every cryptocurrency already has at least two forms of consensus baked in: proof of work and proof of stake. I will show two very simple protocols for doing this right now:
Proof of work Protocol
- If you mine a block, you have the right to set a value in the “extra data field”, which can be anywhere from 0-32 bytes (this is already in the protocol)
- If the first byte of this data is 0, nothing happens
- If the first byte of this data is 1, we set block.basefee = block.basefee + floor(block.basefee / 65536)
- If the first byte of this data is 255, we set block.basefee = block.basefee – floor(block.basefee / 65536)
Proof of stake Protocol
- After each block, calculate h = sha256(block.parenthash + address) * block.address_balance(address)for each address
- If h > 2^256 / difficulty, where difficulty is a set constant, that address can sign either 1, 0 or 255 and create a signed object of the form ( val, v, r, s )
- The miner can then include that object in the block header, giving the miner and the stakeholder some miniscule reward.
- If the data is 1, we set block.basefee = block.basefee + floor(block.basefee / 65536)
- If the data is 255, we set block.basefee = block.basefee – floor(block.basefee / 65536)
The two protocols are functionally close to identical; the only difference is that in the proof of work protocol miners decide on the basefee and in the proof of stake protocol ether holders do. The question is, do miners and ether holders have their incentives aligned to set the fee fairly? If transaction fees go to miners, then miners clearly do not. However, if transaction fees are burned, and thus their value goes to all ether holder proportionately through reduced inflation, then perhaps they do. Miners and ether holders both want to see the value of their ether go up, so they want to set a fee that makes the network more useful, both in terms of not making it prohibitively expensive to make transactions and in terms of not setting a high computational load. Thus, in theory, assuming rational actors, we will have fees that are at least somewhat reasonable.
Is there a reason to go one way or the other in terms of miners versus ether holders? Perhaps there is. Miners have the incentive to see the value of ether go as high as possible in the short term, but perhaps not so much in the long term, since a prolonged rise eventually brings competition which cancels out the miners’ increased profit. Thus, miners might end up adopting a looser policy that imposes higher costs (eg. data storage) on miners far down the line. Ether holders, on the other hand, seem to have a longer term interest. On the other hand, miners are somewhat “locked in” to mining ether specifically, especially if semi-specialized or specialized hardware gets involved; ether holders, on the other hand, can easily hop from one market to the other. Furthermore, miners are less anonymous than ether holders. Thus, the issue is not clear cut; if transaction fees are burned one can go either way.