The State of Scaling Ethereum

A concise overview of the challenges and solutions to scaling the Ethereum Network

A blockchain venture production studio building decentralized applications on Ethereum. Go to www.consensys.net and subscribe to our newsletter.

Ethereum developers have long known that scaling the network is a subject worth discussion and investment. The matter did not quite spill out from the developer community, however, until late 2017, when a decentralized application (dApp) named CryptoKitties attracted so much traffic it began slowing down the network. On top of network latency, the price of gas — the fee required to run each operation within a contract on the Ethereum blockchain — soared as users competed for their transactions to be validated.

Though the story is now over-reported and exhausted, the CryptoKitties situation revealed that Ethereum in its current state may not be prepared for the amount of traffic that would accompany the launch of a successful dApp. Slow speeds and volatile usage-costs drive people away from platforms and applications. DApp developers are charging ahead to release the first widely-adopted application, so Ethereum developers must continue working to scale the blockchain.

The “Trilemma”

One theory of blockchain technology is that a network can only support two of the following: security, decentralization, and scalability. This “trilemma” — as it has become known — has been the challenge of Ethereum developers as they they seek to maintain the core tenets of blockchain (decentralization and security) while scaling it for widespread adoption and implementation. Some of the more immediate fixes for scalability, for instance, severely impact security or decentralization:

  1. The use of altcoins is one theoretical solution to scalability concerns. The option is to abandon the idea of one blockchain off of which all transactions occur, and instead adopt a model where multiple altcoins coexist, all of which operate on separate blockchains. The reduced traffic-per-blockchain would allow this constellation of blockchains to scale. However, with fewer nodes acting on each blockchain, each blockchain is more susceptible to attack and malicious users. The use of altcoins, therefore, maintains decentralization and improves scalability, but severely impacts security.
  2. Increasing block size is another theoretical solution to scalability concerns. If the Ethereum community voted to increase the size of each block, all nodes could still perform all operations, but more transactions could be performed in the same amount of time, therefore speeding up the network. With larger block sizes, however, each transaction requires more energy, and fewer and fewer nodes will be able to expend that amount of energy. The result would be a future where the network is maintained by a handful of supercomputers with the tremendous processing power needed to verify each block. Increased block size, therefore, maintains security and improves scalability, but severely decreases how decentralized the network is.

The primary concern with blockchain development was security and decentralization. The primary hindrance to scalability, therefore, is that every node currently must process every transaction. Though undeniably secure and decentralized, this process does not allow much room for scalable growth. The question becomes, therefore, how do we engineer Ethereum to be able to scale without compromising security and decentralization?

There are four primary protocols in development that will address the issues of scalability. Sharding, Plasma, and Raiden were proposed specifically to help Ethereum scale. The fourth protocol, Casper, is much broader in scope, but will have scalability implications on top of many others.

Sharding

Sharding is one method of scaling that maintains all transactions on the original blockchain, therefore known as an “on-chain” solution. Sharding addresses the issue that all transactions on Ethereum are sequential, since every node must process every transaction. Sharding allows for operations to run simultaneously alongside one another, therefore increasing the number of transactions per second the overall blockchain can process. With sharding, the Ethereum network is divided into multiple groups of nodes. Each of these groups is a shard, and each shard processes all the transactions that occur within that group. This allows all shards to each process different transactions simultaneously

Within each shard, certain nodes called “collators” would regularly create a “collation,” or a set of information about that shard. Each collation contains the following information:

  1. Information about which shard the collation came from.
  2. Information about the state of the shard before the transactions are applied.
  3. Information about the state of the shard after the transactions will be applied.
  4. Digital signatures from ⅔ of the collators validating the information in the collation

Across the network, the collations from each shard are aggregated into a single block and added to the Ethereum blockchain. Sharding, therefore, allows these groups of nodes to process and verify transactions, while the only information added to the blockchain is the diluted information found in collations. If, for instance, there are ten shards, and each shard processes five transactions, then the next block would include a record of fifty transactions on the blockchain, rather than just a few had it run transactions sequentially.

Two issues arise with sharding. First, each shard must contain enough nodes to ensure network security. If a shard contains too few nodes, ⅔ of the collators could be compromised and begin acting maliciously. Second, there is no easy way to process a transaction that occurs between two shards instead of within just one (an issue that doesn’t exist with one, whole blockchain). The current method requires a cumbersome series of receipts and proofs.

Plasma

Plasma is another method of scaling that processes transactions “off-chain,” i.e. not on the primary Ethereum blockchain. Plasma allows for many blockchains (called “child chains”) to stem from the original blockchain (called the “root chain”). Each child chain, therefore, can process and maintain its own records of transactions while relying on the underlying security of the root chain. With Plasma, the root chain is the global enforcer of the computation happening on all the child chains. The root chain, however, only needs to be computed if a dispute arises within one of the child chains. This method allows for an entire network of child chains to divide up all the transactions on the blockchain in order to best optimize speed and efficiency. If the nodes on a child chain desire, they can submit an exit transaction and export a record of their transactions to the root chain.

This method has one particular strength. Each plasma chain can have its own qualities and set of standards. This means that different child chains can support transactions with varying requirements (i.e. privacy), while all still occuring within the same, secure ecosystem.

 

Raiden

Raiden is another off-chain scaling solution that allows nodes to maintain a record between them without requiring the root chain to verify every transaction. Two nodes can open up a “state channel” between them, which is a two-way channel between users. “Messages” — in the form of transactions — occur between the two nodes and are signed by each party to ensure immutability. Raiden is particularly useful for payments that are frequent and expected — i.e. a user that knows they will pay a company $10 a week for a service, or a user who knows they will spend money at their local grocery store regularly. With transactions recorded and verified between these two nodes instead of on each block, the root chain is freed of an immense amount of traffic. At any time, either participant in a state channel can choose to close the transaction, and the net result of all the transactions is exported to the root blockchain and included in the next block. That means that after a year of subscribing to the $10/week service, the user could have the block verify one $520 transaction instead of 52 separate $10 transactions.

The Raiden solution comes with one primary caveat and one primary benefit. The caveat is that nodes can only communicate with their “neighbors” — meaning that if node A and node B have a state channel open, and node B and node C have a statement channel open, node A cannot send funds directly to node C. However, transactions can be forwarded through channels in such a way that they cannot be stolen or locked up along the way. Node A could send a transaction to node C by using node B as an intermediary in such a way that node B could not possibly steal the funds. As the primary benefit, Raiden drastically reduces gas prices for each transaction. Transactions that happen off-chain between nodes require less gas to process than transactions that occur on the root chain.

Casper

Casper is a protocol by which Ethereum’s current Proof of Work (PoW) model will change to Proof of Stake (PoS). With PoW, miners currently must expend energy in order to solve a cryptographic equation and mine a block. They are rewarded if they solve the equation, but the process requires immense energy (and will continue to require more and more). This is costly and energy-inefficient, currently costing $1.2 billion USD/year to maintain the PoW model.

In PoS, “validators” replace miners, and they “validate” (instead of mine) blocks onto the blockchain. Instead of expending energy on a certain block, validators stake their funds on a certain block. The block that has the most funds staked on it is verified and added to the blockchain. Essentially, validators “bet” that a certain block will be added to the chain by locking their funds in a contract until the next block is added. They are rewarded if they placed their bet on the correct block. They lose their funds if they act maliciously by trying to validate a block with incorrect or corrupt information.

Conceptually, this shift should protect the blockchain against malicious attacks. With PoW, a failed attack on the blockchain costs the attacker time and power. With PoS, a failed attack on the blockchain directly costs the user money, as s/he immediately loses all the funds staked on the wrong block.

The final rollout of Casper will be preceded by two iterations of the protocol: Casper FFG and Casper CBC. These iterations will be deployed on Ethereum in order to test PoS on the network and identify potential issues before completely switching over.

Casper FFG

Casper FFG (Friendly Finality Gadget) will be the first iteration of Casper, likely released during Ethereum’s next hard fork, Constantinople. In Casper FFG, blocks are still mined with PoW. However, every fifty blocks, validators step in to test the PoS mechanism. This “checkpoint” uses the PoS protocol to assess and confirm finality. Finality means that an operation is complete and entirely immutable. In FFG, the validators stake funds to finalize the previous fifty blocks in the chain.

Casper CBC

Casper CBC (Correct-by-Construction) will be the second iteration of Casper. Typically, a protocol is formally specified and then proven that it satisfies all the given properties. With CBC, the PoS protocol is only partially specified, and then further fine-tuned in order to satisfy the properties it was meant to follow. Essentially, instead of being fully defined from the beginning, the protocol is actively and constantly derived. This is achieved through the implementation of a proof known as an “ideal adversary,” which is able to raise exceptions, faults, and future failures of the protocol.

The final Casper protocol will likely be deployed with learnings from both FFG and CBC. The protocol is much broader in scope than just scalability, including energy and security improvements as well. Less energy per node required to add a block to the chain, however, means that the network will improve current scalability hardships. Though Casper is not being developed specifically to address scaling concerns, it will certainly have a positive impact on the ability of the network to handle higher traffic.

Looking Down the Road

The four proposals above are not mutually exclusive — they can and likely will all be implemented to some degree to help the Ethereum network scale over time. Scaling will be top of mind for Ethereum developers in 2018. As more and more popular dApps are developed and launched, we will see a continuous fine-tuning of the scaling options available in order to allow Ethereum its full potential.