A concise overview of the challenges and solutions to scaling the Ethereum Network
A blockchain venture production studio building decentralized applications on Ethereum. Go to www.consensys.net and subscribe to our newsletter.
Ethereum developers have long known that scaling the network is a subject worth discussion and investment. The matter did not quite spill out from the developer community, however, until late 2017, when a decentralized application (dApp) named CryptoKitties attracted so much traffic it began slowing down the network. On top of network latency, the price of gas — the fee required to run each operation within a contract on the Ethereum blockchain — soared as users competed for their transactions to be validated.
Though the story is now over-reported and exhausted, the CryptoKitties situation revealed that Ethereum in its current state may not be prepared for the amount of traffic that would accompany the launch of a successful dApp. Slow speeds and volatile usage-costs drive people away from platforms and applications. DApp developers are charging ahead to release the first widely-adopted application, so Ethereum developers must continue working to scale the blockchain.
One theory of blockchain technology is that a network can only support two of the following: security, decentralization, and scalability. This “trilemma” — as it has become known — has been the challenge of Ethereum developers as they they seek to maintain the core tenets of blockchain (decentralization and security) while scaling it for widespread adoption and implementation. Some of the more immediate fixes for scalability, for instance, severely impact security or decentralization:
The use of altcoins is one theoretical solution to scalability concerns. The option is to abandon the idea of one blockchain off of which all transactions occur, and instead adopt a model where multiple altcoins coexist, all of which operate on separate blockchains. The reduced traffic-per-blockchain would allow this constellation of blockchains to scale. However, with fewer nodes acting on each blockchain, each blockchain is more susceptible to attack and malicious users. The use of altcoins, therefore, maintains decentralization and improves scalability, but severely impacts security.
Increasing block size is another theoretical solution to scalability concerns. If the Ethereum community voted to increase the size of each block, all nodes could still perform all operations, but more transactions could be performed in the same amount of time, therefore speeding up the network. With larger block sizes, however, each transaction requires more energy, and fewer and fewer nodes will be able to expend that amount of energy. The result would be a future where the network is maintained by a handful of supercomputers with the tremendous processing power needed to verify each block. Increased block size, therefore, maintains security and improves scalability, but severely decreases how decentralized the network is.
The primary concern with blockchain development was security and decentralization. The primary hindrance to scalability, therefore, is that every node currently must process every transaction. Though undeniably secure and decentralized, this process does not allow much room for scalable growth. The question becomes, therefore, how do we engineer Ethereum to be able to scale without compromising security and decentralization?
There are four primary protocols in development that will address the issues of scalability. Sharding, Plasma, and Raiden were proposed specifically to help Ethereum scale. The fourth protocol, Casper, is much broader in scope, but will have scalability implications on top of many others.
Sharding is one method of scaling that maintains all transactions on the original blockchain, therefore known as an “on-chain” solution. Sharding addresses the issue that all transactions on Ethereum are sequential, since every node must process every transaction. Sharding allows for operations to run simultaneously alongside one another, therefore increasing the number of transactions per second the overall blockchain can process. With sharding, the Ethereum network is divided into multiple groups of nodes. Each of these groups is a shard, and each shard processes all the transactions that occur within that group. This allows all shards to each process different transactions simultaneously
Within each shard, certain nodes called “collators” would regularly create a “collation,” or a set of information about that shard. Each collation contains the following information:
Information about which shard the collation came from.
Information about the state of the shard before the transactions are applied.
Information about the state of the shard after the transactions will be applied.
Digital signatures from ⅔ of the collators validating the information in the collation
Across the network, the collations from each shard are aggregated into a single block and added to the Ethereum blockchain. Sharding, therefore, allows these groups of nodes to process and verify transactions, while the only information added to the blockchain is the diluted information found in collations. If, for instance, there are ten shards, and each shard processes five transactions, then the next block would include a record of fifty transactions on the blockchain, rather than just a few had it run transactions sequentially.
Two issues arise with sharding. First, each shard must contain enough nodes to ensure network security. If a shard contains too few nodes, ⅔ of the collators could be compromised and begin acting maliciously. Second, there is no easy way to process a transaction that occurs between two shards instead of within just one (an issue that doesn’t exist with one, whole blockchain). The current method requires a cumbersome series of receipts and proofs.
Plasma is another method of scaling that processes transactions “off-chain,” i.e. not on the primary Ethereum blockchain. Plasma allows for many blockchains (called “child chains”) to stem from the original blockchain (called the “root chain”). Each child chain, therefore, can process and maintain its own records of transactions while relying on the underlying security of the root chain. With Plasma, the root chain is the global enforcer of the computation happening on all the child chains. The root chain, however, only needs to be computed if a dispute arises within one of the child chains. This method allows for an entire network of child chains to divide up all the transactions on the blockchain in order to best optimize speed and efficiency. If the nodes on a child chain desire, they can submit an exit transaction and export a record of their transactions to the root chain.
This method has one particular strength. Each plasma chain can have its own qualities and set of standards. This means that different child chains can support transactions with varying requirements (i.e. privacy), while all still occuring within the same, secure ecosystem.