Chains

BNB Smart Chain

Build web3 dApps effortlessly

BNB Beacon Chain

Sunset Complete

BNB ecosystem’s staking & governance layer

DocumentationGitHubFaucetStake BNBBscScanBSCTraceDev ToolsLearn more about FusionDocumentationBeacon Chain ExplorerToken Recovery ToolDocumentationGitHubFaucetBridgeGreenfieldScanDCellarDev ToolsDocumentationGitHubFaucetBridgeopBNBScanDev ToolsDocumentationGitHub

Developers


Submit dApps

BNB Smart ChainBNB GreenfieldopBNBzkBNBBNB HackBNB Incubation Alliance (BIA)Most Valuable Builder Accelerator Program (MVB)BNB Chain GrantsKickstartGas GrantsTVL Incentive ProgramMEME Coins InnovationSee All Programs

Ecosystem

Staking

Earn BNB and rewards effortlessly

Tokenization Solutions

Get Your Business Into Web3

Native StakingLiquid StakingCompany TokenizationReal World Assets TokenizationNFT Loyalty Program

Community

Contact UsGet Started
Contact UsGet Started

BNB Smart Chain Annual Storage Report 2024

2024.3.15  •  7 min read
Blog post image.

Introduction

In 2023, the BNB Smart Chain (BSC) maintained consistent traffic volumes, witnessing a notable increase in market activities due to inscriptions in December. These developments over the past year have significantly influenced BSC's storage demands. In this report, we will learn:

  1. How do the storage statistics differ from the previous year?
  2. What phenomena cause the storage difference?
  3. The challenges faced and the proposed direction to resolve them

Storage Overview

All storage statistics are obtained by setting up a full node with Path-based Storage Scheme (PBSS) and PebbleDB synced to block 34840595, and were generated on 31st December 2023.

The following table shows an overview of the storage result:

Database

Category

Size

Count

Key-Value store


Headers

72.28MiB

90009

Bodies

12.40GiB

90009

Receipt lists

7.73GiB

90009

Difficulties

4.03MiB

90009

Block number -> hash

3.61MiB

90007

Block hash -> number

1.33GiB

34840598

Transaction index

176.04GiB

5183543985

Bloombit index

8.12GiB

17426746

Contract codes

20.23GiB

2590028

Hash trie nodes

0.00B

0

Path trie state lookups

3.52MiB

90001

Path trie account nodes

40.34GiB

349647355

Path trie storage nodes

473.95GiB

4718092104

Trie preimages

819.00B

13

Account snapshot

13.17GiB

257244258

Storage snapshot

246.98GiB

3468787109

Clique snapshots

0.00B

0

Parlia snapshots

100.79MiB

34105

Singleton metadata

401.62MiB

17

Light client


CHT trie nodes

3.39GiB

33630011

Bloom trie nodes

8.65GiB

9334268

Ancient store (Chain)


Bodies

797.23GiB

34750596

Receipts

664.62GiB

34750596

Diffs

356.49MiB

34750596

Headers

20.21GiB

34750596

Hashes

1.23GiB

34750596

Ancient store (State)


Account Data

1.52GiB

90000

Storage Data

1.63GiB

90000

History Meta

248.81MiB

90000

Account Index

2.03GiB

90000

Storage Index

3.65GiB

90000


Total

2.45TiB


The following visualization shows the storage distribution of each major component:

As shown, block data takes up the majority of the storage, followed by the world state and metadata. By comparing with the storage layout in December 2022, which was announced in BNB Smart Chain Annual Storage Report 2023, the summary is as follows:

  • The total storage size increased from 1.73TB (correction with ~130GB transaction index) to 2.45TB, a growth rate of 41.6%.
  • The storage capacity of each major storage component is shown below, and the growth rates are 42.6%, 42.5%, 42.9%, and 34.4% respectively.

Block Data

The following graph shows the year-over-year block data comparison:

In 2023, BSC saw a notable increase in its data storage requirements, particularly in block body sizes, which expanded by 256GB, marking a 46.4% growth rate. Additional components such as receipts, headers, and codes also experienced significant increases, growing by 185GB, 6.68GB, and 4.73GB respectively, with their growth rates standing at 37.95%, 49.1%, and 30.5%. This expansion pace represents a slowdown compared to 2022, attributed to the reduced transaction per second (TPS) in a bear market.

The substantial block size presents several challenges. One key issue is the necessity to store all blocks from the Genesis block to the most recent, consuming extensive disk space that will only continue to grow. However, executing the most recent blocks does not require access to historical block data. This situation presents an opportunity to explore optimization techniques that could potentially reduce the storage needs of a node by excluding this historical data.

Furthermore, the size of each block increases with higher transaction throughput. From the average block size and daily transaction number charts on BscScan, the average block size is around 40k-50k and the average TPS is around 44.  In December, the block size once reached 250k and the TPS reached more than 1k, which is consistent with the popularity of the entire crypto market. Higher TPS means larger block data size, which demands more disk bandwidth and larger disk space.

Exploring the database mechanics further, initially, recent blocks are stored in a key-value (KV) database. When these blocks age beyond a certain point, termed the ancient threshold, they are transferred to the ancient database. This transfer process, unfortunately, results in some disk bandwidth inefficiency. Additionally, it's important to note the implications of EIP-4844. With the adoption of EIP-4844 by the BSC, an increase in block size is anticipated due to the incorporation of blobs. Although the storage required for blobs may not expand over time, it will nonetheless impose an additional demand for disk space on the part of node operators.

World State

Trie


Dec. 2022

Dec. 2023

Growth rate

EOA accounts

87,190,393

152,436,001

74.8%

Contract accounts

47,329,085

104,809,811

121.4%

Total KV pairs

3,449,013,209

5,068,274,292

46.9%

Total Size

360.68GB

514.29GB

42.6%

From the table above, we see a huge surge in the number of accounts, particularly for contract accounts which increased by 121.4%. This indicates a healthy growth and activity level within the BNB Chain ecosystem even during the bear market. However, this also leads to an increase in trie storage size with a growth rate of 42.6%. 

Diving deeper into the MPT composition, the following diagram shows the proportion of trie nodes on each trie level:

The deeper the nodes are in the trie, the longer the reading latency,  which may impact the node performance. Most trie nodes are concentrated in the 7th and 8th levels of the trie, which is still considered normal.

Snapshot


Dec. 2022

Dec. 2023

Growth rate

Account snapshot size

6.71GB

13.17GB

96.3%

Storage snapshot size

174.86GB

246.98GB

41.2%

Total KV pairs

2,577,621,332

3,726,031,367

44.6%

Snapshot is a flat key-value representation of the trie. Hence, the increase in the number of accounts in the trie would also increase the account snapshot size.

Big contract accounts

The unbounded nature of contract size allows for a single contract to potentially grow as large, or even larger, than the entire account trie. In light of this, an analysis was conducted on "big contract accounts," characterized by their extensive storage sizes, manifested through substantial KV pair volumes written by the contract.

These contracts, with their significant storage demands and complex, multi-layered MPT structures, could lead to storage amplification issues, adversely affecting node performance. Presented below is a table detailing the number and proportion of trie nodes for the top 20 contracts:

Contract Address Hash 

Total Trie Nodes

Percentage

0xe9dae3d797a6bf53395810df9d7048f18ac98f1bd211dc87dfad3532aa88d237 

292687327

6.203%

0xe3ee5c338fb03ba97621fbf6b62c153a7a9b3c4dc567d43368d31a1ae9a2d6b5 

127974389

2.712%

0xbe09a843e96d820323ffaac74f0f119734db1f158ac0d0d5b627ac7f3bcc82c2 

97475866

2.066%

0x9944875b9e5ab4adbba2b96063da62b3027becaed0108d94caa199e447f3899b 

89336533

1.893%

0xcbfc208cdd69e775207d3575299a371560c11e9896b0a4163c2b845a7d9700ff 

81506522

1.727%

0xa2aea0f231dc891cdb73930caa95a9cc139c3a15aa82bdd058ed70f340639f03 

64950309

1.376%

0xe9f236c88a4a8a733cdc8006ea8ea015b72d5af7ce2349c63fbf18d8e8caf967 

51406538

1.089%

0xd97dd5b88bb7ee807775844477cb799dbe99670ce8b2c117353e135807c96749 

50664326

1.074%

0xc874e65ccffb133d9db4ff637e62532ef6ecef3223845d02f522c55786782911 

50360139

1.067%

0xd463275379920234d812dc6067bd870fd827f413d7522b5ea4fa1344b0f67e98 

49206262

1.043%

0x4f0461659e231d1a2414365e75f957f73cf742123e96266b388f745e748e5cb5 

46347263

0.982%

0x6d6171b4266182a5688e6c28a1b19b90ef55d7c9477b203ac2efc5c767268a21 

42535827

0.901%

0x056c4f19188880933e0d07f50b427ecd7f0e76a51114ebe3009810fab290f238 

42060518

0.891%

0x659dd7cc4344b94968d04d592683ceb1d3cf2c537d3a70f6008bbbcd9257ee91 

38665970

0.819%

0xfe1c2c3bf003e59420de2a964984544a947ac6de636a2dedb89b689ab278b65e 

36522794

0.774%

0xb391b79f572b5a9730880e7ce4da4a9f128b595f4ba8cc8c74cd195b50f6912e 

32918172

0.698%

0xb23ca34dfccaab5e20e02f61e2d9f76422f560e5407906b35398e774c27b40ae 

30919935

0.655%

0xf7c451c1298c0a97d0dfbe0a4bec252fd1544432b7f968ec6dabe904165d3f69 

30332874

0.643%

0xca7707f73fe46dcd03ecacc1ba26184f023fd3281fdfecb67a08d576d101af9a 

30243859

0.641%



27.356%

Since the database only stores the hash of the account address, it is not easy to obtain the original account address directly. We attempted to identify the original addresses of these large players and have listed the top 5 below:

Contract Address

Total Trie Nodes

Percentage

XEN Crypto: bXEN Token (0x2AB0e9e4eE70FFf1fB9D67031E44F6410170d00e )

292687327

6.203%

CryptoMines Worker (0x6053b8FC837Dc98C54F7692606d632AC5e760488)

127974389

2.712%

PancakeSwap: Prediction V2 (0x18B2A687610328590Bc8F2e5fEdDe3b582A49cdA)

97475866

2.066%

Shido - Shido Network (0xE71A487706A065aE0947576F8E591732360d39fb) 

89336533

1.893%

Bomb Crypto:BHERO (0x30Cc0553F6Fa1fAF6d7847891b9b36eb559dC618) 

81506522

1.727%

Future Development

Blockchain is highly IO-bounded. Higher transaction throughput means more disk bandwidth, and larger database size also affects the database performance and the overall system performance. 

Reasonable data storage solutions and utilization of the disk bandwidth are the keys to effectively improving overall system throughput. Below are some proposals and directions we can research based on the analysis that we’ve done on this storage report:

  1. Separated databases for block data and state dataBlock data is stored sequentially while state data is stored randomly in the database. Split database by data pattern will make disk bandwidth usage more reasonable and improve the whole performance.
  2. Segmented History Data MaintenanceIt can help resolve the problem of increasing history block storage on the BSC for validators and full nodes. They only need to maintain a limited range of blocks. 
  3. State expiry in contract level to reduce current world state sizeThe current world state data size is continuously increasing, which will impact the network's performance. We need to build some strategy to keep it under control.  Some storage tries may be rarely or no longer used. These storage tries’ state data can expire to reduce the whole state size.
  4. Build a high-performance state database

Currently, the state data is constructed on MPT and stored in a generic store such as LevelDB. The index performance is not good enough and our team is working to find a new solution to solve it.

  1. Integrate state snapshot into trie database

The state snapshot is used to improve the execution performance and its persistent data overlaps with the trie database. Besides, both state snapshot and trie database have similar complicated and nasty recovery mechanisms to ensure recoverability after panic. So it's beneficial enough to integrate state snapshot into trie database for better robustness and simplicity.

  1. Improve the performance of storage tries with huge KV pairsA storage trie with huge KV pairs will make too many levels of MPT, which will impact the access performance. 

Looking Forward

In 2023, BSC implemented the PBSS and PebbleDB to enhance the efficiency of blockchain state storage. As we move into 2024, the continuous and rapid growth of blockchain data presents a significant challenge for maintaining BSC's performance. It is crucial for all stakeholders to collaborate in seeking innovative solutions to enhance BSC's efficiency and cost-effectiveness. Together, let's commit to making BSC more robust and sustainable.

Share