Solana Internals Part 4: The Bank — A Key Component
January 30, 2022
Following Part 3: the TPU, this article elaborates on the bank module, a core component of Solana blockchain.
What’s a bank?
The importance of the bank module cannot be overstated:
It manages the state of all accounts and programs, executes the on-chain programs, and tracks their progress.
At a high level, a bank relates to a block produced by a single leader and each bank (except for the genesis bank) points back to a parent bank.
The bank is the main entrypoint for processing verified transactions. In Bank::process_transactions, it creates an InvokeContext to process each transaction.
InvokeContext in detail
The invoke_context.process_instruction is a key function that processes each instruction, verifies the called program has not misbehaved, maintains a cache to store compiled instructions, returns how many compute units were used, and so on.
It has a few parameters: the instruction_data and instruction_accounts , the program_indices (for retrieving the invoked program_id), the compute_units_consumed (for recording the compute units used for executing the instruction, initially 0), and timings (for the execution time info).
// Process instruction
let mut compute_units_consumed = 0;
To invoke an instruction on a program, invoke_context.process_instruction first uses the program’s owner to load the program, and then calls the program’s entrypoint.
The owner of the called program is one of the following:
the native loader (which loads the built-in programs)
built-in programs (e.g., system instruction)
If it is the native loader (NativeLoader1111111111111111111111111111111), then the corresponding built-in program’s entrypoint process_instruction will be called:
Otherwise, the built-in program’s entrypoint process_instruction will be called:
The process_instruction function takes as input three parameters: first_instruction_account (the invoked program_id ), the instruction_data , and the invoke_context itself:
Consider system_instruction.process_instruction, it handles the following instructions:
CreateAccount // Create a new account
Assign // Assign account to a program
Transfer // Transfer lamports
CreateAccountWithSeed // Create a new account derived from seeds
AdvanceNonceAccount, // Update a stored nonce
WithdrawNonceAccount(u64),// Withdraw funds from a nonce account
Allocate // Allocate space in a (possibly new) account
AssignWithSeed // Assign account to a program based on a seed
TransferWithSeed // Transfer lamports from a derived address
The most frequently used instructions are CreateAccount , Transfer and Allocate .
The bpf_loader.process_instruction function is used to execute user-deployed smart contracts (i.e., the BPF byte code):
The function calls process_instruction_common , which creates a BpfExecutor passing the program data and calls its execute function:
In BpfExecutor.execute, it creates a vm and executes the program by either vm.execute_program_jit or vm.execute_program_interpreted.
How’s the BPF code executed?
Importantly, the BPF byte code is not executed by the Linux kernel, but by a BPF virtual machine (EbpfVm).
By default, use_jit is false, and vm.execute_program_interpreted is used, i.e., BPF code is interpreted by the vm. This also means that Solana has a large potential to further improve performance, e.g. by executing the BPF code natively in the Linux kernel (though more technical details and security safeguard need to be fleshed out there).
Note that rbpf is not audited and it contains numerous unsafe Rust function blocks. Any errors in rbpf may cause severe vulnerabilities, such as Integer overflows and memory corruptions. See this post by BlockSec for an example.
Dealing with cross program invocation (CPI)
When the invoked program calls another program through invoke or invoke_signed, that program will be loaded and its entry point will be called.
Internally, this is done by a syscall to sol_invoke_signed_rust, which will call invoke_context.process_instruction again.
The syscall.function is retrieved from the syscall_registry , which has been initialized with numerous built-in system calls such as sol_invoke_signed_c, sol_invoke_signed_rust, sol_create_program_address, sol_keccak256, etc.
A full list of registered system calls can be found in syscall.rs.
After processing a CPI, the results (updates on all involved accounts) will be copied back to the caller:
TheInvokeContext has a transaction_context to track the current calling context, and to ensure that
the call depth is limited to max_invoke_depth set in the compute budget (max_invoke_depth: 4)
it disallows reentrancy unless caller is calling itself
Verify the calling program hasn’t misbehaved
The design of Solana for verifying an instruction call is similar totransactional memory: it executes the instruction first and then verifies the results to ensure that the invoked program has not misbehaved.
Thebankisalsoresponsible for verifying the results of an instruction and every CPI with respect to the accounting rules, a list of properties critical to Solana.
Depending on if the invocation level, it will call either verify or verify_and_update (which also updates the results if verified):
Important: the accounting rules
The bank maintains states of the instruction accounts before and after the instruction execution, and extensively checks rules:
It verifies an invariantthat the total sum of all the lamports did not change:
It verifies all executable accounts have zero outstanding references:
It verifies that only the owner of the account may change owner and only if the account is writable and only if the account is not executable and only if the data is zero-initialized or empty:
An account not owned by the program cannot debit the account:
The balance of read-only and executable accounts may not change:
Account data size cannot exceed a maximum length:
Only the owner of the account can change the size of the data:
Only the system program can change the size of the data and only if the system program owns the account:
Only the owner may change account data and if the account is writable and if the account is not executable:
Executable is one-way (false->true) and only the account owner may set it:
No one modifies rent_epoch:
The bank lifecycle
On a high level, the life cycle of a bank includes the following phases:
Open: A new bank is created and transactions are applied to it until either the bank reached the tick count when the node is the leader for that slot, or the node has applied all transactions present in all entries in the slot.
Committed: For a transaction, only if all instructions in the transaction succeed, the accounts are committed back to the bank and then the results are stored to the accounts store.
Frozen: Once it is complete, the bank can then be frozen. After frozen, no more transactions can be applied or state changes made. At the frozen step, rent will be applied and various sysvar special accounts update to the new state of the system.
Rooted: After frozen, and if the bank has had the appropriatenumber of votes on it, then it can become rooted. At this point, it will not be able to be removed from the chain and the state is finalized.
We will continue to introduce the architecture of Solana and its technical components in the next article.hr
Soteria is founded by leading minds in the fields of blockchain security and software verification.