Move++ - Enabling Highly Secure Cross-Shard Transactions Based on Move

Introduction

We are working on applying sharding/multi-chain technologies to Libra in order to achieve better scalability and more decentralized. One major motivation is that we found Move language is a very good fit for cross-shard interoperability compared to existing ones (e.g., EVM). One beautiful result is that, we could impose the safety feature of the resource defined in Move not noly in a single chain setup (current Libra), but also in the sharding/multi-chain setup, i.e., safely moving resources from one chain to another chain.

Note that safely exchanging assets in different chains is a major challenge in cross-chain interoperability, there are a lot of designs aim to address it. With the safety feature enabled by Move, we believe this shed the light on the new direction of interoperability of multi-chain.

Further, to simplify the use of the cross-shard communication, we also add a few features on Move (most in async support), namely, Move++. We would like to share the idea, and any comments are welcome!

System Setup

We consider a sharding blockchain network setup, which has N + 1 chains running in parallel:

  • N shard chains, one of which runs an instance of Libra chain with
    – Simplified version of LibraBFT with smaller validator size (e.g., 10)
    – Libra Account Model + Move VM
  • 1 root chain, which runs an instance of Libra
    – Full LibraBFT with about 100 validators
    – Libra Account Model + Move VM

In addition, the ledger/block of the root chain contains the latest views of all shard chains, and all shard chains follow the views defined by the root chain. This ensures the network has a consistent view of all ledgers of N + 1 chains.

Note that this kind of sharding setup is common in most sharding/multi-chain designs (such as Eth2.0/Polkadot).

Cross-Shard Communications

We also consider the following cross-shard/chain communication model:

  • C1: A shard chain can read any data of the root chain (e.g., import the module of root chain)
  • C2: A shard chain can issue any transaction script to another shard chain (e.g., schedule an async call), and such a script can be executed eventually at destination shard.

There are a couple of discussions on how these can be achieved. We ignore the details here but readers of interest can refer to the following links (link1, link2) to get more details.

Example on Moving Tokens Between Shard Chains

Let us consider the following MyToken module:

module MyToken {
    resource T {
        value: u64;
    }
    mint(amount: u64): Self.T;
    withdraw_from_sender(amount: u64): Self.T;
    deposit(payee: address, token: Self.T);
}

where a user can move its token to another account by calling the following script in current Libra blockchain:

import 0x1234.MyToken
main() {
  let token: MyToken.T;
  token = MyToken.withdraw_from_sender(100);
  MyToken.deposit(0x1111,  move(token));
}

Now, let us move to multi-chain/sharding context. First of all, we deploy the same module on the root chain:

module MyToken {
    resource T {
        value: u64;
    }
    mint(amount: u64): Self.T;
    withdraw_from_sender(amount: u64): Self.T;
    deposit(payee: address, token: Self.T);
}

Note that the module must be deployed on the root chain so that all shard chains can read/import the module.

Second, if a user wants to transfer its token to another in the root chain, the code will be the same as before. But suppose the transfer happens in the shard chain, where the payee account is in the same shard, a.k.a., in-shard transaction, the transaction script will be:

import root.0x1234.MyToken
main(payee: address, amount: u64) {
  let token: MyToken.T;
  token = MyToken.withdraw_from_sender(amount);
  MyToken.deposit(copy(payee),  move(token));
}

where one major change is that instead of importing local modules, the transaction will import the module from the root chain following C1. This ensures that no matter which shard chain runs the transactions, it will always move the same resources defined by the module of the root chain.

Third, suppose a user wants to transfer its token to another account in different shard chain, then the code will become:

import root.0x1234.MyToken
main(payee: address, payee_chain_id: u32, amount: u64) {
  let token: MyToken.T;
  let chain: Chain;
  token = MyToken.withdraw_from_sender(amount);
  chain = get_chain_by_id(payee_chain_id);
  chain.call_async(move || MyToken.deposit(copy(payee), move(token)));
}

where the major differences compared to in-shard transaction script are:

  • The script will obtain a target chain object
  • The script will create a deposit object and send it to the target chain. The deposit object contains
    – The resources/structs that are going to be moved/copied to the target shard (token/payee)
    – A script to be run in the target shard (MyToken.deposit(copy(payee), move(token)), where the moved/copied resources/structs will be recovered when running the script at the target shard (C2).
    – The script and the resources are defined in a rust-style lambda expression, where (move || … ) means to move all resources used in the lambda and store it in the deposit

Upon processing the deposit object at the target shard, the corresponding resources (MyToken here) will be moved/merged to the recipient. Since all the move operations are governed by the same module, we ensure the resource safety during the cross-shard transaction.

Future Topics

We are working on an MVP based on Libra code to enable the above features. Meanwhile, there are several topics need to be discussed:

  • Global/local resource separation
  • Cross-shard transaction gas metering
  • Failure handling if the target shard fails to process the script
3 Likes

Thanks for writing up this interesting idea! I share your intuition that Move’s resources and tree-shaped state may enable some new kinds of off-chain protocols. Although we’re not looking to experiment with this in Libra right now, I’m very happy to see you explore it. Looking forward to seeing the code MVP, but a few questions based on the above:

  • What does chain.call_async do? Does it produce a transaction that can only be executed on chain.id? And if so, (1) how is the transaction authenticated and (2) how is replay protection enforced?
  • Are there restrictions on what kinds of values can be captured in a lambda expression/what kind of code can appear in a lambda expression?
  • What happens if a lambda expression is created, but not used? Are the resources it captures lost?

One remark on the proposed extensions to Move: I’m guessing/hoping that you will actually be able to achieve what you are looking for without changing the Move VM. In Libra, we have organized the execution component into three different layers:

Move VM (Rust code): understands addresses, resources, gas metering
Libra adapter (Rust code that calls Move code published on chain): transactions, events, crypto primitives, native functions
Libra core modules (Move code published on chain): accounts, currency, validator set management, gas prices

The Move VM is hard to change. But it is quite easy for other blockchains to customize their Move-based execution layer by writing their own version of the Libra adapter or Libra core modules. In particular: the Libra adapter can define “native functions” that are callable from Move code, but are implemented in Rust. I think it should be possible to implement call_async (or an equivalent variant) as a native function without introducing closures in the Move VM. Let us know if we can help with code pointers or guidance on writing your own Libra adapter!

Many thanks for the detailed response and questions! Some questions are critical and highly related to our implementation details. To answer the questions, I will expand more implementation details that we are working on. Please note that

  1. the discussion is based on chained block model, but it should apply to Libra’s accumulator model as well;
  2. a lot of designs are continuously evolving, so any suggestions and feedbacks are welcome!

These are critical problems that generally apply to any cross-chain protocols. In our design, chain.call_async will emit a “deposit” object (similar to emitting an event or receipt in Eth), where the deposit object contains:

  1. A list of resources/structs that are moved/copied to the deposit object;
  2. Original tx information such as sender/gas price/gas remaining; and
  3. Bytecode of the script that will be executed at destination shard; and
  4. Destination shard id.

Similar to receipts in Eth, the list of the deposit objects will be Merklized and its hash will be stored in the block header.

Authentication of a Deposit Object

Authentication of deposit object means that anyone (especially the validators at destination shard) could verify the existence (completeness) or non-existence (soundness) of a deposit object in the ledger that is agreed by the majority operators of the network. In our sharding design, that means the 2/3 f + 1 of root chain BPs (67 of 100 BPs).

Now let us define “committed deposit object”:
Definition 1 (Committed deposit object): A deposit object is committed if

  • The deposit object is included in a source shard block
  • The source shard block is included in a root block, where the root block is committed by the root chain consensus (e.g., 2/3 f + 1 root chain BPs)

Now given a committed deposit object, we could verify the deposit object by

  • Cryptographic inclusion proofs of the deposit object in a shard block; and
  • Cryptographic inclusion proofs of the shard block included by committed root blocks.

Deterministic Replay

To achieve deterministic replay, we would enforce the following primitives:

  • Exactly-once delivery: The deposit object will be delivered and executed at a destination shard exactly once.
  • Ordered delivery: The destination shard would process the deposit object with a deterministic order enforced by the consensus. Note that this primitive differs from the user-submitted transactions, where a validator could choose the order of the transactions at its will.
  • Eventual delivery: All the deposit objects to a destination shard should be processed eventually. This can be driven by the transaction fee carried by the deposit object.

To illustrate how to achieve these primitives, let us define “receive deposit queue”:
Definition 2 (Receive deposit queue): Given a shard id i and a committed root chain block, by running as light clients of the other shard chains except i and the full client of the root chain, a shard chain validator can deterministically recover a receive deposit queue by 1, querying and verifying all deposit objects that are sent to shard i from other shard chains; 2, ordering the deposits with the following tuple:

(root_block_height, shard_block_height, deposit_index)

where

  • deposit_index is the index of a deposit object in a shard block
  • shard_block_height is the height of the shard block that includes the deposit
  • root_block_height is the height of the root block that contains the shard block of the deposit; and
  • the comparison first compares root_block_height, and then shard_block_height, and then deposit_index.

Now, for each shard block, besides including the previous shard block in the header, we also include a committed root chain block, so that the corresponding receive deposit queue of a shard block is deterministic. To enforce replay-ability, we will make sure the queue is not shrinking for future shard blocks in a shard. This is achieved enforcing the following constraint in a shard block header

  • The height of the committed root block for each shard block in a shard chain must be non-decreasing.

Now a simple implementation to ensure deterministic replay is that a shard block will process all new deposits in receive deposit queue if the queue grows (by including new committed root chain block). By incentivizing the destination shard validators to include newly-created committed root chain block, we can achieve exactly-once, ordered, and eventual delivery of deposit objects.

A further issue is the hot-shard problem where all shards may simultaneously send lots of cross-shards deposits to a destination shard, and thus the validator at destination shard may be overwhelmed by the deposits. This can be done using rate-limiting technique, which we ignore the details here.

Execution of a Deposit at the Destination Shard

Processing a deposit at the destination shard is similar to execute a user-submitted transaction except that the main function of the deposit script will take moved/copied resources and structures as input arguments. Taking the following transaction as an example:

import root.0x1234.MyToken
main(payee: address, payee_chain_id: u32, amount: u64) {
  let token: MyToken.T;
  let chain: Chain;
  token = MyToken.withdraw_from_sender(amount);
  chain = get_chain_by_id(payee_chain_id);
  chain.call_async(move || MyToken.deposit(copy(payee), move(token)));
}

The script that is run at the destination shard becomes:

import root.0x1234.MyToken
main(payee: address, MyToken.T token) {
  MyToken.deposit(copy(payee), move(token));
}

There shouldn’t have any restrictions on any variables that are captured. However, as a general principle of Move language, if a resource is moved to a lambda (i.e., a deposit object), it cannot be used anymore in the following code of the tx. For the code in lambda expression, we could any code as long as it constructs a valid script.

The resources will be held by the deposit object when the deposit is emitted, and upon processing the deposit object at the destination shard, the resources will be moved to the script (as the input arguments) executed at the destination sahrd. This ensures the following constant for a fungible token (e.g., LIBRA)

sum of balances of all accounts of all chains (root + shard chains) + sum of balances of all unprocessed deposits = C

where the unprocessed deposits are the deposits that are emitted at source shard, but not processed at the destination shard. And as long as the deposits are eventually delivered and processed, we will guarantee that no resources will be lost.

Many thanks for offering the help. It would be great helpful to instruct us to add several native functions to

  • import root block module
  • issue an async_call to a shard
  • determine which chain the code is running on
  • etc

Future, to reduce the complexity, we probably will use C-style callback instead of closure at the beginning:

import root.0x1234.MyToken
main(payee: address, payee_chain_id: u32, amount: u64) {
  let token: MyToken.T;
  let chain: Chain;
  token = MyToken.withdraw_from_sender(amount);
  chain = get_chain_by_id(payee_chain_id);
  chain.call_async(main_dest, copy(payee), move(token))
}
main_dest(payee: address, MyToken.T token) {
  MyToken.deposit(copy(payee), move(token));
}