How to create a new testnet from scratch

Hi, I the found instructions to launch a node, (https://developers.libra.org/docs/my-first-transaction#run-a-local-validator-node) but how can I launch more than one and create a new testnet?

2 Likes

Hi @jrosich! If you want just to run cluster locally you can specify -n option. For example, cargo run -p libra_swarm -- -s -n 4, will do config management for you and will spawn network of 4 validators locally

3 Likes

Hi, what I want to do is run a testnet with my friends around the world. Can we do that?

1 Like

How to run a real node so I can participate with my peers directly?

Sure, it’s possible. Unfortunately we don’t have nice tutorial for that yet.
But basically you’ll need to replicate what libra_swarm does for you behind the scenes.
It includes:

  • generate keypair for genesis account: `cargo run --bin generate_keypair – -o <path_to_output_file>
  • generate configs for nodes for your future network. You can do it via cargo run --bin libra-config. Here you’ll need to specify number of nodes in cluster, path to genesis account, base template(example of such is available in config/data/configs/node.config.toml)
  • distribute configs for each node to your friends
  • finally run separate node locally via `cargo run --bin libra_node – -f <path_to_config> -p <peer_id>

Hopefully, we will have it documented better soon

5 Likes

Having this process well-documented would be greatly appreciated!

What is the peer_id in the last step?

@trusty @jrosich

Sorry for the self-promotion, but I have this process (mostly) well-documented here: https://github.com/jonrau1/AWS-Libra-Blockchain

This shows you how to setup the local testnet, share the config files and setup basic security & networking architecture to connect remote clients into it

1 Like

Hi @jonrau1. I actually saw your repo. Great writeup. However, I’m mostly interested in setting up a cluster of validator nodes, e.g. multiple EC2 instances.

meaning we can set up this node and let other connect to it (explorers) and let 3rd parties create custom modules/smartcontracts in it ?

@trusty

You can specify multiple validators on your testnet with the -n flag, I haven’t put mine through the wringer as hard as I would like but you should not run into a lot of latency, especially with the (kind of) SBFT consensus mechanism under the covers of LibraBFT versus POW/POS/POA.

I am working on a new change right now to put multiple nodes behind an ALB, not sure how well it would work given every time you run swarm the Admission Control Service Port changes – @phoenix can we override that value in the *.config.node.toml file?

I at least want to make sure I can come into it from behind an ALB

@lucas

My distro is dealing with RFC1918 address spaces, but I am experimenting with connecting to the Public IP, but theoretically yes, you could put an explorer on top of it – I am not sure how to execute a Move module / SC yet, but I imagine that would be possible to in it

1 Like

AFAIK the -n flag only spins up multiple nodes on the same host. I’m trying to do the same thing as you. Something like multiple hosts behind an ALB.

Working it right now – having issues connecting to my public IPV4 address but can do private no problem at all, I have Allow All Traffic between both SGs as well, super weird

Testing out resolving with a NLB right now – since it’s all TCP anyway

(though you can resolve an Internal ALB by IP from the external NLB and go from target groups in there)

EDIT – I guess in the same subnet/VPC it won’t let you egress the VPC and return to it, but from another VPC I could resolve the Public IP address

Working through getting a health check override on the targets so I can resolve via NLB, 443 & the service port are both draining my targets, researching how to overcome this

@trusty

So I found that in the same VPC, I could not resolve the Public IP address, but was able to from another VPC as well as “on-prem” (my laptop) – so that was pretty easy. Though, when I quit and reconnect, it forgets my old wallet, which doesnt happen with the RFC1918 host target.

Regarding Load Balancing - I first tried an IP-target NLB with the RFC1918 space of the local blockchain network node, that would not resolve no matter what. To get the target to register, I had to create an Override Health Check on Port 80 and throw NGINX on the instance, but made the SG rule only allow HTTP traffic from my VPC /16

I swapped to an Instance target with the Listener – that was able to work, but, everytime I back quit the client and reconnect to the NLB DNS A Record, it creates a new wallet for me.

Will be experimenting with the ALB instead, definitely need some session stickiness of some sort

(or to remember to load mnemonics)

just fyi. Count me in if anyone want one more validator node joining in. So far there is still issues to complete all transactions from clients to validator node. Working on it.

@phoenix Do you have any information about genesis.blob file and how to generate this file ?

For anyone interested in this, we are working on setting up an independent from-scratch testnet for OpenLibra. We’ve put together a script, adapted from the Libra Terraform recipe, which generates the relevant keys configuration files using libra-config, including genesis.blob, using a method more or less like the one described in this post. You can find our configuration here:

  • build.sh: runs generate-keypair and libra-config to create mint.key and node configurations respectively, ala libra-swarm, with -n 4 targeting a 4 node initial testnet
  • node.config.toml: this is more or less copypasta from the upstream https://github.com/libra/libra

Unfortunately this no-longer works:

It seems the -p <peer_id> option has been removed (and the binary renamed to libra-node, with hyphen instead of underscore)

When I try to boot a node, with role = "validator" in its node.config.toml, I’m getting:

C1016 21:52:48.863359 140326530447104 common/crash_handler/src/lib.rs:36] details = '''panicked at 'assertion failed: !self.proposers.is_empty()', consensus/src/chained_bft/chained_bft_smr.rs:141:9'''

Okay, I’ve gotten a bit farther… seems huge swaths of my node.config.toml were missing/misconfigured. I now have 4 nodes running in a cluster, each with their own node/consensus keys, and from the Debug output of NodeConfig on startup everything looks plausible:

I’m now stuck on an endless loop of this on all currently running nodes:

W1017 18:00:45.828016 140075560072960 consensus/src/chained_bft/liveness/pacemaker.rs:350] Round 1 has timed out, broadcasting new round message to all replicas
W1017 18:00:45.828615 140075560072960 consensus/src/chained_bft/event_processor.rs:392] Round 1 timed out: already executed and voted at this round, expected round proposer was ["8deeeaed", "57ff8374"], broadcasting new round to all replicas
W1017 18:00:45.833560 140075560072960 network/src/protocols/direct_send/mod.rs:315] DirectSend to peer 1e5d5a74 failed: Peer not connected
W1017 18:00:45.834594 140075560072960 network/src/protocols/direct_send/mod.rs:315] DirectSend to peer 57ff8374 failed: Peer not connected
W1017 18:00:45.835450 140075560072960 network/src/protocols/direct_send/mod.rs:315] DirectSend to peer 8deeeaed failed: Peer not connected
W1017 18:00:45.836275 140075560072960 network/src/protocols/direct_send/mod.rs:315] DirectSend to peer ab0d6a54 failed: Peer not connected
W1017 18:00:46.830899 140075560072960 consensus/src/chained_bft/liveness/pacemaker.rs:350] Round 1 has timed out, broadcasting new round message to all replicas

In the seed_peers.config.toml I have the following, which matches the Terraform config:

[seed_peers]
8deeeaed65f0cd7484a9e4e5ac51fbac548f2f71299a05e000156031ca78fb9f = ["/ip4/10.138.0.12/tcp/6180"]

When I start a non-seed node under strace, I don’t see it making any connect() calls to anything but localhost, and I see no attempts on port 6180 whatsoever (only localhost ports 6182, 6184, and 6191, all successful).

The only reference to the IP address configured in seed_peers.config.toml I see is the write() call it uses to print the NodeConfig:

[pid  4436] write(1, "})), network_identity_pubkey: X2"..., 1017})), network_identity_pubkey: X25519StaticPublicKey(X25519PublicKey(PublicKey(MontgomeryPoint([122, 233, 30, 41, 7, 70, 71, 85, 201, 221, 195, 206, 47, 27, 64, 54, 36, 185, 204, 198, 27, 75, 232, 200, 56, 140, 226, 140, 71, 172, 49, 32])))) }} }, network_peers_file: "network_peers.config.toml", seed_peers: SeedPeersConfig { seed_peers: {"8deeeaed65f0cd7484a9e4e5ac51fbac548f2f71299a05e000156031ca78fb9f": ["/ip4/10.138.0.12/tcp/6180"]} }, seed_peers_file: "seed_peers.config.toml" }]

I’ve tried using external IP addresses in addition to RFC 1918 addresses to no avail (and confirmed both are connectable on 6180 with netcat). I’ve even tried listing every peer’s IP address in seed_peers.config.toml to no avail.

It seems particularly weird that peers seemingly are saying Peer not connected about what is ostensibly their own peer ID. I’m also curious why the peer_id they print on boot seems to correspond to the network_peers.config.toml's ni (network identity?) entry under the correspondingly named table. Should I be using the ni value in seed_peers.config.toml instead (rather than the one that libra-config uses in the generated key filenames). Trying that quickly it also doesn’t help.

1 Like

Figured out this issue.

I figured out the problem.

If is_permissioned is set to false, the seed nodes are never configured.