Ledger State Si map key collision

Technical whitepaper, page 14, footnote 7:

The hash of the address is used as the key. Even though addresses are hashes of the public key, an adversary could create a new account at an address that is a near collision with a real address. While no transactions can be sent from this account because the private key corresponding to its address is unknown, the existence of this near-collision in the tree increases the proof length for the nearby address. Hashing the address before using it as a key reduces the impact of this type of attack.

Does the use of one more hash really make a difference in this regard? An adversary could still brute-force key generation to get an account that would give a key that is a near collision to another account in the map.

To really deter an attacker from generating keys until they get a near collision with a real account, the single round of extra hashing would have to be replaced by a KDF or other form of cost that is configured to be negligible for regular users but costly for an adversary.

1 Like

Hi @Gustav-Simonsson,

Near-collision in the sparse merkle tree is not a security problem. Instead, it is a potential waste of storage, computation power or bandwidth. And once it happens, it can never be corrected unless a hard fork. So I’d say it is potentially a DOS attack vector.

It is true that one more round of hash means that an adversary could still brute-force near collision. But the question is how near is near enough to be effective. From the block hash of Bitcoin, we assume that brute-forcing around 70 bits of prefix of a hash is already super difficult. And after that much of computation, that’s just one single malicious account.

Without that round of hash, an adversary could easily make large bunch of malicious addresses that are just few bits away from each other, or in other words, more than 200 bits of common prefix. Only that could lead to some real serious problems to the sparse merkle tree.

So a single more round of hash is effective enough to mitigate the potential attack vector.

That makes sense, thanks for clarifying! Could be worth to add a note in the paper on the number of common prefix bits that would potentially be a DoS for the tree.

Aside from creating (unspendable) accounts close to existing account addresses, some users of existing systems have (not maliciously) expended a surprising amount of computing power on vanity addresses (e.g. https://en.bitcoin.it/wiki/Vanitygen) and this will likely happen in libra too. While this wouldn’t be a problem for the tree its still nice that the 2nd round of hashing avoids such addresses affecting the structure of the tree.