Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented updateMany method as required in issue #117 #314

Merged
merged 13 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions packages/lean-imt/src/lean-imt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,59 @@ export default class LeanIMT<N = bigint> {
this._nodes[this.depth] = [node]
}

/**
* Updates m leaves all at once.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re the time complexity of the algorithm.

n: Number of leaves of the tree.

I think that the worst case for updating multiple elements at once is to update all the leaves. Then the number of possible leaves to update is bounded by n.

Then the time complexity of the update function in a loop for the worst case would be O(n log n). Because you will need to go from a leaf to the root (log n operations) n times.

I think that the worst case for this updateMany function is O(n) because since you would be updating all the leaves, it's like rebuilding the tree. Rebuilding the tree is O(n) because the number of operations when building a tree from the leaves is n + ceiling(n/2) + ceiling(n/4) + ... + ceiling(n/(2^k)) that's <= 2*n + O(log n) and that's <= O(n) + O(log n) which is O(n).

Number of operations when building the tree (or updating all the elements in the tree at once):

$$ n + \lceil \frac{n}{2} \rceil + \lceil \frac{n}{4} \rceil + ... + \lceil \frac{n}{2^d} \rceil$$

That is the same as $$\sum_{k=0}^{d} \lceil \frac{n}{2^k} \rceil$$

$$\sum_{k=0}^{d} \lceil \frac{n}{2^k} \rceil \leq \sum_{k=0}^{d} (\frac{n}{2^k} + 1)$$

$$\leq \sum_{k=0}^{d} \frac{n}{2^k} + \sum_{k=0}^{d} 1$$

$$\leq 2n + O(\log n)$$

$$\leq O(n) + O(\log n)$$

$$\Rightarrow O(n)$$

$$\sum_{k=0}^{d} \frac{n}{2^k} = n \sum_{k=0}^{d} \frac{1}{2^k} \approx n * 2 \Rightarrow 2n$$

$$\sum_{k=0}^{d} 1 = d + 1 = O(\log n) + 1 \Rightarrow O(\log n)$$

For more reference on this, you can take a look at the InsertMany time complexity in the LeanIMT paper: https://github.com/vplasencia/leanimt-paper/blob/main/paper/leanimt-paper.pdf

Does it make sense to explain it like this?

If you use n instead of m, you will also get O(n). Does it make sense to modify your proof assuming that the worst case for m is n since we are calculating big O notation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set m = n, the complexity as I stated it becomes O(n) since log(n)-log(m) = 0. I think it makes sense to add this case separately to the complexity statement, but maintaining the whole explanation with a smaller m.

That is, my proof is a generalization of the case m=n, so I don't see why we should take that out. Completely agree with mentioning that the algorithm is O(n), though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the worst case the one when the nodes to update have as few ancestors as possible on all levels? In that case, time complexity would be quite similar to the one of the update function in a loop, i.e. O(n * log n).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I also meant n as the subset of nodes to be updated (and not all leaves). We are on the same page 👍🏽

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but as the number of leaves to update increases, the condition of "having few common ancestors" starts being stronger and stronger, to the point where common ancestors start appearing inevitably. This makes that we never reach n*log(n) operations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ChinoCribioli could you update the updateMany function documentation with something similar to what you have at the beginning and move the time complexity analysis to a new issue called something like UpdateMany time complexity and add a comment with your proof so that we can discuss it there. Then, if you want, you could add it to the LeanIMT paper. What do you think?

Copy link
Contributor Author

@ChinoCribioli ChinoCribioli Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vplasencia Done. I changed the documentation to a more minimalist description and created this issue to discuss the deeper complexity analysis.

After the discussion is closed, you may reach out to me to add the whole analysis to the LeanIMT paper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thank you very much! 🙏

* It is more efficient than using the {@link LeanIMT#update} method m times because it
* prevents updating middle nodes several times. This would happen when updating leaves
* with common ancestors. The naive approach of calling 'update' m times has complexity
* O(m*log(n)) (where n is the number of leaves of the tree), which ends up in
* O(n*log(n)) when m ~ n. With this new approach, this ends up being O(n) because every
* node is updated at most once and there are around 2*n nodes in the tree.
* @param indices The list of indices of the respective leaves.
* @param leaves The list of leaves to be updated.
*/
public updateMany(indices: number[], leaves: N[]) {
requireDefined(leaves, "leaves")
requireDefined(indices, "indices")
requireArray(leaves, "leaves")
requireArray(indices, "indices")

if (leaves.length !== indices.length) {
cedoor marked this conversation as resolved.
Show resolved Hide resolved
cedoor marked this conversation as resolved.
Show resolved Hide resolved
throw new Error("There is no correspondence between indices and leaves")
}
// This will keep track of the outdated nodes of each level.
let modifiedIndices = new Set<number>()
for (let i = 0; i < indices.length; i += 1) {
requireNumber(indices[i], `index ${i}`)
if (indices[i] < 0 || indices[i] >= this.size) {
throw new Error(`Index ${i} is out of range`)
}
if (modifiedIndices.has(indices[i])) {
throw new Error(`Leaf ${indices[i]} is repeated`)
}
modifiedIndices.add(indices[i])
}

modifiedIndices.clear()
// First, modify the first level, which consists only of raw, un-hashed values
for (let leaf = 0; leaf < indices.length; leaf += 1) {
this._nodes[0][indices[leaf]] = leaves[leaf]
modifiedIndices.add(indices[leaf] >> 1)
}

// Now update each node of the corresponding levels
for (let level = 1; level <= this.depth; level += 1) {
const newModifiedIndices: number[] = []
for (const index of modifiedIndices) {
const leftChild = this._nodes[level - 1][2 * index]
const rightChild = this._nodes[level - 1][2 * index + 1]
this._nodes[level][index] = rightChild ? this._hash(leftChild, rightChild) : leftChild
newModifiedIndices.push(index >> 1)
}
modifiedIndices = new Set<number>(newModifiedIndices)
}
}

/**
* It generates a {@link LeanIMTMerkleProof} for a leaf of the tree.
* That proof can be verified by this tree using the same hash function.
Expand Down
114 changes: 114 additions & 0 deletions packages/lean-imt/tests/lean-imt.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,120 @@ describe("Lean IMT", () => {
})
})

describe("# updateMany", () => {
it(`Should not update any leaf if one of the parameters is not defined`, () => {
const tree = new LeanIMT(poseidon, leaves)

const fun1 = () => tree.updateMany([1], undefined as any)
const fun2 = () => tree.updateMany(undefined as any, [BigInt(1)])

expect(fun1).toThrow("Parameter 'leaves' is not defined")
expect(fun2).toThrow("Parameter 'indices' is not defined")
})

it(`Should not update any leaf if the parameters are not arrays`, () => {
const tree = new LeanIMT(poseidon, leaves)

const fun1 = () => tree.updateMany([3], BigInt(1) as any)
const fun2 = () => tree.updateMany(3 as any, [BigInt(1)])

expect(fun1).toThrow("Parameter 'leaves' is not an Array instance")
expect(fun2).toThrow("Parameter 'indices' is not an Array instance")
})

it(`Should not update any leaf if the parameters are of different size`, () => {
const tree = new LeanIMT(poseidon, leaves)

const fun1 = () => tree.updateMany([1, 2, 3], [BigInt(1), BigInt(2)])
const fun2 = () => tree.updateMany([1], [])

expect(fun1).toThrow("There is no correspondence between indices and leaves")
expect(fun2).toThrow("There is no correspondence between indices and leaves")
})

it(`Should not update any leaf if some index is not a number`, () => {
const tree = new LeanIMT(poseidon, leaves)

const fun1 = () => tree.updateMany([1, "hello" as any, 3], [BigInt(1), BigInt(2), BigInt(3)])
const fun2 = () => tree.updateMany([1, 2, undefined as any], [BigInt(1), BigInt(2), BigInt(3)])

expect(fun1).toThrow("Parameter 'index 1' is not a number")
expect(fun2).toThrow("Parameter 'index 2' is not a number")
})

it(`Should not update any leaf if some index is out of range`, () => {
const tree = new LeanIMT(poseidon, leaves)

const fun1 = () => tree.updateMany([-1, 2, 3], [BigInt(1), BigInt(2), BigInt(3)])
const fun2 = () => tree.updateMany([1, 200000, 3], [BigInt(1), BigInt(2), BigInt(3)])
const fun3 = () => tree.updateMany([1, 2, tree.size], [BigInt(1), BigInt(2), BigInt(3)])

expect(fun1).toThrow("Index 0 is out of range")
expect(fun2).toThrow("Index 1 is out of range")
expect(fun3).toThrow("Index 2 is out of range")
})

it(`Should not update any leaf when passing an empty list`, () => {
const tree = new LeanIMT(poseidon, leaves)
const previousRoot = tree.root

tree.updateMany([], [])

expect(tree.root).toBe(previousRoot)
})

it(`'updateMany' with 1 change should be the same as 'update'`, () => {
const tree1 = new LeanIMT(poseidon, leaves)
const tree2 = new LeanIMT(poseidon, leaves)

tree1.update(4, BigInt(-100))
tree2.updateMany([4], [BigInt(-100)])
expect(tree1.root).toBe(tree2.root)

tree1.update(0, BigInt(24))
tree2.updateMany([0], [BigInt(24)])
expect(tree1.root).toBe(tree2.root)
})

it(`'updateMany' should be the same as executing the 'update' function multiple times`, () => {
const tree1 = new LeanIMT(poseidon, leaves)
const tree2 = new LeanIMT(poseidon, leaves)

const indices = [0, 2, 4]

const nodes = [BigInt(10), BigInt(11), BigInt(12)]

for (let i = 0; i < indices.length; i += 1) {
tree1.update(indices[i], nodes[i])
}
tree2.updateMany(indices, nodes)

expect(tree1.root).toBe(tree2.root)
})

it(`'updateMany' with repeated indices should fail`, () => {
const tree = new LeanIMT(poseidon, leaves)

const fun = () => tree.updateMany([4, 1, 4], [BigInt(-100), BigInt(-17), BigInt(1)])

expect(fun).toThrow("Leaf 4 is repeated")
})

it(`Should update leaves correctly`, () => {
const tree = new LeanIMT(poseidon, leaves)

const updateLeaves = [BigInt(24), BigInt(-10), BigInt(100000)]
tree.updateMany([0, 1, 4], updateLeaves)

const h1_0 = poseidon(updateLeaves[0], updateLeaves[1])
const h1_1 = poseidon(leaves[2], leaves[3])
const h2_0 = poseidon(h1_0, h1_1)
const updatedRoot = poseidon(h2_0, updateLeaves[2])

expect(tree.root).toBe(updatedRoot)
})
})
vplasencia marked this conversation as resolved.
Show resolved Hide resolved

describe("# generateProof", () => {
it(`Should not generate any proof if the index is not defined`, () => {
const tree = new LeanIMT(poseidon, leaves)
Expand Down
Loading