Parallel Execution of Transactions #152

ajlopez · 2019-12-17T18:28:55Z

Description

This improvement proposals specifies a way to run transactions IN A BLOCK in parallel. To do so, the transactions in a block should be distributed in one more than one lists of transactions. To know these lists, two approaches are proposed:

The block builder informs the lists, having executed the transactions in a sequential way, and knowing the data they read/write
The block executor make a guess, and then, if the execution of transactions results in the incompatibility of parallelization, back to normal sequential execution.

Notably, this proposal could be implemented without using any internal information about token contracts or other specific information from transactions. In some way, it is transaction agnostic.

And many of these improvements could be implemented as a soft fork.

Notation

Ta, Tb: Different transactions (different lower case letter)
Ta < Tb: The first transactions is before than the second one in the block
Ta << Tb: Tb SHOULD be executed AFTER Ta
Ta || Tb: Tb COULD be executed in parallel with Ta
Ta <> Tb: Ta CANNOT be executed in parallel with Tb

Data

A datum can be read or write/change during the execution of transaction. Account balance, storage cell are examples of datum.

Some facts

If Ta and Tb have the same sender, and Ta nonce is less than Tb nonce, then Ta << Tb
If Ta reads datum X and Tb writes datum X, then Ta <> Tb
If Ta writes datum X and Tb writes datum X, then Ta <> Tb
If Ta does not invoke a contract, and Ta sender != Tb sender, then Ta || Tb
If not (Ta << Tb) AND not (Ta <> Tb) then Ta || Tb

These relations could be discovered during the sequencial execution of transactions during the block building process in miners. Then, the miner, given the data used by each transaction and any read/write conflict, could divide the transactions in many lists of transactions, that could be run in parallel. Such sublists could be informed inside the block, ie: additional data in the Remasc transaction. The rest of the nodes, could read such extra information and execute the block in parallel, then the state root at the end is the same than the state root obtained from sequential full execution of the block. The nodes that not have implemented this parallel execution approach, still could verify the block using sequential execution.

The other solution is: the block is unchanged, but each execution node could make a guess about the grouping of transactions, without executing them. During the execution, they could check if two parallelized transactions Ta Tb broke the above condition about read/write the same datum. In such case, the execution node back to sequential execution.

The first solution (mining added clue), could be improved with a reward for the miner that adds that information. Except for this reward, all the above algorithm and both solutions could be implemented as a soft fork.

SergioDemianLerner · 2019-12-24T17:07:37Z

Let's first number the rules for convenience:

R1. If Ta and Tb have the same sender, and Ta nonce is less than Tb nonce, then Ta << Tb
R2. If Ta reads datum X and Tb writes datum X, then Ta <> Tb
R3. If Ta writes datum X and Tb writes datum X, then Ta <> Tb
R4. If Ta does not invoke a contract, and Ta sender != Tb sender, then Ta || Tb
R5. If not (Ta << Tb) AND not (Ta <> Tb) then Ta || Tb

Now my comments:

In the definition of "<<" by SHOULD you mean MUST.
The relation Ta || Tb seems unnecessary to be defined explicitly by R4. Every two transactions where there is not statement Ta <> Tb means they could be tried to be concurrently executed. Therefore R4 is unnecessary.
The relation R5 should be replaced simply by If (Ta << Tb) then Ta <> Tb

SergioDemianLerner · 2019-12-24T17:10:42Z

Also it's important for any new RSKIP that tries to solve the same problem of another RSKIP by different means to reference the previous attempts and highlight the differences. In this case RSKIP3, RSKIP4 and RSKIP144.

SergioDemianLerner · 2019-12-24T17:21:20Z

Another problem is that there is no rule that makes unparallelizable two transactions Ta and Tb if:

Ta creates a contract and address X and passes some value to X, and then suicides it.
Tb reads the balance of X

In this case, there is no "datum" that both had accessed. Therefore there must be a clear definition of what a "datum" is and a hint of how these datum changes will be tracked.

ajlopez · 2020-02-04T13:25:53Z

Another problem is that there is no rule that makes unparallelizable two transactions Ta and Tb if:

Ta creates a contract and address X and passes some value to X, and then suicides it.

Tb reads the balance of X

In this case, there is no "datum" that both had accessed. Therefore there must be a clear definition of what a "datum" is and a hint of how these datum changes will be tracked.

The datum is the account state of X, The execution engine has the list of keys accessed (for read or for write), and in the above case, it has: X was accessed for read and write by Ta, then X was access for read, I wrote explicitly that account balance is to be considered a datum. We could extend it to the account state in full

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel Execution of Transactions #152

Parallel Execution of Transactions #152

ajlopez commented Dec 17, 2019 •

edited

Loading

SergioDemianLerner commented Dec 24, 2019

SergioDemianLerner commented Dec 24, 2019

SergioDemianLerner commented Dec 24, 2019

ajlopez commented Feb 4, 2020

Parallel Execution of Transactions #152

Parallel Execution of Transactions #152

Comments

ajlopez commented Dec 17, 2019 • edited Loading

Description

Notation

Data

Some facts

SergioDemianLerner commented Dec 24, 2019

SergioDemianLerner commented Dec 24, 2019

SergioDemianLerner commented Dec 24, 2019

ajlopez commented Feb 4, 2020

ajlopez commented Dec 17, 2019 •

edited

Loading