Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale exploration with Gini impurity of policy values #44

Merged
merged 9 commits into from
Aug 15, 2024

Conversation

XInTheDark
Copy link
Contributor

First, when computing policy values in a position, we also calculate the Gini impurity, defined as (1 - sum of squares of policy values). A high Gini impurity indicates that there are many strong candidate moves in a position, and vice versa.

This Gini impurity is then used to adjust the exploration scaling using a logarithmic formula. For higher values of Gini impurity, we decrease the exploration value so that the search focuses more on exploring variations with high q values. Conversely, for positions with low Gini impurity where one move is much better than the others, we increase the exploration value to ensure that other potential lines are not prematurely discarded.

The idea to use Gini impurity was first proposed and tested by @Viren6.

This patch was shown to affect the quality of data produced, so it has been intentionally excluded for datagen.

Passed STC: https://montychess.org/tests/view/66aef6280f6f1e65cfa2b1f8
LLR: 2.93 (-2.94,2.94) <0.00,4.00>
Total: 6944 W: 1626 L: 1460 D: 3858
Ptnml(0-2): 54, 787, 1643, 915, 73

Passed LTC: https://montychess.org/tests/view/66af1fd90f6f1e65cfa2b235
LLR: 2.93 (-2.94,2.94) <1.00,5.00>
Total: 8718 W: 1877 L: 1705 D: 5136
Ptnml(0-2): 40, 931, 2255, 1083, 50

Rebased STC: https://montychess.org/tests/view/66b053380f6f1e65cfa2b657
LLR: 2.91 (-2.94,2.94) <0.00,4.00>
Total: 9216 W: 2083 L: 1913 D: 5220
Ptnml(0-2): 82, 998, 2291, 1142, 95

2nd Rebased STC: https://montychess.org/tests/view/66bd91bf68e8f7e2fe23cfde
LLR: 3.03 (-2.94,2.94) <0.00,4.00>
Total: 4448 W: 1158 L: 982 D: 2308
Ptnml(0-2): 55, 487, 997, 597, 88

Bench: 1317423

Bench: 1317423
Bench: 1317423
@jw1912 jw1912 merged commit 7b92be7 into official-monty:master Aug 15, 2024
3 checks passed
Viren6 pushed a commit that referenced this pull request Aug 16, 2024
First, when computing policy values in a position, we also calculate the Gini impurity, defined as (1 - sum of squares of policy values). A high Gini impurity indicates that there are many strong candidate moves in a position, and vice versa.

This Gini impurity is then used to adjust the exploration scaling using a logarithmic formula. For higher values of Gini impurity, we decrease the exploration value so that the search focuses more on exploring variations with high q values. Conversely, for positions with low Gini impurity where one move is much better than the others, we increase the exploration value to ensure that other potential lines are not prematurely discarded.

The idea to use Gini impurity was first proposed and tested by @Viren6.

This patch was shown to affect the quality of data produced, so it has been intentionally excluded for datagen.

Passed STC: https://montychess.org/tests/view/66aef6280f6f1e65cfa2b1f8
LLR: 2.93 (-2.94,2.94) <0.00,4.00>
Total: 6944 W: 1626 L: 1460 D: 3858
Ptnml(0-2): 54, 787, 1643, 915, 73

Passed LTC: https://montychess.org/tests/view/66af1fd90f6f1e65cfa2b235
LLR: 2.93 (-2.94,2.94) <1.00,5.00>
Total: 8718 W: 1877 L: 1705 D: 5136
Ptnml(0-2): 40, 931, 2255, 1083, 50

Rebased STC: https://montychess.org/tests/view/66b053380f6f1e65cfa2b657
LLR: 2.91 (-2.94,2.94) <0.00,4.00>
Total: 9216 W: 2083 L: 1913 D: 5220
Ptnml(0-2): 82, 998, 2291, 1142, 95

2nd Rebased STC: https://montychess.org/tests/view/66bd91bf68e8f7e2fe23cfde
LLR: 3.03 (-2.94,2.94) <0.00,4.00>
Total: 4448 W: 1158 L: 982 D: 2308
Ptnml(0-2): 55, 487, 997, 597, 88

Bench: 1317423
Viren6 added a commit that referenced this pull request Aug 16, 2024
First, when computing policy values in a position, we also calculate the Gini impurity, defined as (1 - sum of squares of policy values). A high Gini impurity indicates that there are many strong candidate moves in a position, and vice versa.

This Gini impurity is then used to adjust the exploration scaling using a logarithmic formula. For higher values of Gini impurity, we decrease the exploration value so that the search focuses more on exploring variations with high q values. Conversely, for positions with low Gini impurity where one move is much better than the others, we increase the exploration value to ensure that other potential lines are not prematurely discarded.

The idea to use Gini impurity was first proposed and tested by @Viren6.

This patch was shown to affect the quality of data produced, so it has been intentionally excluded for datagen.

Passed STC: https://montychess.org/tests/view/66aef6280f6f1e65cfa2b1f8
LLR: 2.93 (-2.94,2.94) <0.00,4.00>
Total: 6944 W: 1626 L: 1460 D: 3858
Ptnml(0-2): 54, 787, 1643, 915, 73

Passed LTC: https://montychess.org/tests/view/66af1fd90f6f1e65cfa2b235
LLR: 2.93 (-2.94,2.94) <1.00,5.00>
Total: 8718 W: 1877 L: 1705 D: 5136
Ptnml(0-2): 40, 931, 2255, 1083, 50

Rebased STC: https://montychess.org/tests/view/66b053380f6f1e65cfa2b657
LLR: 2.91 (-2.94,2.94) <0.00,4.00>
Total: 9216 W: 2083 L: 1913 D: 5220
Ptnml(0-2): 82, 998, 2291, 1142, 95

2nd Rebased STC: https://montychess.org/tests/view/66bd91bf68e8f7e2fe23cfde
LLR: 3.03 (-2.94,2.94) <0.00,4.00>
Total: 4448 W: 1158 L: 982 D: 2308
Ptnml(0-2): 55, 487, 997, 597, 88

Bench: 1317423
Co-Authored-By: Viren6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants