Performance improvement: Non-recursively drop TokenStreams with a visitor #489

WalkerKnapp · 2025-01-11T00:48:18Z

Hello! I noted in a profiler on my system that while running the benchmarks in the syn repository that an outsided amount of time is spent in the non-recursive Drop implementation for TokenStream that was introduced in #232 to solve the stack overflow noted in #55. A large portion of this time can be removed by using a visitor pattern to drop segments of the TokenStream instead of flattening groups into the inner Vec.

This adds some complexity which may be undesirable, but (at least in my tests) it does yield good results when repeatedly running those benchmarks.

Running syns rust benchmark with proc-macro2 1.0.92:

tokenstream_parse: elapsed=0.602s (+/- 0.005s)
syn_parse: elapsed=2.558s (+/- 0.024s)

Running syns rust benchmark with this fork:

tokenstream_parse: elapsed=0.563s (+/- 0.005s)
syn_parse: elapsed=2.236s (+/- 0.030s)

This yields about a 7% performance improvement in the tokenstream_parse test and a 12% performance improvement in the syn_parse test on my system.

Opening as a draft to get feedback and testing with other use-cases. If no regressions are found, I will mark it ready for review. Thanks!

dtolnay

Thanks for the PR!

If I understand correctly, the speedup in this PR is from moving non-Group tokens from heap to stack directly, instead of heap to heap to stack. In real-world inputs there is always a much larger number of non-Group tokens than Group tokens, so the less moving matters.

To implement that, you have created a linked list of Groups, where a new linked list node is heap-allocated for every Group token visited (other than top-level ones).

I don't think allocating a separate linked list node for every Group is the best representation for this algorithm. Would you be willing to try benchmarking this against #490?

WalkerKnapp · 2025-01-11T02:53:35Z

Sure! This is what I'm seeing with that branch on my machine:

tokenstream_parse: elapsed=0.563 (+/- 0.039)
syn_parse: elapsed=2.240 (+/- 0.047)

Seems to be within margin of error of my fork, and I'd agree that the solution you gave is quite a bit simpler :). I'll close this as obsolete in that case, but I'm glad to have been able to spark the idea!

dtolnay · 2025-01-11T03:44:30Z

Published in 1.0.93.

I am glad you are profiling syn benchmarks. I have not looked at that in years.

You might be interested to pick up #336.

perf: Non-recursively drop TokenStreams with a visitor

e2f82ec

WalkerKnapp force-pushed the master branch from d2553bf to e2f82ec Compare January 11, 2025 00:59

dtolnay requested changes Jan 11, 2025

View reviewed changes

WalkerKnapp closed this Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvement: Non-recursively drop TokenStreams with a visitor #489

Performance improvement: Non-recursively drop TokenStreams with a visitor #489

WalkerKnapp commented Jan 11, 2025

dtolnay left a comment

WalkerKnapp commented Jan 11, 2025

dtolnay commented Jan 11, 2025

Performance improvement: Non-recursively drop TokenStreams with a visitor #489

Performance improvement: Non-recursively drop TokenStreams with a visitor #489

Conversation

WalkerKnapp commented Jan 11, 2025

dtolnay left a comment

Choose a reason for hiding this comment

WalkerKnapp commented Jan 11, 2025

dtolnay commented Jan 11, 2025