Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speeding up file walk #589

Closed
wants to merge 7 commits into from
Closed

Speeding up file walk #589

wants to merge 7 commits into from

Conversation

deniszh
Copy link
Member

@deniszh deniszh commented Jun 13, 2024

That's similar to my old PR #329 but I'm using github.com/charlievieth/fastwalk which is still updating instead of cwalk which is 4 years old.

Why it's needed? On really big and powerful servers with many metrics filewalk is slow. I tried WalkDir - it's faster nowadays, but fastwalk is what really gives you performance gain.

For example, for little over 55M metrics, file_scan_runtime was 28302 seconds, after this change - 2069 seconds.

filewalk is sane with number of workers, it's minimum 4, then equal to numcpu but not more than 32.

@deniszh
Copy link
Member Author

deniszh commented Jun 17, 2024

OK, probably I'll close this for now.
Reasion - concurrent filewalk creates too much contention on trie index, which was not designed for parallel inserting. I tried to isolate it with mutex as a whole, but looks like it's too much contention. So, would put that aside, maybe return to it soon, when optimize indices

@deniszh deniszh closed this Jun 17, 2024
@deniszh deniszh deleted the fastwalk branch June 22, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant