-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider pros and cons of alternative data structure for pool #6
Comments
You don't perform operations on the bool. It needs exactly one operation: The pros and cons of alternative data structures are not around making any other operations efficient. Custom Hashfunction hashset/dictPro:Could be faster than current implementation Con:Work to implement basically from ground up TreeSet/dictProComparisons could be faster than current hash function ConGets slower the more elements it contains. Technically so does Hashmap, but this is worse. TrieAll pros and cons of TreeSet/dict, ProUses less memory in storage ConBig Con: There are a few other finite-automata-ish string storing structures with the same kinda problems as tries but perhaps more so. |
Yeah, but if the pool is stored in a data structure such that it's easy to obtain each element's rank within the pool then one can sort an InternedStrings vector quite fast using counting sorting. Suppose there are only 3 possible elements "a", "b", "c" and so their rank is 1,2,3. We can scan through the vector and keep a count of the rank, and then use the counts to sort the vector using the same algorithm as counting sort. I don't think this is currently possible. Nice properties for the pool data structure are easy to look up, easy to insert, easy to delete, and easy to find rank |
Rather than talking about properties of a pool do you want to take a step outwards and talk about properties of the InternedString Right now, the property of the interned string is:
You are suggesting adding:
Right? |
Yeah |
Due to the removal of the InternedString type this is not directly possible anymore, I'm so dubious about the using of Tries for performance or for memory decreasing, |
I used to use B+ trees for interning (this was for huge tables, millions of entries, the B+ trees were buffered in shared memory, with TB of disk). |
Currently pool is a WeakRefDict. Wondering if it can be made into something else to allow for more operations to be more efficient e.g. sorting/grouping. A B-tree perhaps?
The text was updated successfully, but these errors were encountered: