-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathprogress
72 lines (59 loc) · 2.89 KB
/
progress
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
26/6/17
- Basic functions added: createNode, nextPrime, isPrime, etc. (helper functions).
- Important functionality added = free_hash, put, delete_key, print_hash
- Next: create unit tests for current functions, add get (look up)
27/6/17
- Important functionality = get
- Also created tests + load factor tracking
3/7/17
- Added more unit tests
- Load factor checking more efficient (doesn't divide each time)
- Added copyHash and automatic resizing. Not entirely sure if it's spreading the keys out properly tho
5/7/17
- Added more unit tests + timing tests
- Fixed auto resizing, no mem leaks w/VALGRIND
- Added some python scripts
- Thinking about organising everything better
14/7/17
- Added more timing tests
- Perl script to test both
- Researching open addr
- Currently beating out python's dict but can be more optimised. (This was due to python rand)
23/7/17
- Currently ~10% slower than Python dict
- Added optimisations to modulus
- Change some conditionals to loops
25/7/17
- Adding open addr (put, free, createHash)
- Added some testing
- Completed some manual tests with valgrind
31/7/17
- Official collision tests
- Read "https://probablydance.com/2017/02/26/i-wrote-the-fastest-hashtable/" about a fast table. Uses limit on probe count and robin hood hashing
- Decided to implement probe count, which have included
1/8/17
- Robin hood hashing included
- Competing against g_hash_table in C as well
- Current status: open addr very close to Python dict. with size init. much faster
open addr = 1/2 speed of g_hash_table!!! Reading about macros and full-on optimizations
- started doing perf profiling (kcachegrind and valgrind --tool=callgrind). E.g kcachegrind callgrind.out.*, etc
1/8/17 - 9/8/17
- Optimising well, swapping pointers, precomputing
- For collisions getting close to g_hash_table (~0.52 for 1000000 inserts vs ~0.33 for 1000000 inserts)
- HOWEVER, there was an assertion that didn't go right, I couldn't get it to fire off after many tries, this needs a look into
10/8/17
- Milestone moment: faster put than Python dict!!!
- Getting closer to g_hash_table ~0.50, have broken 0.5 once or twice
- Optimised some loops, doing less cond check in put. NOTE: I've primarily been working on open addr for a while
- #defined some vars like sizeof(int)
10/8/17 - cont:
- Fixed some big bugs that I just made today, mixed up size and num_elem
- Speed slowed back to ~0.52, but for non-collisions speed is still good
17/8/17 - 21/8/17
- Fixed my memory leak! Thanks to some good people on SO
- Experimenting with power of 2 hash, much faster for collision hashing... did some research and seems to be the norm
- SHOULD FOCUS on completing everything before focusing on speed. String keys, string values, etc!!!
24/8/17 - 2/9/17
- Now on par with C's glib hash. This is solid stuff, I did have to go back and fix a bug I made ages ago
which really sucked
- String keys are in (take much longer because of extra mallocs)