Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add a memory leak test #51

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

okdistribute
Copy link
Member

@okdistribute okdistribute commented Aug 4, 2020

This is a failing test for the memory leak. System will print Killed after less than 30 seconds. Check /var/log/syslog for memory leak message. Related to cabal-club/commons#15

If the issue is having too many hypercores open at once, an approach like a manifest and request middleware might also fix the memory leak issue for most production use cases: #26

Or, some way to replicate in batches, and close the feeds that aren't actively replicating.

To mimic real life scenario for cabal, about 70% of the feeds are empty in this test -- only 30% of them have 1 log entry.

@okdistribute okdistribute requested a review from a user August 4, 2020 16:40
@hackergrrl
Copy link
Member

hackergrrl commented Aug 22, 2020

This is great @okdistribute.

I poked at this a bit and wrote a patch! I noticed something big first: my machine can't even allocate 500 multifeeds in memory. Allocating and syncing everything at once also overwhelmed my computer, so I changed run-parallel for async-series.

I also made the numbers a bit smaller, and used process.memoryUsage() to try and see how much heap the syncs were sucking up, instead of relying on the process to get killed. (My computer doesn't seem to nicely kill it, instead it sucks up my swap and freezes everything.)

From there, I noticed that the memory use goes up non-linearly with the number of syncs. For example, 10 syncs uses 34mb of heap, 20 uses 114mb, and 40 syncs uses 447mb.

I didn't want to push on top of your work, and have been wanting to mess with non-github workflows, so here's a patch you can apply locally to see if you like it:

curl -sL http://eight45.net/pub/0001-multifeed.patch | git am

@okdistribute
Copy link
Member Author

Thanks @noffle! Feel free to push the patch here, I agree about smaller increments, I picked a big number out of laziness 😅

@lejeunerenard
Copy link
Contributor

@okdistribute this is a great example of how easy it is to bloat a multifeed with feeds! It's clear when you add more peers the memory usage balloons.

I did some poking around and think this test might be what you want to test as it simulates multiple peers in one program (and hence one heap). So the final heapUsed value includes all the simulated peers, aka the entire network. Here's my analysis to explain:

@noffle was right that the memory goes up non-linearly with the number of multifeed peers. Looking at the heap snapshot I found that ~89% of the memory used was retained by Feeds AKA hypercores. So I did a series of tests and got the following results:

Multifeeds (n) Feeds (f) post sync memory
2 6 9.438872 mb
4 17 15.238376 mb
10 74 46.004 mb
20 249 146.085864 mb
30 524 290.418824 mb
40 899 492.932712 mb

See [1] for how f is correlated to n

Suggested Changes to Test

I believe that not all multifeeds should be included in the memory results and instead each multifeed should be deleted once it's replicated to one persistent multifeed to simulate transient peers connecting. Here is an example patch that deletes all but the fully synced multifeed:

    console.log('pre-sync heap', process.memoryUsage().heapUsed / 1000000, 'mb')
    series(replications, (err) => {
      if (err) throw err
      console.log('replicated! everything!')

      console.log('pre-closing heap', process.memoryUsage().heapUsed / 1000000, 'mb')

      let keep = feeds[0]
      let closures = []
      for (let i = feeds.length - 1; i > 0; i--) {
        closures.push((done) => {
          let a = feeds[i]
          delete feeds[i]
          a.close(done)
        })
      }

      series(closures, (err) => {
        if (err) throw err
        global.gc()

        console.log('keep.feeds().length', keep.feeds().length)
        console.log('post-closing heap', process.memoryUsage().heapUsed / 1000000, 'mb')
        debugger
        t.end()
      })
    })

This requires the --expose-gc flag when running node, but without that the heapUsed includes multifeeds without references. Even so the garbage collection doesn't get every floating reference as I have approximately 24 Feeds still in the heap at a the breakpoint above. Eventually after multiple GCs it goes down to the expected 21 Feeds. The resulting 'post-closing' heapUsed for 20 multifeeds total is 20.720344 mb compared to the 'pre-closing' of 147.788704 mb. I get similar result with other ns:

Multifeeds (n) Feeds (f) post closing memory
10 14 14.090496 mb
20 24 20.720344 mb
30 34 27.0382 mb
40 43 33.731832 mb

Concluding Remarks

Sorry for the tome, but I hope this helps! Right now having a middleware layer would be awesome as weeding out feeds that a peer doesn't want to host would be awesome. Alternatively, people can save a bit on memory if the storage is file based (though I do believe memory still increases over time with random-access-file as the storage.

Footnote

[1] The number of feeds (f) goes up much quicker than the number of multifeeds (n), which makes sense given they are replicating feeds to one another. But I expected f to be f = n² + n since every multifeed has 1 feed as the root and then n feeds, one for each multifeed it gets via replication including itself. Instead f = 2 * n + ( 1 + 2 + ... + n - 1) + n - 1 with the 2 * n standing for the root and local feed for each node and the (1 + 2 + ... + n - 1 ) + n -1 standing for the feeds replicated other than itself. This sum comes from the fact that the replication is in series done with only the previous and next node if they are available. So the first node only gets the second nodes feeds and each node adds one feed to the running total. The last node is the one exception as it has no following node to sync with.

The multifeeds might have fully replicated prior to @noffle's reworking to run them in series as the replication might have persisted long enough for the feeds to propagate out. One potential solution is to sync the last node to be synced with the other nodes it wasn't synced with to ensure they all have the same feeds. Though this may not matter if we narrow the test to one multifeed.

@cblgh
Copy link
Contributor

cblgh commented Dec 14, 2020

mafintosh says:

@cblgh you might wanna rerun that leak test tho, we have no known leaks in [hyper-]core since 1-2 months ago

@cblgh
Copy link
Contributor

cblgh commented Dec 15, 2020

& a comment from @RangerMauve

I was trying to find a leak in multifeed actually, but I couldn't find anything when I did heap dumps. Like, I had it running for a while and connecting and disconnecting periodically, but it didn't look like anything new was being kept other than system garbage like compiled code. :person_shrugging:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants