Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise write-cache role, functionality, etc #2337

Open
notimetoname opened this issue May 11, 2023 · 3 comments
Open

Revise write-cache role, functionality, etc #2337

notimetoname opened this issue May 11, 2023 · 3 comments
Labels
epic A collection of related issues I2 Regular impact neofs-storage Storage node application issues S1 Highly significant U2 Seriously planned
Milestone

Comments

@notimetoname
Copy link
Contributor

notimetoname commented May 11, 2023

It's an epic now, let's solve it

Original issue

There is no strict theory (or, at least, i do not know such) that describes how it should work, what problems it should solve, etc.

At least:

  • it has strange limits for the object it has (its capacity divided by the avarage object it can store);
  • the only way to clean up space for WC is to exceed its object number;
  • it is not "write" cache but a "read-write" cache (see the prev point), which means it is always filled with something even if there is no load at all (although e.g. it can try to flush objects to the blobstor to be prepared for the new load peak instead);
  • it has some interfaces that are not used at all;
  • its initialization may take incredible time because of iterating every object it stores.
@notimetoname notimetoname added discussion Open discussion of some problem triage neofs-storage Storage node application issues labels May 11, 2023
@notimetoname
Copy link
Contributor Author

Well, WC is even a racer:

WARNING: DATA RACE
Read at 0x00c000270043 by goroutine 31:
  testing.(*common).logDepth()
      /usr/local/go/src/testing/testing.go:889 +0x4e7
  testing.(*common).log()
      /usr/local/go/src/testing/testing.go:876 +0xa4
  testing.(*common).Logf()
      /usr/local/go/src/testing/testing.go:927 +0x6a
  testing.(*T).Logf()
      <autogenerated>:1 +0x75
  go.uber.org/zap/zaptest.testingWriter.Write()
      /home/carpawell/go/pkg/mod/go.uber.org/[email protected]/zaptest/logger.go:130 +0x12c
  go.uber.org/zap/zaptest.(*testingWriter).Write()
      <autogenerated>:1 +0x7e
  go.uber.org/zap/zapcore.(*ioCore).Write()
      /home/carpawell/go/pkg/mod/go.uber.org/[email protected]/zapcore/core.go:99 +0x199
  go.uber.org/zap/zapcore.(*CheckedEntry).Write()
      /home/carpawell/go/pkg/mod/go.uber.org/[email protected]/zapcore/entry.go:255 +0x2ce
  go.uber.org/zap.(*Logger).Debug()
      /home/carpawell/go/pkg/mod/go.uber.org/[email protected]/logger.go:212 +0x6d
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).flushDB()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:137 +0x40a
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).runFlushLoop.func1()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:51 +0x12b

Previous write at 0x00c000270043 by goroutine 8:
  testing.tRunner.func1()
      /usr/local/go/src/testing/testing.go:1433 +0x7e4
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:476 +0x32
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1493 +0x47

Goroutine 31 (running) created at:
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).runFlushLoop()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush.go:42 +0x204
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.(*cache).Init()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/writecache.go:148 +0x38
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.TestFlush.func1()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush_test.go:67 +0x882
  github.com/nspcc-dev/neofs-node/pkg/local_object_storage/writecache.TestFlush.func4()
      /home/carpawell/NSPCC/git/neofs-node/pkg/local_object_storage/writecache/flush_test.go:103 +0x83
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1446 +0x216
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1493 +0x47

@roman-khimov roman-khimov added U2 Seriously planned S1 Highly significant I2 Regular impact and removed triage labels Dec 21, 2023
@roman-khimov roman-khimov added epic A collection of related issues and removed discussion Open discussion of some problem labels Nov 20, 2024
@roman-khimov
Copy link
Member

What it can do is:

  • help with load spikes
  • reduce latency for smaller write loads
  • level writing to slower medium (push to it at more proper constant rate)

That's about it. There is no magic, it can not make writing faster in general, eventually you'll run out of space and drop down to the primary storage performance level. It only works if it's located on a faster drive since it doesn't have any magic technology that can make it write faster to the same medium. Usually this means that primary storage is located on HDD with writecache on SSD and it's a nice combination, HDD/HDD and SSD/SSD won't give any benefit.

What we can do to improve it is:

  • drop BoltDB from it completely (provide flush/migration in the new version), it sucks for SSDs as we know from Combined writing for FSTree #2814
  • make its flush loop delete objects it has (looks like currently it's only done on Init which defeats the purpose completely)
  • adjust flusher behavior to utilize all of the underlying blobstor capacity (make it more aggressive in general)

This will remove artificial limits and mostly solve init at the same time (most of the time it'll be empty).

@roman-khimov
Copy link
Member

Other things to consider:

  • make it an engine-level thing (not shard level)
  • don't mess with metabase storage IDs, keep writecache index in memory only

@roman-khimov roman-khimov added this to the v0.45.0 milestone Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic A collection of related issues I2 Regular impact neofs-storage Storage node application issues S1 Highly significant U2 Seriously planned
Projects
None yet
Development

No branches or pull requests

2 participants