Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

registry: add disk-backed storage #488

Closed
ammario opened this issue Jul 25, 2019 · 16 comments
Closed

registry: add disk-backed storage #488

ammario opened this issue Jul 25, 2019 · 16 comments
Labels
enhancement New feature or request lifecycle/stale

Comments

@ammario
Copy link
Contributor

ammario commented Jul 25, 2019

The current implementation is a big memory hog for my tests.

I'm thinking we create NewMemory() and NewDisk(root string) functions, and deprecate the current New() function.

If this idea is OK with you @jonjohnsonjr, I can implement it.

@ammario ammario changed the title registry: disk-backed storage registry: add disk-backed storage Jul 25, 2019
@jonjohnsonjr
Copy link
Collaborator

I've wanted to take advantage of the handler code (and registry) but have swappable implementations of the actual storage. There's more than just disk vs memory, e.g. I'd like to be able to use this to run a proxy registry.

We could start by defining an actual interface for the content (instead of just maps) and use that interface from the handlers.

Something like:

type blobStore interface {
  Get(key string) (io.ReadSeeker, error)
  Put(key string, blob io.ReadCloser) error
  Delete(key string) error
}

// Wrapper for current implementation.
type memBlobStore struct {
  contents map[string][]byte
}

func (bs *memBlobStore) Get(key string) (io.ReadSeeker, error) {
  if b, ok := bs.contents[key]; ok {
    return bytes.NewReader(b), nil
  }
  return nil, errNotFound
}

func (bs *memBlobStore) Put(key string, blob io.ReadCloser) error {
  b, err := ioutil.ReadAll(blob)
  if err != nil {
    return err
  }
  bs.contents[key] = b
  return nil
}

func (bs *memBlobStore) Delete(key string) error {
  delete(bs.contents, key)
  return nil
}

(I'm not 100% sold on this as a good blobs API, just a strawman.)

If that is easy enough, we can figure out an API for the manifests as well.

I'm not actually that familiar with this code, so I'd defer to @clrprod's thoughts.

@clrprod
Copy link
Contributor

clrprod commented Jul 25, 2019

sgtm - as long as you're adding a second implementation (i.e. postgres or another registry). I generally don't like unused abstraction.

@clrprod
Copy link
Contributor

clrprod commented Jul 25, 2019

Ah didn't see the original comment, yeah a disk backed version is totally doable. Maybe default to ioutil.TempDir(), but allow for an alternate path?

@clrprod
Copy link
Contributor

clrprod commented Jul 25, 2019

I'd probably do something like

struct option {
 disk string
}
type Option func(o *option)

func DIr(path string) Option {
  return func(o *option) {
     o.disk = path
}
}

func TempDir() Option {
   d, err := ioutil.TempDir
   return Dir(d)
}

New(o ...Option)
TLS(domain string, o Option)

@clrprod
Copy link
Contributor

clrprod commented Jul 25, 2019

Then you can call New(Dir("/foo")) and store on disk, or New(TmpDir()) if you just want a random dir, while leaving the default behaviour in memory.

The underlying implementation could implement Jon's interface with memory and disk. (or a different interface if that doesn't quite work).

@clrprod
Copy link
Contributor

clrprod commented Jul 25, 2019

But yeah - do it, I'll gladly look forwards to the PR :D

@imjasonh
Copy link
Collaborator

This seems related to https://godoc.org/github.com/google/go-containerregistry/pkg/v1/cache#NewFilesystemCache -- if you decide on a different API to describe filesystem-backed...things..., please consider updating that API to match it.

@jonjohnsonjr
Copy link
Collaborator

please consider updating that API to match it.

I'd love if the registry package can remain only dependent on the stdlib 😄 but maybe we can simplify that interface

@ammario
Copy link
Contributor Author

ammario commented Jul 25, 2019

Functional options sound good to me.

Will definitely abstract away the storage.

I don't think we should have TmpDir() because the user should really be removing a temporary directory when they're done.

@clrprod
Copy link
Contributor

clrprod commented Jul 25, 2019 via email

ammario added a commit to ammario/go-containerregistry that referenced this issue Jul 25, 2019
@jonjohnsonjr jonjohnsonjr added the enhancement New feature or request label Sep 11, 2019
@mamoit
Copy link

mamoit commented Jul 28, 2020

Sorry for the necromancy, but this is still an issue right?
I see there were some efforts to abstract out the storage layer, but they don't seem to have been merged.
I'm coming from kaniko's issue 862, which I think may be related to this one.
Building large images (with CUDA's 1.5GB dev image for example) is prohibitive right now.

@imjasonh
Copy link
Collaborator

This issue covers the registry package, which is intended to provide a OCI registry server, typically for tests.

If you're hitting kaniko memory usage limits, it's probably because they're buffering layers in memory, which they could get around either by writing layer data to disk (e.g., with tarball.Write), or by streaming layers to the registry using the stream package. I'm not sure if they're already using either or both now.

@jonjohnsonjr
Copy link
Collaborator

We might want to create a new issue from this -- it seems like we're the ones who should be using a stream.Layer? See

b := w.Bytes()
// gzip the contents, then create the layer
opener := func() (io.ReadCloser, error) {
return v1util.GzipReadCloser(ioutil.NopCloser(bytes.NewReader(b))), nil
}
layer, err = tarball.LayerFromOpener(opener)
if err != nil {
return nil, fmt.Errorf("creating layer: %v", err)
}

At some point I thought it was necessary to hack this commit together: jonjohnsonjr@542d74b

It may be that naively changing the layertime implementation to use stream.Layer doesn't work as expected.

@mamoit
Copy link

mamoit commented Jul 28, 2020

@jonjohnsonjr it was a comment on the issue opened on the kaniko repo by @nathan-c that led me to mutate.go and then here when I searched for memory related issues, but I found myself way out of my league while trying to investigate this.
What you are saying makes sense to me, but I'm quite new to golang, so I get lost quite fast.
I can open a separate issue since this is out of topic, but the information that I can provide is quite limited.

@jonjohnsonjr
Copy link
Collaborator

I can open a separate issue since this is out of topic, but the information that I can provide is quite limited.

No worries. Filed #749 to track this separately.

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lifecycle/stale
Projects
None yet
Development

No branches or pull requests

5 participants