Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shareable Controller stream interfaces #1189

Closed
clux opened this issue Apr 7, 2023 · 11 comments · Fixed by #1449
Closed

Shareable Controller stream interfaces #1189

clux opened this issue Apr 7, 2023 · 11 comments · Fixed by #1449
Assignees
Labels
question Direction unclear; possibly a bug, possibly could be improved. runtime controller runtime related

Comments

@clux
Copy link
Member

clux commented Apr 7, 2023

Would you like to work on this feature?

Maybe

What problem are you trying to solve?

As a final step to allow stream sharing #1080 we need to allow StreamExt::stream_subscribe output to be able to be passed into the controller streams interfaces:

  • Controller::for_stream
  • Controller::watches_stream
  • Controller::owns_stream

Which currently is not possible because StreamExt::stream_subscribe produces Stream<Item = Result<Arc<K>, Error>>, but the input streams are normal watche streams of the form Stream<Item = Result<K, Error>>.

Arc wrapping can help reduce memory consumption + reduce awkward arc wrap points:

  • reflector stores arc wrap already by cloning objects passed through from streams
  • we already have to arc-wrap in Controller before calling out to the reconcile and error_policy fns anyway

Describe the solution you'd like

We need the streams that watcher produces to ultimately end up with equal types for both the subscribepath and the non-subscribe path. Options are considered below, but personally, I think that if it's viable, we should experiment with arc-wrapping earlier.

Describe alternatives you've considered

1. Always Arc wrap output from EventFlatten

Arc wrap once it's intended to be consumed; when it's flattened (::applied_objects() or ::touched_objects()). It's late enough that it won't really affect any store implementations, and it will work with ::stream_subscribe.

While this is less breaking than the suggestion below, it also creates another copy of the object; as each Store is cloning:

watcher::Event::Applied(obj) => {
let key = ObjectRef::from_obj_with(obj, self.dyntype.clone());
let obj = Arc::new(obj.clone());
self.store.write().insert(key, obj);
}
watcher::Event::Deleted(obj) => {
let key = ObjectRef::from_obj_with(obj, self.dyntype.clone());
self.store.write().remove(&key);
}
watcher::Event::Restarted(new_objs) => {
let new_objs = new_objs
.iter()
.map(|obj| {
(
ObjectRef::from_obj_with(obj, self.dyntype.clone()),
Arc::new(obj.clone()),

2. Always Arc wrap output from watcher

A much more fundamental change to our api, but it will provide a much more consistent api, and possibly clone less huge k8s objects while passing through streams.

If we do it this deep, then we basically add it to step_trampolined where the objects are handled first:

async fn step_trampolined<A>(
api: &A,
wc: &Config,
state: State<A::Value>,
) -> (Option<Result<Event<A::Value>>>, State<A::Value>)
where
A: ApiMode,
A::Value: Resource + 'static,
{
match state {
State::Empty => match api.list(&wc.to_list_params()).await {
Ok(list) => {
if let Some(resource_version) = list.metadata.resource_version {
(Some(Ok(Event::Restarted(list.items))), State::InitListed {
resource_version,
})
} else {
(Some(Err(Error::NoResourceVersion)), State::Empty)
}
}
Err(err) => {
if std::matches!(err, ClientErr::Api(ErrorResponse { code: 403, .. })) {
warn!("watch list error with 403: {err:?}");
} else {
debug!("watch list error: {err:?}");
}
(Some(Err(err).map_err(Error::InitialListFailed)), State::Empty)
}
},
State::InitListed { resource_version } => {
match api.watch(&wc.to_watch_params(), &resource_version).await {
Ok(stream) => (None, State::Watching {
resource_version,
stream,
}),
Err(err) => {
if std::matches!(err, ClientErr::Api(ErrorResponse { code: 403, .. })) {
warn!("watch initlist error with 403: {err:?}");
} else {
debug!("watch initlist error: {err:?}");
}
(
Some(Err(err).map_err(Error::WatchStartFailed)),
State::InitListed { resource_version },
)
}
}
}
State::Watching {
resource_version,
mut stream,
} => match stream.next().await {
Some(Ok(WatchEvent::Added(obj) | WatchEvent::Modified(obj))) => {
let resource_version = obj.resource_version().unwrap();
(Some(Ok(Event::Applied(obj))), State::Watching {
resource_version,
stream,
})
}
Some(Ok(WatchEvent::Deleted(obj))) => {
let resource_version = obj.resource_version().unwrap();
(Some(Ok(Event::Deleted(obj))), State::Watching {
resource_version,
stream,
})
}
Some(Ok(WatchEvent::Bookmark(bm))) => (None, State::Watching {
resource_version: bm.metadata.resource_version,
stream,
}),
Some(Ok(WatchEvent::Error(err))) => {
// HTTP GONE, means we have desynced and need to start over and re-list :(
let new_state = if err.code == 410 {
State::Empty
} else {
State::Watching {
resource_version,
stream,
}
};
if err.code == 403 {
warn!("watcher watchevent error 403: {err:?}");
} else {
debug!("error watchevent error: {err:?}");
}
(Some(Err(err).map_err(Error::WatchError)), new_state)
}
Some(Err(err)) => {
if std::matches!(err, ClientErr::Api(ErrorResponse { code: 403, .. })) {
warn!("watcher error 403: {err:?}");
} else {
debug!("watcher error: {err:?}");
}
(Some(Err(err).map_err(Error::WatchFailed)), State::Watching {
resource_version,
stream,
})
}
None => (None, State::InitListed { resource_version }),
},
}
}

This has the potential to help reduce cloning between streams and reflectors. Given reflector stores and Kubernetes objects account for most of the memory usage of typical rust controllers, this could be a worthwhile avenue.

However, not sure if this is a practical solution.

3. Internally Arc with Arc wrapper helper for WatchStreamExt

  1. Create a new WatchStreamExt::arc (or something like that) to map the Ok K elements to an Arc<K>.
  2. Internally in Controller, call ::arc on all managed streams
  3. Tell users using the unstable streams interface that they need an extra ::arc() call to be compatible

Fairly simple. Just propagate the Arc inside Controller only, and do it at an earlier point.

This does not solve memory issues with Store, and it puts the onus on the user to figure out when to arc-wrap or not.

Documentation, Adoption, Migration Strategy

Examples, doc tests, kube.rs controller guide updates where relevant.
I suspect this can be done in a mostly non-disruptive way. We have arc-wrapped a bunch of things in the past, and it's generally been a good idea.

Target crate for feature

kube-runtime

@clux clux added question Direction unclear; possibly a bug, possibly could be improved. runtime controller runtime related labels Apr 7, 2023
@louismrose
Copy link

I think I'm running into this problem now, as I'm trying to use a StreamSubscribe as input to a reflector. Any thoughts on which direction you'd like to go in here @clux?

@clux
Copy link
Member Author

clux commented Jun 27, 2023

Hey @louismrose.

I am still not sure which one works out to be the best. Ideally, a PoC for 2 would really help disambiguate the problem space because that is the most breaking change, and it would help us get a feel for whether it is a justifiable one.

If it isn't justifiable, then we can probably do 1 or 3 pretty easily at the cost of some user ergonomics.

@clux clux added the help wanted Not immediately prioritised, please help! label Jun 27, 2023
@mateiidavid
Copy link
Contributor

I'd be happy to try and get a PoC for 2 if that helps, unless someone else wants to try their hand in which case I'm fine with stepping aside :)

@clux
Copy link
Member Author

clux commented Jun 29, 2023

No one else has jumped at this for two months so I'd say go for it @mateiidavid :-)

@louismrose
Copy link

Thanks @mateiidavid. I had a quick attempt at this, but it's a bit beyond my level of Rust at the moment!

@clux
Copy link
Member Author

clux commented Jul 18, 2023

@mateiidavid did you get anywhere with this? :D

@mateiidavid
Copy link
Contributor

@clux thanks for the nudge. My progress has been slow :( It's been a few busy weeks at work and I struggled to find the time for it. I know you're all probably eager to see this through. My motivation didn't waver tho and I'm still committed to getting this done; give me a few more days to fully mobilise and I'll get something written up here.

@clux
Copy link
Member Author

clux commented Jul 18, 2023

Hehe, that's great. DW, just checking in :-)

@mateiidavid
Copy link
Contributor

mateiidavid commented Jul 24, 2023

Just to give you an update, I've been making some progress here. Trying to get solution number (2) to work since big object stores & the associated memory pressure are actually the bane of my existence now. I have some additional motivation in the tank to do this as efficiently as possible.

It's a bit harder than I expected, but so far (2) seems to also imply (1). Here's an overview:

  • wiring up the watcher (step_trampolined) has been pretty straight forward
  • naturally, once the watcher interface changes, the reflector also needs some tweaks (apply_watcher_event now gets an arc). Kind of a net-zero change here since we can just clone to add the obj to the store.
  • reflector() util also needs to be changed based on the above, some other small tidbits (e.g. wait()).
  • controller runtime: flatten interface & the triggers, and ofc the stream interfaces for the controller.

Wiring up the watcher and reflector is for the most part pretty straight forward. The controller machinery is a bit more involved, I'll spend some time studying the internals and doing this in parallel and see what I come up with, but it does seem to at least also change the flatten interface.

What I'd ideally like to do is get a branch to look at the diff and then think things through in terms of where things can go wrong. Keep ya all posted.

Update:

Okay, so I got it to pass check. You can see the diff in my fork branch. It ended up not being as gnarly as I thought, had to add some Sync bounds and change some of the helper functions to use a generic type for associated type bounds. Will change one of the examples to see if it works.

Update 2: ugh, seems like the subscribe interface wraps the entire event in an arc, will have to make some changes to the branch.

Update 3: Arc'd output from watchers and again fell in the trap of not checking the subscribe signature 🙃 it seems like the subscriber adapter actually Arcs the results. We could probably arc the results in the watcher as well and propagate that upstream. We'd have to change the subscriber interface to take in a stream that's already Arc'd and avoid cloning if we want to share watcher streams, presumably? imo having the watcher use Arc<Result<_>> is going to be painful, perhaps Result<Arc<Event<K>>, Arc<Error>> might be easier to work with? I feel like I'm not really on the right path but it's sort of coming together. Lmk if anyone has any thoughts on this. See diff above for most recent update.

@clux clux linked a pull request Jul 31, 2023 that will close this issue
@github-project-automation github-project-automation bot moved this to Defining in Kube Roadmap Sep 13, 2023
@clux clux moved this from Defining to In Progress in Kube Roadmap Sep 13, 2023
@clux
Copy link
Member Author

clux commented Feb 16, 2024

A fourth potential avenue was brought up in the meeting yesterday:

  1. impl Stream for Store somehow (channels on write side that stream impl can read from)
  2. clone reader
  3. stream from reader
  4. allow stream interfaces in Controller that take a reader stream (somehow)

This could be an easier, approach since the Arc-wrapping seems to be very much a "break the world" type of thing (that would not be kind to downstream users), and it's not even clear we want people to re-use the same object that deeply anyway.

With a store-stream approach (if it works out) users could still have the option to memory prune before going into the store, but have many users pull copies from the store into the controller.

@clux clux removed the help wanted Not immediately prioritised, please help! label Mar 30, 2024
@clux clux linked a pull request Apr 17, 2024 that will close this issue
@clux clux removed a link to a pull request Apr 17, 2024
@clux
Copy link
Member Author

clux commented Apr 17, 2024

We are avoiding the original intent of this story; to arc-wrap at a deep level and instead doing this additively with shared stores in #1449.

This will likely be available under a feature flag in the next version. See #1080 for more. Closing this via #1449.

@clux clux linked a pull request Apr 17, 2024 that will close this issue
@clux clux changed the title Arc wrap watcher and/or Controller stream interfaces Shareable Controller stream interfaces Apr 17, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Kube Roadmap Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Direction unclear; possibly a bug, possibly could be improved. runtime controller runtime related
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants