Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new CRD to create repairs schedules #1437

Open
c3-clement opened this issue Oct 23, 2024 · 7 comments
Open

Add new CRD to create repairs schedules #1437

c3-clement opened this issue Oct 23, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@c3-clement
Copy link
Contributor

c3-clement commented Oct 23, 2024

What is missing?

It's not possible to create repairs schedules in a Kubernetes native way.

Why do we need it?

In my company, we want to be able to start full repair for an entire Cassandra cluster.
Ideally, those abilities should be exposed by a Kubernetes based API.

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: K8OP-281

@c3-clement c3-clement added the enhancement New feature or request label Oct 23, 2024
@c3-clement
Copy link
Contributor Author

Hi @adejanovski @burmanm,

I would like to work on this enhancement.
First, could you confirm that it makes sense to add repair support in K8ssandraTask CRD ?

What do you think of this implementation proposal?

  • Add repair command in CassandraCommand enum
  • Add repair parameters in CasandraTask CRD
  • Invoke POST /api/v1/ops/node/repair to start repairs
  • Invoke GET /api/v0/ops/executor/job?job_id=${id} to track the progress of the repair job (exactly like for compaction)

@adejanovski
Copy link
Contributor

Hi @c3-clement,

Reaper is what handles repairs in K8ssandra. We don't want to create yet another repair orchestrator in the form of CassandraTasks.
If we want to be able to run repairs for all keyspaces, then there are probably some changes we need to do in Reaper to accommodate this. It's been asked quite a few times.

I think running nodetool garbagecollect would be a better idea than running a major compaction btw. Have you considered it?

@c3-clement
Copy link
Contributor Author

Hi @adejanovski , thanks for the feedback!

Reaper is what handles repairs in K8ssandra. We don't want to create yet another repair orchestrator in the form of CassandraTasks.

Got it, makes sense.

If we want to be able to run repairs for all keyspaces, then there are probably some changes we need to do in Reaper to accommodate this. It's been asked quite a few times.

We don't actually need to run repair for all keyspaces, but we need to run a full repair for a specific keyspace and a specific table across all Cassandra nodes.
As workaround, we are executing manually nodetool repair --full keyspace table in all Cassandra nodes.

Is there a way to achieve this with the Reaper API?

I think running nodetool garbagecollect would be a better idea than running a major compaction btw. Have you considered it?

I will tell my team about it, thanks for the suggestion!

@adejanovski
Copy link
Contributor

We don't actually need to run repair for all keyspaces, but we need to run a full repair for a specific keyspace and a specific table across all Cassandra nodes.

What makes you think that's not what Reaper is doing? It's doing full repairs but splits the work in discrete chunks (subranges).
Once the execution is finished, it's the same thing as running nodetool repair --full -pr everywhere

@c3-clement
Copy link
Contributor Author

We don't actually need to run repair for all keyspaces, but we need to run a full repair for a specific keyspace and a specific table across all Cassandra nodes.

What makes you think that's not what Reaper is doing? It's doing full repairs but splits the work in discrete chunks (subranges). Once the execution is finished, it's the same thing as running nodetool repair --full -pr everywhere

So according to Reaper API reference, to start a full repair programmatically, I need to invoke POST /repair_run with incrementalRepair set to false?
Then, I need to start the repair with PUT /repair_run/{id}/state/start and I can monitor repair progress with GET /repair_run/{id}

There is no way to abstract this process with one of the K8ssandra Custom Resources?

@adejanovski
Copy link
Contributor

you can do that for sure, there's indeed no Kubernetes native way to do this at the moment.

You should probably create repair schedules btw instead of creating the repair runs yourself, or even turn on the autoscheduler so that each new keyspace will get a schedule created automatically.

Lastly you can do all this from the UI if that's useful and watch the network calls to get your payloads right.

I've been thinking myself of making it possible to create schedules and repairs through the Kubernetes API, I think that would be a nice addition to Reaper.

@c3-clement c3-clement changed the title Add repair support in K8ssandraTask Add new CRD to create repairs schedules Oct 25, 2024
@c3-clement
Copy link
Contributor Author

c3-clement commented Oct 25, 2024

Hi @adejanovski , I really appreciated your feedback, thanks again!

I've updated the issue title and description, as K8ssandraTask is not responsible for repairs.

To unblock the team that needs the repair and compaction API, I'll implement an internal CRD to abstract repair and compaction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Status: No status
Development

No branches or pull requests

2 participants