[RHCLOUD-35640] Update custom default group logic #1245

lpichler · 2024-10-17T12:10:29Z

This PR adds replication

for creation of custom default group logic:
- add roles to default group
- remove roles from default group
removal of custom groups

Link(s) to Jira

https://issues.redhat.com/browse/RHCLOUD-35640

Description of Intent of Change(s)

The what, why and how.

Local Testing

add roles to default group:

POST http://localhost:8000/api/rbac/v1/groups/<detault_group_uuid>/roles/
Content-Type: application/json

{"roles":["d414435a-00fc-4dd5-82dd-c5c345ad1338"]}

remove roles from default group

DELETE http://localhost:8000/api/rbac/v1/groups/<custom_detault_group_uuid>/roles/?roles=d414435a-00fc-4dd5-82dd-c5c345ad1338

removal of custom groups

DELETE http://localhost:8000/api/rbac/v1/groups/<custom_detault_group_uuid>/

Checklist

if API spec changes are required, is the spec updated?
are there any pre/post merge actions required? if so, document here.
are theses changes covered by unit tests?
if warranted, are documentation changes accounted for?
does this require migration changes?
- if yes, are they backwards compatible?
is there known, direct impact to dependent teams/components?
- if yes, how will this be handled?

Secure Coding Practices Checklist Link

https://github.com/RedHatInsights/secure-coding-checklist

Secure Coding Practices Checklist

lpichler · 2024-10-17T12:13:24Z

rbac/management/group/definer.py

-        group_uuid = uuid4()
+    if settings.PRINCIPAL_CLEANUP_UPDATE_ENABLED_UMB and settings.REPLICATION_TO_RELATION_ENABLED:
+        tenant_bootstrap_service = V2TenantBootstrapService(OutboxReplicator())
+        bootstrapped_tenant = tenant_bootstrap_service.bootstrap_tenant(tenant)


@alechenninger can we always count that if tenant is created is also also created default and root ws ?

If after my CJI job run, I think yes. But now it is not guaranteed a tenant has those workspaces created.
However, it is ok either the tenant has or not. It is guaranteed to coexist with the tenant mapping. If the tenant mapping exist, so does the default/root ws. If not, we create them

Right, by definition a tenant is not considered "bootstrapped" (in v2) if they don't have built in workspaces. That's part of the process. So bootstrap_tenant must create the builtin workspaces as well as mapping.

# This assumes that Tenant Mapping exist

Not sure what is meant by the comment exactly – but my interpretation of it is that it is maybe misleading. The method below does not assume the TenantMapping exists. It creates it if it doesn't.

This method is called only from API endpoint and in that situation TenantMapping has to be created already.
But maybe in some edge cases it is not true ?

I can remove the comment. My main concern was whether the tenant can be created, but the default workspace is not, or if the tenant mapping could be created, but the default workspace is not. Because if the tenant mapping is not created and the default workspace is already created, this method will fail - but it looks that it is not possible.

This method is called only from API endpoint and in that situation TenantMapping has to be created already. But maybe in some edge cases it is not true ?

Okay, yeah I see the issue. We need to update the settings check above to check for V2_BOOTSTRAP_TENANT I think. I think this would only happen during migration steps as of latest doc. There is a window where a default group may attempt to be created for an existing tenant that is not yet bootstrapped.

We could keep this as is, and then I think it should never happen to your point (at least, after replication is enabled). The user import job will take care of race conditions in that case (it will detect default groups added before bootstrapping). But, I think it is a little safer to bootstrap the tenant here in that case, to ensure we use the uuid from the tenant mapping. And we know we cannot get into trouble there, because of the 1-1 unique constraint on the tenant mapping, any concurrent tx creating one (e.g. the user import) will fail.

My main concern was whether the tenant can be created, but the default workspace is not, or if the tenant mapping could be created, but the default workspace is not. Because if the tenant mapping is not created and the default workspace is already created, this method will fail - but it looks that it is not possible.

Yes, correct – we should never have workspaces without the TenantMapping. That's why the bootstrap service stuff was created – to consolidate the logic to enforce those kinds of things.

astrozzc · 2024-10-17T18:06:46Z

rbac/management/group/definer.py

-        group_uuid = uuid4()
+    if settings.PRINCIPAL_CLEANUP_UPDATE_ENABLED_UMB and settings.REPLICATION_TO_RELATION_ENABLED:
+        tenant_bootstrap_service = V2TenantBootstrapService(OutboxReplicator())
+        bootstrapped_tenant = tenant_bootstrap_service.bootstrap_tenant(tenant)


If after my CJI job run, I think yes. But now it is not guaranteed a tenant has those workspaces created.
However, it is ok either the tenant has or not. It is guaranteed to coexist with the tenant mapping. If the tenant mapping exist, so does the default/root ws. If not, we create them

rbac/management/group/definer.py

rbac/management/relation_replicator/relation_replicator.py

rbac/management/group/definer.py

lpichler · 2024-10-18T08:57:05Z

/retest

lpichler · 2024-10-22T13:34:02Z

/retest

alechenninger · 2024-10-23T21:02:19Z

rbac/management/group/definer.py



 @transaction.atomic
-def add_roles(group, roles_or_role_ids, tenant, user=None):
+def add_roles(group, roles_or_role_ids, tenant, user=None, relations=None):


Main thing I am wondering about is if we can avoid passing relations here all the way through.

It is a bit awkward right now. At the very least I would rename the parameter. But maybe we can have this return relations, and then have an outer method consolidate them into a single replication event. But, will take a look more tomorrow.

Sorry for the short comment – need to run now and just wanted to jot this down for you to think on :-). Catch up more tomorrow :-).

Main thing I am wondering about is if we can avoid passing relations here all the way through.

Yes definitely I agree! I wanted to know whether changes are ok regard to algorithm and relation generation. Anyway I will rework to avoid passing relations.

So as I was looking at this I think I have a suggestion for this, which I think would also make the logic easier to understand/validate.

First, it involves using two outbox events. One event will separate the default access, and replicate that default access with new role bindings. The other will update those role bindings to add or remove roles.

I do not think it is worth the complexity to combine those outbox events. Using two separate events does two things: (1) the additional access or removed access will be "more" eventually consistent with RBAC, by one additional event, and (2) it adds potentially redundant work. As for (1), it is already eventually consistent. Because both events are committed in one transaction, they will still be correctly ordered with respect to concurrent changes (e.g. adding/removing users from the group which is itself eventually consistent for default groups anyway). So, it actually doesn't change the semantics in any way to combine those events. To any perspective, the result is the same: access is the default, and then it changes at some point in the future, regardless of whether that happens with one replication event or two.

As for (2) (performance)... I'm just thinking it's premature optimization to think about that much otherwise. The worst case scenario is that you remove all access that was just added. So you'd start by adding all role bindings, and then you'd remove them. Realistically, if you had default access for 30 roles, that means adding ~90 tuples and then removing them. In the grand scheme of things, probably not a big deal. And if it's a problem, we can always improve it later.

So given that, here is what I would suggest:

Change the RelationApiDualWriteGroupHandler but not quite as much as in this PR currently. Maybe just keep the separate replicate piece or let it take multiple roles at once. So change replicate_added_role to accept multiple roles, and then we just add them all together in one method call. Or separate add_role from replicate_added_role and then keep the separate replicate() method. (but we don't need the custom group param or other stuff)

In clone_default_group_in_public_schema, replicate an event that (a) removes default access and (b) re-adds the roles. I think this should be able to reuse the old RelationApiDualWriteGroupHandler code and call replicate_added_roles with all the roles (or whatever based on above), similar to what add_roles used to do. I would probably add an argument or method to the group handler for removing default access, and then remove the default_bindings_from_mapping method from the bootstrap service. The logic there doesn't really have to be kept in the bootstrap service for a few reasons:

The group handler already knows that default access goes to the default workspace and already looks up the default workspace.

It already has the group uuid and mapping for the default role binding uuid

The only missing piece then is the platform default role uuid. We could consider just not removing this part since strictly speaking all we have to remove is the binding relationship. Regardless, the platform default role uuid is already repeated by the seeding process, so its something that should really be exposed outside of the bootstrap service in a reusable place, anyway.

Then, adjust add_roles and remove_roles only to use the modified methods per 1 above. (It would be an unrelated improvement to have a single outbox event for those–I hadn't realized or forgotten that was doing multiple I think.)

One easy way to improve the performance, too, would be to mint a ReplicationEvent and optionally allow passing that into the add/remove roles methods. Then, if present, "merge" it with the one generated by that method. This would decouple it from the logic and keep the "redundant" work purely in memory confined to a single CPU.

So in summary that would look something like this (psuedo-ish code, adjust as needed per any different API choices you make):

def clone_default_group_in_public_schema(group, tenant) -> Optional[Group]: """Clone the default group for a tenant into the public schema.""" bootstrapped_tenant = None if settings.V2_TENANT_BOOTSTRAP: tenant_bootstrap_service = V2TenantBootstrapService(OutboxReplicator()) bootstrapped_tenant = tenant_bootstrap_service.bootstrap_tenant(tenant) group_uuid = bootstrapped_tenant.mapping.default_group_uuid else: group_uuid = uuid4() public_tenant = Tenant.objects.get(tenant_name="public") # ... all the same group and policy set up stuff dual_write_handler = RelationApiDualWriteGroupHandler(group, ReplicationEventType.CUSTOMIZE_DEFAULT_GROUP) dual_write_handler.replicate_added_roles(public_default_roles, remove_default_access_from=bootstrapped_tenant.mapping) return group

And then inside replicate_added_roles, if the mapping argument was provided we would either just remove a tuple from this group to the default role binding, or additionally remove the default role and default group subject as well. But if that's a pain or the logic is messy it's really optional–it's just an optimization to avoid extra tuples lying around. But they won't hurt I think.

And then add/remove_roles are mostly untouched I think (aside from any updates to dual write handler interface).

WDYT?

…tion event in add roles to group

rbac/management/group/definer.py

alechenninger · 2024-10-25T00:11:22Z

rbac/management/group/relation_api_dual_write_group_handler.py

+        roles = Role.objects.filter(policies__group=custom_group)
+        system_roles = roles.filter(tenant=Tenant.objects.get(tenant_name="public"))


Curious – do you need the additional filter on tenant?

alechenninger · 2024-10-25T00:17:32Z

rbac/management/group/relation_api_dual_write_group_handler.py

+            bootstrap_service = V2TenantBootstrapService(replicator=NoopReplicator())
+            bootstrapped_tenant = bootstrap_service.bootstrap_tenant(self.group.tenant)
+            relations_to_add = bootstrap_service.default_bindings_from_mapping(bootstrapped_tenant)
+            self.group_relations_to_add.extend(relations_to_add)


I think this logic is technically correct / working, but I wonder if this should really be used via the bootstrap service. I guess I kind of touched on this in the other comment, but while that service does rely on it, it's a little awkward to make that the main source, for something relatively simple. I'm curious what it would look like if we queried for the mapping for a tenant instead here and created the relationships to remove inline.

We can again consider only adding/removing the binding tuple, which would simplify this, if needed. Removing the other tuples is not required, it's only a database storage optimization, which is probably a minor concern. So it might be more worth it to leave them and simplify the code and maintenance of tuples.

alechenninger · 2024-10-25T00:23:58Z

rbac/rbac/settings.py

@@ -353,6 +353,7 @@
 V2_MIGRATION_RESOURCE_EXCLUDE_LIST = ENVIRONMENT.get_value("V2_MIGRATION_RESOURCE_EXCLUDE_LIST", default="").split(",")
 V2_BOOTSTRAP_TENANT = ENVIRONMENT.bool("V2_BOOTSTRAP_TENANT", default=False)
 V1_BOOTSTRAP_ADD_USER_ID = ENVIRONMENT.bool("V1_BOOTSTRAP_ADD_USER_ID", default=False)
+LOG_REPLICATION = ENVIRONMENT.bool("LOG_REPLICATION", default=False)


We could consider an "aggregate replicator" which passed the event to multiple implementations, and then optionally using this with the logging replicator plus another impl, as an elegant way to add logging to any replicator, but I'm not sure if it's worth it. Or if it is, maybe this can be its own small PR.

Another option would be to add normal logging statements to the OutboxReplicator and configure that with normal logging configuration.

Co-authored-by: Alec Henninger <[email protected]>

This reverts commit 5605a83.

lpichler changed the title ~~[RHCLOUD-35640] Update custom default group logic~~ [WIP] [RHCLOUD-35640] Update custom default group logic Oct 17, 2024

lpichler commented Oct 17, 2024

View reviewed changes

lpichler requested a review from astrozzc October 17, 2024 12:21

lpichler force-pushed the update_custom_default_group_logic branch from c9e05da to 3dad051 Compare October 17, 2024 12:39

astrozzc reviewed Oct 17, 2024

View reviewed changes

lpichler force-pushed the update_custom_default_group_logic branch 6 times, most recently from 89aa42a to 66742b0 Compare October 22, 2024 09:31

lpichler changed the title ~~[WIP] [RHCLOUD-35640] Update custom default group logic~~ [RHCLOUD-35640] Update custom default group logic Oct 22, 2024

lpichler force-pushed the update_custom_default_group_logic branch 2 times, most recently from 3727059 to 192d992 Compare October 22, 2024 12:05

lpichler requested review from coderbydesign, astrozzc and alechenninger October 22, 2024 12:09

lpichler force-pushed the update_custom_default_group_logic branch from 192d992 to 10f693c Compare October 22, 2024 12:22

alechenninger reviewed Oct 23, 2024

View reviewed changes

lpichler added 9 commits October 24, 2024 14:28

Extract default bindings into method in v2 bootstrap tenant

d00a40a

Replicate relations for creation of custom default group

03c429b

Replicate removal of custom default group

e5da708

Fix linter issues

5543337

Consolidate replication for customer default group

5d11e2a

Fix relation generations for custom group creation with tests

dc47c54

Remove unecessary replication event type

8de975d

Add all default roles to group

678d833

Log relations optionaly from outbox replicator

5605a83

lpichler added 6 commits October 24, 2024 14:28

Admin default group is not customizable

d9fc356

Add system roles to custom default group once and consolidate replica…

76b2052

…tion event in add roles to group

Replicate removal role(s) from custom default group

2051d5c

Update tests for custom default group creation

3687633

Fix linter issues and fixes after merge conflicts resolution

5211cfd

Add tests for removal of custom default group and for unassigning roles

ed35176

lpichler force-pushed the update_custom_default_group_logic branch from 10f693c to ed35176 Compare October 24, 2024 12:30

alechenninger reviewed Oct 24, 2024

View reviewed changes

rbac/management/group/definer.py Outdated Show resolved Hide resolved

alechenninger reviewed Oct 25, 2024

View reviewed changes

lpichler and others added 4 commits October 25, 2024 13:29

Merge branch 'master' into update_custom_default_group_logic

6e4ca99

Update rbac/management/group/definer.py

7bcc341

Co-authored-by: Alec Henninger <[email protected]>

Revert "Log relations optionaly from outbox replicator"

424dbc2

This reverts commit 5605a83.

Fixes after merging resolutions

2b3a009

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RHCLOUD-35640] Update custom default group logic #1245

[RHCLOUD-35640] Update custom default group logic #1245

lpichler commented Oct 17, 2024 •

edited

Loading

lpichler Oct 17, 2024

astrozzc Oct 17, 2024

alechenninger Oct 22, 2024

lpichler Oct 23, 2024 •

edited

Loading

alechenninger Oct 23, 2024 •

edited

Loading

astrozzc Oct 17, 2024

lpichler commented Oct 18, 2024

lpichler commented Oct 22, 2024

alechenninger Oct 23, 2024

lpichler Oct 24, 2024

alechenninger Oct 24, 2024 •

edited

Loading

alechenninger Oct 25, 2024

alechenninger Oct 25, 2024 •

edited

Loading

alechenninger Oct 25, 2024 •

edited

Loading

		roles = Role.objects.filter(policies__group=custom_group)
		system_roles = roles.filter(tenant=Tenant.objects.get(tenant_name="public"))

[RHCLOUD-35640] Update custom default group logic #1245

Are you sure you want to change the base?

[RHCLOUD-35640] Update custom default group logic #1245

Conversation

lpichler commented Oct 17, 2024 • edited Loading

Link(s) to Jira

Description of Intent of Change(s)

Local Testing

Checklist

Secure Coding Practices Checklist Link

Secure Coding Practices Checklist

lpichler Oct 17, 2024

Choose a reason for hiding this comment

astrozzc Oct 17, 2024

Choose a reason for hiding this comment

alechenninger Oct 22, 2024

Choose a reason for hiding this comment

lpichler Oct 23, 2024 • edited Loading

Choose a reason for hiding this comment

alechenninger Oct 23, 2024 • edited Loading

Choose a reason for hiding this comment

astrozzc Oct 17, 2024

Choose a reason for hiding this comment

lpichler commented Oct 18, 2024

lpichler commented Oct 22, 2024

alechenninger Oct 23, 2024

Choose a reason for hiding this comment

lpichler Oct 24, 2024

Choose a reason for hiding this comment

alechenninger Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

alechenninger Oct 25, 2024

Choose a reason for hiding this comment

alechenninger Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

alechenninger Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

lpichler commented Oct 17, 2024 •

edited

Loading

lpichler Oct 23, 2024 •

edited

Loading

alechenninger Oct 23, 2024 •

edited

Loading

alechenninger Oct 24, 2024 •

edited

Loading

alechenninger Oct 25, 2024 •

edited

Loading

alechenninger Oct 25, 2024 •

edited

Loading