Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add repository_version param as a building context #1687

Merged
merged 1 commit into from
Oct 4, 2024

Conversation

git-hyagi
Copy link
Contributor

closes: #479

Comment on lines 52 to 54
Non-staff users, lacking read access to the `artifacts` endpoint, may encounter restricted
functionality as they are prohibited from listing artifacts uploaded to Pulp and utilizing
them within the build process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality will no longer be limited by the access to the artifacts endpoint.

Comment on lines 66 to 51
TASK_HREF=$(http --form POST :$REPO_HREF'build_image/' -F "containerfile@./Containerfile" \
repo_version=$REPO_VERSION | jq -r '.task')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TASK_HREF=$(http --form POST :$REPO_HREF'build_image/' -F "containerfile@./Containerfile" \
repo_version=$REPO_VERSION | jq -r '.task')
TASK_HREF=$(http --form POST :$REPO_HREF'build_image/' "containerfile@./Containerfile" \
repo_version=$REPO_VERSION | jq -r '.task')

@@ -696,6 +696,7 @@ class ContainerRepositoryViewSet(
"condition": [
"has_model_or_obj_perms:container.build_image_containerrepository",
"has_model_or_obj_perms:container.view_containerrepository",
"has_repository_model_or_obj_perms:file.view_filerepository",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to this line, I am receiving an HTTP 500 error.

[pulp]  | pulp [d8943394f02c4759b3deac8f518d2218]: django.request:ERROR: Internal Server Error: /pulp/api/v3/repositories/container/container/01907884-8984-7905-bc76-58b8c747c347/build_image/
[pulp]  | Traceback (most recent call last):
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/django/core/handlers/exception.py", line 55, in inner
[pulp]  |     response = get_response(request)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/django/core/handlers/base.py", line 197, in _get_response
[pulp]  |     response = wrapped_callback(request, *callback_args, **callback_kwargs)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/django/views/decorators/csrf.py", line 56, in wrapper_view
[pulp]  |     return view_func(*args, **kwargs)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/viewsets.py", line 124, in view
[pulp]  |     return self.dispatch(request, *args, **kwargs)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 509, in dispatch
[pulp]  |     response = self.handle_exception(exc)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 469, in handle_exception
[pulp]  |     self.raise_uncaught_exception(exc)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 480, in raise_uncaught_exception
[pulp]  |     raise exc
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 497, in dispatch
[pulp]  |     self.initial(request, *args, **kwargs)
[pulp]  |   File "/src/pulpcore/pulpcore/app/viewsets/base.py", line 300, in initial
[pulp]  |     super().initial(request, *args, **kwargs)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 415, in initial
[pulp]  |     self.check_permissions(request)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_framework/views.py", line 332, in check_permissions
[pulp]  |     if not permission.has_permission(request, self):
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_access_policy/access_policy.py", line 69, in has_permission
[pulp]  |     allowed = self._evaluate_statements(statements, request, view, action)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_access_policy/access_policy.py", line 113, in _evaluate_statements
[pulp]  |     matched = self._get_statements_matching_conditions(
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_access_policy/access_policy.py", line 262, in _get_statements_matching_conditions
[pulp]  |     passed = self._check_condition(condition, request, view, action)
[pulp]  |   File "/usr/local/lib/python3.9/site-packages/rest_access_policy/access_policy.py", line 286, in _check_condition
[pulp]  |     result = method(request, view, action, arg)
[pulp]  |   File "/src/pulpcore/pulpcore/app/global_access_conditions.py", line 407, in has_repository_model_or_obj_perms
[pulp]  |     return request.user.has_perm(permission) or has_repository_obj_perms(
[pulp]  |   File "/src/pulpcore/pulpcore/app/global_access_conditions.py", line 367, in has_repository_obj_perms
[pulp]  |     plugin_repository = Repository.objects.get(pk=view.kwargs["repository_pk"]).cast()
[pulp]  | KeyError: 'repository_pk'

Reproducer:

pulp user create --username test --password krokodyl1 --email [email protected]
pulp user role-assignment add --username test --role "container.containerdistribution_collaborator" --object ""
pulp user role-assignment add --username test --role "container.containerrepository_content_manager" --object ""

FILE_REPO=$(pulp file repository create --name foo --autopublish | jq -r '.pulp_href')
pulp file content upload --repository foo --file ./setup.py --relative-path setup.py

CONTAINER_REPO=$(pulp container repository create --name build | jq -r '.pulp_href') 

http -a test:krokodyl1 --form POST :5001${CONTAINER_REPO}build_image/ "containerfile@./Containerfile" repo_version=${FILE_REPO}versions/1/

I do not think we can inject the policy from other entities' contexts here (repo_version=FILE_HREF). The policy is available only for the current viewset scope (CONTAINER_REPO). I think we need to run permission checks manually. Similarly to

repositories_by_namespace = get_objects_for_user(
or create a customized policy condition, like in
def has_namespace_or_obj_perms(request, view, action, permission):
.

@mdellweg, any ideas on how to properly add permission checks for entities created from within a different plugin? In our case, we want to check if a user has permissions to read pulp_file repository and build an image inside pulp_container repository.

file_bindings.RepositoriesFileApi.modify(file_repo_with_auto_publish.pulp_href, data).task
)

build_response = container_repository_api.build_image(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind adding a test case that triggers users' permissions checks before building an image?

with user_helpless, pytest.raises(APIException):
    container_repository_api.build_image()
    
with user_manager:
    container_repository_api.build_image()

...

Comment on lines 815 to 826
if "repo_version" in data:
version_href = data["repo_version"]
repo_version_artifacts = RepositoryVersion.objects.get(pk=version_href.pk).artifacts
files = FileContent.objects.filter(
digest__in=repo_version_artifacts.values("sha256")
).values("_artifacts__pk", "relative_path")
if len(files) == 0:
raise serializers.ValidationError(
_("No file found for the specified repository version.")
)
for file in files:
artifacts[str(file["_artifacts__pk"])] = file["relative_path"]
Copy link
Member

@lubosmj lubosmj Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image building is still considered to be a tech-preview. Have you considered changing the current API to better fit our needs? Instead of passing a list of artifacts to the task, we can try passing only the reference to the repository version, after verifying user's permissions. The task will then unpack the repository version and run the suggested queries ^. Running these queries while serializing input data may block api-app during the execution. We can outsource the evaluation to the tasking system freely.

@git-hyagi git-hyagi force-pushed the build-context-from-repoversion branch 2 times, most recently from b947807 to 3bb3329 Compare July 10, 2024 13:50
Copy link
Member

@lubosmj lubosmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! I left a couple of small suggestions. Also, I would still ditch the artifacts parameter from the build endpoint. What do you think, @ipanova, should we keep it or not?

Comment on lines 137 to 144
repo_version_artifacts = RepositoryVersion.objects.get(pk=repo_version).artifacts
files = FileContent.objects.filter(
digest__in=repo_version_artifacts.values("sha256")
).values("_artifacts__pk", "relative_path")
if len(files) == 0:
raise ValidationError(_("No file found for the specified repository version."))
for file in files:
artifacts[str(file["_artifacts__pk"])] = file["relative_path"]
Copy link
Member

@lubosmj lubosmj Jul 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we have the list of artifacts already available from RepositoryVersion.objects.get(pk=repo_version).artifacts. Have you considered iterating over this queryset instead of devising a new dictionary and then running another DB queries for getting the corresponding artifacts in the for loop below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I could not find the relative_path (that will be used in the Containerfile) of each file in the artifacts queryset. Is there a better way to find it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something along these lines:

for content in FileContent.objects.filter(pk__in=repository_version.content).prefetch_related("contentartifact_set"):
    content._artifacts.get(), content.relative_path
    ...

But, you will need to check whether prefetch_related or select_related is working in this scenario. And, whether content._artifacts.get() hits the database or not. The idea is to hit the database fewer times.


Sometimes, there is already implemented the same thing that you are trying accomplish. In this case, skimming through the pulpcore code might be helpful:

  1. https://github.com/pulp/pulpcore/blob/ac08d63534227c07e67e01ca8a0afdf97c038389/pulp_file/app/tasks/publishing.py#L67-L82
  2.  pulpcore/app/models/repository.py:                    content_batch_qs.prefetch_related("contentartifact_set")
     pulpcore/app/serializers/content.py:        Content.objects.prefetch_related("_artifacts").all()
    

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you may want to verify whether the artifacts are in place or not. A user could sync a repository with the on-demand policy and then use it as the build context for pulp-container. We should be careful about this.

Artifact.objects.filter(pk__in=artifacts.keys()).touch()
elif serializer.validated_data.get("repo_version"):
repo_version = serializer.validated_data["repo_version"]
self._check_file_repo_permission(request, repo_version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this, we may want to declare a new entry inside the access policy, like we did for the repository viewset:

"action": ["sync"],
"principal": "authenticated",
"effect": "allow",
"condition": [
"has_model_or_obj_perms:container.sync_containerrepository",
"has_remote_param_model_or_obj_perms:container.view_containerremote",
"has_model_or_obj_perms:container.view_containerrepository",
],
},

Comment on lines 59 to 60
# workaround for drf-spectacular not allowing artifacts=None and raising an exception
# (a bytes-like object is required, not 'NoneType')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that when I want to build a new image from a Containerfile that does not require any additional artifacts, I cannot use the python bindings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. Testing (from main branch) with:

    build_response = container_repository_api.build_image(
        repository.pulp_href, containerfile=containerfile_name, artifacts=None
    )

will fail:

E               TypeError: a bytes-like object is required, not 'NoneType'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so unfortunate. Is there any chance that we provide an invalid openapi schema for the generation?

@@ -0,0 +1 @@
Added a feature to allow using file `repository_version` as a building context of a Containerfile.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Added a feature to allow using file `repository_version` as a building context of a Containerfile.
Added support for using the file `repository_version` as the build context for a Containerfile.

@git-hyagi git-hyagi force-pushed the build-context-from-repoversion branch 5 times, most recently from 0e87500 to a988728 Compare July 17, 2024 19:13
Comment on lines 126 to 128
serializer = serializers.OCIBuildImageSerializer(
data=request.data, context={"request": request}
)
Copy link
Member

@lubosmj lubosmj Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we cannot make this more generic, we should rename the function to something like "has_image_build_file_repo_perms".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad!

@@ -0,0 +1 @@
Removed support for using raw artifacts as the build context for a Containerfile.
Copy link
Member

@lubosmj lubosmj Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed. You can mention that we replaced "artifacts" by "repo_version" in the feature changelog.

@git-hyagi git-hyagi force-pushed the build-context-from-repoversion branch from a988728 to 010d8fb Compare July 18, 2024 15:02
Comment on lines 1 to 2
Replaced `artifacts` by `repository_version` as the parameter to provide the build context
for a Containerfile.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of repository_version? pulp-container's one? 😜

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops ... good catch!

@@ -41,12 +41,18 @@ CMD ["cat", "/inside-image.txt"]' >> Containerfile
## Build an OCI image

```bash
TASK_HREF=$(http --form POST :$REPO_HREF'build_image/' containerfile@./Containerfile \
artifacts="{\"$ARTIFACT_HREF\": \"foo/bar/example.txt\"}" | jq -r '.task')
ARTIFACT_SHA256=$(http :$ARTIFACT_HREF | jq -r '.sha256')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer care about artifacts, do we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum ... we don't, I kept this artifact just as a sample/doc to populate the repository:
https://github.com/pulp/pulp_container/pull/1687/files/010d8fb88c2361abbe8f4cefc5a6b62ba42308ee#diff-8d94fc0843f51b6f430b79c609efee42205be12f7fee3369c1dd7824eeabe779R47-R48

I'll update it to create the file content "directly".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do. We should forget about using the artifacts endpoint.

Comment on lines 128 to 130
obj = view.get_object() if view.detail else None
context = {"request": request}
serializer = view.serializer_class(instance=obj, data=request.data, context=context)
Copy link
Member

@lubosmj lubosmj Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help me understand this. This is invoking ContainerRepositorySerializer instead of OCIBuildImageSerializer, correct? Is the repo_version always initialized during the serialization when provided? Or, is this function always returning True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help me understand this. This is invoking ContainerRepositorySerializer instead of OCIBuildImageSerializer, correct?

Hum... I'm seeing the view.serializer_class returning OCIBuildImageSerializer in the debugger.

Is the repo_version always initialized during the serialization when provided?

Yes, when we provide a repo_version, the serializer will:
https://github.com/pulp/pulp_container/pull/1687/files/010d8fb88c2361abbe8f4cefc5a6b62ba42308ee#diff-1689ddc49f7d0e21a80ae6006966cfb47766de4a004c7bb1b8e7f442b285384eR752
https://github.com/pulp/pulp_container/pull/1687/files/010d8fb88c2361abbe8f4cefc5a6b62ba42308ee#diff-1689ddc49f7d0e21a80ae6006966cfb47766de4a004c7bb1b8e7f442b285384eR780-R781

    repo_version = RepositoryVersionRelatedField(...)
... 
        if "repo_version" in data:
        |   data["repo_version"] = data["repo_version"].pk

Or, is this function always returning True?

If my tests were valid, it is returning False if the user has no permission to the file repo.

Running the following test returned "You do not have permission to perform this action.":

pulp user create --username test --password krokodyl1 --email [email protected]
pulp user role-assignment add --username test --role "container.containerdistribution_collaborator" --object ""
pulp user role-assignment add --username test --role "container.containerrepository_content_manager" --object ""
echo 'Hello world!' > /tmp/example.txt

FILE_REPO=$(pulp file repository create --name foo --autopublish | jq -r '.pulp_href')
pulp file content upload --repository foo --file /tmp/example.txt --relative-path example4.txt

CONTAINER_REPO=$(pulp container repository create --name build | jq -r '.pulp_href')
mkdir -p /tmp/test
echo 'FROM centos:7
COPY example4.txt /inside-image.txt
CMD ["cat", "/inside-image.txt"]' > /tmp/test/Containerfile

http -a test:krokodyl1 --form POST :5001${CONTAINER_REPO}build_image/ "containerfile@/tmp/test/Containerfile" repo_version=${FILE_REPO}versions/1/

and then it worked (returned "202 Accepted") with:

pulp user role-assignment add --username test --role "file.filerepository_viewer" --object ""
http -a test:krokodyl1 --form POST :5001${CONTAINER_REPO}build_image/ "containerfile@/tmp/test/Containerfile" repo_version=${FILE_REPO}versions/1/

Copy link
Member

@lubosmj lubosmj Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for clarifying this. I thought that it would just take the default repository viewset serializer from

serializer_class = serializers.ContainerRepositorySerializer

Since this works out of the box and it is generic enough, do you think we should move this access policy check to pulpcore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... considering it could be helpful to other plugins I would vote yes, but considering the need to do a serialization I am not confident that it is a good approach (at least, not yet to be used by other plugins).
Do you think the serialization is fine and we should not be too worried about it?

Copy link
Member

@lubosmj lubosmj Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, something similar is already used in https://github.com/pulp/pulpcore/blob/92785d9e3ed917b8b0bec4c24c3669f594a81018/pulpcore/app/global_access_conditions.py#L146. But, I guess this whole discussion needs someone who has better expertise in this area...

@lubosmj
Copy link
Member

lubosmj commented Jul 23, 2024

This is blocked by pulp/pulpcore#5618.

Comment on lines 119 to 127
def test_build_image_from_repo_version_with_creator_user(
build_image,
containerfile_name,
create_file_and_container_repos_with_sample_data,
delete_orphans_pre,
gen_user,
):
"""Test if a user (with the expected permissions) can build an OCI image."""
container_repo, file_repo = create_file_and_container_repos_with_sample_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the pulp_container's container_repo fixture instead.

def test_build_image(
pulpcore_bindings,
@pytest.fixture
def create_file_and_container_repos_with_sample_data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def create_file_and_container_repos_with_sample_data(
def populated_file_repo(

You can use the pulpcore's file_repo fixture and upload content to it. There is no need to enable auto-publishing.


def _copy_file_from_artifact(context_path, relative_path, artifact):
dest_path = os.path.join(context_path, relative_path)
dirs = os.path.split(dest_path)[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dirs = os.path.split(dest_path)[0]
dirs = os.path.dirname(dest_path)

@git-hyagi git-hyagi force-pushed the build-context-from-repoversion branch from c4b90a1 to fc6d605 Compare July 24, 2024 12:14
lubosmj
lubosmj previously approved these changes Jul 24, 2024
Copy link
Member

@lubosmj lubosmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me! Let's wait for other members to review this.

@git-hyagi git-hyagi force-pushed the build-context-from-repoversion branch from fc6d605 to ef4a129 Compare July 26, 2024 11:29
Comment on lines 1 to 3
Replaced `artifacts` by `repository_version` (i.e., a file plugin repository version HREF)
as the parameter to provide the build context for a Containerfile.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that "artifacts" is removed in turn. You should add a removal snippet.

Remind me: Image building is still tech-preview?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, image building is still tech-preview. Should we add a removal snippet even this way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say, yes.

@@ -956,7 +956,7 @@ def build_image(self, request, pk):
"containerfile_pk": str(containerfile.pk),
"tag": tag,
"repository_pk": str(repository.pk),
"artifacts": artifacts,
"repo_version": repo_version.pk,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"repo_version": repo_version.pk,
"repository_version_pk": repository_version.pk,

It's easier to follow around if we don't change the name in the middle.
And maybe this should be build_context_pk instead.

@@ -750,11 +749,10 @@ class OCIBuildImageSerializer(ValidateFieldsMixin, serializers.Serializer):
tag = serializers.CharField(
required=False, default="latest", help_text="A tag name for the new image being built."
)
artifacts = serializers.JSONField(
repository_version = RepositoryVersionRelatedField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we call that build_context?


containerfile_path = os.path.join(working_directory, "Containerfile")
if repo_version:
file_repository = RepositoryVersion.objects.get(pk=repo_version)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming here could also be more consistent:

Suggested change
file_repository = RepositoryVersion.objects.get(pk=repo_version)
build_context = RepositoryVersion.objects.get(pk=build_context_pk)

content_artifacts = ContentArtifact.objects.filter(
content__in=file_repository.content
).order_by("-content__pulp_created")
for content in content_artifacts.select_related("artifact").iterator():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too: using the name "content" is misleading since it's content artifact.
BTW: I think it would be better to loop over actual FileContent to be future proof.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... even if with FileContent we would need to do more database hits?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we add a filter(content__pulp_type="file") to be sure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

Comment on lines 52 to 57
if not containerfile_name:
containerfile_name = ""

if not build_context:
build_context = b""

build_response = container_repository_api.build_image(
container_container_repository_href=repository,
containerfile=containerfile,
containerfile_name=containerfile_name,
build_context=build_context,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not containerfile_name:
containerfile_name = ""
if not build_context:
build_context = b""
build_response = container_repository_api.build_image(
container_container_repository_href=repository,
containerfile=containerfile,
containerfile_name=containerfile_name,
build_context=build_context,
)
build_response = container_repository_api.build_image(
container_container_repository_href=repository,
containerfile=containerfile,
containerfile_name=containerfile_name or "",
build_context=build_context or "",
)

Comment on lines 130 to 131
else:
containerfile = Artifact.objects.get(pk=containerfile_artifact_pk)
Copy link
Member

@lubosmj lubosmj Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else:
containerfile = Artifact.objects.get(pk=containerfile_artifact_pk)
elif containerfile_artifact_pk:
containerfile = Artifact.objects.get(pk=containerfile_artifact_pk)
else:
raise RuntimeError("Neither an artifact nor temporary file for the Containerfile was specified.")

Comment on lines 202 to 203
with pytest.raises(ApiException) as e:
build_image(
repository=container_repo.pulp_href,
containerfile_name="Non_existing_file",
build_context=f"{populated_file_repo.pulp_href}versions/2/",
)
assert "Could not find the Containerfile" in e.value.body
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also check that this returns an HTTP 400 error?

@@ -0,0 +1 @@
Removed support for using raw artifacts as the build context and for the Containerfile.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth mentioning that this concerns the image building feature.

Comment on lines 1 to 5
Replaced `artifacts` by `build_context` (i.e., a file plugin repository version HREF)
as the parameter to provide the build context for a Containerfile.
Added the `containerfile_name` as the parameter to provide the relative-path of a File Content
from the provided `build_context`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth mentioning that this concerns the image building feature.

Comment on lines 1 to 5
Updated the image building feature:
* Replaced `artifacts` by `build_context` (i.e., a file plugin repository version HREF)
as the parameter to provide the build context for a Containerfile.
* Added the `containerfile_name` as the parameter to provide the relative-path of a File Content
from the provided `build_context`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedro-psb, is this bullet-point list in the changelog going to be properly rendered by mkdocs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the bullet paragraph has a 4 space indentation it will render correctly.

Updated the image building feature:
    * Replaced `artifacts` by `build_context` (i.e., a file plugin repository version HREF)
    as the parameter to provide the build context for a Containerfile.
    * Added the `containerfile_name` as the parameter to provide the relative-path of a File Content
    from the provided `build_context`.

Otherwise it'll become a flat list.

containerfile = Artifact.objects.get(pk=containerfile_artifact_pk)
else:
raise RuntimeError(
"Neither an artifact or temporary file for the Containerfile was specified."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Neither an artifact or temporary file for the Containerfile was specified."
"Neither an artifact nor temporary file for the Containerfile was specified."

The last nitpick.

Copy link
Member

@lubosmj lubosmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

Comment on lines 11 to 16
To make a file available during the build process, we need to add it to a versioned file repository,
then pass it as the `build_context` query parameter of the build container API, and finally specify the
```
ADD/COPY <file relative-path> <location in container>
```
instruction in the Containerfile.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What is a file? We should re-phrase it to something like "To pass arbitrary files or artifacts to the image building context, the build_context property can be provided in the request payload, containing a reference to a file repository."
  2. build_context is not a query parameter. It is a "JSON property" in the request body, see the difference.

Comment on lines 83 to 84
File repositories synced with on-demand policy will not automatically pull the missing artifacts.
Trying to build using a file that is not yet pulled will fail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
File repositories synced with on-demand policy will not automatically pull the missing artifacts.
Trying to build using a file that is not yet pulled will fail.
File repositories synced with the on-demand policy will not automatically download the missing artifacts.
Trying to build an image using a file that has not yet been downloaded will fail.

Comment on lines 796 to 798
# the "has_repo_or_repo_ver_param_model_or_obj_perms" permission condition function
# expects a "repo" or "repository_version" arguments, so we need to pass "build_context"
# as "repository_version" to be able to validate the permissions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# the "has_repo_or_repo_ver_param_model_or_obj_perms" permission condition function
# expects a "repo" or "repository_version" arguments, so we need to pass "build_context"
# as "repository_version" to be able to validate the permissions
# the "has_repo_or_repo_ver_param_model_or_obj_perms" permission condition function
# expects a "repository" or "repository_version" arguments, so we need to pass "build_context"
# as "repository_version" to be able to validate the permissions

I find this comment redundant. It would be better to open a new feature request in pulpcore and add a TODO comment referencing it.

if containerfile_name and content_artifact.relative_path == containerfile_name:
data["containerfile_artifact_pk"] = str(content_artifact.artifact.pk)

data["content_artifact_pks"].append(content_artifact.pk)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When serializing data and dispatching a task from the viewset, there is a small window where someone could remove a file repository and we would end up with a list of non-existing keys. The task might fail when reading these keys. However, I am still unsure it is worth moving this code back to the task.

lubosmj
lubosmj previously approved these changes Sep 11, 2024
Copy link
Member

@lubosmj lubosmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

Comment on lines 1 to 5
Updated the image building feature:
* Replaced `artifacts` by `build_context` (i.e., a file plugin repository version HREF)
as the parameter to provide the build context for a Containerfile.
* Added the `containerfile_name` as the parameter to provide the relative-path of a File Content
from the provided `build_context`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to all of us:
Towncrier does not do Markdown validation. So we need to be on the lookout if this produces a nice and readable changelog in the end.

Comment on lines 58 to 61
!!! note

If `TMPFILE_PROTECTION_TIME` is set to 0 (default value), the automatic cleanup is disabled,
meaning all the uploaded Containerfile used for the builds will be kept in `FILE_UPLOAD_TEMP_DIR`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is something we should change.
Tmpfiles should only be left over when a task failed or was canceled. After successful completion, the task should really clean up the kitchen. Don't bother the user with janitorial work.

"tag": tag,
"repository_pk": str(repository.pk),
"artifacts": artifacts,
"content_artifact_pks": content_artifact_pks,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was thinking we were going to pass the build context and the containerfile_name to the task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should keep the old task untouched to be deprecated and removed in a few releases and design a new one with a clean interface next to it.


return data

def deferred_validation(self, data):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred validation will only run, if we call validate from the task context.
Actually on this kind of Serializer, there is no deferred_validation at all. Look for the Content upload serializer hierarchy to see where it's implemented.

@lubosmj
Copy link
Member

lubosmj commented Sep 26, 2024

Is this PR ready for review?

@git-hyagi git-hyagi marked this pull request as ready for review September 27, 2024 19:56
Copy link
Member

@mdellweg mdellweg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cosmetic suggestions.

Overall, looks very good!

functionality as they are prohibited from listing artifacts uploaded to Pulp and utilizing
them within the build process.
File repositories synced with the on-demand policy will not automatically download the missing artifacts.
Trying to build an image using a file that has not yet been downloaded will fail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we intend to lift that limitation some day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good improvement.
We could “force” (or automate) the download by Pulp if the artifacts were not found, or maybe do not allow the usage of on-demand repositories for the build machinery.

elif "containerfile_artifact" in data:
data["containerfile_artifact"].touch()
else:
if bool(data.get("containerfile", None)) == bool(data.get("containerfile_name", None)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poor mans xor... 🤣

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣

Comment on lines 127 to 129
if containerfile_tempfile_pk:
containerfile_artifact = PulpTemporaryFile.objects.get(pk=containerfile_tempfile_pk)

if not containerfile_tempfile_pk and not containerfile_name:
raise RuntimeError("Neither a name nor temporary file for the Containerfile was specified.")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if containerfile_tempfile_pk:
containerfile_artifact = PulpTemporaryFile.objects.get(pk=containerfile_tempfile_pk)
if not containerfile_tempfile_pk and not containerfile_name:
raise RuntimeError("Neither a name nor temporary file for the Containerfile was specified.")
if not containerfile_tempfile_pk and not containerfile_name:
raise RuntimeError("Neither a name nor temporary file for the Containerfile was specified.")
if containerfile_tempfile_pk:
containerfile_artifact = PulpTemporaryFile.objects.get(pk=containerfile_tempfile_pk)

@git-hyagi git-hyagi force-pushed the build-context-from-repoversion branch 2 times, most recently from 5b0d910 to 60d5a97 Compare October 3, 2024 10:06
@@ -0,0 +1,2 @@
Support for utilizing raw artifacts, as the build context and for the Containerfile, during image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have still duplicated changelog entries. Please, resvole this.

@git-hyagi git-hyagi force-pushed the build-context-from-repoversion branch from 60d5a97 to 65ed701 Compare October 4, 2024 10:52
@lubosmj lubosmj merged commit d149619 into pulp:main Oct 4, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

I can specify a repository_version as the build context for container images
4 participants