Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds tools for editing tags. #129

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 52 additions & 2 deletions dockerfile_parse/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -401,7 +401,7 @@ def parent_images(self, parents):
lines[instr['startline']:instr['endline']+1] = [instr['content']]

self.lines = lines

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change shouldn't be here

Copy link
Contributor Author

@tim-vk tim-vk Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please describe describe the use case little bit more?

The use case I have: I have a dockerfile with images & tags (that are not "latest" since we want to pin our images to specific tags).
Every once in a while I want to (semi-automatically) update these tags within the dockerfile. e.g.:

  • Load dockerfile
  • Get base image
  • Update base image to latest (or other specific) tag (0.1.0 -> 0.2.0).

While dockerfile-parse has the tools to load the dockerfile and get the image it was lacking tools to split up this image in it's base components (registry\image:tag). Currently I'm using this in a seperate script but I doubt i'm the only person that would like to just update the tags of an image.

From my point of view, tag is tied with image, and not directly Dockerfile, for me tag is attribute of image and not direct dependency of dockerfile.

My personal use case is heavily "tag" focussed. The way I see it tag is a seperate imput for the FROM keyword (as per the docker documentation

IMO better solution would be pass image string into a "ImageNameParser" and return tag form there, to keep responsibility separated.

I think the first thing we'd need to agree apon is: "Is the tag part of the image name"?

If no: Then a seperate "ImageNameParser" is the better structure. (Would this be a seperate repo?).
If yes: Then I think this PR would be in the right direction.

Edit:

I was looking back at my other PR and remembered something:

Ideally I would change the entire "image_from" function to a "parse_FROM" function which returns all values that you could give the FROM keyword (platform, image, digest, tag and name). But that would have been massively breaking (since this entire project assumes "tag" is part of your image, and not a seperate entity). So I opted for this solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, dockerfile-parse works with imagename as a whole including digest or tag.

I'm not very eager to make it more granular, because we may also split imagename into: registry, namespace, image, tag/digest. There are multiple parts, many of them optional.

I'd like to keep dockerfile parse simple and rather use 3rd party library to manipulate with image string and just pass it to dockerfile parse

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using it like that, dockerfile just provides raw string, we manipulate it with separate code
https://github.com/containerbuildsystem/osbs-client/blob/master/osbs/utils/__init__.py#L372

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is naming convention we use

    Naming Conventions
    ==================
    registry.somewhere/namespace/image_name:tag
    |-----------------|                          registry, reg_uri
                      |---------|                namespace
    |--------------------------------------|     repository
                      |--------------------|     image name
                                            |--| tag
                      |------------------------| image
    |------------------------------------------| image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific 3rd party "image parser" that you know of that works with this convention?

I find it odd that osbs-client (your link) contains the logic to split up the image into registry, tag, name etc. which seems to be a bit counterintuitive with the "keep responsibility seperated" mentality.

The thing that I apparently am running against is that this repo doesn't seem to be a generic "dockerfile parser" but an "osbs dockerfile parser" which decreases the usability of it somewhat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what do you mean.

What I want to avoid, is having separate method for everything: get_tag, set_tag, get_registry, set_registry, etc..

We are modifying registry part, we didn't put that code into docker_file parse, because it's better to have single class for manipulating with imagenames and then just write back string representation into dockerfile parse.

This also work for all stages in Dockerfile, not just final stage.

What could be good, is to modify dockerfile-parse to return imagename as object where you can use methods to modify image, or make this more object oriented, and return object per stage.

I'd like to keep here some OOP decoupling between image and dockerfile statements

Copy link
Contributor Author

@tim-vk tim-vk Nov 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm starting to get your intention.

But if you'd truly want to decouple dockerfile statements I would expect something like get_FROM_arguments() Which would return all the possible arguments that are allowed for the FROM statement (see dockerfile reference). Or a class (FromArguments) where you can set the different arguments.

Regardless of the approach. What I don't really get is: Why isn't the ImageName class included in osbs-client? And not in dockerfile-parse? (or why isn't it a seperate repo?).

The thing thats bugging the most in the current set-up is that being able to change just the tag (or just the registry or other seperate component of "image") seems to be a fairly common & normal use case. And to achieve that either needs you to add your own functions (which I am sharing here) or install osbs-client (which doesn't seem to be available via pip?). Which raises my earlier question: Is dockerfile-parse meant as just a supplementary tool for osbs-client or as a tool for anyone who want to parse dockerfiles (and nothing else).

Edit:
Just spitballing:

Would it be more fitting for this project if the ImageName class from osbs-client was moved to util.py from this project?

Or these tag functions to util.py?

Copy link
Contributor

@MartinBasti MartinBasti Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't the ImageName class included in osbs-client? And not in dockerfile-parse? (or why isn't it a seperate repo?).

Historical reasons, I don't know. Before it was part of atomic-reactor.

Would it be more fitting for this project if the ImageName class ...

This would work for me, sounds like great idea. Let's wait what other maintainers things about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving ImageName here and returning base/parent images with it sounds good to me

@property
def is_multistage(self):
return len(self.parent_images) > 1
Expand All @@ -428,6 +428,21 @@ def baseimage(self, new_image):
raise RuntimeError('No stage defined to set base image on')
images[-1] = new_image
self.parent_images = images

@property
def basetag(self):
"""
:return: tag of base image, i.e. tag of base image
"""
_, tag = tag_from(self.baseimage)
return tag

@basetag.setter
def basetag(self, new_tag):
"""
only change the tag of the final stage FROM instruction
"""
self.baseimage = tag_to(self.baseimage, new_tag)

@property
def cmd(self):
Expand Down Expand Up @@ -882,7 +897,42 @@ def image_from(from_value):
match = re.match(regex, from_value)
return match.group('image', 'name') if match else (None, None)


def tag_from(from_value):
"""
:param from_value: string like "registry:port/image:tag AS name"
:return: tuple of the image and tag e.g. ("image", "tag")
"""

image, _ = image_from(from_value)
bare, _, tag = image.rpartition(":") if image and ":" in image else (None, None, None)

# check if a tag was actually present
if not valid_tag(tag) or not bare:
return (image, None)

return (bare, tag)

def valid_tag(tag):
"""
:param tag to be checked for validity
:return: true or false
"""
regex = re.compile(r"""(?x) # readable, case-insensitive regex
^(?P<tag>[a-zA-Z0-9\_][a-zA-Z0-9\.\_\-]*)$ # valid tag format (alphanumeric characters, numbers . _ and - (. and - not leading))
""")
match = re.match(regex, tag) if tag else None
return True if match and match.group('tag') and len(match.group('tag')) < 128 else False

def tag_to(image, new_tag):
"""
:param image: string like "image:tag" or "image"
:param tag: string like "latest"
:return: string like "image:new_tag" or "image" if no tag was given
"""

bare, _ = tag_from(image)
return ":".join(filter(None, [bare.strip() if bare else None, new_tag.strip() if new_tag else None]))

def _endline(line):
"""
Make sure the line ends with a single newline.
Expand Down
139 changes: 138 additions & 1 deletion tests/test_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@

from dockerfile_parse import DockerfileParser
from dockerfile_parse.parser import image_from
from dockerfile_parse.parser import tag_from
from dockerfile_parse.parser import tag_to
from dockerfile_parse.parser import valid_tag
from dockerfile_parse.constants import COMMENT_INSTRUCTION
from dockerfile_parse.util import b2u, u2b, Context
from tests.fixtures import dfparser, instruction
Expand Down Expand Up @@ -312,11 +315,22 @@ def test_get_baseimg_from_df(self, dfparser):
"LABEL a b\n"]
assert dfparser.baseimage == 'fedora:latest'

def test_get_basetag_from_df(self,dfparser):
dfparser.lines = ["From fedora:latest\n",
"LABEL a b\n"]
assert dfparser.basetag == 'latest'

def test_get_baseimg_from_arg(self, dfparser):
dfparser.lines = ["ARG BASE=fedora:latest\n",
"FROM $BASE\n",
"LABEL a b\n"]
assert dfparser.baseimage == 'fedora:latest'

def test_get_basetag_from_arg(self, dfparser):
dfparser.lines = ["ARG BASE=fedora:latest\n",
"FROM $BASE\n",
"LABEL a b\n"]
assert dfparser.basetag == 'latest'

def test_get_baseimg_from_build_arg(self, tmpdir):
tmpdir_path = str(tmpdir.realpath())
Expand All @@ -328,6 +342,16 @@ def test_get_baseimg_from_build_arg(self, tmpdir):
assert dfp.baseimage == 'fedora:latest'
assert not dfp.args

def test_get_basetag_from_build_arg(self, tmpdir):
tmpdir_path = str(tmpdir.realpath())
b_args = {"BASE": "fedora:latest"}
dfp = DockerfileParser(tmpdir_path, env_replace=True, build_args=b_args)
dfp.lines = ["ARG BASE=centos:latest\n",
"FROM $BASE\n",
"LABEL a b\n"]
assert dfp.basetag == 'latest'
assert not dfp.args

def test_set_no_baseimage(self, dfparser):
dfparser.lines = []
with pytest.raises(RuntimeError):
Expand Down Expand Up @@ -468,6 +492,114 @@ def test_image_from(self, from_value, expect):
result = image_from(from_value)
assert result == expect

@pytest.mark.parametrize(('from_value', 'expect'), [
(
"",
(None, None),
),
(
" ",
(None, None),
), (
" foo",
('foo', None),
), (
"foo:bar as baz ",
('foo', 'bar'),
), (
"foo as baz",
('foo', None),
), (
"foo and some other junk", # we won't judge
('foo', None),
), (
"registry.example.com:5000/foo/bar",
('registry.example.com:5000/foo/bar', None),
), (
"registry.example.com:5000/foo/bar:baz",
('registry.example.com:5000/foo/bar', "baz"),
), (
"localhost:5000/foo/bar:baz",
('localhost:5000/foo/bar', "baz"),
)
])
def test_tag_from(self, from_value, expect):
result = tag_from(from_value)
assert result == expect

@pytest.mark.parametrize(('from_image', 'from_tag', 'expect'), [
(
" ",
" ",
"",
),(
"foo",
None,
'foo',
), (
"foo",
"bar",
'foo:bar',
), (
"foo",
"",
'foo',
), (
"foo:bar",
"baz",
'foo:baz',
), (
"registry.example.com:5000/foo/bar",
"baz",
'registry.example.com:5000/foo/bar:baz',
),
(
"localhost:5000/foo/bar",
"baz",
'localhost:5000/foo/bar:baz',
),
(
"nonvalid1@%registry.example.com:5000/foo/bar",
"baz",
'nonvalid1@%registry.example.com:5000/foo/bar:baz',
),
(
"registry.example.com:5000/foo/bar",
"baz",
'registry.example.com:5000/foo/bar:baz',
),(
"registry.example.com:5000/foo/bar:baz",
"bap",
'registry.example.com:5000/foo/bar:bap',
)
])
def test_tag_to(self, from_image, from_tag, expect):
result = tag_to(from_image, from_tag)
assert result == expect


@pytest.mark.parametrize(('tag', 'expect'), [
(
"Tag",
True
),(
"tAg.",
True
), (
"tag-tag",
True
), (
".notTag",
False
), (
"not/tag",
False
)
])
def test_valid_tag(self, tag, expect):
result = valid_tag(tag)
assert result == expect

def test_parent_images(self, dfparser):
FROM = ('my-builder:latest', 'rhel7:7.5')
template = dedent("""\
Expand Down Expand Up @@ -507,8 +639,9 @@ def test_parent_images_missing_from(self, dfparser):
assert dfparser.content.count('FROM') == 4

def test_modify_instruction(self, dfparser):
FROM = ('ubuntu', 'fedora:')
FROM = ('ubuntu', 'fedora:theBest')
CMD = ('old❤cmd', 'new❤command')
TAG = ('theBest', 'newtag')
df_content = dedent("""\
FROM {0}
CMD {1}""").format(FROM[0], CMD[0])
Expand All @@ -518,6 +651,10 @@ def test_modify_instruction(self, dfparser):
assert dfparser.baseimage == FROM[0]
dfparser.baseimage = FROM[1]
assert dfparser.baseimage == FROM[1]

assert dfparser.basetag == TAG[0]
dfparser.basetag = TAG[1]
assert dfparser.basetag == TAG[1]

assert dfparser.cmd == CMD[0]
dfparser.cmd = CMD[1]
Expand Down