Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yarn 1.x: Parse packages from yarn.lock #693

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

taylormadore
Copy link
Contributor

@taylormadore taylormadore commented Oct 21, 2024

This PR creates "Packages" for a Yarn 1.x project and all project dependencies. These "Packages" will later be resolved into SBOM Components in future PRs.

The yarn.lock file is the principal source for the fully resolved set of direct project dependencies and transitive dependencies (Yarn workspaces are an exception to this and must be parsed from package.json).

The pyarn library will be used to parse yarn.lock. There is a related PR in that repository that must be merged and released first. It adds handling for additional styles of path-based version specifiers.

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

Copy link
Collaborator

@a-ovchinnikov a-ovchinnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, a couple of small nitpicks.

cachi2/core/package_managers/yarn_classic/main.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/yarn_classic/resolver.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/yarn_classic/resolver.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/yarn_classic/resolver.py Outdated Show resolved Hide resolved
requirements-extras.txt Outdated Show resolved Hide resolved
@taylormadore taylormadore marked this pull request as ready for review October 23, 2024 13:45
cachi2/core/package_managers/yarn_classic/main.py Outdated Show resolved Hide resolved
pyproject.toml Outdated
@@ -28,6 +28,7 @@ dependencies = [
"gitpython",
"packageurl-python",
"packaging",
"pyarn@https://github.com/taylormadore/pyarn/archive/refs/heads/add-alternate-file-dep-style.tar.gz",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only raising a comment here so that we don't forget to update this before merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to the released version

pyproject.toml Show resolved Hide resolved
requirements-extras.txt Show resolved Hide resolved
@@ -186,6 +187,22 @@ def _get_packages_from_lockfile(source_dir: RootedPath) -> list[YarnClassicPacka
return [_classify_pyarn_package(source_dir, package) for package in pyarn_packages]


def _get_main_package(source_dir: RootedPath) -> WorkspacePackage:
"""Return a WorkspacePackage for the main package in package.json."""
# TODO: Use yarn classic project files once available
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm lacking context with this message, can you please elaborate for me?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled in the PackageJson class from the yarn module here. It should eventually be replaced by whatever the result of #705 is once merged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, apparently github failed to include my response in this thread when submitting the review, so repeating:

I pulled in the PackageJson class from the yarn module here. It should eventually be replaced by whatever the result of #705 is once merged

Oh, but this means it's going to be super easy to forget to update this accordingly given there's already working logic. This creates a direct dependency on #705 then.

Comment on lines 118 to 184
def _classify_pyarn_package(source_dir: RootedPath, package: PYarnPackage) -> YarnClassicPackage:
"""Return a YarnClassicPackage corresponding to a PYarnPackage."""

def assert_package_has_relative_path(package: PYarnPackage) -> None:
if package.path and Path(package.path).is_absolute():
raise PackageRejected(
(
f"The package {package.name}@{package.version} has an absolute path "
f"({package.path}), which is not permitted."
),
solution="Ensure that file/link packages in yarn.lock do not have absolute paths.",
)

if _is_from_npm_registry(package.url):
return RegistryPackage(
name=package.name,
version=package.version,
integrity=package.checksum,
)
elif package.path is not None:
# Ensure path is not absolute
assert_package_has_relative_path(package)
# Ensure path is within the repository root
path = source_dir.join_within_root(package.path)
# File packages have a url, whereas link packages do not
if package.url:
return FilePackage(
name=package.name,
version=package.version,
relpath=path.subpath_from_root,
integrity=package.checksum,
)
return LinkPackage(
name=package.name,
version=package.version,
relpath=path.subpath_from_root,
)
elif _is_git_url(package.url):
return GitPackage(
name=package.name,
version=package.version,
url=package.url,
)
elif _is_tarball_url(package.url):
return UrlPackage(
name=package.name,
version=package.version,
url=package.url,
integrity=package.checksum,
)
else:
raise UnexpectedFormat(
(
"Cachi2 could not determine the package type for the following package in "
f"yarn.lock: {vars(package)}"
),
solution=(
"Ensure yarn.lock is well-formed and if so, report this error to the Cachi2 team"
),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Do you think that strictly from OOP's POV we'd benefit from a YarnClassicPackageFactory factory class holding this logic? Functionally, there's no difference, so you can ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

cachi2/core/package_managers/yarn_classic/resolver.py Outdated Show resolved Hide resolved
Copy link
Member

@eskultety eskultety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, but we should IMO wait for #705 and merge the pyarn change in the meantime.

@taylormadore taylormadore force-pushed the parse-yarn-lock branch 3 times, most recently from 740a843 to 5a83727 Compare October 30, 2024 20:08
The workspace path globs in package.json are relative to the workspace
root. Workspace paths need to be resolved relative to the workspace root
(the request package path) rather than the request source dir to prevent
workspaces in package subdirectories from being missed by the glob

Signed-off-by: Taylor Madore <[email protected]>
Move responsibility for the prefetch and component resolution from
fetch_yarn_sources to be more consistent with how other package managers
are implemented

Signed-off-by: Taylor Madore <[email protected]>
pip-compile appears to want to add propcache to requirements.txt and
also update the ordering and comments of some of the other requirements

Signed-off-by: Taylor Madore <[email protected]>
These correspond to the different package types that can be resolved in
a Yarn 1.x yarn.lock file

Signed-off-by: Taylor Madore <[email protected]>
Resolves containerbuildsystem#629

Use the pyarn library to parse yarn.lock and get the list of project
dependencies. Next classify them into specific package types that can
later be used to generate SBOM components with PURLs.

Some of the logic used to classify the package types was inspired by or
taken directly from the official yarn sources.

Signed-off-by: Taylor Madore <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants