Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access additional license data #23

Merged
merged 5 commits into from
Oct 6, 2022

Conversation

pavelmemory
Copy link

@pavelmemory pavelmemory commented Jul 18, 2022

The change in this PR addresses a problem described in the issue "Make license database public".
In order to get access to additional license data the two new exportable functions were added:

  • LicenseURLs - it returns a list of URLs where the license can be found in the internet
  • LicenseName - it returns the full name of the license
    The values are get from the JSON representation of the license stored in the https://github.com/spdx/license-list-data/archive/v$(SPDX_DATA_VERSION).tar.gz as all other information about the licenses. But to get the fully qualified URLs the third column was added to the urls.csv file and it contains schema of the URL. That is done to have possibility re-build the full URL and do not change existing format of the second column that may be used by someone already.
    Both functions return a new ErrUnknownLicenseID error in case the license ID provided as a parameter doesn't exist in the database.

The change also includes upgrade of the SPDX_DATA_VERSION to the latest at the moment 3.17 version. After the update the tool begun to panic because of the index access to the empty slice. The fix has been applied to solve that issue.
After the upgrade the expected confidence in the tests were adjusted to the new results.

The one question that remains open here is why .mit-license.org literal is used in the licensedb/internal/assets/extract_urls.go? Should it be changed to the ://mit-license.org to have a proper URL format?

Pavlo Strokov added 2 commits July 15, 2022 17:03
SPDX version upgraded to the 3.17 version.
After the upgrade the licensedb.Detect() function begun to
panic with:
panic: runtime error: index out of range [0] with length 0.
The root cause was in the queryLicenseAbstract method of the
database. After adding a length check to amount of matches
found the issue disappeared.
The upgrade also affected a confidence of the matched licenses
in the TestCmdMain, so they were adjusted accordingly to pass.
In order to extend functionality the two new exportable functions
were added to get more info about the license.
LicenseURLs returns a list of URLs for the license and
LicenseName returns a full name of the license. Both accepts a
license identifier as an input parameter and do search over the
license database. If there is no info about provided license the
error ErrUnknownLicenseID is returned.
The extract_urls.go was extended to produce one more column in
urls.csv file that includes schema of the URL. It is needed to
construct the full URL that will be returned from the LicenseURLs.
@pavelmemory
Copy link
Author

pavelmemory commented Jul 18, 2022

@bzz could you please take a look at this PR?
Unfortunately the list of the maintainers doesn't exist by the reference that is mentioned in the Contributing Guidelines

@pavelmemory
Copy link
Author

Hi @lafriks! Thanks for taking an eye on my changes. Could you please explain why .mit-license.org is used instead of ://mit-license.org (the last paragraph in the PR description)?

Copy link
Collaborator

@vmarkovtsev vmarkovtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

@pavelmemory
Copy link
Author

pavelmemory commented Jul 24, 2022

Seems like Lint job fails because golangci/golangci-lint-action needs to be updated or the configuration needs to be changed to use the newer version of the golangci-lint. I don't have permissions for doing this, so I rely on maintainers of the repository to do this.

Update: I have increased version of the golangci-lint tool to 1.47.0 version and fixed the issues.

@vmarkovtsev
Copy link
Collaborator

I don't think I have ever supported windows, it's ok to remove the CI checks for that OS.

Pavlo Strokov added 3 commits August 1, 2022 13:15
The Race action executes tests with -race flag and sometimes
can take more than 10m which is the default test timeout.
To mitigate that possibility the timeout is increased to 60m
by explicit usage of the -timeout flag.
The version of the golangci-lint tool upgraded to a newer 1.47.0 version.
As there were couple of lint issues (unsafeptr: possible misuse of reflect.SliceHeader (govet)), those were fixed to pass the static
code check.
Widnows was not ever explicitly supported by the author
and causes troubles on CI action runs. Those we drop it
as redundant.
@pavelmemory pavelmemory force-pushed the access-additional-license-data branch from 8fedebb to a7c9e28 Compare August 1, 2022 10:17
@pavelmemory
Copy link
Author

I don't think I have ever supported windows, it's ok to remove the CI checks for that OS.

@vmarkovtsev I have removed Windows OS from the actions, hope the CI will pass this time.

@To1ne
Copy link

To1ne commented Aug 11, 2022

Hey @vmarkovtsev, is there anything we can do to help get this PR over the finish line? @pavelmemory made the necessary changes to drop CI for Windows (this would actually close #22).

@bzz
Copy link
Member

bzz commented Oct 6, 2022

As CI is green now and there are new tests and review ✅ - I'll be merging this, if there is no further discussion.

As this change includes new API - it's a bit hard for me to say off hand how backward-compatible is it, and if it is - I assume v4.4.0 needs to be released, otherwise, it's v5.0.0.

WDYT?

@bzz bzz mentioned this pull request Oct 6, 2022
@lafriks
Copy link
Contributor

lafriks commented Oct 6, 2022

API seems to be backward compatible as only adds new methods so imho v4.4.0 is way to go

@bzz bzz closed this Oct 6, 2022
@bzz bzz reopened this Oct 6, 2022
@bzz
Copy link
Member

bzz commented Oct 6, 2022

Closed&re-opened to trigger the CI, going to merge as soon as it passes.
Then rebase #24 -> ✅ -> 🔪 v4.4.0

@bzz
Copy link
Member

bzz commented Oct 6, 2022

Linter complains on the licensedb/filer/filer.go that was not changed in this PR and thus is not relevant (would be nice to have a separate issue for handling it in the future).

@bzz
Copy link
Member

bzz commented Oct 6, 2022

...wonder what makes Race CI profile run 20min here, while it's only 6m in #24 🤔

@bzz bzz merged commit a3a1cc6 into go-enry:master Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants