Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files changed in a PR are too many to be fully detected by dorny/paths-filter #227

Open
hwhsu1231 opened this issue Mar 7, 2024 · 2 comments

Comments

@hwhsu1231
Copy link

hwhsu1231 commented Mar 7, 2024

Problem Description

Recently, I found that when using dorny/paths-filter@v3 (currently de90cc6), if a PR contains too many files changed, it seems that dorny/paths-filter@v3 will miss some of the files during the filtering process, leading to incorrect filtering results.

What happened?

Over the past few months, I have been trying to create an automation project that localizes CMake documentation.

Workflow to create a PR

First, I wrote a workflow file named ci-sphinx-update.yml, which essentially executes the following steps in order:

  1. Generate/Update .pot files from running sphinx-build command with gettext builder
  2. Generate/Update .po files from .pot files by running msgcat or msgmerge command
  3. Create a PR from a feature branch to the master branch by peter-evans/create-pull-request@v6

Thus, theoretically, if this workflow runs from scratch to generate .pot/.po files, the generated PR should include both .pot and .po files. This is indeed what appears from the output logs of peter-evans/create-pull-request@v6. Below is a part of the log extracted from my output. From it, we can see that this PR had a total of 4182 files changed.

Click to expand the log of 'peter-evans/create-pull-request@v6'
  [343874be-8420-4a7d-aea3-c36361be72f7 3a5b420f6] pot(3.1): Update pot from Sphinx
   Author: docs-l10n[bot] <157310748+docs-l10n[bot]@users.noreply.github.com>
   4182 files changed, 245931 insertions(+)
   create mode 100644 l10n/3.1/crowdin.yml
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_compile_options.po
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_command.po
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_target.po
   create mode 100644 l10n/3.1/po/ja/LC_MESSAGES/command/add_definitions.po
   ...
   ...
   ...
   create mode 100644 l10n/3.1/pot/variable/PROJECT_VERSION_TWEAK.pot
   create mode 100644 l10n/3.1/pot/variable/UNIX.pot
   create mode 100644 l10n/3.1/pot/variable/WIN32.pot
   create mode 100644 l10n/3.1/pot/variable/WINCE.pot
   create mode 100644 l10n/3.1/pot/variable/WINDOWS_PHONE.pot
   create mode 100644 l10n/3.1/pot/variable/WINDOWS_STORE.pot
   create mode 100644 l10n/3.1/pot/variable/XCODE_VERSION.pot
   create mode 100644 l10n/3.1/version.json

Workflow to check status

Next, I also wrote a workflow file named ci-check-status.yml, which uses dorny/paths-filter@v3 to filter .pot files as follows:

- name: Check for *.pot files changed
  id: filter
  if: ${{ steps.evprt.outputs.VERSION != '' }}
  uses: dorny/paths-filter@v3
  with:
    filters: |
      pot:
        - 'l10n/${{ steps.evprt.outputs.VERSION }}/pot/**'

However, when ci-check-status.yml was triggered and attempted to filter the .pot files changed in the PR, I found that it nearly missed all .pot files, thus returning a false result. Below is a part of the log extracted from my output. From it, we can see that dorny/paths-filter@v3 only detected 3000 fiels changed.

Click to expand the log of 'dorny/paths-filter@v3'
Run dorny/paths-filter@v3
  with:
    filters: pot:
    - 'l10n/3.1/pot/**'
  
    token: ***
    list-files: none
    initial-fetch-depth: 100
Fetching list of changed files for PR#446 from Github API
  Invoking listFiles(pull_number: 446, per_page: 100)


  Received 100 items
  [added] l10n/3.1/crowdin.yml
  [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_compile_options.po
  [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_command.po
  [added] l10n/3.1/po/ja/LC_MESSAGES/command/add_custom_target.po
  ...
  ...
  ...
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_LINKER_FLAGS.po
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_LINKER_FLAGS_CONFIG.po
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_MODULE_PREFIX.po
  [added] l10n/3.1/po/zh_TW/LC_MESSAGES/variable/CMAKE_SHARED_MODULE_SUFFIX.po
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
  Received 0 items
Detected 3000 changed files
Results:
Filter pot = false
  Matching files: none
Changes output set to []

Conclusion

From the output logs of the two workflows, I infer that because the PR contains too many files changed (a total of 4182 files changed), dorny/paths-filter@v3 is unable to load all the files changed (it detected a maximum of 3000 files changed).

My questions are as follows:

  1. Is my inference correct?
  2. If so, how can I solve this problem?
  3. If it cannot be solved, is it considered a bug?
  4. If it's indeed a bug, I hope it could be fixed as soon as possible.
@hwhsu1231
Copy link
Author

Might be related to: https://github.com/orgs/community/discussions/57830

@kelchm
Copy link

kelchm commented Apr 7, 2024

I ran into this as well -- I've not had time to dig into it very much yet, but I did observe that falling back to the git-based change detection sidesteps the issue.

To do that, simply set the token param to an empty string:

      - uses: dorny/paths-filter@v3
        id: filter
        with:
          token: ''

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants