-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MonthlyCommits is unsafe #96
Comments
Almost done. @mauricioaniche I'm extending ChangeSet to include message and author/committer objects modeled on JGit's See the branch MoreFilters in my fork if you want a sneak preview. Still fixing a test failure. |
davisjam
added a commit
to davisjam/repodriller
that referenced
this issue
Oct 18, 2017
A Changeset now includes a message as well as author/committer objects modeled on JGit's PersonIdent. This is a breaking change for anyone who uses ChangeSet. The payoff is that on complex git-based repositories, users can now better handle the distributed nature of git. The time-based CommitRanges provided by RepoDriller now use the committer time when doing time comparisons. This is what the user generally intends: "the time at which this commit entered the repository". Addresses mauricioaniche#96.
davisjam
added a commit
to davisjam/repodriller
that referenced
this issue
Oct 19, 2017
A Changeset now includes a message as well as author/committer objects modeled on JGit's PersonIdent. This is a breaking change for anyone who uses ChangeSet. The payoff is that on complex git-based repositories, users can now better handle the distributed nature of git. The time-based CommitRanges provided by RepoDriller now use the committer time when doing time comparisons. This is what the user generally intends: "the time at which this commit entered the repository". Addresses mauricioaniche#96.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Example: I cloned the popular libuv library and tried to extract monthly commits. I ended up with 16 commits even though the project has had regular commits since 2011 (expected commits = 6 years x 12 months/year = 72 commits).
Explanation: The date being used by RepoDriller uses the author commit time rather than the committer commit time. Suppose someone forks libuv and adds commits. Years later they submit a PR with the fix. Taking a real example, suppose a commit authored in 2015 was merged in 2017. When MonthlyCommits encounters this commit, it will accept it as "at least X months before" the last commit (which was in 2017). This 2015 commit will then cause all of the remaining commits from 2017 and 2016 to be skipped.
Suggested fix: Add both author and committer time fields to ChangeSet. Use the committer time in MonthlyCommits.
Initial exploration: Changing from
getAuthorIdent
togetCommitterIdent
inGitRepository.convertToDate
seems to do the right thing: 16 -> 77 commits found, and the timestamps are well-ordered.Thanks @ayaankazerouni for pointing out the difference between author and committer idents.
The text was updated successfully, but these errors were encountered: