-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mtree: implement manifest comparisons #48
Conversation
@cyphar how much of this can be incrementally done? If I am merging other fixes and features, I'd hate for so much rebasing for you 😬 |
Right now, this is good enough for my usecase (you could in principle do all of the comparisons you ever need manually). But there's still a bunch of code refactoring that needs to take place (especially in the tar handling code). |
ok. so, much tar refactoring and fixes have happened in #55 |
You can merge #55 first, because I'm also doing a clean-up of the I'm also going to change the comparison interfaces. You don't need interfaces here. |
The purpose of |
Okay, I'll see how much my |
@cyphar please2rebase? |
@cyphar would love to see this merged 😸 |
I know, and I've been drowing in other issues since I started this PR. 😉 I'll take a look at this next week. |
The comparison code fails some tar to regular dir tests because the tar code appears to be broken. I'll open an issue about it separately. Also Go |
@vbatts I've reimplemented everything using |
(i've opened #78 which will address the CI failure) |
go vet for go1.7.3 complains |
I'm reimplementing it using |
as i'm checking, a simple use of [vbatts@bananaboat] {(46d449e...)} ~/src/vb/go-mtree$ ./gomtree -c
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x402b7b]
goroutine 1 [running]:
panic(0x511c00, 0xc420016110)
/home/vbatts/go1.7/src/runtime/panic.go:500 +0x1a1
main.main()
/home/vbatts/src/vb/go-mtree/cmd/gomtree/main.go:315 +0xfab |
Fixed. I forgot about that usage of the CLI. |
couple more issues: [vbatts@bananaboat] {(5024980...)} ~/src/vb/go-mtree$ ./gomtree -f ~/mtree.mtree -p .
2016/10/31 13:12:59 Unknown keyword "flags" for file "."
[vbatts@bananaboat] {(5024980...)} ~/src/vb/go-mtree$ ./gomtree -f ~/mtree.mtree -p $(pwd)
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x70 pc=0x4949b5]
goroutine 1 [running]:
panic(0x511be0, 0xc420018110)
/home/vbatts/go1.7/src/runtime/panic.go:500 +0x1a1
github.com/vbatts/go-mtree.Walk.func1.3(0x7bdf00, 0xc420079450, 0x7ffc68bcfad2, 0x1c, 0xc4200dd240, 0xc4200d4ce0, 0xc4200cec80, 0x0, 0x0)
/home/vbatts/src/github.com/vbatts/go-mtree/walk.go:184 +0x155
github.com/vbatts/go-mtree.Walk.func1(0x7ffc68bcfad2, 0x1c, 0x7bdf00, 0xc420079450, 0x0, 0x0, 0xc420000340, 0x4d4109)
/home/vbatts/src/github.com/vbatts/go-mtree/walk.go:188 +0x6d4
github.com/vbatts/go-mtree.walk(0xc4200d4ce0, 0x7ffc68bcfad2, 0x1c, 0x7bdf00, 0xc420079450, 0xc4200d0140, 0x1, 0xc4200d0140)
/home/vbatts/src/github.com/vbatts/go-mtree/walk.go:237 +0x97
github.com/vbatts/go-mtree.startWalk(0xc4200d4ce0, 0x7ffc68bcfad2, 0x1c, 0xc4200d0140, 0x3, 0xc42009a400)
/home/vbatts/src/github.com/vbatts/go-mtree/walk.go:232 +0xdf
github.com/vbatts/go-mtree.Walk(0x7ffc68bcfad2, 0x1c, 0x0, 0x0, 0x0, 0xc42000a600, 0x9, 0x10, 0x215544b200791e8, 0xc8, ...)
/home/vbatts/src/github.com/vbatts/go-mtree/walk.go:208 +0x281
main.main()
/home/vbatts/src/vb/go-mtree/cmd/gomtree/main.go:294 +0x1508 |
oh. nm. hold that thought. ... |
(it was something else) |
@vbatts What remaining issues did you have with this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall i'm ready to merge it. The panics ought to get gone and a few other questions. I'm still yet to fully compare the behavoir of the cli of this PR vs. master
// ensure that we don't accidentally expose pointers to the caller that are | ||
// internal data. | ||
func ePtr(e Entry) *Entry { return &e } | ||
func sPtr(s string) *string { return &s } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would not do this. Since these functions are not pointer receivers to begin with, then you may already have a copy of the object, not the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. This is just a function, not a function of Entry. Still, perhaps it is not necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "not returning the actual pointer" is intentional. The purpose of these is so that I can return a pointer (namely so I can say whether or not an entry is nil
) that is not actually able to access the entries we're iterating over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that, though it is misleading as if it is the pointer to the item passed.
Also, not sure that it matters, but the ptr to an empty string is string is not nil. So, I'm not sure whether there is ever a case that a ptr to a string is returned, even though the string is empty (I realize that the predicate does not check the string contents)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that the callers of ePtr
can return nil
or a pointer to an entry. However, they don't want to leak the pointer to the actual entry and rather want to return a pointer to a copy. Because structs are passed by value in arguments, it works out.
As a point of comparison, how would you implement returning a "safe" pointer to an entry in the case of this line?
though it is misleading as if it is the pointer to the item passed.
I've seen this pattern in a few places. Problem is that I don't think there's another nice way of doing this without three lines of temporary variable stuff. To be fair, I was confused about it for a while because of how much I write C code ;).
// determined by the ordering of parameters to Compare). | ||
func (i InodeDelta) Old() *Entry { | ||
if i.diff == Modified || i.diff == Missing { | ||
return ePtr(i.old) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shorthand seems like it is more cumbersome than just &i.old
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my above point. We don't want to return the actual pointer because then the caller could modify it (messing around with the mtree.DirectoryHierarchy
).
// present in one manifest [Missing, Extra], keys only present in one of the | ||
// manifests [Modified] or a difference between the keys of the same object in | ||
// both manifests [Modified]. | ||
type InodeDelta struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only a slight nit. As the inode themselves are not being compared, only the attributes of the inodes, is it confusing to have this name be InodeDelta
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. It's ultimately up to you, but IMO we are comparing inodes here (namely their attributes, not just the contents). Which is kinda what the point of an inode is (to provide attributes of data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair.
key := kv.Keyword() | ||
|
||
// We cannot just take &kv here because it's the iterator. I | ||
// learned this the hard way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh. Thanks for such a comment. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove it, it just took me a while to figure that out. :P
// cached version of inode.keys. | ||
func compareEntry(oldEntry, newEntry Entry) []KeyDelta { | ||
// Represents the new and old states for an entry's keys. | ||
type stateT struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
particular reason for defining this type inside the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only used inside the function. I get that it's not the prettiest thing in the world but because it's effectively just a way of holding a tuple I don't really see the benefit of making it a type external to the function. Your call though.
|
||
diffs["tar_time"].New = newTime | ||
} else { | ||
panic("time and tar_time set in the same manifest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ouch, panic()
? 😬
Is there no error condition? perhaps this compareEntry()
could have the capacity to also return error
:-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'll make these return errors. 😛 Real programmers don't need error handling.
for name, diff := range diffs { | ||
// Invalid | ||
if diff.Old == nil && diff.New == nil { | ||
panic("invalid state: both old and new are nil") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for path, diff := range diffs { | ||
// Invalid | ||
if diff.Old == nil && diff.New == nil { | ||
panic("invalid state: both old and new are nil") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
} | ||
return &result, nil | ||
// TODO: Handle tar_time, if necessary. | ||
return Compare(dh, newDh, keywords) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is much nicer of an API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers. 😸
@vbatts I fixed up the panics and comments. I think that your other comments about the |
Understood On Fri, Nov 4, 2016, 10:25 Aleksa Sarai [email protected] wrote:
|
@cyphar i was about to merge it, and just clean up the |
@vbatts Ah okay, fair enough. Once you get a test case tell me and I'll fix it. Even if you can just give me an example of the regression I can work on fixing it this week. |
@cyphar so with #82, I rebased your branch on this, and and see a difference like:
|
I merged the tests, so you can now rebase on master. |
Fix a bug in the parser that caused all iterators to have to handle the /set and /unset semantics separately. In addition, provide a helper function to correctly generate the merged set of keywords for a particular entry. Signed-off-by: Aleksa Sarai <[email protected]>
This is part of a patchset that refactors all of the checking logic into comparison operations. Essentially, provide a Compare(...) function that allows for two different manifests to be compared. Extra and missing entries are supported in addition to the standard modified entry, and by implementing as a manifest comparison there is no double-scanning of the manifest source. The main annoyance is that we have to also include tar_time handling, which has not been abstracted inside keywords.go. This is a bit ugly right now, but works fine for the moment. Signed-off-by: Aleksa Sarai <[email protected]>
Switch the commandline to use the .Compare() API when checking specification files against the state of a tar archive or other archive. The main purpose is to completely remove the check.go code from being necessary (outside of a wrapper). Signed-off-by: Aleksa Sarai <[email protected]>
While the full testing is broken due to bugs in the tar DH generator, we ignore known bugs in the tar generator to at least allow us to test some of the other semantics of Compare. Signed-off-by: Aleksa Sarai <[email protected]>
This removes all of the special handling code for both TarCheck() and Check() so that everything now uses the new (generic) Compare() code. In addition, the tests had to be modified to reflect the new classes of errors. Signed-off-by: Aleksa Sarai <[email protected]>
@vbatts I've figured it out. It's not because of any of my changes. It's because the tar spec generation is utterly broken. Look at the output of a
The main thing to note is that directory entries don't contain the keyword that we were asked to check against. I've added a commit which ignores keywords that are missing from a manifest, but this isn't a nice solution to the problem. Of course, another issue is that you actually can't get |
Due to several unsolveable problems in tar generation, such as the size=... keyword, we have to special case quite a few things in the checking code. We might want to move this to mtree properly (but I'm hesitant about ignoring errors that only happen for tar DHes). Signed-off-by: Aleksa Sarai <[email protected]>
Yeah. It is one of the cases that we have to only check what we can or Having a way to give a keyword context of being supported or not would be On Wed, Nov 9, 2016, 20:08 Aleksa Sarai [email protected] wrote:
|
@vbatts Fixing the supported / unsupported issue should be solved separately IMO. It will require changes to how we generate spec files. We could do something like using |
Oh, certainly agree to bikeshed it separately, just acknowledging it is an On Wed, Nov 9, 2016, 21:13 Aleksa Sarai [email protected] wrote:
|
LGTM |
Thanks @cyphar ! |
I'll get to work on #83 soon. |
This patchset implements manifest comparison as a first-class primitive and then refactors all of the other checking and validation code into manifest comparison operations. This ensures consistency of comparisons, less code to maintain as well as providing some extra features to the client that are not available already. It also increases the usefulness of
go-mtree
as a standalone library..Set
parsing issue.Compare()
.tar_mtime
.Compare
:Check
TarCheck
Check
API to use[]Difference
or makeFailure
more generic as in check: implement "missing" and "present" failure types #34.Supersedes #34.
Signed-off-by: Aleksa Sarai [email protected]