-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression causing deadlock between File.Close and File.WriteTo #603
Comments
🤔 Right… so to address this, we would need to trigger a signal to close any outstanding Actually, let’s look at the behavior of package main
import (
"io"
"fmt"
"os"
"sync"
)
func main() {
f, err := os.Open("/dev/zero")
if err != nil {
panic(err)
}
rd, wr := io.Pipe()
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
_, err := f.WriteTo(wr)
_ = wr.CloseWithError(err)
}()
lr := &io.LimitedReader{ R: rd, N: 42 }
b, err := io.ReadAll(lr)
if err != nil {
panic(err)
}
fmt.Println(string(b))
err = f.Close()
if err != nil {
panic(err)
}
wg.Wait()
} output:
It seems here, while the 🤔 P.S.: Adding a |
Min-repro:
This locks up:
Adding an It seems that things were only ever working because the |
… tests Before this change, upgrading to v1.13.7 caused a deadlock in the tests. This was caused by additional locking in the sftp package exposing a bad choice by the rclone code. See pkg/sftp#603 and thanks to @puellanivis for the fix suggestion.
Thank you so much for making a repro @puellanivis You are right, closing things in a different order makes the deadlock go away --- a/backend/sftp/sftp.go
+++ b/backend/sftp/sftp.go
@@ -2101,10 +2101,10 @@ func (file *objectReader) Read(p []byte) (n int, err error) {
// Close a reader of a remote sftp file
func (file *objectReader) Close() (err error) {
- // Close the sftpFile - this will likely cause the WriteTo to error
- err = file.sftpFile.Close()
// Close the pipeReader so writes to the pipeWriter fail
_ = file.pipeReader.Close()
+ // Close the sftpFile - this will likely cause the WriteTo to error
+ err = file.sftpFile.Close()
// Wait for the background process to finish
<-file.done
// Show connection no longer in use I think the new way is probably the best way and the fact that it worked before was due to an implementation detail of the sftp library. I'm happy with that fix and so we can close this if you are happy? |
Yeah, I’m still thinking I’d rather repeat the I’ll timebox this to a week. I would like to at least get |
Sounds like a plan! For my use, I'll put the above tweak in and update rclone to v1.13.7 for its next release. I haven't looked at the v2 branch yet - should I? |
Yeah, I think the tweak of closing the read pipe first is definitely the better solution in general. 👍 Yes, you can look at the |
#604 works! 😁 Just like the |
I noticed when I upgraded the dependencies for rclone that the sftp server tests have started deadlocking.
I used git bisect to narrow this down to this commit d1903fb
This made an enormous backtrace which I removed all traces without pkg/sftp in and deduplicated the traces which left us with three.
I think backtrace 1 is irrelevant. What appears to have happened is that a goroutine started (backtrace 3) which is holding the
File
lockThe code from rclone which causes this is this (backtrace 3)
And here (backtrace 2)
So the problem is that rclone expects the line
err = file.sftpFile.Close()
to close the sftpFile
and return an error, while now what happens is that because of the locking added in d1903fb theClose
waits indefinitely for theFile.WriteTo()
to finish which it won't since rclone has stopped putting data into it.So I think this is a regression - this has worked perfectly for many years!
I haven't made a small reproduction - I probably can now I understand what is going on - but maybe you've got an immediate idea on how to solve this problem?
Backtrace 1
goroutine 56 gp=0xc0004048c0 m=nil [sync.Cond.Wait]:
runtime.gopark(0xc0000a5ce0?, 0x1?, 0x20?, 0x24?, 0x23?)
go1.23/src/runtime/proc.go:424 +0xce fp=0xc000352430 sp=0xc000352410 pc=0x473b6e
runtime.goparkunlock(...)
go1.23/src/runtime/proc.go:430
sync.runtime_notifyListWait(0xc00011e0d0, 0x32c)
go1.23/src/runtime/sema.go:587 +0x159 fp=0xc000352480 sp=0xc000352430 pc=0x4753b9
sync.(*Cond).Wait(0x0?)
go1.23/src/sync/cond.go:71 +0x85 fp=0xc0003524c0 sp=0xc000352480 pc=0x484dc5
golang.org/x/crypto/ssh.(*buffer).Read(0xc00007c060, {0xc0001167f8, 0x4, 0x4})
/home/ncw/go/pkg/mod/golang.org/x/[email protected]/ssh/buffer.go:94 +0x1ed fp=0xc000352548 sp=0xc0003524c0 pc=0x66eb6d
golang.org/x/crypto/ssh.(*channel).ReadExtended(0xc0003e40c0, {0xc0001167f8?, 0x1c0004048c0?, 0xc0003525d8?}, 0x40811e?)
/home/ncw/go/pkg/mod/golang.org/x/[email protected]/ssh/channel.go:363 +0x93 fp=0xc0003525a8 sp=0xc000352548 pc=0x672553
golang.org/x/crypto/ssh.(*channel).Read(0x408245?, {0xc0001167f8?, 0x471f94?, 0xc000400e08?})
/home/ncw/go/pkg/mod/golang.org/x/[email protected]/ssh/channel.go:540 +0x25 fp=0xc0003525e0 sp=0xc0003525a8 pc=0x673265
github.com/pkg/sftp.(*conn).Read(0x0?, {0xc0001167f8?, 0xc000352648?, 0x475e49?})
<autogenerated>:1 +0x29 fp=0xc000352610 sp=0xc0003525e0 pc=0x6b0029
io.ReadAtLeast({0xb6e6c0, 0xc0000ea240}, {0xc0001167f8, 0x4, 0x4}, 0x4)
go1.23/src/io/io.go:335 +0x90 fp=0xc000352658 sp=0xc000352610 pc=0x4a93f0
io.ReadFull(...)
go1.23/src/io/io.go:354
github.com/pkg/sftp.recvPacket({0xb6e6c0, 0xc0000ea240}, 0x0, 0x2?)
+0x99 fp=0xc0003526d0 sp=0xc000352658 pc=0x6aae39
github.com/pkg/sftp.(*conn).recvPacket(...)
github.com/pkg/sftp.(*clientConn).recv(0xc0000ea240)
+0xc5 fp=0xc000352790 sp=0xc0003526d0 pc=0x6a89e5
github.com/pkg/sftp.NewClientPipe.func1()
+0x4e fp=0xc0003527e0 sp=0xc000352790 pc=0x69faee
runtime.goexit({})
go1.23/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003527e8 sp=0xc0003527e0 pc=0x47bf21
created by github.com/pkg/sftp.NewClientPipe in goroutine 46
+0x349
Backtrace 2
goroutine 1762 gp=0xc0003a96c0 m=nil [sync.Mutex.Lock]:
runtime.gopark(0x0?, 0xc000034700?, 0xc0?, 0x20?, 0xc00001fbb8?)
go1.23/src/runtime/proc.go:424 +0xce fp=0xc00001fad8 sp=0xc00001fab8 pc=0x473b6e
runtime.goparkunlock(...)
go1.23/src/runtime/proc.go:430
runtime.semacquire1(0xc00038e06c, 0x0, 0x3, 0x1, 0x15)
go1.23/src/runtime/sema.go:178 +0x225 fp=0xc00001fb40 sp=0xc00001fad8 pc=0x450725
sync.runtime_SemacquireMutex(0xc00001fbb0?, 0xe5?, 0xb?)
go1.23/src/runtime/sema.go:95 +0x25 fp=0xc00001fb78 sp=0xc00001fb40 pc=0x475105
sync.(*Mutex).lockSlow(0xc00038e068)
go1.23/src/sync/mutex.go:173 +0x15d fp=0xc00001fbc8 sp=0xc00001fb78 pc=0x4864fd
sync.(*Mutex).Lock(...)
go1.23/src/sync/mutex.go:92
sync.(*RWMutex).Lock(0x2?)
go1.23/src/sync/rwmutex.go:148 +0x31 fp=0xc00001fbf8 sp=0xc00001fbc8 pc=0x4877d1
github.com/pkg/sftp.(*File).Close(0xc00038e050)
+0x3c fp=0xc00001fc58 sp=0xc00001fbf8 pc=0x6a2d1c
github.com/rclone/rclone/backend/sftp.(*objectReader).Close(0xc0002ac680)
github.com/rclone/rclone/backend/sftp/sftp.go:2105 +0x1f fp=0xc00001fc90 sp=0xc00001fc58 pc=0x8cacbf
github.com/rclone/rclone/lib/readers.(*LimitedReadCloser).Close(0xc000338480)
github.com/rclone/rclone/lib/readers/limited.go:17 +0x24 fp=0xc00001fce0 sp=0xc00001fc90 pc=0x7c8d44
github.com/rclone/rclone/fstest/fstests.ReadObject({0xb73360, 0xf3e900}, 0xc000312ea0, {0xb77540, 0xc000486e80}, 0xffffffffffffffff, {0xc00061e350, 0x1, 0x1})
github.com/rclone/rclone/fstest/fstests/fstests.go:293 +0x2f5 fp=0xc00001fdb0 sp=0xc00001fce0 pc=0x925895
github.com/rclone/rclone/fstest/fstests.Run.func13.14.36(0xc000312ea0)
github.com/rclone/rclone/fstest/fstests/fstests.go:1802 +0x310 fp=0xc00001ff70 sp=0xc00001fdb0 pc=0x933db0
testing.tRunner(0xc000312ea0, 0xc000486e00)
go1.23/src/testing/testing.go:1690 +0xf4 fp=0xc00001ffc0 sp=0xc00001ff70 pc=0x526174
testing.(*T).Run.gowrap1()
go1.23/src/testing/testing.go:1743 +0x25 fp=0xc00001ffe0 sp=0xc00001ffc0 pc=0x527165
runtime.goexit({})
go1.23/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00001ffe8 sp=0xc00001ffe0 pc=0x47bf21
created by testing.(*T).Run in goroutine 1096
go1.23/src/testing/testing.go:1743 +0x390
Backtrace 3
goroutine 1676 gp=0xc000160e00 m=nil [select]:
runtime.gopark(0xc000416cf0?, 0x2?, 0x98?, 0x6b?, 0xc000416cb4?)
go1.23/src/runtime/proc.go:424 +0xce fp=0xc000416b50 sp=0xc000416b30 pc=0x473b6e
runtime.selectgo(0xc000416cf0, 0xc000416cb0, 0xc0001ef470?, 0x1, 0x40f905?, 0x1)
go1.23/src/runtime/select.go:335 +0x7a5 fp=0xc000416c78 sp=0xc000416b50 pc=0x44f685
io.(*pipe).write(0xc000112f00, {0xc0003f0000, 0x5f, 0x8000})
go1.23/src/io/pipe.go:86 +0x1e5 fp=0xc000416d50 sp=0xc000416c78 pc=0x4ab185
io.(*PipeWriter).Write(0xc00038e050?, {0xc0003f0000?, 0xc0003f0000?, 0x8000?})
go1.23/src/io/pipe.go:161 +0x1a fp=0xc000416d80 sp=0xc000416d50 pc=0x4ab59a
github.com/pkg/sftp.(*File).writeToSequential(0xc00038e050, {0xb6e140, 0xc000112f00})
+0xfb fp=0xc000416df8 sp=0xc000416d80 pc=0x6a43fb
github.com/pkg/sftp.(*File).WriteTo(0xc00038e050, {0xb6e140, 0xc000112f00})
+0x3a5 fp=0xc000416fa8 sp=0xc000416df8 pc=0x6a4865
github.com/rclone/rclone/backend/sftp.(*Fs).newObjectReader.func1()
github.com/rclone/rclone/backend/sftp/sftp.go:2085 +0x30 fp=0xc000416fe0 sp=0xc000416fa8 pc=0x8cac10
runtime.goexit({})
go1.23/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000416fe8 sp=0xc000416fe0 pc=0x47bf21
created by github.com/rclone/rclone/backend/sftp.(*Fs).newObjectReader in goroutine 1762
github.com/rclone/rclone/backend/sftp/sftp.go:2082 +0x199
The text was updated successfully, but these errors were encountered: