Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readdir support #11 #12

Merged
merged 2 commits into from
Aug 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,18 @@ version = "0.0.1"

[dependencies]
cfg-if = "1.0.0"
cvt = "0.1.1"

[dev-dependencies]
tempfile = "3.3.0"

[target.'cfg(not(windows))'.dependencies]
cvt = "0.1.1"
libc = "0.2.121"
# Saves nontrivial unsafe and platform specific code (Darwin vs other Unixes,
# MAX_PATH and more) : consider it weak and something we can remove if expedient
# later.
nix = { version = "0.24.2", default-features = false, features = ["dir"] }


[target.'cfg(windows)'.dependencies]
ntapi = "0.3.7"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ filesystem code, since otherwise the state of the filesystem path that
operations are executed against can change silently, leading to TOC-TOU race
conditions. For Unix these calls are readily available in the libc crate, but
for Windows some more plumbing is needed. This crate provides a unified
Rust-y interface to these calls.
Rust-y and safe interface to these calls.

## MSRV policy

Expand Down
115 changes: 112 additions & 3 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
//! unified Rust-y interface to these calls.

use std::{
ffi::OsStr,
fs::File,
io::{Error, ErrorKind, Result},
path::Path,
Expand All @@ -23,11 +24,11 @@ cfg_if::cfg_if! {
if #[cfg(windows)] {
mod win;

use win::OpenOptionsImpl;
use win::{OpenOptionsImpl, ReadDirImpl, DirEntryImpl};
} else {
mod unix;

use unix::OpenOptionsImpl;
use unix::{OpenOptionsImpl, ReadDirImpl, DirEntryImpl};
}
}

Expand Down Expand Up @@ -183,6 +184,11 @@ impl OpenOptions {
/// This will honour the options set for creation/append etc, but will only
/// operate relative to d. To open a file with an absolute path, use the
/// stdlib fs::OpenOptions.
///
/// Note: On Windows this uses low level APIs that do not perform path
/// separator translation: if passing a path containing a separator, it must
/// be a platform native one. e.g. `foo\\bar` on Windows, vs `foo/bar` on
/// most other OS's.
pub fn open_at<P: AsRef<Path>>(&self, d: &mut File, p: P) -> Result<File> {
self._impl.open_at(d, OpenOptions::ensure_root(p.as_ref())?)
}
Expand All @@ -198,6 +204,66 @@ impl OpenOptions {
}
}

/// Iterate over the contents of a directory. Created by calling read_dir() on
/// an opened directory. Each item yielded by the iterator is an io::Result to
/// allow communication of io errors as the iterator is advanced.
///
/// To the greatest extent possible the underlying OS semantics are preserved.
/// That means that `.` and `..` entries are exposed, and that no sort order is
/// guaranteed by the iterator.
#[derive(Debug)]
pub struct ReadDir<'a> {
_impl: ReadDirImpl<'a>,
}

impl<'a> ReadDir<'a> {
pub fn new(d: &'a mut File) -> Result<Self> {
Ok(ReadDir {
_impl: ReadDirImpl::new(d)?,
})
}
}

impl Iterator for ReadDir<'_> {
type Item = Result<DirEntry>;

fn next(&mut self) -> Option<Result<DirEntry>> {
self._impl
.next()
.map(|entry| entry.map(|_impl| DirEntry { _impl }))
}
}

/// The returned type for each entry found by [`read_dir`].
///
/// Each entry represents a single entry inside the directory. Platforms that
/// provide rich metadata may in future expose this through methods or extension
/// traits on DirEntry.
///
/// For now however, only the [`name()`] is exposed. This does not imply any

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about it being racy, but, does open_at give you a way to avoid the race consistently.

If there's an entry that's flipping between being a symlink, directory, file, and something else, and I want to readlink, readlink, read, or error out... I'd need a nofollow option to start with, and then I guess the application code needs to retry various options until something succeeds? Seems a bit messy?

(Probably not a reason to block addition of this with just the name to start with, just a thought.)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like sourcefrog@88faff1 works on Linux at least.

Thats ... fascinating. What bakes my brain here is that this is a new fd - where is the state coming from? (The previous iteration over the DIR* contents should be wiped out when fdopendir is called : https://docs.rs/nix/0.24.2/src/nix/dir.rs.html#57

And we never actually read from the original fd: the pattern is:

  • open dir (fd 3)
  • fcntl(3, F_DUPFD_CLOEXEC, 0) = 4
  • fdopendir(4) = DIR*
  • drop for Dir calls closedir correctly., which closes fd 4.

It works on Linux here outside docker. Doesn't work within docker. That means we don't require a kernel version change to trigger the situation.

Linux tr2vm 5.13.0-52-generic #59-Ubuntu SMP Wed Jun 15 20:17:13 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
libc6:amd64 2.34-0ubuntu3.2

The failing docker container contents:
ii libc6:amd64 2.35-0ubuntu3.1 amd64 GNU C Library: Shared libraries

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point about it being racy, but, does open_at give you a way to avoid the race consistently.

If there's an entry that's flipping between being a symlink, directory, file, and something else, and I want to readlink, readlink, read, or error out... I'd need a nofollow option to start with, and then I guess the application code needs to retry various options until something succeeds? Seems a bit messy?

(Probably not a reason to block addition of this with just the name to start with, just a thought.)

NoFollow is absolutely needed yes. We also need an inode abstraction (to permit safeish open_at(child, '..')) to walk back up the tree. But then you don't need to retry, you just open, interrogate, and if a symlink follow the pointer from userspace, otherwise process the object you obtained from open_at.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like sourcefrog@88faff1 works on Linux at least.

Thats ... fascinating. What bakes my brain here is that this is a new fd - where is the state coming from? (The previous iteration over the DIR* contents should be wiped out when fdopendir is called : https://docs.rs/nix/0.24.2/src/nix/dir.rs.html#57

And we never actually read from the original fd: the pattern is:

  • open dir (fd 3)
  • fcntl(3, F_DUPFD_CLOEXEC, 0) = 4
  • fdopendir(4) = DIR*
  • drop for Dir calls closedir correctly., which closes fd 4.

It works on Linux here outside docker. Doesn't work within docker. That means we don't require a kernel version change to trigger the situation.

Linux tr2vm 5.13.0-52-generic #59-Ubuntu SMP Wed Jun 15 20:17:13 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux libc6:amd64 2.34-0ubuntu3.2

The failing docker container contents: ii libc6:amd64 2.35-0ubuntu3.1 amd64 GNU C Library: Shared libraries

Installed an impish container (same as host os that 'works') and it still fails - so this is looking very specific to docker.

Going to try a couple things :). Also I've realised nix has poor readdir hygiene, filing a bug there.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaves a strong hint (from the NetBSD man page in particular) nix-rust/nix#1784

So I think seek might be right, though I'm still suspicious of underlying things here.

Copy link

@sourcefrog sourcefrog Aug 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It failed for me first time, outside of Docker, on Pop_OS 22.04 Linux lift

5.18.10-76051810-generic #202207071639~1659108431~22.04~c9172fb SMP PREEMPT_DYNAMIC Fri J x86_64 x86_64 x86_64 GNU/Linux

I can't immediately cite chapter and verse but I'm not surprised that dup'd file descriptors share a seek position.

https://man7.org/linux/man-pages/man2/dup.2.html

   The dup() system call allocates a new file descriptor that refers to the same open file description as the descriptor oldfd. 

https://man7.org/linux/man-pages/man2/open.2.html

    A call to open() creates a new open file description, an entry in
   the system-wide table of open files.  The open file description
   records the file offset and the file status flags (see below).  A
   file descriptor is a reference to an open file description; this
   reference is unaffected if pathname is subsequently removed or
   modified to refer to a different file.  For further details on
   open file descriptions, see NOTES.

So my model was that you have a single open file object in the kernel with two fd numbers pointing to it. If you read from it by one, you have to seek it back in the other.

That does not explain why it would succeed in some cases though!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoFollow is absolutely needed yes. We also need an inode abstraction (to permit safeish open_at(child, '..')) to walk back up the tree. But then you don't need to retry, you just open, interrogate, and if a symlink follow the pointer from userspace, otherwise process the object you obtained from open_at.

But with O_NOFOLLOW you'll fail to open the file. You need to then retry with readlinkat.

I split out #13 on this too.

The other case I though of where statat will help beyond fstat is a file that you don't have permission to read.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the success in some cases is very confusing. We may need to add a note that the same file object is in use and thus the file location offset can be updated by readdir.

O_PATH permits fstat after opening, but the portability concerns are there :/.

The advice for programs that want to be secure and run on Darwin, from that stackexchange, is to tell folk that they need to grant read access to permit such software to work, or document its vulnerability to races on Darwin.

/// additional IO for most workloads: metadata returned from a directory listing
/// is inherently racy: presuming that what was a dir, or symlink etc when the
/// directory was listed, will still be the same when opened is fallible.
/// Instead, use open_at to open the contents, and then process based on the
/// type of content found.
#[derive(Debug)]
pub struct DirEntry {
_impl: DirEntryImpl,
}

impl DirEntry {
pub fn name(&self) -> &OsStr {
self._impl.name()
}
}

/// Read the children of the directory d.
///
/// See [`ReadDir`] and [`DirEntry`] for details.
pub fn read_dir(d: &mut File) -> Result<ReadDir> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somewhat would have expected this would be a method on a specific DirHandle object, not a generic fs::File. Or, maybe it would be an extension trait of File... (I realize this is the style it's using now.)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so I can understand that.

There is a consistency question. The stdlib's OpenOptions struct is a programmable factory basically. I've chosen to put mkdir_at and open_at (and in future symlink_at and delete_at) on a similar struct, at least for now. That permits a foreign File like that implements AsDeref to be used I guess, which an extension trait wouldn't so much.

readdir doesn't seem to need OpenOptions which is why I made it a free function for now.

The type it works on is File because one doesn't know when a path is opened whether it is a directory or not. We could do a type conversion thing where we check the metadata, but std lib File's don't have free metadata last I checked, so we'd be adding a syscall hidden behind a type conversion, which feels strange to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have a struct DirHandle(File) that allows construction from a File without checking the on-disk type, just letting later calls fail if it's not really a dir.

It might make client code feel more natural: dir_handle.readlink_at(name). Maybe this is imposing too much of an opinion on the client.

Of course on Linux (at least) it doesn't have to be a dir; you can have an O_PATH.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps. We can try a few things and see whats nice. e.g.

struct FileAndError { f: File, e: io::Error}
DirHandle::from(File) -> Result<DirHandle, FileAndError>

would certainly work. I'm just not sure its better yet :)

ReadDir::new(d)
}

pub mod os {
cfg_if::cfg_if! {
if #[cfg(windows)] {
Expand All @@ -214,14 +280,15 @@ pub mod testsupport;
#[cfg(test)]
mod tests {
use std::{
ffi::OsStr,
fs::{rename, File},
io::{Error, ErrorKind, Result, Seek, SeekFrom, Write},
path::PathBuf,
};

use tempfile::TempDir;

use crate::{testsupport::open_dir, OpenOptions, OpenOptionsWriteMode};
use crate::{read_dir, testsupport::open_dir, DirEntry, OpenOptions, OpenOptionsWriteMode};

/// Create a directory parent, open it, then rename it to renamed-parent and
/// create another directory in its place. returns the file handle and the
Expand Down Expand Up @@ -464,4 +531,46 @@ mod tests {
}
Ok(())
}

#[test]
fn readdir() -> Result<()> {
let (_tmp, mut parent_dir, _pathname) = setup()?;
assert_eq!(
2, // . and ..
read_dir(&mut parent_dir)?
.collect::<Result<Vec<DirEntry>>>()?
.len()
);
let dir_present =
|children: &Vec<DirEntry>, name: &OsStr| children.iter().any(|e| e.name() == name);

let mut options = OpenOptions::default();
options.create_new(true).write(OpenOptionsWriteMode::Write);
options.open_at(&mut parent_dir, "1")?;
options.open_at(&mut parent_dir, "2")?;
options.open_at(&mut options.mkdir_at(&mut parent_dir, "child")?, "3")?;
let children = read_dir(&mut parent_dir)?.collect::<Result<Vec<_>>>()?;
assert_eq!(
5,
children.len(),
"directory contains 5 entries (., .., 1, 2, child)"
);
assert!(dir_present(&children, OsStr::new("1")), "{:?}", children);
assert!(dir_present(&children, OsStr::new("2")), "{:?}", children);
assert!(
dir_present(&children, OsStr::new("child")),
"{:?}",
children
);

{
let mut child = OpenOptions::default()
.read(true)
.open_at(&mut parent_dir, "child")?;
let children = read_dir(&mut child)?.collect::<Result<Vec<_>>>()?;
assert_eq!(3, children.len(), "{:?}", children);
assert!(dir_present(&children, OsStr::new("3")), "{:?}", children);
}
Ok(())
}
}
107 changes: 106 additions & 1 deletion src/unix.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
use std::{
ffi::CString,
ffi::{CString, OsStr, OsString},
fs::File,
io::Result,
marker::PhantomData,
os::unix::prelude::{AsRawFd, FromRawFd, OsStrExt},
path::Path,
ptr,
};

// This will probably take a few iterations to get right. The idea: always use
Expand Down Expand Up @@ -180,6 +182,109 @@ impl OpenOptionsExt for OpenOptions {
}
}

#[derive(Debug)]
pub(crate) struct ReadDirImpl<'a> {
// Since we clone the FD, the original FD is now separate. In theory.
// However for Windows we use the File directly, thus here we need to
// pretend.
_phantom: PhantomData<&'a File>,
// Set to None after we closedir it. Perhaps we should we impl Send and Sync
// because the data referenced is owned by libc ?
dir: Option<ptr::NonNull<libc::DIR>>,
}

impl<'a> ReadDirImpl<'a> {
pub fn new(dir_file: &'a mut File) -> Result<Self> {
// closedir closes the FD; make a new one that we can close when done with.
let new_fd =
cvt_r(|| unsafe { libc::fcntl(dir_file.as_raw_fd(), libc::F_DUPFD_CLOEXEC, 0) })?;
let mut dir = Some(
ptr::NonNull::new(unsafe { libc::fdopendir(new_fd) }).ok_or_else(|| {
let _droppable = unsafe { File::from_raw_fd(new_fd) };
std::io::Error::last_os_error()
})?,
);

// If dir_file has had operations on it - such as open_at - its pointer
// might not be at the start of the dir, and fdopendir is documented
// (e.g. BSD man pages) to not rewind the fd - and our cloned fd
// inherits the pointer.
if let Some(d) = dir.as_mut() {
unsafe { libc::rewinddir(d.as_mut()) };
}

Ok(ReadDirImpl {
_phantom: PhantomData,
dir,
})
}

fn close_dir(&mut self) -> Result<()> {
if let Some(ref mut dir) = self.dir {
let result = unsafe { libc::closedir(dir.as_mut()) };
// call made, clear state
self.dir = None;
cvt_r(|| result)?;
}
Ok(())
}
}

impl Drop for ReadDirImpl<'_> {
fn drop(&mut self) {
// like the stdlib, we eat errors occuring during drop, as there is no
// way to get error handling.
let _ = self.close_dir();
}
}

impl Iterator for ReadDirImpl<'_> {
type Item = Result<DirEntryImpl>;

fn next(&mut self) -> Option<Self::Item> {
let dir = unsafe { self.dir?.as_mut() };
// the readdir result is only guaranteed valid within the same thread
// and until other calls are made on the same dir stream. Thus we
// perform the required work inside next, allowing the next call to
// readdir to be managed by the single mutable borrower rule in Rust.
// readdir requires errno set to zero.
nix::Error::clear();
ptr::NonNull::new(unsafe { libc::readdir(dir) })
.map(|e| {
Ok(DirEntryImpl {
name: unsafe {
// Step one: C pointer to CStr - referenced data, length not known.
let c_str = std::ffi::CStr::from_ptr(e.as_ref().d_name.as_ptr());
// Step two: OsStr: referenced data, length calcu;ated
let os_str = OsStr::from_bytes(c_str.to_bytes());
// Step three: owned copy
os_str.to_os_string()
},
})
})
.or_else(|| {
// NULL result, an error IFF errno has been set.
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(0) {
None
} else {
Some(Err(err))
}
})
}
}

#[derive(Debug)]
pub(crate) struct DirEntryImpl {
name: OsString,
}

impl DirEntryImpl {
pub fn name(&self) -> &OsStr {
&self.name
}
}

#[cfg(test)]
mod tests {
use std::{
Expand Down
Loading