Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsafety improvements #1683

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Unsafety improvements #1683

wants to merge 5 commits into from

Conversation

Radvendii
Copy link
Contributor

As a followup to #1681, I've implemented a couple of the easy improvements.

  • annotated std::mem::transmute so the types are present locally
  • put lifetime annotations to disambiguate

When doing the latter, I noticed that one of the comments is wrong. I'm not sure how to interpret the lifetime that is correct, though.

@github-actions github-actions bot temporarily deployed to pull request October 16, 2023 19:43 Inactive
Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the interner part is confusing because 'a and 'self (although the later doesn't really make sense for a value - which is in part why there's no such keyword in Rust) are in fact morally the same. At least that's what I understand from the code.

What we want to express is that the &str we are storing inside the vector and the hashmap live as long as the the whole interner, because we store them inside an arena that is owned by the interner. So, for all intent of purpose, 'a = 'self. But you can't express that in Rust without an explicit lifetime, at least to the best of my knowledge. Hence, we need to use this additional parameter, to juggle between them depending on the requirement of the type signature, and sometimes force the compiler to accept the conversion using transmute.

I believe you're right that what we want in the second transmute is to get out an &'a to be able to store it inside the vector and the hashmap.

I'm not sure about the signature change of lookup, though: after all both lifetimes are "valid" (in that both are possible and sound), but I tend to think that 'a is an implementation detail/Rust work-around that leaks. As a consumer of an interner, what you want is to create stuff that live as long as the interner, I think, and you don't really care about the 'a lifetime. So using the same as self and transmuting looks nicer. It doesn't matter now, because we only have one, static interner, but let's imagine that interner is a library, and that we might make use of it locally (e.g. if one day we compile to bytecode - we could use an interner for one of the pases, but not the others).

core/src/environment.rs Outdated Show resolved Hide resolved
@Radvendii Radvendii force-pushed the unsafety_improvements branch from 42a0556 to b652cbe Compare November 6, 2023 19:29
@github-actions github-actions bot temporarily deployed to pull request November 6, 2023 19:32 Inactive
@Radvendii
Copy link
Contributor Author

for all intent of purpose, 'a = 'self

Ohhh no. Okay, so I played around some more and discovered why the lifetimes were the way they were originally. With the changes I made, one can write this code:

let interner = Interner::new();
let sym = interner.intern("temporary");
let smuggler = interner.lookup(sym);
drop(interner);
println!(smuggler); // use-after-free!!!

It's more clear what's going on if I add in a couple of explicit lifetime annotations:

let interner = Interner::<'static>::new();
let sym = interner.intern("temporary");
let smuggler: &'static str = interner.lookup(sym);
drop(interner);
println!(smuggler); // use-after-free!!!

There is no guarantee that 'self: 'a, only that 'a: 'self.

So it's definitely not the case that 'a = 'self, but we don't care because we never actually use that 'a lifetime. The external-facing API converts it to a different lifetime, but it wasn't clearly documented why all this was the way it was. I will revert my changes and update the comments.

@yannham
Copy link
Member

yannham commented Nov 7, 2023

So it's definitely not the case that 'a = 'self

Yes, sorry, I think I didn't express myself correctly. What I meant is that, as the writer of the API, you want to think of 'a as 'self. It is there just as an implementation detail because we can't really express 'self in Rust currently. However, indeed, for the compiler, they have no reason to be the same.

However you're right that it's not just a question of taste or API usage, because you can obviously pick 'a however you want, which I totally missed, so the change was in fact unsound, as you say. At first it just didn't make a lot of sense.

@Radvendii
Copy link
Contributor Author

I managed to produce UB without unsafe even with the old lifetime arrangement, although it's not likely to ever happen.

let i1 = Interner::new();
let i2 = Interner::new();
let sym = i1.intern("foo");
let foo = i1.lookup(sym);
*i1.0.write().unwrap() = i2.0.into_inner().unwrap();
dbg!(foo);

The issue here is that we assume the lifetime of the InnerInterner is the same as the lifetime of the Interner, but in fact it's not necessarily. Maybe the fix for this is to make InnerInterner a private struct field.


There's another problem, that's potentially worse / easier to trigger. I wasn't able to actually trigger it but I'm not entirely sure why. Self-referential structs can be problematic if they're moved. Something like

pub struct Foo<'a> {
    num: u32,
    num_ptr_opt: Option<&'a u32>,
}
impl<'a> Foo<'a> {
    pub fn new(num: u32) -> Self {
        let mut foo = Foo {
          num,
          num_ptr_opt: None
       };
       foo.num_ptr_opt = unsafe { std::mem::transmute<&'_ u32, &'a u32>(foo.num) };
       foo
    }
    pub fn get_num_ptr(&self) -> &u32 {
         self.num_ptr_opt.unwrap()
    }
}

Seems safe, because the data lives as long as the struct, and so the pointer should always point to valid memory. But it's not, because one can write

let foo = Foo::new(0);
let num_ptr = foo.get_num_ptr();
let foo = std::hint::black_box(foo); // force move
dbg!(num_ptr); // points to moved / freed memory

My guess as to why this doesn't happen with Interner is that the memory is allocated on the heap, and only pointed to by the struct. However, I've found people saying that the pointed-to heap memory should also be considered invalidated, and I haven't been able to find anything contradicting this.

I tried running a test of this behaviour with Interner through miri and it passed 🤔.

@github-actions github-actions bot temporarily deployed to pull request November 7, 2023 21:00 Inactive
Comment on lines -243 to +245
pub(crate) struct Interner<'a>(RwLock<InnerInterner<'a>>);
// NOTE: We set the lifetime parameter of InnerInterner to 'static since
// it's arbitrary, and there's no reason to expose it to users of Interner
pub(crate) struct Interner(RwLock<InnerInterner<'static>>);
Copy link
Member

@yannham yannham Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Radvendii , it seems that it used to be the fact that the lifetime of the interner and the inner interner were the same one, before this change. Or am I misunderstanding something? Does your first example of UB still happens with the original code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it still happens with the original code.

Remember that in Interner<'a>(RwLock<InnerInterner<'a>) , there is nothing limiting 'a to the liferime of the Interner. If rust needs it to be 'static it will just insert 'static in here anyways, so specifying 'static doesn't allow any behaviour that wasn't allowed before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. But it's a private item already, as is InnerInterner, so you can only access the InnerInterner of an interner from within the identifier::interner module, is that right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact, I start to believe that the lifetime of InnerInterner is useless, because we only ever instantiate it to static. So maybe we should just bite the bullet and use static references, saying that they aren't static at all, but only the actual implementation of InnerInterner can touch it. And document as well that an Interner should never ever change the underlying inner interner after construction. However, the inner interner being private should make it relatively ok. Another solution is to inline the inner interner in the Interner to avoid this situation, but this means more locks, and more coupling. I don't know if this is worth the price: in any case, we should make it clear that this module has unsafe parts and that you shouldn't touch it unless you know what you're doing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility is to explore Pin, which believe allows to express those kind of constraints (for example you can't swap a Pin<..> with something else, just drop it, which might be sufficient to avoid your UB example ?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And finally, maybe we should use NonNull instead of references with lifetimes, which would avoid to lie about their lifetime (would still require unsafe, but at least won't make it possible to do conversion to &'static from safe code)

@yannham
Copy link
Member

yannham commented Nov 8, 2023

The issue here is that we assume the lifetime of the InnerInterner is the same as the lifetime of the Interner, but in fact it's not necessarily. Maybe the fix for this is to make InnerInterner a private struct field.

Cf my in-code comment, but it seems this UB has been introduced by this PR, and wasn't here before?

My guess as to why this doesn't happen with Interner is that the memory is allocated on the heap, and only pointed to by the struct. However, I've found people saying that the pointed-to heap memory should also be considered invalidated, and I haven't been able to find anything contradicting this.

Ah, and I found other people claiming that Box guarantees that the content will never be moved 😅

https://stackoverflow.com/questions/64854198/can-box-move-its-contents-when-the-whole-box-is-moved

Also, I didn't have time to dive as deep as you into the problem, so I might be misinterpreting, but it seems the answer you linked doesn't really says that stuff on the heap can be moved. It says that it's unsafe to have the content of a Box being moved "under the rug" (even it's not really moved, but the ownership is transferred), and you can do that because you have mutable alias to the content stored inside the box (and with a 'static lifetime!).

Yes, it's undefined behavior. Box isn't just managing an allocation; it also (currently) signals a claim of unique, non-aliasing access to the contents. You violate that non-aliasing by creating the writer and then moving the buffer — even though the heap memory is not actually touched, the move of buffer is counted invalidating all references into it.

It seems what we do here is different, because we never ever expose a mutable static reference, nor do we mutably alias anything, so nothing can take out the backing storage of e.g. the Vec or the HashMap during its lifetime. We just store and hand over immutable (self-)references of heap-allocated content we own, which looks like it should be ok, provided it's indeed guaranteed to not ever move. However, if you're not sure, I'm happy to introduce ourobouros, Pin or whatnot to make sure we do things the right way.

That being said: https://www.reddit.com/r/rust/comments/1479b5n/ouroboros_is_also_unsound/
It seems self-referential struct is a tricky matter, for everyone 😄

@yannham
Copy link
Member

yannham commented Nov 8, 2023

summoning @jneem to have a second look at those tricky soundness issues

@jneem
Copy link
Member

jneem commented Nov 8, 2023

@Radvendii I couldn't reproduce your "Foo" example on the playground. The signature of get_num_ptr makes its return value borrow from the Foo, so then you can't move the Foo away while that borrow is live. I think that same thing applies to Interner::lookup. The 'a lifetime is not really relevant here, AFAICT: it's the signature fn lookup<'b>(&'b self, sym: Symbol) -> &'b str that makes it work.

My understanding of Box (which seems compatible with both SO answers) is that the pointed-to data never moves (expressed in the fact that Box<T>: Unpin for all T), but it can get freed and thus invalidated. In our current interner code, the arena-allocated data will be valid as long as the arena is alive, which will be for as long as the InnerInterner is alive. And so I think this means the signature of InnerInterner::lookup is fine. (The lifetime 'a seems pretty irrelevant here, AFAICT)

I wonder, though, if it would be better to just take a dependency on an existing interner crate. This one seems to be widely used, for example.

@yannham
Copy link
Member

yannham commented Nov 16, 2023

I wonder, though, if it would be better to just take a dependency on an existing interner crate. This one seems to be widely used, for example.

I think the original assessment was that we needed something very simple, but now, it looks like it's not as simple as initially thought (and the person who wrote this code left Tweag some time ago, although to be fair, it's well-written).

A second point is that intern might make use of an arena, which could theoretically speed up allocations and have better cache locality than our current "just allocate anywhere in the heap" (non-)strategy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants