Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve wording and description of atomics in the Arc chapter #430

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions src/arc-mutex/arc-clone.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,18 @@ We can update the atomic reference count as follows:
let old_rc = inner.rc.fetch_add(1, Ordering::???);
```

But what ordering should we use here? We don't really have any code that will
need atomic synchronization when cloning, as we do not modify the internal value
while cloning. Thus, we can use a Relaxed ordering here, which implies no
happens-before relationship but is atomic. When `Drop`ping the Arc, however,
we'll need to atomically synchronize when decrementing the reference count. This
is described more in [the section on the `Drop` implementation for
`Arc`](arc-drop.md). For more information on atomic relationships and Relaxed
ordering, see [the section on atomics](../atomics.md).
But what ordering should we use here? We don't really have any code
that will need atomic synchronization when cloning, as we do not
modify the internal value while cloning. Additionally, we already know
the reference count is at least one, by virtue of having a
`&Arc<T>`--and it will stay that way in sound code as long as that
reference still lives. Thus, we can use a Relaxed ordering here, which
implies no happens-before relationship but is atomic. When `Drop`ping
the Arc, however, we'll need to atomically synchronize when
decrementing the reference count. This is described more in [the
section on the `Drop` implementation for `Arc`](arc-drop.md). For more
information on atomic relationships and Relaxed ordering, see [the
section on atomics](../atomics.md).

Thus, the code becomes this:

Expand Down
56 changes: 24 additions & 32 deletions src/arc-mutex/arc-drop.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,45 +32,37 @@ if inner.rc.fetch_sub(1, Ordering::Release) != 1 {
}
```

We then need to create an atomic fence to prevent reordering of the use of the
data and deletion of the data. As described in [the standard library's
implementation of `Arc`][3]:
> This fence is needed to prevent reordering of use of the data and deletion of
> the data. Because it is marked `Release`, the decreasing of the reference
> count synchronizes with this `Acquire` fence. This means that use of the data
> happens before decreasing the reference count, which happens before this
> fence, which happens before the deletion of the data.
>
> As explained in the [Boost documentation][1],
>
> > It is important to enforce any possible access to the object in one
> > thread (through an existing reference) to *happen before* deleting
> > the object in a different thread. This is achieved by a "release"
> > operation after dropping a reference (any access to the object
> > through this reference must obviously happened before), and an
> > "acquire" operation before deleting the object.
>
> In particular, while the contents of an Arc are usually immutable, it's
> possible to have interior writes to something like a Mutex<T>. Since a Mutex
> is not acquired when it is deleted, we can't rely on its synchronization logic
> to make writes in thread A visible to a destructor running in thread B.
>
> Also note that the Acquire fence here could probably be replaced with an
> Acquire load, which could improve performance in highly-contended situations.
> See [2].
>
> [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html
> [2]: https://github.com/rust-lang/rust/pull/41714
[3]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1440-L1467
Why do we use `Release` here? Well, the `Release` ordering ensures
that any writes to the data from other threads happen-before this
decrementing of the reference count.

If we succeed, however, we need further guarantees. We must use an
`Acquire` fence, which ensures that the decrement of the reference
count happens-before our deletion of the data.

To do this, we do the following:

```rust
# use std::sync::atomic::Ordering;
<!-- ignore: simplified code -->
```rust,ignore
use std::sync::atomic;
atomic::fence(Ordering::Acquire);
```

We could have used `AcqRel` for the `fetch_sub` operation, but that
would give more guarantees than we need when we *don't* succeed. On
some platforms, using an `AcqRel` ordering for *every* `Drop` may have
an impact on performance. While this is a niche optimization, it can't
hurt--also, it helps to further convey the guarantees necessary not
only to the processor but also to readers of the code.

With the combination of these two synchronization points, we ensure
that, to our thread, the following order of events manifests:
- Data used by our/other thread
- Reference count decremented by our thread
- Data deleted by our thread
This way, we ensure that our data is not dropped while it is still
in use.

Finally, we can drop the data itself. We use `Box::from_raw` to drop the boxed
`ArcInner<T>` and its data. This takes a `*mut T` and not a `NonNull<T>`, so we
must convert using `NonNull::as_ptr`.
Expand Down
40 changes: 25 additions & 15 deletions src/arc-mutex/arc-layout.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Naively, it would look something like this:
use std::sync::atomic;

pub struct Arc<T> {
ptr: *mut ArcInner<T>,
ptr: *const ArcInner<T>,
}

pub struct ArcInner<T> {
Expand All @@ -35,24 +35,34 @@ pub struct ArcInner<T> {
}
```

This would compile, however it would be incorrect. First of all, the compiler
will give us too strict variance. For example, an `Arc<&'static str>` couldn't
be used where an `Arc<&'a str>` was expected. More importantly, it will give
incorrect ownership information to the drop checker, as it will assume we don't
own any values of type `T`. As this is a structure providing shared ownership of
a value, at some point there will be an instance of this structure that entirely
owns its data. See [the chapter on ownership and lifetimes](../ownership.md) for
all the details on variance and drop check.
This would compile, however it would be incorrect--it will give
incorrect ownership information to the drop checker, as it will assume
we don't own any values of type `T`. As this is a structure providing
shared ownership of a value, at some point there will be an instance
of this structure that entirely owns its data.

To fix the first problem, we can use `NonNull<T>`. Note that `NonNull<T>` is a
wrapper around a raw pointer that declares that:
To fix the problem, we can include a `PhantomData` marker containing an
`ArcInner<T>`. This will tell the drop checker that we have some notion of
ownership of a value of `ArcInner<T>` (which itself contains some `T`).

We should also use `NonNull<T>`, as it helps convey to the reader, and
the compiler, more guarantees about our inner pointer. Note that
`NonNull<T>` is a wrapper around a raw pointer that declares that:

* We are covariant over `T`
* We are covariant over `T`. This property is important to retain from
a `*const T` so that, for example, we could use an `Arc<&'static T>`
where an `Arc<&'a T>` was needed. This is perhaps a bit contrived but
there are cases when this could be useful, especially when dealing
with structures generic over lifetimes.
* Our pointer is never null

To fix the second problem, we can include a `PhantomData` marker containing an
`ArcInner<T>`. This will tell the drop checker that we have some notion of
ownership of a value of `ArcInner<T>` (which itself contains some `T`).
For more information on variance and the drop check, see [the chapter
on ownership and lifetimes](../ownership.md).

This can lead to some helpful compiler optimizations for layout (for
example, the null pointer optimization for `Option<NonNull<T>>` would
carry forth to an `Option<Arc<T>>` and perhaps even machine code
optimizations in certain cases.

With these changes we get our final structure:

Expand Down