Module left_right::aliasing
source · [−]Expand description
Types that aid in aliasing values across the two left-right data copies.
This module primarily revolves around the Aliased
type, and its associated DropBehavior
trait. The basic flow of using it is going to go as follows.
In general, each value in your data structure should be stored wrapped in an Aliased
, with an
associated type D
that has DropBehavior::DO_DROP
set to false
. In
Absorb::absorb_first
, you then simply drop any removed Aliased<T, D>
as normal. The
backing T
will not be dropped.
In Absorb::absorb_second
, you first cast your datastructure from
&mut DataStructure<Aliased<T, D>>
to
&mut DataStructure<Aliased<T, D2>>
where <D2 as DropBehavior>::DO_DROP
is true
. This time, any Aliased<T>
that you drop
will drop the inner T
, but this should be safe since the only other alias was dropped in
absorb_first
. This is where the invariant that absorb_*
is deterministic becomes extremely
important!
Sounds nice enough, right? Well, you have to be really careful when working with this type. There are two primary things to watch out for:
Mismatched dropping
If absorb_first
and absorb_second
do not drop exactly the same aliased values for a given
operation from the oplog, unsoundness ensues. Specifically, what will happen is that
absorb_first
does not drop some aliased t: T
, but absorb_second
does. Since
absorb_second
assumes that t
no longer has any alises (it expects that absorb_first
got
rid of such an alias), it will drop the T
. But that T
is still in the “other” data copy,
and may still get accessed by readers, who will then be accessing a dropped value, which is
unsound.
While it might seem like it is easy to ensure that absorb_first
and absorb_second
do the
same thing, it is not. A good example of this is non-deterministic (likely malicious)
implementations of traits that you’d expect to be deterministic like Hash
or Eq
. Imagine
someone writes an implementation like:
use std::sync::atomic::{AtomicBool, Ordering::SeqCst};
static SNEAKY: AtomicBool = AtomicBool::new(false);
#[derive(Eq, Hash)]
struct Sneaky(Vec<u8>);
impl PartialEq for Sneaky {
fn eq(&self, other: &Self) -> bool {
if SNEAKY.swap(false, SeqCst) {
false
} else {
self.0 == other.0
}
}
}
Will your absorb_*
calls still do the same thing? If the answer is no, then your
datastructure is unsound.
Every datastructure is different, so it is difficult to give good advice on how to achieve
determinism. My general advice is to never call user-defined methods in absorb_second
. Call
them in absorb_first
, and use the &mut O
to stash the results in the oplog itself. That
way, in absorb_second
, you can use those cached values instead. This may be hard to pull off
for complicated datastructures, but it does tend to guarantee determinism.
If that is unrealistic, mark the constructor for your data structure as unsafe
, with a safety
comment that says that the inner types must have deterministic implementations of certain
traits. It’s not ideal, but hopefully consumers know what types they are using your
datastructures with, and will be able to check that their implementations are indeed not
malicious.
Unsafe casting
The instructions above say to cast from
&mut DataStructure<Aliased<T, D>>
to
&mut DataStructure<Aliased<T, D2>>
That cast is unsafe, and rightly so! While it is _likely that the cast is safe, that is far
from obvious, and it’s worth spending some time on why, since it has implications for how you
use Aliased
in your own crate.
The cast is only sound if the two types are laid out exactly the same in memory, but that is
harder to guarantee than you might expect. The Rust compiler does not guarantee that
A<Aliased<T>>
and A<T>
are laid out the same in memory for any arbitrary A
, even if
both A
and Aliased
are #[repr(transparent)]
. The primary reason for this is associated
types. Imagine that I write this code:
trait Wonky { type Weird; }
struct A<T: Wonky>(T, T::Weird);
impl<T> Wonky for Aliased<T> { type Weird = u32; }
impl<T> Wonky for T { type Weird = u16; }
Clearly, these types will end up with different memory layouts, since one will contain a
u32
and the other a u16
(let’s ignore the fact that this particular example requires
specialization). This, in turn, means that it is not generally safe to transmute between a
wrapper around one type and that same wrapper around a different type with the same layout! You
can see this discussed in far more detail here if you’re curious:
https://github.com/jonhoo/rust-evmap/pull/83#issuecomment-735504638
Now, if we can find a way to guarantee that the types have the same layout, this problem
changes, but how might we go about this? Our saving grace is that we are casting between
A<Aliased<T, D>>
and A<Aliased<T, D2>>
where we control both D
and D2
. If we ensure
that both those types are private, there is no way for any code external to your crate can
implement a trait for one type but not the other. And thus there’s no way (that I know of) for
making it unsound to cast between the types!
Now, I only say that this is likely sound because the language does not actually give this as a guarantee at the moment. Though wiser minds seem to suggest that this might be okay.
But this warrants repeating: your D
types for Aliased
must be private.
Structs
A T
that is aliased.
Traits
Dictates the dropping behavior for the implementing type when used with Aliased
.