Module left_right::aliasing

Expand description

Types that aid in aliasing values across the two left-right data copies.

This module primarily revolves around the Aliased type, and its associated DropBehavior trait. The basic flow of using it is going to go as follows.

In general, each value in your data structure should be stored wrapped in an Aliased, with an associated type D that has DropBehavior::DO_DROP set to false. In Absorb::absorb_first, you then simply drop any removed Aliased<T, D> as normal. The backing T will not be dropped.

In Absorb::absorb_second, you first cast your datastructure from

&mut DataStructure<Aliased<T, D>>

to

&mut DataStructure<Aliased<T, D2>>

where <D2 as DropBehavior>::DO_DROP is true. This time, any Aliased<T> that you drop will drop the inner T, but this should be safe since the only other alias was dropped in absorb_first. This is where the invariant that absorb_* is deterministic becomes extremely important!

Sounds nice enough, right? Well, you have to be really careful when working with this type. There are two primary things to watch out for:

Mismatched dropping

If absorb_first and absorb_second do not drop exactly the same aliased values for a given operation from the oplog, unsoundness ensues. Specifically, what will happen is that absorb_first does not drop some aliased t: T, but absorb_second does. Since absorb_second assumes that t no longer has any alises (it expects that absorb_first got rid of such an alias), it will drop the T. But that T is still in the “other” data copy, and may still get accessed by readers, who will then be accessing a dropped value, which is unsound.

While it might seem like it is easy to ensure that absorb_first and absorb_second do the same thing, it is not. A good example of this is non-deterministic (likely malicious) implementations of traits that you’d expect to be deterministic like Hash or Eq. Imagine someone writes an implementation like:

use std::sync::atomic::{AtomicBool, Ordering::SeqCst};
static SNEAKY: AtomicBool = AtomicBool::new(false);

#[derive(Eq, Hash)]
struct Sneaky(Vec<u8>);
impl PartialEq for Sneaky {
    fn eq(&self, other: &Self) -> bool {
        if SNEAKY.swap(false, SeqCst) {
            false
        } else {
            self.0 == other.0
        }
    }
}

Will your absorb_* calls still do the same thing? If the answer is no, then your datastructure is unsound.

Every datastructure is different, so it is difficult to give good advice on how to achieve determinism. My general advice is to never call user-defined methods in absorb_second. Call them in absorb_first, and use the &mut O to stash the results in the oplog itself. That way, in absorb_second, you can use those cached values instead. This may be hard to pull off for complicated datastructures, but it does tend to guarantee determinism.

If that is unrealistic, mark the constructor for your data structure as unsafe, with a safety comment that says that the inner types must have deterministic implementations of certain traits. It’s not ideal, but hopefully consumers know what types they are using your datastructures with, and will be able to check that their implementations are indeed not malicious.

Unsafe casting

The instructions above say to cast from

&mut DataStructure<Aliased<T, D>>

to

&mut DataStructure<Aliased<T, D2>>

That cast is unsafe, and rightly so! While it is _likely that the cast is safe, that is far from obvious, and it’s worth spending some time on why, since it has implications for how you use Aliased in your own crate.

The cast is only sound if the two types are laid out exactly the same in memory, but that is harder to guarantee than you might expect. The Rust compiler does not guarantee that A<Aliased<T>> and A<T> are laid out the same in memory for any arbitrary A, even if both A and Aliased are #[repr(transparent)]. The primary reason for this is associated types. Imagine that I write this code:

trait Wonky { type Weird; }
struct A<T: Wonky>(T, T::Weird);

impl<T> Wonky for Aliased<T> { type Weird = u32; }
impl<T> Wonky for T { type Weird = u16; }

Clearly, these types will end up with different memory layouts, since one will contain a u32 and the other a u16 (let’s ignore the fact that this particular example requires specialization). This, in turn, means that it is not generally safe to transmute between a wrapper around one type and that same wrapper around a different type with the same layout! You can see this discussed in far more detail here if you’re curious:

https://github.com/jonhoo/rust-evmap/pull/83#issuecomment-735504638

Now, if we can find a way to guarantee that the types have the same layout, this problem changes, but how might we go about this? Our saving grace is that we are casting between A<Aliased<T, D>> and A<Aliased<T, D2>> where we control both D and D2. If we ensure that both those types are private, there is no way for any code external to your crate can implement a trait for one type but not the other. And thus there’s no way (that I know of) for making it unsound to cast between the types!

Now, I only say that this is likely sound because the language does not actually give this as a guarantee at the moment. Though wiser minds seem to suggest that this might be okay.

But this warrants repeating: your D types for Aliased must be private.

Structs

Aliased

A T that is aliased.

Traits

DropBehavior

Dictates the dropping behavior for the implementing type when used with Aliased.