Introduction
Work in progress. For now refer to the side menu.
Natural aliasing model
Draft 2025-07-07
Motivation
I have in a sense conflicting feelings about Rust. In my opinion it is the most expressive compiled language as of 2025 that I've yet seen. It is really a miracle that such a complicated programming language became mainstream. It is a proof that language's complexity could be beneficial up to defining its public image. However I can't get rid of the occasional feeling that some suboptimal decisions about Rust's development were made. Furthermore Rust's aim at everlasting stability makes me more sensitive to such decisions.
More than a year later after my initial suspicions, today I've found a way to substantiate some of my alternative vision on the language's type system. In this text I'll touch upon several aspects of our type system:
- Why and how
&Cell
is a natural extension of mutable borrows; - Alternative, more general than
Send
, approach to thread-safety; - Why and how
Send
futures may contain!Send
types likeRc
; - Why and how hypothesized
Forget
marker trait does and does not prevent memory leaks and use-after-free bugs; - The general role of less or more so copyable types in Rust's type system;
- Self-referencial types
- Etc.
To put it simply: this text is all about abstraction of memory aliasing. Although I am not a good writer, I've tried to explain things in a manner similar to a Rust programmer. Nonetheless, due to my lack of experience, I expect this text to contain a good amount a flaws and errors.
Introduction
Let's first focus on the power of Cell
.
In usual memory-safe languages (Java, JavaScript, Python, etc) objects are conceptualized as arbitrary aliased pointers with reference counters or GC, just like Rc<Cell<T>>
.
I have found this approach to lack needed control for the complexity of those object semantics.
Rust grants me this control with lifetimes and complicated library of generic types.
My grudge with object semantics of other memory-safe languages comes down to this:
a.c = 13
b.c = 42
assert(a.c == 13) // may fail if `b = a`
For myself I found this failing code to be very unintuitive, from my assumption that names a
and b
represent two distinct entities.
But in javascript such aliasing is permitted.
It is, however, becomes intuitive once I am aware, when such aliasing could is taking place:
b = a
a.c = 13
b.c = 42
assert(a.c != 13)
Rust allows to make a distinction between aliased and unaliased borrows:
#![allow(unused)] fn main() { // compare usual code fn assert_unaliased(a: &mut S, b: &mut S) { a.c = 13; b.c = 42; assert_eq!(a.c, 13); // won't fail } fn assert_unaliased(a: &Cell<S>, b: &Cell<S>) { a.set(S { c: 13, ..a.get() }); b.set(S { c: 42, ..b.get() }); assert_eq!(a.get().c, 13); // may fail } }
To achieve this Rust restricts mutable borrows to be uncopyable, ensuring a mutable borrow is aliased in context exclusively by one variable's name. This rule relates to the second JS case when we were aware of aliasing taking place, as it rules out information about aliasing at least one important way. But what if it was more than one way?
Simple aliasing
Consider adding a marker lifetime to the Cell<'a, T>
type, to establish aliasing at the type level.
Although I am simplifying, now it is possible to express aliasing requirements like:
#![allow(unused)] fn main() { fn assert_aliased<'a>(a: &Cell<'a, S>, b: &Cell<'a, S>) { a.set(S { c: 13, ..a.get() }); b.set(S { c: 42, ..b.get() }); assert_eq!(a.get().c, 42); // Change to check if `a` contains value from `b`, won't fail } }
The same marker lifetime establishes that these cells alias the same memory region.
Compiler would complain otherwise if such Cell
is designed properly (like GhostCell
is).
This syntax essentially expresses the notion of "if you have put something in a
or b
you will get it from a
and b
", for aliasing references a
and b
.
However it is indivisible, as you cannot look at only one of two variables, without knowing what happens to the second one.
Instead of picking memory regions at random, programmers rely on memory allocators to ensure their memory is generally unaliased.
Aliasing information is essential to develop a reasonable program.
Nonetheless I will immediately contradict myself there, but not really.
You can absolutely define a reasonable subroutine working on aliased memory, although to do that, you have to make it clear to the user what you are doing.
A part of that would be the understanding that &Cell
s outside of the subroutine call aren't used until subroutine returns.
Better Send
This comes with a cool consequence of alternative definition of thread-safe/unsafe types.
It would be safe to send a type across the thread boundary only if it's aliased memory region isn't aliased anywhere else.
To avoid to talk about plain borrows, consider Rc<'a, T>
implemented using new Cell<'a, usize>
as a reference counter.
It is safe to send a: Rc<'a, T>
to another thread if there isn't any other b: Rc<'a, T>
left on the old thread.
But more than that, if there is another b: Rc<'a, T>
, we still could send both of them (a, b)
across threads.
I have found type annotation for higher-ranked lifetimes (a, b): for<'a> (Rc<'a, T>, Rc<'a, T>)
, although formally ambiguous, to be quite fitting.
Now you can see yourself why &mut T
would be just a non-copyable version of for<'a> &Cell<'a, T>
.
From this we could even restore the original Send
oriented design.
The !Send
implementation on a type essentially tells that utilized memory region could be (non-atomically, without synchronization) aliased from the current thread.
This stems from the assumption that the function body execution always stays on the same thread until its finished.
That assumption is the reason of some limitations on stackless (async blocks) and stackful coroutines around Send
.
This also allows to store !Send
types in thread locals, which then becomes the evident cornerstone of problems with async and Send
.
The solution to that problem would be to abstract assumption into a type, let's say, ThreadLocalKey<'a>
zero-sized type that would allow thread-unsafe access to thread locals.
But you shouldn't be able to prove that 'a
aliasing lifetime does not occur somewhere else, so you won't ever be able to send it across threads.
Any function requiring thread-unsafe access to thread-locals would have to get this type through its arguments.
This then would be reflected in the function signature, which would inform whether function body is sendable across threads or not.
This way you could imagine a Future
gets ThreadLocalKey<'a>
through its poll
method,
which explains why storing any thread-unsafe type T: 'a
should make the compiler assume future is thread-unsafe as a whole.
Unless that future's internal structure contains types only with for<'a>
bounded aliasing lifetimes!
You should notice that now the thread-safe property of a type could be defined solely from the type's boundary, i.e. its safe public interface. I will name this rule the fundamental aliasing rule, although pretentious, in the context of our theory it is worth its name.
Unfortunately it's not possible to realize such thread-safety checking behavior in the type system today. It would require to extend capabilities of lifetimes, potentially even allowing self-referential types to be defined in safe way, or even introducing another type of aliasing lifetime.
Compound aliasing and borrows
On that note, this analogously explains why regular lifetimes inside of an async block is "squashed" to 'static
from the outside perspective.
Such lifetimes simply aren't reflected in the future's type boundary.
But to dive a bit deeper, we have to develop this connection of borrows and aliasing further. What does (re)borrowing actually mean? For this let's investigate a difference between two aliasing cell references and one mutable reborrow of a mutable reference:
#![allow(unused)] fn main() { // notice symmetry between `a` and `b` fn assert_aliased_cell<'a>(a: &Cell<'a, S>, b: &Cell<'a, S>) { a.set(S { c: 13, ..a.get() }); b.set(S { c: 42, ..b.get() }); assert_eq!(a.get().c, 42); // ok! } fn assert_aliased_mut(a: &mut S) { a.c = 13; let b = &mut *a; // reborrow b.c = 42; assert_eq!(a.c, 42); // obviously ok! } // what if we swap `a` and `b`? // now notice the antisymmetry fn assert_aliased_mut_bad(b: &mut S) { let a = &mut *b; // reborrow a.c = 13; b.c = 42; // compiler error! assert_eq!(a.c, 42); } }
So it looks like that it isn't actually correct to call mutable references unique.
Rather, mutable borrows allow aliasing in a compound fashion.
Pick the assert_aliased_mut
example.
As you can see, from a
's perspective b
aliases it, while from b
's point of view nothing aliases it at the moment, it is exclusive.
At this moment it is as reasonable to look at b
alone and to look at both a
and b
, while considering only a
won't tell you much about program's behavior.
In this sense a
's aliasing info is included in b
's aliasing info.
Immutable borrows
Immutable borrows allows us to worry less about aliasing.
Rather, restricting mutability of a reference allows us to disregard any aliasing information on that borrow.
That is, aliasing information on an immutable borrow is quite trivial, limited to compound aliasing of borrowed by it mutable references.
Even more trivial case would be of a static SOME_REFERENCE: &'static T = &T {}
, where static immutable references are ideally what a programmer would like to see.
This is the kind of aliasing functional programming languages use, where every variable should be interpreted "at face value".
Allocations and Forget
So what about a Box
we would only read from?
Would that be the same as for static immutable references?
Obviously no.
If you've got a hang of Rust, you might draw a comparison between mutable borrows and memory allocators.
In a sense, memory allocation is a borrow of the memory allocator, or rather, a part of its managed memory.
That's why it's sometimes more compelling to implement custom memory allocators using mutable borrows instead of some kind of a Box
type, like bumpalo.
The only difference between a Box
and a mutable borrow is in the party responsible for bookkeeping, either the compiler or the runtime.
However, if something isn't handled by the compiler, it becomes syntactically invisible to us, which then explains why memory leaks are considered safe.
Part of it, the function std::mem::forget
allows anything to escape syntax and render its runtime effects invisible.
In order to guarantee absence of memory leaks, compiler should be aware of this kind of aliasing information too, just like for &mut T
.
This entails a type of API used by aforementioned memory allocators and arenas, maybe with some portion of runtime bookkeeping via custom Box
type with lifetime.
This is where hypothetical Forget
trait comes to rescue.
While it was satisfying to realize that Forget
was tightly involved with lifetimes, its lack of connection to memory leaking was uncanny.
But now there's an answer: it comes from the allocator interface design.
If allocation wasn't a free function, but designed as explained above, !Forget
would have prevented those leaks.
Noticeably, if you consider the rule of aliasing information of a type is being closed under its public interface,
it would be ok to forget allocations, if we also forget about the allocator itself.
Although that warrants a question "wouldn't allocator need to allocate memory from somewhere in order to hold it?" The answer is yes, allocator is by definition the way of tracking unaliased memory, thus for every allocator we should establish there's no intersections between allocators, for which we need an allocator. This leaves us to conclude that there has to be a chain of runtime and compile-time allocators, with the primordial allocator at the beginning. I'll argue this primordial allocator is your consistent decision on division of memory between allocators, possibly leaving a single runtime allocator on the entire physical memory.
Copy
Another funny thing to consider is absence Copy
impl on type as being closed under its API.
That wouldn't make much sense for actual pointers, until we would consider pointers as indices.
Global memory could be thought of as a singleton byte array we index using load and store instructions.
And in reverse, if we would ever consider indices to be pointers with multiple memories,
it allows to copy the whole memory region, leaving stored these pseudo-pointers to be valid.
But alas I find this thinking a bit unclear for implementation yet.
Ownership and self-referencial types
What is ownership really? Coming from above section, I hope you consider an argument that it is about giving/taking something and taking/giving it back. In order to give something you have to take it first, and so in order to give something back, you need to take back what you gave. First statement is about composition of constructors, how constructor of a structure utilizes its field's constructor. But the second one is more interesting, as it stands for composition of destructors. Rust automates destructor nesting largely due to implicit destruction of variables, although there is probably a fitting alternative. No matter, as we still can make sense of it in a few new ways.
One way is to reexamine, so called, self-referencial types. Take the infamous doubly-linked list for example. A list as a whole contains a collection of allocated on a heap nodes with value field, next and previous nullable pointers to respective nodes. There's a consistent way of deallocating all of these nodes. For this sequence of nodes we can recursively deallocate its tail, and when we get the empty next node we can start deallocating nodes themselves. It's just as if it was singly-linked list without the previous node pointer, which forms a tree of nodes. Usually deallocation of a doubly-linked list is handled with a loop instead, but that would be the same as if we took tail out of the head node and had the tail call optimization.
To some extent this thinking of converting types with reference cycles into a tree of references is unavoidable, because of our conceptualization of ownership. At least this allows to refine our thinking, to compose destructors and think about them separately. This nested ownership of types may resonate in other aspects of Rust language, even if such feature would be a hypothetical, like structured concurrency for async code.
Returning back to doubly linked list,
my suggestion for trying to came up with safe syntax for self-referencial types in this case would be to regard list nodes in two different modes:
as a singly-linked list node, with next pointer resembling a Box
,
and as a doubly-linked list node with next and previous pointers as arbitrary aliasing mutable borrows.
Top-level, you would consider list of nodes in second mode by creating a &Cell
borrow of list's head in singly-linked mode.
This is kinda what GhostCell
does already.
Also this sits well with my intuition about async blocks with references to other local variables, which is yet to be put on paper.
Move
and runtime ownership tracking
I guess this is an appropriate place to mention, that the program stack is also an allocator.
Many uncomfortable consequences stand from this nuance, like restricting values from moving between scopes when borrowed.
But it seems possible to somewhat alleviate this using a primitive like RefOnce
or &own T
which I've found a use in one of my libraries.
This makes me think that, if stack frame allocation had a syntax with lifetimes,
then inability to move a type would have been expressed as inability to place a type into something with a bigger lifetime.
Otherwise this may lead it to being able to witness that type in outlived/invalid state, which RefOnce
avoids by borrowing only memory for that type to occupy.
And again, back to Forget
.
One of this trait's flaws would have been unclear semantics about what type would require of a type to be forgettable.
For example, Rc
can be placed into itself, introducing a reference cycle.
To handle this it is required to restrict Rc<'a, T>
with aforementioned aliasing lifetime from being put into itself somehow using lifetimes to track down such case.
But it becomes obvious if we remember that Rc
shifts responsibility of tracking ownership to runtime,
which usually isn't aware of any syntactic scopes we keep in mind in order to think about ownership.
In order to understand how Rc
s are tracking memory allocation, appropriately you would need to keep in mind all of them.
More appropriately you would reason about Rc
as aliasing mutable borrows to allocated memory.
Precisely upon dropping Rc
s, runtime filters out contexts its allocated memory belongs to, sort of like it's in superposition until then.
On the second to last drop of Rc
we would know one definite context where its allocated memory is placed,
which currently could be either Rc
itself or some other syntactic context we have hold on.
This thinking also extends to channels like MPSC, which have exhibit similar unclear/runtime ownership.
Aliasing topology
I hope it is clear to you why looking at aliasing variables separately hurts programmer's ability to develop reasoning about a program's behavior. To be more precise, you have to know what happens to different aliases to construct a sound program. While it is possible to write a public library, working with aliased memory, it is library users' task to put the pieces together to conclude a program. Otherwise we would call that possible memory corruption.
If you have ever delved into topology, you might recognize that neighborhoods of aliased variables could be expressed with some topology. Naively we could say two variables alias the same memory whenever they alias same memory addresses. This means entails map \(m\) from the collection of aliasing variables \(V\) to a powerset of the address space \(2^A\). However this doesn't account for compound aliasing of reborrowing.
So that means instead of a mapping to boolean domain \(2\), we should map from address space to topological space of aliasing constructions defined as:
points as strings of the form b*(f[oi]*)?
or 0
using regex notation
and (infinite) opens sets defined to hold every valid string can we get by appending some suffix and the 0
string standing for unaliased memory address.
This way mutable (aliasing) borrows would map to strings b+
with each b
symbol corresponding to one reborrow,
and immutable borrows map to strings b*f[oi]*
where f
standing for freezing mutable reference into immutable one and then either of [oi]*
sequences.
Whenever copying or reborrowing an immutable borrow, we assign old one a new string with added o
and new one new string with added i
,
which would ensure that every such variable of immutable borrow forms a singleton open set.
There's a smallest fitting topology, set of open subsets, \(\tau\) with open sets defined from preimages of continuous map \(m\) which I will name alias map.
For any set of aliasing variables \(V\) we will call this \(\tau_V\) an aliasing topology on space \(V\).
This description, sadly, is too mechanical to be a good mathematical definition. However, although I lack confidence in defining it in such way, I suspect aliasing topology can be expressed as sieves of an appropriate category and alias map to be Grothendieck construction. Moreover, my intuition about this subject while based in Grothendieck topos of sheaves on a site, I am yet to develop a confidence to express my ideas this way. But I hope more knowledgeable people would connect dots together if such interpretation was appropriate for the subject of natural aliasing.
Justification for the fundamental aliasing rule
Now to define product type (pair and tuples) of this theory, it is most fitting to define alias map from the pair of variables as union of alias maps from each variable. This allows us to disregard individual members of a tuple and view it only as a whole. It also means that alias map of a pair of borrowed and borrowing variables is the same as alias map of that borrowing variables by itself, which should make sense if you remember the compound aliasing.
Another important type construction would be exponential types, i.e. closures.
Closures are important for type erasure of a variable, or tuple, and consider any construction of a closure identical by their alias maps.
This makes it possible to abstract any function call as a FnOnce()
closure and disregard internal contents of the closure except for its captured variables.
Important consequence to note: β-reduction on such closure is able to change its alias map,
which is fine as long as closure's alias map constitutes an open set in the aliasing topology.
Nonetheless this constitutes the ability to think about aliasing of variables solely by public interfaces of their constructions.
Aliasing topology also establishes determinism for applications β-reduction rule, which is another way to say that if we know variables are unaliased, we could use memory to store and load values in a deterministic and consistent way.
Afterword
I would appreciate and credit your contributions if you share me useful improvements to this text. I hope all this abstract nonsense would help guide rust-lang's and other languages future, as there are a lots of implications about the memory-safe language design to consider.
The asynchronous drop glue generation design
This text aims to explain the design of my asynchronous drop prototype, which I have been working on for some time now.
Public interface of AsyncDrop
I've tried to make interface of asynchronous drops as similar to the
synchronous drops as possible. Take a look at the definition of the most
important public trait of my prototype (AsyncDrop
trait):
/// Custom code within the asynchronous destructor.
#[lang = "async_drop"]
pub trait AsyncDrop {
/// A future returned by the [`AsyncDrop::async_drop`] to be part
/// of the async destructor.
type Dropper<'a>: Future<Output = ()>
where
Self: 'a;
/// Constructs the asynchronous destructor for this type.
fn async_drop(self: Pin<&mut Self>) -> Self::Dropper<'_>;
}
Given that we don't have async
as a keyword generic I've had to define
the entire new trait. It's kinda similar to AsyncFnMut
as that
trait also mirrors FnMut
. Both of these async traits use near to
the desugared interface of async functions, returning from sync method
a future object of trait's associated type. I've also wrapped mutable
reference to self into Pin
just to be sure, maybe it'll become useful
or detrimental.
Let's imagine its implementation for a new, cancellable during drop task handle in tokio:
impl<T> AsyncDrop for CancellableJoinHandle<T> {
type Dropper<'a>: impl Future<Output = ()>;
fn async_drop(self: Pin<&mut Self>) -> Self::Dropper<'_> {
async move {
self.join_handle.abort();
let _ = Pin::new(&mut self.join_handle).await;
}
}
}
Here we are wrapping tokio::task::JoinHandle
and using
JoinHandle::abort
to cancel the task if possible, and then awaiting
its end. The impl_trait_in_assoc_type
feature is used there to not
implement futures manually, perhaps this can be simplified further with
return-position impl Trait
and async methods in traits.
Choosing against poll_drop
You may wonder about possible alternative design of async drop, usually
named poll_drop
:
#[lang = "async_drop"]
pub trait AsyncDrop {
fn poll_drop(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()>;
}
We have decided against it since it would require to embed the state
of asynchronous destruction into the type itself. For example Vec<T>
would need to store an additional index to know which element is currently
in the process of asynchronous destruction (unless we poll_drop
every
element on each parent call, but I imagine that could become expensive
quick, and it is not exactly symmetrical to how the regular Drop
functions). Also each element of the vector would require additional
space for these embedded asynchronous destructors, even tho it would be
utilized one at a time.
However there is indeed one benefit of poll_drop
which I hypothesized
to be a supplemental interface down below.
Asynchronous drop glue
To run async drop glue on a type we can use public async_drop
or
async_drop_in_place
functions, just as with the regular variant of drop.
These are the async implementations:
pub async fn async_drop<T>(to_drop: T) {
let to_drop = MaybeUninit::new(to_drop);
// SAFETY: we store the object within the `to_drop` variable
unsafe { async_drop_in_place(to_drop.as_mut_ptr()).await };
}
#[lang = "async_drop_in_place"]
pub unsafe fn async_drop_in_place<T: ?Sized>(
to_drop: *mut T,
) -> <T as AsyncDestruct>::AsyncDestructor {
// ...
}
I assume you understand how async_drop
function works. However the
hard part lies with async_drop_in_place
. It is not an async function
but merely returns an object of AsyncDestruct::AsyncDestructor
type,
presumably a future. You can also notice we don't have syntax T: AsyncDestruct
. Let's take a closer look of AsyncDestruct
trait and
its associated type:
#[lang = "async_destruct"]
trait AsyncDestruct {
type AsyncDestructor: Future<Output = ()>;
}
This trait is internal to the compiler. The AsyncDestructor
is
actually a future for async drop glue, the code deinitializing the
Self
object. It is implemented for every type, thus it does not require
trait bounds to use on any type. Compiler implements it the same way as
the also internal DiscriminantKind
trait. Now I should mention that
async_drop_in_place
's body is also generated by the compiler, but this
time it's the same way drop_in_place
is generated (via shim).
But what type should we assign to AsyncDestructor
? async_drop_in_place
simply creates that asynchronous destructor future and does not execute
it. I haven't yet found a way to generate coroutines solely from the
compiler, but I was given the advice to compose core
library types to
create such futures. I've defined various future combinators to chain,
defer futures or to choose either of two futures and by combining
them I've implemented asynchronous destructors for ADTs and other
types. Although some code couldn't have been offloaded to the core
(I think). For example I've had to precompute a pointer to each field
ahead of time inside of the async_drop
method.
#[lang = "async_drop_chain"]
async fn chain<F, G>(first: F, second: G)
where
F: IntoFuture<Output = ()>,
G: IntoFuture<Output = ()>,
{
first.await;
second.await;
}
#[lang = "async_drop_either"]
async unsafe fn either<O: IntoFuture<Output = ()>, M: IntoFuture<Output = ()>, T>(
other: O,
matched: M,
this: *mut T,
discriminant: <T as DiscriminantKind>::Discriminant,
) {
if unsafe { discriminant_value(&*this) } == discriminant {
drop(other);
matched.await
} else {
drop(matched);
other.await
}
}
#[lang = "async_drop_defer"]
async unsafe fn defer<T: ?Sized>(to_drop: *mut T) {
unsafe { async_drop_in_place(to_drop) }.await
}
Since async drop glue could hypothetically in future be executed
automatically within the cleanup branches used for unwind, one property
I believe AsyncDestructor
future should have is that instead of
panicking it must simply return Poll::Ready(())
on every poll after
future completes. I've called this property future idempotency since
it makes sense and have a special fuse combinator wrap around any
regular future to have such guarantee.
As of right now (2024-03-29) async drop glue for coroutines (async blocks)
and dynamic types (dyn Trait
) are not implemented. Coroutines have
their special code for generating even regular drop glue, extracting a
coroutine_drop
branch from coroutine's MIR. Other person works on
it. For dynamic types support I have a hypothetical design which I'll
describe below. Automatic async drops at the end of the scope aren't
implemented too.
Combinator table
Combinator | Description |
---|---|
either | Used by async destructors for enums to choose which variant of the enum to execute depending on enum's discriminant value |
chain | Used by async destructors for ADTs to chain fields' async destructors |
fuse | Used by async destructors to return Poll::Ready(()) on every poll after completion |
noop | Used by async destructors for trivially destructible types and empty types |
slice | Used by async destructors for slices and arrays |
surface_async_drop_in_place | Used by async destructors to execute the surface level AsyncDrop::Dropper future of a type |
surface_drop_in_place | Used by async destructors to execute the surface level Drop::drop of a type |
You might ask if we even need Noop
combinator and can't we not
instantiate async destructor for trivially destructible types? But no,
this is not possible, since user may call async_drop_in_place
on any
type, which has to return some future type.
See current implementations of these combinators inside of the library/core/src/future/async_drop.rs.
Visibility problem
If you compare public interface for interacting with value discriminants
within the core
library with interface described here, you could notice
usage of trait's associated type instead of a generic type. Actually
directly using this associated type may be problematic as it can possibly
leak its special trait and method implementations. Also I believe
it would be better to keep AsyncDestruct
trait private. At last it
perhaps it would be more convenient to use a generic type instead as with
Discriminant
.
To solve this problem the only way right now would be to define a
wrapper struct AsyncDropInPlace<T>
around it and forward its Future
implementation to the actual async destructor of type T
. We would also
have a new wrapper function async_drop_in_place
to return that wrapper
type and would rename compiler generated function which held this name
previously into async_drop_in_place_raw
.
However, this AsyncDropInPlace
could still leak some details of stored
inner value, such as any auto trait implementation and a drop check. These
can be either left as is (current behavior) or be suppressed with
PhantomData<*mut ()>
field and with a noop Drop
implementation on it.
Not sure which one should be chosen.
Generation of async_drop_in_place_raw
The body of async_drop_in_place_raw
function is generated by
the compiler within the compiler/rustc_mir_transform/src/shim.rs.
AsyncDestructorCtorShimBuilder
is the core structure of for generating
code in form of MIR. Let's take a look at what kind of code is being
generated for enum:
chain(
surface_async_drop_in_place(to_drop),
either(
chain(
async_drop_in_place_raw((*(to_drop as *mut T::Variant1)).field0),
async_drop_in_place_raw((*(to_drop as *mut T::Variant1)).field1),
),
chain(
async_drop_in_place_raw((*(to_drop as *mut T::Variant0)).field0),
async_drop_in_place_raw((*(to_drop as *mut T::Variant0)).field1),
),
to_drop,
variant0_discriminant,
),
)
As you can see it can see it is simply an expression. We can express
execution of a single expression with a stack machine, which is actually
exactly how AsyncDestructorCtorShimBuilder
functions. It stores a
stack of operands which are either a moved local, a copied local or a
const value (like a discriminant). We allocate and deallocate storage
for moved locals on push and pop to the builder's stack. We can assign a
value to a local, putting it at the top of the stack or combine operands
(but first we pop them) with a function call to put a combinator value
at the top of the stack too. Order of arguments for a function call can
be summarized as top operand of the stack being the last argument. Then
we return the one last stack operand.
This stack machine also allows us to easily create a cleanup branch to drop operands during unwind without redundant drops by reusing drops for stored locals on the stack, forming a kind of tree inside of the MIR control-flow graph.
What's next?
ADT async destructor types
As I've said those future combinators are just a patchwork for
current inability to generate ADT futures on the fly. Defining such
components inside of the core
is fine in some cases, like for
async destructor of slice. But for ADTs, tuples, closures the proper
solution would be to define the new type kind named something like
AdtAsyncDestructor
. Given one of those types we could generate
a consistent state for the async destructor and then generate its
Future::poll
function. This way we won't need to calculate and store
all the pointers to each field ahead of time.
Ideas for the future
Should async_drop_in_place work with references?
Since async_drop_in_place
returns an async destructor future what should
reference the dropped object, perhaps it would be more beneficial to have
async_drop_in_place
use reference &mut ManuallyDrop<T>
instead. It
would be less unsafe and we won't have to deal with pointers infecting
async destructor types with !Send
and !Sync
.
Async drop glue for dyn Trait
The problem with dynamic types is basically about
loosing the type information. We cannot know <dyn Trait as AsyncDestruct>::AsyncDestructor
type's size and alignment, thus we
cannot know how much
stack or coroutine's local space we should allocate for the storage. One
approach would be to have type AsyncDestructor = Box<dyn Future>
for dynamic types, which could be not ideal. But actually before we
coerce static types into dynamic, perhaps we could have a wrapper
type which contains space both for T
and for <AsyncDestruct as T>::AsyncDestructor
?
#[lang = "PollDestruct"]
trait PollDestruct {
fn poll_drop(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()>;
}
struct PollDrop<T> {
async_dtor: Option<<T as AsyncDestruct>::AsyncDestructor>,
value: MaybeUninit<T>,
_pinned: PhantomPinned,
}
impl<T> PollDestruct for PollDrop<T> {
fn poll_drop(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
unsafe {
let this = self.get_unchecked_mut();
let dtor = this
.async_dtor
.get_or_insert_with(|| async_drop_in_place(this.value.as_mut_ptr()));
Pin::new_unchecked(dtor).poll(cx)
}
}
}
// Have a `PollDrop<Fut> -> Box<dyn Future<Output = ()> + PollDestruct>`
And like that we embedded enough space and type information to unsize these types and work with them, while still being able to be asynchronously destroyed.
Exclusively async drop
It's almost pointless to implement AsyncDrop
on your type while
it is perfectly valid to synchronously drop your type. There can be
a way to restrict sync drops of a type by implementing !Destruct
for a type. Compiler should emit a compiler error wherever it tries to
synchronously drop a ?Destruct
value. It would be fine to asynchronously
drop them, which would be done (semi)automatically inside of async code.
While this approach as far as I know preserves backwards compatibility,
it would require users to manually add support for T: ?Destruct
types inside of their code, which is the reason new ?Trait
bounds are
considered to be unergonomic by many rustc lead developers. Perhaps it
would be fine to have T: Destruct
by default for synchronous functions
and T: ?Destruct
by default for asynchronous ones in the next edition?
But my mentor suggests to try out a different approach: emitting such errors after monomorphization of a generic function, perhaps as a temporary measure before a proper type-level solution is enabled. It does sound like how C++ templates work which come with some issues on their own. But rust already allows post-monomorphization errors like linker errors.
Automatic async drop and implicit cancellation points
The core feature of the hypothetical async drop mechanism is considered to
be automatic async cleanup, which requires to add implicit await points
inside of the async code wherever it destroys an object with async drop
implementation. Currently every await point also creates a cancellation
point where future can be cancelled with drop
if it is suspended there.
Implicit cancellation point within the async code would probably make it
much more difficult to maintain cancellation safety of your async code
because of not seeing where exactly your async code can suspend. The
simplest solution for this would be to have implicit await point to
not generate a cancellation point. This is possible if such async
block implements !Destruct
(see above) and can only be asynchronously
dropped. Then if user starts async drop of that future while it is
suspended on implicit await point, the future will continue as usual
until it either returns or suspends on explicit await point. User will
have to explicitly call and await async_drop
to allow cancellation
during suspension.
Drop of async destructors
How should drop of an async destructor should function? I see the simplest solution would probably be that async drop of async destructor will simply continue execution of async destructor.
Conclusion
There are still a lot of questions to be answered, but it's important to not put our hands down.
Also I would like to mention this text is based on similar works of many other people, references to which you can find in this MCP: Low level components for async drop.
The destruction guarantee and linear types formulation
Myosotis
Background
Currently there is a consensus about absence of the drop guarantee. To be precise,
in today's Rust you can forget some value via core::mem::forget
or via some other safe contraption like cyclic shared references Rc/Arc
.
As you may know in the early days of Rust the destruction guarantee was
intended to exist. Instead of today's std::thread::scope
there was
std::thread::scoped
which worked in a similar manner, except it
used a guard value with a drop implementation to join the spawned thread
so that it wouldn't refer to any local stack variable after the parent
thread exited the scope and destroyed them, but due to absence of the
drop guarantee it was found to be unsound and was removed from standard
library.[1] Let's name these
two approaches as guarded closure
and guard object. Also to note
C++20 has analogous std::jthread
guard object.
There is also a discussion among Rust theorists about linear types which leads them researching
(or maybe revisiting) the possible Leak
trait. I've noticed some
confusion and thus hesitation when people are trying to define what does
leaking a value mean. I will try to clarify and define what does leak
actually mean.
Problem
There is a class of problems that we will try to solve. In particular, we
return some object from a function or a method that mutably (exclusively)
borrows one of function arguments. While returned object is alive we
could not refer to borrowed value, which can be a useful property to
exploit. You can invalidate some invariant of a borrowed type but then
you restore it inside of returned object's drop. This is a fine concept
until you realize in some circumstances drop is not called, which would
in turn mean that the borrowed type invariant invalidation may never
cause undefined behavior
(UB in short) if left untreated. However, if drop is guaranteed,
we could mess with borrowed type invariant, knowing that the
cleanup will restore the invariant and make impossible to cause UB
after. I found one example of this as once mentioned planned feature
Vec::drain_range
.
One other special case would be owned scoped thread. It may be included
within class of problems mentioned, but I am not sure. Anyway, in the
most trivial case this is the same as once deleted std::thread::{scoped, JoinGuard}
described above. However, many C APIs may in some sense
use this via the callback
registration pattern, most common for multithreaded client
handles. Absence of a drop guarantee thus implies 'static
lifetime
for a callback so that the user wouldn't use invalidated references
inside of the callback, if client uses guard object API patternP.S. (see
example).
Solution
From now on I will use the term "destruction guarantee" instead of the "drop guarantee" because it more precisely describes the underlying concept. The difference between drop and destruction is that first only relates to drop functionality of Rust, while latter can relate to those and any consuming function that destroys object in sense of how it is defined by library authors, in other words a destructor. Such destructors may even disable drop code and cleanup in some other way.
Most importantly in these two cases objects with the destruction guarantee would be bounded by lifetime arguments. So to define the destruction guarantee:
Destruction guarantee asserts that bounding lifetime of an object
must end only after object is destroyed by drop or any other valid
destructor. Somehow breaking this guarantee can lead to UB.
Notice what this implies for T: 'static
types. Since static lifetime
never ends or ends only after end of program's execution, the drop
may never be called. This property does not conflict with described
use cases. JoinGuard<'static, T>
indeed doesn't require to ever
be destroyed, since there would be no references that would ever be
invalidated.
In the context of discussion around Leak
trait some argue it is possible
to implement core::mem::forget
via threads and an infinite loop.[2] That forget implementation
won't violate a destruction guarantee as defined above, since either
you use regular threads which require F: 'static
or use scoped threads
which would join this never completing thread thus no drop and no lifetime
end. That definition only establishes order between object's destruction
and end of a lifetime, but not existence of a lifetime's end inside of
any execution time. My further advice would be in general to think
not in terms of execution time but in terms of semantic lifetimes,
which role would be to conservatively establish order of events if
those ever exist. Alternatively you will be fundamentally limited by the
halting problem.
On the topic of abort or exit, it shouldn't be considered an end to any lifetime, since otherwise abort and even spontaneous termination of a program like SIGTERM becomes unsafe.
To move forward let's determine required conditions for destruction
guarantee. Rust language already makes sure you could never use a
value which bounding lifetime has ended. Drop as a fallback to other
destructors is only ever run on owned values, so for a drop to run
on a value, the value should preserve transitive ownership of it by
functions' stack/local values. If you familiar with tracing garbage
collection this is similar to it, so that the required alive value should
be traceable from function stack. The value has to not own itself or be
owned by something that would own itself, at least before the end of its
bounding lifetime, otherwise drop would not be called. Last statement
could be simplified, given that owner of a value transitively must also
satisfy these requirements, leaving us with just the value has to not
own itself. Also reminding you that 'static
values can be moved into
static context like static variables, which lifetime exceeds lifetime
of a program's execution itself, so consider that analogous to calling
std::process::exit()
before 'static
ends.
Trivial implementation
One trivial implementation might have already crept into your mind.
#![feature(auto_traits, negative_impls)]
use core::marker::PhantomData;
unsafe auto trait Leak {}
#[repr(transparent)]
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Unleak<T>(pub T, PhantomUnleak);
impl<T> Unleak<T> {
pub const fn new(v: T) -> Self {
Unleak(v, PhantomUnleak)
}
}
// This is the essential part of the `Unleak` design.
unsafe impl<T: 'static> Leak for Unleak<T> {}
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
struct PhantomUnleak;
impl !Leak for PhantomUnleak {}
struct Variance<Contra, Co> {
process: fn(Contra) -> String,
// invalidate `Co` type's safety invariant before restoring it
// inside of the drop
queue: Unleak<Co>,
}
struct VarianceAlt<Contra, Co> {
process: fn(Contra) -> String,
queue: Co,
_unleak: PhantomUnleak,
}
unsafe impl<Contra, Co: 'static> Leak for VarianceAlt<Contra, Co> {}
// not sure about variance here
struct JoinGuard<'a, T: 'a> {
// ...
_marker: PhantomData<fn() -> T>,
_unleak: PhantomData<Unleak<&'a ()>>,
_unsend: PhantomData<*mut ()>,
}
unsafe impl<T: 'static> Send for JoinGuard<'static, T> {}
unsafe impl<'a, T> Sync for JoinGuard<'a, T> {}
// We are outside of the main function
fn main() {}
This is an automatic trait, which would mean that it
is implemented for types in a similar manner to Send
.[3] Name Leak
is a subject for a
possible future change. I used it as it came up in many people's thoughts
as Leak
. Since T: !Leak
types possibly could leak in a practical
meaning, it can be renamed into Forget
. Other variants could be Lose
,
!Trace
or !Reach
(last two as in tracing GC), maybe add -able
suffix?P.S.
This trait would help to forbid !Leak
values from using problematic
functionality:
- Obviously
core::mem::forget
should have aT: Leak
over its generic type argument; core::mem::ManuallyDrop::new
should have leak bound over input type, but intrinsically maybe author has some destructor besides the drop that would benefit fromManuallyDrop::new_unchecked
fallback;Rc
andArc
may themselves be put inside of the contained value, creating an ownership loop, although there should probably be an unsafe (constructors) fallback in case ownership cycles are guaranteed to be broken before cleanup;- Channel types like inside of
std::sync::mpsc
with a shared buffer ofT
are problematic since you can send a receiver through its sender back to itself, thus creating an ownership cycle leaking that shared buffer;- Rendezvous channels seem to lack this flaw because they wait for other thread/task to be ready to take a value instead of running off right after sending it;
In any case the library itself dictates appropriate bounds for its types.
Given that !Leak
implies new restrictions compared to current Rust
value semantics, by default every type is assumed to be T: Leak
, kinda
like with Sized
, e.g. implicit Leak
trait bound on every type and
type argument unless specified otherwise (T: ?Leak
). I pretty sure this
feature should not introduce any breaking changes. This means working with
new !Leak
types is opt-in, kinda like library APIs may consider adding
?Sized
support after release. There could be a way to disable implicit
T: Leak
bounds between editions, although I do not see it as a desirable
change, since !Leak
types would be a small minority in my vision.
The Unleak wrapper type
To make !Leak
struct you would need to use new Unleak
wrapper type:
#[repr(transparent)]
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct Unleak<T>(pub T, PhantomUnleak);
impl<T> Unleak<T> {
pub const fn new(v: T) -> Self {
Unleak(v, PhantomUnleak)
}
}
// This is the essential part of the `Unleak` design.
unsafe impl<T: 'static> Leak for Unleak<T> {}
#[derive(Default, Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
struct PhantomUnleak;
impl !Leak for PhantomUnleak {}
This wrapper makes it easy to define !Leak
data structures. It
implements Leak
for 'static
case for you. As a rule of thumb
you determine which field (it should contain struct's lifetime or
generic type argument) would require the destruction guarantee, so
if you invalidate safety invariant of a borrowed type, make sure
this borrow is under Unleak
. To illustrate how Unleak
helps
you could look at this example:
struct Variance<Contra, Co> {
process: fn(Contra) -> String,
// invalidate `Co` type's safety invariant before restoring it
// inside of the drop
queue: Unleak<Co>,
}
If you aware of variance then you should know that contravariant
lifetimes (which are placed inside of arguments of a function pointer) can
be extended via subtyping up to the 'static
lifetime, it is also applied
to lifetime bounds of generic type arguments. So it should be useless to
mark this function pointer with Unleak
. If we just had PhantomUnleak
there - this is what example above would look like instead:
struct VarianceAlt<Contra, Co> {
process: fn(Contra) -> String,
queue: Co,
_unleak: PhantomUnleak,
}
unsafe impl<Contra, Co: 'static> Leak for VarianceAlt<Contra, Co> {}
It now requires unsafe impl with a bit unclear type bounds. If user
forgets to add the Leak
implementation the type would become restricted
as any !Leak
type even if type itself 'static
, granting nothing of
value. If user messes up and doesn't add appropriate 'static
bounds,
It may lead to unsound API. Unleak
on the other hand automatically
ensures that T: 'static => T: Leak
. So the PhantomUnleak
should
probably be private/unstable.
Now given this a bit awkward situation about T: 'static => T: Leak
,
impl and dyn trait types can sometimes be meaningless like Box<dyn Debug + ?Leak>
or -> impl Debug + ?Leak
because those are static
unless you add + 'a
explicit lifetime bound, so there probably
should be a lint that would warn user about that.
One thing that we should be aware of in the future would be users'
desire of making their types !Leak
while not actually needing it. The
appropriate example would be MutexGuard<'a, T>
being !Leak
. It is
not required, since it is actually safe to forget a value of this type or
to never unlock a mutex, but it can exist. In this case, you can safely
violate !Leak
bound, making it useless in practice. Thus unnecessary
!Leak
impls should be avoided. To address users' underlying itch to
do this, they should be informed that forgetting or leaking a value
is already undesirable and can be considered a logic bug.
Of course there should be an unsafe core::mem::forget_unchecked
for
any value if you really know what you're doing, because there are some ways
to implement core::mem::forget
for any type with unsafe code still,
for example with core::ptr::write
. There should also probably be safe
core::mem::forget_static
since you can basically do that using thread
with an endless loop. However ?Leak
types implement Leak
for static
lifetimes transitively from Unleak
to satisfy any function's
bounds over types.
// not sure about variance here
struct JoinGuard<'a, T: 'a> {
// ...
_marker: PhantomData<fn() -> T>,
_unleak: PhantomData<Unleak<&'a ()>>,
}
While implementing !Leak
types you should also make sure you cannot move
a value of this type into itself. In particular JoinGuard
may be made
!Send
to ensure that user won't send JoinGuard
into its inner thread,
creating a reference to itself, thus escaping from a parent thread while
having live references to parent thread local variables.
// not sure about variance here
struct JoinGuard<'a, T: 'a> {
// ...
_marker: PhantomData<fn() -> T>,
_unleak: PhantomData<Unleak<&'a ()>>,
_unsend: PhantomData<*mut ()>,
}
unsafe impl<T: 'static> Send for JoinGuard<'static, T> {}
unsafe impl<'a, T> Sync for JoinGuard<'a, T> {}
There is also a way to forbid JoinGuard
from moving into its thread if
we bound it by a different lifetime which is shorter than input closure's
lifetime. See prototyped thread::SendJoinGuard
in leak-playground
docs
and repo. Because
there's no Leak
trait outside of this repo and external libraries
cannot account for it, !Leak
types usage safety is enforced manually
sometimes. There're also some new possible features for tokio in
leak_playground_tokio like non-static task support. The doctest code
behaves as intended (except for internally unleak future examples),
but I have no formal proof of it being 100% valid.
One other consequence would be that if a drop of a !Leak
object panics
it should be safe to use the referred to object, basically meaning that
panic or unwind is a valid exit path from the drop implementation. If
!Leak
type invalidates some safe type invariant of a borrowed object,
then even if the drop implementation panics, it should restore this
invariant, maybe even by replacing the borrowed value with a default
or an empty value or with a good old manual std::process::abort
. If
designed otherwise the code should abort on a panic from a drop of !Leak
value. So you would have to be careful with panics too. This also
applies to any other destructor.
Internally Unleak coroutines
Consider one other example from leak_playground_std:
fn _internal_unleak_future() -> impl std::future::Future<Output = ()> + Leak {
async {
let num = std::hint::black_box(0);
let bor = Unleak::new(&num);
let () = std::future::pending().await;
assert_eq!(*bor.0, 0);
}
}
During the execution of a future, local variables have
non-static lifetimes, however after future yields these lifetimes
become static unless they refer to something outside of itP.S.. This is an
example of sound and safe lifetime extension thus making the whole
future Leak
. However, if when we use JoinGuard
it becomes a little
bit trickier:
fn _internal_join_guard_future() -> impl std::future::Future<Output = ()> + Leak {
async {
let local = 42;
let thrd = JoinGuard::spawn({
let local = &local;
move || {
let _inner_local = local;
}
});
let () = std::future::pending().await;
drop(thrd); // This statement is for verbosity and `thrd`
// should drop there implicitly anyway
}
}
Code above may lead to use-after-free if we forget
this future,
meaning that memory holding this future is deallocated without cancelling
(i.e. dropping) this future first, thus spawned thrd
now refers to the
future's deallocated local state, since we haven't joined this thread.
But remember that self-referential (!Unpin
) future is pinned forever
after it starts, which means that it is guaranteed there is no way
(or at least should be no way) to forget and deallocate the underlying
value in safe code (see pin's drop guarantee). However outside of
rust-lang project some people would not follow this rule because they
don't know about it or maybe discard it purposefully (the Rust police
is coming for you). Maybe in the future it would be possible to somehow
relax this rule in some cases, but it would be a different problem.
Extensions and alternatives
DISCLAIMER: This section is optional as it contains unpolished concepts, which are not essential for understanding the overall design of proposed feature.
Disowns (and NeverGives) trait(s)
If you think about Rc
long enough, the T: Leak
bound will start to
feel unnecessary strong. Maybe we could add a trait that signify that your
type can never own Rc
of self, which would allow us to have a new bound:
impl<T> Rc<T> {
fn new(v: T) -> Self
where
T: Disowns<Rc<T>>
{
// ...
}
}
By analogy with that to make sure closure that you pass into a spawned thread should never capture anything that can give you join guard:
pub fn scoped<F>(f: F) -> JoinGuard<F>
where
F: NeverGives<JoinGuard<F>>
{
// ...
}
To help you with understanding:
<fn(T)>: NeverGives<T> + Disowns<T>,
<fn() -> T>: !NeverGives<T> + Disowns<T>,
T: !NeverGives<T> + !Disowns<T>,
trait NeverGives<T>: Disowns<T>,
Custom Rc trait
Or, to generalize, maybe there should be a custom automatic trait for Rc, so
that anything that implements it is safely allowed to be held within Rc
:
impl<T> Rc<T> {
fn new(v: T) -> Self
where
T: AllowedInRc
{
// ...
}
}
impl<T> Arc<T> {
fn new(v: T) -> Self
where
T: AllowedInRc + Send + Sync
{
// ...
}
}
Ranked Leak trait
While we may allow T: Leak
types to be held within Rc
, U: Leak2
would be not given that Rc<T>: Leak2
. And so on. This allows us to
forbid recursive types but also forbids nested enough within Rc
s data
types. This is similar to von Neumann hierarchy of sets as sets there
have some rank ordinal. Maybe there could be unsafe auto trait Leak<const N: usize> {}
for that?
Turning drop invocations into compiler errors
Perhaps we could have some automatic trait RoutineDrop
which if
unimplemented for a type means that dropping this value would result in
a compiler error. This may be useful with hypothetical async drop. It
could also help expand linear type functionality.
Forward compatibility
Since I wrote this text in terms of destructors, it should be play nicely
with hypothetical async drop. Then it could be the case that JoinGuard
logic can be extended to analogous AwaitGuard
representing async tasks.
Possible problems
Some current std library functionality relies upon forgetting values,
like Vec
does it in some cases like panic during element's drop. I'm
not sure if anyone relies upon this, so we could use abort instead. Or
instead we can add std::mem::is_leak::<T>() -> bool
to determine if
we can forget values or not and then act accordingly.
Currently internally unleak futures examples emit errors where they
shouldn't or should emit different errors, so I guess some compiler
hacking is required. There could also be some niche compilation case,
where compiler assumes every type is Leak
and purposefully forgets
a value.
Terminology
- ^ Linear type
-
Value of which should be used at least once, generally speaking. Use is usually defined within the context.
- ^ Drop guarantee
-
Guarantee that drop is run on every created value unless value's drop is a noop.
This text uses this term only in reference to older discussions. I use destruction guarantee instead to be more precise and to avoid confusion in future discussions about async drop.
- ^ Guarded closure
-
A pattern of a safe library API in Rust. It is a mechanism to guarantee library's cleanup code is run after user code (closure) used some special object. It is usually used only in situations when this guarantee is required to achieve API safety, because it is unnecessary unwieldy otherwise.
// WARNING: Yes I know you can rewrite this more efficiently, it's just a demonstration fn main() { let mut a = 0; foo::scope(|foo| { for _ in 0..10 { a += foo.get_secret(); // cannot forget(foo) since we only have a reference to it } }); println!("a = {a}"); } // Implementation mod foo { use std::marker::PhantomData; use std::panic::{catch_unwind, resume_unwind, AssertUnwindSafe}; pub struct Foo<'scope, 'env> { secret: u32, // use lifetimes to avoid the error // strange lifetimes to achieve invariance over them _scope: PhantomData<&'scope mut &'scope ()>, _env: PhantomData<&'env mut &'env ()>, } impl Foo<'_, '_> { pub fn get_secret(&self) -> u32 { // There should be much more complex code self.secret } fn cleanup(&self) { println!("Foo::cleanup"); } } pub fn scope<'env, F, T>(f: F) -> T where F: for<'scope> FnOnce(&'scope Foo<'scope, 'env>) -> T, { let foo = Foo { secret: 42, _scope: PhantomData, _env: PhantomData, }; // AssertUnwindSafe is fine because we rethrow the panic let res = catch_unwind(AssertUnwindSafe(|| f(&foo))); foo.cleanup(); match res { Ok(v) => v, Err(payload) => resume_unwind(payload), } } }
Output:
Foo::cleanup a = 420
- ^ Guard object
-
A pattern of library APIs like
std::sync::MutexGuard
. Usually these borrow some local state (likestd::sync::Mutex
) and restore it within its drop implementation. Since Rust value semantics allow objects to be forgotten, cleanup code within the drop implementation should not be essential to preserve safety of your API.However this proposal aims to relax this restriction, given a new backwards-compatible set of rules.
- ^ Callback registration
-
A pattern of library APIs, especially in C. It is usually represented as setting some function as a callback to incoming response for some client handle. tigerbeetle_unofficial_core::Client would be an example of that.
- ^ Undefined behavior or UB
References
Postscript
- ^
It is safe to forget an unforgettable type as long as it can outlive,
broadly speaking, any usage of the type's instance. That usage may be
thread manager running thread's closure for a bit, which is where that
'static
lifetime comes from. Or another example would be to forget guard object as long as guarded object is forgotten too. I have modified leak_playground_std'sUnleak
to accommodate this feature.
-
^ During the discussion about this post people expressed the option that
Leak
name is very misleading and thatForget
would have been a better name. I will refer to it as such in my future texts and code. -
^ I am now convinced there is at least a family of auto traits that to determine whether some coroutine implements this trait should ignore its local state even if it passes await/yield point, thus I consider this questionable argument about lifetimes inside of coroutines transforming into
'static
to be obsolete. I'll give an explanation of this peculiar feature in one of my next posts.
Credits
Thanks to @petrochenkov for reviewing and discussing this proposal with me.