Read-copy update (RCU) vaguely resembles a reader-writer lock [1], but one in which readers do not exclude writers. This change in semantic permits RCU readers to be exceedingly fast and scalable. In fact, in the most aggressive case, rcu_read_lock() and rcu_read_unlock() generate no code, and rcu_dereference() emits but a single load instruction. This most aggressive case is achieved in production via Linux-kernel CONFIG_PREEMPT_NONE=y builds. Additional information on RCU is presented at the end of this post.
Can Rust Ownership be Adapted to RCU?
The fact that RCU readers do not exclude writers opens the door to data races between RCU readers and updaters. This in turn suggests that least some RCU use cases are likely to pose challenges to Rust’s ownership semantics, however, discussions with Wedson Almeida Filho at Linux Plumbers Conference suggested that these semantics might be flexed a little bit, perhaps in a manner similar to the way that Linux-kernel RCU flexes lockdep’s semantics. In addition, the data races between the read-side rcu_dereference() primitive and the update-side rcu_assign_pointer() primitive might reasonably be resolved by having the Rust implementation of these two primitives use unsafe mode. Or, perhaps better yet, the Rust implementation of these two primitives might simply wrapper the Linux-kernel primitives in order to get the benefit of existing tools such as the aforementioned lockdep and the sparse static-analysis checker. One downside of this approach is potential performance degradation due to the extra function call implied by the wrappering. Longer term, LTO might eliminate this function call, but if the initial uses of Rust are confined to performance-insensitive device drivers, the extra overhead should not be a problem.
Linux-Kernel Tooling and RCU
Here are a few examples of these Linux-kernel tools:
- A pointer can be marked __rcu, which will cause the sparse static-analysis checker to complain if that pointer is accessed directly, instead of via rcu_dereference() and rcu_assign_pointer(). This checking can help developers avoid accidental destruction of the address and data dependencies on which many RCU use cases depend.
- In most RCU use cases, the rcu_dereference() primitive that accesses RCU-protected pointers must itself be protected by rcu_read_lock(). The Linux-kernel lockdep facility has been therefore adapted to complain about unprotected uses of rcu_dereference().
- Passing the same pointer twice in quick succession to a pair of call_rcu() invocations is just as bad as doing the same with a pair of kfree() invocations: Both situations are a double free. The Linux kernel’s debug-objects facility checks for this sort of abuse.
This is by no means a complete list. Although I am by no means claiming that these sorts of tools provide the same degree of safety as does Rust safe mode, it is important to understand that Rust unsafe mode is not to be compared to standard C, but rather to the dialect of C used in the Linux kernel, augmented by the associated tools, processes, and coding guidelines.
RCU Use Cases
There are in fact Rust implementations of variants of RCU, including:
- crossbeam-rs
- ArcSwapAny
I currently have no opinion on whether or not these could be used within the Linux kernel, and any such opinion is irrelevant. You see, Rust’s use of RCU within the Linux kernel would need to access RCU-protected data structures provided by existing C code. In this case, it will not help for Rust code to define its own RCU. It must instead interoperate with the existing Linux-kernel RCU implementations.
But even interoperating with Linux-kernel RCU is not trivial. To help with this task, this section presents a few RCU use cases within the Linux kernel.
The first use case is the textbook example where once an RCU-protected object is exposed to readers, its data fields remain constant. This approach allows Rust’s normal read-only mode of ownership to operate as intended. The pointers will of course change as objects are inserted or deleted, but these are handled by rcu_dereference() and rcu_assign_pointer() and friends. These special primitives might be implemented using unsafe Rust code or C code, as the case might be.
In-place updates of RCU-protected objects are sometimes handled using the list_replace_rcu() primitive. In this use case, a new version of the object is allocated, the old version is copied to the new version, any required updates are carried out on the new version, and then list_replace_rcu() is used to make the new version available to RCU readers. Readers then see either the old version or the new version, but either way they see an object with constant data fields, again allowing Rust ownership to work as intended.
However, one complication arises for objects removed from an RCU-protected data structure. Such objects are not owned by the updater until after a grace period elapses, and they can in fact be accessed by readers in the meantime. It is not clear to me how Rust would handle this. For purposes of comparison, within the Linux kernel, the Kernel Concurrency Sanitizer (KCSAN) handles this situation using dynamic runtime checking. With this checking enabled, when an updater takes full ownership of a structure before readers are done with it, KCSAN reports a data race. These issues should be most immediately apparent to anyone attempting to create a Rust wrapper for the call_rcu() function.
Because RCU readers do not exclude RCU updaters, it is possible for an RCU reader to upgrade itself to an updater while traversing an RCU-protected data structure. This is usually handled by a lock embedded in the object itself. From what I understand, the easiest way to apply Rust ownership to these situations is to make the RCU-protected object contain a structure that in turn contains the data protected by the lock. People wishing to apply Rust to these sorts of situations are invited to review the Linux-kernel source code to check for possible complications, for example, cases where readers locklessly read fields that are updated under the protection of the lock, and cases where multiple locks provide one type of protection or another to overlapping subsets of fields in the RCU-protected object.
Again because RCU readers do not exclude RCU updaters, there can also be lockless in-place updates (either from readers or updaters) to RCU-protected objects, for example, to set flags, update statistics, or to provide type-safe memory for any number of lockless algorithms (for example, using SLAB_TYPESAFE_BY_RCU). Rust could presumably use appropriate atomic or volatile operations in this case.
One important special case of RCU readers locklessly reading updated-in-place data is the case where readers cannot be allowed to operate on an object that has been removed from the RCU-protected data structure. These situations normally involve a per-object flag and lock. Updaters acquire the object’s lock, remove the object from the structure, set the object’s flag, and release the lock. Readers check the flag, and if the flag is set, pretend that they did not find the corresponding object. This class of use cases illustrate how segregating read-side and update-side fields of an RCU-protected object can be non-trivial.
RCU is sometimes used to protect combinations of data structures, sometimes involving nested RCU readers. In some cases, a given RCU read-side critical section might span a great many functions across several translation units.
It is not unusual for RCU use cases to include a search function that is invoked both by readers and updaters. This function will therefore sometimes be within an RCU read-side critical section and other times be protected by the update-side lock (or by some other update-side synchronization mechanism).
Sequence locking is sometimes used in conjunction with RCU, so that RCU protects traversal of the data structure and sequence locking detects updates profound enough to cause problems for the RCU traversals. The poster boy for this use case is in the Linux kernel’s directory-entry cache. In this case, certain sequences of rename operations could fool readers into traversing pathnames that never actually existed. Using the sequence lock named rename_lock to protect such traversals allows readers to reject such bogus pathnames. More information on the directory-entry cache is available in Neil Brown’s excellent Linux Weekly News series (part 1, part 2, part 3). One key take-away of this use case is that sequence locking is sometimes used to protect arbitrarily large data structures.
Sequence locking can be used to easily provide readers with a consistent view of a data structure, for a very wide range of definitions of “consistent”. However, all of these techniques allow updaters to block readers. There are a number of other approaches to read-side consistency both within and across data structures, including the Issaquah Challenge.
One final use case is phased state change, where RCU readers check a state variable and take different actions depending on the current state. Updaters can change the state variable, but must wait for the completion of all readers that might have seen the old value before taking further action. The rcu-sync functionality (rcu_sync_init() and friends) implements a variant of RCU-protected phased state change. This use case is notable in that there are no linked data structures and nothing is ever freed.
This is by no means an exhaustive list of RCU use cases. RCU has been in the Linux kernel for almost 20 years, and during that time kernel developers have come up with any number of new ways of applying RCU to concurrency problems.
Rust RCU Options
The most straightforward approach is to provide Rust wrappers for the relevant C-language RCU API members. As noted earlier, wrappering the low-overhead read-side API members will introduce function-call overhead in non-LTO kernels, but this should not be a problem for performance-insensitive device drivers. However, fitting RCU into Rust’s ownership model might not be as pretty as one might hope.
A more utopian approach would be to extend Rust’s ownership model, one way or another, to understand RCU. One approach would be for Rust to gain “type safe” and “guaranteed existence” ownership modes, which would allow Rust to better diagnose abuses of the RCU API. For example, this might help with pointers to RCU-protected objects being erroneously leaked from RCU read-side critical sections.
So how might a pointer to an RCU-protected object be non-erroneously leaked from an RCU read-side critical section, you ask? One way is to acquire a reference via a reference count within that object before exiting that critical section. Another way is to acquire a lock within that object before exiting that critical section.
So what does the Linux kernel currently do to find this type of pointer leak? One approach is to enable KCSAN and build the kernel with CONFIG_RCU_STRICT_GRACE_PERIOD=y, and to run this on a small system. Nevertheless, this might be one area where Rust could improve Linux-kernel diagnostics, albeit not without effort.
For More Information
Additional information on RCU may be found here:
- Linux-kernel documentation, with special emphasis on Linux-kernel RCU’s requirements.
- The following sections of perfbook discuss other aspects of RCU:
- Section 9.5 (“Read-Copy Update”).
- Section 10.3 (“Read-Mostly Data Structures”).
- Section 13.5 (“RCU Rescues”).
- Section 15.4.2 (“RCU [memory-ordering properties ]”).
- The RCU API, 2019 Edition.
- Userspace RCU.
- Folly-library RCU.
- Section 9.6.3.3 of perfbook lists several other RCU implementations.
- Proposed Wording for Concurrent Data Structures: Read-Copy-Update (RCU) (C++ Working Paper).
RCU’s semantics ordered by increasing formality:
- The “Fundamental Requirements” section of the aforementioned Linux-kernel RCU requirements. Extremely informal, but you have to start somewhere!
- RCU Semantics: A First Attempt. Also quite informal, but with a bit more math involved.
- Verifying Highly Concurrent Algorithms with Grace (extended version). Formal semantics expressed in separation logic.
- Frightening Small Children and Disconcerting Grown-ups: Concurrency in the Linux Kernel. Executable formal semantics expressed in Cat language. This paper also includes a proof that the traditional “wait for all pre-existing readers” semantic is equivalent to these executable formal semantics within the context of the Linux-kernel memory model.
- Linux-kernel memory model (LKMM), see especially the Sync-rcu and Sync-srcu relations in file linux-kernel.cat.
Furthermore, significant portions of Linux-kernel RCU have been subjected to mechanical proofs of correctness:
- Lihao Liang applied the C Bounded Model Checker (CBMC) to Tree RCU (paper).
- Michalis Kokologiannakis applied Nidhugg to Tree RCU (paper).
- Lance Roy applied CBMC to Classic SRCU, represented by Linux-kernel commit 418b2977b343 (“rcutorture: Add CBMC-based formal verification for SRCU”). Sadly, the scripting has not aged well.
For more information, see Verification Challenge 6.
For those interested in the userspace RCU library, there have been both manual and mechanical proofs of correctness. A manual proof of correctness of userspace RCU in the context of a linked list has also been carried out.
Endnotes
[1] |
At its core, RCU is nothing more nor less than a way of waiting for already-started things to finish. Within the Linux kernel, something to be waited on is delimited by rcu_read_lock() and rcu_read_unlock(). For their part, synchronize_rcu() and call_rcu() do the waiting, synchronously and asynchronously, respectively.
It turns out that you can do quite a lot with this simple capability, including approximating some reader-writer-lock use cases, emulating a number of reference-counting use cases, providing type-safe memory, and providing existence guarantees, this last sometimes being pressed into service as a poor-man’s garbage collector. Section 9.3.3 (“RCU Usage”) of perfbook provides more detail on these and other RCU use cases.
|
History
October 12, 2021: Self-review. Note that some of the comments are specific to earlier versions of this blog post.
October 13, 2021: Add RCU-protected phased state change use case.
October 18, 2021: Add read/update common code and Tasseroti citation.
You must be logged in to post a comment.