//

Rust: Learnings on generics and references

25.10.2022 | 8 minutes of reading time

Rust: Generics and references

TL;DR: When in doubt, use struct S<T> { t: T } instead of struct S<'a, T> {t:&'a T}?

Rust is one of the few modern languages that does not hide the complexity of object lifetime and cleanup. Instead, it is designed such that problems like use after free or concurrent mutation of state are almost impossible. This usually goes as far as: if your rust code compiles, then it won't crash. This is something one usually does not experience – neither interpreted languages, nor with low level compiled languages.

However, the aforementioned complexity also collects its toll when learning the language. Designing an API may feel a lot more difficult because of the constraints enforced by the compiler – especially when new to Rust. Another reason for that might be that experience with other languages (even the non-GC ones like C++) may sometimes lead you into the wrong direction, which is what this blog post is about.

The following tries to summarize one of my key learnings from working on rs-plode, a small toy project for computing graphs layouts. With a background in C++, which like Rust uses the & for references, I probably went the wrong way with my initial API draft :-).

layouted graph Sample graph (tree) generated with rs-plode. The animation depicts the adaptions of node positions implemented according to Fruchterman & Reingold.

In the following, I will assume that the reader has a basic understanding for Rusts traits, trait bounds, references, and borrowing. Far from being an expert, I hope I mostly get things right – Please correct me if I don't! If you want to play along and try out some things on your own, I can highly recommend the Rust Playground.

In rs-plode I wanted the API to be flexible and intuitive to use. I came up with the something similar as this:

1/// Hold the location of the nodes as determined by a layout engine.
2///
3/// The graph is needed such that  later on, lines between connected nodes can be rendered.
4struct Layout<'a, G: Graph> {
5    graph: &'a G,
6    locations: Vec<(f32, f32)>, // not really relevant here 
7}
8
9/// Implementing this trait allows the user to call layout on his graph structures.
10trait Graph {
11    fn nodes(&self) -> ...; // allows iteration over node
12    fn edges(&self) -> ...; // allows iteration over edges
13
14    fn layout<'a, E: LayoutEngine>(&'a self, engine: E) -> Layout<'a, Self> {
15        engine.layout(self)
16    }
17}
18
19/// Defines how to arrange nodes in a 2D space.
20trait LayoutEngine {
21    fn layout<'a, G: Graph>(self, graph: &'a G) -> Layout<'a, G>;
22}
23

This allows to implement different layout modes, and they then can be used with anything that implements the Graph trait. The returned Layout will contain a reference to the graph (e.g. to render the edges between the nodes) and the location of the nodes. In hindsight, I think the design (holding a reference in the Layout) is flawed, and the following paragraphs will try to explain.

Heads-up on references

The rust documentation states, that "references are a first class citizen of the language". One way to think of this is as "a reference to a type T is its own distinct type" denoted as &T (in C++ the corresponding entity would probably be std::reference_wrapper<T>). A good demonstration for this distinction are function arguments:

1struct S {}
2
3fn foo(s: &S) {} // function wants a reference to S, not an S.
4
5fn main() {
6    let s: S = S {}; // a value of type S
7    // foo(s); // call foo with value => error
8    foo(&s); // call foo with reference => works
9}
10

The &s creates a reference to s – it borrows s – so the argument type matches the signature of foo.

Implicit vs Explicit reference usage

Explicit reference

In the scenario of the introduction the generic layout function returns a Layout that contains a handle on the graph argument – i.e. it enriches the argument with some extra information. Let's reword and simplify this (drop the trait bound for now):

1fn enrich_explicit<'a, G>(g: &'a G) -> Explicit<'a, G> {
2    Explicit { g /* other fields */ }
3}
4
5struct Explicit<'a, G> {
6    g: &'a G
7    /* other fields */
8}
9

Because references are first class, the type parameter G of that struct may still be both, a value G or a reference type (&G). I.e. the field g will then be a reference to a value (&'a G), or a reference to another reference (&'a &G). That is probably not what was intended.

1fn main() {
2    let hello: String = "Hello World".to_string();
3    enrich_explicit(hello); // Error: expected reference, found struct
4    enrich_explicit(&hello); // returns Explicit<'a, String> where 'a denotes the lifetime of the hello variable.
5    enrich_explicit(&&hello); // returns Explicit<'a, &String> where 'a denotes the lifetime of the temporary &hello (???).
6}
7

You might wonder how &&hello is even allowed, isn't that a reference to a temporary: &{&hello}. Rust got you covered!.

Implicit reference

The alternative to above Explicit struct is to simply omit the reference and lifetime from the struct declaration and the function signature:

1
2fn enrich_implicit<G>(g: G) -> Implicit<G> {
3    Implicit { g }
4}
5
6struct Implicit<G> {
7    g: G
8}
9
10fn main() {
11    let hello: String = "Hello World".to_string();
12    enrich_implicit(&&hello); // returns Implicit<&&String>
13    enrich_implicit(&hello); // returns Implicit<&String>
14    enrich_implicit(hello); // works, but needs to be last, as it consumes hello.
15}
16

Again, the type parameter G can be both, a plain value, or a reference which leads to a couple of interesting properties discussed in the next paragraph.

Comparison and Trait Bounds

So now we have two possibilities to store references in a generic struct:

1struct Explicit<'a, G> {
2    g: &'a G
3}
4
5struct Implicit<G> {
6    g: G
7}
8

The obvious arguments for one or the other approach are:

  • The implicit approach is fully transparent or symmetric w.r.t. how it deals with ownership and references. It is more generic in the sense that the user decides what to store.
  • With the implicit code the user is only confronted with lifetime complexity when references are actually used as type arguments. Even then, the lifetimes are a "part of the type parameter".
  • A good library/API does not unnecessarily constrain the user. I.e. if the remaining invariants of the API are indifferent w.r.t. &G and G, then the choice should be left to the user.
  • The explicit code needs the lifetime annotations. While not a big deal in the small example, this can get out of hand and quite annoying if needed to be repeated often.
  • Explicit code is often easier to understand compared to its implicit counterpart.

Seems like overall, the implicit approach is simpler and more flexible and a clear winner. However, we dropped the trait bounds and adding them back in will be the next step!

1pub trait Graph {}
2
3pub struct Implicit<G: Graph> {
4    g: G
5}
6
7pub struct Explicit<'a, G: Graph> {
8    g: &'a G
9}
10
11impl Graph for Vec<(usize, usize)> {}
12
13fn main() {
14    let graph: Vec<(usize, usize)> = vec![(0, 1), (1, 2), (2, 1)]; // a triangle
15    Implicit { g: graph.clone() };  // Works: G=Vec<(usize, usize)> which implements Graph
16    // Implicit {t: &graph }; // Error: G=&Vec<(usize, usize)> and "Graph not implemented for &Vec<(usize, usize)>"
17  
18    // Explicit { t: graph.clone() }; // Error: expected ref, found Vec<(usize, usize)>
19    Explicit { g: &graph };     // Works: G=&Vec<(usize, usize)> which implements Graph
20}
21

Because Graph is only implemented for Vec<(usize, usize)>, we lost the previous flexibility. I.e. we cannot put a &Vec<(usize, usize)> into Implicit anymore. However, if it makes sense for the trait Graph, we can provide blanket implementations for reference types:

1impl<G> Graph for &G where G: Graph
2{}
3
4fn main() {
5    let graph: Vec<(usize, usize)> = vec![(0, 1), (1, 2), (2, 1)]; // a triangle
6    Implicit { t: graph.clone() }; // Works: G=Vec<(usize, usize)> which implements Graph
7    Implicit { t: &graph };        // Works: G=&Vec<(usize, usize)> which implements Graph, because Vec<(usize, usize)> implements Graph
8
9    // Explicit { t: graph.clone() }; // Error: expected ref, found Vec<(usize,usize)>
10    Explicit { t: &graph };           // Works: G=&Vec<(usize, usize)> which implements Graph
11}
12

And we can do so even for smart pointers:

1use std::rc::Rc;
2
3impl<G> Graph for Rc<G> where G: Graph {
4}
5
6fn main() {
7    let graph: Vec<(usize, usize)> = vec![(0, 1), (1, 2), (2, 1)]; // a triangle
8    Implicit { t: graph.clone() }; // Works: G=Rc<Vec<(usize, usize)>> which implements Graph because Vec<(usize, usize)> implements Graph
9    // Explicit { t: rc.clone() }; // Error: expected ref, found Rc
10    Explicit { t: &graph }.foo();  // Works: G=Rc<Vec<(usize, usize)>> which implements Graph because Vec<(usize, usize)> implements Graph
11}
12

Note that it is not recommended to provide multiple blanket implementations via the Deref trait because of reasons.

The last example already hints at the rule of thumb "use implicit over explicit" - Imagining an usecase where it makes sense to store a &Rc<G> is quite hard...

Conclusion

In hindsight, think I got lured into the struct S<'a, T> {t:&'a T} approach by my (slightly outdated) C++ gut feelings. I'll have to go back and fix that in rs-plode GH-1. This issue also points out the biggest drawback of the explicit reference approach:

1// This won't work, as layout is bound to lifetime of graph
2fn get_some_layout()->Layout<'?, SomeGraph>{
3   let graph = SomeGraph::default();
4   return graph.layout(SomeEngine{}); 
5}
6

Another (probably good) option could be to align with some of the rust standard library design decisions where the choice of ownership is left to the user:

1trait Graph {
2 fn layout<'a, E: LayoutEngine>(&'a self, engine: E) -> Layout<&'a Self>
3 fn into_layout<E:LayoutEngine>(self, engine:E)-> Layout<Self>
4}
5

Besides the TL;DR from the beginning, a major other conclusion I draw from this endeavor is that writing a small application in rust with a given set of libraries is rather straight forward - compared to coming up with a good, flexible and generic library design of your own!

share post

Likes

0

//

More articles in this subject area\n

Discover exciting further topics and let the codecentric world inspire you.

//

Gemeinsam bessere Projekte umsetzen

Wir helfen Deinem Unternehmen

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.