Rust – an introduction

No Comments

A few weeks ago it was that time again. After many years of software development with the programming languages Java and Groovy and then mainly with JavaScript and TypeScript in the backend as well as in the frontend, I had decided to learn a new programming language. Every now and then it makes sense to think outside the box, acquire new concepts and thereby gain different perspectives or new insights.

You might wonder why I chose Rust for this purpose. This programming language is rather low-level compared to Java and JavaScript. There is no garbage collector and no runtime environment. The source code is compiled directly into the machine code of the system using LLVM Compiler Infrastructure, which is why Rust is also called a system programming language. Thus, Rust is much closer to C/C++ than to Java or JavaScript. Yet Rust has held the top spot as most loved programming language in Stackoverflow’s annual developer survey for five years in a row. I would like to outline possible reasons for this in this article now.

The first time my attention was drawn to Rust was when I was looking into Deno. This new runtime for JavaScript and TypeScript was developed entirely in Rust and attempts to address many of the conceptual limitations and shortcomings of Node.js. As an introductory read, I can recommend the blog article by my colleague Felix Magnus (in German).

Rust is from Mozilla

Rust was developed by Mozilla and is still a relatively young programming language. The first numbered pre-alpha version of the Rust compiler appeared in January 2012, and the first stable release was published on May 15, 2015. Stable point releases have been delivered every six weeks since then. New features are developed in Nightly Rust and then tested with beta releases, which also take six weeks. A new Rust edition is produced every two to three years. It combines the features shipped in previous point releases into one neat package, with fully updated documentation and tooling. After the first edition in 2015, there has been another edition so far in 2018, and the next one is expected in 2021.

Even though Mozilla had to lay off 250 employees last year due to the long-term effects of the COVID-19 pandemic, and the Rust team was equally affected, the future of this programming language is by no means endangered. Meanwhile, not only Mozilla uses Rust for the development of its Firefox browser. Many well-known companies implement software with Rust today. Amazon, for example, loves Rust and has implemented several of its infrastructure services with it (e.g. S3, Route 53, Cloud Front and others). Google has written its experimental operating system Fuchsia entirely in Rust, and Microsoft has also experimented with Rust for secure and safety-critical software components. A long list of companies using Rust in production can be found here.

In order to place the development of Rust on sound footing in the future, the Rust Foundation was founded in February 2021. This non-profit organization takes ownership of all Mozilla trademarks and domain names and bears financial responsibility for their costs. The goal of this foundation is to manage and further develop the Rust programming language and its ecosystem as well as to support the maintainers of the project. The founding members of the foundation are AWS, Huawei, Google, Microsoft, and Mozilla. Other companies interested in the development of Rust can contribute through the foundation.

Rust is mature

Rust is easy to install. It has very good documentation, first class development tools, and an integrated package manager as well as a build and test runner called Cargo. The installation already brings everything you need as a developer. After that, all you really need to do is install an IDE, if you don’t already have one, and you’re ready to go. To try out simple snippets, the Online Playground is sufficient. The Core or Standard Library of Rust is very powerful. Many things you need for your project can already be found there. If something is missing, you can find it on crates.io. This central registry of Rust contains everything that the community has developed. You can compare it with the NPM registry in the JavaScript world.

Rust uses an ownership concept for memory management

In Rust, the developer decides whether data ends up on the stack or is stored on the heap. At compile time, it is already determined if memory is no longer needed and therefore has to be cleaned up. This allows for an efficient use of memory as well as a very performant memory access.

But now you don’t have to be afraid of a malloc/free hell like in C. Rust manages memory in an elegant way via an ownership concept and the so-called borrow checker. However, this does not mean that ownership and borrow checker are trivial. On the contrary, they make it more difficult to get started with Rust than with other programming languages. However, once you understand them correctly, you don’t have to worry much about memory management similar to programming languages with garbage collector. Since at least one more blog article is necessary to describe this central topic for learning the programming language Rust, I would like to refer here to the official documentation.

Security and performance

Rust is a statically typed programming language with very good type inference, i.e. you rarely have to specify types explicitly. Most of the time, the compiler can determine them itself from the context.

Due to its strong type system in combination with ownership concept and borrow checker, Rust provides very high memory security, which is already checked and enforced at compile time. When a Rust program compiles, one can be sure that a variable points to an existing valid object (quite unlike many other programming languages). The value null or undefined for a variable does not exist in Rust. Instead you have to use the type Option<T>. A variable of this type contains either a value of type T or None. The handling of the None case in the source code is verified by the compiler.

Rust can also detect data races or race conditions during compilation. These occur when

  • two or more pointers access the same data at the same time,
  • at least one of the pointers is used to write to the data, and
  • no mechanism is used to synchronise access to the data.

A typical example of this is iterating over a vector and during this, elements are added to or removed from the vector.

Because Rust detects many more problems at compile time than other programming languages, this can be very frustrating, especially for a novice. However, errors at compile time are much better than spartan problems at runtime on production systems. The error messages from the Rust compiler are always very good and usually help the developer locate the problem.

The two most important goals in the development of Rust were robustness and maximum performance. All abstractions offered by the programming language show the same performance as the corresponding handwritten code (zero-cost abstractions). In addition, the safe and efficient handling of concurrent programming is another important goal of Rust. Thanks to its strict type checking and the ownership concept, Rust can find many concurrency problems at compile time.

Rust is expression-oriented

Rust is an expression-oriented language. This means that most language constructs are expressions that are evaluated to a value. For example, in Rust, even a block enclosed in curly braces to create a new scope is an expression:

Statements, unlike expressions, are commands that perform an action but do not return a value. There are only two types of statements in Rust: declaration and expression statements. The former are, for example, definitions of functions or variables like let x = 5;. Since this statement does not return a value, the compiler returns an error for let x = (let y = 5);. If one omits the second let and so writes let x = y = 5;, it is correct but not very meaningful. The expression y = 5 is an assignment which returns no value in Rust!

A program is always a sequence of statements, not expressions. That’s why there is a semicolon at the end of each program line in Rust, which turns expressions into statements and separates them from each other. These statements are then called expression statements. This also explains why the semicolon is omitted at the end of a function and in the last line in the code block above:

The expression x + 1 is the return value of this function as well as in the code block further above. If you add a semicolon after this expression, it is a statement. But then the compiler would show an error, because the function would return nothing instead of an i32 integer. By the way, functions that return nothing have the so-called unit type (). This type has exactly one value – the empty tuple ().

With the return expression a function is terminated early and the return value is passed to the calling function.

If you write a semicolon after the return expression, the semantic does not change, however. The expression statement executes the return expression, i.e. the function is terminated and the return value is passed to the calling function.

Neither classes nor interfaces

In Rust there are no classes that implement interfaces or inherit data and methods from other classes. Instead of class there is struct which defines the structure of an object with its data (fields) and its behavior (methods). Structs cannot inherit from each other. Instead of interfaces, however, structs can implement traits which, like interfaces, define the same behavior (method signatures) for different types. Thus, a trait tells the Rust compiler what behavior a type has and can share with other types. Unlike interfaces, however, traits can already contain standard implementations (defaults) and thus the same code can be reused by many different types. In addition, traits can be implemented for already existing types, for example, also for types from the standard library.

Functional programming

The design of Rust has been inspired by many existing languages and techniques. However, a major influence was functional programming whose principles and concepts can be found in Rust as follows:

  • All variables and references are immutable by default, unless they are explicitly qualified with mut.
  • You can classify types by the number of values they can have. For example, the unit type in Rust has only one value, bool has two values, u8 has 256 values, and so on. The types that Rust comes with can also be combined by addition and multiplication (in terms of number of values). When combining data types in this way, the laws of algebra apply, which is why they are called Algebraic Data Types (ADTs). In Rust one uses enums to add types and tuples or structs to multiply types. The Rust standard library already contains many ADTs like Option or Result.
  • A feature of Rust that plays very well with the algebraic data types is pattern matching. It uses patterns in a special syntax to extract or destructure data from the structure of types. In the simplest case, it is very similar to JavaScript’s destructuring:

    However, the syntax of the patterns in combination with the match expression is much more powerful in Rust:

    If you omit the last line in the match expression here, there is a compiler error, since match expressions must be exhaustive, i.e. all possibilities for the value must always be considered.

  • Functions have a type and can be assigned to variables or referenced by variables like all other values. They are so-called first-class citizen.

  • Functions can also be passed as arguments to other functions or returned by other functions (so-called higher order functions).

  • Lambda expressions are anonymous functions that are specified directly at the point where they are called or passed as an argument to another function. As in other languages, they are also used in Rust to encapsulate a few lines of code which are passed to functions and methods. For example, to double all the values in a list of numbers and then sum them up, you simply write (1..101).map(|x| x * 2).fold(0, |x, y| x + y). The expression 1..101 is here the range of numbers from 1 to 100.

  • With the help of lambda expressions functions can also be applied partially.

  • Anonymous functions are not only easier to write than named functions. They also have access to the scope or environment in which they are defined.

    The anonymous function in this example can access the variable v defined outside of it. It is therefore a so-called closure that captures its environment. The code may seem a bit contrived but there are many use cases for closures. For example, a program can define a closure in the main thread and then execute it in a new thread. Since a closure captures its environment, the child thread can use variables defined in the parent thread.

  • For value sets, Rust provides various data structures, the so-called collections. In contrast to the array and tuple types already built into Rust, the data managed in the collections of the standard library is stored on the heap. It means that the amount of data does not have to be known at compile time. It can grow or shrink during program execution. Such collections are, for example, vectors (dynamic arrays), queues, lists, maps, sets, and the characters of strings.

  • Iterators can be used to walk through the elements of the collections and perform operations on them. As in other functional programming languages, iterators are lazy in Rust, i.e. they have no effect until methods are called that actually use or consume the iterator.

    In this example, the expression (1..) creates an infinite sequence of integer values starting at 1. However, this infinite sequence is not actually created. It would not even be possible. Instead, a Range object is created that implements an iterator. When this iterator is used, the three operations filter, take and map are executed in order. Since take(5) terminates the iteration after the fifth successful filter, only the first five odd numbers are multiplied by themselves. At the time filter, take and map are called, only iterators are created and concatenated. The operations defined in this way are executed on the elements of the collection only when the iterators are actually consumed. In the example above, this happens only when the collect method is called, which converts the iterator into a vector.

Rust supports polymorphism – even without classes

Rust uses generics to parameterize data types (such as structs, functions, methods, and enums). This reduces code duplication without sacrificing type safety. At compile time, the generic code is converted by replacing the parameters with the concrete types that the compiler finds in the source code. Type erasure as in Java does not exist.

Traits can define that the parameter of a generic type has a certain behavior, i.e. it must implement certain methods with a defined signature. The traits used in this way are constraints or bounds for the parameters of generics and make it possible to abstract over different types. Thus, objects can also be exchanged at run time against each other if they possess the characteristics defined by the Traits. Thereby, Rust implements polymorphism (so-called bounded parametric polymorphism).

The following code example shows how to implement the Summary trait for the three types Point, Vec, and LinkedList. The type of the elements of Vec and LinkedList is a generic parameter that must implement the ToString trait from the standard library. The to_string method defined in this trait is used in the method summary to convert the elements of the vector or list to a string.

The summaries vector in the example above contains references to objects that must implement the summary trait (so-called trait objects). By the keyword dyn the Rust compiler knows that (dynamic (runtime) dispatching) must be used for calling the method summary because the objects in the vector can be of different types (Point, Vec, and LinkedList). Rust must hold pointers to virtual method tables (vtables) and then do method calls based on the runtime type of the object. Only by doing this, the correct method summary will be called in the last line when iterating the vector elements.

Metaprogramming

Macros are another very powerful feature of Rust. However, their syntax is intimidating and overwhelming, especially for beginners. With macros you can write code that generates Rust code at compile time. Because of this indirection, macro definitions are generally more difficult to read, understand, and maintain than function definitions. The generation of code by code is often referred to as metaprogramming.

Unlike the C preprocessor, Rust’s macros are not simple text replacements but part of the normal compilation process. This means they behave more like functions, inserted into the code before it is compiled to binary – not as text but directly into the Abstract Syntax Tree (AST). This provides better type safety and minimizes unexpected behavior.

In Rust there are two very different types of macros:

  • Declarative macros use a construct similar to the match expression to generate repetitive code or define domain-specific languages (DSL).
  • Procedural macros allow you to operate on the AST of the Rust code passed to the macro. Essentially, it is a function from one TokenStream to another TokenStream, with the output replacing the macro call. Procedural macros are much more powerful than declarative macros but also more complex.

Similar to functions, macros can reduce the number of lines of code you have to write. For example, the macro vec! generates approximately the following code to create and initialise a vector:

Note: To distinguish the call of a macro from a function call, a ! is appended to the macro name.

Macros also have some capabilities that functions do not. For example, unlike functions, a variable number of parameters can be passed to macros. You can see this in several places in the code examples in this article. There, the macros println! and format! are called with a different number of arguments depending on the first parameter – the format string.

Since the macro code is generated before the compiler interprets the meaning of this code, a macro can implement a trait for a type. A function cannot, because it is called at runtime and the trait must already exist at compile time. For example, if you want to print an object with the println! macro in the console (e.g. for debugging), this object must implement the Debug trait’. However, you can save the boilerplate code necessary for this. With the Debug macro, the Debug trait is created automatically, as you can see in this example:

Macros are a big and important topic in Rust. You can find a good introduction to them here.

Rust is very versatile

The Rust compiler generates a runnable executable for the respective platform without further dependencies. A runtime (such as JDK or Node.js) is not required. Thus, a Docker image running the Rust application is many times smaller than a corresponding Java or Node.js application. An Alpine Docker image with a simple Node.js web service quickly grows to more than 200 MB in size. However, a corresponding web service written in Rust packaged in the same Docker image is less than 20 MB in size. This makes Rust a perfect match for services in the cloud or in a Kubernetes cluster.

In Rust you can implement very resource-efficient (low footprint) applications, you have full access to the hardware similar to C/C++ (you should do this only, however, if really necessary) and you have an excellent platform support. This makes Rust particularly well suited for embedded software, such as microcontroller applications or sensors on IoT devices, but also for agents that monitor systems. Furthermore, since no garbage collector freezes the application briefly to free memory, Rust can also be used to implement real-time applications or even OS kernels.

Rust source code can be compiled not only into machine code of the supported platforms. It is also possible to compile it into WebAssembly, which is a very load-time efficient binary format that can now be executed in all modern web browsers. WebAssembly promises to execute performance-critical parts of a web application, such as animations and simulations, with near-native performance in web browsers in parallel with JavaScript. This also makes Rust interesting for developers who want to implement games for the browser. Other programming languages, such as Go or C#, can also be compiled down into WebAssembly. However, they generate larger WebAssembly binaries compared to Rust, since at least parts of their runtime environment must be written into the file as well. Rust has an advantage there, since it does not need its own runtime environment. Similar good results can probably only be achieved with C++ at the moment.

The book Rust and WebAssembly explains everything you need to know about compiling Rust to WebAssembly. However, you must already have some knowledge of Rust and be familiar with JavaScript, HTML, and CSS. You don’t have to be an expert in any of these areas, however. In the book you implement Conway’s Game of Life step by step as a browser app. Performance-critical parts are implemented in Rust and compiled to WebAssembly. I had a lot of fun working through the tutorial and I highly recommend doing the same. However, if you are short on time and still want to take a look at the implementation, you can find my solution with some minor improvements in this Github Repo. In the readme of the repo you will find everything you need to know to start the app in the browser.

Your next steps

If I have piqued your interest and you want to learn more about the programming language Rust, you should probably read the blog article by Elisabeth Schulz next. It compares Rust with Java, describes various features of the language and explains some concepts, like the already mentioned ownership. If you want to go really deep, I recommend The Rust Programming Language, affectionately called “the book”. It gives an overview of the language, explains concepts and principles, and lets you build some sample projects on your way to a deeper understanding of the language. However, if reading several hundred pages about a programming language is not your thing, then probably Rust By Example is for you. It contains a lot of code examples and also some exercises.

Finally I would like to point out two more websites that might be helpful for getting into Rust. At cheat.rs you will find a very good and extensive cheat sheet that will help you get started with your first Rust project. However, if you first want to compare some programming idioms of your current favorite language with Rust, you should have a look at programming-idioms.org. Last but not least, have fun coding in Rust!

For almost 20 years, Falk Edelmann worked as a developer and architect in one of the largest software companies in the world and was involved in the development of some very successful products. He can bring his extensive knowledge and years of experience to every project and drive it forward.

Post by Falk Edelmann

Functional Programming

Rust – Einstieg

More content about Funktionale Programmierung

Comment

Your email address will not be published. Required fields are marked *