Let’s dive into Rust pointers, a concept that all developers work with in Rust. In this post, we’ll give you an overview of pointers and then focus on smart pointers. Pointers in Rust can be a bit confusing for new developers, but don’t worry! Our goal with this post is to help you gain a better understanding of them.
If you’re not familiar with the terms “stack” and “heap,” do I recommend checking out this article before continuing with this post.
So, what exactly is a pointer in Rust? When you create a new variable, Rust stores the data in memory. If you need a reference, it returns a reference to the memory location. In this context, a pointer in Rust is a reference that points to where the data exists.
A real-world use case would be if you, as the reader, are a variable and you know where the milk in the supermarket is. In this scenario, you are acting as a pointer because you are pointing to where the milk exists. The same logic applies to pointers in Rust: the variable holds the knowledge of where the data in the memory exists.
|
|
In the code example above, we create a variable named a
which holds the integer value hello rust
and is stored at a memory location (e.g., 0xd3e100
). Rust stores this reference on the stack and the data on the heap, and you can imagine it looks something like this:
location | value |
---|---|
0xd3e100 | hello rust |
We can also create a pointer that stores the memory location of another reference.
|
|
location | value |
---|---|
0xd3e100 | hello rust |
0xd3e101 | 0xd3e100 |
However, the type of b
would not inherit from a
; it would keep the “ground type” of str
, but change the type to a reference with type str
, such as &'static str
. To access the value of variable a
from variable b
, we would need to dereference it. Dereferencing means accessing the value of a reference rather than the reference. In Rust, it is possible to create a chain of references, which could have a type of &&&
. This would mean that if you need to access the value of a type with &&&
, you would also need to dereference it with ***
before the variable.
Pointers provide the flexibility to store data on the heap, particularly when you are uncertain about the data’s size or if the size might change over time while the memory location still exists. A great example of this scenario is when you box a value and work with dyn
traits, as explained in more detail here. In the code example provided on that page, you can use the following code to box a client:
|
|
This creates something like this:
location | value |
---|---|
a | 0xd3e100 |
References
In Rust, one of the most common pointers you’ll come across are references
. You can easily identify a reference by looking at the type of the variable. Just check if the type starts with an &
, and that means it’s a reference. For instance, if you see a type like &i32
, you can be sure that it’s a reference pointing to data of type i32
. If you see a type of a pointer starting with &'a
or &
, then you are working with a shared (immutable) reference. If you get a type starting with &mut
or &'a mut
, then you are working with a mutable reference.
|
|
Raw pointers
Another type of pointer is raw pointers. Raw pointers are unsafe and not recommended for use unless you have a clear understanding of what you are doing. They can be null
or unaligned
, meaning that the memory location they point to may not exist or may be incorrect. You can identify a raw pointer by the types *const T
or *mut T
. A raw pointer is essentially a pointer without the knowledge of whether the data it points to exists or not. For this reason, they are considered harmful to your code unless you are fully aware of what you are doing. In the code example below, you will see me using raw pointers to modify and use some data, and I need to mark the code as unsafe{}
in order to compile successfully.
|
|
Smart pointers
Let’s continue this post with smart pointers, which are pointers but with some extra intelligence (obviously). These pointers come with additional metadata and functionality, such as reference counting to make sure that data is not discarded when nobody is using it. So, smart pointers are basically pointers with handy features like keeping track of how many references are using the data or preventing race conditions.
One smart pointer that you might not have considered is String
. The interesting thing about String
is that it ensures that you always get a UTF-8 string, and its capacity information is stored within its metadata.
Box
While discussing pointers, let’s also talk about Box
in Rust. Box
allows us to store data on the heap without specifying the size of the data. Boxes are particularly useful when working with dynamic traits in Rust. Dynamic traits can vary significantly between each struct that implements a trait, as explained here. To store this data, we need to box it. Here’s an example of how to box data in Rust:
|
|
Rc and Arc
The next topic to discuss is Rc
and Arc
in Rust. Both of these are similar in that they allow multiple ownership of the same data through reference counting. They both return a pointer that the user can work with.
Rc
stands for Reference counting and works by increasing the number of owners of a memory location by 1 every time a reference to that location is cloned to an Rc
. This prevents the data from being deallocated when it is still needed. Here is an example usage:
|
|
However, the problem with Rc
is that it is designed for single-threaded use only. If you use Rc
in a multi-threaded application, it can potentially lead to issues. To address this, there is a smarter solution called Arc
.
Arc
stands for atomic reference counted and functions similarly to Rc
, but it is thread-safe because of the addition of atomic counting.
|
|
Something to keep in mind when working with Arc
and Rc
is that you should only use Arc
if you really need it, as it can be more expensive for your application.
Ref and RefMut
Let’s talk about Ref
and RefMut
, two things you can use together with RefCell
. RefCell
itself is not a smart pointer, but when you use the methods borrow
and borrow_mut
, you obtain smart pointers.
RefCell
can be compared to Box
, but there is a significant difference between them. The main distinction is that Box
performs checks during compile time, while RefCell
performs checks during runtime. This means that instead of receiving a compile error, your code will panic. RefCell
is useful when you need to bypass the compile-time rules enforced by Box
. I even suspect that RefCell
utilizes some unsafe{}
functionality under the hood. Therefore, if you are not completely certain about what you are doing, it is better to avoid using RefCell
.
When you use the borrow
method provided by RefCell
, you get a Ref
, which is a smart pointer that keeps track of how many owners are using the reference. This prevents the data from being dropped prematurely.
When you use the borrow_mut
method, you get a RefMut
, which allows you to change the value within the RefCell
. However, you are only allowed to create one RefMut
, and if you try to create more than one, the code will panic.```rust
use std::cell::{Ref, RefCell, RefMut};
|
|
.unwrap()
Thank you for reading this article. I hope you found it informative, or perhaps it served as a refresher for some. If you have any input or feedback, please don’t hesitate to reach out to me on Twitter. If you think there are any changes that could be made, please feel free to contact me or click the “suggest change” button below the title.