May 2019

Rust: A Deeper Dive

Previously, we discussed getting up and running with Rust for an image color analyzer. Today, I would like to dig more into Rust and really start to gain an appreciation for some of its more advanced features. We will take a deeper dive into Rust macros, we'll talk about measuring performance, and then mention one last crate to really improve your Rust.

The Value of Macros

In the previous post we mentioned macros. Today, we are going to dive in. Rust macros have a slightly awkward syntax and again I will encourage you to visit the following resources to get familiar (here or here). Remember, macros are a code generation tool; meaning that macro's will be expanded at compile-time to produce the appropriate code to run your program. This means that you are dealing with code as inputs to your macro, and returns code that replaces it in the rust program. Cargo Expand is an invaluable tool in generating the Rust code that your macro will expand to. Below is an example of a basic macro. Feel free to take your time to make sure you understand the syntax, but also of note, is that we start to gain an appreciation for the power of macros. Since macros generate code, we can do some nifty things like get the value of the variable and the name of variable passed in. In this example, our not_12_macro! can tell the user of the code, that their user-defined variable (e.g ) is not 12. If a reference to the name or syntax of the input would be of particular use to the caller of your function, macros could be very helpful. Think of something like assert! or json! and we can begin to understand the value.

A Step Further

Now that we have a sense of what is capable, let us build something with some better utility. For this we will move on to procedural macros. Procedural macros can be used to extend interfaces to user-defined types. For instance, in our previous post when we mentioned the #[derive(Serialize)] attribute to a Struct to enable it be serialized as JSON. But, we sort of skipped over what is actually happening when we do this. So, here we are going to do it for ourselves, first-hand.

For this example, we will use the infamous rot13 cipher. The rules are simple; take any alphabetical characters (in English) and rotate them 13 characters, and if we need to pass 'z', return back to 'a'. This is rather straightforward for us to make a nice crate around the functionality. However, dealing with string types is easy, but it would be really obnoxious if everybody who wanted to use our crate would then have to define a rot13 interface for their user-defined types. There's got to be a better way! Here is the power of procedural macros; where we can give others to import our functionality with the ease of #[derive(Rot13)]. We will then generate the Rust code to implement the interface for their type. It turn out that this is rather easy, and the code to do so in an extended example is below. Notice that we are able to define our Rot13 interface for two user-defined types: Message and Text. Specifically, the entry-point for such a macro is #[proc_macro_derive(Name)] and in our case #[proc_macro_derive(Rot13)].

CPU && Memory Profiling

Next, I would like us to dig in to some performance tooling around Rust. For a language with a big draw around performance, I was naturally curious to know where my code, from the previous image analyzer project, spent most of its time. I figured euclidean distances were pretty cheap (most of the logic), but I was curious nonetheless. I found great blog post on how to do exactly this. Since, Rust has the ability to import C code, it was nearly as simple as installing the google-perftools. The exact steps are listed below.

  1. Install google-perftools and libgoogle-perftools-dev to develop against
  2. Introduce code to start/stop profiling (e.g. here)
  3. Install pprof with go get -u
  4. Take profile output file and use pprof to generate an understandable file (e.g pprof -svg -output ./profile/analyze.svg profile/analyze)

The above code will generate an output svg similar to the one below.

Interestingly enough. The results show something very interesting about my code. 60% of server time was spent in Hashmap.get(). This is quite a considerable amount of time. My (yet to be optimized) code results in many Hashmap lookups (one for each color available to check against: ~10) for each pixel in the image. This of course would mean (hundreds of?) thousands of lookups for every image, which of course leads to two conclusions. 1, we need to stop doing so many lookups, and 2, should it be that slow?. #1 should be very fixable, so naturally I was more concerned with #2. Going straight to the Rust Hashmap documentation, we can see right away that it says:

The default hashing algorithm is currently SipHash 1-3, though this is subject to change at any point in the future. While its performance is very competitive for medium sized keys, other hashing algorithms will outperform it for small keys such as integers

My use case is rather small keys, so it felt like maybe using a new hashing algorithm may make for considerable improvement. FNV seems like a great option. I changed that to my default hasher an rebuilt the chart, only to find out that my code really was the problem. Time spent in Hashmap.get had slightly budged to 56% (the results can be examined here.


It woudln't feel right without leaving you with one great crate recommendation. Who needs code review! There are some seriously great static analysis and code linting tools in most sophisticated languages. Clippy seems to raise the bar. Below, is an example of an output from using clippy on a previous version of the image color analyzer. Notice how the warnings are a well-need extension to the compiler. It tells me about unneeded and excessively verbose code, potential bugs, and unneeded logic. And most importantly, dedicated links for each warning with advise to resolve the issue. Clippy makes for a great tool to bring to any team project or CI pipeline.

Clippy output