Note: anytime you see 🙋♂️ RFC, that’s a “Request For Comments” about a topic I didn’t understand or take the time to look into. Please feel free to add what you know!
This is a followup to the last post. Instead of using the template rust_ruby_example
gem, we’ll make one from scratch. Make sure to go back over the “Using a rubygems
fork” section because we’ll be using it heavily during this post, as well!
Requirements
Requirements / dependencies / utilities I used and their versions on macOS Monterey v12.1 (21C52) as of 2022-02-02:
- Bundler version 2.4.0.dev
- gem version 3.4.0.dev
- cargo 1.58.0 (f01b232bc 2022-01-19)
- rustc 1.58.1 (db9d1b20b 2022-01-20)
Generating a new gem
Let’s see what it takes to write a Rust gem from scratch. Thankfully, Bundler has a generator for making new gems, and we can look at the rust_ruby_example
gem for pointers on how to get the Rust parts working.
We’re still going to be doing some string manipulation, but this time we’ll just shuffle the characters. Full disclosure: I also don’t know enough of Ruby’s C API to do much more than that. That’s an adventure for another day!
Let’s start out by asking bundler
to make a new gem. We’ll call it rust_shuffle, or ruffle
for short.
If this is the first time you’ve created a new gem with bundler
, it may ask you a few configuration questions first. These questions are saved to your user’s profile at ~/.bundle/config
and can be changed with the bundle config
subcommand. For our purposes, the only one that matters is using rspec
as the test framework.
Speaking of rspec
, let’s $ bundle install
in our new ruffle
directory so we can fetch the rspec
gem:
Well! Looks like we’ll be using some good old-fashioned EDD (Error Driven Development) to get this gem ship-shape.
The problem here is that the gemspec, a metadata file, needs to be filled out before the gem is valid. We’re going to do it the easy way and simply remove all the TODO
‘s from the. Bundler also checks that the URLs parse correctly, so we’ll be replacing all the URLs with "https://example.com"
:
And now we can fetch our dependencies:
Now, what happens when we run our specs?
The test “Ruffle does something useful” in the file spec/ruffle_spec.rb
on line 8 fails. Harsh.
Adding a #shuffle
method in Rust
First, let’s change the “does something useful” test to test something useful. Replace spec/ruffle_spec.rb
with the following:
Now let’s run the test, like good EDD practitioners – red, green, deploy. Then right back to EDD, this time in prod (it’s a virtuous cycle):
You may see a warning like --pattern spec/**{,/*/**}/*_spec.rb failed
. This is a result of running rspec
through rake
. While it’s annoying, it’s not a showstopper.
Initializing a Rust project
Now let’s add some Rust code! For the purposes of this tutorial, we’ll add it directly to the root directory of the gem, but there is a standard project structure for gem native extensions.
(🙋♂️ RFC: should a Rust extension follow the rake-compiler
project structure?)
You can initialize a Rust project with:
We pass the --lib
argument to cargo
to tell it that we want our crate to be a library, not a binary. Whereas a binary results in an executable, a library crate’s output should be something like .so
or .dll
files, if not compiled directly into another Rust binary.
Now we have two new files: Cargo.toml
and src/lib.rs
. cargo
has also modified the .gitignore
file to ignore build artifacts and Cargo.lock
, a lockfile for Cargo dependencies.
Adding a rake compile
task
Let’s add a rake
task to compile our Rust code. Rakefile
s are Ruby files that define the tasks that rake
can run. If we inspect ours, it looks fairly barebones:
Basically it requires some default tasks. It also defines the :spec
task as the default task, or the task that’s run when you execute rake
without any arguments. We’ve already used it, in fact.
Let’s add a compile
task. Make sure the RUBYGEMS_PATH
environment variable is set from the last post. If you don’t have it, make sure to export it:
Now we can reference that:
💡 Click for additional notes – *cargo_builder_gem
?
We have to make sure we’re using the cargo-builder
branch of rubygems. If we simply shell out to gem
, we’ll end up using our default system gem.
I was hoping to use the rubygem
internals to build the gem instead of relying on shelling out to the utility. I got as far as to require 'rubygems/ext'
in order to use the Gem::Ext::CargoBuilder
class, but realized that gem
and bundler
are just aliases to the cargo-builder
branch of our rubygems
repo and not a system-wide installation, I don’t have access to it within Ruby itself. It will be much easier once the PR is merged into the repository proper.
Now, there is a way using the setup.rb
script in the rubygems
repository, but it requires replacing your default rubygems
gems. As long as this article is, it would be even longer with the caveats and restoration that would take, so I chose not to use setup.rb
.
And now to test:
💡 Click for additional notes – "true"
?
EDIT – Thanks Zach! I’ve changed it to backticks 😊
The call to check whether the ruffle
gem is installed is system 'gem', 'list', '-i', '^ruffle$'
. Unfortunately, the actual shell command prints true
or false
to stdout, so I can’t swallow it by, for example, assigning it to a variable. I’m too lazy and this post has taken too long already to add this to the list of things to figure out.
My apologies 🙇♂️
🙋♂️ RFC: what’s a better way to do this?
At this point, however, it’s not actually building the extension. If it were, it would print the message "Building native extensions. This could take a while..."
.
Fixing crate compilation
There are two more things we need to change in ruffle.gemspec
before it’ll work. First, we have to add "Cargo.toml"
to spec.extensions
:
But this still isn’t enough. If we run $ rake compile
now, we get a cryptic error message:
The problem is that the we aren’t telling the gemspec to package the Cargo.lock
file in the final .gem
file, and building the crate requires a lock on the Cargo.lock
file. So how do we add the lockfile to the finished gem? By adding them to the spec.files
array in the gemspec. You can read more about that here.
There are two ways we can do this:
- Add the
Cargo.lock
andCargo.toml
files to thegit
history. The default logic forspec.files
relies on thegit
history. It usesgit ls-files -z.split("\x0")
, which only reports files that have been added togit
. - Specifically add
["Cargo.lock", "Cargo.toml", and "src/lib.rs"]
tospec.files
in the gemspec.
Since we haven’t been paying much attention to git
and this is a simple enough project with no Ruby files, we’ll go with option #2. In ruffle.gemspec
, find the code that assigns to spec.files
:
…and replace it with this:
We need to set the crate-type
to "cdylib"
in order to tell the Rust compiler that the output should be a shared library that can be used from other languages. As per the Rust docs on Linkage:
> --crate-type=cdylib
, #[crate_type = "cdylib"]
– A dynamic system library will be produced. This is used when compiling a dynamic library to be loaded from another language. This output type will create *.so
files on Linux, *.dylib
files on macOS, and *.dll
files on Windows.
So next, we add crate-type
to Cargo.toml
:
Now we can finally count on rake
to build the crate:
💡 Click for additional details – what happens when we don’t set crate-type
?
When we run $ rake compile
, we get another error:
$ rake compile
# ...omitted
Compiling ruffle v0.1.0 (/Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/gems/ruffle-0.1.0)
Finished release [optimized] target(s) in 0.47s
Dynamic library not found for Rust extension (in /Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/extensions/x86_64-darwin-21/3.0.0/ruffle-0.1.0)
Make sure you set "crate-type" in Cargo.toml to "cdylib"
The error message came with instructions this time! Thanks, @ianks!
Defining the Ruffle
module in Rust
While the crate is compiling now, our tests are still failing because we haven’t actually done anything with the Rust code. Since we’ll be creating the Ruby data structures on the Rust side, we’ll use a Ruby API to interface with Ruby internals. Enter rb-sys
, Rust bindings for ruby that have been automatically generated using rust-bindgen
.
Add it under your dependencies in Cargo.toml
:
Next, we’ll just copy from rust_ruby_example/src/lib.rs
, the example gem from the last post, except we’ll substitute “ruffle” wherever it says “rust_ruby_example” and “shuffle” wherever it says “reverse.”
Now we just remove the lib/ruffle.rb
file:
💡 Click for additional details – why delete lib/ruffle.rb
?
The respond_to
test won’t pass because Ruby’s require
method will look in lib
first and find the empty lib/ruffle.rb
file before it searches available gems (you can read more about how Ruby’s require
works here: https://ryanbigg.com/2017/11/how-require-loads-a-gem).
Here’s the proof:
$ rake compile spec
# ...omitted
Ruffle
has a #shuffle method (FAILED - 1)
Failures:
1) Ruffle has a #shuffle method
Failure/Error: expect(Ruffle).to respond_to(:shuffle)
expected Ruffle to respond to :shuffle
# ./spec/ruffle_spec.rb:5:in `block (2 levels) in <top (required)>'
(Note that this leaves lib/ruffle/version.rb
, which could be confusing for anyone reading your code.)
Now, let’s run rake spec
:
We’re green! ✅ And all with Rust code.
Implementing Ruffle#shuffle
Next, let’s add a test that describes the behavior of the method:
Technically, this could fail if the shuffled string randomly returns the original string. If this happens, you’ve won the random number generator lottery! Take a screenshot 📸
And when we run rake spec
:
Because we’ve simply copied over rust_ruby_example
‘s pub_reverse
code, it’s just reversing the string.
Let’s see if we can modify our pub_shuffle
method in Rust to look like what we want. Right now we have:
Instead of let shuffled = ruby_string.chars().rev().collect::<String>()
on line 4, we would want something like let shuffled = ruby_string.chars().shuffle().collect::<String>()
. Unfortunately, there is no such method implemented on Rust’s Iter
struct.
As per this StackOverflow answer, you can use Rust’s rand::seq::SliceRandom
trait to provide a shuffle
method on Vec
s. StackOverflow user Vladimir Matveev’s answer looks like this:
Shuffling a vector is a randomized operation, so we need a random number generator (RNG), and Rust doesn’t have an RNG in its standard library. The de facto standard RNG in Rust is rand
. Let’s add it to our dependencies:
In the original code, the char
s are reversed and collected into an owned String
.
However, we need to actually mutate the Vec
in place instead of using fancy functional Iter
methods.
In rust_ruby_example
the input
goes from a Ruby VALUE
type:
To a CString
type with this function call:
Finally to a Rust String
type with the final function call:
We want to shuffle the characters of the string, so we’ll declare it as such and request the characters. Calling collect()
charges you casts the value into the requested type signature if an Into
trait is available for the type conversion:
💡 Additional notes about chars()
Note that Rust’s chars()
method does not handle Unicode grapheme clusters (thanks, Wesley!). As per Rust’s documentation on the chars
method:
Remember,
char
s might not match your intuition about characters:let y = "y̆"; let mut chars = y.chars(); assert_eq!(Some('y'), chars.next()); // not 'y̆' assert_eq!(Some('\u{0306}'), chars.next()); assert_eq!(None, chars.next());
The more or less canonical way to handle this is to use the unicode-segmentation
crate, as per this StackOverflow answer.
This is also a handy introduction to Unicode: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Now we can bring the rand
functionality into scope at the top of our file:
And then call shuffle
on the chars
vec:
So your function should now look like this:
Now we need to convert the Vec<char>
to a Rust String
and store its length as a c_long
type, which in this case is just a type alias for i64
:
Then we construct a new CString
from the shuffled Rust String
:
And finally return the Ruby String:
…and with that, the final function looks like this:
And don’t forget to check your Init_ruffle
method that actually initializes the Ruby Ruffle
module and defines the method:
(It’s unmodified from our earlier template code.)
Now to recompile it and run the specs:
And we’re passing!
I’m sure there will be a lot of patterns emerging on how to organize the code once people start creating their own Rust gems – this is by no means a definitive one, just the one I copied from @ianks’s rust_ruby_example
gem. I’m a novice as well! But hopefully you got as much out of reading this as I did out of writing it. And with @ianks’s pull request approved and passing CI, hopefully it’ll be only a matter of time before everyone gets to play with the new functionality!
The commit messages are pretty bad because it was a spike for me, but you can find all the code here: https://github.com/briankung/ruffle
Leave a Reply