Sneak preview: Writing Ruby gem native extensions in Rust

Sneak preview: Writing Ruby gem native extensions in Rust

Note: anytime you see 🙋‍♂️ RFC, that’s a “Request For Comments” about a topic I didn’t understand or take the time to look into. Please feel free to add what you know!

If this post tickles your fancy, check out the follow-up post: Writing a Rust gem from scratch

In December 2021, Ian Ker-Seymer (@ianks) submitted a pull request to enable native extensions in Rust!

I was so excited, I had to try it out, even though it hadn’t been merged yet. A lot of maintainers are showing interest and pitching in, so I have high hopes for it being merged into main. So here are my notes on writing a Rust gem extension.

Requirements

Requirements / dependencies / utilities I used and their versions on macOS Monterey v12.1 (21C52) as of 2022-01-29:

  • Bundler version 2.4.0.dev
  • gem version 3.4.0.dev
  • cargo 1.58.0 (f01b232bc 2022-01-19)
  • rustc 1.58.1 (db9d1b20b 2022-01-20)

Using a rubygems fork

Warning! The following is based on ianks’s development branch of Rubygems. The feature may have changed – or not exist at all – by the time you read this. I’ll modify this warning if the feature ends up being merged.

Find somewhere cozy to clone @ianks’s cargo-builder branch of rubygems and run the following:

$ git clone –branch cargo-builder git@github.com:ianks/rubygems.git
Cloning into 'rubygems'
remote: Enumerating objects: 224785, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 224785 (delta 3), reused 3 (delta 3), pack-reused 224780
Receiving objects: 100% (224785/224785), 191.82 MiB | 38.18 MiB/s, done.
Resolving deltas: 100% (124370/124370), done.
$ cd rubygems
view raw 000.sh hosted with ❤ by GitHub

Aliasing your default rubygems

We need to be able to use the gem and bundle commands from the cargo-builder branch of rubygems. As per the directions in the CONTRIBUTING.md:

To run commands like gem install from the repo:

ruby -Ilib bin/gem install

To run commands like bundle install from the repo:

ruby bundler/spec/support/bundle.rb install

But this is a hassle, so we’ll use aliases instead of typing all of this up every time. cd into your rubygems directory and alias these commands with the following:

export RUBYGEMS_PATH="$(pwd)"
alias gem="ruby -I$RUBYGEMS_PATH/lib $RUBYGEMS_PATH/bin/gem"
alias bundle="ruby $RUBYGEMS_PATH/bundler/spec/support/bundle.rb"
view raw 001.sh hosted with ❤ by GitHub

We’re going to use the RUBYGEMS_PATH variable later on, so keep that handy! Now if you check the version numbers of your default gems, they should be as follows:

$ gem –version
3.4.0.dev
$ bundler –version
Bundler version 2.4.0.dev
view raw 002.sh hosted with ❤ by GitHub

Note that these aliases won’t be present in a new terminal shell!

Compiling an example gem

We’re ready to test the functionality of a Rust-based gem. For starters, let’s use the rust_ruby_example gem that I’ve extracted from @ianks’s pull request:

$ git clone https://github.com/briankung/rust_ruby_example
$ cd rust_ruby_example
view raw 003.sh hosted with ❤ by GitHub

Let’s confirm that it does, indeed, allow us to run Rust code from Ruby.

First, we need to build the gem. We do this by pointing the gem command at a .gemspec file. Luckily, the repo has one of those:

$ gem build rust_ruby_example.gemspec –output rust_ruby_example.gem
WARNING: licenses is empty, but is recommended. Use a license identifier from
http://spdx.org/licenses or 'Nonstandard' for a nonstandard license.
WARNING: no homepage specified
WARNING: See https://guides.rubygems.org/specification-reference/ for help
Successfully built RubyGem
Name: rust_ruby_example
Version: 0.1.0
File: rust_ruby_example.gem
view raw 004.sh hosted with ❤ by GitHub

We also explicitly name the output file, otherwise we get something like rust_ruby_example-0.1.0.gem, which is just a tad bit more awkward.

And we’re done!

…well, not exactly. As it turns out, extensions aren’t compiled until you install the gem. It makes sense that building the gem and installing the gem are two separate steps. So next we need to install it:

$ gem install rust_ruby_example.gem
Building native extensions. This could take a while…
Successfully installed rust_ruby_example-0.1.0
1 gem installed
view raw 005.sh hosted with ❤ by GitHub

Firing up cargo took a minute or so on my machine.

Potential errors

Speaking of cargo, if you don’t have it installed, you may see a message that looks like this:

$ gem install rust_ruby_example.gem
Building native extensions. This could take a while…
ERROR: Error installing rust_ruby_example.gem:
ERROR: Failed to build gem native extension.
current directory: /Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/gems/rust_ruby_example-0.1.0
cargo rustc –target-dir /Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/extensions/x86_64-darwin-21/3.0.0/rust_ruby_example-0.1.0 –manifest-path /Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/gems/rust_ruby_example-0.1.0/Cargo.toml –lib –release –locked — -C linker\=clang -C link-arg\=-fdeclspec -L native\=/Users/brian/.rbenv/versions/3.0.0/lib -L native\=/Users/brian/.rbenv/versions/3.0.0/lib -L native\=/usr/local/opt/icu4c/lib -C link_arg\=-Wl,-undefined,dynamic_lookup -C link_arg\=-Wl,-multiply_defined,suppress -C debuginfo\=1
cargo failedNo such file or directory – cargo
Gem files will remain installed in /Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/gems/rust_ruby_example-0.1.0 for inspection.
Results logged to /Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/extensions/x86_64-darwin-21/3.0.0/rust_ruby_example-0.1.0/gem_make.out
view raw 006.sh hosted with ❤ by GitHub

This means that you don’t have cargo installed, or rubygems couldn’t find cargo in your $PATH. Make sure to install Rust and come back when you’re done!

You may also see an error like this:

$ gem install rust_ruby_example.gem
Building native extensions. This could take a while…
ERROR: Error installing rust_ruby_example.gem:
ERROR: Failed to build gem native extension.
No builder for extension 'Cargo.toml'
Gem files will remain installed in /Library/Ruby/Gems/2.6.0/gems/rust_ruby_example-0.1.0 for inspection.
Results logged to /Library/Ruby/Gems/2.6.0/extensions/universal-darwin-21/2.6.0/rust_ruby_example-0.1.0/gem_make.out
view raw 007.sh hosted with ❤ by GitHub

The key here being the message "No builder for extension 'Cargo.toml'." If that’s the case, double check your bundler --version and gem --version to make sure they match the versions above. Your current version of the gem utility is missing @ianks’s CargoBuilder addition.

Inspecting rust_ruby_example code

rust_ruby_example includes some sample code in src/lib.rs:

#[no_mangle]
unsafe extern "C" fn pub_reverse(_klass: VALUE, mut input: VALUE) -> VALUE {
let ruby_string = cstr_to_string(rb_string_value_cstr(&mut input));
let reversed = ruby_string.to_string().chars().rev().collect::<String>();
let reversed_cstring = CString::new(reversed).unwrap();
let size = ruby_string.len() as c_long;
rb_utf8_str_new(reversed_cstring.as_ptr(), size)
}
view raw 008.rs hosted with ❤ by GitHub

If you’ve never seen Ruby internal code before, a few of these methods look like exactly what you’d call in C code, courtesy of a library called rb-sys. The key here is in the name of the method – pub_reverse reverses strings. Here’s where the reversal actually happens:

let reversed = ruby_string.to_string().chars().rev().collect::<String>();
view raw 009.rs hosted with ❤ by GitHub

There’s also an initialization function, Init_rust_ruby_example, to actually define the Ruby modules and methods. Let’s piece together what it’s doing. Here are the relevant lines for declaring a Ruby module:

let name = CString::new("RustRubyExample").unwrap();
// …
let klass = unsafe { rb_define_module(name.as_ptr()) };
view raw 010.rs hosted with ❤ by GitHub

…and the rest is all adding the reverse method to the module:

// Name of the function
let function_name = CString::new("reverse").unwrap();
// transmute the function for unknown reasons
let callback = unsafe {
std::mem::transmute::<
unsafe extern "C" fn(VALUE, VALUE) > VALUE,
unsafe extern "C" fn() > VALUE,
>(pub_reverse)
};
// …Bind the transmuted function as a module function on the RustRubyExample module
unsafe { rb_define_module_function(klass, function_name.as_ptr(), Some(callback), 1) }
view raw 011.rs hosted with ❤ by GitHub

Note that it needs to translate everything into Matz’s Ruby compatible data structures. That includes the module, the module function, and even the string name for the function.

(🙋‍♂️ RFC: what is the purpose of std::mem::transmute here?)

💡 Click to read more about the purpose of std::mem::transmute

From a comment I wrote on the follow up post:

After staring at the C header file where rb_define_module_function is defined – I don’t know C 😰 – I think it’s necessary because Rust won’t let you pass a function pointer with arbitrary arity, but the C code just assumes that you can. Note that the last argument in rb_define_module_function is an arity indicator. So the transmutation is just ceremony to get a function pointer – any function pointer – past Rust’s type system. That’s my guess, anyway.

EDIT – Seems right! https://twitter.com/_ianks/status/1489419634168184834


Trying out rust_ruby_example

Alright, if you’ve seen this message:

$ gem install rust_ruby_example.gem
Building native extensions. This could take a while…
Successfully installed rust_ruby_example-0.1.0
1 gem installed
view raw 012.sh hosted with ❤ by GitHub

…you’re ready to go! Fire up IRB and require rust_ruby_example to take it for a test drive:

$ irb
irb(main):001:0' require 'rust_ruby_example'
=> true
irb(main):002:0> RustRubyExample.reverse("rust_ruby_example")
=> "elpmaxe_ybur_tsur"
view raw 013.sh hosted with ❤ by GitHub

It reverses the string, as promised. It works!

…or does it? Let’s see if it’s really doing our bidding by modifying the code.

Adding a lowercase method

Let’s add a RustRubyExample#lowercase method. It will be exactly the same as RustRubyExample#reverse, except it converts case-convertible text to lower case.

It should work like this:

$ irb
irb(main):001:0' require 'rust_ruby_example'
=> true
irb(main):002:0> RustRubyExample.lowercase("RustRubyExample")
=> "rustrubyexample"
view raw 014.sh hosted with ❤ by GitHub

And we can confirm that it currently does not work:

$ irb
irb(main):001:0> require 'rust_ruby_example'
=> true
irb(main):002:0> RustRubyExample.lowercase("RustRubyExample")
Traceback (most recent call last):
4: from /Users/brian/.rbenv/versions/3.0.0/bin/irb:23:in `<main>'
3: from /Users/brian/.rbenv/versions/3.0.0/bin/irb:23:in `load'
2: from /Users/brian/.rbenv/versions/3.0.0/lib/ruby/gems/3.0.0/gems/irb-1.3.0/exe/irb:11:in `<top (required)>'
1: from (irb):2:in `<main>'
NoMethodError (undefined method `lowercase' for RustRubyExample:Module)
view raw 015.sh hosted with ❤ by GitHub

So let’s add it. Once again we need #[no_mangle] to tell the compiler not to alter the name of the function once it’s been compiled. Mangling essentially namespaces function names so there are no name collisions in the final binary. However, in our case, we want to be able to refer to it by the name we give it in C Ruby, so we don’t want our function name to be mangled

Add this block of code between the pub_reverse and Init_rust_ruby_example functions in src/lib.rs:

// in src/lib.rs
#[no_mangle]
view raw 016.rs hosted with ❤ by GitHub

We’re also going to copy the function signature:

// in src/lib.rs
#[no_mangle]
unsafe extern "C" fn pub_lowercase(_klass: VALUE, mut input: VALUE) -> VALUE {
// …
}
view raw 017.rs hosted with ❤ by GitHub

(🙋‍♂️ RFC: why does this need the _klass argument?)

💡 Click to read more about _klass argument

After figuring out the purpose of the call to std::mem::transmute, the _klass argument isn’t too confusing. It has a leading underscore because it’s unused, but the type of functions with that arity on the C side requires a receiver object for the method, even if it goes unused.


Next we take the Ruby VALUE input and cast it to a Rust string, then lowercase it using standard Rust String methods:

// in src/lib.rs
#[no_mangle]
unsafe extern "C" fn pub_lowercase(_klass: VALUE, mut input: VALUE) -> VALUE {
let ruby_string = cstr_to_string(rb_string_value_cstr(&mut input));
let lowercased = ruby_string.to_lowercase();
// …
}
view raw 018.rs hosted with ❤ by GitHub

…and the rest is all glue code to convert it to a C string and then to a Ruby string:

// in src/lib.rs
#[no_mangle]
unsafe extern "C" fn pub_lowercase(_klass: VALUE, mut input: VALUE) -> VALUE {
let ruby_string = cstr_to_string(rb_string_value_cstr(&mut input));
let lowercased = ruby_string.to_lowercase();
let lowercased_cstring = CString::new(lowercased).unwrap();
let size = ruby_string.len() as c_long;
rb_utf8_str_new(lowercased_cstring.as_ptr(), size)
}
view raw 019.rs hosted with ❤ by GitHub

We also need to add the method to the Ruby module. We can do that by duplicating the relevant code in Init_rust_ruby_example:

// in src/lib.rs
#[allow(non_snake_case)]
#[no_mangle]
pub extern "C" fn Init_rust_ruby_example() {
let name = CString::new("RustRubyExample").unwrap();
// …Code for defining "RustRubyExample#reverse" omitted
let function_name = CString::new("lowercase").unwrap();
let callback = unsafe {
std::mem::transmute::<
unsafe extern "C" fn(VALUE, VALUE) > VALUE,
unsafe extern "C" fn() > VALUE,
>(pub_lowercase)
};
let klass = unsafe { rb_define_module(name.as_ptr()) };
unsafe { rb_define_module_function(klass, function_name.as_ptr(), Some(callback), 1) }
}
view raw 020.rs hosted with ❤ by GitHub

Now to build and reinstall it:

$ gem build rust_ruby_example.gemspec –output rust_ruby_example.gem
// Warnings omitted
Successfully built RubyGem
Name: rust_ruby_example
Version: 0.1.0
File: rust_ruby_example.gem
$ gem install rust_ruby_example.gem
Building native extensions. This could take a while…
Successfully installed rust_ruby_example-0.1.0
1 gem installed
view raw 021.sh hosted with ❤ by GitHub

Finally, let’s test our new functionality:

$ irb
irb(main):001:0> require 'rust_ruby_example'
=> true
irb(main):002:1* RustRubyExample.lowercase("RustRubyExample")
=> "rustrubyexample"
view raw 022.sh hosted with ❤ by GitHub

Awesome!

Conclusion

This is all possible due to Ian’s long, hard slog to get this into rubygems proper: https://github.com/rubygems/rubygems/pull/5175. This could be a hugely impactful addition to rubygems that’s been stewing since 2019 and it’s mostly been Ian’s efforts to get it there. Thanks, Ian!

If you got this far, check out the follow-up post: Writing a Rust gem from scratch

3 responses to “Sneak preview: Writing Ruby gem native extensions in Rust”

Leave a Reply to 週刊Railsウォッチ: Ruby標準のCSVライブラリは優秀、if代入のコーディングスタイル、rambulanceほか(20220301後編)|TechRacho by BPS株式会社 Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s