rewrite_expr tutorial - C2Rust Manual

c2rust refactor provides a general-purpose rewriting command, rewrite_expr, for transforming expressions. In its most basic form, rewrite_expr replaces one expression with another, everywhere in the crate:

rewrite_expr '1+1' '2'

Here, all instances of the expression 1+1 (the "pattern") are replaced with 2 (the "replacement").

rewrite_expr parses both the pattern and the replacement as Rust expressions, and compares the structure of the expression instead of its raw text when looking for occurrences of the pattern. This lets it recognize that 1 + 1 and 1 + /* comment */ both match the pattern 1+1 (despite being textually distinct), while 1+11 does not (despite being textually similar).

In rewrite_expr's expression pattern, any name beginning with double underscores is a metavariable. Just as a variable in an ordinary Rust match expression will match any value (and bind it for later use), a metavariable in an expression pattern will match any Rust code. For example, the expression pattern __x + 1 will match any expression that adds 1 to something:

rewrite_expr '__x + 1' '11'

In these examples, the __x metavariable matches the expressions 1, 2 * 3, and f().

When a metavariable matches against some piece of code, the code it matches is bound to the variable for later use. Specifically, rewrite_expr's replacement argument can refer back to those metavariables to substitute in the matched code:

rewrite_expr '__x + 1' '11 * __x'

In each case, the expression bound to the __x metavariable is substituted into the right-hand side of the multiplication in the replacement.

Finally, the same metavariable can appear multiple times in the pattern. In that case, the pattern matches only if each occurence of the metavariable matches the same expression. For example:

rewrite_expr '__x + __x' '2 * __x'

Here a + a and f() + f() are both replaced, but f() + 1 is not because __x cannot match both f() and 1 at the same time.

Suppose we wish to add an argument to an existing function. All current callers of the function should pass a default value of 0 for this new argument. We can update the existing calls like this:

rewrite_expr 'my_func(__x, __y)' 'my_func(__x, __y, 0)'

Every call to my_func now passes a third argument, and we can update the definition of my_func to match.

rewrite_expr supports several special matching forms that can appear in patterns to add extra restrictions to matching.

A pattern such as def!(::foo::f) matches any ident or path expression that resolves to the function whose absolute path is ::foo::f. For example, to replace all expressions referencing the function foo::f with ones referencing foo::g:

rewrite_expr 'def!(::foo::f)' '::foo::g'

This works for all direct references to f, whether by relative path (foo::f), absolute path (::foo::f), or imported identifier (just f, with use foo::f in scope). It can even handle imports under a different name (f2 with use foo::f as f2 in scope), since it checks only the path of the referenced definition, not the syntax used to reference it.

When rewrite_expr attempts to match def!(path) against some expression e, it actually completely ignores the content of e itself. Instead, it performs these steps:

Check rustc's name resolution results to find the definition d that e resolves to. (If e doesn't resolve to a definition, then the matching fails.)
Construct an absolute path dpath referring to d. For definitions in the current crate, this path looks like ::mod1::def1. For definitions in other crates, it looks like ::crate1::mod1::def1.
Match dpath against the path pattern provided as the argument of def!. Then e matches def!(path) if dpath matches path, and fails to match otherwise.

Matching with def! can sometimes fail in surprising ways, since the user-provided path is matched against a generated path that may not appear explicitly anywhere in the source code. For example, this attempt to match HashMap::new does not succeed:

rewrite_expr
    'def!(::std::collections::hash_map::HashMap::new)()'
    '::std::collections::hash_map::HashMap::with_capacity(10)'

The debug_match_expr command exists to diagnose such problems. It takes only a pattern, and prints information about attempts to match it at various points in the crate:

debug_match_expr 'def!(::std::collections::hash_map::HashMap::new)()'

Here, its output includes this line:

def!(): trying to match pattern path(::std::collections::hash_map::HashMap::new) against AST path(::std::collections::HashMap::new)

Which reveals the problem: the absolute path def! generates for HashMap::new uses the reexport at std::collections::HashMap, not the canonical definition at std::collections::hash_map::HashMap. Updating the previous rewrite_expr command allows it to succeed:

rewrite_expr
    'def!(::std::collections::HashMap::new)()'
    '::std::collections::HashMap::with_capacity(10)'

The argument to def! is a path pattern, which can contain metavariables just like the overall expression pattern. For instance, we can rewrite all calls to functions from the foo module:

rewrite_expr 'def!(::foo::__name)()' '123'

Since every definition in the foo module has an absolute path of the form ::foo::(something), they all match the expression pattern def!(::foo::__name).

Like any other metavariable, the ones in a def! path pattern can be used in the replacement expression to substitute in the captured name. For example, we can replace all references to items in the foo module with references to the same-named items in the bar module:

rewrite_expr 'def!(::foo::__name)' '::bar::__name'

Note, however, that each metavariable in a path pattern can match only a single ident. This means foo::__name will not match the path to an item in a submodule, such as foo::one::two. Handling these would require an additional rewrite step, such as rewrite_expr 'def!(::foo::__name1::__name2)' '::bar::__name1::__name2'.

A pattern of the form typed!(e, ty) matches any expression that matches the pattern e, but only if the type of that expression matches the pattern ty. For example, we can perform a rewrite that only affects i32s:

rewrite_expr 'typed!(__e, i32)' '0'

Every expression matches the metavariable __e, but only the i32s (whether literals or variables of type i32) are affected by the rewrite.

Internally, typed! works much like def!. To match an expression e against typed!(e_pat, ty_pat), rewrite_expr follows these steps:

Consult rustc's typechecking results to get the type of e. Call that type rustc_ty.
rustc_ty is an internal, abstract representation of the type, which is not suitable for matching. Construct a concrete representation of rustc_ty, and call it ty.
Match e against e_pat and ty against ty_pat. Then e matches typed!(e_pat, ty_pat) if both matches succeed, and fails to match otherwise.

When matching fails unexpectedly, debug_match_expr is once again useful for understanding the problem. For example, this rewriting command has no effect:

rewrite_expr "typed!(__e, &'static str)" '"hello"'

Passing the same pattern to debug_match_expr produces output that includes the following:

typed!(): trying to match pattern type(&'static str) against AST type(&str)

Now the problem is clear: the concrete type representation constructed for matching omits lifetimes. Replacing &'static str with &str in the pattern causes the rewrite to succeed:

rewrite_expr 'typed!(__e, &str)' '"hello"'

The expression pattern and type pattern arguments of typed!(e, ty) are handled using the normal rewrite_expr matching engine, which means they can contain metavariables and other special matching forms. For example, metavariables can capture both parts of the expression and parts of its type for use in the replacement:

rewrite_expr
    'typed!(Vec::with_capacity(__n), ::std::vec::Vec<__ty>)'
    '::std::iter::repeat(<__ty>::default())
        .take(__n)
        .collect::<Vec<__ty>>()'

Notice that the rewritten code has the correct element type in the call to default, even in cases where the type is not written explicitly in the original expression! The matching of typed! obtains the inferred type information from rustc, and those inferred types are captured by metavariables in the type pattern.

This example demonstrates usage of def! and typed!.

Suppose we have some unsafe code that uses transmute to convert a raw pointer that may be null (*const T) into an optional reference (Option<&T>). This conversion is better expressed using the as_ref method of *const T, and we'd like to apply this transformation automatically.

Here is a basic first attempt:

rewrite_expr 'transmute(__e)' '__e.as_ref()'

This has two major shortcomings, which we will address in order:

It works only on code that calls exactly transmute(foo). The instances that import std::mem and call mem::transmute(foo) do not get rewritten.
It rewrites transmutes between any types, not just *const T to Option<&T>. Only transmutes between those types should be replaced with as_ref.

We want to rewrite calls to std::mem::transmute, regardless of how those calls are written. This is a perfect use case for def!:

rewrite_expr 'def!(::std::intrinsics::transmute)(__e)' '__e.as_ref()'

Now our rewrite catches all uses of transmute, whether they're written as transmute(foo), mem::transmute(foo), or even ::std::mem::transmute(foo).

Notice that we refer to transmute as std::intrinsics::transmute: this is the location of its original definition, which is re-exported in std::mem. See the "def!: debugging match failures" section for an explanation of how we discovered this.

We now have a command for rewriting all transmute calls, but we'd like it to rewrite only transmutes from *const T to Option<&T>. We can achieve this by filtering the input and output types with typed!:

rewrite_expr '
    typed!(
        def!(::std::intrinsics::transmute)(
            typed!(__e, *const __ty)
        ),
        ::std::option::Option<&__ty>
    )
' '__e.as_ref()'

Now only those transmutes that turn *const T into Option<&T> are affected by the rewrite. And because typed! has access to the results of type inference, this works even on transmute calls that are not fully annotated (transmute(foo), not just transmute::<*const T, Option<&T>>(foo)).

The marked! form is simple: marked!(e, label) matches an expression only if e matches the expression and the expression is marked with the given label. See the documentation on marks and select for more information.

Several other refactoring commands use the same pattern-matching engine as rewrite_expr:

rewrite_ty PAT REPL (docs) works like rewrite_expr, except it matches and replaces type annotations instead of expressions.
abstract SIG PAT (docs) replaces expressions matching a pattern with calls to a newly-created function.
type_fix_rules (docs) uses type patterns to find the appropriate rule to fix each type error.
select's match_expr (docs) and similar filters use syntax patterns to identify nodes to mark.

1	fn main() {	1	fn main() {
2	println!("{}", 1 + 1);	2	println!("{}", 2);
3	println!("{}", 1 + /comment/ 1);	3	println!("{}", 2);
4	println!("{}", 1 + 11);	4	println!("{}", 1 + 11);
5	}	5	}

1	fn f() -> i32 {	1	fn f() -> i32 {
2	123	2	123
3	}	3	}
4		4
5	fn main() {	5	fn main() {
6	println!("a = {}", 1 + 1);	6	println!("a = {}", 11);
7	println!("b = {}", 2 * 3 + 1);	7	println!("b = {}", 11);
8	println!("c = {}", 4 + 5 + 1);	8	println!("c = {}", 11);
9	println!("d = {}", f() + 1);	9	println!("d = {}", 11);
10	}	10	}

1	fn f() -> i32 {	1	fn f() -> i32 {
2	123	2	123
3	}	3	}
4		4
5	fn main() {	5	fn main() {
6	println!("a = {}", 1 + 1);	6	println!("a = {}", 11 * 1);
7	println!("b = {}", 2 * 3 + 1);	7	println!("b = {}", 11 * (2 * 3));
8	println!("c = {}", 4 + 5 + 1);	8	println!("c = {}", 11 * (4 + 5));
9	println!("d = {}", f() + 1);	9	println!("d = {}", 11 * f());
10	}	10	}

1	fn f() -> i32 {	1	fn f() -> i32 {
2	123	2	123
3	}	3	}
4		4
5	fn main() {	5	fn main() {
6	let a = 2;	6	let a = 2;
7	println!("{}", 1 + 1);	7	println!("{}", 2 * 1);
8	println!("{}", a + a);	8	println!("{}", 2 * a);
9	println!("{}", f() + f());	9	println!("{}", 2 * f());
10	println!("{}", f() + 1);	10	println!("{}", f() + 1);
11	}	11	}

 fn my_func(x: i32, y: i32) {
     /* ... */
 }
 fn main() {
-    my_func(1, 2);
+    my_func(1, 2, 0);
     let x = 123;
-    my_func(x, x);
+    my_func(x, x, 0);
-    my_func(0, {
+    my_func(
+        0,
+        {
-        let y = x;
+            let y = x;
-        y + y
+            y + y
+        },
+        0,
-    });
+    );
 }

C2Rust Manual

Metavariables

Using bindings

Multiple occurences

Example: adding a function argument

Special matching forms

`def!`

Under the hood

Debugging match failures

Metavariables

`typed!`

Under the hood

Debugging match failures

Metavariables

Example: `transmute` to `<*const T>::as_ref`

Initial attempt

Identifying `transmute` calls with `def!`

Filtering `transmute` calls by type

`marked!`

Other commands

1	use std::collections::hash_map::HashMap;	1	use std::collections::hash_map::HashMap;
2		2
3	fn main() {	3	fn main() {
4	let m: HashMap<i32, i32> = HashMap::new();	4	let m: HashMap<i32, i32> = HashMap::new();
5	}	5	}

1	use std::collections::hash_map::HashMap;	1	use std::collections::hash_map::HashMap;
2		2
3	fn main() {	3	fn main() {
4	let m: HashMap<i32, i32> = HashMap::new();	4	let m: HashMap<i32, i32> = ::std::collections::HashMap::with_capacity(10);
5	}	5	}

1	mod foo {	1	mod foo {
2	fn f() {	2	fn f() {
3	/* ... */	3	/* ... */
4	}	4	}
5	fn g() {	5	fn g() {
6	/* ... */	6	/* ... */
7	}	7	}
8	}	8	}
9		9
10	mod bar {	10	mod bar {
11	fn f() {	11	fn f() {
12	/* ... */	12	/* ... */
13	}	13	}
14	fn g() {	14	fn g() {
15	/* ... */	15	/* ... */
16	}	16	}
17	}	17	}
18		18
19	fn main() {	19	fn main() {
20	foo::f();	20	::bar::f();
21	foo::g();	21	::bar::g();
22	}	22	}

1	use std::mem;	1	use std::mem;
2		2
3	unsafe fn foo(ptr: *const u32) {	3	unsafe fn foo(ptr: *const u32) {
4	let r: &u32 = mem::transmute::<*const u32, Option<&u32>>(ptr).unwrap();	4	let r: &u32 = mem::transmute::<*const u32, Option<&u32>>(ptr).unwrap();
5		5
6	let opt_r2: Option<&u32> = mem::transmute(ptr);	6	let opt_r2: Option<&u32> = mem::transmute(ptr);
7	let r2 = opt_r2.unwrap();	7	let r2 = opt_r2.unwrap();
8	let ptr2: *const u32 = mem::transmute(r2);	8	let ptr2: *const u32 = mem::transmute(r2);
9		9
10	{	10	{
11	use std::mem::transmute;	11	use std::mem::transmute;
12	let opt_r3: Option<&u32> = transmute(ptr);	12	let opt_r3: Option<&u32> = ptr.as_ref();
13	let r3 = opt_r2.unwrap();	13	let r3 = opt_r2.unwrap();
14	}	14	}
15		15
16	/* ... */	16	/* ... */
17	}	17	}

C2Rust Manual

Metavariables

Using bindings

Multiple occurences

Example: adding a function argument

Special matching forms

def!

Under the hood

Debugging match failures

Metavariables

typed!

Under the hood

Debugging match failures

Metavariables

Example: transmute to <*const T>::as_ref

Initial attempt

Identifying transmute calls with def!

Filtering transmute calls by type

marked!

Other commands

`def!`

`typed!`

Example: `transmute` to `<*const T>::as_ref`

Identifying `transmute` calls with `def!`

Filtering `transmute` calls by type

`marked!`