What is C2Rust?

C2Rust helps you migrate C99-compliant code to Rust. It provides:

  • a C to Rust translator
  • a Rust code refactoring tool
  • tools to cross-check execution of the C code against the new Rust code

The translator (or transpiler), produces unsafe Rust code that closely mirrors the input C code. The primary goal of the translator is to produce code that is functionally identical to the input C code. Generating safe or idomatic Rust is not a goal for the translator. Rather, we think the best approach is to gradually rewrite the translated Rust code using dedicated refactoring tools. To this end, we are building a refactoring tool that rewrites unsafe auto-translated Rust into safer idioms.

Some refactoring will have to be done by hand which may introduce errors. We provide plugins for clang and rustc so you can compile and run two binaries and check that they behave identically (at the level of function calls). For details on cross-checking see the cross-checks directory and the cross checking tutorial.

Here's the big picture:

C2Rust overview

To learn more, check out our RustConf'18 talk on YouTube and try the C2Rust translator online at www.c2rust.com.

Installation

Prerequisites

C2Rust requires LLVM 6 or 7 and its corresponding libraries and clang compiler. Python 3.4 or later, CMake 3.4.3 or later, and openssl (1.0) are also required. These prerequisites may be installed with the following commands, depending on your platform:

  • Ubuntu 16.04, 18.04 & 18.10:

      apt install build-essential llvm-6.0 clang-6.0 libclang-6.0-dev cmake libssl-dev pkg-config
    
  • Arch Linux:

      pacman -S base-devel llvm clang cmake openssl
    
  • OS X: XCode command-line tools and recent LLVM (we recommend the Homebrew version) are required.

      xcode-select --install
      brew install llvm python3 cmake openssl
    

Finally, a rust installation with Rustup is required on all platforms. You will also need to install rustfmt:

    rustup component add rustfmt-preview

Building C2Rust

cargo build --release

This builds the c2rust tool in the target/release/ directory.

On OS X with Homebrew LLVM, you need to point the build system at the LLVM installation as follows:

LLVM_CONFIG_PATH=/usr/local/opt/llvm/bin/llvm-config cargo build

If you have trouble with cargo build, the developer docs provide more details on the build system.

Translating C to Rust

To translate C files specified in compile_commands.json (see below), run the c2rust tool with the transpile subcommand:

c2rust transpile compile_commands.json

(The c2rust refactor tool is also available for refactoring Rust code, see refactoring).

The translator requires the exact compiler commands used to build the C code. To provide this information, you will need a standard compile_commands.json file. Many build systems can automatically generate this file, as it is used by many other tools, but see below for recommendations on how to generate this file for common build processes.

Once you have a compile_commands.json file describing the C build, translate the C code to Rust with the following command:

c2rust transpile path/to/compile_commands.json

To generate a Cargo.toml template for a Rust library, add the -e option:

c2rust transpile --emit-build-files path/to/compile_commands.json

To generate a Cargo.toml template for a Rust binary, do this:

c2rust transpile --main myprog path/to/compile_commands.json

Where --main myprog tells the transpiler to use the main method from myprog.rs as the entry point.

The translated Rust files will not depend directly on each other like normal Rust modules. They will export and import functions through the C API. These modules can be compiled together into a single static Rust library or binary.

There are several known limitations in this translator. The translator will emit a warning and attempt to skip function definitions that cannot be translated.

Generating compile_commands.json files

The compile_commands.json file can be automatically created using either cmake, intercept-build, or bear.

It may be a good idea to remove optimizations(-OX) from the compile commands file, as there are optimization builtins which we do not support translating.

... with cmake

When creating the initial build directory with cmake specify -DCMAKE_EXPORT_COMPILE_COMMANDS=1. This only works on projects configured to be built by cmake. This works on Linux and MacOS.

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=1 ...

... with intercept-build

intercept-build (part of the scan-build tool) is recommended for non-cmake projects. intercept-build is bundled with clang under tools/scan-build-py but a standalone version can be easily installed via PIP with:

pip install scan-build

Usage:

intercept-build <build command>

You can also use intercept-build to generate a compilation database for compiling a single C file, for example:

intercept-build sh -c "cc program.c"

... with bear (linux only)

If you have bear installed, it can be used similarly to intercept-build:

bear <build command>

C2Rust Transpiler

Basic Usage

The transpiler module is invoked using the transpile sub-command of c2rust:

c2rust transpile [args] compile_commands.json [-- extra-clang-args]

The following arguments control the basic transpiler behavior:

  • --emit-modules - Emit each translated Rust file as a module (the default is to make each file its own crate).
  • --fail-on-error - Fail instead of warning if a source file cannot be fully translated.
  • --reduce-type-annotations - Do not emit explicit type annotations when unnecessary.
  • --translate-asm - Translate C inline assembly into corresponding Rust inline assembly. The translated assembly is unlikely to work as-is due to differences between GCC and LLVM (used in Rust) inline assembly styles, but it can provide a starting point for manual translation.
  • -f <regex>, --filter <regex> - Only translate files based on the regular expression used.

Creating cargo build files

The transpiler can create skeleton cargo build files for the translated Rust sources, controlled by the following options:

  • -e, --emit-build-files - Emit cargo build files to build the translated Rust code as a library. Build files are emitted in the directory specified by --output-dir, or if not specified, the directory containing compile_commands.json. This will not overwrite existing files, so remove these build files before re-creating build files. (implies --emit-modules)
  • -m <main_module>, --main <main_module> - Emit cargo build files to build the translated Rust code as a binary. The main function must be found in the specified module (C source file) <main_module>. <main_module> should be the bare module name, not including the .rs extension. Build files are emitted in the directory specified by --output-dir, or if not specified, the directory containing compile_commands.json. This will not overwrite existing files, so remove this build file directory before re-creating build files. (implies --emit-build-files)

Cross-check instrumentation

The transpiler can instrument the transpiled Rust code for cross-checking. The following options control this instrumentation:

  • -x, --cross-checks - Add macros and build files for cross-checking.
  • --use-fakechecks - Link against the fakechecks library for cross-checking instead of using the default online checks.
  • -X <config>, --cross-check-config <config> - Use the given config file as the cross-checking config.

For Developers

The c2rust-transpile library uses the c2rust-ast-exporter library to translate C code to Rust. The ast-exporter library links against the native clang compiler front end to parse C code and exports the AST for use in the transpiler, which is then implemented purely in Rust.

Known Limitations of Translation

This document tracks things that we know the translator can't handle, as well as things it probably won't ever handle.

Unimplemented

  • variadic function definitions (blocking Rust issue)
  • preserving comments (work in progress)
  • long double and _Complex types (partially blocked by Rust language)
  • Non x86/64 SIMD function/types and x86/64 SIMD function/types which have no rust equivalent

Unimplemented, might be implementable but very low priority

  • GNU packed structs (Rust has #[repr(packed)] compatible with #[repr(C)])
  • inline functions (Rust has #[inline])
  • restrict pointers (Rust has references)
  • inline assembly
  • macros

Likely won't ever support

  • longjmp/setjmp Although there are LLVM intrinsics for these, it is unclear how these interact with Rust (esp. idiomatic Rust).
  • jumps into and out of statement expressions We support GNU C statement expressions, but we can not handle jumping into or out of these. Both entry and exit into the expression have to be through the usual fall-through evaluation of the expression.

C2Rust-Bitfields Crate

C2Rust-Bitfields enables you to write structs containing bitfields. It has three primary goals:

  • Byte compatibility with equivalent C bitfield structs
  • The ability to take references/pointers to non bitfield fields
  • Provide methods to read from and write to bitfields

We currently provides a single custom derive, BitfieldStruct, as well as a dependent field attribute bitfield.

Requirements

  • Rust 1.30+
  • Rust Stable, Beta, or Nightly
  • Little Endian Architecture

Example

Suppose you want to write a super compact date struct which only takes up three bytes. In C this would look like this:

struct date {
    unsigned char day: 5;
    unsigned char month: 4;
    unsigned short year: 15;
} __attribute__((packed));

Clang helpfully provides us with this information:

*** Dumping AST Record Layout
         0 | struct date
     0:0-4 |   unsigned char day
     0:5-8 |   unsigned char month
    1:1-15 |   unsigned short year
           | [sizeof=3, align=1]

And this is enough to build our rust struct:

extern crate libc;

#[repr(C, align(1))]
#[derive(BitfieldStruct)]
struct Date {
    #[bitfield(name = "day", ty = "libc::c_uchar", bits = "0..=4")]
    #[bitfield(name = "month", ty = "libc::c_uchar", bits = "5..=8")]
    #[bitfield(name = "year", ty = "libc::c_ushort", bits = "9..=23")]
    day_month_year: [u8; 3]
}

fn main() {
    let mut date = Date {
        day_month_year: [0; 3]
    };

    date.set_day(18);
    date.set_month(7);
    date.set_year(2000);

    assert_eq!(date.day(), 18);
    assert_eq!(date.month(), 7);
    assert_eq!(date.year(), 2000);
}

Furthermore, C bitfield rules for overflow and signed integers are taken into account.

This crate can generate no_std compatible code when the no_std feature flag is provided.

Tests

Since rust doesn't support a build.rs exclusively for tests, you must manually compile the c test code and link it in.

$ clang tests/bitfields.c -c -fPIC -o tests/bitfields.o
$ ar -rc tests/libtest.a tests/bitfields.o
$ RUSTFLAGS="-L `pwd`/tests" cargo test

Acknowledgements

This crate is inspired by the rust-bitfield, packed_struct, and bindgen crates.

C2Rust Refactoring Tool

This is a refactoring tool for Rust programs, aimed at removing unsafety from automatically-generated Rust code.

Usage

c2rust refactor command line usage is as follows:

c2rust refactor [flags] <command> [command args] -- <input file> [rustc flags]

Flags for c2rust refactor are described by c2rust refactor --help.

See the command documentation for a list of commands, including complete usage and descriptions. Multiple commands can be separated by an argument consisting of a single semicolon, as in c2rust refactor cmd1 arg1 \; cmd2 arg2. (Note the semicolon needs to be escaped to prevent it from being interpreted by the shell.)

c2rust refactor requires rustc command line arguments for the program to be refactored, so that it can use rustc to load and typecheck the source code. For projects built with cargo, pass the --cargo flag to c2rust refactor and it will obtain the right arguments from cargo automatically. Otherwise, you must provide the rustc arguments on the c2rust refactor command line, after a -- separator.

Marks

Some commands require the user to "mark" some AST nodes for it to operate on. For example, the rename_struct command requires that the user mark the declaration of the struct that should be renamed.

Each mark associates a "label" with a specific AST node (identified by its NodeId). Labels are used to distinguish different types of marks, and a single node can have any number of marks with distinct labels. For example, when running the func_to_method command, which turns functions into methods in an inherent impl, the user must mark the functions to move with the target label and must mark the destination impl with the dest label. Nodes marked with other labels will be ignored. The set of labels recognized by a command is described in the command's documentation; by default, most commands that use marks operate on target.

The most flexible way of marking nodes is by using the select command. See the command documentation and src/select/mod.rs for details. Note that marks are not preserved across c2rust refactor invocations, so you usually want to run select followed by the command of interest using the ; separator mentioned above.

Refactoring Commands

abstract

Usage: abstract SIG PAT [BODY]

Replace all instances of pat with calls to a new function whose name and signature is given by sig. Example:

Input:

 1 + 2

After running abstract 'add(x: u32, y: u32) -> u32' 'x + y':

 add(1, 2)

 // Elsewhere:
 fn add(x: u32, y: u32) -> u32 { x + y }

All type and value parameter names in sig act as bindings when matching pat. The captured exprs and types are passed as parameters when building the new call expression. The body of the function is body, if provided, otherwise pat itself.

Non-ident patterns in sig are not supported. It is also an error for any type parameter's name to collide with any value parameter.

If matching with pat fails to capture expressions for any of the value parameters of sig, it is an error. If it fails to capture for a type parameter, the parameter is filled in with _ (infer).

autoretype

Usage: autoretype 'A: T'...

Marks: A... (specified in command)

Change the type of nodes with mark A to the new type T, propagating changes and inserting casts when possible to satisfy type checking. Multiple simultaneous retypings can be specified in this command as separate arguments. Each argument should be of the form: label: type where label is a mark label and type can be parsed as a valid rust type.

bitcast_retype

Usage: bitcast_retype PAT REPL

Marks: may read marks depending on PAT

For every type in the crate matching PAT, change the type to REPL. PAT and REPL are types, and can use placeholders in the manner of rewrite_ty. For each definitions whose type has changed, it also inserts mem::transmute calls at each use of the definition to fix discrepancies between the old and new types. (This implies that the original type and its replacement must be transmutable to each other.)

bytestr_to_str

Usage: bytestr_to_str

Marks: target

Convert bytestring literal expressions marked target to string literal expressions.

Note the mark must be placed on the expression, as it is currently difficult to mark a literal node.

canonicalize_externs

Usage: canonicalize_externs MOD_PATH

Marks: target

Replace foreign items ("externs") with references to externs in a different crate or module.

For each foreign fn or static marked target, if a foreign item with the same symbol exists in the module at MOD_PATH (which can be part of an external crate), it deletes the marked foreign item and replaces all its uses with uses of the matching foreign item in MOD_PATH. If a replacement item has a different type than the original, it also inserts the necessary casts at each use of the item.

canonicalize_structs

Usage: canonicalize_structs

Marks: target

For each type definition marked target, delete all other type definitions with the same name, and replace their uses with uses of the target type.

This only works when all the identically-named types have the same definition, such as when all are generated from #includes of the same C header.

Example:

 mod a {
     pub struct Foo { ... }  // Foo: target
 }

 mod b {
     struct Foo { ... }  // same as ::a::Foo

     unsafe fn use_foo(x: &Foo) { ... }
 }

After running canonicalize_structs:

 mod a {
     pub struct Foo { ... }
 }

 mod b {
     // 1. `struct Foo` has been deleted
     // 2. `use_foo` now references `::a::Foo` directly
     unsafe fn use_foo(x: &::a::Foo) { ... }
 }

Note that this transform does not check or adjust item visibility. If the target type is not visible throughout the crate, this may introduce compile errors.

char_literals

Obsolete - the translator now does this automatically.

Usage: char_literals

Replace integer literals cast to libc::c_char with actual char literals. For example, replaces 65 as libc::c_char with 'A' as libc::c_char.

clear_marks

Usage: clear_marks

Marks: clears all marks

Remove all marks from all nodes.

commit

Usage: commit

Write the current crate to disk (by rewriting the original source files), then read it back in, clearing all mark. This can be useful as a "checkpoint" between two sets of transformations, if applying both sets of changes at once proves to be too much for the rewriter.

This is only useful when the rewrite mode is inplace. Otherwise the "write" part of the operation won't actually change the original source files, and the "read" part will revert the crate to its original form.

convert_cast_as_ptr

convert_format_args

Usage: convert_format_args

Marks: target

For each function call, if one of its argument expressions is marked target, then parse that argument as a printf format string, with the subsequent arguments as the format args. Replace both the format string and the args with an invocation of the Rust format_args! macro.

This transformation applies casts to the remaining arguments to account for differences in argument conversion behavior between C-style and Rust-style string formatting. However, it does not attempt to convert the format_args! output into something compatible with the original C function. This results in a type error, so this pass should usually be followed up by an additional rewrite to change the function being called.

Example:

 printf("hello %d\n", 123);

If the string "hello %d\n" is marked target, then running convert_format_string will replace this call with

 printf(format_args!("hello {:}\n", 123 as i32));

At this point, it would be wise to replace the printf expression with a function that accepts the std::fmt::Arguments produced by format_args!.

convert_printfs

copy_marks

Usage: copy_marks OLD_MARK NEW_MARK

Marks: reads OLD_MARK; sets NEW_MARK

For every node bearing OLD_MARK, also apply NEW_MARK.

create_item

Usage: create_item ITEMS <inside/after> [MARK]

Marks: MARK/target

Parse ITEMS as item definitions, and insert the parsed items either inside (as the first child) or after (as a sibling) of the AST node bearing MARK (default: target). Supports adding items to both mods and blocks.

Note that other itemlikes, such as impl and trait items, are not handled by this command.

delete_items

Usage: delete_items

Marks: target

Delete all items marked target from the AST. This handles items in both mods and blocks, but doesn't handle other itemlikes.

delete_marks

Usage: delete_marks MARK

Marks: clears MARK

Remove MARK from every node where it appears.

fix_unused_unsafe

Usage: fix_unused_unsafe

Find unused unsafe blocks and turn them into ordinary blocks.

fold_let_assign

Usage: fold_let_assign

Fold together lets with no initializer or a trivial one, and subsequent assignments. For example, replace let x; x = 10; with let x = 10;.

func_to_method

Usage: func_to_method

Marks: target, dest

Turn functions marked target into static methods (no self) in the impl block marked dest. Turn functions that have an argument marked target into methods, replacing the named argument with self. Rewrite all uses of marked functions to call the new method versions.

Marked arguments of type T, &T, and &mut T (where T is the Self type of the dest impl) will be converted to self, &self, and &mut self respectively.

generalize_items

Usage: generalize_items VAR [TY]

Marks: target

Replace marked types with generic type parameters.

Specifically: add a new type parameter called VAR to each item marked target, replacing type annotations inside that item that are marked target with uses of the type parameter. Also update all uses of target items, passing TY as the new type argument when used inside a non-target item, and passing the type variable VAR when used inside a target item.

If TY is not provided, it defaults to a copy of the first type annotation that was replaced with VAR.

Example:

 struct Foo {    // Foo: target
     x: i32,     // i32: target
     y: i32,
 }

 fn f(foo: Foo) { ... }  // f: target

 fn main() {
     f(...);
 }

After running generalize_items T:

 // 1. Foo gains a new type parameter `T`
 struct Foo<T> {
     // 2. Marked type annotations become `T`
     x: T,
     y: i32,
 }

 // 3. `f` gains a new type parameter `T`, and passes
 // it through to uses of `Foo`
 fn f<T>(foo: Foo<T>) { ... }
 struct Bar<T> {
     foo: Foo<T>,
 }

 fn main() {
     // 4. Uses outside target items use `i32`, the
     // first type that was replaced with `T`.
     f::<i32>(...);
 }

ionize

Usage: ionize

Marks: target

Convert each union marked target to a type-safe Rust enum. The generated enums will have as_variant and as_variant_mut methods for each union field, which panic if the enum is not the named variant. Also updates assignments to union variables to assign one of the new enum variants, and updates uses of union fields to call the new methods instead.

let_x_uninitialized

Obsolete - the translator now does this automatically.

Usage: let_x_uninitialized

For each local variable that is uninitialized (let x;), add mem::uninitialized() as an initializer expression.

link_funcs

Usage: link_funcs

Link up function declarations and definitions with matching symbols across modules. For every foreign fn whose symbol matches a fn definition elsewhere in the program, it replaces all uses of the foreign fn with a direct call of the fn definition, and deletes the foreign fn.

Example:

 mod a {
     #[no_mangle]
     unsafe extern "C" fn foo() { ... }
 }

 mod b {
     extern "C" {
         // This resolves to `a::foo` during linking.
         fn foo();
     }

     unsafe fn use_foo() {
         foo();
     }
 }

After running link_funcs:

 mod a {
     #[no_mangle]
     unsafe extern "C" fn foo() { ... }
 }

 mod b {
     // 1. Foreign fn `foo` has been deleted
     unsafe fn use_foo() {
         // 2. `use_foo` now calls `foo` directly
         ::a::foo();
     }
 }

link_incomplete_types

Usage: link_incomplete_types

Link up type declarations and definitions with matching names across modules. For every foreign type whose name matches a type definition elsewhere in the program, it replaces all uses of the foreign type with the type definition, and deletes the foreign type.

Example:

 mod a {
     struct Foo { ... }
 }

 mod b {
     extern "C" {
         type Foo;
     }

     unsafe fn use_foo(x: &Foo) { ... }
 }

After running link_incomplete_types:

 mod a {
     struct Foo { ... }
 }

 mod b {
     // 1. Foreign fn `Foo` has been deleted
     // 2. `use_foo` now references `Foo` directly
     unsafe fn use_foo(x: &::a::Foo) { ... }
 }

mark_arg_uses

Usage: mark_arg_uses ARG_IDX MARK

Marks: reads MARK; sets/clears MARK

For every fn definition bearing MARK, apply MARK to expressions passed in as argument ARG_IDX in calls to that function. Removes MARK from the original function.

mark_callers

Usage: mark_callers MARK

Marks: reads MARK; sets/clears MARK

For every fn definition bearing MARK, apply MARK to call expressions that call that function. Removes MARK from the original function.

mark_field_uses

Obsolete - use select with match_expr!(typed!(::TheStruct).field) instead

Usage: mark_field_uses FIELD MARK

Marks: reads MARK; sets/clears MARK

For every struct definition bearing MARK, apply MARK to expressions that use FIELD of that struct. Removes MARK from the original struct.

mark_pub_in_mod

Obsolete - use select instead.

Usage: mark_pub_in_mod MARK

Marks: reads MARK; sets MARK

In each mod bearing MARK, apply MARK to every public item in the module.

mark_related_types

Usage: mark_related_types [MARK]

Marks: MARK/target

For each type annotation bearing MARK (default: target), apply MARK to all other type annotations that must be the same type according to (a simplified version of) Rust's typing rules.

For example, in this code:

 fn f(x: i32, y: i32) -> i32 {
     x
 }

The i32 annotations on x and the return type of f are related, because changing these annotations to two unequal types would produce a type error. But the i32 annotation on y is unrelated, and can be changed independently of the other two.

mark_uses

Usage: mark_uses MARK

Marks: reads MARK; sets/clears MARK

For every top-level definition bearing MARK, apply MARK to uses of that definition. Removes MARK from the original definitions.

ownership_annotate

Usage: ownership_annotate [MARK]

Marks: MARK/target

Run ownership analysis on functions bearing MARK (default: target), and add attributes to each function describing its inferred ownership properties. See analysis/ownership/README.md for details on ownership inference.

ownership_mark_pointers

Usage: ownership_mark_pointers [MARK]

Marks: reads MARK/target; sets ref, mut, and box

Run ownership analysis on functions bearing MARK (default: target), then for pointer type appearing in their argument and return types, apply one of the marks ref, mut, or box, reflecting the results of the ownership analysis. See analysis/ownership/README.md for details on ownership inference.

ownership_split_variants

Usage: ownership_split_variants [MARK]

Marks: MARK/target

Run ownership analysis on functions bearing MARK (default: target), and split each ownership-polymorphic functions into multiple monomorphic variants. See analysis/ownership/README.md for details on ownership inference.

pick_node

Test command - not intended for general use.

Usage: pick_node KIND FILE LINE COL

Find a node of kind KIND at location FILE:LINE:COL. If successful, logs the node's ID and span at level info.

print_marks

Test command - not intended for general use.

Usage: print_marks

Marks: reads all

Logs the ID and label of every mark, at level info.

print_spans

Test command - not intended for general use.

Usage: print_spans

Print IDs, spans, and pretty-printed source for all exprs, pats, tys, stmts, and items.

reconstruct_for_range

Usage: reconstruct_for_range

Replaces i = start; while i < end { ...; i += step; } with for i in (start .. end).step_by(step) { ...; }.

reconstruct_while

Obsolete - the translator now does this automatically.

Usage: reconstruct_while

Replaces all instances of loop { if !cond { break; } ... } with while loops.

remove_null_terminator

Usage: remove_null_terminator

Marks: target

Remove a trailing \0 character from marked string and bytestring literal expressions.

Note the mark must be placed on the expression, as it is currently difficult to mark a literal node.

remove_redundant_casts

remove_redundant_let_types

remove_unused_labels

Usage: remove_unused_labels

Removes loop labels that are not used in a named break or continue.

rename_items_regex

Usage: rename_items_regex PAT REPL [FILTER]

Marks: reads FILTER

Replace PAT (a regular expression) with REPL in all item names. If FILTER is provided, only items bearing the FILTER mark will be renamed.

rename_marks

Usage: rename_marks OLD_MARK NEW_MARK

Marks: reads/clears OLD_MARK; sets NEW_MARK

For every node bearing OLD_MARK, remove OLD_MARK and apply NEW_MARK.

rename_struct

Obsolete - use rename_items_regex instead.

Usage: rename_struct NAME

Marks: target

Rename the struct marked target to NAME. Only supports renaming a single struct at a time.

rename_unnamed

Usage: rename_unnamed

Renames all Idents that have unnamed throughout the Crate, so the Crate can have a completely unique naming scheme for Anonymous Types. This command should be ran after transpiling using c2rust-transpile, and is also mainly to be used when doing the reorganize_definition pass; although this pass can run on any c2rust-transpiled project.

Example:

pub mod foo {
    pub struct unnamed {
        a: i32
    }
}

pub mod bar {
    pub struct unnamed {
        b: usize
    }
}

Becomes:

pub mod foo {
    pub struct unnamed {
        a: i32
    }
}

pub mod bar {
    pub struct unnamed_1 {
        b: usize
    }
}

reoganize_definitions

replace_items

Usage: replace_items

Marks: target, repl

Replace all uses of items marked target with reference to the item marked repl, then remove all target items.

retype_argument

Usage: retype_argument NEW_TY WRAP UNWRAP

Marks: target

For each argument marked target, change the type of the argument to NEW_TY, and use WRAP and UNWRAP to convert values to and from the original type of the argument at call sites and within the function body.

WRAP should contain an expression placeholder __old, and should convert __old from the argument's original type to NEW_TY. UNWRAP should contain an expression placeholder __new, and should perform the opposite conversion.

retype_return

Usage: retype_return NEW_TY WRAP UNWRAP

Marks: target

For each function marked target, change the return type of the function to NEW_TY, and use WRAP and UNWRAP to convert values to and from the original type of the argument at call sites and within the function body.

WRAP should contain an expression placeholder __old, and should convert __old from the function's original return type to NEW_TY. UNWRAP should contain an expression placeholder __new, and should perform the opposite conversion.

retype_static

Usage: retype_static NEW_TY REV_CONV_ASSIGN CONV_RVAL CONV_LVAL [CONV_LVAL_MUT]

Marks: target

For each static marked target, change the type of the static to NEW_TY, using the remaining arguments (which are all all expression templates) to convert between the old and new types at the definition and use sites.

The expression arguments are used as follows:

  • REV_CONV_ASSIGN: In direct assignments to the static and in its initializer expression, the original assigned value is wrapped (as __old) in REV_CONV_ASSIGN to produce a value of type NEW_TY.
  • CONV_RVAL: In rvalue contexts, the static is wrapped (as __new) in CONV_RVAL to produce a value of the static's old type.
  • CONV_LVAL and CONV_LVAL_MUT are similar to CONV_RVAL, but for immutable and mutable lvalue contexts respectively. Especially for CONV_LVAL_MUT, the result of wrapping should be an lvalue expression (such as a dereference or field access), not a temporary, as otherwise updates to the static could be lost. CONV_LVAL_MUT is not required for immutable statics, which cannot appear in mutable lvalue contexts.

rewrite_expr

Usage: rewrite_expr PAT REPL [FILTER]

Marks: reads FILTER, if set; may read other marks depending on PAT

For every expression in the crate matching PAT, replace it with REPL. PAT and REPL are both Rust expressions. PAT can use placeholders to capture nodes from the matched AST, and REPL can refer to those same placeholders to substitute in the captured nodes. See the matcher module for details on AST pattern matching.

If FILTER is provided, only expressions marked FILTER will be rewritten. This usage is obsolete - change PAT to marked!(PAT, FILTER) to get the same behavior.

Example:

 fn double(x: i32) -> i32 {
     x * 2
 }

After running rewrite_expr '$e * 2' '$e + $e':

 fn double(x: i32) -> i32 {
     x + x
 }

Here $e * 2 matches x * 2, capturing x as $e. Then x is substituted for $e in $e + $e, producing the final expression x + x.

rewrite_stmts

rewrite_ty

Usage: rewrite_ty PAT REPL [FILTER]

Marks: reads FILTER, if set; may read other marks depending on PAT

For every type in the crate matching PAT, replace it with REPL. PAT and REPL are both Rust types. PAT can use placeholders to capture nodes from the matched AST, and REPL can refer to those same placeholders to substitute in the captured nodes. See the matcher module for details on AST pattern matching.

If FILTER is provided, only expressions marked FILTER will be rewritten. This usage is obsolete - change PAT to marked!(PAT, FILTER) to get the same behavior.

See the documentation for rewrite_expr for an example of this style of rewriting.

select

Usage: select MARK SCRIPT

Marks: sets MARK; may set/clear other marks depending on SCRIPT

Run node-selection script SCRIPT, and apply MARK to the nodes it selects. See select::SelectOp, select::Filter, and select::parser for details on select script syntax.

select_phase2

Usage: select_phase2 MARK SCRIPT

Marks: sets MARK; may set/clear other marks depending on SCRIPT

Works like select, but stops the compiler's analyses before typechecking happens. This means type information will not available, and script commands that refer to it will fail.

set_mutability

Usage: set_mutability MUT

Marks: target

Set the mutability of all items marked target to MUT. MUT is either imm or mut. This command only affects static items (including extern statics).

set_visibility

Usage: set_visibility VIS

Marks: target

Set the visibility of all items marked target to VIS. VIS is a Rust visibility qualifier such as pub, pub(crate), or the empty string.

Doesn't handle struct field visibility, for now.

sink_lets

Usage: sink_lets

For each local variable with a trivial initializer, move the local's declaration to the innermost block containing all its uses.

"Trivial" is currently defined as no initializer (let x;) or an initializer without any side effects. This transform requires trivial assignments to avoid reordering side effects.

sink_unsafe

Usage: sink_unsafe

Marks: target

For functions marked target, convert unsafe fn f() { ... } into fn () { unsafe { ... } }. Useful once unsafe argument handling has been eliminated from the function.

static_collect_to_struct

Usage: static_collect_to_struct STRUCT VAR

Marks: target

Collect marked statics into a single static struct.

Specifically:

  1. Find all statics marked target. For each one, record its name, type, and initializer expression, then delete it.
  2. Generate a new struct definition named STRUCT. For each marked static, include a field of STRUCT with the same name and type as the static.
  3. Generate a new static mut named VAR whose type is STRUCT. Initialize it using the initializer expressions for the marked statics.
  4. For each marked static foo, replace uses of foo with VAR.foo.

Example:

 static mut FOO: i32 = 100;
 static mut BAR: bool = true;

 unsafe fn f() -> i32 {
     FOO
 }

After running static_collect_to_struct Globals G, with both statics marked:

 struct Globals {
     FOO: i32,
     BAR: bool,
 }

 static mut G: Globals = Globals {
     FOO: 100,
     BAR: true,
 };

 unsafe fn f() -> i32 {
     G.FOO
 }

static_to_local

Usage: static_to_local

Marks: target

Delete each static marked target. For each function that uses a marked static, insert a new local variable definition replicating the marked static.

Example:

 static mut FOO: i32 = 100;  // FOO: target

 unsafe fn f() -> i32 {
     FOO
 }

 unsafe fn g() -> i32 {
     FOO + 1
 }

After running static_to_local:

 // `FOO` deleted

 // `f` gains a new local, replicating `FOO`.
 unsafe fn f() -> i32 {
     let FOO: i32 = 100;
     FOO
 }

 // If multiple functions use `FOO`, each one gets its own copy.
 unsafe fn g() -> i32 {
     let FOO: i32 = 100;
     FOO + 1
 }

static_to_local_ref

Usage: static_to_local_ref

Marks: target, user

For each function marked user, replace uses of statics marked target with uses of newly-introduced reference arguments. Afterward, no user function directly accesses any target static. At call sites of user functions, a reference to the original static is passed in for each new argument if the caller is not itself a user function; otherwise, the caller's own reference argument is passed through. Note this sometimes results in functions gaining arguments corresponding to statics that the function itself does not use, but that its callees do.

Example:

 static mut FOO: i32 = 100;  // FOO: target

 unsafe fn f() -> i32 {  // f: user
     FOO
 }

 unsafe fn g() -> i32 {  // g: user
     f()
 }

 unsafe fn h() -> i32 {
     g()
 }

After running static_to_local_ref:

 static mut FOO: i32 = 100;

 // `f` is a `user` that references `FOO`, so it
 // gains a new argument `FOO_`.
 unsafe fn f(FOO_: &mut i32) -> i32 {
     // References to `FOO` are replaced with `*FOO_`
     *FOO_
 }

 // `g` is a `user` that references `FOO` indirectly,
 // via fellow `user` `f`.
 unsafe fn g(FOO_: &mut i32) -> i32 {
     // `g` passes through its own `FOO_` reference
     // when calling `f`.
     f(FOO_)
 }

 // `h` is not a `user`, so its signature is unchanged.
 unsafe fn h() -> i32 {
     // `h` passes in a reference to the original
     // static `FOO`.
     g(&mut FOO)
 }

struct_assign_to_update

Usage: struct_assign_to_update

Replace all struct field assignments with functional update expressions.

Example:

 let mut x: S = ...;
 x.f = 1;
 x.g = 2;

After running struct_assign_to_update:

 let mut x: S = ...;
 x = S { f: 1, ..x };
 x = S { g: 2, ..x };

struct_merge_updates

Usage: struct_merge_updates

Merge consecutive struct updates into a single update.

Example:

 let mut x: S = ...;
 x = S { f: 1, ..x };
 x = S { g: 2, ..x };

After running struct_assign_to_update:

 let mut x: S = ...;
 x = S { f: 1, g: 2, ..x };

test_analysis_ownership

Test command - not intended for general use.

Usage: test_analysis_ownership

Runs the ownership analysis and dumps the results to stderr.

test_analysis_type_eq

Test command - not intended for general use.

Usage: test_analysis_type_eq

Runs the type_eq analysis and logs the result (at level info).

test_debug_callees

Test command - not intended for general use.

Usage: test_debug_callees

Inspect the details of each Call expression. Used to debug RefactorCtxt::opt_callee_info.

test_f_plus_one

Test command - not intended for general use.

Usage: test_f_plus_one

Replace the expression f(__x) with __x + 1 everywhere it appears.

test_insert_remove_args

Test command - not intended for general use.

Usage: test_insert_remove_args INS REM

In each function marked target, insert new arguments at each index listed in INS (a comma-separated list of integers), then delete the arguments whose original indices are listed in REM.

This is used for testing sequence rewriting of fn argument lists.

test_one_plus_one

Test command - not intended for general use.

Usage: test_one_plus_one

Replace the expression 2 with 1 + 1 everywhere it appears.

test_reflect

Test command - not intended for general use.

Usage: test_reflect

Applies path and ty reflection on every expr in the program.

test_replace_stmts

Test command - not intended for general use.

Usage: test_replace_stmts OLD NEW

Replace statement(s) OLD with NEW everywhere it appears.

test_typeck_loop

Test command - not intended for general use.

Usage: test_typeck_loop

Runs a no-op typechecking loop for three iterations. Used to test the typechecking loop and AST re-analysis code.

type_fix_rules

Usage: type_fix_rules RULE...

Attempts to fix type errors in the crate using the provided rules. Each rule has the form "ectx, actual_ty, expected_ty => cast_expr".

  • ectx is one of rval, lval, lval_mut, or *, and determines in what kinds of expression contexts the rule applies.
  • actual_ty is a pattern to be matched against the (reflected) actual expression type.
  • expected_ty is a pattern to be matched against the (reflected) expected expression type.
  • cast_expr is a template for generating a cast expression.

For expressions in context ectx, whose actual type matches actual_ty and whose expected type matches expected_ty (and where actual != expected), the expr is substituted into cast_expr to replace the original expr with one of the expected type. During substitution, cast_expr has access to variables captured from both actual_ty and expected_ty, as well as __old containing the original (ill-typed) expression.

uninit_to_default

Obsolete - works around translator problems that no longer exist.

Usage: uninit_to_default

In local variable initializers, replace mem::uninitialized() with an appropriate default value of the variable's type.

wrap_api

Usage: wrap_api

Marks: target

For each function foo marked target:

  1. Reset the function's ABI to "Rust" (the default)
  2. Remove any #[no_mangle] or #[export_name] attributes
  3. Generate a new wrapper function called foo_wrapper with foo's old ABI and an #[export_name="foo"] attribute.

Calls to foo are left unchanged. The result is that callers from C use the wrapper function, while internal calls use foo directly, and the signature of foo can be changed freely without affecting external callers.

wrap_extern

Usage: wrap_extern

Marks: target, dest

For each foreign function marked target, generate a wrapper function in the module marked dest, and rewrite all uses of the function to call the wrapper instead.

Example:

 extern "C" {
     fn foo(x: i32) -> i32;
 }

 mod wrappers {
     // empty
 }

 fn main() {
     let x = unsafe { foo(123) };
 }

After transformation, with fn foo marked target and mod wrappers marked dest:

 extern "C" {
     fn foo(x: i32) -> i32;
 }

 mod wrappers {
     unsafe fn foo(x: i32) -> i32 {
         ::foo(x)
     }
 }

 fn main() {
     let x = unsafe { ::wrappers::foo(123) };
 }

Note that this also replaces the function in expressions that take its address, which may cause problem as the wrapper function has a different type that the original (it lacks the extern "C" ABI qualifier).

wrapping_arith_to_normal

Usage: wrapping_arith_to_normal

Replace all uses of wrapping arithmetic methods with ordinary arithmetic operators. For example, replace x.wrapping_add(y) with x + y.

Reference

Module Refactor

Refactoring module

Tables

Stmt AST Stmt
Expr AST Expr

Fields

refactor Global refactoring state

Class RefactorState

RefactorState:run_command (name, args) Run a builtin refactoring command
RefactorState:transform (callback) Run a custom refactoring transformation

Class MatchCtxt

MatchCtxt:parse_stmts (pat) Parse statements and add them to this MatchCtxt
MatchCtxt:parse_expr (pat) Parse an expressiong and add it to this MatchCtxt
MatchCtxt:fold_with (needle, crate, callback) Find matches of pattern within crate and rewrite using callback
MatchCtxt:get_expr (Expression) Get matched binding for an expression variable
MatchCtxt:get_stmt (Statement) Get matched binding for a statement variable
MatchCtxt:try_match (pat, target) Attempt to match target against pat, updating bindings if matched.
MatchCtxt:subst (replacement) Substitute the currently matched AST node with a new AST node

Class TransformCtxt

TransformCtxt:replace_stmts_with (needle, callback) Replace matching statements using given callback
TransformCtxt:replace_expr_with (needle, callback) Replace matching expressions using given callback
TransformCtxt:match () Create a new, empty MatchCtxt
TransformCtxt:get_ast (node) Retrieve a Lua version of an AST node



<h2 class="section-header "><a name="Tables"></a>Tables</h2>

<dl class="function">
<dt>
<a name = "Stmt"></a>
<strong>Stmt</strong>
</dt>
<dd>
AST Stmt


<h3>Fields:</h3>
<ul>
    <li><span class="parameter">type</span>
     "Stmt"
    </li>
    <li><span class="parameter">kind</span>
        <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
     <code>StmtKind</code> of this statement</p>

StmtKind::Local only:

  • ty AstNode Type of local (optional)
  • init AstNode Initializer of local (optional)
  • pat AstNode Name of local

    StmtKind::Item only:

  • item AstNode Item node

    StmtKind::Semi and StmtKind::Expr only:

  • expr AstNode Expression in this statement
  • Expr
    AST Expr
    <h3>Fields:</h3>
    <ul>
        <li><span class="parameter">type</span>
         "Expr"
        </li>
        <li><span class="parameter">kind</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         <code>ExprKind</code> of this expression</p>
    

    ExprKind::Lit only:

  • value Literal value of this expression
  • Fields

    <dl class="function">
    <dt>
    <a name = "refactor"></a>
    <strong>refactor</strong>
    </dt>
    <dd>
    Global refactoring state
    
    
    <ul>
        <li><span class="parameter">refactor</span>
         RefactorState object
        </li>
    </ul>
    

    Class RefactorState

          <div class="section-description">
          Refactoring context
          </div>
    <dl class="function">
    <dt>
    <a name = "RefactorState:run_command"></a>
    <strong>RefactorState:run_command (name, args)</strong>
    </dt>
    <dd>
    Run a builtin refactoring command
    
    
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">name</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         Command to run
        </li>
        <li><span class="parameter">args</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">{string,...}</a></span>
         List of arguments for the command
        </li>
    </ul>
    
    RefactorState:transform (callback)
    Run a custom refactoring transformation
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">callback</span>
            <span class="types"><span class="type">function(TransformCtxt,AstNode)</span></span>
         Transformation function called with a fresh <a href="scripting_api.html#TransformCtxt">TransformCtxt</a> and the crate to be transformed.
        </li>
    </ul>
    

    Class MatchCtxt

          <div class="section-description">
          A match context
          </div>
    <dl class="function">
    <dt>
    <a name = "MatchCtxt:parse_stmts"></a>
    <strong>MatchCtxt:parse_stmts (pat)</strong>
    </dt>
    <dd>
    Parse statements and add them to this MatchCtxt
    
    
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">pat</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         Pattern to parse
        </li>
    </ul>
    
    <h3>Returns:</h3>
    <ol>
    
           <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
        The parsed statements
    </ol>
    
    MatchCtxt:parse_expr (pat)
    Parse an expressiong and add it to this MatchCtxt
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">pat</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         Pattern to parse
        </li>
    </ul>
    
    <h3>Returns:</h3>
    <ol>
    
           <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
        The parsed expression
    </ol>
    
    MatchCtxt:fold_with (needle, crate, callback)
    Find matches of pattern within crate and rewrite using callback
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">needle</span>
            <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
         Pattern to search for
        </li>
        <li><span class="parameter">crate</span>
            <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
         Crate to fold over
        </li>
        <li><span class="parameter">callback</span>
            <span class="types"><span class="type">function(AstNode,MatchCtxt)</span></span>
         Function called for each match. Takes the matching node and a new <a href="scripting_api.html#MatchCtxt">MatchCtxt</a> for that match.
        </li>
    </ul>
    
    MatchCtxt:get_expr (Expression)
    Get matched binding for an expression variable
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">Expression</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         variable pattern
        </li>
    </ul>
    
    <h3>Returns:</h3>
    <ol>
    
           <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
        Expression matched by this binding
    </ol>
    
    MatchCtxt:get_stmt (Statement)
    Get matched binding for a statement variable
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">Statement</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         variable pattern
        </li>
    </ul>
    
    <h3>Returns:</h3>
    <ol>
    
           <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
        Statement matched by this binding
    </ol>
    
    MatchCtxt:try_match (pat, target)
    Attempt to match target against pat, updating bindings if matched.
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">pat</span>
            <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
         AST (potentially with variable bindings) to match with
        </li>
        <li><span class="parameter">target</span>
            <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
         AST to match against
        </li>
    </ul>
    
    <h3>Returns:</h3>
    <ol>
    
           <span class="types"><span class="type">bool</span></span>
        true if match was successful
    </ol>
    
    MatchCtxt:subst (replacement)
    Substitute the currently matched AST node with a new AST node
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">replacement</span>
            <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
         New AST node to replace the currently matched AST. May include variable bindings if these bindings were matched by the search pattern.
        </li>
    </ul>
    
    <h3>Returns:</h3>
    <ol>
    
           <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
        New AST node with variable bindings replaced by their matched values
    </ol>
    

    Class TransformCtxt

          <div class="section-description">
          Transformation context
          </div>
    <dl class="function">
    <dt>
    <a name = "TransformCtxt:replace_stmts_with"></a>
    <strong>TransformCtxt:replace_stmts_with (needle, callback)</strong>
    </dt>
    <dd>
    Replace matching statements using given callback
    
    
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">needle</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         Statements pattern to search for, may include variable bindings
        </li>
        <li><span class="parameter">callback</span>
            <span class="types"><span class="type">function(AstNode,MatchCtxt)</span></span>
         Function called for each match. Takes the matching node and a new <a href="scripting_api.html#MatchCtxt">MatchCtxt</a> for that match. See <a href="scripting_api.html#MatchCtxt:fold_with">MatchCtxt:fold_with</a>
        </li>
    </ul>
    
    TransformCtxt:replace_expr_with (needle, callback)
    Replace matching expressions using given callback
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">needle</span>
            <span class="types"><a class="type" href="https://www.lua.org/manual/5.3/manual.html#6.4">string</a></span>
         Expression pattern to search for, may include variable bindings
        </li>
        <li><span class="parameter">callback</span>
            <span class="types"><span class="type">function(AstNode,MatchCtxt)</span></span>
         Function called for each match. Takes the matching node and a new <a href="scripting_api.html#MatchCtxt">MatchCtxt</a> for that match. See <a href="scripting_api.html#MatchCtxt:fold_with">MatchCtxt:fold_with</a>
        </li>
    </ul>
    
    TransformCtxt:match ()
    Create a new, empty MatchCtxt
    <h3>Returns:</h3>
    <ol>
    
           <span class="types"><a class="type" href="scripting_api.html#MatchCtxt">MatchCtxt</a></span>
        New match context
    </ol>
    
    TransformCtxt:get_ast (node)
    Retrieve a Lua version of an AST node
    <h3>Parameters:</h3>
    <ul>
        <li><span class="parameter">node</span>
            <span class="types"><a class="type" href="scripting_api.html#AstNode">AstNode</a></span>
         AST node handle
        </li>
    </ul>
    
    <h3>Returns:</h3>
    <ol>
    
        Struct representation of this AST node. Valid return types are <a href="scripting_api.html#Stmt">Stmt</a>, and <a href="scripting_api.html#Expr">Expr</a>.
    </ol>
    
    generated by LDoc 1.4.6 Last updated 2019-02-21 10:38:12

    c2rust refactor provides a general-purpose rewriting command, rewrite_expr, for transforming expressions. In its most basic form, rewrite_expr replaces one expression with another, everywhere in the crate:

    rewrite_expr '1+1' '2'
    

    1
    fn main() {
    1
    fn main() {
    2
        println!("{}", 1 + 1);
    2
        println!("{}", 2);
    3
        println!("{}", 1 + /*comment*/ 1);
    3
        println!("{}", 2);
    4
        println!("{}", 1 + 11);
    4
        println!("{}", 1 + 11);
    5
    }
    5
    }

    Here, all instances of the expression 1+1 (the "pattern") are replaced with 2 (the "replacement").

    rewrite_expr parses both the pattern and the replacement as Rust expressions, and compares the structure of the expression instead of its raw text when looking for occurrences of the pattern. This lets it recognize that 1 + 1 and 1 + /* comment */ both match the pattern 1+1 (despite being textually distinct), while 1+11 does not (despite being textually similar).

    Metavariables

    In rewrite_expr's expression pattern, any name beginning with double underscores is a metavariable. Just as a variable in an ordinary Rust match expression will match any value (and bind it for later use), a metavariable in an expression pattern will match any Rust code. For example, the expression pattern __x + 1 will match any expression that adds 1 to something:

    rewrite_expr '__x + 1' '11'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        123
    2
        123
    3
    }
    3
    }
    4
    4
    5
    fn main() {
    5
    fn main() {
    6
        println!("a = {}", 1 + 1);
    6
        println!("a = {}", 11);
    7
        println!("b = {}", 2 * 3 + 1);
    7
        println!("b = {}", 11);
    8
        println!("c = {}", 4 + 5 + 1);
    8
        println!("c = {}", 11);
    9
        println!("d = {}", f() + 1);
    9
        println!("d = {}", 11);
    10
    }
    10
    }

    In these examples, the __x metavariable matches the expressions 1, 2 * 3, and f().

    Using bindings

    When a metavariable matches against some piece of code, the code it matches is bound to the variable for later use. Specifically, rewrite_expr's replacement argument can refer back to those metavariables to substitute in the matched code:

    rewrite_expr '__x + 1' '11 * __x'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        123
    2
        123
    3
    }
    3
    }
    4
    4
    5
    fn main() {
    5
    fn main() {
    6
        println!("a = {}", 1 + 1);
    6
        println!("a = {}", 11 * 1);
    7
        println!("b = {}", 2 * 3 + 1);
    7
        println!("b = {}", 11 * (2 * 3));
    8
        println!("c = {}", 4 + 5 + 1);
    8
        println!("c = {}", 11 * (4 + 5));
    9
        println!("d = {}", f() + 1);
    9
        println!("d = {}", 11 * f());
    10
    }
    10
    }

    In each case, the expression bound to the __x metavariable is substituted into the right-hand side of the multiplication in the replacement.

    Multiple occurences

    Finally, the same metavariable can appear multiple times in the pattern. In that case, the pattern matches only if each occurence of the metavariable matches the same expression. For example:

    rewrite_expr '__x + __x' '2 * __x'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        123
    2
        123
    3
    }
    3
    }
    4
    4
    5
    fn main() {
    5
    fn main() {
    6
        let a = 2;
    6
        let a = 2;
    7
        println!("{}", 1 + 1);
    7
        println!("{}", 2 * 1);
    8
        println!("{}", a + a);
    8
        println!("{}", 2 * a);
    9
        println!("{}", f() + f());
    9
        println!("{}", 2 * f());
    10
        println!("{}", f() + 1);
    10
        println!("{}", f() + 1);
    11
    }
    11
    }

    Here a + a and f() + f() are both replaced, but f() + 1 is not because __x cannot match both f() and 1 at the same time.

    Example: adding a function argument

    Suppose we wish to add an argument to an existing function. All current callers of the function should pass a default value of 0 for this new argument. We can update the existing calls like this:

    rewrite_expr 'my_func(__x, __y)' 'my_func(__x, __y, 0)'
    

    1
    fn my_func(x: i32, y: i32) {
    1
    fn my_func(x: i32, y: i32) {
    2
        /* ... */
    2
        /* ... */
    3
    }
    3
    }
    4
    4
    5
    fn main() {
    5
    fn main() {
    6
        my_func(1, 2);
    6
        my_func(1, 2, 0);
    7
        let x = 123;
    7
        let x = 123;
    8
        my_func(x, x);
    8
        my_func(x, x, 0);
    9
        my_func(0, {
    9
        my_func(
    10
            0,
    11
            {
    10
            let y = x;
    12
                let y = x;
    11
            y + y
    13
                y + y
    14
            },
    15
            0,
    12
        });
    16
        );
    13
    }
    17
    }

    Every call to my_func now passes a third argument, and we can update the definition of my_func to match.

    Special matching forms

    rewrite_expr supports several special matching forms that can appear in patterns to add extra restrictions to matching.

    def!

    A pattern such as def!(::foo::f) matches any ident or path expression that resolves to the function whose absolute path is ::foo::f. For example, to replace all expressions referencing the function foo::f with ones referencing foo::g:

    rewrite_expr 'def!(::foo::f)' '::foo::g'
    

    1
    mod foo {
    1
    mod foo {
    2
        fn f() {
    2
        fn f() {
    3
            /* ... */
    3
            /* ... */
    4
        }
    4
        }
    5
        fn g() {
    5
        fn g() {
    6
            /* ... */
    6
            /* ... */
    7
        }
    7
        }
    8
    }
    8
    }
    9
    9
    10
    fn main() {
    10
    fn main() {
    11
        use self::foo::f;
    11
        use self::foo::f;
    12
        // All these calls get rewritten
    12
        // All these calls get rewritten
    13
        f();
    13
        f();
    14
        foo::f();
    15
        ::foo::f();
    14
        ::foo::g();
    15
        ::foo::g();
    16
    }
    16
    }
    17
    17
    18
    mod bar {
    18
    mod bar {
    19
        fn f() {}
    19
        fn f() {}
    20
    20
    21
        fn f_caller() {
    21
        fn f_caller() {
    22
            // This call does not...
    22
            // This call does not...
    23
            f();
    23
            f();
    24
            // But this one still does
    24
            // But this one still does
    25
            super::foo::f();
    25
            ::foo::g();
    26
        }
    26
        }
    27
    }
    27
    }

    This works for all direct references to f, whether by relative path (foo::f), absolute path (::foo::f), or imported identifier (just f, with use foo::f in scope). It can even handle imports under a different name (f2 with use foo::f as f2 in scope), since it checks only the path of the referenced definition, not the syntax used to reference it.

    Under the hood

    When rewrite_expr attempts to match def!(path) against some expression e, it actually completely ignores the content of e itself. Instead, it performs these steps:

    1. Check rustc's name resolution results to find the definition d that e resolves to. (If e doesn't resolve to a definition, then the matching fails.)
    2. Construct an absolute path dpath referring to d. For definitions in the current crate, this path looks like ::mod1::def1. For definitions in other crates, it looks like ::crate1::mod1::def1.
    3. Match dpath against the path pattern provided as the argument of def!. Then e matches def!(path) if dpath matches path, and fails to match otherwise.

    Debugging match failures

    Matching with def! can sometimes fail in surprising ways, since the user-provided path is matched against a generated path that may not appear explicitly anywhere in the source code. For example, this attempt to match HashMap::new does not succeed:

    rewrite_expr
        'def!(::std::collections::hash_map::HashMap::new)()'
        '::std::collections::hash_map::HashMap::with_capacity(10)'
    

    1
    use std::collections::hash_map::HashMap;
    1
    use std::collections::hash_map::HashMap;
    2
    2
    3
    fn main() {
    3
    fn main() {
    4
        let m: HashMap<i32, i32> = HashMap::new();
    4
        let m: HashMap<i32, i32> = HashMap::new();
    5
    }
    5
    }

    The debug_match_expr command exists to diagnose such problems. It takes only a pattern, and prints information about attempts to match it at various points in the crate:

    debug_match_expr 'def!(::std::collections::hash_map::HashMap::new)()'
    

    Here, its output includes this line:

    def!(): trying to match pattern path(::std::collections::hash_map::HashMap::new) against AST path(::std::collections::HashMap::new)
    

    Which reveals the problem: the absolute path def! generates for HashMap::new uses the reexport at std::collections::HashMap, not the canonical definition at std::collections::hash_map::HashMap. Updating the previous rewrite_expr command allows it to succeed:

    rewrite_expr
        'def!(::std::collections::HashMap::new)()'
        '::std::collections::HashMap::with_capacity(10)'
    

    1
    use std::collections::hash_map::HashMap;
    1
    use std::collections::hash_map::HashMap;
    2
    2
    3
    fn main() {
    3
    fn main() {
    4
        let m: HashMap<i32, i32> = HashMap::new();
    4
        let m: HashMap<i32, i32> = ::std::collections::HashMap::with_capacity(10);
    5
    }
    5
    }

    Metavariables

    The argument to def! is a path pattern, which can contain metavariables just like the overall expression pattern. For instance, we can rewrite all calls to functions from the foo module:

    rewrite_expr 'def!(::foo::__name)()' '123'
    

    1
    mod foo {
    1
    mod foo {
    2
        fn f() {
    2
        fn f() {
    3
            /* ... */
    3
            /* ... */
    4
        }
    4
        }
    5
        fn g() {
    5
        fn g() {
    6
            /* ... */
    6
            /* ... */
    7
        }
    7
        }
    8
    }
    8
    }
    9
    9
    10
    mod bar {
    10
    mod bar {
    11
        fn f() {
    11
        fn f() {
    12
            /* ... */
    12
            /* ... */
    13
        }
    13
        }
    14
        fn g() {
    14
        fn g() {
    15
            /* ... */
    15
            /* ... */
    16
        }
    16
        }
    17
    }
    17
    }
    18
    18
    19
    fn main() {
    19
    fn main() {
    20
        foo::f();
    20
        123;
    21
        foo::g();
    21
        123;
    22
    }
    22
    }

    Since every definition in the foo module has an absolute path of the form ::foo::(something), they all match the expression pattern def!(::foo::__name).

    Like any other metavariable, the ones in a def! path pattern can be used in the replacement expression to substitute in the captured name. For example, we can replace all references to items in the foo module with references to the same-named items in the bar module:

    rewrite_expr 'def!(::foo::__name)' '::bar::__name'
    

    1
    mod foo {
    1
    mod foo {
    2
        fn f() {
    2
        fn f() {
    3
            /* ... */
    3
            /* ... */
    4
        }
    4
        }
    5
        fn g() {
    5
        fn g() {
    6
            /* ... */
    6
            /* ... */
    7
        }
    7
        }
    8
    }
    8
    }
    9
    9
    10
    mod bar {
    10
    mod bar {
    11
        fn f() {
    11
        fn f() {
    12
            /* ... */
    12
            /* ... */
    13
        }
    13
        }
    14
        fn g() {
    14
        fn g() {
    15
            /* ... */
    15
            /* ... */
    16
        }
    16
        }
    17
    }
    17
    }
    18
    18
    19
    fn main() {
    19
    fn main() {
    20
        foo::f();
    20
        ::bar::f();
    21
        foo::g();
    21
        ::bar::g();
    22
    }
    22
    }

    Note, however, that each metavariable in a path pattern can match only a single ident. This means foo::__name will not match the path to an item in a submodule, such as foo::one::two. Handling these would require an additional rewrite step, such as rewrite_expr 'def!(::foo::__name1::__name2)' '::bar::__name1::__name2'.

    typed!

    A pattern of the form typed!(e, ty) matches any expression that matches the pattern e, but only if the type of that expression matches the pattern ty. For example, we can perform a rewrite that only affects i32s:

    rewrite_expr 'typed!(__e, i32)' '0'
    

    1
    fn main() {
    1
    fn main() {
    2
        let x = 100_i32;
    2
        let x = 0;

    4
        let z = x + y;
    4
        let z = 0;
    5
    5
    6
        let a = "hello";
    6
        let a = "hello";
    7
        let b = format!("{}, {}", a, "world");
    7
        let b = format!("{}, {}", a, "world");
    8
    }
    8
    }

    Every expression matches the metavariable __e, but only the i32s (whether literals or variables of type i32) are affected by the rewrite.

    Under the hood

    Internally, typed! works much like def!. To match an expression e against typed!(e_pat, ty_pat), rewrite_expr follows these steps:

    1. Consult rustc's typechecking results to get the type of e. Call that type rustc_ty.
    2. rustc_ty is an internal, abstract representation of the type, which is not suitable for matching. Construct a concrete representation of rustc_ty, and call it ty.
    3. Match e against e_pat and ty against ty_pat. Then e matches typed!(e_pat, ty_pat) if both matches succeed, and fails to match otherwise.

    Debugging match failures

    When matching fails unexpectedly, debug_match_expr is once again useful for understanding the problem. For example, this rewriting command has no effect:

    rewrite_expr "typed!(__e, &'static str)" '"hello"'
    

    1
    fn main() {
    1
    fn main() {
    2
        let a = "hello";
    2
        let a = "hello";
    3
        let b = format!("{}, {}", a, "world");
    3
        let b = format!("{}, {}", a, "world");
    4
    }
    4
    }

    Passing the same pattern to debug_match_expr produces output that includes the following:

    typed!(): trying to match pattern type(&'static str) against AST type(&str)
    

    Now the problem is clear: the concrete type representation constructed for matching omits lifetimes. Replacing &'static str with &str in the pattern causes the rewrite to succeed:

    rewrite_expr 'typed!(__e, &str)' '"hello"'
    

    1
    fn main() {
    1
    fn main() {
    2
        let a = "hello";
    2
        let a = "hello";
    3
        let b = format!("{}, {}", a, "world");
    3
        let b = format!("{}, {}", "hello", "hello");
    4
    }
    4
    }

    Metavariables

    The expression pattern and type pattern arguments of typed!(e, ty) are handled using the normal rewrite_expr matching engine, which means they can contain metavariables and other special matching forms. For example, metavariables can capture both parts of the expression and parts of its type for use in the replacement:

    rewrite_expr
        'typed!(Vec::with_capacity(__n), ::std::vec::Vec<__ty>)'
        '::std::iter::repeat(<__ty>::default())
            .take(__n)
            .collect::<Vec<__ty>>()'
    

    1
    fn main() {
    1
    fn main() {
    2
        let v: Vec<&'static str> = Vec::with_capacity(20);
    2
        let v: Vec<&'static str> = ::std::iter::repeat(<&str>::default())
    3
            .take(20)
    4
            .collect::<Vec<&str>>();
    3
    5
    4
        let v: Vec<_> = Vec::with_capacity(10);
    6
        let v: Vec<_> = ::std::iter::repeat(<i32>::default())
    7
            .take(10)
    8
            .collect::<Vec<i32>>();
    5
        // Allow `v`'s element type to be inferred
    9
        // Allow `v`'s element type to be inferred
    6
        let x: i32 = v[0];
    10
        let x: i32 = v[0];
    7
    }
    11
    }

    Notice that the rewritten code has the correct element type in the call to default, even in cases where the type is not written explicitly in the original expression! The matching of typed! obtains the inferred type information from rustc, and those inferred types are captured by metavariables in the type pattern.

    Example: transmute to <*const T>::as_ref

    This example demonstrates usage of def! and typed!.

    Suppose we have some unsafe code that uses transmute to convert a raw pointer that may be null (*const T) into an optional reference (Option<&T>). This conversion is better expressed using the as_ref method of *const T, and we'd like to apply this transformation automatically.

    Initial attempt

    Here is a basic first attempt:

    rewrite_expr 'transmute(__e)' '__e.as_ref()'
    

    1
    use std::mem;
    1
    use std::mem;
    2
    2
    3
    unsafe fn foo(ptr: *const u32) {
    3
    unsafe fn foo(ptr: *const u32) {
    4
        let r: &u32 = mem::transmute::<*const u32, Option<&u32>>(ptr).unwrap();
    4
        let r: &u32 = mem::transmute::<*const u32, Option<&u32>>(ptr).unwrap();
    5
    5
    6
        let opt_r2: Option<&u32> = mem::transmute(ptr);
    6
        let opt_r2: Option<&u32> = mem::transmute(ptr);
    7
        let r2 = opt_r2.unwrap();
    7
        let r2 = opt_r2.unwrap();
    8
        let ptr2: *const u32 = mem::transmute(r2);
    8
        let ptr2: *const u32 = mem::transmute(r2);
    9
    9
    10
        {
    10
        {
    11
            use std::mem::transmute;
    11
            use std::mem::transmute;
    12
            let opt_r3: Option<&u32> = transmute(ptr);
    12
            let opt_r3: Option<&u32> = ptr.as_ref();
    13
            let r3 = opt_r2.unwrap();
    13
            let r3 = opt_r2.unwrap();
    14
        }
    14
        }
    15
    15
    16
        /* ... */
    16
        /* ... */
    17
    }
    17
    }

    This has two major shortcomings, which we will address in order:

    1. It works only on code that calls exactly transmute(foo). The instances that import std::mem and call mem::transmute(foo) do not get rewritten.
    2. It rewrites transmutes between any types, not just *const T to Option<&T>. Only transmutes between those types should be replaced with as_ref.

    Identifying transmute calls with def!

    We want to rewrite calls to std::mem::transmute, regardless of how those calls are written. This is a perfect use case for def!:

    rewrite_expr 'def!(::std::intrinsics::transmute)(__e)' '__e.as_ref()'
    

    1
    use std::mem;
    1
    use std::mem;
    2
    2
    3
    unsafe fn foo(ptr: *const u32) {
    3
    unsafe fn foo(ptr: *const u32) {
    4
        let r: &u32 = mem::transmute::<*const u32, Option<&u32>>(ptr).unwrap();
    4
        let r: &u32 = ptr.as_ref().unwrap();
    5
    5
    6
        let opt_r2: Option<&u32> = mem::transmute(ptr);
    6
        let opt_r2: Option<&u32> = ptr.as_ref();
    7
        let r2 = opt_r2.unwrap();
    7
        let r2 = opt_r2.unwrap();
    8
        let ptr2: *const u32 = mem::transmute(r2);
    8
        let ptr2: *const u32 = r2.as_ref();
    9
    9
    10
        {
    10
        {
    11
            use std::mem::transmute;
    11
            use std::mem::transmute;
    12
            let opt_r3: Option<&u32> = transmute(ptr);
    12
            let opt_r3: Option<&u32> = ptr.as_ref();
    13
            let r3 = opt_r2.unwrap();
    13
            let r3 = opt_r2.unwrap();
    14
        }
    14
        }
    15
    15
    16
        /* ... */
    16
        /* ... */
    17
    }
    17
    }

    Now our rewrite catches all uses of transmute, whether they're written as transmute(foo), mem::transmute(foo), or even ::std::mem::transmute(foo).

    Notice that we refer to transmute as std::intrinsics::transmute: this is the location of its original definition, which is re-exported in std::mem. See the "def!: debugging match failures" section for an explanation of how we discovered this.

    Filtering transmute calls by type

    We now have a command for rewriting all transmute calls, but we'd like it to rewrite only transmutes from *const T to Option<&T>. We can achieve this by filtering the input and output types with typed!:

    rewrite_expr '
        typed!(
            def!(::std::intrinsics::transmute)(
                typed!(__e, *const __ty)
            ),
            ::std::option::Option<&__ty>
        )
    ' '__e.as_ref()'
    

    1
    use std::mem;
    1
    use std::mem;
    2
    2
    3
    unsafe fn foo(ptr: *const u32) {
    3
    unsafe fn foo(ptr: *const u32) {
    4
        let r: &u32 = mem::transmute::<*const u32, Option<&u32>>(ptr).unwrap();
    4
        let r: &u32 = ptr.as_ref().unwrap();
    5
    5
    6
        let opt_r2: Option<&u32> = mem::transmute(ptr);
    6
        let opt_r2: Option<&u32> = ptr.as_ref();
    7
        let r2 = opt_r2.unwrap();
    7
        let r2 = opt_r2.unwrap();
    8
        let ptr2: *const u32 = mem::transmute(r2);
    8
        let ptr2: *const u32 = mem::transmute(r2);
    9
    9
    10
        {
    10
        {
    11
            use std::mem::transmute;
    11
            use std::mem::transmute;
    12
            let opt_r3: Option<&u32> = transmute(ptr);
    12
            let opt_r3: Option<&u32> = ptr.as_ref();
    13
            let r3 = opt_r2.unwrap();
    13
            let r3 = opt_r2.unwrap();
    14
        }
    14
        }
    15
    15
    16
        /* ... */
    16
        /* ... */
    17
    }
    17
    }

    Now only those transmutes that turn *const T into Option<&T> are affected by the rewrite. And because typed! has access to the results of type inference, this works even on transmute calls that are not fully annotated (transmute(foo), not just transmute::<*const T, Option<&T>>(foo)).

    marked!

    The marked! form is simple: marked!(e, label) matches an expression only if e matches the expression and the expression is marked with the given label. See the documentation on marks and select for more information.

    Other commands

    Several other refactoring commands use the same pattern-matching engine as rewrite_expr:

    • rewrite_ty PAT REPL (docs) works like rewrite_expr, except it matches and replaces type annotations instead of expressions.
    • abstract SIG PAT (docs) replaces expressions matching a pattern with calls to a newly-created function.
    • type_fix_rules (docs) uses type patterns to find the appropriate rule to fix each type error.
    • select's match_expr (docs) and similar filters use syntax patterns to identify nodes to mark.

    Many refactoring commands in c2rust refactor are designed to work only on selected portions of the crate, rather than affecting the entire crate uniformly. To support this, c2rust refactor has a mark system, which allows marking AST nodes (such as functions, expressions, or type annotations) with simple string labels. Certain commands add or remove marks, while others check the existing marks to identify nodes to transform.

    For example, in a program containing several byte string literals, you can use select to mark a specific one:

    select target 'item(B2); desc(expr);'
    

    1
    static B1: &'static [u8] = b"123";
    1
    static B1: &'static [u8] = b"123";
    2
    static B2: &'static [u8] = b"abc";
    2
    static B2: &'static [u8] = b"abc";
    3
    static B3: &'static [u8] = b"!!!";
    3
    static B3: &'static [u8] = b"!!!";

    Then, you can use bytestr_to_str to change only the marked byte string to an ordinary string literal, leaving the others unaffected:

    bytestr_to_str
    

    1
    static B1: &'static [u8] = b"123";
    1
    static B1: &'static [u8] = b"123";
    2
    static B2: &'static [u8] = b"abc";
    2
    static B2: &'static [u8] = "abc";
    3
    static B3: &'static [u8] = b"!!!";
    3
    static B3: &'static [u8] = b"!!!";

    This ability to limit transformations to specific parts of the program is useful for refactoring a large codebase incrementally, on a module-by-module or function-by-function basis.

    The remainder of this tutorial describes select and related mark-manipulation commands. For details of how marks affect various transformation commands, see the command documentation or read about the marked! pattern for rewrite_expr and other pattern-matching commands.

    Marks

    A "mark" is a short string label that is associated with a node in the AST. Marks can be applied to nodes of most kinds, including items, expressions, patterns, type annotations, and so on. The mark string can be any valid Rust identifier, though most commands that process marks use short words such as target, dest, or new. It's possible to apply multiple distinct marks to the same node, and it's also possible to mark children of marked nodes separately from their parents (for example, to mark an expression and one of its subexpressions).

    Here are some examples.

    select target 'crate; desc(match_expr(2 + 2));'
    

    1
    fn f() -> Option<i32> {
    1
    fn f() -> Option<i32> {
    2
        Some(2 + 2)
    2
        Some(2 + 2)
    3
    }
    3
    }
    4
    4
    5
    fn g() -> i32 {
    5
    fn g() -> i32 {
    6
        match f() {
    6
        match f() {
    7
            Some(x) => x,
    7
            Some(x) => x,
    8
            None => 0,
    8
            None => 0,
    9
        }
    9
        }
    10
    }
    10
    }

    The ... indicators in the diff show that the expression 2 + 2 has been marked. Hover over the indicators for more details, such as the label of the added mark.

    As mentioned above, most kinds of nodes can be marked, not only expressions. Here we mark a function, a pattern, and a type annotation:

    select a 'item(f);' ;
    select b 'item(g); desc(match_ty(i32));' ;
    select c 'item(g); desc(match_pat(Some(x)));' ;
    

    1
    fn f() -> Option<i32> {
    1
    fn f() -> Option<i32> {
    2
        Some(2 + 2)
    2
        Some(2 + 2)
    3
    }
    3
    }
    4
    4
    5
    fn g() -> i32 {
    5
    fn g() -> i32 {
    6
        match f() {
    6
        match f() {
    7
            Some(x) => x,
    7
            Some(x) => x,
    8
            None => 0,
    8
            None => 0,
    9
        }
    9
        }
    10
    }
    10
    }

    As mentioned above, it's possible to mark the same node twice with different labels. (Marking it twice with the same label is no different from marking it once.) Here's an example of marking a function multiple times:

    select a 'item(f);' ;
    select a 'item(f);' ;
    select b 'item(f);' ;
    

    1
    fn f() -> Option<i32> {
    1
    fn f() -> Option<i32> {
    2
        Some(2 + 2)
    2
        Some(2 + 2)
    3
    }
    3
    }
    4
    4
    5
    fn g() -> i32 {
    5
    fn g() -> i32 {
    6
        match f() {
    6
        match f() {
    7
            Some(x) => x,
    7
            Some(x) => x,
    8
            None => 0,
    8
            None => 0,
    9
        }
    9
        }
    10
    }
    10
    }

    As you can see by hovering over the indicators, labels a and b were both added to the function f.

    Marks on a node have no connection to marks on its parent or child nodes. We can, for example, mark an expression like 2 + 2, then separately mark its subexpressions with either the same or different labels:

    select a 'item(f); desc(match_expr(2 + 2));' ;
    select a 'item(f); desc(match_expr(2)); first;' ;
    select b 'item(f); desc(match_expr(2)); last;' ;
    

    1
    fn f() -> Option<i32> {
    1
    fn f() -> Option<i32> {
    2
        Some(2 + 2)
    2
        Some(2 + 2)
    3
    }
    3
    }
    4
    4
    5
    fn g() -> i32 {
    5
    fn g() -> i32 {
    6
        match f() {
    6
        match f() {
    7
            Some(x) => x,
    7
            Some(x) => x,
    8
            None => 0,
    8
            None => 0,
    9
        }
    9
        }
    10
    }
    10
    }

    Hovering over the mark indicators shows precisely what has happened: we marked both 2 + 2 and the first 2 with the label a, and marked the second 2 with the label b.

    The select command

    The select command provides a simple scripting language for applying marks to specific nodes. The basic syntax of the command is:

    select LABEL SCRIPT
    

    select runs a SCRIPT (written in the language described below) to obtain a set of AST nodes, then marks every node in the set with LABEL, which should be a single identifier such as target.

    More concretely, when running the script, select maintains a "current selection", which is a set of AST nodes. Script operations (described below) can extend or modify the current selection. At the end of the script, select marks every node in the current selection with LABEL.

    We next describe a few common select script patterns, followed by details on the available operations and filters.

    Common patterns

    Selecting an item by path

    For items such as functions, type declarations, or traits, the item(path) operation selects the item by its path:

    select target 'item(f);' ;
    select target 'item(T);' ;
    select target 'item(S);' ;
    select target 'item(m::g);' ;
    

    1
    fn f() {}
    1
    fn f() {}
    2
    trait T {}
    2
    trait T {}
    3
    struct S {}
    3
    struct S {}
    4
    mod m {
    4
    mod m {
    5
        fn g() {}
    5
        fn g() {}
    6
    }
    6
    }

    Note that this only works for the kinds of items that can be imported via use. It doesn't handle other kinds of item-like nodes, such as impl methods, which cannot be imported directly.

    Selecting all nodes matching a filter

    The operations crate; desc(filter); together select all nodes (or, equivalently, all descendants of the crate) that match a filter. For example, we can select all expressions matching the pattern 2 + 2 using a match_expr filter:

    select target 'crate; desc(match_expr(2 + 2));'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        2 + 2
    2
        2 + 2
    3
    }
    3
    }
    4
    4
    5
    const FOUR: i32 = 2 + 2;
    5
    const FOUR: i32 = 2 + 2;
    6
    6
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];

    Here we see that crate; desc(filter); can find matching items anywhere in the crate: inside function bodies, constant declarations, and even inside the length expression of an array type annotation.

    Selecting filtered nodes inside a parent node

    In the previous example, crate; desc(filter); is made up of two separate script operations. crate selects the entire crate:

    select target 'crate;'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        2 + 2
    2
        2 + 2
    3
    }
    3
    }
    4
    4
    5
    const FOUR: i32 = 2 + 2;
    5
    const FOUR: i32 = 2 + 2;
    6
    6
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];

    Then desc(filter) looks for descendants of selected nodes that match filter, and replaces the current selection with the nodes it finds:

    clear_marks ;
    select target 'crate; desc(match_expr(2 + 2));'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        2 + 2
    2
        2 + 2
    3
    }
    3
    }
    4
    4
    5
    const FOUR: i32 = 2 + 2;
    5
    const FOUR: i32 = 2 + 2;
    6
    6
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];

    (Note: we use clear_marks here only for illustration purposes, to make the diff clearly show the changes between the old and new versions of our select command.)

    Combining desc with operations other than crate allows selecting descendants of only specific nodes. For example, we can find expressions matching 2 + 2, but only within the function f:

    select target 'item(f); desc(match_expr(2 + 2));'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        2 + 2
    2
        2 + 2
    3
    }
    3
    }
    4
    4
    5
    const FOUR: i32 = 2 + 2;
    5
    const FOUR: i32 = 2 + 2;
    6
    6
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];
    7
    static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];

    In a more complex example, we can use multiple desc calls to target an expression inside of a specific method (recall that methods can't be selected directly with item(path)). We first select the module containing the impl:

    select target 'item(m);'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        2 + 2
    2
        2 + 2
    3
    }
    3
    }
    4
    4
    5
    mod m {
    5
    mod m {
    6
        struct S;
    6
        struct S;
    7
        impl S {
    7
        impl S {
    8
            fn f(&self) -> i32 {
    8
            fn f(&self) -> i32 {
    9
                2 + 2
    9
                2 + 2
    10
            }
    10
            }
    11
        }
    11
        }
    12
    }
    12
    }

    Then we select the method of interest, using the name filter (described below):

    clear_marks ;
    select target 'item(m); desc(name("f"));'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        2 + 2
    2
        2 + 2
    3
    }
    3
    }
    4
    4
    5
    mod m {
    5
    mod m {
    6
        struct S;
    6
        struct S;
    7
        impl S {
    7
        impl S {
    8
            fn f(&self) -> i32 {
    8
            fn f(&self) -> i32 {
    9
                2 + 2
    9
                2 + 2
    10
            }
    10
            }
    11
        }
    11
        }
    12
    }
    12
    }

    And finally, we select the expression inside the method:

    clear_marks ;
    select target 'item(m); desc(name("f")); desc(match_expr(2 + 2));'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        2 + 2
    2
        2 + 2
    3
    }
    3
    }
    4
    4
    5
    mod m {
    5
    mod m {
    6
        struct S;
    6
        struct S;
    7
        impl S {
    7
        impl S {
    8
            fn f(&self) -> i32 {
    8
            fn f(&self) -> i32 {
    9
                2 + 2
    9
                2 + 2
    10
            }
    10
            }
    11
        }
    11
        }
    12
    }
    12
    }

    Combined with some additional filters described below, this approach is quite effective for marking nodes that can't be named with an ordinary import path, such as impl methods or items nested inside functions.

    Script operations

    A select script can consist of any number of operations, which will be run in order to completion. (There is no control flow in select scripts.) Each operation ends with a semicolon, much like Rust statements.

    The remainder of this section documents each script operation.

    crate

    crate (which takes no arguments) adds the root node of the entire crate to the current selection. All functions, modules, and other declarations are descendants of this single root node.

    Example:

    select target 'crate;'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        123
    2
        123
    3
    }
    3
    }
    4
    mod m {
    4
    mod m {
    5
        static S: i32 = 0;
    5
        static S: i32 = 0;
    6
    }
    6
    }

    item

    item(p) adds the item identified by the path p to the current selection. The provided path is handled like in Rust's use declarations (except that only plain paths are supported, not wildcards or curly-braced blocks).

    select target 'item(m::S);'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        123
    2
        123
    3
    }
    3
    }
    4
    mod m {
    4
    mod m {
    5
        static S: i32 = 0;
    5
        static S: i32 = 0;
    6
    }
    6
    }

    Because the item operation only adds to the current selection (as opposed to replacing the current selection with a set containing only the identified item), we can run item multiple times to select several different items at once:

    select target 'item(f); item(m::S); item(m);'
    

    1
    fn f() -> i32 {
    1
    fn f() -> i32 {
    2
        123
    2
        123
    3
    }
    3
    }
    4
    mod m {
    4
    mod m {
    5
        static S: i32 = 0;
    5
        static S: i32 = 0;
    6
    }
    6
    }

    child

    child(f) checks each child of each currently selected node against the filter f, and replaces the current selection with the set of matching children.

    This can be used, for example, to select a static's type annotation without selecting type annotations that appear inside its initializer:

    select target 'item(S); child(kind(ty));'
    

    1
    static S: i32 = 123_u8 as i32;
    1
    static S: i32 = 123_u8 as i32;
    2
    const C: u32 = 0;
    2
    const C: u32 = 0;

    To illustrate how this works, here is the AST for the static S item:

    • item static S
      • identifier S (the name of the static)
      • type i32 (the type annotation of the static)
      • expression 123_u8 as i32 (the initializer of the static)
        • expression 123_u8 (the input of the cast expression)
        • type i32 (the target type of the cast expression)

    The static's type annotation is a direct child of the static (and has kind ty, matching the kind(ty) filter), so the type annotation is selected by the example command above. The target type for the cast is not a direct child of the static - rather, it's a child of the initializer expression, which is a child of the static - so it is ignored.

    desc

    desc(f) ("descendant") checks each descendant of each currently selected node against the filter f, and replaces the current selection with the set of matching descendants. This is similar to child, but checks for matching descendants at any depth, not only matching direct children.

    Using the same example as for child, we see that desc selects more nodes:

    select target 'item(S); desc(kind(ty));'
    

    1
    static S: i32 = 123_u8 as i32;
    1
    static S: i32 = 123_u8 as i32;
    2
    const C: u32 = 0;
    2
    const C: u32 = 0;

    Specifically, it selects both the type annotation of the static and the target type of the cast expression, as both are descendants of the static (though at different depths). Of course, it still does not select the type annotation of the const C, which is not a descendant of static S at any depth.

    Note that desc only considers the strict descendants of marked nodes - that is, it does not consider a node to be a "depth-zero" descendant of itself. So, for example, the following command selects nothing:

    select target 'item(S); desc(item_kind(static));'
    

    1
    static S: i32 = 123_u8 as i32;
    1
    static S: i32 = 123_u8 as i32;
    2
    const C: u32 = 0;
    2
    const C: u32 = 0;

    S itself is a static, but contains no additional statics inside of it, and desc does not consider S itself when looking for item_kind(static) descendants.

    filter

    filter(f) checks each currently selected node against the filter f, and replaces the current selection with the set of matching nodes. Equivalently, filter(f) removes from the current selection any nodes that don't match f.

    Most uses of the filter operation can be replaced by passing a more appropriate filter expression to desc or child, so the examples in this section are somewhat contrived. (filter can still be useful in combination with marked, described below, or in more complex select scripts.)

    Here is a slightly roundabout way to select all items named f. First, we select all items:

    select target 'crate; desc(kind(item));'
    

    1
    fn f() {}
    1
    fn f() {}
    2
    fn g() {}
    2
    fn g() {}
    3
    3
    4
    mod m {
    4
    mod m {
    5
        fn f() {}
    5
        fn f() {}
    6
    }
    6
    }

    Then, we use filter to keep only items named f:

    clear_marks ;
    select target 'crate; desc(kind(item)); filter(name("f"));'
    

    1
    fn f() {}
    1
    fn f() {}
    2
    fn g() {}
    2
    fn g() {}
    3
    3
    4
    mod m {
    4
    mod m {
    5
        fn f() {}
    5
        fn f() {}
    6
    }
    6
    }

    With this command, only descendants of crate matching both filters kind(item) and name("f") are selected. (This could be written more simply as crate; desc(kind(item) && name("f"));.)

    first and last

    first replaces the current selection with a set containing only the first selected node. last does the same with the last selected node. "First" and "last" are determined by a postorder traversal of the AST, so sibling nodes are ordered as expected, and a parent node come "after" all of its children.

    The first and last operations are most useful for finding places to insert new nodes (such as with the create_item command) while ignoring details such as the specific names or kinds of the nodes around the insertion point. For example, we can use last to easily select the last item in a module. First, we select all the module's items:

    select target 'item(m); child(kind(item));'
    

    1
    mod m {
    1
    mod m {
    2
        fn f() {}
    2
        fn f() {}
    3
        static S: i32 = 0;
    3
        static S: i32 = 0;
    4
        const C: i32 = 1;
    4
        const C: i32 = 1;
    5
    }
    5
    }

    Then we use last to select only the last such child:

    clear_marks ;
    select target 'item(m); child(kind(item)); last;'
    

    1
    mod m {
    1
    mod m {
    2
        fn f() {}
    2
        fn f() {}
    3
        static S: i32 = 0;
    3
        static S: i32 = 0;
    4
        const C: i32 = 1;
    4
        const C: i32 = 1;
    5
    }
    5
    }

    Now we could use create_item to insert a new item after the last existing one.

    marked

    marked(l) adds all nodes marked with label l to the current selection. This is useful for more complex marking operations, since (together with the delete_marks command) it allows using temporary marks to manipulate multiple sets of nodes simultaneously.

    For example, suppose we wish to select both the first and the last item in a module. Normally, this would require duplicating the select command, since both first and last replace the entire current selection with the single first or last item. This would be undesirable if the operations for setting up the initial set of items were fairly complex. But with marked, we can save the selection before running first and restore it afterward.

    We begin by selecting all items in the module and saving that selection by marking it with the tmp_all_items label:

    select tmp_all_items 'item(m); child(kind(item));'
    

    1
    mod m {
    1
    mod m {
    2
        fn f() {}
    2
        fn f() {}
    3
        static S: i32 = 0;
    3
        static S: i32 = 0;
    4
        const C: i32 = 1;
    4
        const C: i32 = 1;
    5
    }
    5
    }

    Next, we use marked to retrieve the tmp_all_items set and take the first item from it. This reduces the current selection to only a single item, but the tmp_all_items marks remain intact for later use.

    select target 'marked(tmp_all_items); first;'
    

    1
    mod m {
    1
    mod m {
    2
        fn f() {}
    2
        fn f() {}
    3
        static S: i32 = 0;
    3
        static S: i32 = 0;
    4
        const C: i32 = 1;
    4
        const C: i32 = 1;
    5
    }
    5
    }

    We do the same to mark the last item with target:

    select target 'marked(tmp_all_items); last;'
    

    1
    mod m {
    1
    mod m {
    2
        fn f() {}
    2
        fn f() {}
    3
        static S: i32 = 0;
    3
        static S: i32 = 0;
    4
        const C: i32 = 1;
    4
        const C: i32 = 1;
    5
    }
    5
    }

    Finally, we clean up, removing the tmp_all_items marks using the delete_marks command:

    delete_marks tmp_all_items
    

    1
    mod m {
    1
    mod m {
    2
        fn f() {}
    2
        fn f() {}
    3
        static S: i32 = 0;
    3
        static S: i32 = 0;
    4
        const C: i32 = 1;
    4
        const C: i32 = 1;
    5
    }
    5
    }

    Now the only marks remaining are the target marks on the first and last items of the module, as we originally intended.

    reset

    reset clears the set of marked nodes. This is only useful in combination with mark and unmark, as otherwise the operations before a reset have no effect.

    mark and unmark

    These operations allow select scripts to manipulate marks directly, rather than relying solely on the automatic marking of selected nodes at the end of the script. mark(l) marks all nodes in the current selection with label l (immediately, rather than waiting until the select command is finished), and unmark(l) removes label l from all selected nodes.

    mark, unmark, and reset can be used to effectively combine multiple select commands in a single script. Here's the "first and last" example from the marked section, using only a single select command:

    select _dummy '
        item(m); child(kind(item)); mark(tmp_all_items); reset;
        marked(tmp_all_items); first; mark(target); reset;
        marked(tmp_all_items); last; mark(target); reset;
        marked(tmp_all_items); unmark(tmp_all_items); reset;
    '
    

    1
    mod m {
    1
    mod m {
    2
        fn f() {}
    2
        fn f() {}
    3
        static S: i32 = 0;
    3
        static S: i32 = 0;
    4
        const C: i32 = 1;
    4
        const C: i32 = 1;
    5
    }
    5
    }

    Note that we pass _dummy as the LABEL argument of select, since the desired target marks are applied using the mark operation, rather than relying on the implicit marking done by select.

    unmark is also useful in combination with marked to interface with non-select mark manipulation commands. For example, suppose we want to mark all occurrences of 2 + 2 that are passed as arguments to a function f. One option is to do this using the mark_arg_uses command, with additional processing by select before and after. Here we start by marking the function f:

    select target 'item(f);'
    

    1
    fn f(x: i32) {
    1
    fn f(x: i32) {
    2
        // ...
    2
        // ...
    3
    }
    3
    }
    4
    4
    5
    fn g(x: i32) {
    5
    fn g(x: i32) {
    6
        // ...
    6
        // ...
    7
    }
    7
    }
    8
    8
    9
    fn main() {
    9
    fn main() {
    10
        f(1);
    10
        f(1);
    11
        f(2 + 2);
    11
        f(2 + 2);
    12
        g(2 + 2);
    12
        g(2 + 2);
    13
        let x = 2 + 2;
    13
        let x = 2 + 2;
    14
    }
    14
    }

    Next, we run mark_arg_uses to replace the mark on f with a mark on each argument expression passed to f:

    mark_arg_uses 0 target
    

    1
    fn f(x: i32) {
    1
    fn f(x: i32) {
    2
        // ...
    2
        // ...
    3
    }
    3
    }
    4
    4
    5
    fn g(x: i32) {
    5
    fn g(x: i32) {
    6
        // ...
    6
        // ...
    7
    }
    7
    }
    8
    8
    9
    fn main() {
    9
    fn main() {
    10
        f(1);
    10
        f(1);
    11
        f(2 + 2);
    11
        f(2 + 2);
    12
        g(2 + 2);
    12
        g(2 + 2);
    13
        let x = 2 + 2;
    13
        let x = 2 + 2;
    14
    }
    14
    }

    And finally, we use select again to mark only those arguments that match 2 + 2:

    select target 'marked(target); unmark(target); filter(match_expr(2 + 2));'
    

    1
    fn f(x: i32) {
    1
    fn f(x: i32) {
    2
        // ...
    2
        // ...
    3
    }
    3
    }
    4
    4
    5
    fn g(x: i32) {
    5
    fn g(x: i32) {
    6
        // ...
    6
        // ...
    7
    }
    7
    }
    8
    8
    9
    fn main() {
    9
    fn main() {
    10
        f(1);
    10
        f(1);
    11
        f(2 + 2);
    11
        f(2 + 2);
    12
        g(2 + 2);
    12
        g(2 + 2);
    13
        let x = 2 + 2;
    13
        let x = 2 + 2;
    14
    }
    14
    }

    Beginning the script with marked(target); unmark(target); copies the set of target-marked nodes into the current selection, then removes the existing marks. The remainder of the script can then operate as usual, manipulating only the current selection with no need to worry about additional marks being already present.

    Filters

    Boolean operators

    Filter expressions can be combined using the boolean operators &&, ||, and !. A node matches the filter f1 && f2 only if it matches f1 and also matches f2, and so on.

    kind

    kind(k) matches AST nodes whose node kind is k. The supported node kinds are:

    • item - a top-level item, as in struct Foo { ... } or fn foo() { ... }. Includes both items in modules and items defined inside functions or other blocks, but does not include "item-like" nodes inside traits, impls, or extern blocks.
    • trait_item - an item inside a trait definition, such as a method or associated type declaration
    • impl_item - an item inside an impl block, such as a method or associated type definition
    • foreign_item - an item inside an extern block ("foreign module"), such as a C function or static declaration
    • stmt
    • expr
    • pat - a pattern, including single-ident patterns like foo in let foo = ...;
    • ty - a type annotation, such as Foo in let x: Foo = ...;
    • arg - a function or method argument declaration
    • field - a struct, enum variant, or union field declaration
    • itemlike - matches nodes whose kind is any of item, trait_item, impl_item, or foreign_item
    • any - matches any node

    The node kind k can be used alone as shorthand for kind(k). For example, the operation desc(item); is the same as desc(kind(item));.

    item_kind

    item_kind(k) matches itemlike AST nodes whose subkind is k. The itemlike subkinds are:

    • extern_crate
    • use
    • static
    • const
    • fn
    • mod
    • foreign_mod
    • global_asm
    • ty - type alias definition, as in type Foo = Bar;
    • existential - existential type definition, as in existential type Foo: Bar;. Note that existential types are currently an unstable language feature.
    • enum
    • struct
    • union
    • trait - ordinary trait Foo { ... } definition, including unsafe trait
    • trait_alias - trait alias definition, as in trait Foo = Bar; Note that trait aliases are currently an unstable language feature.
    • impl - including both trait and inherent impls
    • mac - macro invocation. Note that select works on the macro-expanded AST, so macro invocations are never present under normal circumstances.
    • macro_def - 2.0/decl_macro-style macro definition, as in macro foo(...) { ... }. Note that 2.0-style macro definitions are currently an unstable language feature.

    Note that a single item_kind filter can match multiple distinct node kinds, as long as the subkind is correct. for example, item_kind(fn) will match fn items, method trait_items and impl_items, and fn declarations inside extern blocks (foreign_items). similarly, item_kind(ty) matches ordinary type alias definitions, associated type declarations (in traits) and definitions (in impls), and foreign type declarations inside extern blocks.

    item_kind filters match only those nodes that also match kind(itemlike), as other node kinds have no itemlike subkind.

    The itemlike subkind k can be used alone as shorthand for item_kind(k). For example, the operation desc(fn); is the same as desc(item_kind(fn));.

    pub and mut

    pub matches any item, impl item, or foreign item whose visibility is pub. It currently does not support struct fields, even though they can also be declared pub.

    mut matches static mut items, static mut foreign item declarations, and mutable binding patterns such as the mut foo in let mut foo = ...;.

    name

    name(re) matches itemlikes, arguments, and fields whose name matches the regular expression re. For example, name("[fF].*") matches fn f() { ... } and struct Foo { ... }, but not trait Bar { ... }. It currently does not support general binding patterns, aside from those in function arguments.

    path and path_prefix

    path(p) matches itemlikes and enum variants whose absolute path is p.

    path_prefix(n, p) is similar to path(p), but drops the last n segments of the node's path before comparing to p.

    has_attr

    has_attr(a) matches itemlikes, exprs, and field declarations that have an attribute named a.

    match_*

    match_expr(e) uses rewrite_expr-style AST matching to compare exprs to e, and matches any node where AST matching succeeds. For example, match_expr(__e + 1) matches the expressions 1 + 1, x + 1, and f() + 1, but not 2 + 2.

    match_pat, match_ty, and match_stmt are similar, but operate on pat, ty, and stmt nodes respectively.

    marked

    marked(l) matches nodes that are marked with the label l.

    any_child, all_child, any_desc, and all_desc

    any_child(f) matches nodes that have a child that matches f. all_child(f) matches nodes where all children of the node match f.

    any_desc and all_desc are similar, but consider all descendants instead of only direct children.

    Other commands

    In addition to select, c2rust refactor contains a number of other mark-manipulation commands. A few of these can be replicated with appropriate select scripts (though using the command is typically easier), but some are more complex.

    copy_marks

    copy_marks OLD NEW adds a mark with label NEW to every node currently marked with OLD.

    delete_marks

    delete_marks OLD removes the label OLD from every node that is currently marked with it.

    rename_marks

    rename_marks OLD NEW behaves like copy_marks OLD NEW followed by delete_marks OLD: it adds a mark with label NEW to every node marked with OLD, then removes OLD from each such node.

    mark_uses

    mark_uses LABEL transfers LABEL marks from definitions to uses. That is, it finds each definition marked with LABEL, marks each use of such a definition with LABEL, then removes LABEL from the definitions. For example, if a static FOO: ... = ... is marked with target, then mark_uses target will add a target mark to every expression FOO that references the marked definition and then remove target from FOO itself.

    For the purposes of this command, a "use" of a definition is a path or identifier that resolves to that definition. This includes expressions (both paths and struct literals), patterns (paths to constants, structs, and enum variants), and type annotations. When a function definition is marked, only the function path itself (the foo::bar in foo::bar(x)) is considered a use, not the entire call expression. Method calls (whether using dotted or UFCS syntax) normally can't be handled at all, as their resolution is "type-dependent" (however, the mark_callers command can sometimes work when mark_uses does not).

    mark_callers

    mark_callers LABEL transfers LABEL marks from function or method definitions to uses. That is, it works like mark_uses, but is specialized to functions and methods. mark_callers uses more a more sophisticated means of name resolution that allows it to detect uses via type-dependent method paths, which mark_uses cannot handle.

    For purposes of mark_callers, a "use" is a function call (foo::bar()) or method call (x.foo()) expression where the function or method being called is one of the marked definitons.

    mark_arg_uses

    mark_arg_uses INDEX LABEL transfers LABEL marks from function or method definitions to the argument in position INDEX at each use. That is, it works like mark_callers, but marks the expression passed as argument INDEX instead of the entire call site.

    INDEX is zero-based. However, the self/receiver argument of a method call counts as the first argument (index 0), with the first argument in parentheses having index 1 (arg0.f(arg1, arg2)). For ordinary function calls (including UFCS method calls), the first argument has index 0 (f(arg0, arg1, arg2))

    The analysis::ownership module implements a pointer analysis for inferring ownership information in code using raw pointers. The goal is to take code that has been automatically translated from C, and thus uses only raw pointers, and infer which of those raw pointers should be changed to safe &, &mut, or Box pointers. Pointers can appear in a number of places in the input program, but this analysis focuses mainly on function signatures and struct field types.

    Design

    The goal of the analysis is to assign to each raw pointer type constructor a permission value, one of READ, WRITE, and MOVE, corresponding to the Rust pointer types &, &mut, and Box. These permissions form a trivial lattice, where READ < WRITE < MOVE. The READ permission indicates that the pointed-to data may be read, the WRITE permission indicates that the pointed-to data may be modified, and the MOVE permission indicates that the pointed-to data may be "moved", or consumed in a linear-typed fashion. The MOVE permission also includes the ability to free the pointed-to data, which amouns to "moving to nowhere".

    Here is a simple example to illustrate the major features of the analysis:

    struct Array {
        data: *mut i32,
    }
    
    unsafe fn new_array(len: usize) -> *mut Array {
        let data = malloc(size_of::<i32>() * len);
        let arr = malloc(size_of::<Array>());
        (*arr).data = data;
        array
    }
    
    unsafe fn delete_array(arr: *mut Array) {
        free((*arr).data);
        free(arr);
    }
    
    unsafe fn element_ptr(arr: *mut Array, idx: usize) -> *mut i32 {
        (*arr).data.offset(idx)
    }
    
    unsafe fn get(arr: *mut Array, idx: usize) -> i32 {
        let elt: *mut i32 = element_ptr(arr, idx);
        *elt
    }
    
    unsafe fn set(arr: *mut Array, idx: usize, val: i32) {
        let elt: *mut i32 = element_ptr(arr, idx);
        *elt = val;
    }
    

    The analysis infers pointer permissions by observing how pointers are used, and applying the rules of the Rust reference model. For instance, the set function's elt pointer must have permission WRITE (or higher), because there is a write to the pointed-to data. Similarly, delete_array's first call to free requires that the pointer in the Array::data field must have permission MOVE. Furthermore, the first free also requires arr to have permission MOVE, because consuming the pointer (*arr).data constitutes a move out of *arr. (In general, the pointer permission sets an upper bound on the permissions of all pointers within the pointed-to data. For example, if arr has permission READ, then *(*arr).data can only be read, not written or moved.)

    The element_ptr function presents an interesting case for analysis, because it is used polymorphically: in get, we would like element_ptr to take a READ *mut Array and return a READ *mut i32, whereas in set we would like the same function to take and return WRITE pointers. In strictly const-correct C code, get and set would respectively call separate const and non-const variants of element_ptr, but a great deal of C code is not const-correct.

    This analysis handles functions like element_ptr by allowing inferred function signatures to be permission polymorphic. Signatures may include permission parameters, which can be instantiated separately at each call site, subject to a set of constraints. For example, here is the inferred polymorphic signature of element_ptr, with permission annotations written in comments (since there is no Rust syntax for them):

    fn element_ptr /* <s0, s1> */ (arr: /* s0 */ *mut Array,
                                   idx: usize)
                                   -> /* s1 */ *mut i32
        /* where s1 <= s0 */;
    

    The function has two permission parameters, s0 and s1, which are the permissions of the argument and return pointers respectively. The signature includes the constraint s1 <= s0, indicating that the output pointer's permission is no higher than that of the input pointer. The function is called in get with permission arguments s0 = s1 = READ and in set with s0 = s1 = WRITE.

    Rust does not support any analogue of the permission polymorphism used in this analysis. To make the results useful in actual Rust code, the analysis includes a monomorphization step, which chooses a set of concrete instantiations for each polymorphic function, and selects an instantiation to use for each call site. In the example above, element_ptr would have both READ, READ and WRITE, WRITE instantiations, with the first being used for the callsite in get and the second at the callsite in set.

    Implementation

    The analysis first computes a polymorphic signature for each function, then monomorphizes to produce functions that can be handled by Rust's type system.

    Both parts of the analysis operate on constraint sets, which contain constraints of the form p1 <= p2. The permissions p1, p2 can be concrete permissions (READ, WRITE, MOVE), permission variables, or expressions of the form min(p1, p2) denoting the less-permissive of two permission values.

    Permission variables appear on pointer type constructors in the types of static variables and struct fields ("static" variables), in the types within function signatures ("sig"), in the types of temporaries and local variables ("local"), and at callsites for instantiating a permission polymorphic function ("inst"). Variables are marked with their origin, as variable from different locations are handled in different phases of the analysis.

    The overall goal of the analysis is to produce assignments to static and sig variables that satisfy all the relevant constraints (or multiple assignments, when monomorphizing polymorphic functions).

    Polymorphic signatures

    The permission variables of each function's polymorphic signature are easily determined: for simplicity, the analysis introduces one variable for each occurrence of a pointer type constructor in the function signature. Cases that might otherwise involve a single variable appearing at multiple locations in the signature are instead handled by adding constraints between the variables. The main task of the first part of the analysis is to compute the constraints over the signature variables of each function. This part of the analysis must also build an assignment of permission values to all static vars, which are not involved in any sort of polymorphism.

    Constraints arise mainly at assignments and function call expressions.

    At assignments, the main constraint is that, if the assigned value has a pointer type, the permission on the LHS pointer type must be no greater than the permission on the RHS pointer type (lhs <= rhs). In other words, an assignment of a pointer may downgrade the permission value of that pointer, but may never upgrade it. In non-pointer types, and in the pointed-to type of an outermost pointer type, all permission values occurring in the two types must be equal (lhs <= rhs and rhs <= lhs).

    Assignments also introduce two additional constraints, both relating to path permissions. The path permission for an expression is the minimum of the permission values on all pointers dereferenced in the expression. For example, in *(*x).f, the path permission is the minimum of the permission on the local variable x and the permission on the struct field f. The calculation of path permissions reflects the transitive nature of access restrictions in Rust: for example, if a struct field x.f has type &mut T, but x is an immutable reference (&S), then only immutable access is allowed to *x.f.

    The two additional constraints introduced by assigments are (1) the path permission of the LHS must be no lower than WRITE, and (2) the path permission of the RHS must be no lower than the permission of the LHS pointer type. Constraint (1) prevents writing through a READ pointer, or through any path containing a READ pointer. Constraint (2) prevents assigning a WRITE pointer accessed through a READ path (or a MOVE pointer accessed through a WRITE or READ path) to a WRITE pointer variable, which would allow bypassing the READ restriction.

    Function calls require additional work. At each call site, the analysis copies in the callee's constraints, substituting a fresh "instantiation" ("inst") variable for each variable in the callee's signature. It then links the new inst variables to the surrounding locals by processing a "pseudo-assignment" from each argument expression to the corresponding formal parameter type in the substituted signature, and from the return type to the lvalue expression where the result is to be stored. The effect is to allow the analysis to "reason through" the function call, relating the (local) return value to the caller's argument expressions. Copying the constraints instead of relying on a concrete instantiation permits precise reasoning about polymorphic functions that call other polymorphic functions.

    The final step for each function is to simplify the constraint set by eliminating "local", "inst", and "static" permission variables. Local variables have no connection to types outside the current function, and can be simplified away without consequence. Eliminating static and instantiation variables requires fixed-point iteration, which is described below. The result of the simplification is a set of constraints over only the function's sig variables, which is suitable for use as the constraint portion of the function signature.

    Since each function's signature depends on the signatures of its callees, and functions may be recursive, a fixed-point iteration step is required to compute the final constraint set for each function. To simplify the implementation, the polymorphic signature construction part of the analysis is split into two phases. The intraprocedural phase visits every function once and generates constraints for that function, but doesn't copy in constraints from callees, which may not have been processed yet. This phase records details of each call site for later use. The intraprocedural phase eliminates local variables at the end of each function, but it does not have enough information to safely eliminate static and inst variables. The interprocedural phase updates each function in turn, substituting in callees' sig constraints and simplifying away static and inst variables to produce a new, more accurate set of sig constraints for the current function, and iterates until it reaches a fixed point. The interprocedural phase also computes an assignment of concrete permission values to static variables, during the process of removing static variables from functions' constraint sets.

    Monomorphization

    The first part of the analysis infers a permission polymorphic signature for each function, but Rust does not support this form of polymorphism. To make the analysis results applicable to actual Rust code, the analysis must provide enough information to allow monomorphizing functions - that is, producing multiple copies of each function with different concrete instantiations of the permission variables.

    Monomorphization begins by collecting all "useful" monomorphic signatures for each function. The analysis identifies all signature variables that appear in output positions (in the return type, or behind a pointer whose permission value is always at least WRITE), then enumerates all assignments to those output variables that are allowed by the function's constraints. For each combination of outputs, it finds the least-restrictive valid assignment of permissions to the remaining (input) variables. For example, given this function:

    fn element_ptr /* <s0, s1> */ (arr: /* s0 */ *mut Array,
                                   idx: usize)
                                   -> /* s1 */ *mut i32
        /* where s1 <= s0 */;
    

    The only output variable is s1, which appears in the return type. The monomorphization step will try each assignment to s1 that is allowed by the constraints. Since the only constraint is s1 <= s0, READ, WRITE, and MOVE are all valid. For each of these, it finds the least restrictive assignment to s0 that is compatible with the assignment to s0. For example, when s1 = MOVE, only s0 = MOVE is valid, so the analysis records MOVE, MOVE as a monomorphization for the element_ptr function. When s1 = WRITE, both s0 = MOVE and s0 = WRITE satisfy the constraints, but s0 = WRITE is less restrictive - it allows calling the function with both MOVE and WRITE pointers, while setting s0 = MOVE allows only MOVE pointers. So the analysis records arguments WRITE, WRITE as another monomorphization, and by similar logic records READ, READ as the final one.

    The next step of monomorphization is to select a monomorphic variant to call at each callsite of each monomorphized function. Given a pair of functions:

    fn f /* <s0, s1> */ (arr: /* s0 */ *mut Array) -> /* s1 */ *mut i32
            /* where s1 <= s0 */ {
        g(arr)
    }
    
    fn g /* <s0, s1> */ (arr: /* s0 */ *mut Array) -> /* s1 */ *mut i32
            /* where s1 <= s0 */ {
        ...
    }
    

    For pointer permissions to line up properly, a monomorphic variant of f specialized to READ, READ will need to call a variant of g also specialized to READ, READ, and a variant of f specialized to WRITE, WRITE will need to call a WRITE, WRITE variant of g.

    To infer this information, the analysis separately considers each monomorphic signature of each function. It performs a backtracking search to select, for each callsite in the function, a monomorphic signature of the callee, such that all of the calling function's constraints are satisfied, including constraints setting the caller's sig variables equal to the concrete permissions in the monomorphic signature. The table of callee monomorphization selections is included in the analysis results so that callsites can be updated appropriately when splitting functions for monomorphization.

    Annotations

    The ownership analysis supports annotations to specify the permission types of functions and struct fields. These annotations serve two purposes. First, the user can annotate functions to provide custom signatures for functions on which the analysis produces inaccurate results. Signatures provided this way will be propagated throughout the analysis, so manually correcting a single wrongly-inferred function can fix the inference results for its callers as well. Second, the ownership system provides an ownership_annotate command that adds annotations to functions reflecting their inferred signatures. The user can then read the generated annotations to check the analysis results, and optionally edit them to improve precision, before proceeding with further code transformations.

    There are four annotation types currently supported by the ownership system.

    • #[ownership_static(<perms>)] provides concrete permission values for all pointer types in a static declaration or struct field. The perms argument is a comma-separated sequence of concrete permission tokens (READ, WRITE, MOVE). The given permission values will be applied to the pointers in the static or field type, following a preorder traversal of the type. For example:

      struct S {
          #[ownership_static(READ, WRITE, MOVE)]
          f: *mut (*mut u8, *mut u16)
      }
      

      Here the outermost pointer will be given permission READ, the pointer to u8 will be given permission WRITE, and the pointer to u16 will be given permission MOVE.

    • #[ownership_constraints(<constraints>) provides the signature constraints for the annotated function, overriding polymorphic signature inference. The argument constraints is a comma-separated sequence of constraints of the form le(<perm1>, <perm2>), each representing a single constraint perm1 <= perm2. The permissions used in each constraint may be any combination of concrete permissions (READ, WRITE, MOVE), permission variables (_0, _1, ...), or expressions of the form min(p1, p2, ...). (The permission syntax is limited by the requirement for compatibility with Rust's attribute syntax.)

      The permission variables used in constraints always refer to signature variables of the annotated function. A signature variable is introduced for each pointer type constructor in the function's signature, and they are numbered according to a preorder traversal of each node in the argument and return types of the function. This example shows location of each variable in a simple signature:

      fn get_err(arr: /* _0 */ *mut Array,
                 element_out: /* _1 */ *mut /* _2 */ *mut i32)
                 -> /* _3 */ *const c_char;
      
    • #[ownership_mono(<suffix>, <perms>)] supplies a monomorphic signature to be used for the annotated function. The suffix argument is a quoted string, which (if non-empty) will be used when splitting polymorphic functions into monomorphic variants to construct a name for the monomorphized copy of the function. The perms argument is a comma-separated list of concrete permission tokens, giving the permissions to be used in the function signature in this monomorphization.

      The ownership_mono annotation can appear multiple times on a single function to provide multiple monomorphic signatures. However, if it appears at all, monomorphization inference will be completely overriden for the annotated function, and only the provided signatures will be used in callee argument inference and later transformations.

      Example:

      #[ownership_mono("mut", WRITE, WRITE)]
      #[ownership_mono("", READ, READ)]
      fn first(arr: *mut Array) -> *mut i32;
      

      This function will have two monomorphic variants, one where both pointers' permission values are WRITE and one where both are READ. When the ownership_split_variants command splits the function into its monomorphic variants, the WRITE variant will be named first_mut and the READ variant will keep the original name first.

    • #[ownership_variant_of(<name>)] is used to combine source-level functions into variant groups. See the section on variant groups for details.

    Variant Groups

    The "variant group" mechanism allows combining several source-level functions into a single logical function for purposes of the analysis. This is useful for combining a function that was previously split into monomorphic variants back into a single logical function. This allows for a sort of "modular refactoring", in which the user focuses on one module at a time, analyzing, annotating, and splitting variants in only that module before moving on to another.

    As a concrete example of the purpose of this feature, consider the following code:

    fn f(arr: *mut Array) -> *mut i32 { ... g(arr) ... }
    
    fn g(arr: *mut Array) -> *mut i32 { ... }
    

    The user works first on (the module containing) g, resulting in splitting g into two variants:

    fn f(arr: *mut Array) -> *mut i32 { ... g_mut(arr) ... }
    
    fn g(arr: *mut Array) -> *mut i32 { ... }
    fn g_mut(arr: *mut Array) -> *mut i32 { ... }
    

    Note that, because there is still only one variant of f, the transformation must choose a single g variant for f to call. In this case, it chose the g_mut variant.

    Later, the user works on f. If g and g_mut are treated as separate functions, then there are two possibilities. First, if the constraints on g_mut are set up (or inferred) to require WRITE permission for arr, then only a WRITE variant of f will be generated. Or second, if the constraints are relaxed, then f may get both READ and WRITE variants, but both will (wrongly) call g_mut.

    Treating g and g_mut as two variants of a single function allows the analysis to switch between g variants in the different variants of f, resulting in correct code like the following:

    fn f(arr: *mut Array) -> *mut i32 { ... g(arr) ... }
    fn f_mut(arr: *mut Array) -> *mut i32 { ... g_mut(arr) ... }
    
    fn g(arr: *mut Array) -> *mut i32 { ... }
    fn g_mut(arr: *mut Array) -> *mut i32 { ... }
    

    The ownership_split_variants automatically annotates the split functions so they will be combined into a variant group during further analysis. Variant groups can also be constructed manually using the #[ownership_variant_of(<name>)] annotation, where name is an arbitrary quoted string. All source-level functions bearing an ownership_variant_of annotation with the same name will form a single variant group, which will be treated as a single function throughout the analysis. However, signature inference for the variants themselves is not well supported. Thus, each variant must have an ownership_mono annotation, and exactly one function in each variant group must also have an ownership_constraints annotation. Together, these provide enough information that inference is not required. Note that unlike non-variant functions, variants may not have multiple ownership_mono annotations, as each variant is expected to correspond to a single monomorphization of the original function.

    The "Collection Hack"

    The analysis as described so far tries to mimic the Rust ownership model as implemented in the Rust compiler. However, collection data structures in Rust often use unsafe code to bypass parts of the ownership model. A particularly common case is in removal methods, such as Vec::pop:

    impl<T> Vec<T> {
        fn pop(&mut self) -> Option<T> { ... }
    }
    

    This method moves a T out of self's internal storage, but only takes self by mutable reference. Under the "normal" rules, this is impossible, and the analysis described above will infer a stricter signature for the raw pointer equivalent:

    fn pop(this: /* MOVE */ *mut Vec) -> /* MOVE */ *mut c_void { ... }
    

    The analysis as implemented includes a small adjustment (the "collection hack") to let it infer the correct signature for such methods.

    The collection hack is this: when handling a pointer assignment, instead of constraining the path permission of the RHS to be at least the permission of the LHS, we constraint it to be at least min(lhs_perm, WRITE). The result is that it becomes possible to move a MOVE pointer out of a struct when only WRITE permission is available for the pointer to that struct. Then the analysis will infer the correct type for pop:

    fn pop(this: /* WRITE */ *mut Vec) -> /* MOVE */ *mut c_void { ... }
    

    This is the top-level directory for all cross-checking components, and contains the following:

    • A clang plugin that automatically inserts cross-check instrumentation into C code.

    • An equivalent rustc compiler plugin for Rust.

    • The libfakechecks cross-checking backend library that prints out all cross-checks to standard output. This library is supported by both the C and Rust compiler plugins.

    • Our experimental fork of the ReMon MVEE modified for C/Rust side-by-side checking, along with the mvee-configs directory that contains some MVEE configuration examples.

    Cross-checking Tutorial

    Introduction

    The C2Rust transpiler aims to convert C code to semantically equivalent unsafe Rust code, and later incremental refactoring passes gradually transform this code to Rust code. However, the initial Rust translation might not be a perfect semantic match to the original C code, and the refactoring passes may also change the code in ways that break semantics. Cross-checking is an automated way to verify that the translated program behaves the same as the original C code.

    The way cross-checking achieves this goal is by comparing the execution traces of all versions (henceforth called "variants") of the program (original C, unsafe and refactored Rust) and checking for any differences. Our cross-checking implementation modifies the source code of the program at compile-time (separately during C and Rust compiler invocation) so that the variants output the traces at run-time, and then checks the traces against each other either online during execution (using the ReMon MVEE), or offline by comparing log files. The C2Rust cross-checkers currently instrument function entry and exit points, function return values, and function call arguments (currently experimental and disabled by default, but can be enabled per argument, function or file).

    Example

    To illustrate how cross-checking works, let us take the following code snippet:

    int foo() {
        return 1;
    }
    

    Calling the foo function will cause the following cross-check events to be emitted:

    XCHECK(Ent):193491849/0x0b887389
    XCHECK(Exi):193491849/0x0b887389
    XCHECK(Ret):8680820740569200759/0x7878787878787877
    

    Building code with cross-checks

    C2Rust contains one cross-checking implementation per language, in the form of a compiler plugin in both cases. We provide a clang plugin for C code, and a rustc plugin for Rust code.

    Building C code

    To build C variants with cross-checks enabled, first build the cross-checking plugin using $C2RUST/scripts/build_cross_checks.py, then run clang (or pass it to the build system) with the following options:

    • -Xclang -load -Xclang $C2RUST/build/clang-xcheck-plugin.$(uname -n)/plugin/CrossChecks.so to load the plugin
    • -Xclang -add-plugin -Xclang crosschecks to activate the plugin
    • -Xclang -plugin-arg-crosschecks -Xclang <...> for every additional option to pass to the plugin
    • -ffunction-sections may be required to correctly deduplicate some linkonce functions inserted by the plugin

    Note that every option passed to clang requires a -Xclang prefix before the actual option, so that the compiler driver passes it to the clang backend correctly. We provide a cc_wrapper.sh script in the plugin source code directory that inserts these automatically, as well as several project-specific scripts in directories under examples/.

    Additionally, the following arguments should be passed to the linker:

    • The cross-checking runtime library from $C2RUST/build/clang-xcheck-plugin.$(uname -n)/runtime/libruntime.a
    • A cross-checking backend library that provides the rb_xcheck function, e.g., libfakechecks for offline logging or libclevrbuf for online MVEE-based checks

    Building Rust code

    Building Rust code with cross-checks is simpler that C code, and only requires a few additions to Cargo.toml and the main Rust source file. Add the following to your Cargo.toml file (replacing $C2RUST to the actual path to this repository):

    [dependencies.c2rust-xcheck-plugin]
    path = "$C2RUST/cross-checks/rust-checks/rustc-plugin"
    
    [dependencies.c2rust-xcheck-derive]
    path = "$C2RUST/cross-checks/rust-checks/derive-macros"
    
    [dependencies.c2rust-xcheck-runtime]
    path = "$C2RUST/cross-checks/rust-checks/runtime"
    features = ["libc-hash", "fixed-length-array-hash"]
    

    and this preamble to your lib.rs or main.rs:

    #![feature(plugin, custom_attribute)]
    #![cross_check(yes)]
    
    #[macro_use] extern crate c2rust_xcheck_derive;
    #[macro_use] extern crate c2rust_xcheck_runtime;
    

    You may also add #![plugin(c2rust_xcheck_plugin(...))] to pass additional arguments to the cross-checking plugin.

    Cross-check configuration

    Cross-checks can be customized at a fine granularity using cross-check configuration files or inline attributes.

    Running cross-checked programs

    Offline mode

    When cross-checking in offline mode, all variants are executed independentely on the same inputs, and their cross-checks are written to either standard output or log files. After running all the variants, divergence can be detected by manually comparing the logs for mismatches. There are several backend libraries that support different types of logging outputs:

    • libfakechecks outputs a list of the cross-checks linearly to either standard output or a file (specified using the FAKECHECKS_OUTPUT_FILE environment variable)
    • zstd-logging library from cross-checks/rust-checks/backends (can also be used with the clang plugin) outputs a binary encoding of the cross-checks that is compressed using zstd, and is much more space-efficient than the text output of libfakechecks. The compressed output files can be converted to text using the xcheck-printer tool.

    Before running the C and Rust variants, you may need to load in one of these libraries using LD_PRELOAD if you haven't linked against it and passed in its path using -rpath (this is fairly easy to do for a C build, but more complicated when using Cargo for Rust code), like this:

    $ env LD_PRELOAD=$C2RUST/cross-checks/libfakechecks/libfakechecks.so ./a.out
    

    Running each variant with cross-checks enabled will print a list of cross-check results to the specified output. A simple diff or cmp command will show differences in cross-checks, if any.

    Online (MVEE) mode

    The other execution mode for cross-checks is the online mode, where a monitor program (the MVEE) runs all variants in parallel with exactly the same inputs (by intercepting input system calls like read and replicating their return values) and cross-checks all the output system calls and instrumentation points inserted by our plugins. This approach has several advantages over offline mode:

    • Input operations are fully replicated, including those from stateful resources like sockets; only the master variant performs each actual operation, and each other variant only gets a copy of the data.
    • Outputs are cross-checked but not duplicated, so each output operation is only executed by the master variant; the others are only cross-checked for matching outputs. For example, only the master variant opens and writes to output files.
    • The lock-step MVEE automatically eliminates most sources of non-determinism, like threading and non-deterministic syscalls, e.g., reading from /dev/urandom (see the Troubleshooting section below for more details)

    However, the main disadvantage of this approach is that some applications may not run correctly under the MVEE, due to either incomplete support from the MVEE or fundamental MVEE limitations. In such cases, we recommend using offline mode instead.

    To run your application inside our MVEE, first build it following the instructions in its README. After building it successfully, write an MVEE configuration file for your application (there is a sample file in the MVEE directory, and a few others in our examples directory), then run the MVEE:

    $ ./MVEE/bin/Release/MVEE -f <path/to/MVEE_config.ini> -N<number of variants> -- <variant arguments>
    

    The MVEE.ini configuration file is fairly self-explanatory, but there are a few notable settings that are important:

    • xchecks_initially_enabled disables system call replication and cross-checks up to the first function cross-check (usually for the main function), and should be false by default for cross-language checks. This is because the Rust runtime performs a few additional system calls that C code does not, and the MVEE would terminate with divergence if cross-checks were enabled.
    • relaxed_mman_checks and unsynced_brk disable MVEE cross-checks on the mmap family of calls and brk, respectively, and should both be set to true if the Rust code performs significantly different memory allocations.
    • path specifies the path to the variant's executable, and should be specified separately per variant (ReMon also supports running multiple variants for the same binary, but with different command line arguments; this is not used by C2Rust cross-checks). All variants should be files in the same directory, otherwise the MVEE will abort with a divergence inside the ELF loader.
    • argv specifies the arguments to pass to each variant (can be configured per-variant or globally for all variants).
    • env specifies the environment variables to pass to the variants, and should at least contain a LD_LIBRARY_PATH entry for the libclevrbuf.so library, and a LD_PRELOAD entry for the zeroing allocator libzero_malloc.so, like this:
    {
      "variant": {
        "global": {
          "exec": {
            "env": [
              "LD_LIBRARY_PATH=../../../cross-checks/ReMon/libclevrbuf",
              "LD_PRELOAD=../../../cross-checks/zero-malloc/target/release/libzero_malloc.so"
            ]
          }
        }
      }
    }
    

    Troubleshooting

    In case you run into any issues while building or running the variants, please refer to this section for possible fixes.

    Build failures

    Builds may occasionally fail because of partially or completely unsupported features in our plugins:

    • Bitfields in C structures: these do not currently have a Rust equivalent, and the transpiler converts them to regular integer types. The clang plugin will exit with an error when trying to cross-check a bitfield.
    • Variadic functions: Rust does not support these yet, and our clang plugin cannot handle them either.
    • Fixed-sized arrays of large or unusual sizes: as of the writing of this document, Rust does not have const generics yet, and we need them to support arbitrary-sized arrays on the Rust side. Until then, the rustc plugin runtime only supports cross-checking on fixed-sized arrays from a limited set of sizes (all integers up to 32, all powers of 2 up to 1024).
    • Function pointers with more than 12 arguments: these require variadic generics in Rust.

    In all these cases, we recommend that you either disable cross-checks for values of these types, or manually provide a custom cross-check function (see the cross-check configuration for more details).

    Inline functions in C headers

    Cross-checker output for C variants may sometimes contain additional cross-checks for inline functions from system headers which are missing from the corresponding Rust translation. In such cases, we recommend manually disabling cross-checks for the C inline functions using an external cross-check configuration file.

    Non-determinism

    Ideally, divergence (cross-checks differences between variants) is only caused by actual semantic differences between the variants. However, there is another cause of unintended divergence: nondeterminism in program execution. In most cases, non-determinism will simply cause each variant to produce different cross-checks, but it may also occasionally cause crashes. There are many causes of program non-determinism that interfere with cross-checking:

    • Calls to time, RNG, PID (getpid() and friends) and other non-deterministic OS functions and operations, e.g., reads from /dev/urandom
    • File descriptors. We have hard-coded a fixed cross-check value for all FILE* objects for this exact reason.
    • Threading, i.e., non-deterministic thread scheduling
    • Pointer-to-integer casts under ASLR.

    Generally, running the variants in online mode inside the MVEE fixes these issues by replicating all system calls between the variants, which ensures that they all receive the same values from the OS. In case the MVEE does not support a specific application and you need to run it in offline mode (or for any other reason), the recommended fix is to remove all such non-determinism from your code manually, e.g., replace all reads from /dev/urandom with constant values.

    To verify that non-determinism truly is the cause of divergence, we recommend running each separate variant multiple times and cross-checking it against itself. If non-determinism really is the problem, each run will produce different cross-checks.

    A note on ASLR and pointers: our cross-checkers currently check pointers by dereference instead of by address, thereby making the checks insensitive to ASLR. However, manually casting pointers to integers poses problems because integers cannot be dereferenced. Integers are cross-checked by value regardless of their source, and their values will differ across runs when they originate from pointers with ASLR enabled.

    Uninitialized memory

    Uninitialized memory is one common source of non-determinism, since an uninitialized value may have different actual values across different runs of a program. Since our plugins cross-check pointers by dereferencing them, invalid pointers can also crash our cross-checking runtime.

    To eliminate this problem, we force zero-initialization for all C and Rust values. The plugins enforce this for stack (function-local values), and all global values are already zero-initialized (as required by the C standard), which only leaves heap-allocated values, i.e., those allocated using malloc. We provide a zeroing malloc wrapper under cross-checks/zero-malloc which can be preloaded into an application using LD_PRELOAD. This wrapper library intercepts all memory allocation calls, and zeroes the allocated buffer after each call. To use this in Rust executables in place of the default jemalloc, add the following lines to your code to use the system allocator, which our library intercepts:

    #[global_allocator]
    static A: ::std::alloc::System = ::std::alloc::System;
    

    We recommend that you use our zeroing memory allocator for all cross-checks.

    Pointer aliasing

    Some C code will use pointers of some type T to refer to values of another type U, tricking our runtime into cross-checking the values incorrectly. This may not only cause divergence, but also potential crashes when our runtime attempts to cross-check actual integers as invalid pointers (see example below). If the value of the integer incidentally represents a valid memory address, the runtime will try to cross-check that memory as a T; otherwise, the runtime will most likely crash.

    struct T {
      int n;
    };
    struct U {
      char *s;
    };
    void foo(struct U *x) {
      // Cross-check problem here: will try to cross-check `x` as a `struct U*`,
      // when it's a `struct T*`, so the check will most likely crash
      // when attempting to dereference x->s
    }
    int main() {
      T x = { 0x1234 };
      foo((struct U*)&x);
      return 0;
    }
    

    Our cross-checking runtimes can recover from attempts to dereference invalid pointers, but rely on the pointer-tracer tool that uses ptrace to check and restart all invalid pointer dereferences. To use this recovery feature, you must pointer-tracer to start the variants:

    $ $C2RUST/cross-checks/pointer-tracer/target/release/pointer-tracer ./a.out
    

    Alternatively, some issues caused by pointer aliasing can be fixed by disabling cross-checks altogether for certain types and values, or by providing custom cross-check functions for certain types. For example, one common pattern is the tagged union, where multiple structures have an overlapping prefix with a tag, followed by a type-specific part:

    enum Tag {
      TAG_A,
      TAG_B
    };
    struct TypeA {
      // Common part
      Tag tag;
      char *foo;
      // Specific part
      int valA;
    };
    struct TypeB {
      // Common part
      Tag tag;
      char *foo;
      // Specific part
      void *valB;
    };
    

    For this example, you can either disable cross-checks manually for all the type-specific types, e.g., valA and valB above, or provide a custom cross-check function that cross-checks each value based on its tag.

    End-of-buffer pointers

    Another common C pattern related to memory allocations is the pointer to end of buffer pattern:

    struct Buf {
      char *start;
      char *end;
    };
    void alloc(struct Buf *buf, unsigned size) {
        buf->start = malloc(size);
        buf->end = buf->start + size;
    }
    

    In this example, cross-checks will diverge on the end pointer, since it points to the first byte after the allocation returned by malloc. Since we only require the allocation itself to be zero-initialized, the value of that byte is undefined, and could change at any time during the execution of the variant.

    For any pointers outside allocated memory, we recommend disabling cross-checks altogether.

    Benign dangling pointers

    The cross-checking runtimes may attempt to cross-check pointers to values that have been deallocated, e.g., by calling free, but still linger in memory without being used by the program. This means that the runtimes may dereference these pointers, even if the program never does, which may lead to divergence since the allocator is free to reuse that memory.

    struct Data {
      int allocated;
      char *buf;
    };
    struct Data *free_data(struct Data *data) {
        data->allocated = 0;
        free(data->buf);
        // Potential non-determinism: data->buf has been freed,
        // but our runtime will try to dereference it
        return data;
    }
    

    As of the writing of this document, we have no automatic way to detect when the runtimes attempt to dereference deallocated memory, so we recommend manually disabling cross-checks when this occurs.

    Compiler builtins and optimizations

    In some cases, clang and rustc may optimize the generated code differently in ways that produce divergence. For example, clang (more specifically, a LLVM optimization pass enabled by clang) converts single-argument printf calls to direct calls to puts. For example, clang converts printf("Hello world!\n"); to puts("Hello world!");. The two functions have different internal implementations and make different syscalls (mainly the write syscall), so this optimization causes divergence. We recommend compiling all C code with the -fno-builtin argument to prevent this.

    Cross-checking Configuration

    In many cases, we can add identical cross-checks to the original C and the transpiled Rust code, e.g., when the C code is naively translated to the perfectly equivalent Rust code, and everything just works. However, this might not always be the case, and we need to handle mismatches such as:

    • Type mismatches between C and Rust, e.g., a C const char* (with or without an attached length parameter) being translated to a str. Additionally, if a string+length value pair (with the types const char* and size_t) gets translated to a single str, we may want to omit the cross-check on the length parameter.
    • Whole functions added or removed by the transpiler or refactoring tool, e.g., helpers.

    Note that this list is not exhaustive, so there may be many more cases of mismatches.

    To handle all these cases, we need a language that lets us add new cross-checks, or modify or delete existing ones.

    The cross-check language

    The cross-check metadata is stored as a YAML encoding of an array of configuration entries. Each configuration entry describes the configuration for that specific check.

    An example configuration file for a function foo with 3 arguments a, alen and b looks something like:

    main.c:
      - item: defaults
        disable_xchecks: true
    
      - item: function
        name: foo
        disable_xchecks: false
        args:
          a: default
          alen: none
          b: default
        return: no
    
    main.rs:
      - item: function
        name: foo
        args:
          a: default
          b: default
        return: no
    

    Inline vs external configuration

    We can store the cross-check configuration entries in a few places:

    • Externally in separate configuration files.
    • Inline in the source code, attached to the checked functions and structures.

    Each approach has advantages and drawbacks. Inline configuration entries are simpler to maintain, but do not scale as well to larger codebases or more complex cross-check configuration entries. Conversely, external configuration entries are more flexible and can potentially express complex configurations in a cleaner and more elegant way, but can easily get out of sync with their corresponding source code. We currently support both approaches, with external configuration settings taking priority over inline attributes where both are present.

    In the current implementation of the Rust cross-checker, inline configuration settings are passed to the enclosing scope's #[cross_check] attribute, e.g.:

    
    # #![allow(unused_variables)]
    #fn main() {
    #[cross_check(yes, entry(djb2="foo"))]
    fn bar() { }
    
    #[cross_check(yes, entry(fixed=0x1234))]
    fn baz() { }
    #}

    Configuration file format

    At the top level, each configuration file is a YAML associative array mapping file names to their configuration entries. Each array element maps a file name (represented as a string) to a list of individual items, each item representing a Rust/C scope entity, i.e., function or structure. Each item is encoded in YAML as an associative array. All items have a few common array members:

    • item specifies the type of the current item, e.g., function, struct or others.
    • name specifies the name of the item, i.e., the name of the function or structure.

    Function cross-check configuration

    Function cross-checks are configured using entries with item: function. Function entries support the following fields:

    Field Role
    disable_xchecks Disables all cross-checks for this function and everything in it if set to true.
    entry Configures the function entry cross-check (see below for information on accepted values).
    exit Configures the function exit cross-check.
    all_args Specifies a cross-check override for all of this function's arguments. For example, setting all_args: none disables cross-checks for all arguments.
    args An associative array that maps argument names to their corresponding cross-checks. This can be used to customize the cross-checks for some of the function arguments individually. This setting overrides both the global default and the one specified in all_args for the current function.
    return Configures the function return value cross-check.
    ahasher and shasher Override the default values for the aggregate and simple hasher for this function (see the hashing documentation for the meaning of these fields).
    nested Recursively configures the items nested inside the current items. Since Rust allows arbitrarily deep function and structure nesting, we use this to recursively configure nested functions.
    entry_extra Specifies a list of additional custom cross-checks to perform after the argument. Each cross-check accepts an optional tag parameter that overrides the default UNKNOWN tag.
    exit_extra Specifies a list of additional custom cross-checks to perform on function return.

    Structure cross-check configuration

    Structure entries configure cross-checks for Rust structure, tuple and enumeration types, and are tagged with item: struct. For a general overview of cross-checking for structures (aggregate types), see the hashing documentation. Structure entries support the following fields:

    Field Role
    disable_xchecks Disable automatic cross-check emission for this structure (this is generally best left out, unless the default is true and needs to be reset to false).
    field_hasher Configures the replacement hasher for this structure. The hasher is a Rust object that implements the cross_check_runtime::hash::CrossCheckHasher trait.
    custom_hash Specifies a function to call to hash objects of this type, instead of the default implementation. This function should have the signature fn foo<XCHA, XCHS>(arg: &T, depth: usize) -> u64 where T is the name of the current type. XCHA and XCHS are template parameters passed by the caller that specify the aggregate and simple hasher to use for this computation (and can be overridden using ahasher and shasher below).
    fields An associative array that specifies custom hash computations for some or all of the structure's fields. Accepts values in the format of cross-check types.
    ahasher and shasher Override the aggregate and simple hasher for the default hash implementation for the current type (mainly useful if field_hasher is left out). These are recursively passed to the hash function call for each structure field.

    The field_hasher and custom_hash provide two alternative methods of customizing the hashing algorithm for a given structure: users may either provide a custom implementation of CrossCheckHasher and pass that to field_hasher, or implement a hashing function and pass it to custom_hash. The two alternatives are mostly equivalent, and users may use whichever is more convenient. Additionally, users can choose to completely disable the automatic derivation of CrossCheckHash, and manually implement CrossCheckHasher for some of the types instead.

    Cross-check types

    There are several types of cross-check implemented in the compiler:

    Check Value Type Behavior
    default Lets the compiler perform the default cross-check.
    none or disabled Disables cross-checking or hashing for the current value.
    fixed u64 Sets the cross-checked value to the given 64-bit integer.
    djb2 String Sets the cross-checked value to the djb2 hash of the given string. This is mainly useful for overriding function entry cross-checks, in case the function names don't match between languages.
    as_type String Perform the default value cross-check, but after casting the value to the given type, e.g., cast it to a u32 then cross-check it as a u32.
    custom String Parses the given string as a C or Rust expression and uses it to compute the cross-checked value. In most cases, the string is inserted verbatim into the cross-check code, e.g., for function argument cross-checks.

    Each cross-check is encoded in YAML as either a single word with the type, e.g., default, or a single-element associative array mapping the type to its argument, e.g., { fixed: 0x1234 }.

    More cross-check types may be added as needed.

    Custom hash functions for structures

    If custom_hash: { custom: "hash_foo" } is a configuration entry for structure Foo, then the compiler will insert a call to hash_foo to perform the cross checks. This function should have the following signature:

    
    # #![allow(unused_variables)]
    #fn main() {
    fn hash_foo<XCHA, XCHS>(foo: &Foo, depth: usize) -> u64 { ... }
    #}

    The hash function receives a reference to a Foo object and a maximum depth, and should return the 64-bit hash value for the given object.

    Custom hash functions for structure fields

    If bar: { custom: "hash_bar" } is a configuration entry for field bar, then the compiler will insert a call to hash_bar to compute the hash for bar. This function should have the following signature:

    
    # #![allow(unused_variables)]
    #fn main() {
    fn hash_bar<XCHA, XCHS, S, F>(h: &mut XCHA, foo: &S, bar: &F, depth: usize)
           where XCHA: cross_check_runtime::hash::CrossCheckHasher { ... }
    #}

    The function receives the following arguments:

    • The current aggregate hasher for this structure. The function can call the hasher's write_u64 function as many times as needed.
    • The structure containing this field. This argument has generic type S, so the same function can be reused for different structures.
    • The field itself, with generic type F. The function may require additional type bounds for F to make it compatible with its callers.
    • The maximum hashing depth (explained in the hashing documentation).
    • The type parameters XCHA and XCHS bound to the current aggregate and simple value hasher for the current invocation.

    This function should not return the hash value of the field. Instead, the function should call the hasher's write_u64 method directly.

    Per-file default settings

    The special defaults item type specifies the default cross-check settings for all items in a file. We currently support the following entries:

    Field Role
    disable_xchecks Disables all cross-checks for this file. Can be individually overridden per function or structure.
    entry Configures the default entry cross-check for all functions in this file.
    exit Similarly configures the function exit cross-check.
    all_args Specifies a cross-check override for all arguments to all functions in this file. For example, setting all_args: default enables cross-checks for all arguments.
    return Configures the function return value cross-check.

    More examples

    Function example

    Example configuration for a function baz1(a, b):

    main.rs:
      - item: function
        name: baz1
        entry: { djb2: "baz" }    // Cross-check the function as "baz"
        args:
          a: { custom: "foo(a)" } // Cross-check a as foo(a)
          b: none                 // Do not cross-check b
        entry_extra:              // Cross-check foo(b) with a FUNCTION_ARG tag
          - { custom: "foo(b)", tag: FUNCTION_ARG }
          - { custom: "a" }       // Cross-check the value "a" with UNKNOWN_TAG
    

    Structure example

    Example configuration for a structure Foo (illustrated on an object foo of type Foo):

    main.rs:
      - item: struct
        name: Foo
        field_hasher: "FooHasher"  // Use FooHasher as the aggregate hasher
        fields:
          a: { fixed: 0x12345678 } // Use 0x12345678 as the hash of foo.a
          b: { custom: "hash_b" }  // Hash foo.b using hash_b(foo.b)
          c: none                  // Ignore foo.c when hashing foo
    

    Inline cross-check configuration

    In addition to the external configuration format, a subset of cross-checks can also be configured inline in the program source code. The compiler plugin provides a custom #[cross_check] attribute used to annotate functions, structures and fields with custom cross-check metadata.

    Inline function configuration

    The #[cross_check] function attribute currently supports the following arguments:

    Argument Type Role
    none or disabled Disable cross-checks for this function and all its sub-items (this attribute is inherited). Each sub-item can individually override this with yes or enabled.
    yes or enabled Enable cross-checks for this function and its sub-items. Each nested item can also override this setting with none or disabled.
    entry XCheckType Cross-check to use on function entry, same as for external configuration.
    exit XCheckType Cross-check to use on function entry, same as for external configuration.
    all_args XCheckType Enable cross-checks for this function's arguments (disabled by default). Takes the cross-check type as its argument.
    args(...) Per-argument cross-check overrides (same as for external configuration).
    return XCheckType Cross-check to perform on the function return value, same as for external configuration.
    ahasher and shasher String Same as for external configuration.
    entry_extra and exit_extra Same as for external configuration.

    Function example

    
    # #![allow(unused_variables)]
    #fn main() {
    #[cross_check(yes, entry(djb2="foo"))] // Cross-check this function as "foo"
    fn foo1() {
      #[cross_check(none)]
      fn bar() { ... }
      bar();
    
      #[cross_check(yes, all_args(default), args(a(fixed=0x123)))]
      fn baz(a: u8, b: u16, c: u32) { ... }
      baz(1, 2, 3);
    }
    #}

    Inline structure configuration

    The compiler plugin also supports a subset of the full external configuration settings as #[cross_check] arguments:

    Argument Type Role
    field_hasher String Same as for external configuration.
    custom_hash String Same as for external configuration.
    ahasher and shasher String Same as for external configuration.

    The #[cross_check] attribute can also be attached to structure fields to configure hashing:

    Argument Type Role
    none or disabled This field is skipped during hashing.
    fixed u64 Fixed 64-bit integer to use as the hash value for this field. Identical to the fixed external cross-check type.
    custom_hash String Same as for external configuration.

    Structure example

    
    # #![allow(unused_variables)]
    #fn main() {
    #[cross_check(field_hasher="MyHasher")]
    struct Foo {
      #[cross_check(none)]
      foo: u64,
    
      #[cross_check(fixed=0x1234)]
      bar: String,
    
      #[cross_check(custom_hash="hash_baz")]
      baz: String,
    }
    #}

    Caveats

    Duplicate items

    At any level or scope, there may be duplicate items, i.e., multiple items with the same names. It is not clear at this point how to best handle this case, since we have several conflicting requirements. On the one hand, we may wish to allow the configuration for one source file to be spread across multiple configuration files, and entries from later configuration files to be appended or replace entries from earlier files. On the other hand, we may have identically-named structures or functions in nested scopes that we want to configure separately. For an example, consider the following code:

    fn foo(x: u32) -> u32 {
        if x > 22 {
            fn bar(x: u32) -> u32 {
                x - 22
            };
            bar(x)
        } else {
            fn bar(x: u32) -> u32 {
                x + 34
            }
            bar(x)
        }
    }
    

    In this example, there are two distinct foo::bar functions, and we wish to configure them separately. However, at the top level of a file, there may only be one foo function, so we can merge all entries for foo together. Alternatively, we could check for multiple top-level items with the same name and exit with an error if we encounter any duplicates.

    Configuration priority

    Currently, if a certain cross-check is configured using both an external entry and an inline #[cross_check(...)] attribute, the external entry takes priority. Alternatively, we may reverse this priority, or exit with an error if both are present.

    Scope configuration inheritance

    The configuration settings described above apply to the scope of an item. While most settings apply exclusively to the scope itself (for example, args and all_args settings only apply to the current function, e.g., foo above and not any of the bar functions) and not any of its nested sub-items, there are a few that apply to everything inside the scope. These attributes are internally "inherited" from each scope by its child scopes. Currently, the only inherited attributes are disable_xchecks (so that disabling cross-checks for a module or function disables them for everything inside that function), ahasher and shasher.

    Custom cross-check parameters

    Custom cross-check definitions have a different format for each language. The rustc plugin accepts any Rust expression that is valid on function entry as a custom cross-check.

    The clang plugin, on the other hand, only accepts a limited subset of C expressions: each cross-check specification contains the name of the function to call, optionally followed by a list of parameters to pass to the function, e.g., function or function(arg1, arg2, ...). Each parameter is the name of a global variable or function argument, and is optionally preceded by & (to pass the parameter by address instead of value) or by * (to dereference the value if it is a pointer).

    Anonymous structures

    C allows developers to define anonymous structures that define the type for a single value, e.g.:

    struct {
      int x;
    } y;
    

    For a variety of reasons, we need to assign names to these structures ourselves. The most important reason is that we need to identify these structures in the external configuration files. We assign the names using one of the following formats, depending on the context where the anonymous structure is defined:

    Assigned name Meaning
    Foo$field$x This structure defines the type for the field x of the outer structure Foo. Note that Foo itself may also be an anonymous structure that follows the same naming policy.
    foo$arg$x This structure defines the type for the argument x of function foo (as illustrated below).
    foo$result This structure defines the return type for function foo.

    Examples

    struct Foo {
      struct {                  // This gets named `Foo$field$x`
        int x;
      }
    };
    
    struct { int a; }           // This gets the `foo$result` name
    foo(struct { int b; } x) { // The `x` argument type gets the `foo$arg$x` name
    }
    

    Cross-checking hashing algorithm

    For a given value x of a type T, our cross-checking implementation needs to hash x to a hash value H(x) of fixed size (64 bits in the current implementation), regardless of the size and layout of T. This document describes the design and implementation of the type-aware hashing algorithms used by the cross-checker.

    Using an established hash functions over the raw bytes of x has a few disadvantages:

    • C/Rust structures contain padding bytes between consecutive fields (due to alignment requirements), and we must not include this padding in the hash.
    • Pointer addresses are non-deterministic due to ASLR and other factors, so we must hash them by dereference instead of address.

    For these reasons, we have chosen to design our own type-aware hashing algorithms. The algorithms hash each value differently depending on its type, and are implemented by functions with the following signature:

    uint64_t __c2rust_hash_T(T x, size_t depth);
    

    We use recursive hashing algorithms for complex types. To prevent infinite recursion and long hashing times, we limit the recursion depth to a fixed value. When recursion reaches this limit, the hash function returns a constant hash instead of going deeper.

    We distinguish between the following kinds of types:

    • Simple types, e.g., integers, booleans, characters, floats, are trivial types which can be hashed directly by value. In the current implementation, we hash these values by XORing them with a constant that depends on the type (see the C and Rust implementations for details). Since simple types cannot recurse, we perform no depth checks for this case.

    • Aggregate (or non-trivial) types:

      • Structures. We hash the contents of each structure by recursively hashing each field (with depth increased by one), then aggregating all the hashes into one. We currently use the JodyHash function for the latter.

      • Fixed-size arrays are hashed in fundamentally the same way as structures, by recursively hashing each array element then aggregating the resulting hashes.

      • Pointers. We avoid hashing pointers by address for the reasons listed above. Instead, we hash each pointer by recursively hashing its dereferenced value (with depth increased by one). We have two special cases here that we need to handle:

        • Null pointers, which our hash functions check and return a special hard-coded hash value for.
        • Non-null invalid pointers. Our cross-checking implementation will crash when dereferencing these pointers. However, running the crashing program either using pointer-tracer tool or under the MVEE will fix the crashes and safely hash these pointers by returning another special hard-coded value.

    Other data types, e.g., unions and structures containing bitfields, are difficult to hash programatically and require the user to specify a manual hash function.

    The cross-checking configuration settings can be used to specify different hashing algorithm separately for simple and aggregate types.

    Rustc cross-checker compiler plugin

    This is a simple cross-check inserter for Rust code that is implemented as a Rust compiler plugin.

    Usage

    To use the compiler plugin, you need to take several steps. First, add the plugin as a Cargo dependency to your Cargo.toml file:

    [dependencies]
    c2rust-xcheck-plugin = { path = ".../C2Rust/cross-checks/rust-checks/rustc-plugin" }
    c2rust-xcheck-derive = { path = ".../C2Rust/cross-checks/rust-checks/derive-macros" }
    c2rust-xcheck-runtime = { path = ".../C2Rust/cross-checks/rust-checks/runtime" }
    

    with ... as the full path to the C2Rust repository. Next, add the following preamble to your main.rs or lib.rs file:

    
    # #![allow(unused_variables)]
    #![feature(plugin)]
    #![plugin(c2rust_xcheck_plugin)]
    
    #fn main() {
    #[macro_use]
    extern crate c2rust_xcheck_derive;
    #[macro_use]
    extern crate c2rust_xcheck_runtime;
    #}

    Cross-checker options

    Cross-checking is enabled and configured using the #[cross_check] directive, which can either be enabled globally (using #![cross_check] at the beginning of main.rs or lib.rs) or individually per function (the per-function settings override the global ones).

    The directive optionally takes the following options:

    • yes and enabled enable cross-checking for the current scope (crate or function).
    • none and disabled disable cross-checking for the current scope.
    • entry(djb2="foo") sets the cross-checking name for the current function entry point to the DJB2 hash of foo.
    • entry(fixed=NNN) sets the cross-checking ID for the current function entry point to NNN.

    Example:

    
    # #![allow(unused_variables)]
    #fn main() {
    #[cross_check(yes, entry(djb2="foo"))]
    fn bar() { }
    
    #[cross_check(yes, entry(fixed=0x1234))]
    fn baz() { }
    
    #[cross_check(no)]
    fn foo() { }
    #}

    Clang plugin for crosschecking on C programs

    This is a cross-check inserter for C programs implemented as a clang compiler plugin.

    Building and running the plugin

    1. Build libfakechecks (optional, useful for testing):

       $ cd ../../libfakechecks
       $ make all
      
    2. Build the clang plugin using the build script:

       $ ../../../scripts/build_cross_checks.py
      
    3. To compile code using the plugin, either wrap the compilation command with the cc_wrapper.sh script from this directory:

      $ cc_wrapper.sh <path/to/clang> .../CrossChecks.so <rest of command line...>
    

    or add the following arguments manually to the clang command line, e.g., using CFLAGS:

    -Xclang -load -Xclang .../CrossChecks.so -Xclang -add-plugin -Xclang crosschecks
    

    and link against libruntime.a. In both cases, the target binary must then be linked against one of the rb_xcheck implementation libraries: libfakechecks.so or libclevrbuf.so.

    Testing

    This plugin can be tested in this directory by running make test.

    Example translations

    The following example translations illustrate how to run C2Rust on real codebases. Each example has been modified if necessary to prepare it for translation with C2Rust and each has accompanying documentation on how to translate the example.

    The robotfindskitten example is accompanied by a demonstration of the refactoring tool rewriting the unsafe translated Rust into idiomatic, safe Rust.

    json-c library

    Translating json-c

    # in examples/json-c/repo:
    ../configure    # use the custom c2rust configure script
    intercept-build make
    make check
    python3 ../translate.py
    ninja -C rust
    

    This will produce rust/libjson-c.so.4.0.0.

    Running tests

    # in examples/json-c/repo:
    
    # Replace the C libjson-c.so with a symlink to the Rust one.
    # You only need to do this the first time.
    rm .libs/libjson-c.so.4.0.0
    ln -s ../rust/libjson-c.so.4.0.0 .libs/libjson-c.so.4.0.0
    
    # Run tests
    make check
    

    If you modify the C files, make check will try to rebuild some stuff and then will break because of the object files that translate.py deleted. If this happens, run make clean && make, then repeat the "running tests" steps from the top.

    url parser

    Getting Started

    If the repo submodule appears to be empty or out of date, you may need to run git submodule update --init path/to/repo.

    Transpiling

    $ intercept-build make
    $ c2rust transpile compile_commands.json
    $ rustc test.rs
    

    qsort

    This tiny project provides an example of how to use CMake to build a C project and to generate the clang "compile_commands.json" file which is used by tools like the c2rust-ast-exporter.

    Build with the following commands:

    $ mkdir ../build
    $ cd ../build
    $ cmake ../qsort -DCMAKE_EXPORT_COMPILE_COMMANDS=1
    $ cmake --build .
    $ c2rust transpile compile_commands.json
    

    tmux

    Checking out the tmux sources

    Only linux is supported at the moment, but OSX might work with some tweaks.

    In path/to/examples/tmux, initialize the git submodule:

    git submodule update --init repo

    Create a Makefile

    in tmux/repo:

    ./autogen.sh && ./configure

    Create a compile_commands.json

    in tmux/repo:

    intercept-build make check

    If your compile_commands.json enables optimizations(-O2, -O3, etc) you will need to remove them so that unsupported compiler_builtins are less likely to be generated and leave you in an uncompilable state.

    Run rm *.o compat/*.o here to get rid of gcc generated staticlibs or else you may see CRITICAL:root:error: some ELF objects were not compiled with clang: in the next step

    Generate Rust Code

    in tmux:

    ./translate.py to translate all required c files into the tmux/repo/rust/src and tmux/repo/rust/src/compat directories.

    Run Tmux

    Run cargo run to build and execute tmux.

    grabc

    Getting Started

    If the repo submodule appears to be empty or out of date, you may need to run git submodule update --init path/to/repo.

    Transpiling

    The steps to get the transpiled code are as follows:

    $ intercept-build make
    $ c2rust transpile compile_commands.json
    $ rustc grabc.rs -L/usr/X11R6/lib -lX11
    

    If you want to have the transpiler create a crate:

    $ intercept-build make
    $ c2rust transpile compile_commands.json --emit-build-files -m grabc --output-dir rust
    $ cd rust
    $ RUSTFLAGS="-L/usr/X11R6/lib -lX11" cargo build
    

    libxml2

    Checking out the libxml2 sources

    In path/to/examples/libxml2, initialize the git submodule:

    git submodule update --init repo

    Create a Makefile

    in libxml2/repo:

    ./autogen.sh

    and optionally ./configure (autogen.sh currently runs this automatically, so you're not required to).

    Create a compile_commands.json

    in libxml2/repo:

    intercept-build make check

    If your compile_commands.json enables optimizations(-O2) you will need to remove them so that unsupported compiler_builtins are less likely to be generated and leave you in an uncompilable state.

    Run rm .libs/*.o here to get rid of gcc generated staticlibs or else you may see CRITICAL:root:error: some ELF objects were not compiled with clang: in the next step

    Generate Rust Code

    in libxml2:

    ./translate.py to translate all required c files (including tests) into the libxml2/repo/rust/src and libxml2/repo/rust/examples directories.

    Fix Known Translation Issues

    in libxml2:

    ./patch_translated_code.py to apply patches to some known issues in the generated code.

    Run Libxml2 C Tests

    Since each of these tests have their own main file, we decided to move them to the rust examples directory instead of trying to wrap them in the test framework.

    You can run a test like so: cargo run --example EXAMPLE where EXAMPLE is one of the files in libxml2/repo/rust/examples, not including the file extension.

    Outstanding Test Issues

    Runnable

    • testReader seems to be mostly working identically but with some slight differences. Try testReader --valid test/japancrlf.xml. It produces an extra "Ns verBoom: Validation failed: no DTD found !, (null), (null)"

    Working

    • runtest seems to be consistently successful now
    • testRelax seems to work equivalently with files as in C
    • testXPath seems to work equivalently with files as in C
    • xmllint seems to work equivalently with files as in C
    • testSAX prints out nothing on success, just like C version
    • testModule prints "Success!"
    • testHTML works with input files from test/HTML and produces same output as C version
    • testRegexp works with files from test/regexp and produces same output as C version
    • testrecurse prints "Total 9 tests, no errors"
    • testlimits prints "Total 514 tests, no errors"
      • Note: text output seems noticeably slower than the C version
    • testThreads prints nothing (but no longer prints parsing errors)
    • testapi runs successfully and prints "Total: 1172 functions, 280928 tests, 0 errors"
    • testC14N prints parsed output when given a file to read from test/c14n
    • testSchemas no longer crashes when provided a file from test/schemas/*.xsd
    • testchar prints tests completed
    • testdict prints "dictionary tests succeeded 20000 strings"
    • testAutomata takes a file from test/automata and produces equivalent output to C run
    • testURI waits on input from stdin, needs example input from test/URI. See Makefile.am and result/URI/uri.data for examples

    Working cross-checks

    • testchar all cross-checks match
    • testdict all cross-checks match
    • testapi all cross-checks match (345 million)
    • runtest all cross-checks match
    • testlimits all cross-checks match, but requires -fno-builtin as a compiler argument
    • testSAX works
    • testHTML works
    • testRegexp works
    • testModule requires testdso.so, doesn't work yet
    • testAutomata works
    • testSchemas works on all files from test/schemas
    • testRelax works on all files from test/relaxng
    • testURI works
    • testC14N works
    • testXPath works on files under test/XPath/expr and test/xmlid
    • testThreads deadlocks, still investigating
    • xmlllint does not compile

    Snudown

    To build snudown with the C2Rust translator and/or cross-checks, initialize the git submodule by running git submodule update --init path/to/repo.

    Make sure to build the derive-macros, runtime and rustc-plugin projects in the cross-checks folder beforehand. The runtime project must be built with the libc-hash feature (e.g. cargo build --features libc-hash).

    Next, cd into the repo directory and run python setup.py build with one of the following arguments:

    • --translate to translate the C code to Rust without any checks
    • --clang-crosschecks to build the C version of snudown with full cross-checking
    • --rust-crosschecks to translate to cross-checked Rust code
    • --use-fakechecks may be appended to use the fakechecks library to print out the cross-checks, instead of libclevrbuf from the MVEE
    • running with no flags will build the C version of the code
    • Note that -f may need to be appended to the end of the command to force a rebuild, if building multiple times consecutively

    After building any of the 3 versions, run python setup.py test to test it.

    genann (Neural Network Library)

    Getting Started

    If the repo submodule appears to be empty or out of date, you may need to run git submodule update --init path/to/repo.

    Transpiling

    # generate compile_commands.json
    $ intercept-build make
    $ c2rust transpile compile_commands.json --emit-build-files
    

    Testing

    Instead of translating with --emit-build-files to generate a library crate, you can build with --main exampleN where N is one of 1, 3, or 4 (example2.c seems to never halt in both C and Rust but translates and executes just fine). This will create a binary crate that will run the specified example.

    lil (Little Interpreted Language)

    Getting Started

    If the repo submodule appears to be empty or out of date, you may need to run git submodule update --init path/to/repo.

    Transpiling

    $ intercept-build make
    $ c2rust transpile compile_commands.json --emit-build-files -m main --output-dir rust
    $ cd rust
    $ cargo build
    

    xzoom

    Getting Started

    If the repo submodule appears to be empty or out of date, you may need to run git submodule update --init path/to/repo.

    Required Manual Changes

    You may need to add #include <unistd.h> to xzoom.c for it to properly generate (otherwise main_0 goes missing). This include is normally only added with the TIMER macro enabled, but seems to be required for standard functionality. (We could fork the repo if we want to make this change explicit for the purposes of automated testing.)

    Required Dependencies

    • clang >= 5.0
    • sed

    Transpiling

    $ clang -MJ compile_commands.o.json xzoom.c -L/usr/X11R6/lib -lX11
    $ sed -e '1s/^/[\n/' -e '$s/,$/\n]/' *.o.json > compile_commands.json
    $ c2rust transpile compile_commands.json
    $ rustc xzoom.rs  -L/usr/X11R6/lib -lX11
    

    Refactoring robotfindskitten

    This section details the refactoring script used to transform the initial Rust translation of robotfindskitten, as generated by c2rust transpile, into a safe Rust program. We divide the refactoring process into several major steps:

    • ncurses macro cleanup: The ncurses library implements parts of its API using C preprocessor macros, and a few of those macros expand to relatively complex code. We replace these expanded macro bodies with calls to equivalent functions, which are easier to recognize and refactor.

    • String formatting: robotfindskitten calls several printf-style string-formatting functions. We replace these unsafe variable-argument function calls with safe wrappers using Rust's format family of macros. Aside from improving memory safety, this also allows the Rust compiler to more accurately typecheck the format arguments, which is helpful for later type-directed refactoring passes.

    • Static string constants: robotfindskitten has two global variables containing string constants, which are translated to Rust as static mut definitions containing C-style *const c_char pointers. We refactor to remove both sources of unsafety, replacing raw pointers with checked &'static str references and converting the mutable statics to immutable ones.

    • Heap allocations: robotfindskitten uses a heap allocated array to track the objects in the game world. This array is represented as a raw pointer, and the underlying storage is managed explicitly with malloc and free. We replace the array with a memory-safe collection type, avoiding unsafe FFI calls and preventing out-of-bounds memory accesses.

    • Using the pancurses library: Calling ncurses library functions directly through the Rust FFI requires unsafe code at every call site. We replace unsafe ncurses function calls with calls to the safe wrappers provided by the pancurses crate.

    • Moving global state to the stack: robotfindskitten uses mutable global variables to store the game state, which turn into unsafe static mut definitions in Rust. We collect all such variables into a single stack-allocated struct, which can be mutated without unsafety.

    • libc calls: We replace calls to miscellaneous libc functions, such as sleep and rand, with calls to safe Rust equivalents.

    • Function argument types: Two remaining functions in robotfindskitten take raw pointers as arguments. We change each function's signature to use only safe Rust types, and update their callers to match.

    • String conversion cleanup: Several of the previous refactoring passes insert conversions between Rust and C string types. In several places, these conversions form cycles, such as &str -> *const c_char -> &str, which are both redundant and a source of unsafe code. We remove such conversion cycles to avoid unnecessary raw pointer manipulation.

    • Final cleanup: At this point, we have removed all the unsafe code we can. Only a few cleanup steps remain, such as removing unused unsafe qualifiers and deleting unused extern "C" definitions. In the end, we are left with a correct Rust translation of robotfindskitten that contains only a single line of unsafe code.

    ncurses macro cleanup

    robotfindskitten uses a variety of macros provided by the ncurses library. Since c2rust transpile runs the C preprocessor before translating to Rust, the expansions of those macros effectively get inlined in the Rust code at each call site. In many cases, this is harmless: for example, move(y, x) expands to wmove(stdscr, y, x), which is not much harder to refactor than the original. However, the attr_get and attrset macros are more complex: they expand to multiple lines of code involving several conditionals and complex expressions. In this step, we convert the expanded code into simple function calls, which are easier to manipulate in later refactoring passes.

    Fortunately, the ncurses library provides functions implementing the same operations as the troublesome macros, and we can call those functions through Rust's FFI. We begin by providing Rust declarations for these foreign functions. For ease of reading, we put the new declarations just after the existing extern "C" block:

    select target 'crate; child(foreign_mod); last;' ;
    create_item
        '
            extern "C" {
                fn wattr_get(win: *mut WINDOW, attrs: *mut attr_t,
                    pair: *mut libc::c_short, opts: *mut libc::c_void) -> libc::c_int;
                fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
            }
        '
        after ;
    

    Diff #1

    src/robotfindskitten.rs
    8
    )]
    8
    )]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    10
    extern crate libc;
    10
    extern crate libc;
    11
    extern "C" {
    11
    extern "C" {
    12
        pub type ldat;
    12
        pub type ldat;
    13
        #[no_mangle]
    13
        #[no_mangle]
    14
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    14
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    15
        #[no_mangle]
    15
        #[no_mangle]
    16
        fn cbreak() -> libc::c_int;
    16
        fn cbreak() -> libc::c_int;

    77
        #[no_mangle]
    77
       #[no_mangle]
    78
        fn time(__timer: *mut time_t) -> time_t;
    78
        fn time(__timer: *mut time_t) -> time_t;
    79
        #[no_mangle]
    79
        #[no_mangle]
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    81
    }
    81
    }
    82
    extern "C" {
    83
        fn wattr_get(
    84
            win: *mut WINDOW,
    85
            attrs: *mut attr_t,
    86
            pair: *mut libc::c_short,
    87
            opts: *mut libc::c_void,
    88
        ) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    90
    }
    82
    pub type __time_t = libc::c_long;
    91
    pub type __time_t = libc::c_long;
    83
    pub type chtype = libc::c_ulong;
    92
    pub type chtype = libc::c_ulong;
    93
    94
    #[repr(C)]
    84
    #[derive(Copy, Clone)]
    95
    #[derive(Copy, Clone)]
    85
    #[repr(C)]
    86
    pub struct _win_st {
    96
    pub struct _win_st {
    87
        pub _cury: libc::c_short,
    97
        pub _cury: libc::c_short,
    88
        pub _curx: libc::c_short,
    98
        pub _curx: libc::c_short,
    89
        pub _maxy: libc::c_short,
    99
        pub _maxy: libc::c_short,
    90
        pub _maxx: libc::c_short,
    100
        pub _maxx: libc::c_short,

    110
        pub _pary: libc::c_int,
    120
        pub _pary: libc::c_int,
    111
        pub _parent: *mut WINDOW,
    121
        pub _parent: *mut WINDOW,
    112
        pub _pad: pdat,
    122
        pub _pad: pdat,
    113
        pub _yoffset: libc::c_short,
    123
        pub _yoffset: libc::c_short,
    114
    }
    124
    }
    125
    126
    #[repr(C)]
    115
    #[derive(Copy, Clone)]
    127
    #[derive(Copy, Clone)]
    116
    #[repr(C)]
    117
    pub struct pdat {
    128
    pub struct pdat {
    118
        pub _pad_y: libc::c_short,
    129
        pub _pad_y: libc::c_short,
    119
        pub _pad_x: libc::c_short,
    130
        pub _pad_x: libc::c_short,
    120
        pub _pad_top: libc::c_short,
    131
        pub _pad_top: libc::c_short,
    121
        pub _pad_left: libc::c_short,
    132
        pub _pad_left: libc::c_short,

    156
    /*Screen dimensions.*/
    167
    /*Screen dimensions.*/
    157
    /*Macros for generating numbers in different ranges*/
    168
    /*Macros for generating numbers in different ranges*/
    158
    /*Row constants for the animation*/
    169
    /*Row constants for the animation*/
    159
    /*This struct contains all the information we need to display an object
    170
    /*This struct contains all the information we need to display an object
    160
    on the screen*/
    171
    on the screen*/
    172
    173
    #[repr(C)]
    161
    #[derive(Copy, Clone)]
    174
    #[derive(Copy, Clone)]
    162
    #[repr(C)]
    163
    pub struct screen_object {
    175
    pub struct screen_object {
    164
        pub x: libc::c_int,
    176
        pub x: libc::c_int,
    165
        pub y: libc::c_int,
    177
        pub y: libc::c_int,
    166
        pub color: libc::c_int,
    178
        pub color: libc::c_int,
    167
        pub bold: bool,
    179
        pub bold: bool,

    Now we can use rewrite_expr to find Rust code that comes from the expansions of the wattrset macro and replace it with calls to the wattrset function:

    rewrite_expr
        '
            if !(__win as *const libc::c_void).is_null() {
                (*__win)._attrs = __attrs
            } else {
            }
        '
        'wattrset(__win, __attrs as libc::c_int)' ;
    

    Diff #2

    src/robotfindskitten.rs
    1202
            new |= 1u64 << 14i32 + 8i32
    1202
            new |= 1u64 << 14i32 + 8i32
    1203
        }
    1203
        }
    1204
        if o.bold {
    1204
        if o.bold {
    1205
            new |= 1u64 << 13i32 + 8i32
    1205
            new |= 1u64 << 13i32 + 8i32
    1206
        }
    1206
        }
    1207
        if !(stdscr as *const libc::c_void).is_null() {
    1207
        wattrset(stdscr, new as libc::c_int);
    1208
            (*stdscr)._attrs = new
    1209
        } else {
    1210
        };
    1211
        if in_place {
    1208
        if in_place {
    1212
            printw(
    1209
            printw(
    1213
                b"%c\x00" as *const u8 as *const libc::c_char,
    1210
                b"%c\x00" as *const u8 as *const libc::c_char,
    1214
                o.character as libc::c_int,
    1211
                o.character as libc::c_int,
    1215
            );
    1212
            );

    1220
                b"%c\x00" as *const u8 as *const libc::c_char,
    1217
                b"%c\x00" as *const u8 as *const libc::c_char,
    1221
                o.character as libc::c_int,
    1218
                o.character as libc::c_int,
    1222
            );
    1219
            );
    1223
            wmove(stdscr, o.y, o.x);
    1220
            wmove(stdscr, o.y, o.x);
    1224
        }
    1221
        }
    1225
        if !(stdscr as *const libc::c_void).is_null() {
    1222
        wattrset(stdscr, old as libc::c_int);
    1226
            (*stdscr)._attrs = old
    1227
        } else {
    1228
        };
    1229
    }
    1223
    }
    1230
    #[no_mangle]
    1224
    #[no_mangle]
    1231
    pub unsafe extern "C" fn instructions() {
    1225
    pub unsafe extern "C" fn instructions() {
    1232
        let mut dummy: libc::c_char = 0;
    1226
        let mut dummy: libc::c_char = 0;
    1233
        mvprintw(
    1227
        mvprintw(

    The __win and __attrs metavariables in the pattern correspond to the arguments of the original C macro, and are used in the replacement to construct the equivalent Rust function call.

    Next, we do the same thing for the more complicated wattr_get macro:

    rewrite_expr
        '
            if !(__win as *const libc::c_void).is_null() {
                if !(&mut __attrs as *mut attr_t as *const libc::c_void).is_null() {
                    __attrs = (*__win)._attrs
                } else {
                };
                if !(&mut __pair as *mut libc::c_short as *const libc::c_void).is_null() {
                    __pair = (((*__win)._attrs as libc::c_ulong
                        & ((1u32 << 8i32).wrapping_sub(1u32) << 0i32 + 8i32) as libc::c_ulong)
                        >> 8i32) as libc::c_int as libc::c_short
                } else {
                };
            } else {
            }
        '
        'wattr_get(__win, &mut __attrs, &mut __pair, ::std::ptr::null_mut())' ;
    

    Finally, we are done with this bit of cleanup, so we write the changes to disk before continuing on:

    commit ;
    

    Diff #4

    src/robotfindskitten.rs
    8
    )]
    8
    )]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    10
    extern crate libc;
    10
    extern crate libc;
    11
    extern "C" {
    11
    extern "C" {
    12
        pub type ldat;
    12
        pub type ldat;
    13
        #[no_mangle]
    13
        #[no_mangle]
    14
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    14
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    15
        #[no_mangle]
    15
        #[no_mangle]
    16
        fn cbreak() -> libc::c_int;
    16
        fn cbreak() -> libc::c_int;

    79
       #[no_mangle]
    79
        #[no_mangle]
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    81
    }
    81
    }
    82
    extern "C" {
    82
    extern "C" {
    83
        fn wattr_get(
    83
        fn wattr_get(
    84
            win: *mut WINDOW,
    84
            win: *mut WINDOW,
    85
            attrs: *mut attr_t,
    85
            attrs: *mut attr_t,
    86
            pair: *mut libc::c_short,
    86
            pair: *mut libc::c_short,
    87
            opts: *mut libc::c_void,
    87
            opts: *mut libc::c_void,

    String formatting

    robotfindskitten calls several printf-style variable-argument functions to perform string formatting. Since variable-argument function calls are considered unsafe in Rust, we must replace these with Rust-style string formatting using format! and related macros. Specifically, for each string formatting function such as printf, we will create a safe wrapper fmt_printf that takes a Rust fmt::Arguments object, and replace printf(...) calls with fmt_printf(format_args!(...)). This approach isolates all the unsafety into the fmt_printf wrapper, where it can be eliminated by later passes.

    The replacement itself happens in two steps. First, we convert printf calls from printf(<C format args...>) to printf(format_args!(<Rust format args...>)). Note that the code does not typecheck in this intermediate state: C's printf function cannot accept the std::fmt::Arguments produced by the format_args! macro. The second step then replaces the printf call with a call to the fmt_printf wrapper, which does accept std::fmt::Arguments.

    printf format argument conversion

    We run a few commands to mark the nodes involved in string formatting, before finally running the convert_format_args command to perform the actual transformation.

    First, we use select and mark_arg_uses to mark the first argument of every printf call as targets:

    select target 'item(printf);' ;
    mark_arg_uses 0 target ;
    

    Diff #5

    src/robotfindskitten.rs
    977
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    977
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    978
        endwin();
    978
        endwin();
    979
        printf(
    979
        printf(
    980
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    980
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    981
            27i32,
    981
            27i32,
    982
            '(' as i32,
    982
            '(' as i32,
    983
            'B' as i32,
    983
            'B' as i32,
    984
        );
    984
        );
    985
        exit(0i32);
    985
        exit(0i32);

    1419
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1419
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1420
            {
    1420
            {
    1421
                printf(
    1421
                printf(
    1422
                    b"Run-time parameter must be between 0 and %d.\n\x00" as *const u8
    1422
                    b"Run-time parameter must be between 0 and %d.\n\x00" as *const u8
    1423
                        as *const libc::c_char,
    1423
                        as *const libc::c_char,
    1424
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1424
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1425
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    1425
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    1426
                );
    1426
                );
    1427
                exit(0i32);
    1427
                exit(0i32);

    1429
        }
    1429
        }
    1430
        srand(time(0 as *mut time_t) as libc::c_uint);
    1430
        srand(time(0 as *mut time_t) as libc::c_uint);
    1431
        printf(
    1431
        printf(
    1432
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    1432
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    1433
            27i32,
    1433
            27i32,
    1434
            '(' as i32,
    1434
            '(' as i32,
    1435
            'U' as i32,
    1435
            'U' as i32,
    1436
        );
    1436
        );
    1437
        initialize_ncurses();
    1437
        initialize_ncurses();

    convert_format_args will treat the target argument at each call site as a printf-style format string, and will treat all later arguments as format args.

    Next, we mark the format string literal with fmt_str, which tells convert_format_args the exact string literal it should use as the format string. This usually is not the same as the target argument, since c2rust-transpile inserts several casts to turn a Rust string literal into a *const libc::c_char.

    select fmt_str 'marked(target); desc(expr && !match_expr(__e as __t));' ;
    

    Diff #6

    src/robotfindskitten.rs
    977
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    977
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    978
        endwin();
    978
        endwin();
    979
        printf(
    979
        printf(
    980
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    980
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    981
            27i32,
    981
            27i32,
    982
            '(' as i32,
    982
            '(' as i32,
    983
            'B' as i32,
    983
            'B' as i32,
    984
        );
    984
        );
    985
        exit(0i32);
    985
        exit(0i32);

    1419
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1419
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1420
            {
    1420
            {
    1421
                printf(
    1421
                printf(
    1422
                    b"Run-time parameter must be between 0 and %d.\n\x00" as *const u8
    1422
                    b"Run-time parameter must be between 0 and %d.\n\x00" as *const u8
    1423
                        as *const libc::c_char,
    1423
                        as *const libc::c_char,
    1424
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1424
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1425
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    1425
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    1426
                );
    1426
                );
    1427
                exit(0i32);
    1427
                exit(0i32);

    1429
        }
    1429
        }
    1430
        srand(time(0 as *mut time_t) as libc::c_uint);
    1430
        srand(time(0 as *mut time_t) as libc::c_uint);
    1431
        printf(
    1431
        printf(
    1432
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    1432
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    1433
            27i32,
    1433
            27i32,
    1434
            '(' as i32,
    1434
            '(' as i32,
    1435
            'U' as i32,
    1435
            'U' as i32,
    1436
        );
    1436
        );
    1437
        initialize_ncurses();
    1437
        initialize_ncurses();

    With both target and fmt_str marks in place, we can apply the actual transformation:

    convert_format_args ;
    

    Diff #7

    src/robotfindskitten.rs
    974
            );
    974
            );
    975
        };
    975
        };
    976
    }
    976
    }
    977
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    977
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    978
        endwin();
    978
        endwin();
    979
        printf(
    979
        printf(format_args!(
    980
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    980
            "{:}{:}{:}",
    981
            27i32,
    981
            27i32 as u8 as char, '(' as i32 as u8 as char, 'B' as i32 as u8 as char
    982
            '(' as i32,
    983
            'B' as i32,
    984
        );
    982
        ));
    985
        exit(0i32);
    983
        exit(0i32);
    986
    }
    984
    }
    987
    #[no_mangle]
    985
    #[no_mangle]
    988
    pub unsafe extern "C" fn initialize_arrays() {
    986
    pub unsafe extern "C" fn initialize_arrays() {
    989
        let mut counter: libc::c_int = 0;
    987
        let mut counter: libc::c_int = 0;

    1416
            if num_bogus < 0i32
    1414
            if num_bogus < 0i32
    1417
                || num_bogus as libc::c_ulong
    1415
                || num_bogus as libc::c_ulong
    1418
                    > (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1416
                    > (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1419
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1417
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1420
            {
    1418
            {
    1421
                printf(
    1419
                printf(format_args!(
    1422
                    b"Run-time parameter must be between 0 and %d.\n\x00" as *const u8
    1420
                    "Run-time parameter must be between 0 and {:}.\n",
    1423
                        as *const libc::c_char,
    1424
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1421
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1425
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    1422
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1423
                        as libc::c_int
    1426
                );
    1424
                ));
    1427
                exit(0i32);
    1425
                exit(0i32);
    1428
            }
    1426
            }
    1429
        }
    1427
        }
    1430
        srand(time(0 as *mut time_t) as libc::c_uint);
    1428
        srand(time(0 as *mut time_t) as libc::c_uint);
    1431
        printf(
    1429
        printf(format_args!(
    1432
            b"%c%c%c\x00" as *const u8 as *const libc::c_char,
    1430
            "{:}{:}{:}",
    1433
            27i32,
    1431
            27i32 as u8 as char, '(' as i32 as u8 as char, 'U' as i32 as u8 as char
    1434
            '(' as i32,
    1435
            'U' as i32,
    1436
        );
    1432
        ));
    1437
        initialize_ncurses();
    1433
        initialize_ncurses();
    1438
        initialize_arrays();
    1434
        initialize_arrays();
    1439
        initialize_robot();
    1435
        initialize_robot();
    1440
        initialize_kitten();
    1436
        initialize_kitten();
    1441
        initialize_bogus();
    1437
        initialize_bogus();

    Finally, we clean up from this step by clearing all the marks.

    clear_marks ;
    

    commit would also clear the marks, but we don't want to commit these changes until we've fixed the type errors introduced in this step.

    Creating a printf wrapper

    As a reminder, we currently have code that looks like this:

    printf(format_args!("Hello, {}!\n", "world"))
    

    printf itself can't accept the std::fmt::Arguments returned by format_args!, so we will define a wrapper that does accept std::fmt::Arguments and then rewrite these printf calls to call the wrapper instead.

    First, we insert the wrapper:

    select target 'crate; child(foreign_mod); last;' ;
    create_item
        '
            fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
                print!("{}", args);
                0
            }
        '
        after ;
    

    Diff #9

    src/robotfindskitten.rs
    79
        #[no_mangle]
    79
        #[no_mangle]
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    81
    }
    81
    }
    82
    extern "C" {
    82
    extern "C" {
    83
        fn wattr_get(
    83
        fn wattr_get(
    84
            win: *mut WINDOW,
    84
            win: *mut WINDOW,
    85
            attrs: *mut attr_t,
    85
            attrs: *mut attr_t,
    86
            pair: *mut libc::c_short,
    86
            pair: *mut libc::c_short,
    87
            opts: *mut libc::c_void,
    87
            opts: *mut libc::c_void,
    88
        ) -> libc::c_int;
    88
        ) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    90
    }
    90
    }
    91
    fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
    92
        print!("{}", args);
    93
        0
    94
    }
    91
    pub type __time_t = libc::c_long;
    95
    pub type __time_t = libc::c_long;
    92
    pub type chtype = libc::c_ulong;
    96
    pub type chtype = libc::c_ulong;
    93
    97
    94
    #[repr(C)]
    98
    #[repr(C)]
    95
    #[derive(Copy, Clone)]
    99
    #[derive(Copy, Clone)]

    Since Rust provides a print! macro with similar functionality to printf, our "wrapper" actually just calls print! directly, avoiding the string conversions necessary to call the actual C printf. (See the next subsection for an example of a "real" wrapper function.)

    With the wrapper in place, we can now update the call sites:

    rewrite_expr 'printf' 'fmt_printf' ;
    

    Diff #10

    src/robotfindskitten.rs
    978
            );
    978
            );
    979
        };
    979
        };
    980
    }
    980
    }
    981
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    981
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    982
        endwin();
    982
        endwin();
    983
        printf(format_args!(
    983
        fmt_printf(format_args!(
    984
            "{:}{:}{:}",
    984
            "{:}{:}{:}",
    985
            27i32 as u8 as char, '(' as i32 as u8 as char, 'B' as i32 as u8 as char
    985
            27i32 as u8 as char, '(' as i32 as u8 as char, 'B' as i32 as u8 as char
    986
        ));
    986
        ));
    987
        exit(0i32);
    987
        exit(0i32);
    988
    }
    988
    }

    1418
            if num_bogus < 0i32
    1418
            if num_bogus < 0i32
    1419
                || num_bogus as libc::c_ulong
    1419
                || num_bogus as libc::c_ulong
    1420
                    > (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1420
                    > (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1421
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1421
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1422
            {
    1422
            {
    1423
                printf(format_args!(
    1423
                fmt_printf(format_args!(
    1424
                    "Run-time parameter must be between 0 and {:}.\n",
    1424
                    "Run-time parameter must be between 0 and {:}.\n",
    1425
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1425
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    1426
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1426
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong)
    1427
                        as libc::c_int
    1427
                        as libc::c_int
    1428
                ));
    1428
                ));
    1429
                exit(0i32);
    1429
                exit(0i32);
    1430
            }
    1430
            }
    1431
        }
    1431
        }
    1432
        srand(time(0 as *mut time_t) as libc::c_uint);
    1432
        srand(time(0 as *mut time_t) as libc::c_uint);
    1433
        printf(format_args!(
    1433
        fmt_printf(format_args!(
    1434
            "{:}{:}{:}",
    1434
            "{:}{:}{:}",
    1435
            27i32 as u8 as char, '(' as i32 as u8 as char, 'U' as i32 as u8 as char
    1435
            27i32 as u8 as char, '(' as i32 as u8 as char, 'U' as i32 as u8 as char
    1436
        ));
    1436
        ));
    1437
        initialize_ncurses();
    1437
        initialize_ncurses();
    1438
        initialize_arrays();
    1438
        initialize_arrays();

    Now that we've finished this step and the crate typechecks again, we can safely commit the changes:

    commit ;
    

    Diff #11

    src/robotfindskitten.rs
    79
        #[no_mangle]
    79
        #[no_mangle]
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    80
        fn sleep(__seconds: libc::c_uint) -> libc::c_uint;
    81
    }
    81
    }
    82
    extern "C" {
    82
    extern "C" {
    83
        fn wattr_get(
    83
        fn wattr_get(
    84
            win: *mut WINDOW,
    84
            win: *mut WINDOW,
    85
            attrs: *mut attr_t,
    85
            attrs: *mut attr_t,
    86
            pair: *mut libc::c_short,
    86
            pair: *mut libc::c_short,
    87
            opts: *mut libc::c_void,
    87
            opts: *mut libc::c_void,
    88
        ) -> libc::c_int;
    88
        ) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    90
    }
    90
    }
    91
    fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
    91
    fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
    92
        print!("{}", args);
    92
        print!("{}", args);
    93
        0
    93
        0
    94
    }
    94
    }
    95
    pub type __time_t = libc::c_long;
    95
    pub type __time_t = libc::c_long;
    96
    pub type chtype = libc::c_ulong;
    96
    pub type chtype = libc::c_ulong;

    Other string formatting functions

    Aside from printf, robotfindskitten also uses the ncurses printw and mvprintw string-formatting functions. The refactoring script for printw is similar to the previous two steps combined:

    select target 'item(printw);' ;
    mark_arg_uses 0 target ;
    select fmt_str 'marked(target); desc(expr && !match_expr(__e as __t));' ;
    
    convert_format_args ;
    
    clear_marks ;
    
    select target 'crate; child(foreign_mod); last;' ;
    create_item
        '
            fn fmt_printw(args: ::std::fmt::Arguments) -> libc::c_int {
                unsafe {
                    ::printw(b"%s\0" as *const u8 as *const libc::c_char,
                             ::std::ffi::CString::new(format!("{}", args))
                                 .unwrap().as_ptr())
                }
            }
        '
        after ;
    rewrite_expr 'printw' 'fmt_printw' ;
    commit ;
    

    Diff #12

    src/robotfindskitten.rs
    86
            pair: *mut libc::c_short,
    86
            pair: *mut libc::c_short,
    87
            opts: *mut libc::c_void,
    87
            opts: *mut libc::c_void,
    88
        ) -> libc::c_int;
    88
        ) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    90
    }
    90
    }
    91
    fn fmt_printw(args: ::std::fmt::Arguments) -> libc::c_int {
    92
        unsafe {
    93
            ::printw(
    94
                b"%s\0" as *const u8 as *const libc::c_char,
    95
                ::std::ffi::CString::new(format!("{}", args))
    96
                    .unwrap()
    97
                    .as_ptr(),
    98
            )
    99
        }
    100
    }
    91
    fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
    101
    fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
    92
        print!("{}", args);
    102
        print!("{}", args);
    93
        0
    103
        0
    94
    }
    104
    }
    95
    pub type __time_t = libc::c_long;
    105
    pub type __time_t = libc::c_long;

    1161
            b"robotfindskitten v%s\n\n\x00" as *const u8 as *const libc::c_char,
    1171
            b"robotfindskitten v%s\n\n\x00" as *const u8 as *const libc::c_char,
    1162
            ver,
    1172
            ver,
    1163
        );
    1173
        );
    1164
        counter = 0i32;
    1174
        counter = 0i32;
    1165
        while counter <= COLS - 1i32 {
    1175
        while counter <= COLS - 1i32 {
    1166
            printw(b"%c\x00" as *const u8 as *const libc::c_char, 95i32);
    1176
            fmt_printw(format_args!("{:}", 95i32 as u8 as char));
    1167
            counter += 1
    1177
            counter += 1
    1168
        }
    1178
        }
    1169
        counter = 0i32;
    1179
        counter = 0i32;
    1170
        while counter < num_bogus {
    1180
        while counter < num_bogus {
    1171
            draw(bogus[counter as usize]);
    1181
            draw(bogus[counter as usize]);

    1206
        if o.bold {
    1216
        if o.bold {
    1207
            new |= 1u64 << 13i32 + 8i32
    1217
            new |= 1u64 << 13i32 + 8i32
    1208
        }
    1218
        }
    1209
        wattrset(stdscr, new as libc::c_int);
    1219
        wattrset(stdscr, new as libc::c_int);
    1210
        if in_place {
    1220
        if in_place {
    1211
            printw(
    1221
            fmt_printw(format_args!(
    1212
                b"%c\x00" as *const u8 as *const libc::c_char,
    1222
                "{:}",
    1213
                o.character as libc::c_int,
    1223
                o.character as libc::c_int as u8 as char
    1214
            );
    1224
            ));
    1215
        } else {
    1225
        } else {
    1216
            mvprintw(
    1226
            mvprintw(
    1217
                o.y,
    1227
                o.y,
    1218
                o.x,
    1228
                o.x,
    1219
                b"%c\x00" as *const u8 as *const libc::c_char,
    1229
                b"%c\x00" as *const u8 as *const libc::c_char,

    1230
            0i32,
    1240
            0i32,
    1231
            0i32,
    1241
            0i32,
    1232
            b"robotfindskitten v%s\n\x00" as *const u8 as *const libc::c_char,
    1242
            b"robotfindskitten v%s\n\x00" as *const u8 as *const libc::c_char,
    1233
            ver,
    1243
            ver,
    1234
        );
    1244
        );
    1235
        printw(
    1245
        fmt_printw(format_args!(
    1236
            b"By the illustrious Leonard Richardson (C) 1997, 2000\n\x00" as *const u8
    1246
            "By the illustrious Leonard Richardson (C) 1997, 2000\n"
    1237
                as *const libc::c_char,
    1238
        );
    1247
        ));
    1239
        printw(
    1248
        fmt_printw(format_args!(
    1240
            b"Written originally for the Nerth Pork robotfindskitten contest\n\n\x00" as *const u8
    1249
            "Written originally for the Nerth Pork robotfindskitten contest\n\n"
    1241
                as *const libc::c_char,
    1242
        );
    1250
        ));
    1243
        printw(b"In this game, you are robot (\x00" as *const u8 as *const libc::c_char);
    1251
        fmt_printw(format_args!("In this game, you are robot ("));
    1244
        draw_in_place(robot);
    1252
        draw_in_place(robot);
    1245
        printw(b"). Your job is to find kitten. This task\n\x00" as *const u8 as *const libc::c_char);
    1253
        fmt_printw(format_args!("). Your job is to find kitten. This task\n"));
    1246
        printw(
    1254
        fmt_printw(format_args!(
    1247
            b"is complicated by the existence of various things which are not kitten.\n\x00"
    1255
            "is complicated by the existence of various things which are not kitten.\n"
    1248
                as *const u8 as *const libc::c_char,
    1249
        );
    1256
        ));
    1250
        printw(
    1257
        fmt_printw(format_args!(
    1251
            b"Robot must touch items to determine if they are kitten or not. The game\n\x00"
    1258
            "Robot must touch items to determine if they are kitten or not. The game\n"
    1252
                as *const u8 as *const libc::c_char,
    1253
        );
    1259
        ));
    1254
        printw(
    1260
        fmt_printw(format_args!(
    1255
            b"ends when robotfindskitten. Alternatively, you may end the game by hitting\n\x00"
    1261
            "ends when robotfindskitten. Alternatively, you may end the game by hitting\n"
    1256
                as *const u8 as *const libc::c_char,
    1257
        );
    1262
        ));
    1258
        printw(
    1263
        fmt_printw(format_args!(
    1259
            b"the Esc key. See the documentation for more information.\n\n\x00" as *const u8
    1264
            "the Esc key. See the documentation for more information.\n\n"
    1260
                as *const libc::c_char,
    1261
        );
    1265
        ));
    1262
        printw(b"Press any key to start.\n\x00" as *const u8 as *const libc::c_char);
    1266
        fmt_printw(format_args!("Press any key to start.\n"));
    1263
        wrefresh(stdscr);
    1267
        wrefresh(stdscr);
    1264
        dummy = wgetch(stdscr) as libc::c_char;
    1268
        dummy = wgetch(stdscr) as libc::c_char;
    1265
        wclear(stdscr);
    1269
        wclear(stdscr);
    1266
    }
    1270
    }
    1267
    #[no_mangle]
    1271
    #[no_mangle]

    Aside from replacing the name printf with printw, the other notable difference from the printf script is the body of fmt_printw. There is no convenient replacement for printw in the Rust standard library, so instead we call the original printw function, passing in the result of Rust string formatting (converted to a C string) as an argument.

    The mvprintw replacement is also similar, just with a few extra arguments:

    select target 'item(mvprintw);' ;
    mark_arg_uses 2 target ;
    select fmt_str 'marked(target); desc(expr && !match_expr(__e as __t));' ;
    
    convert_format_args ;
    
    clear_marks ;
    
    select target 'crate; child(foreign_mod); last;' ;
    create_item
        '
            fn fmt_mvprintw(y: libc::c_int, x: libc::c_int,
                            args: ::std::fmt::Arguments) -> libc::c_int {
                unsafe {
                    ::mvprintw(y, x, b"%s\0" as *const u8 as *const libc::c_char,
                             ::std::ffi::CString::new(format!("{}", args))
                                 .unwrap().as_ptr())
                }
            }
        '
        after ;
    rewrite_expr 'mvprintw' 'fmt_mvprintw' ;
    commit ;
    

    Diff #13

    src/robotfindskitten.rs
    86
            pair: *mut libc::c_short,
    86
            pair: *mut libc::c_short,
    87
            opts: *mut libc::c_void,
    87
            opts: *mut libc::c_void,
    88
        ) -> libc::c_int;
    88
        ) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    89
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    90
    }
    90
    }
    91
    fn fmt_mvprintw(y: libc::c_int, x: libc::c_int, args: ::std::fmt::Arguments) -> libc::c_int {
    92
        unsafe {
    93
            ::mvprintw(
    94
                y,
    95
                x,
    96
                b"%s\0" as *const u8 as *const libc::c_char,
    97
                ::std::ffi::CString::new(format!("{}", args))
    98
                    .unwrap()
    99
                    .as_ptr(),
    100
            )
    101
        }
    102
    }
    91
    fn fmt_printw(args: ::std::fmt::Arguments) -> libc::c_int {
    103
    fn fmt_printw(args: ::std::fmt::Arguments) -> libc::c_int {
    92
        unsafe {
    104
        unsafe {
    93
            ::printw(
    105
            ::printw(
    94
                b"%s\0" as *const u8 as *const libc::c_char,
    106
                b"%s\0" as *const u8 as *const libc::c_char,
    95
                ::std::ffi::CString::new(format!("{}", args))
    107
                ::std::ffi::CString::new(format!("{}", args))

    1163
    #[no_mangle]
    1175
    #[no_mangle]
    1164
    pub static mut num_bogus: libc::c_int = 0;
    1176
    pub static mut num_bogus: libc::c_int = 0;
    1165
    #[no_mangle]
    1177
    #[no_mangle]
    1166
    pub unsafe extern "C" fn initialize_screen() {
    1178
    pub unsafe extern "C" fn initialize_screen() {
    1167
        let mut counter: libc::c_int = 0;
    1179
        let mut counter: libc::c_int = 0;
    1168
        mvprintw(
    1180
        fmt_mvprintw(
    1169
            0i32,
    1181
            0i32,
    1170
            0i32,
    1182
            0i32,
    1171
            b"robotfindskitten v%s\n\n\x00" as *const u8 as *const libc::c_char,
    1183
            format_args!("robotfindskitten v{:}\n\n", unsafe {
    1184
                ::std::ffi::CStr::from_ptr(ver as *const libc::c_char)
    1185
                    .to_str()
    1186
                    .unwrap()
    1172
            ver,
    1187
            }),
    1173
        );
    1188
        );
    1174
        counter = 0i32;
    1189
        counter = 0i32;
    1175
        while counter <= COLS - 1i32 {
    1190
        while counter <= COLS - 1i32 {
    1176
            fmt_printw(format_args!("{:}", 95i32 as u8 as char));
    1191
            fmt_printw(format_args!("{:}", 95i32 as u8 as char));
    1177
            counter += 1
    1192
            counter += 1

    1221
            fmt_printw(format_args!(
    1236
            fmt_printw(format_args!(
    1222
                "{:}",
    1237
                "{:}",
    1223
                o.character as libc::c_int as u8 as char
    1238
                o.character as libc::c_int as u8 as char
    1224
            ));
    1239
            ));
    1225
        } else {
    1240
        } else {
    1226
            mvprintw(
    1241
            fmt_mvprintw(
    1227
                o.y,
    1242
                o.y,
    1228
                o.x,
    1243
                o.x,
    1229
                b"%c\x00" as *const u8 as *const libc::c_char,
    1244
                format_args!("{:}", o.character as libc::c_int as u8 as char),
    1230
                o.character as libc::c_int,
    1231
            );
    1245
            );
    1232
            wmove(stdscr, o.y, o.x);
    1246
            wmove(stdscr, o.y, o.x);
    1233
        }
    1247
        }
    1234
        wattrset(stdscr, old as libc::c_int);
    1248
        wattrset(stdscr, old as libc::c_int);
    1235
    }
    1249
    }
    1236
    #[no_mangle]
    1250
    #[no_mangle]
    1237
    pub unsafe extern "C" fn instructions() {
    1251
    pub unsafe extern "C" fn instructions() {
    1238
        let mut dummy: libc::c_char = 0;
    1252
        let mut dummy: libc::c_char = 0;
    1239
        mvprintw(
    1253
        fmt_mvprintw(
    1240
            0i32,
    1254
            0i32,
    1241
            0i32,
    1255
            0i32,
    1242
            b"robotfindskitten v%s\n\x00" as *const u8 as *const libc::c_char,
    1256
            format_args!("robotfindskitten v{:}\n", unsafe {
    1257
                ::std::ffi::CStr::from_ptr(ver as *const libc::c_char)
    1258
                    .to_str()
    1259
                    .unwrap()
    1243
            ver,
    1260
            }),
    1244
        );
    1261
        );
    1245
        fmt_printw(format_args!(
    1262
        fmt_printw(format_args!(
    1246
            "By the illustrious Leonard Richardson (C) 1997, 2000\n"
    1263
            "By the illustrious Leonard Richardson (C) 1997, 2000\n"
    1247
        ));
    1264
        ));
    1248
        fmt_printw(format_args!(
    1265
        fmt_printw(format_args!(

    1301
    }
    1318
    }
    1302
    #[no_mangle]
    1319
    #[no_mangle]
    1303
    pub unsafe extern "C" fn message(mut message_0: *mut libc::c_char) {
    1320
    pub unsafe extern "C" fn message(mut message_0: *mut libc::c_char) {
    1304
        wmove(stdscr, 1i32, 0i32);
    1321
        wmove(stdscr, 1i32, 0i32);
    1305
        wclrtoeol(stdscr);
    1322
        wclrtoeol(stdscr);
    1306
        mvprintw(
    1323
        fmt_mvprintw(
    1307
            1i32,
    1324
            1i32,
    1308
            0i32,
    1325
            0i32,
    1309
            b"%.*s\x00" as *const u8 as *const libc::c_char,
    1326
            format_args!("{:.*}", COLS as usize, unsafe {
    1327
                ::std::ffi::CStr::from_ptr(message_0 as *const libc::c_char)
    1328
                    .to_str()
    1329
                    .unwrap()
    1310
            COLS,
    1330
            }),
    1311
            message_0,
    1312
        );
    1331
        );
    1313
        wmove(stdscr, robot.y, robot.x);
    1332
        wmove(stdscr, robot.y, robot.x);
    1314
        wrefresh(stdscr);
    1333
        wrefresh(stdscr);
    1315
    }
    1334
    }
    1316
    #[no_mangle]
    1335
    #[no_mangle]

    Static string constant - ver

    robotfindskitten defines a static string constant, ver, to store the game's version. Using ver is currently unsafe, first because its Rust type is a raw pointer (*mut c_char), and second because it's mutable. To make ver usage safe, we first change its type to &'static str (and fix up the resulting type errors), and then we change it from a static mut to an ordinary immutable static. Note that we must change the type first because Rust does not allow raw pointers to be stored in safe (non-mut) statics.

    We change the type using rewrite_ty:

    select target 'item(ver); child(ty);' ;
    rewrite_ty 'marked!(*mut libc::c_char)' "&'static str" ;
    delete_marks target ;
    

    Diff #14

    src/robotfindskitten.rs
    203
        pub y: libc::c_int,
    203
        pub y: libc::c_int,
    204
        pub color: libc::c_int,
    204
        pub color: libc::c_int,
    205
        pub bold: bool,
    205
        pub bold: bool,
    206
        pub character: libc::c_char,
    206
        pub character: libc::c_char,
    207
    }
    207
    }
    208
    static mut ver: *mut libc::c_char =
    208
    static mut ver: &'static str =
    209
        b"1.7320508.406\x00" as *const u8 as *const libc::c_char as *mut libc::c_char;
    209
        b"1.7320508.406\x00" as *const u8 as *const libc::c_char as *mut libc::c_char;
    210
    #[inline]
    210
    #[inline]
    211
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    211
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    212
        return strtol(
    212
        return strtol(
    213
            __nptr,
    213
            __nptr,

    The combination of select and the marked! matching form ensures that only ver's type annotation is modified. We delete the mark afterward, since it's no longer needed.

    Simply replacing *mut c_char with &str introduces type errors throughout the crate. The initializer for ver still has type *mut c_char, and all uses of ver are still expecting a *mut c_char.

    Fixing ver's initializer

    Fixing the ver initializer is straightforward: we simply remove all the casts, then convert the binary string (&[u8]) literal to an ordinary string literal. For the casts, we mark all cast expressions in ver's definition, then replace each one with its subexpression:

    select target 'item(ver); desc(match_expr(__e as __t));' ;
    rewrite_expr 'marked!(__e as __t)' '__e' ;
    delete_marks target ;
    

    Diff #15

    src/robotfindskitten.rs
    203
        pub y: libc::c_int,
    203
        pub y: libc::c_int,
    204
        pub color: libc::c_int,
    204
        pub color: libc::c_int,
    205
        pub bold: bool,
    205
        pub bold: bool,
    206
        pub character: libc::c_char,
    206
        pub character: libc::c_char,
    207
    }
    207
    }
    208
    static mut ver: &'static str =
    208
    static mut ver: &'static str = b"1.7320508.406\x00";
    209
        b"1.7320508.406\x00" as *const u8 as *const libc::c_char as *mut libc::c_char;
    210
    #[inline]
    209
    #[inline]
    211
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    210
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    212
        return strtol(
    211
        return strtol(
    213
            __nptr,
    212
            __nptr,
    214
            0 as *mut libc::c_void as *mut *mut libc::c_char,
    213
            0 as *mut libc::c_void as *mut *mut libc::c_char,

    Only the binary string literal remains, so we mark it and change it to an ordinary str:

    select target 'item(ver); child(expr);' ;
    bytestr_to_str ;
    delete_marks target ;
    

    Diff #16

    src/robotfindskitten.rs
    203
        pub y: libc::c_int,
    203
        pub y: libc::c_int,
    204
        pub color: libc::c_int,
    204
        pub color: libc::c_int,
    205
        pub bold: bool,
    205
        pub bold: bool,
    206
        pub character: libc::c_char,
    206
        pub character: libc::c_char,
    207
    }
    207
    }
    208
    static mut ver: &'static str = b"1.7320508.406\x00";
    208
    static mut ver: &'static str = "1.7320508.406\u{0}";
    209
    #[inline]
    209
    #[inline]
    210
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    210
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    211
        return strtol(
    211
        return strtol(
    212
            __nptr,
    212
            __nptr,
    213
            0 as *mut libc::c_void as *mut *mut libc::c_char,
    213
            0 as *mut libc::c_void as *mut *mut libc::c_char,

    Fixing ver's uses

    ver's initializer is now well-typed, but its uses are still expecting a *mut c_char instead of a &str. To fix these up, we use the type_fix_rules command, which rewrites expressions anywhere a type error occurs:

    type_fix_rules '*, &str, *const __t => __old.as_ptr()' ;
    

    Diff #17

    src/robotfindskitten.rs
    1178
        let mut counter: libc::c_int = 0;
    1178
        let mut counter: libc::c_int = 0;
    1179
        fmt_mvprintw(
    1179
        fmt_mvprintw(
    1180
            0i32,
    1180
            0i32,
    1181
            0i32,
    1181
            0i32,
    1182
            format_args!("robotfindskitten v{:}\n\n", unsafe {
    1182
            format_args!("robotfindskitten v{:}\n\n", unsafe {
    1183
                ::std::ffi::CStr::from_ptr(ver as *const libc::c_char)
    1183
                ::std::ffi::CStr::from_ptr(ver.as_ptr() as *const libc::c_char)
    1184
                    .to_str()
    1184
                    .to_str()
    1185
                    .unwrap()
    1185
                    .unwrap()
    1186
            }),
    1186
            }),
    1187
        );
    1187
        );
    1188
        counter = 0i32;
    1188
        counter = 0i32;

    1251
        let mut dummy: libc::c_char = 0;
    1251
        let mut dummy: libc::c_char = 0;
    1252
        fmt_mvprintw(
    1252
        fmt_mvprintw(
    1253
            0i32,
    1253
            0i32,
    1254
            0i32,
    1254
            0i32,
    1255
            format_args!("robotfindskitten v{:}\n", unsafe {
    1255
            format_args!("robotfindskitten v{:}\n", unsafe {
    1256
                ::std::ffi::CStr::from_ptr(ver as *const libc::c_char)
    1256
                ::std::ffi::CStr::from_ptr(ver.as_ptr() as *const libc::c_char)
    1257
                    .to_str()
    1257
                    .to_str()
    1258
                    .unwrap()
    1258
                    .unwrap()
    1259
            }),
    1259
            }),
    1260
        );
    1260
        );
    1261
        fmt_printw(format_args!(
    1261
        fmt_printw(format_args!(

    Here we run type_fix_rules with only one rule: in any position (*), if an expression has type &str but is expected to have a raw pointer type (*const __t), then wrap the original expression in a call to .as_ptr(). This turns out to be enough to fix all the errors at uses of ver.

    Making ver immutable

    Now that all type errors have been corrected, we can finish our refactoring of ver. We make it immutable, then commit the changes.

    select target 'item(ver);' ;
    set_mutability imm ;
    
    commit ;
    

    Diff #18

    src/robotfindskitten.rs
    203
        pub y: libc::c_int,
    203
        pub y: libc::c_int,
    204
        pub color: libc::c_int,
    204
        pub color: libc::c_int,
    205
        pub bold: bool,
    205
        pub bold: bool,
    206
        pub character: libc::c_char,
    206
        pub character: libc::c_char,
    207
    }
    207
    }
    208
    static mut ver: &'static str = "1.7320508.406\u{0}";
    208
    static ver: &'static str = "1.7320508.406\u{0}";
    209
    #[inline]
    209
    #[inline]
    210
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    210
    unsafe extern "C" fn atoi(mut __nptr: *const libc::c_char) -> libc::c_int {
    211
        return strtol(
    211
        return strtol(
    212
            __nptr,
    212
            __nptr,
    213
            0 as *mut libc::c_void as *mut *mut libc::c_char,
    213
            0 as *mut libc::c_void as *mut *mut libc::c_char,

    Static string array - messages

    Aside from ver, robotfindskitten contains a static array of strings, called messages. Like ver, accessing messages is unsafe because each element is a raw *mut c_char pointer and because messages itself is a static mut.

    We rewrite the type and initializer of messages using the same strategy as for ver:

    select target 'item(messages); child(ty); desc(ty);' ;
    rewrite_ty 'marked!(*mut libc::c_char)' "&'static str" ;
    delete_marks target ;
    select target 'item(messages); child(expr); desc(expr);' ;
    rewrite_expr 'marked!(__e as __t)' '__e' ;
    bytestr_to_str ;
    delete_marks target ;
    

    Diff #19

    src/robotfindskitten.rs
    218
    will happen.*/
    218
    will happen.*/
    219
    /*Also, take note that robotfindskitten.c and configure.in
    219
    /*Also, take note that robotfindskitten.c and configure.in
    220
    currently have the version number hardcoded into them, and they
    220
    currently have the version number hardcoded into them, and they
    221
    should reflect MESSAGES. */
    221
    should reflect MESSAGES. */
    222
    /* Watch out for fenceposts.*/
    222
    /* Watch out for fenceposts.*/
    223
    static mut messages: [*mut libc::c_char; 406] = [
    223
    static mut messages: [&'static str; 406] = [
    224
        b"\"I pity the fool who mistakes me for kitten!\", sez Mr. T.\x00" as *const u8
    224
        "\"I pity the fool who mistakes me for kitten!\", sez Mr. T.\u{0}",
    225
            as *const libc::c_char as *mut libc::c_char,
    225
        "That\'s just an old tin can.\u{0}",
    226
        b"That\'s just an old tin can.\x00" as *const u8 as *const libc::c_char as *mut libc::c_char,
    226
        "It\'s an altar to the horse god.\u{0}",
    227
        b"It\'s an altar to the horse god.\x00" as *const u8 as *const libc::c_char
    228
            as *mut libc::c_char,
    229
        b"A box of dancing mechanical pencils. They dance! They sing!\x00" as *const u8
    227
        "A box of dancing mechanical pencils. They dance! They sing!\u{0}",
    230
            as *const libc::c_char as *mut libc::c_char,
    228
        "It\'s an old Duke Ellington record.\u{0}",
    231
        b"It\'s an old Duke Ellington record.\x00" as *const u8 as *const libc::c_char
    232
            as *mut libc::c_char,

    933
            as *mut libc::c_char,
    625
        "It\'s your favorite game -- robotfindscatan!\u{0}",
    934
        b"It\'s your favorite game -- robotfindscatan!\x00" as *const u8 as *const libc::c_char
    626
        "Just a man selling an albatross.\u{0}",
    935
            as *mut libc::c_char,
    627
        "The intermission from a 1930s silent movie.\u{0}",
    936
        b"Just a man selling an albatross.\x00" as *const u8 as *const libc::c_char
    628
        "It\'s an inverted billiard ball!\u{0}",
    937
            as *mut libc::c_char,
    629
        "The spectre of Sherlock Holmes wills you onwards.\u{0}",
    938
        b"The intermission from a 1930s silent movie.\x00" as *const u8 as *const libc::c_char
    939
            as *mut libc::c_char,
    940
        b"It\'s an inverted billiard ball!\x00" as *const u8 as *const libc::c_char
    941
            as *mut libc::c_char,
    942
        b"The spectre of Sherlock Holmes wills you onwards.\x00" as *const u8 as *const libc::c_char
    943
            as *mut libc::c_char,
    944
    ];
    630
    ];
    945
    /*
    631
    /*
    946
     *Function definitions
    632
     *Function definitions
    947
     */
    633
     */
    948
    /*Initialization and setup functions*/
    634
    /*Initialization and setup functions*/

    We use type_fix_rules to fix up the uses of messages, as we did for ver:

    type_fix_rules
        '*, &str, *const __t => __old.as_ptr()'
        '*, &str, *mut __t => __old.as_ptr() as *mut __t' ;
    

    Diff #20

    src/robotfindskitten.rs
    1069
                }
    1069
                }
    1070
                _ => {
    1070
                _ => {
    1071
                    message(
    1071
                    message(
    1072
                        messages[bogus_messages[(*(*screen.offset(check_x as isize))
    1072
                        messages[bogus_messages[(*(*screen.offset(check_x as isize))
    1073
                            .offset(check_y as isize)
    1073
                            .offset(check_y as isize)
    1074
                            - 2i32) as usize] as usize],
    1074
                            - 2i32) as usize] as usize]
    1075
                            .as_ptr() as *mut i8,
    1075
                    );
    1076
                    );
    1076
                }
    1077
                }
    1077
            }
    1078
            }
    1078
            return;
    1079
            return;
    1079
        }
    1080
        }

    Here we needed a second rule for *mut pointers, similar to the one for *const, because robotfindskitten mistakenly declares messages as an array of char* instead of const char*.

    With all type errors fixed, we can make messages immutable and commit the changes:

    select target 'item(messages);' ;
    set_mutability imm ;
    
    commit ;
    

    Diff #21

    src/robotfindskitten.rs
    218
    will happen.*/
    218
    will happen.*/
    219
    /*Also, take note that robotfindskitten.c and configure.in
    219
    /*Also, take note that robotfindskitten.c and configure.in
    220
    currently have the version number hardcoded into them, and they
    220
    currently have the version number hardcoded into them, and they
    221
    should reflect MESSAGES. */
    221
    should reflect MESSAGES. */
    222
    /* Watch out for fenceposts.*/
    222
    /* Watch out for fenceposts.*/
    223
    static mut messages: [&'static str; 406] = [
    223
    static messages: [&'static str; 406] = [
    224
        "\"I pity the fool who mistakes me for kitten!\", sez Mr. T.\u{0}",
    224
        "\"I pity the fool who mistakes me for kitten!\", sez Mr. T.\u{0}",
    225
        "That\'s just an old tin can.\u{0}",
    225
        "That\'s just an old tin can.\u{0}",
    226
        "It\'s an altar to the horse god.\u{0}",
    226
        "It\'s an altar to the horse god.\u{0}",
    227
        "A box of dancing mechanical pencils. They dance! They sing!\u{0}",
    227
        "A box of dancing mechanical pencils. They dance! They sing!\u{0}",
    228
        "It\'s an old Duke Ellington record.\u{0}",
    228
        "It\'s an old Duke Ellington record.\u{0}",

    Heap allocations

    The screen variable stores a heap-allocated two-dimensional array, represented in C as an int**. In Rust, this becomes *mut *mut c_int, which is unsafe to access. We replace it with CArray<CArray<c_int>>, where CArray is a memory-safe collection type provided by the c2rust_runtime library. CArray is convenient for this purpose because it supports C-style initialization and access patterns (including pointer arithmetic) while still guaranteeing memory safety.

    We actually perform the conversion from *mut to CArray in two steps. First, we replace *mut with the simpler CBlockPtr type, also defined in c2rust_runtime. CBlockPtr provides some limited bounds checking, but otherwise functions much like a raw pointer. It serves as a useful intermediate step, letting us fix up the differences between the raw-pointer and CArray APIs in two stages instead of attempting to do it all at once. Once screen has been fully converted to CBlockPtr<CBlockPtr<c_int>>, we finish the conversion to CArray in the second step.

    As a preliminary, we need to add an import of the c2rust_runtime library:

    select target 'crate;' ;
    create_item 'extern crate c2rust_runtime;' inside ;
    

    Diff #22

    src/robotfindskitten.rs
    5
        non_snake_case,
    5
        non_snake_case,
    6
        non_upper_case_globals,
    6
        non_upper_case_globals,
    7
        unused_mut
    7
        unused_mut
    8
    )]
    8
    )]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    10
    extern crate c2rust_runtime;
    10
    extern crate libc;
    11
    extern crate libc;
    11
    extern "C" {
    12
    extern "C" {
    12
        pub type ldat;
    13
        pub type ldat;
    13
        #[no_mangle]
    14
        #[no_mangle]
    14
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    15
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;

    Now we can proceed with the actual refactoring.

    Converting to CBlockPtr

    We further break down the transition from *mut *mut c_int to CBlockPtr<CBlockPtr<c_int>> into two steps, first converting the inner pointer (leaving the overall type as *mut CBlockPtr<c_int>) and then the outer. We change the type annotation first, as we did for var and messages:

    select target 'item(screen); child(ty);' ;
    rewrite_ty 'marked!(*mut *mut __t)'
        '*mut ::c2rust_runtime::CBlockPtr<__t>' ;
    

    Diff #23

    src/robotfindskitten.rs
    759
    /* This array contains our internal representation of the screen. The
    759
    /* This array contains our internal representation of the screen. The
    760
    array is bigger than it needs to be, as we don't need to keep track
    760
    array is bigger than it needs to be, as we don't need to keep track
    761
    of the first few rows of the screen. But that requires making an
    761
    of the first few rows of the screen. But that requires making an
    762
    offset function and using that everywhere. So not right now. */
    762
    offset function and using that everywhere. So not right now. */
    763
    #[no_mangle]
    763
    #[no_mangle]
    764
    pub static mut screen: *mut *mut libc::c_int =
    764
    pub static mut screen: *mut ::c2rust_runtime::CBlockPtr =
    765
        0 as *const *mut libc::c_int as *mut *mut libc::c_int;
    765
        0 as *const *mut libc::c_int as *mut *mut libc::c_int;
    766
    #[no_mangle]
    766
    #[no_mangle]
    767
    pub unsafe extern "C" fn initialize_robot() {
    767
    pub unsafe extern "C" fn initialize_robot() {
    768
        robot.x = rand() % (COLS - 1i32) + 1i32;
    768
        robot.x = rand() % (COLS - 1i32) + 1i32;
    769
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    769
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;

    This introduces type errors, letting us easily find (and fix) related expressions using type_fix_rules:

    type_fix_rules
        'rval, *mut __t, ::c2rust_runtime::CBlockPtr<__u> =>
            unsafe { ::c2rust_runtime::CBlockPtr::from_ptr(__old) }'
        'rval, *mut __t, *mut __u => __old as *mut __u'
        ;
    

    Diff #24

    src/robotfindskitten.rs
    707
       };
    707
       };
    708
        let mut i: libc::c_int = 0i32;
    708
        let mut i: libc::c_int = 0i32;
    709
        screen = malloc(
    709
        screen = malloc(
    710
            (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    710
            (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    711
                .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong),
    711
                .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong),
    712
        ) as *mut *mut libc::c_int;
    712
        ) as *mut *mut libc::c_int as *mut ::c2rust_runtime::CBlockPtr<i32>;
    713
        i = 0i32;
    713
        i = 0i32;
    714
        while i < COLS - 1i32 + 1i32 {
    714
        while i < COLS - 1i32 + 1i32 {
    715
            let ref mut fresh0 = *screen.offset(i as isize);
    715
            let ref mut fresh0 = *screen.offset(i as isize);
    716
            *fresh0 = malloc(
    716
            *fresh0 = unsafe {
    717
                ::c2rust_runtime::CBlockPtr::from_ptr(malloc(
    717
                (::std::mem::size_of::() as libc::c_ulong)
    718
                    (::std::mem::size_of::() as libc::c_ulong)
    718
                    .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong),
    719
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong),
    719
            ) as *mut libc::c_int;
    720
                ) as *mut libc::c_int)
    721
            };
    720
            i += 1
    722
            i += 1
    721
        }
    723
        }
    722
        empty.x = -1i32;
    724
        empty.x = -1i32;
    723
        empty.y = -1i32;
    725
        empty.y = -1i32;
    724
        empty.color = 0i32;
    726
        empty.color = 0i32;

    760
    array is bigger than it needs to be, as we don't need to keep track
    762
    array is bigger than it needs to be, as we don't need to keep track
    761
    of the first few rows of the screen. But that requires making an
    763
    of the first few rows of the screen. But that requires making an
    762
    offset function and using that everywhere. So not right now. */
    764
    offset function and using that everywhere. So not right now. */
    763
    #[no_mangle]
    765
    #[no_mangle]
    764
    pub static mut screen: *mut ::c2rust_runtime::CBlockPtr =
    766
    pub static mut screen: *mut ::c2rust_runtime::CBlockPtr =
    765
        0 as *const *mut libc::c_int as *mut *mut libc::c_int;
    767
        0 as *const *mut libc::c_int as *mut *mut libc::c_int as *mut ::c2rust_runtime::CBlockPtr<i32>;
    766
    #[no_mangle]
    768
    #[no_mangle]
    767
    pub unsafe extern "C" fn initialize_robot() {
    769
    pub unsafe extern "C" fn initialize_robot() {
    768
        robot.x = rand() % (COLS - 1i32) + 1i32;
    770
        robot.x = rand() % (COLS - 1i32) + 1i32;
    769
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    771
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    770
        robot.character = '#' as i32 as libc::c_char;
    772
        robot.character = '#' as i32 as libc::c_char;

    The first rule provided here handles the later part of screen's initialization, where the program allocates a *mut c_int array (now CBlockPtr<c_int>) for each row of the screen. The second rule handles the earlier part, where it allocates the top-level *mut *mut c_int (now *mut CBlockPtr<c_int>). Both allocations now need a cast, since the type of the rows has changed.

    One category of type errors remains: the initialization code tries to dereference the result of offsetting the array pointer, which is not possible directly with the CBlockPtr API. We add the necessary method call using rewrite_expr:

    rewrite_expr
        '*typed!(__e, ::c2rust_runtime::block_ptr::CBlockOffset<__t>)'
        '*__e.as_mut()' ;
    

    Diff #25

    src/robotfindskitten.rs
    728
       empty.character = ' ' as i32 as libc::c_char;
    728
       empty.character = ' ' as i32 as libc::c_char;
    729
        counter = 0i32;
    729
        counter = 0i32;
    730
        while counter <= COLS - 1i32 {
    730
        while counter <= COLS - 1i32 {
    731
            counter2 = 0i32;
    731
            counter2 = 0i32;
    732
            while counter2 <= LINES - 1i32 {
    732
            while counter2 <= LINES - 1i32 {
    733
                *(*screen.offset(counter as isize)).offset(counter2 as isize) = -1i32;
    733
                *(*screen.offset(counter as isize))
    734
                    .offset(counter2 as isize)
    735
                    .as_mut() = -1i32;
    734
                counter2 += 1
    736
                counter2 += 1
    735
            }
    737
            }
    736
            counter += 1
    738
            counter += 1
    737
        }
    739
        }
    738
        counter = 0i32;
    740
        counter = 0i32;

    770
       robot.x = rand() % (COLS - 1i32) + 1i32;
    772
       robot.x = rand() % (COLS - 1i32) + 1i32;
    771
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    773
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    772
        robot.character = '#' as i32 as libc::c_char;
    774
        robot.character = '#' as i32 as libc::c_char;
    773
        robot.color = 0i32;
    775
        robot.color = 0i32;
    774
        robot.bold = 0 != 0i32;
    776
        robot.bold = 0 != 0i32;
    775
        *(*screen.offset(robot.x as isize)).offset(robot.y as isize) = 0i32;
    777
        *(*screen.offset(robot.x as isize))
    778
            .offset(robot.y as isize)
    779
            .as_mut() = 0i32;
    776
    }
    780
    }
    777
    /*Global variables. Bite me, it's fun.*/
    781
    /*Global variables. Bite me, it's fun.*/
    778
    #[no_mangle]
    782
    #[no_mangle]
    779
    pub static mut robot: screen_object = screen_object {
    783
    pub static mut robot: screen_object = screen_object {
    780
        x: 0,
    784
        x: 0,

    786
    #[no_mangle]
    790
    #[no_mangle]
    787
    pub unsafe extern "C" fn initialize_kitten() {
    791
    pub unsafe extern "C" fn initialize_kitten() {
    788
        loop {
    792
        loop {
    789
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    793
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    790
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    794
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    791
            if !(*(*screen.offset(kitten.x as isize)).offset(kitten.y as isize) != -1i32) {
    795
            if !(*(*screen.offset(kitten.x as isize))
    796
                .offset(kitten.y as isize)
    797
                .as_mut()
    798
                != -1i32)
    799
            {
    792
                break;
    800
                break;
    793
            }
    801
            }
    794
        }
    802
        }
    795
        loop {
    803
        loop {
    796
            kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    804
            kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    797
            if !(0 == validchar(kitten.character)) {
    805
            if !(0 == validchar(kitten.character)) {
    798
                break;
    806
                break;
    799
            }
    807
            }
    800
        }
    808
        }
    801
        *(*screen.offset(kitten.x as isize)).offset(kitten.y as isize) = 1i32;
    809
        *(*screen.offset(kitten.x as isize))
    810
            .offset(kitten.y as isize)
    811
            .as_mut() = 1i32;
    802
        kitten.color = rand() % 6i32 + 1i32;
    812
        kitten.color = rand() % 6i32 + 1i32;
    803
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    813
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    804
    }
    814
    }
    805
    #[no_mangle]
    815
    #[no_mangle]
    806
    pub static mut kitten: screen_object = screen_object {
    816
    pub static mut kitten: screen_object = screen_object {

    837
           loop {
    847
           loop {
    838
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    848
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    839
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    849
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    840
                if !(*(*screen.offset(bogus[counter as usize].x as isize))
    850
                if !(*(*screen.offset(bogus[counter as usize].x as isize))
    841
                    .offset(bogus[counter as usize].y as isize)
    851
                    .offset(bogus[counter as usize].y as isize)
    852
                    .as_mut()
    842
                    != -1i32)
    853
                    != -1i32)
    843
                {
    854
                {
    844
                    break;
    855
                    break;
    845
                }
    856
                }
    846
            }
    857
            }
    847
            *(*screen.offset(bogus[counter as usize].x as isize))
    858
            *(*screen.offset(bogus[counter as usize].x as isize))
    848
                .offset(bogus[counter as usize].y as isize) = counter + 2i32;
    859
                .offset(bogus[counter as usize].y as isize)
    860
                .as_mut() = counter + 2i32;
    849
            loop {
    861
            loop {
    850
                index = (rand() as libc::c_ulong).wrapping_rem(
    862
                index = (rand() as libc::c_ulong).wrapping_rem(
    851
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    863
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    852
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    864
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    853
                ) as libc::c_int;
    865
                ) as libc::c_int;

    989
           if !(old_x == robot.x && old_y == robot.y) {
    1001
           if !(old_x == robot.x && old_y == robot.y) {
    990
                if wmove(stdscr, old_y, old_x) == -1i32 {
    1002
                if wmove(stdscr, old_y, old_x) == -1i32 {
    991
                } else {
    1003
                } else {
    992
                    waddch(stdscr, ' ' as i32 as chtype);
    1004
                    waddch(stdscr, ' ' as i32 as chtype);
    993
                };
    1005
                };
    994
                *(*screen.offset(old_x as isize)).offset(old_y as isize) = -1i32;
    1006
                *(*screen.offset(old_x as isize))
    1007
                    .offset(old_y as isize)
    1008
                    .as_mut() = -1i32;
    995
                draw(robot);
    1009
                draw(robot);
    996
                wrefresh(stdscr);
    1010
                wrefresh(stdscr);
    997
                *(*screen.offset(robot.x as isize)).offset(robot.y as isize) = 0i32;
    1011
                *(*screen.offset(robot.x as isize))
    1012
                    .offset(robot.y as isize)
    1013
                    .as_mut() = 0i32;
    998
                old_x = robot.x;
    1014
                old_x = robot.x;
    999
                old_y = robot.y
    1015
                old_y = robot.y
    1000
            }
    1016
            }
    1001
            input = wgetch(stdscr)
    1017
            input = wgetch(stdscr)
    1002
        }
    1018
        }

    1058
           }
    1074
           }
    1059
        }
    1075
        }
    1060
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1076
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1061
            return;
    1077
            return;
    1062
        }
    1078
        }
    1063
        if *(*screen.offset(check_x as isize)).offset(check_y as isize) != -1i32 {
    1079
        if *(*screen.offset(check_x as isize))
    1080
            .offset(check_y as isize)
    1081
            .as_mut()
    1082
            != -1i32
    1083
        {
    1064
            match *(*screen.offset(check_x as isize)).offset(check_y as isize) {
    1084
            match *(*screen.offset(check_x as isize))
    1085
                .offset(check_y as isize)
    1086
                .as_mut()
    1087
            {
    1065
                0 => {}
    1088
                0 => {}
    1066
                1 => {
    1089
                1 => {
    1067
                    /*We didn't move, or we're stuck in a
    1090
                    /*We didn't move, or we're stuck in a
    1068
                    time warp or something.*/
    1091
                    time warp or something.*/
    1069
                    wmove(stdscr, 1i32, 0i32);
    1092
                    wmove(stdscr, 1i32, 0i32);

    1072
               }
    1095
               }
    1073
                _ => {
    1096
                _ => {
    1074
                    message(
    1097
                    message(
    1075
                        messages[bogus_messages[(*(*screen.offset(check_x as isize))
    1098
                        messages[bogus_messages[(*(*screen.offset(check_x as isize))
    1076
                            .offset(check_y as isize)
    1099
                            .offset(check_y as isize)
    1100
                            .as_mut()
    1077
                            - 2i32) as usize] as usize]
    1101
                            - 2i32) as usize] as usize]
    1078
                            .as_ptr() as *mut i8,
    1102
                            .as_ptr() as *mut i8,
    1079
                    );
    1103
                    );
    1080
                }
    1104
                }
    1081
            }
    1105
            }

    Here, the pattern filters for dereferences of CBlockOffset expressions, which result from calling offset on a CBlockPtr, and adds a call to as_mut() before the dereference.

    The conversion of screen to *mut CBlockPtr<c_int> is now complete. The conversion to CBlockPtr<CBlockPtr<c_int>> uses a similar refactoring script:

    select target 'crate; item(screen); child(ty);' ;
    rewrite_ty 'marked!(*mut __t)'
        '::c2rust_runtime::CBlockPtr<__t>' ;
    type_fix_rules
        'rval, *mut __t, ::c2rust_runtime::CBlockPtr<__u> =>
            unsafe { ::c2rust_runtime::CBlockPtr::from_ptr(__old) }'
        'rval, *mut __t, *mut __u => __old as *mut __u'
        ;
    rewrite_expr
        '*typed!(__e, ::c2rust_runtime::block_ptr::CBlockOffset<__t>)'
        '*__e.as_mut()' ;
    

    Diff #26

    src/robotfindskitten.rs
    704
           color: 0,
    704
           color: 0,
    705
            bold: false,
    705
            bold: false,
    706
            character: 0,
    706
            character: 0,
    707
        };
    707
        };
    708
        let mut i: libc::c_int = 0i32;
    708
        let mut i: libc::c_int = 0i32;
    709
        screen = malloc(
    709
        screen = unsafe {
    710
            ::c2rust_runtime::CBlockPtr::from_ptr(malloc(
    710
            (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    711
                (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    711
                .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong),
    712
                    .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong),
    712
        ) as *mut *mut libc::c_int as *mut ::c2rust_runtime::CBlockPtr<i32>;
    713
            ) as *mut *mut libc::c_int
    714
                as *mut ::c2rust_runtime::CBlockPtr<i32>)
    715
        };
    713
        i = 0i32;
    716
        i = 0i32;
    714
        while i < COLS - 1i32 + 1i32 {
    717
        while i < COLS - 1i32 + 1i32 {
    715
            let ref mut fresh0 = *screen.offset(i as isize);
    718
            let ref mut fresh0 = *screen.offset(i as isize).as_mut();
    716
            *fresh0 = unsafe {
    719
            *fresh0 = unsafe {
    717
                ::c2rust_runtime::CBlockPtr::from_ptr(malloc(
    720
                ::c2rust_runtime::CBlockPtr::from_ptr(malloc(
    718
                    (::std::mem::size_of::() as libc::c_ulong)
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    719
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong),
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong),
    720
                ) as *mut libc::c_int)
    723
                ) as *mut libc::c_int)

    728
       empty.character = ' ' as i32 as libc::c_char;
    731
       empty.character = ' ' as i32 as libc::c_char;
    729
        counter = 0i32;
    732
        counter = 0i32;
    730
        while counter <= COLS - 1i32 {
    733
        while counter <= COLS - 1i32 {
    731
            counter2 = 0i32;
    734
            counter2 = 0i32;
    732
            while counter2 <= LINES - 1i32 {
    735
            while counter2 <= LINES - 1i32 {
    733
                *(*screen.offset(counter as isize))
    736
                *(*screen.offset(counter as isize).as_mut())
    734
                    .offset(counter2 as isize)
    737
                    .offset(counter2 as isize)
    735
                    .as_mut() = -1i32;
    738
                    .as_mut() = -1i32;
    736
                counter2 += 1
    739
                counter2 += 1
    737
            }
    740
            }
    738
            counter += 1
    741
            counter += 1

    763
    /* This array contains our internal representation of the screen. The
    766
    /* This array contains our internal representation of the screen. The
    764
    array is bigger than it needs to be, as we don't need to keep track
    767
    array is bigger than it needs to be, as we don't need to keep track
    765
    of the first few rows of the screen. But that requires making an
    768
    of the first few rows of the screen. But that requires making an
    766
    offset function and using that everywhere. So not right now. */
    769
    offset function and using that everywhere. So not right now. */
    767
    #[no_mangle]
    770
    #[no_mangle]
    768
    pub static mut screen: *mut ::c2rust_runtime::CBlockPtr =
    771
    pub static mut screen: ::c2rust_runtime::CBlockPtr<::c2rust_runtime::CBlockPtr> = unsafe {
    769
        0 as *const *mut libc::c_int as *mut *mut libc::c_int as *mut ::c2rust_runtime::CBlockPtr<i32>;
    772
        ::c2rust_runtime::CBlockPtr::from_ptr(
    773
            0 as *const *mut libc::c_int as *mut *mut libc::c_int
    774
                as *mut ::c2rust_runtime::CBlockPtr<i32>,
    775
        )
    776
    };
    770
    #[no_mangle]
    777
    #[no_mangle]
    771
    pub unsafe extern "C" fn initialize_robot() {
    778
    pub unsafe extern "C" fn initialize_robot() {
    772
        robot.x = rand() % (COLS - 1i32) + 1i32;
    779
        robot.x = rand() % (COLS - 1i32) + 1i32;
    773
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    780
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    774
        robot.character = '#' as i32 as libc::c_char;
    781
        robot.character = '#' as i32 as libc::c_char;
    775
        robot.color = 0i32;
    782
        robot.color = 0i32;
    776
        robot.bold = 0 != 0i32;
    783
        robot.bold = 0 != 0i32;
    777
        *(*screen.offset(robot.x as isize))
    784
        *(*screen.offset(robot.x as isize).as_mut())
    778
            .offset(robot.y as isize)
    785
            .offset(robot.y as isize)
    779
            .as_mut() = 0i32;
    786
            .as_mut() = 0i32;
    780
    }
    787
    }
    781
    /*Global variables. Bite me, it's fun.*/
    788
    /*Global variables. Bite me, it's fun.*/
    782
    #[no_mangle]
    789
    #[no_mangle]

    790
    #[no_mangle]
    797
    #[no_mangle]
    791
    pub unsafe extern "C" fn initialize_kitten() {
    798
    pub unsafe extern "C" fn initialize_kitten() {
    792
        loop {
    799
        loop {
    793
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    800
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    794
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    801
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    795
            if !(*(*screen.offset(kitten.x as isize))
    802
            if !(*(*screen.offset(kitten.x as isize).as_mut())
    796
                .offset(kitten.y as isize)
    803
                .offset(kitten.y as isize)
    797
                .as_mut()
    804
                .as_mut()
    798
                != -1i32)
    805
                != -1i32)
    799
            {
    806
            {
    800
                break;
    807
                break;

    804
           kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    811
           kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    805
            if !(0 == validchar(kitten.character)) {
    812
            if !(0 == validchar(kitten.character)) {
    806
                break;
    813
                break;
    807
            }
    814
            }
    808
        }
    815
        }
    809
        *(*screen.offset(kitten.x as isize))
    816
        *(*screen.offset(kitten.x as isize).as_mut())
    810
            .offset(kitten.y as isize)
    817
            .offset(kitten.y as isize)
    811
            .as_mut() = 1i32;
    818
            .as_mut() = 1i32;
    812
        kitten.color = rand() % 6i32 + 1i32;
    819
        kitten.color = rand() % 6i32 + 1i32;
    813
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    820
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    814
    }
    821
    }

    845
               }
    852
               }
    846
            }
    853
            }
    847
            loop {
    854
            loop {
    848
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    855
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    849
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    856
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    850
                if !(*(*screen.offset(bogus[counter as usize].x as isize))
    857
                if !(*(*screen.offset(bogus[counter as usize].x as isize).as_mut())
    851
                    .offset(bogus[counter as usize].y as isize)
    858
                    .offset(bogus[counter as usize].y as isize)
    852
                    .as_mut()
    859
                    .as_mut()
    853
                    != -1i32)
    860
                    != -1i32)
    854
                {
    861
                {
    855
                    break;
    862
                    break;
    856
                }
    863
                }
    857
            }
    864
            }
    858
            *(*screen.offset(bogus[counter as usize].x as isize))
    865
            *(*screen.offset(bogus[counter as usize].x as isize).as_mut())
    859
                .offset(bogus[counter as usize].y as isize)
    866
                .offset(bogus[counter as usize].y as isize)
    860
                .as_mut() = counter + 2i32;
    867
                .as_mut() = counter + 2i32;
    861
            loop {
    868
            loop {
    862
                index = (rand() as libc::c_ulong).wrapping_rem(
    869
                index = (rand() as libc::c_ulong).wrapping_rem(
    863
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    870
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)

    1001
           if !(old_x == robot.x && old_y == robot.y) {
    1008
           if !(old_x == robot.x && old_y == robot.y) {
    1002
                if wmove(stdscr, old_y, old_x) == -1i32 {
    1009
                if wmove(stdscr, old_y, old_x) == -1i32 {
    1003
                } else {
    1010
                } else {
    1004
                    waddch(stdscr, ' ' as i32 as chtype);
    1011
                    waddch(stdscr, ' ' as i32 as chtype);
    1005
                };
    1012
                };
    1006
                *(*screen.offset(old_x as isize))
    1013
                *(*screen.offset(old_x as isize).as_mut())
    1007
                    .offset(old_y as isize)
    1014
                    .offset(old_y as isize)
    1008
                    .as_mut() = -1i32;
    1015
                    .as_mut() = -1i32;
    1009
                draw(robot);
    1016
                draw(robot);
    1010
                wrefresh(stdscr);
    1017
                wrefresh(stdscr);
    1011
                *(*screen.offset(robot.x as isize))
    1018
                *(*screen.offset(robot.x as isize).as_mut())
    1012
                    .offset(robot.y as isize)
    1019
                    .offset(robot.y as isize)
    1013
                    .as_mut() = 0i32;
    1020
                    .as_mut() = 0i32;
    1014
                old_x = robot.x;
    1021
                old_x = robot.x;
    1015
                old_y = robot.y
    1022
                old_y = robot.y
    1016
            }
    1023
            }

    1074
           }
    1081
           }
    1075
        }
    1082
        }
    1076
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1083
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1077
            return;
    1084
            return;
    1078
        }
    1085
        }
    1079
        if *(*screen.offset(check_x as isize))
    1086
        if *(*screen.offset(check_x as isize).as_mut())
    1080
            .offset(check_y as isize)
    1087
            .offset(check_y as isize)
    1081
            .as_mut()
    1088
            .as_mut()
    1082
            != -1i32
    1089
            != -1i32
    1083
        {
    1090
        {
    1084
            match *(*screen.offset(check_x as isize))
    1091
            match *(*screen.offset(check_x as isize).as_mut())
    1085
                .offset(check_y as isize)
    1092
                .offset(check_y as isize)
    1086
                .as_mut()
    1093
                .as_mut()
    1087
            {
    1094
            {
    1088
                0 => {}
    1095
                0 => {}
    1089
                1 => {
    1096
                1 => {

    1093
                   wclrtoeol(stdscr);
    1100
                   wclrtoeol(stdscr);
    1094
                    play_animation(input);
    1101
                    play_animation(input);
    1095
                }
    1102
                }
    1096
                _ => {
    1103
                _ => {
    1097
                    message(
    1104
                    message(
    1098
                        messages[bogus_messages[(*(*screen.offset(check_x as isize))
    1105
                        messages[bogus_messages[(*(*screen.offset(check_x as isize).as_mut())
    1099
                            .offset(check_y as isize)
    1106
                            .offset(check_y as isize)
    1100
                            .as_mut()
    1107
                            .as_mut()
    1101
                            - 2i32) as usize] as usize]
    1108
                            - 2i32) as usize] as usize]
    1102
                            .as_ptr() as *mut i8,
    1109
                            .as_ptr() as *mut i8,
    1103
                    );
    1110
                    );

    The only change is in the rewrite_ty step.

    There's one last bit of cleanup to perform: now that screen has the desired CBlockPtr<CBlockPtr<c_int>> type, we can rewrite the allocations that initialize it. At this point the allocations use the unsafe malloc function followed by the unsafe CBlockPtr::from_ptr, but we can change that to use the safe CBlockPtr::alloc method instead:

    rewrite_expr 'malloc(__e) as *mut __t as *mut __u' 'malloc(__e) as *mut __u' ;
    rewrite_expr
        '::c2rust_runtime::CBlockPtr::from_ptr(malloc(__e) as *mut __t)'
        '::c2rust_runtime::CBlockPtr::alloc(
            __e as usize / ::std::mem::size_of::<__t>())'
        ;
    

    Diff #27

    src/robotfindskitten.rs
    705
           bold: false,
    705
           bold: false,
    706
            character: 0,
    706
            character: 0,
    707
        };
    707
        };
    708
        let mut i: libc::c_int = 0i32;
    708
        let mut i: libc::c_int = 0i32;
    709
        screen = unsafe {
    709
        screen = unsafe {
    710
            ::c2rust_runtime::CBlockPtr::from_ptr(malloc(
    710
            ::c2rust_runtime::CBlockPtr::alloc(
    711
                (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    711
                (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    712
                    .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong),
    712
                    .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong) as usize
    713
            ) as *mut *mut libc::c_int
    713
                    / ::std::mem::size_of::<::c2rust_runtime::CBlockPtr<i32>>(),
    714
                as *mut ::c2rust_runtime::CBlockPtr<i32>)
    714
            )
    715
        };
    715
        };
    716
        i = 0i32;
    716
        i = 0i32;
    717
        while i < COLS - 1i32 + 1i32 {
    717
        while i < COLS - 1i32 + 1i32 {
    718
            let ref mut fresh0 = *screen.offset(i as isize).as_mut();
    718
            let ref mut fresh0 = *screen.offset(i as isize).as_mut();
    719
            *fresh0 = unsafe {
    719
            *fresh0 = unsafe {
    720
                ::c2rust_runtime::CBlockPtr::from_ptr(malloc(
    720
                ::c2rust_runtime::CBlockPtr::alloc(
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong),
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    723
                ) as *mut libc::c_int)
    723
                        / ::std::mem::size_of::(),
    724
                )
    724
            };
    725
            };
    725
            i += 1
    726
            i += 1
    726
        }
    727
        }
    727
        empty.x = -1i32;
    728
        empty.x = -1i32;
    728
        empty.y = -1i32;
    729
        empty.y = -1i32;

    This doesn't remove the unsafe blocks wrapping each allocation - we leave those until the end of our refactoring, when we remove unnecessary unsafe blocks throughout the entire crate at once.

    At this point, the refactoring of screen to is done, and we can commit the changes:

    commit ;
    

    Diff #28

    src/robotfindskitten.rs
    7
        unused_mut
    7
        unused_mut
    8
    )]
    8
    )]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    10
    extern crate c2rust_runtime;
    10
    extern crate c2rust_runtime;
    11
    extern crate libc;
    11
    extern crate libc;
    12
    extern "C" {
    12
    extern "C" {
    13
        pub type ldat;
    13
        pub type ldat;
    14
        #[no_mangle]
    14
        #[no_mangle]
    15
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    15
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;

    Converting to CArray

    The CArray and CBlockPtr APIs are deliberately quite similar, which makes this part of the screen refactoring fairly straightforward.

    First, we replace all uses of CBlockPtr with CArray, both in types and in function calls:

    rewrite_ty '::c2rust_runtime::CBlockPtr<__t>' '::c2rust_runtime::CArray<__t>' ;
    rewrite_expr
        '::c2rust_runtime::CBlockPtr::from_ptr'
        '::c2rust_runtime::CArray::from_ptr' ;
    rewrite_expr
        '::c2rust_runtime::CBlockPtr::alloc'
        '::c2rust_runtime::CArray::alloc' ;
    

    Diff #29

    src/robotfindskitten.rs
    705
            bold: false,
    705
            bold: false,
    706
            character: 0,
    706
            character: 0,
    707
        };
    707
        };
    708
        let mut i: libc::c_int = 0i32;
    708
        let mut i: libc::c_int = 0i32;
    709
        screen = unsafe {
    709
        screen = unsafe {
    710
            ::c2rust_runtime::CBlockPtr::alloc(
    710
            ::c2rust_runtime::CArray::alloc(
    711
                (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    711
                (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    712
                    .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong) as usize
    712
                    .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong) as usize
    713
                    / ::std::mem::size_of::<::c2rust_runtime::CBlockPtr<i32>>(),
    713
                    / ::std::mem::size_of::<::c2rust_runtime::CArray<i32>>(),
    714
            )
    714
            )
    715
        };
    715
        };
    716
        i = 0i32;
    716
        i = 0i32;
    717
        while i < COLS - 1i32 + 1i32 {
    717
        while i < COLS - 1i32 + 1i32 {
    718
            let ref mut fresh0 = *screen.offset(i as isize).as_mut();
    718
            let ref mut fresh0 = *screen.offset(i as isize).as_mut();
    719
            *fresh0 = unsafe {
    719
            *fresh0 = unsafe {
    720
                ::c2rust_runtime::CBlockPtr::alloc(
    720
                ::c2rust_runtime::CArray::alloc(
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    723
                        / ::std::mem::size_of::(),
    723
                        / ::std::mem::size_of::(),
    724
                )
    724
                )
    725
            };
    725
            };

    767
    /* This array contains our internal representation of the screen. The
    767
    /* This array contains our internal representation of the screen. The
    768
    array is bigger than it needs to be, as we don't need to keep track
    768
    array is bigger than it needs to be, as we don't need to keep track
    769
    of the first few rows of the screen. But that requires making an
    769
    of the first few rows of the screen. But that requires making an
    770
    offset function and using that everywhere. So not right now. */
    770
    offset function and using that everywhere. So not right now. */
    771
    #[no_mangle]
    771
    #[no_mangle]
    772
    pub static mut screen: ::c2rust_runtime::CBlockPtr<::c2rust_runtime::CBlockPtr> = unsafe {
    772
    pub static mut screen: ::c2rust_runtime::CArray<::c2rust_runtime::CArray> = unsafe {
    773
        ::c2rust_runtime::CBlockPtr::from_ptr(
    773
        ::c2rust_runtime::CArray::from_ptr(
    774
            0 as *const *mut libc::c_int as *mut *mut libc::c_int
    774
            0 as *const *mut libc::c_int as *mut *mut libc::c_int as *mut ::c2rust_runtime::CArray<i32>,
    775
                as *mut ::c2rust_runtime::CBlockPtr<i32>,
    776
        )
    775
        )
    777
    };
    776
    };
    778
    #[no_mangle]
    777
    #[no_mangle]
    779
    pub unsafe extern "C" fn initialize_robot() {
    778
    pub unsafe extern "C" fn initialize_robot() {
    780
        robot.x = rand() % (COLS - 1i32) + 1i32;
    779
        robot.x = rand() % (COLS - 1i32) + 1i32;

    Next, we fix up calls to offset. Unlike CBlockPtr (and raw pointers in general), CArray distinguishes between mutable and immutable offset pointers. We handle this by simply replacing all offset calls with offset_mut:

    rewrite_expr
        'typed!(__e, ::c2rust_runtime::CArray<__t>).offset(__f)'
        '__e.offset_mut(__f)' ;
    

    Diff #30

    src/robotfindskitten.rs
    713
                    / ::std::mem::size_of::<::c2rust_runtime::CArray<i32>>(),
    713
                    / ::std::mem::size_of::<::c2rust_runtime::CArray<i32>>(),
    714
            )
    714
            )
    715
        };
    715
        };
    716
        i = 0i32;
    716
        i = 0i32;
    717
        while i < COLS - 1i32 + 1i32 {
    717
        while i < COLS - 1i32 + 1i32 {
    718
            let ref mut fresh0 = *screen.offset(i as isize).as_mut();
    718
            let ref mut fresh0 = *screen.offset_mut(i as isize).as_mut();
    719
            *fresh0 = unsafe {
    719
            *fresh0 = unsafe {
    720
                ::c2rust_runtime::CArray::alloc(
    720
                ::c2rust_runtime::CArray::alloc(
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    723
                        / ::std::mem::size_of::(),
    723
                        / ::std::mem::size_of::(),

    732
        empty.character = ' ' as i32 as libc::c_char;
    732
        empty.character = ' ' as i32 as libc::c_char;
    733
        counter = 0i32;
    733
        counter = 0i32;
    734
        while counter <= COLS - 1i32 {
    734
        while counter <= COLS - 1i32 {
    735
            counter2 = 0i32;
    735
            counter2 = 0i32;
    736
            while counter2 <= LINES - 1i32 {
    736
            while counter2 <= LINES - 1i32 {
    737
                *(*screen.offset(counter as isize).as_mut())
    737
                *(*screen.offset_mut(counter as isize).as_mut())
    738
                    .offset(counter2 as isize)
    738
                    .offset_mut(counter2 as isize)
    739
                    .as_mut() = -1i32;
    739
                    .as_mut() = -1i32;
    740
                counter2 += 1
    740
                counter2 += 1
    741
            }
    741
            }
    742
            counter += 1
    742
            counter += 1
    743
        }
    743
        }

    779
        robot.x = rand() % (COLS - 1i32) + 1i32;
    779
        robot.x = rand() % (COLS - 1i32) + 1i32;
    780
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    780
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    781
        robot.character = '#' as i32 as libc::c_char;
    781
        robot.character = '#' as i32 as libc::c_char;
    782
        robot.color = 0i32;
    782
        robot.color = 0i32;
    783
        robot.bold = 0 != 0i32;
    783
        robot.bold = 0 != 0i32;
    784
        *(*screen.offset(robot.x as isize).as_mut())
    784
        *(*screen.offset_mut(robot.x as isize).as_mut())
    785
            .offset(robot.y as isize)
    785
            .offset_mut(robot.y as isize)
    786
            .as_mut() = 0i32;
    786
            .as_mut() = 0i32;
    787
    }
    787
    }
    788
    /*Global variables. Bite me, it's fun.*/
    788
    /*Global variables. Bite me, it's fun.*/
    789
    #[no_mangle]
    789
    #[no_mangle]
    790
    pub static mut robot: screen_object = screen_object {
    790
    pub static mut robot: screen_object = screen_object {

    797
    #[no_mangle]
    797
    #[no_mangle]
    798
    pub unsafe extern "C" fn initialize_kitten() {
    798
    pub unsafe extern "C" fn initialize_kitten() {
    799
        loop {
    799
        loop {
    800
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    800
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    801
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    801
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    802
            if !(*(*screen.offset(kitten.x as isize).as_mut())
    802
            if !(*(*screen.offset_mut(kitten.x as isize).as_mut())
    803
                .offset(kitten.y as isize)
    803
                .offset_mut(kitten.y as isize)
    804
                .as_mut()
    804
                .as_mut()
    805
                != -1i32)
    805
                != -1i32)
    806
            {
    806
            {
    807
                break;
    807
                break;
    808
            }
    808
            }

    811
            kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    811
            kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    812
            if !(0 == validchar(kitten.character)) {
    812
            if !(0 == validchar(kitten.character)) {
    813
                break;
    813
                break;
    814
            }
    814
            }
    815
        }
    815
        }
    816
        *(*screen.offset(kitten.x as isize).as_mut())
    816
        *(*screen.offset_mut(kitten.x as isize).as_mut())
    817
            .offset(kitten.y as isize)
    817
            .offset_mut(kitten.y as isize)
    818
            .as_mut() = 1i32;
    818
            .as_mut() = 1i32;
    819
        kitten.color = rand() % 6i32 + 1i32;
    819
        kitten.color = rand() % 6i32 + 1i32;
    820
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    820
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    821
    }
    821
    }
    822
    #[no_mangle]
    822
    #[no_mangle]

    852
                }
    852
                }
    853
            }
    853
            }
    854
            loop {
    854
            loop {
    855
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    855
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    856
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    856
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    857
                if !(*(*screen.offset(bogus[counter as usize].x as isize).as_mut())
    857
                if !(*(*screen
    858
                    .offset(bogus[counter as usize].y as isize)
    858
                    .offset_mut(bogus[counter as usize].x as isize)
    859
                    .as_mut()
    859
                    .as_mut())
    860
                .offset_mut(bogus[counter as usize].y as isize)
    861
                .as_mut()
    860
                    != -1i32)
    862
                    != -1i32)
    861
                {
    863
                {
    862
                    break;
    864
                    break;
    863
                }
    865
                }
    864
            }
    866
            }
    865
            *(*screen.offset(bogus[counter as usize].x as isize).as_mut())
    867
            *(*screen
    866
                .offset(bogus[counter as usize].y as isize)
    868
                .offset_mut(bogus[counter as usize].x as isize)
    869
                .as_mut())
    870
            .offset_mut(bogus[counter as usize].y as isize)
    867
                .as_mut() = counter + 2i32;
    871
            .as_mut() = counter + 2i32;
    868
            loop {
    872
            loop {
    869
                index = (rand() as libc::c_ulong).wrapping_rem(
    873
                index = (rand() as libc::c_ulong).wrapping_rem(
    870
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    874
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    871
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    875
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    872
                ) as libc::c_int;
    876
                ) as libc::c_int;

    1008
            if !(old_x == robot.x && old_y == robot.y) {
    1012
            if !(old_x == robot.x && old_y == robot.y) {
    1009
                if wmove(stdscr, old_y, old_x) == -1i32 {
    1013
                if wmove(stdscr, old_y, old_x) == -1i32 {
    1010
                } else {
    1014
                } else {
    1011
                    waddch(stdscr, ' ' as i32 as chtype);
    1015
                    waddch(stdscr, ' ' as i32 as chtype);
    1012
                };
    1016
                };
    1013
                *(*screen.offset(old_x as isize).as_mut())
    1017
                *(*screen.offset_mut(old_x as isize).as_mut())
    1014
                    .offset(old_y as isize)
    1018
                    .offset_mut(old_y as isize)
    1015
                    .as_mut() = -1i32;
    1019
                    .as_mut() = -1i32;
    1016
                draw(robot);
    1020
                draw(robot);
    1017
                wrefresh(stdscr);
    1021
                wrefresh(stdscr);
    1018
                *(*screen.offset(robot.x as isize).as_mut())
    1022
                *(*screen.offset_mut(robot.x as isize).as_mut())
    1019
                    .offset(robot.y as isize)
    1023
                    .offset_mut(robot.y as isize)
    1020
                    .as_mut() = 0i32;
    1024
                    .as_mut() = 0i32;
    1021
                old_x = robot.x;
    1025
                old_x = robot.x;
    1022
                old_y = robot.y
    1026
                old_y = robot.y
    1023
            }
    1027
            }
    1024
            input = wgetch(stdscr)
    1028
            input = wgetch(stdscr)

    1081
            }
    1085
            }
    1082
        }
    1086
        }
    1083
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1087
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1084
            return;
    1088
            return;
    1085
        }
    1089
        }
    1086
        if *(*screen.offset(check_x as isize).as_mut())
    1090
        if *(*screen.offset_mut(check_x as isize).as_mut())
    1087
            .offset(check_y as isize)
    1091
            .offset_mut(check_y as isize)
    1088
            .as_mut()
    1092
            .as_mut()
    1089
            != -1i32
    1093
            != -1i32
    1090
        {
    1094
        {
    1091
            match *(*screen.offset(check_x as isize).as_mut())
    1095
            match *(*screen.offset_mut(check_x as isize).as_mut())
    1092
                .offset(check_y as isize)
    1096
                .offset_mut(check_y as isize)
    1093
                .as_mut()
    1097
                .as_mut()
    1094
            {
    1098
            {
    1095
                0 => {}
    1099
                0 => {}
    1096
                1 => {
    1100
                1 => {
    1097
                    /*We didn't move, or we're stuck in a
    1101
                    /*We didn't move, or we're stuck in a

    1100
                    wclrtoeol(stdscr);
    1104
                    wclrtoeol(stdscr);
    1101
                    play_animation(input);
    1105
                    play_animation(input);
    1102
                }
    1106
                }
    1103
                _ => {
    1107
                _ => {
    1104
                    message(
    1108
                    message(
    1105
                        messages[bogus_messages[(*(*screen.offset(check_x as isize).as_mut())
    1109
                        messages[bogus_messages[(*(*screen.offset_mut(check_x as isize).as_mut())
    1106
                            .offset(check_y as isize)
    1110
                            .offset_mut(check_y as isize)
    1107
                            .as_mut()
    1111
                            .as_mut()
    1108
                            - 2i32) as usize] as usize]
    1112
                            - 2i32) as usize] as usize]
    1109
                            .as_ptr() as *mut i8,
    1113
                            .as_ptr() as *mut i8,
    1110
                    );
    1114
                    );
    1111
                }
    1115
                }

    This works fine for robotfindskitten, though in other codebases it may be necessary to properly distinguish mutable and immutable uses of offset.

    With this change, the code typechecks with screens new memory-safe type, so we could stop here. However, unlike CBlockPtr, CArray supports array indexing - ptr[i] - in place of the convoluted *arr.offset(i).as_mut() syntax. So we perform a simple rewrite to make the code a little easier to read:

    rewrite_expr
        'typed!(__e, ::c2rust_runtime::CArray<__t>).offset_mut(__f).as_mut()'
        '&mut __e[__f as usize]' ;
    rewrite_expr '*&mut __e' '__e' ;
    

    Diff #31

    src/robotfindskitten.rs
    713
                    / ::std::mem::size_of::<::c2rust_runtime::CArray<i32>>(),
    713
                    / ::std::mem::size_of::<::c2rust_runtime::CArray<i32>>(),
    714
            )
    714
            )
    715
        };
    715
        };
    716
        i = 0i32;
    716
        i = 0i32;
    717
        while i < COLS - 1i32 + 1i32 {
    717
        while i < COLS - 1i32 + 1i32 {
    718
            let ref mut fresh0 = *screen.offset_mut(i as isize).as_mut();
    718
            let ref mut fresh0 = screen[i as isize as usize];
    719
            *fresh0 = unsafe {
    719
            *fresh0 = unsafe {
    720
                ::c2rust_runtime::CArray::alloc(
    720
                ::c2rust_runtime::CArray::alloc(
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    721
                    (::std::mem::size_of::() as libc::c_ulong)
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    722
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    723
                        / ::std::mem::size_of::(),
    723
                        / ::std::mem::size_of::(),

    732
        empty.character = ' ' as i32 as libc::c_char;
    732
        empty.character = ' ' as i32 as libc::c_char;
    733
        counter = 0i32;
    733
        counter = 0i32;
    734
        while counter <= COLS - 1i32 {
    734
        while counter <= COLS - 1i32 {
    735
            counter2 = 0i32;
    735
            counter2 = 0i32;
    736
            while counter2 <= LINES - 1i32 {
    736
            while counter2 <= LINES - 1i32 {
    737
                *(*screen.offset_mut(counter as isize).as_mut())
    737
                screen[counter as isize as usize][counter2 as isize as usize] = -1i32;
    738
                    .offset_mut(counter2 as isize)
    739
                    .as_mut() = -1i32;
    740
                counter2 += 1
    738
                counter2 += 1
    741
            }
    739
            }
    742
            counter += 1
    740
            counter += 1
    743
        }
    741
        }
    744
        counter = 0i32;
    742
        counter = 0i32;

    779
        robot.x = rand() % (COLS - 1i32) + 1i32;
    777
        robot.x = rand() % (COLS - 1i32) + 1i32;
    780
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    778
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    781
        robot.character = '#' as i32 as libc::c_char;
    779
        robot.character = '#' as i32 as libc::c_char;
    782
        robot.color = 0i32;
    780
        robot.color = 0i32;
    783
        robot.bold = 0 != 0i32;
    781
        robot.bold = 0 != 0i32;
    784
        *(*screen.offset_mut(robot.x as isize).as_mut())
    782
        screen[robot.x as isize as usize][robot.y as isize as usize] = 0i32;
    785
            .offset_mut(robot.y as isize)
    786
            .as_mut() = 0i32;
    787
    }
    783
    }
    788
    /*Global variables. Bite me, it's fun.*/
    784
    /*Global variables. Bite me, it's fun.*/
    789
    #[no_mangle]
    785
    #[no_mangle]
    790
    pub static mut robot: screen_object = screen_object {
    786
    pub static mut robot: screen_object = screen_object {
    791
        x: 0,
    787
        x: 0,

    797
    #[no_mangle]
    793
    #[no_mangle]
    798
    pub unsafe extern "C" fn initialize_kitten() {
    794
    pub unsafe extern "C" fn initialize_kitten() {
    799
        loop {
    795
        loop {
    800
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    796
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    801
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    797
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    802
            if !(*(*screen.offset_mut(kitten.x as isize).as_mut())
    798
            if !(screen[kitten.x as isize as usize][kitten.y as isize as usize] != -1i32) {
    803
                .offset_mut(kitten.y as isize)
    804
                .as_mut()
    805
                != -1i32)
    806
            {
    807
                break;
    799
                break;
    808
            }
    800
            }
    809
        }
    801
        }
    810
        loop {
    802
        loop {
    811
            kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    803
            kitten.character = (rand() % (126i32 - '!' as i32 + 1i32) + '!' as i32) as libc::c_char;
    812
            if !(0 == validchar(kitten.character)) {
    804
            if !(0 == validchar(kitten.character)) {
    813
                break;
    805
                break;
    814
            }
    806
            }
    815
        }
    807
        }
    816
        *(*screen.offset_mut(kitten.x as isize).as_mut())
    808
        screen[kitten.x as isize as usize][kitten.y as isize as usize] = 1i32;
    817
            .offset_mut(kitten.y as isize)
    818
            .as_mut() = 1i32;
    819
        kitten.color = rand() % 6i32 + 1i32;
    809
        kitten.color = rand() % 6i32 + 1i32;
    820
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    810
        kitten.bold = 0 != if 0 != rand() % 2i32 { 1i32 } else { 0i32 };
    821
    }
    811
    }
    822
    #[no_mangle]
    812
    #[no_mangle]
    823
    pub static mut kitten: screen_object = screen_object {
    813
    pub static mut kitten: screen_object = screen_object {

    852
                }
    842
                }
    853
            }
    843
            }
    854
            loop {
    844
            loop {
    855
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    845
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    856
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    846
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    857
                if !(*(*screen
    847
                if !(screen[bogus[counter as usize].x as isize as usize]
    858
                    .offset_mut(bogus[counter as usize].x as isize)
    848
                    [bogus[counter as usize].y as isize as usize]
    859
                    .as_mut())
    860
                .offset_mut(bogus[counter as usize].y as isize)
    861
                .as_mut()
    862
                    != -1i32)
    849
                    != -1i32)
    863
                {
    850
                {
    864
                    break;
    851
                    break;
    865
                }
    852
                }
    866
            }
    853
            }
    867
            *(*screen
    868
                .offset_mut(bogus[counter as usize].x as isize)
    869
                .as_mut())
    870
            .offset_mut(bogus[counter as usize].y as isize)
    854
            screen[bogus[counter as usize].x as isize as usize]
    871
            .as_mut() = counter + 2i32;
    855
                [bogus[counter as usize].y as isize as usize] = counter + 2i32;
    872
            loop {
    856
            loop {
    873
                index = (rand() as libc::c_ulong).wrapping_rem(
    857
                index = (rand() as libc::c_ulong).wrapping_rem(
    874
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    858
                    (::std::mem::size_of::<[*mut libc::c_char; 406]>() as libc::c_ulong)
    875
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    859
                        .wrapping_div(::std::mem::size_of::<*mut libc::c_char>() as libc::c_ulong),
    876
                ) as libc::c_int;
    860
                ) as libc::c_int;

    1012
            if !(old_x == robot.x && old_y == robot.y) {
    996
            if !(old_x == robot.x && old_y == robot.y) {
    1013
                if wmove(stdscr, old_y, old_x) == -1i32 {
    997
                if wmove(stdscr, old_y, old_x) == -1i32 {
    1014
                } else {
    998
                } else {
    1015
                    waddch(stdscr, ' ' as i32 as chtype);
    999
                    waddch(stdscr, ' ' as i32 as chtype);
    1016
                };
    1000
                };
    1017
                *(*screen.offset_mut(old_x as isize).as_mut())
    1001
                screen[old_x as isize as usize][old_y as isize as usize] = -1i32;
    1018
                    .offset_mut(old_y as isize)
    1019
                    .as_mut() = -1i32;
    1020
                draw(robot);
    1002
                draw(robot);
    1021
                wrefresh(stdscr);
    1003
                wrefresh(stdscr);
    1022
                *(*screen.offset_mut(robot.x as isize).as_mut())
    1004
                screen[robot.x as isize as usize][robot.y as isize as usize] = 0i32;
    1023
                    .offset_mut(robot.y as isize)
    1024
                    .as_mut() = 0i32;
    1025
                old_x = robot.x;
    1005
                old_x = robot.x;
    1026
                old_y = robot.y
    1006
                old_y = robot.y
    1027
            }
    1007
            }
    1028
            input = wgetch(stdscr)
    1008
            input = wgetch(stdscr)
    1029
        }
    1009
        }

    1085
            }
    1065
            }
    1086
        }
    1066
        }
    1087
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1067
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1088
            return;
    1068
            return;
    1089
        }
    1069
        }
    1090
        if *(*screen.offset_mut(check_x as isize).as_mut())
    1070
        if screen[check_x as isize as usize][check_y as isize as usize] != -1i32 {
    1091
            .offset_mut(check_y as isize)
    1071
            match screen[check_x as isize as usize][check_y as isize as usize] {
    1092
            .as_mut()
    1093
            != -1i32
    1094
        {
    1095
            match *(*screen.offset_mut(check_x as isize).as_mut())
    1096
                .offset_mut(check_y as isize)
    1097
                .as_mut()
    1098
            {
    1099
                0 => {}
    1072
                0 => {}
    1100
                1 => {
    1073
                1 => {
    1101
                    /*We didn't move, or we're stuck in a
    1074
                    /*We didn't move, or we're stuck in a
    1102
                    time warp or something.*/
    1075
                    time warp or something.*/
    1103
                    wmove(stdscr, 1i32, 0i32);
    1076
                    wmove(stdscr, 1i32, 0i32);
    1104
                    wclrtoeol(stdscr);
    1077
                    wclrtoeol(stdscr);
    1105
                    play_animation(input);
    1078
                    play_animation(input);
    1106
                }
    1079
                }
    1107
                _ => {
    1080
                _ => {
    1108
                    message(
    1081
                    message(
    1109
                        messages[bogus_messages[(*(*screen.offset_mut(check_x as isize).as_mut())
    1082
                        messages[bogus_messages[(screen[check_x as isize as usize]
    1110
                            .offset_mut(check_y as isize)
    1083
                            [check_y as isize as usize]
    1111
                            .as_mut()
    1112
                            - 2i32) as usize] as usize]
    1084
                            - 2i32) as usize] as usize]
    1113
                            .as_ptr() as *mut i8,
    1085
                            .as_ptr() as *mut i8,
    1114
                    );
    1086
                    );
    1115
                }
    1087
                }
    1116
            }
    1088
            }

    Finally, we remove unsafety from screen's static initializer. It currently calls CArray::from_ptr(0 as *mut _), which is unsafe because CArray::from_ptr requires its pointer argument to must satisfy certain properties. But CArray also provides a safe method specifically for initializing a CArray to null, which we can use instead:

    rewrite_expr
        '::c2rust_runtime::CArray::from_ptr(cast!(0))'
        '::c2rust_runtime::CArray::empty()' ;
    

    Diff #32

    src/robotfindskitten.rs
    765
    /* This array contains our internal representation of the screen. The
    765
    /* This array contains our internal representation of the screen. The
    766
    array is bigger than it needs to be, as we don't need to keep track
    766
    array is bigger than it needs to be, as we don't need to keep track
    767
    of the first few rows of the screen. But that requires making an
    767
    of the first few rows of the screen. But that requires making an
    768
    offset function and using that everywhere. So not right now. */
    768
    offset function and using that everywhere. So not right now. */
    769
    #[no_mangle]
    769
    #[no_mangle]
    770
    pub static mut screen: ::c2rust_runtime::CArray<::c2rust_runtime::CArray> = unsafe {
    770
    pub static mut screen: ::c2rust_runtime::CArray<::c2rust_runtime::CArray> =
    771
        ::c2rust_runtime::CArray::from_ptr(
    771
        unsafe { ::c2rust_runtime::CArray::empty() };
    772
            0 as *const *mut libc::c_int as *mut *mut libc::c_int as *mut ::c2rust_runtime::CArray<i32>,
    773
        )
    774
    };
    775
    #[no_mangle]
    772
    #[no_mangle]
    776
    pub unsafe extern "C" fn initialize_robot() {
    773
    pub unsafe extern "C" fn initialize_robot() {
    777
        robot.x = rand() % (COLS - 1i32) + 1i32;
    774
        robot.x = rand() % (COLS - 1i32) + 1i32;
    778
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    775
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    779
        robot.character = '#' as i32 as libc::c_char;
    776
        robot.character = '#' as i32 as libc::c_char;

    This completes the refactoring of screen, as all raw pointer manipulations have been replaced with safe CArray method calls. The only remaining unsafety arises from the fact that screen is a static mut, which we address in a later refactoring step.

    commit ;
    

    Using the pancurses library

    The pancurses library provides safe wrappers around ncurses APIs. Since the pancurses and ncurses APIs are so similar, we can automatically convert the unsafe ncurses FFI calls in robotfindskitten to safe pancurses calls, avoiding the need to maintain safe wrappers in robotfindskitten itself.

    There are two preliminary steps before we do the actual conversion. First, we must import the pancurses library:

    select target 'crate;' ;
    create_item 'extern crate pancurses;' inside ;
    

    Diff #33

    src/robotfindskitten.rs
    7
        unused_mut
    7
        unused_mut
    8
    )]
    8
    )]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    10
    extern crate c2rust_runtime;
    10
    extern crate c2rust_runtime;
    11
    extern crate libc;
    11
    extern crate libc;
    12
    extern crate pancurses;
    12
    extern "C" {
    13
    extern "C" {
    13
        pub type ldat;
    14
        pub type ldat;
    14
        #[no_mangle]
    15
        #[no_mangle]
    15
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    16
        fn printf(_: *const libc::c_char, ...) -> libc::c_int;
    16
        #[no_mangle]
    17
        #[no_mangle]

    And second, we must create a global variable to store the main pancurses Window:

    select target 'crate;' ;
    create_item 'static mut win: Option<::pancurses::Window> = None;' inside ;
    

    Diff #34

    src/robotfindskitten.rs
    5
        non_snake_case,
    5
        non_snake_case,
    6
        non_upper_case_globals,
    6
        non_upper_case_globals,
    7
        unused_mut
    7
        unused_mut
    8
    )]
    8
    )]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    10
    static mut win: Option<::pancurses::Window> = None;
    10
    extern crate c2rust_runtime;
    11
    extern crate c2rust_runtime;
    11
    extern crate libc;
    12
    extern crate libc;
    12
    extern crate pancurses;
    13
    extern crate pancurses;
    13
    extern "C" {
    14
    extern "C" {
    14
        pub type ldat;
    15
        pub type ldat;

    pancurses doesn't have an equivalent of the global stdscr window that ncurses provides. Instead, the pancurses initialization function creates an initial Window object that must be passed around to each function that updates the display. We store that initial Window in the global win variable so that it's accessible everywhere that stdscr is used.

    Note that making win a static mut makes it unsafe to access. However, a later refactoring pass will gather up all static muts, including win, and collect them into a stack-allocated struct, at which point accessing win will no longer be unsafe.

    General library calls

    We convert ncurses library calls to pancurses ones in a few stages.

    First, for functions that don't require a window object, we simply replace each ncurses function with its equivalent in the pancurses library:

    rewrite_expr 'nonl' '::pancurses::nonl' ;
    rewrite_expr 'noecho' '::pancurses::noecho' ;
    rewrite_expr 'cbreak' '::pancurses::cbreak' ;
    rewrite_expr 'has_colors' '::pancurses::has_colors' ;
    rewrite_expr 'start_color' '::pancurses::start_color' ;
    rewrite_expr 'endwin' '::pancurses::endwin' ;
    rewrite_expr 'init_pair' '::pancurses::init_pair' ;
    

    Diff #35

    src/robotfindskitten.rs
    638
    #[no_mangle]
    638
    #[no_mangle]
    639
    pub unsafe extern "C" fn initialize_ncurses() {
    639
    pub unsafe extern "C" fn initialize_ncurses() {
    640
        signal(2i32, Some(finish));
    640
        signal(2i32, Some(finish));
    641
        initscr();
    641
        initscr();
    642
        keypad(stdscr, 0 != 1i32);
    642
        keypad(stdscr, 0 != 1i32);
    643
        nonl();
    643
        ::pancurses::nonl();
    644
        intrflush(stdscr, 0 != 0i32);
    644
        intrflush(stdscr, 0 != 0i32);
    645
        noecho();
    645
        ::pancurses::noecho();
    646
        cbreak();
    646
        ::pancurses::cbreak();
    647
        if has_colors() {
    647
        if ::pancurses::has_colors() {
    648
            start_color();
    648
            ::pancurses::start_color();
    649
            init_pair(
    649
            ::pancurses::init_pair(
    650
                0i32 as libc::c_short,
    650
                0i32 as libc::c_short,
    651
                0i32 as libc::c_short,
    651
                0i32 as libc::c_short,
    652
                0i32 as libc::c_short,
    652
                0i32 as libc::c_short,
    653
            );
    653
            );
    654
            init_pair(
    654
            ::pancurses::init_pair(
    655
                2i32 as libc::c_short,
    655
                2i32 as libc::c_short,
    656
                2i32 as libc::c_short,
    656
                2i32 as libc::c_short,
    657
                0i32 as libc::c_short,
    657
                0i32 as libc::c_short,
    658
            );
    658
            );
    659
            init_pair(
    659
            ::pancurses::init_pair(
    660
                1i32 as libc::c_short,
    660
                1i32 as libc::c_short,
    661
                1i32 as libc::c_short,
    661
                1i32 as libc::c_short,
    662
                0i32 as libc::c_short,
    662
                0i32 as libc::c_short,
    663
            );
    663
            );
    664
            init_pair(
    664
            ::pancurses::init_pair(
    665
                6i32 as libc::c_short,
    665
                6i32 as libc::c_short,
    666
                6i32 as libc::c_short,
    666
                6i32 as libc::c_short,
    667
                0i32 as libc::c_short,
    667
                0i32 as libc::c_short,
    668
            );
    668
            );
    669
            init_pair(
    669
            ::pancurses::init_pair(
    670
                7i32 as libc::c_short,
    670
                7i32 as libc::c_short,
    671
                7i32 as libc::c_short,
    671
                7i32 as libc::c_short,
    672
                0i32 as libc::c_short,
    672
                0i32 as libc::c_short,
    673
            );
    673
            );
    674
            init_pair(
    674
            ::pancurses::init_pair(
    675
                5i32 as libc::c_short,
    675
                5i32 as libc::c_short,
    676
                5i32 as libc::c_short,
    676
                5i32 as libc::c_short,
    677
                0i32 as libc::c_short,
    677
                0i32 as libc::c_short,
    678
            );
    678
            );
    679
            init_pair(
    679
            ::pancurses::init_pair(
    680
                4i32 as libc::c_short,
    680
                4i32 as libc::c_short,
    681
                4i32 as libc::c_short,
    681
                4i32 as libc::c_short,
    682
                0i32 as libc::c_short,
    682
                0i32 as libc::c_short,
    683
            );
    683
            );
    684
            init_pair(
    684
            ::pancurses::init_pair(
    685
                3i32 as libc::c_short,
    685
                3i32 as libc::c_short,
    686
                3i32 as libc::c_short,
    686
                3i32 as libc::c_short,
    687
                0i32 as libc::c_short,
    687
                0i32 as libc::c_short,
    688
            );
    688
            );
    689
        };
    689
        };
    690
    }
    690
    }
    691
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    691
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    692
        endwin();
    692
        ::pancurses::endwin();
    693
        fmt_printf(format_args!(
    693
        fmt_printf(format_args!(
    694
            "{:}{:}{:}",
    694
            "{:}{:}{:}",
    695
            27i32 as u8 as char, '(' as i32 as u8 as char, 'B' as i32 as u8 as char
    695
            27i32 as u8 as char, '(' as i32 as u8 as char, 'B' as i32 as u8 as char
    696
        ));
    696
        ));
    697
        exit(0i32);
    697
        exit(0i32);

    Next, functions taking a window are replaced with method calls on the static win variable we defined earlier:

    rewrite_expr 'wrefresh(stdscr)' 'win.refresh()' ;
    rewrite_expr 'wrefresh(curscr)' 'win.refresh()' ;
    rewrite_expr 'keypad(stdscr, __bf)' 'win.keypad(__bf)' ;
    rewrite_expr 'wmove(stdscr, __my, __mx)' 'win.mv(__my, __mx)' ;
    rewrite_expr 'wclear(stdscr)' 'win.clear()' ;
    rewrite_expr 'wclrtoeol(stdscr)' 'win.clrtoeol()' ;
    rewrite_expr 'waddch(stdscr, __ch)' 'win.addch(__ch)' ;
    
    rewrite_expr
        'wattr_get(stdscr, __attrs, __pair, __e)'
        '{
            let tmp = win.attrget();
            *__attrs = tmp.0;
            *__pair = tmp.1;
            0
        }' ;
    rewrite_expr
        'wattrset(stdscr, __attrs)'
        'win.attrset(__attrs as ::pancurses::chtype)' ;
    

    Diff #36

    src/robotfindskitten.rs
    637
    /*Initialization and setup functions*/
    637
    /*Initialization and setup functions*/
    638
    #[no_mangle]
    638
    #[no_mangle]
    639
    pub unsafe extern "C" fn initialize_ncurses() {
    639
    pub unsafe extern "C" fn initialize_ncurses() {
    640
        signal(2i32, Some(finish));
    640
        signal(2i32, Some(finish));
    641
        initscr();
    641
        initscr();
    642
        keypad(stdscr, 0 != 1i32);
    642
        win.keypad(0 != 1i32);
    643
        ::pancurses::nonl();
    643
        ::pancurses::nonl();
    644
        intrflush(stdscr, 0 != 0i32);
    644
        intrflush(stdscr, 0 != 0i32);
    645
        ::pancurses::noecho();
    645
        ::pancurses::noecho();
    646
        ::pancurses::cbreak();
    646
        ::pancurses::cbreak();
    647
        if ::pancurses::has_colors() {
    647
        if ::pancurses::has_colors() {

    890
           draw(bogus[counter as usize]);
    890
           draw(bogus[counter as usize]);
    891
            counter += 1
    891
            counter += 1
    892
        }
    892
        }
    893
        draw(kitten);
    893
        draw(kitten);
    894
        draw(robot);
    894
        draw(robot);
    895
        wrefresh(stdscr);
    895
        win.refresh();
    896
    }
    896
    }
    897
    #[no_mangle]
    897
    #[no_mangle]
    898
    pub unsafe extern "C" fn draw(mut o: screen_object) {
    898
    pub unsafe extern "C" fn draw(mut o: screen_object) {
    899
        full_draw(o, 0 != 0i32);
    899
        full_draw(o, 0 != 0i32);
    900
    }
    900
    }

    923
           new |= 1u64 << 14i32 + 8i32
    923
           new |= 1u64 << 14i32 + 8i32
    924
        }
    924
        }
    925
        if o.bold {
    925
        if o.bold {
    926
            new |= 1u64 << 13i32 + 8i32
    926
            new |= 1u64 << 13i32 + 8i32
    927
        }
    927
        }
    928
        wattrset(stdscr, new as libc::c_int);
    928
        win.attrset(new as libc::c_int as ::pancurses::chtype);
    929
        if in_place {
    929
        if in_place {
    930
            fmt_printw(format_args!(
    930
            fmt_printw(format_args!(
    931
                "{:}",
    931
                "{:}",
    932
                o.character as libc::c_int as u8 as char
    932
                o.character as libc::c_int as u8 as char
    933
            ));
    933
            ));

    935
           fmt_mvprintw(
    935
           fmt_mvprintw(
    936
                o.y,
    936
                o.y,
    937
                o.x,
    937
                o.x,
    938
                format_args!("{:}", o.character as libc::c_int as u8 as char),
    938
                format_args!("{:}", o.character as libc::c_int as u8 as char),
    939
            );
    939
            );
    940
            wmove(stdscr, o.y, o.x);
    940
            win.mv(o.y, o.x);
    941
        }
    941
        }
    942
        wattrset(stdscr, old as libc::c_int);
    942
        win.attrset(old as libc::c_int as ::pancurses::chtype);
    943
    }
    943
    }
    944
    #[no_mangle]
    944
    #[no_mangle]
    945
    pub unsafe extern "C" fn instructions() {
    945
    pub unsafe extern "C" fn instructions() {
    946
        let mut dummy: libc::c_char = 0;
    946
        let mut dummy: libc::c_char = 0;
    947
        fmt_mvprintw(
    947
        fmt_mvprintw(

    973
       ));
    973
       ));
    974
        fmt_printw(format_args!(
    974
        fmt_printw(format_args!(
    975
            "the Esc key. See the documentation for more information.\n\n"
    975
            "the Esc key. See the documentation for more information.\n\n"
    976
        ));
    976
        ));
    977
        fmt_printw(format_args!("Press any key to start.\n"));
    977
        fmt_printw(format_args!("Press any key to start.\n"));
    978
        wrefresh(stdscr);
    978
        win.refresh();
    979
        dummy = wgetch(stdscr) as libc::c_char;
    979
        dummy = wgetch(stdscr) as libc::c_char;
    980
        wclear(stdscr);
    980
        win.clear();
    981
    }
    981
    }
    982
    #[no_mangle]
    982
    #[no_mangle]
    983
    pub unsafe extern "C" fn draw_in_place(mut o: screen_object) {
    983
    pub unsafe extern "C" fn draw_in_place(mut o: screen_object) {
    984
        full_draw(o, 0 != 1i32);
    984
        full_draw(o, 0 != 1i32);
    985
    }
    985
    }

    991
       let mut input: libc::c_int = 0;
    991
       let mut input: libc::c_int = 0;
    992
        input = wgetch(stdscr);
    992
        input = wgetch(stdscr);
    993
        while input != 27i32 && input != 'q' as i32 && input != 'Q' as i32 {
    993
        while input != 27i32 && input != 'q' as i32 && input != 'Q' as i32 {
    994
            process_input(input);
    994
            process_input(input);
    995
            if !(old_x == robot.x && old_y == robot.y) {
    995
            if !(old_x == robot.x && old_y == robot.y) {
    996
                if wmove(stdscr, old_y, old_x) == -1i32 {
    996
                if win.mv(old_y, old_x) == -1i32 {
    997
                } else {
    997
                } else {
    998
                    waddch(stdscr, ' ' as i32 as chtype);
    998
                    win.addch(' ' as i32 as chtype);
    999
                };
    999
                };
    1000
                screen[old_x as isize as usize][old_y as isize as usize] = -1i32;
    1000
                screen[old_x as isize as usize][old_y as isize as usize] = -1i32;
    1001
                draw(robot);
    1001
                draw(robot);
    1002
                wrefresh(stdscr);
    1002
                win.refresh();
    1003
                screen[robot.x as isize as usize][robot.y as isize as usize] = 0i32;
    1003
                screen[robot.x as isize as usize][robot.y as isize as usize] = 0i32;
    1004
                old_x = robot.x;
    1004
                old_x = robot.x;
    1005
                old_y = robot.y
    1005
                old_y = robot.y
    1006
            }
    1006
            }
    1007
            input = wgetch(stdscr)
    1007
            input = wgetch(stdscr)
    1008
        }
    1008
        }
    1009
        message(b"Bye!\x00" as *const u8 as *const libc::c_char as *mut libc::c_char);
    1009
        message(b"Bye!\x00" as *const u8 as *const libc::c_char as *mut libc::c_char);
    1010
        wrefresh(stdscr);
    1010
        win.refresh();
    1011
        finish(0i32);
    1011
        finish(0i32);
    1012
    }
    1012
    }
    1013
    #[no_mangle]
    1013
    #[no_mangle]
    1014
    pub unsafe extern "C" fn message(mut message_0: *mut libc::c_char) {
    1014
    pub unsafe extern "C" fn message(mut message_0: *mut libc::c_char) {
    1015
        wmove(stdscr, 1i32, 0i32);
    1015
        win.mv(1i32, 0i32);
    1016
        wclrtoeol(stdscr);
    1016
        win.clrtoeol();
    1017
        fmt_mvprintw(
    1017
        fmt_mvprintw(
    1018
            1i32,
    1018
            1i32,
    1019
            0i32,
    1019
            0i32,
    1020
            format_args!("{:.*}", COLS as usize, unsafe {
    1020
            format_args!("{:.*}", COLS as usize, unsafe {
    1021
                ::std::ffi::CStr::from_ptr(message_0 as *const libc::c_char)
    1021
                ::std::ffi::CStr::from_ptr(message_0 as *const libc::c_char)
    1022
                    .to_str()
    1022
                    .to_str()
    1023
                    .unwrap()
    1023
                    .unwrap()
    1024
            }),
    1024
            }),
    1025
        );
    1025
        );
    1026
        wmove(stdscr, robot.y, robot.x);
    1026
        win.mv(robot.y, robot.x);
    1027
        wrefresh(stdscr);
    1027
        win.refresh();
    1028
    }
    1028
    }
    1029
    #[no_mangle]
    1029
    #[no_mangle]
    1030
    pub unsafe extern "C" fn process_input(mut input: libc::c_int) {
    1030
    pub unsafe extern "C" fn process_input(mut input: libc::c_int) {
    1031
        let mut check_x: libc::c_int = robot.x;
    1031
        let mut check_x: libc::c_int = robot.x;
    1032
        let mut check_y: libc::c_int = robot.y;
    1032
        let mut check_y: libc::c_int = robot.y;
    1033
        match input {
    1033
        match input {
    1034
            12 => {
    1034
            12 => {
    1035
                wrefresh(curscr);
    1035
                win.refresh();
    1036
            }
    1036
            }
    1037
            259 | 107 | 75 | 16 => check_y -= 1,
    1037
            259 | 107 | 75 | 16 => check_y -= 1,
    1038
            262 | 121 | 89 => {
    1038
            262 | 121 | 89 => {
    1039
                check_x -= 1;
    1039
                check_x -= 1;
    1040
                check_y -= 1
    1040
                check_y -= 1

    1070
           match screen[check_x as isize as usize][check_y as isize as usize] {
    1070
           match screen[check_x as isize as usize][check_y as isize as usize] {
    1071
                0 => {}
    1071
                0 => {}
    1072
                1 => {
    1072
                1 => {
    1073
                    /*We didn't move, or we're stuck in a
    1073
                    /*We didn't move, or we're stuck in a
    1074
                    time warp or something.*/
    1074
                    time warp or something.*/
    1075
                    wmove(stdscr, 1i32, 0i32);
    1075
                    win.mv(1i32, 0i32);
    1076
                    wclrtoeol(stdscr);
    1076
                    win.clrtoeol();
    1077
                    play_animation(input);
    1077
                    play_animation(input);
    1078
                }
    1078
                }
    1079
                _ => {
    1079
                _ => {
    1080
                    message(
    1080
                    message(
    1081
                        messages[bogus_messages[(screen[check_x as isize as usize]
    1081
                        messages[bogus_messages[(screen[check_x as isize as usize]

    1093
    #[no_mangle]
    1093
    #[no_mangle]
    1094
    pub unsafe extern "C" fn play_animation(mut input: libc::c_int) {
    1094
    pub unsafe extern "C" fn play_animation(mut input: libc::c_int) {
    1095
        let mut counter: libc::c_int = 0;
    1095
        let mut counter: libc::c_int = 0;
    1096
        counter = 4i32;
    1096
        counter = 4i32;
    1097
        while counter > 0i32 {
    1097
        while counter > 0i32 {
    1098
            if wmove(stdscr, 1i32, 50i32 + counter + 1i32) == -1i32 {
    1098
            if win.mv(1i32, 50i32 + counter + 1i32) == -1i32 {
    1099
            } else {
    1099
            } else {
    1100
                waddch(stdscr, ' ' as i32 as chtype);
    1100
                win.addch(' ' as i32 as chtype);
    1101
            };
    1101
            };
    1102
            wmove(stdscr, 1i32, 50i32 + counter);
    1102
            win.mv(1i32, 50i32 + counter);
    1103
            if input == 0o405i32 || input == 0o402i32 || input == 0o540i32 || input == 0o535i32 {
    1103
            if input == 0o405i32 || input == 0o402i32 || input == 0o540i32 || input == 0o535i32 {
    1104
                draw_in_place(kitten);
    1104
                draw_in_place(kitten);
    1105
            } else {
    1105
            } else {
    1106
                draw_in_place(robot);
    1106
                draw_in_place(robot);
    1107
            }
    1107
            }
    1108
            if wmove(stdscr, 1i32, 50i32 - counter) == -1i32 {
    1108
            if win.mv(1i32, 50i32 - counter) == -1i32 {
    1109
            } else {
    1109
            } else {
    1110
                waddch(stdscr, ' ' as i32 as chtype);
    1110
                win.addch(' ' as i32 as chtype);
    1111
            };
    1111
            };
    1112
            wmove(stdscr, 1i32, 50i32 - counter + 1i32);
    1112
            win.mv(1i32, 50i32 - counter + 1i32);
    1113
            if input == 0o405i32 || input == 0o402i32 || input == 0o540i32 || input == 0o535i32 {
    1113
            if input == 0o405i32 || input == 0o402i32 || input == 0o540i32 || input == 0o535i32 {
    1114
                draw_in_place(robot);
    1114
                draw_in_place(robot);
    1115
            } else {
    1115
            } else {
    1116
                draw_in_place(kitten);
    1116
                draw_in_place(kitten);
    1117
            }
    1117
            }
    1118
            wrefresh(stdscr);
    1118
            win.refresh();
    1119
            sleep(1i32 as libc::c_uint);
    1119
            sleep(1i32 as libc::c_uint);
    1120
            counter -= 1
    1120
            counter -= 1
    1121
        }
    1121
        }
    1122
        wmove(stdscr, 1i32, 0i32);
    1122
        win.mv(1i32, 0i32);
    1123
        waddnstr(
    1123
        waddnstr(
    1124
            stdscr,
    1124
            stdscr,
    1125
            b"You found kitten! Way to go, robot!\x00" as *const u8 as *const libc::c_char,
    1125
            b"You found kitten! Way to go, robot!\x00" as *const u8 as *const libc::c_char,
    1126
            -1i32,
    1126
            -1i32,
    1127
        );
    1127
        );
    1128
        wrefresh(stdscr);
    1128
        win.refresh();
    1129
        finish(0i32);
    1129
        finish(0i32);
    1130
    }
    1130
    }
    1131
    unsafe fn main_0(mut argc: libc::c_int, mut argv: *mut *mut libc::c_char) -> libc::c_int {
    1131
    unsafe fn main_0(mut argc: libc::c_int, mut argv: *mut *mut libc::c_char) -> libc::c_int {
    1132
        if argc == 1i32 {
    1132
        if argc == 1i32 {
    1133
            num_bogus = 20i32
    1133
            num_bogus = 20i32

    For simplicity, we write win.f(...) in the rewrite_expr replacement arguments, even though win is actually an Option<Window>, not a Window. Later, we replace win with win.as_ref().unwrap() throughout the crate to correct the resulting type errors.

    We next replace some ncurses global variables with calls to corresponding pancurses functions:

    rewrite_expr 'LINES' 'win.get_max_y()' ;
    rewrite_expr 'COLS' 'win.get_max_x()' ;
    

    Diff #37

    src/robotfindskitten.rs
    709
       };
    709
       };
    710
        let mut i: libc::c_int = 0i32;
    710
        let mut i: libc::c_int = 0i32;
    711
        screen = unsafe {
    711
        screen = unsafe {
    712
            ::c2rust_runtime::CArray::alloc(
    712
            ::c2rust_runtime::CArray::alloc(
    713
                (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    713
                (::std::mem::size_of::<*mut libc::c_int>() as libc::c_ulong)
    714
                    .wrapping_mul((COLS - 1i32 + 1i32) as libc::c_ulong) as usize
    714
                    .wrapping_mul((win.get_max_x() - 1i32 + 1i32) as libc::c_ulong)
    715
                    as usize
    715
                    / ::std::mem::size_of::<::c2rust_runtime::CArray<i32>>(),
    716
                    / ::std::mem::size_of::<::c2rust_runtime::CArray<i32>>(),
    716
            )
    717
            )
    717
        };
    718
        };
    718
        i = 0i32;
    719
        i = 0i32;
    719
        while i < COLS - 1i32 + 1i32 {
    720
        while i < win.get_max_x() - 1i32 + 1i32 {
    720
            let ref mut fresh0 = screen[i as isize as usize];
    721
            let ref mut fresh0 = screen[i as isize as usize];
    721
            *fresh0 = unsafe {
    722
            *fresh0 = unsafe {
    722
                ::c2rust_runtime::CArray::alloc(
    723
                ::c2rust_runtime::CArray::alloc(
    723
                    (::std::mem::size_of::() as libc::c_ulong)
    724
                    (::std::mem::size_of::() as libc::c_ulong)
    724
                        .wrapping_mul((LINES - 1i32 + 1i32) as libc::c_ulong) as usize
    725
                        .wrapping_mul((win.get_max_y() - 1i32 + 1i32) as libc::c_ulong)
    726
                        as usize
    725
                        / ::std::mem::size_of::(),
    727
                        / ::std::mem::size_of::(),
    726
                )
    728
                )
    727
            };
    729
            };
    728
            i += 1
    730
            i += 1
    729
        }
    731
        }

    731
       empty.y = -1i32;
    733
       empty.y = -1i32;
    732
        empty.color = 0i32;
    734
        empty.color = 0i32;
    733
        empty.bold = 0 != 0i32;
    735
        empty.bold = 0 != 0i32;
    734
        empty.character = ' ' as i32 as libc::c_char;
    736
        empty.character = ' ' as i32 as libc::c_char;
    735
        counter = 0i32;
    737
        counter = 0i32;
    736
        while counter <= COLS - 1i32 {
    738
        while counter <= win.get_max_x() - 1i32 {
    737
            counter2 = 0i32;
    739
            counter2 = 0i32;
    738
            while counter2 <= LINES - 1i32 {
    740
            while counter2 <= win.get_max_y() - 1i32 {
    739
                screen[counter as isize as usize][counter2 as isize as usize] = -1i32;
    741
                screen[counter as isize as usize][counter2 as isize as usize] = -1i32;
    740
                counter2 += 1
    742
                counter2 += 1
    741
            }
    743
            }
    742
            counter += 1
    744
            counter += 1
    743
        }
    745
        }

    771
    #[no_mangle]
    773
    #[no_mangle]
    772
    pub static mut screen: ::c2rust_runtime::CArray<::c2rust_runtime::CArray> =
    774
    pub static mut screen: ::c2rust_runtime::CArray<::c2rust_runtime::CArray> =
    773
        unsafe { ::c2rust_runtime::CArray::empty() };
    775
        unsafe { ::c2rust_runtime::CArray::empty() };
    774
    #[no_mangle]
    776
    #[no_mangle]
    775
    pub unsafe extern "C" fn initialize_robot() {
    777
    pub unsafe extern "C" fn initialize_robot() {
    776
        robot.x = rand() % (COLS - 1i32) + 1i32;
    778
        robot.x = rand() % (win.get_max_x() - 1i32) + 1i32;
    777
        robot.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    779
        robot.y = rand() % (win.get_max_y() - 1i32 - 3i32 + 1i32) + 3i32;
    778
        robot.character = '#' as i32 as libc::c_char;
    780
        robot.character = '#' as i32 as libc::c_char;
    779
        robot.color = 0i32;
    781
        robot.color = 0i32;
    780
        robot.bold = 0 != 0i32;
    782
        robot.bold = 0 != 0i32;
    781
        screen[robot.x as isize as usize][robot.y as isize as usize] = 0i32;
    783
        screen[robot.x as isize as usize][robot.y as isize as usize] = 0i32;
    782
    }
    784
    }

    790
       character: 0,
    792
       character: 0,
    791
    };
    793
    };
    792
    #[no_mangle]
    794
    #[no_mangle]
    793
    pub unsafe extern "C" fn initialize_kitten() {
    795
    pub unsafe extern "C" fn initialize_kitten() {
    794
        loop {
    796
        loop {
    795
            kitten.x = rand() % (COLS - 1i32) + 1i32;
    797
            kitten.x = rand() % (win.get_max_x() - 1i32) + 1i32;
    796
            kitten.y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    798
            kitten.y = rand() % (win.get_max_y() - 1i32 - 3i32 + 1i32) + 3i32;
    797
            if !(screen[kitten.x as isize as usize][kitten.y as isize as usize] != -1i32) {
    799
            if !(screen[kitten.x as isize as usize][kitten.y as isize as usize] != -1i32) {
    798
                break;
    800
                break;
    799
            }
    801
            }
    800
        }
    802
        }
    801
        loop {
    803
        loop {

    839
               if !(0 == validchar(bogus[counter as usize].character)) {
    841
               if !(0 == validchar(bogus[counter as usize].character)) {
    840
                    break;
    842
                    break;
    841
                }
    843
                }
    842
            }
    844
            }
    843
            loop {
    845
            loop {
    844
                bogus[counter as usize].x = rand() % (COLS - 1i32) + 1i32;
    846
                bogus[counter as usize].x = rand() % (win.get_max_x() - 1i32) + 1i32;
    845
                bogus[counter as usize].y = rand() % (LINES - 1i32 - 3i32 + 1i32) + 3i32;
    847
                bogus[counter as usize].y = rand() % (win.get_max_y() - 1i32 - 3i32 + 1i32) + 3i32;
    846
                if !(screen[bogus[counter as usize].x as isize as usize]
    848
                if !(screen[bogus[counter as usize].x as isize as usize]
    847
                    [bogus[counter as usize].y as isize as usize]
    849
                    [bogus[counter as usize].y as isize as usize]
    848
                    != -1i32)
    850
                    != -1i32)
    849
                {
    851
                {
    850
                    break;
    852
                    break;

    879
                   .to_str()
    881
                   .to_str()
    880
                    .unwrap()
    882
                    .unwrap()
    881
            }),
    883
            }),
    882
        );
    884
        );
    883
        counter = 0i32;
    885
        counter = 0i32;
    884
        while counter <= COLS - 1i32 {
    886
        while counter <= win.get_max_x() - 1i32 {
    885
            fmt_printw(format_args!("{:}", 95i32 as u8 as char));
    887
            fmt_printw(format_args!("{:}", 95i32 as u8 as char));
    886
            counter += 1
    888
            counter += 1
    887
        }
    889
        }
    888
        counter = 0i32;
    890
        counter = 0i32;
    889
        while counter < num_bogus {
    891
        while counter < num_bogus {

    1015
       win.mv(1i32, 0i32);
    1017
       win.mv(1i32, 0i32);
    1016
        win.clrtoeol();
    1018
        win.clrtoeol();
    1017
        fmt_mvprintw(
    1019
        fmt_mvprintw(
    1018
            1i32,
    1020
            1i32,
    1019
            0i32,
    1021
            0i32,
    1020
            format_args!("{:.*}", COLS as usize, unsafe {
    1022
            format_args!("{:.*}", win.get_max_x() as usize, unsafe {
    1021
                ::std::ffi::CStr::from_ptr(message_0 as *const libc::c_char)
    1023
                ::std::ffi::CStr::from_ptr(message_0 as *const libc::c_char)
    1022
                    .to_str()
    1024
                    .to_str()
    1023
                    .unwrap()
    1025
                    .unwrap()
    1024
            }),
    1026
            }),
    1025
        );
    1027
        );

    1061
                       as *mut libc::c_char,
    1063
                       as *mut libc::c_char,
    1062
                );
    1064
                );
    1063
                return;
    1065
                return;
    1064
            }
    1066
            }
    1065
        }
    1067
        }
    1066
        if check_y < 3i32 || check_y > LINES - 1i32 || check_x < 0i32 || check_x > COLS - 1i32 {
    1068
        if check_y < 3i32
    1069
            || check_y > win.get_max_y() - 1i32
    1070
            || check_x < 0i32
    1071
            || check_x > win.get_max_x() - 1i32
    1072
        {
    1067
            return;
    1073
            return;
    1068
        }
    1074
        }
    1069
        if screen[check_x as isize as usize][check_y as isize as usize] != -1i32 {
    1075
        if screen[check_x as isize as usize][check_y as isize as usize] != -1i32 {
    1070
            match screen[check_x as isize as usize][check_y as isize as usize] {
    1076
            match screen[check_x as isize as usize][check_y as isize as usize] {
    1071
                0 => {}
    1077
                0 => {}

    Finally, we handle a few special cases.

    waddnstr takes a string argument, which in general could be any *const c_char. However, robotfindskitten calls it only on string literals, which lets us perform a more specialized rewrite that avoids unsafe C string conversions:

    rewrite_expr
        'waddnstr(stdscr, __str as *const u8 as *const libc::c_char, __n)'
        "win.addnstr(::std::str::from_utf8(__str).unwrap().trim_end_matches('\0'),
                     __n as usize)" ;
    

    Diff #38

    src/robotfindskitten.rs
    1124
           win.refresh();
    1124
           win.refresh();
    1125
            sleep(1i32 as libc::c_uint);
    1125
            sleep(1i32 as libc::c_uint);
    1126
            counter -= 1
    1126
            counter -= 1
    1127
        }
    1127
        }
    1128
        win.mv(1i32, 0i32);
    1128
        win.mv(1i32, 0i32);
    1129
        waddnstr(
    1129
        win.addnstr(
    1130
            stdscr,
    1130
            ::std::str::from_utf8(b"You found kitten! Way to go, robot!\x00")
    1131
            b"You found kitten! Way to go, robot!\x00" as *const u8 as *const libc::c_char,
    1131
                .unwrap()
    1132
                .trim_end_matches('\u{0}'),
    1132
            -1i32,
    1133
            -1i32 as usize,
    1133
        );
    1134
        );
    1134
        win.refresh();
    1135
        win.refresh();
    1135
        finish(0i32);
    1136
        finish(0i32);
    1136
    }
    1137
    }
    1137
    unsafe fn main_0(mut argc: libc::c_int, mut argv: *mut *mut libc::c_char) -> libc::c_int {
    1138
    unsafe fn main_0(mut argc: libc::c_int, mut argv: *mut *mut libc::c_char) -> libc::c_int {

    intrflush has no pancurses equivalent, so we replace it with a no-op of the same type:

    rewrite_expr 'intrflush(__e, __f)' '0' ;
    

    Diff #39

    src/robotfindskitten.rs
    639
    pub unsafe extern "C" fn initialize_ncurses() {
    639
    pub unsafe extern "C" fn initialize_ncurses() {
    640
        signal(2i32, Some(finish));
    640
        signal(2i32, Some(finish));
    641
        initscr();
    641
        initscr();
    642
        win.keypad(0 != 1i32);
    642
        win.keypad(0 != 1i32);
    643
        ::pancurses::nonl();
    643
        ::pancurses::nonl();
    644
        intrflush(stdscr, 0 != 0i32);
    644
        0;
    645
        ::pancurses::noecho();
    645
        ::pancurses::noecho();
    646
        ::pancurses::cbreak();
    646
        ::pancurses::cbreak();
    647
        if ::pancurses::has_colors() {
    647
        if ::pancurses::has_colors() {
    648
            ::pancurses::start_color();
    648
            ::pancurses::start_color();
    649
            ::pancurses::init_pair(
    649
            ::pancurses::init_pair(

    That covers all of the "ordinary" ncurses functions used in robotfindskitten. The remaining subsections cover the more complex cases.

    String formatting

    We previously replaced calls to the ncurses printw and mvprintw string-formatting functions with code using Rust's safe string formatting macros. This removes unsafety from the call site, but uses wrapper functions (fmt_printw and fmt_mvprintw) that call unsafe code internally. But now that we are using the pancurses library, we can replace those wrappers with safer equivalents.

    select target 'item(fmt_printw);' ;
    create_item '
        fn fmt_printw(args: ::std::fmt::Arguments) -> libc::c_int {
            unsafe {
                win.printw(&format!("{}", args))
            }
        }
    ' after ;
    delete_items ;
    clear_marks ;
    
    select target 'item(fmt_mvprintw);' ;
    create_item '
        fn fmt_mvprintw(y: libc::c_int, x: libc::c_int,
                        args: ::std::fmt::Arguments) -> libc::c_int {
            unsafe {
                win.mvprintw(y, x, &format!("{}", args))
            }
        }
    ' after ;
    delete_items ;
    clear_marks ;
    

    Diff #40

    src/robotfindskitten.rs
    7
        unused_mut
    7
        unused_mut
    8
    )]
    8
    )]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    9
    #![feature(const_raw_ptr_to_usize_cast, extern_types, libc)]
    10
    static mut win: Option<::pancurses::Window> = None;
    10
    static mut win: Option<::pancurses::Window> = None;
    11
    extern crate c2rust_runtime;
    11
    extern crate c2rust_runtime;
    12
    extern crate libc;
    12
    extern crate libc;
    13
    extern crate pancurses;
    13
    extern crate pancurses;
    14
    extern "C" {
    14
    extern "C" {
    15
        pub type ldat;
    15
        pub type ldat;
    16
        #[no_mangle]
    16
        #[no_mangle]

    90
           opts: *mut libc::c_void,
    90
            opts: *mut libc::c_void,
    91
        ) -> libc::c_int;
    91
        ) -> libc::c_int;
    92
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    92
        fn wattrset(win: *mut WINDOW, attrs: libc::c_int) -> libc::c_int;
    93
    }
    93
    }
    94
    fn fmt_mvprintw(y: libc::c_int, x: libc::c_int, args: ::std::fmt::Arguments) -> libc::c_int {
    94
    fn fmt_mvprintw(y: libc::c_int, x: libc::c_int, args: ::std::fmt::Arguments) -> libc::c_int {
    95
        unsafe {
    95
        unsafe { win.mvprintw(y, x, &format!("{}", args)) }
    96
            ::mvprintw(
    97
                y,
    98
                x,
    99
                b"%s\0" as *const u8 as *const libc::c_char,
    100
                ::std::ffi::CString::new(format!("{}", args))
    101
                    .unwrap()
    102
                    .as_ptr(),
    103
            )
    104
        }
    105
    }
    96
    }
    106
    fn fmt_printw(args: ::std::fmt::Arguments) -> libc::c_int {
    97
    fn fmt_printw(args: ::std::fmt::Arguments) -> libc::c_int {
    107
        unsafe {
    98
        unsafe { win.printw(&format!("{}", args)) }
    108
            ::printw(
    109
                b"%s\0" as *const u8 as *const libc::c_char,
    110
                ::std::ffi::CString::new(format!("{}", args))
    111
                    .unwrap()
    112
                    .as_ptr(),
    113
            )
    114
        }
    115
    }
    99
    }
    116
    fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
    100
    fn fmt_printf(args: ::std::fmt::Arguments) -> libc::c_int {
    117
        print!("{}", args);
    101
        print!("{}", args);
    118
        0
    102
        0
    119
    }
    103
    }

    The wrappers still use unsafe code to access win, a static mut, but no longer make FFI calls or manipulate raw C strings. When we later remove all static muts from the program, these functions will become entirely safe.

    Input handling

    Adapting ncurses-based input handling to use pancurses requires some extra care. The pancurses getch function returns a Rust enum, while the ncurses version simply returns an integer. robotfindskitten matches those integers against various ncurses keycode constants, which, after macro expansion, become integer literals in the Rust code.

    The more idiomatic approach would be to replace each integer literal with the matching pancurses::Input enum variant when switching from ncurses getch to the pancurses version. However, we instead take the easier approach of converting pancurses::Input values back to ncurses integer keycodes, so the existing robotfindskitten input handling code can remain unchanged.

    First, we inject a translation function from pancurses to ncurses keycodes:

    select target 'item(initialize_ncurses);' ;
    create_item '
        fn encode_input(inp: Option<::pancurses::Input>) -> libc::c_int {
            use ::pancurses::Input::*;
            let inp = match inp {
                Some(x) => x,
                None => return -1,
            };
            match inp {
                // TODO: unicode inputs in the range 256 .. 512 can
                // collide with ncurses special keycodes
                Character(c) => c as u32 as libc::c_int,
                Unknown(i) => i,
                special => {
                    let idx = ::pancurses::SPECIAL_KEY_CODES.iter()
                        .position(|&k| k == special).unwrap();
                    let code = idx as i32 + ::pancurses::KEY_OFFSET;
                    if code > ::pancurses::KEY_F15 {
                        code + 48
                    } else {
                        code
                    }
                },
            }
        }
    ' after ;
    

    Diff #41

    src/robotfindskitten.rs
    619
     *Function definitions
    619
     *Function definitions
    620
     */
    620
     */
    621
    /*Initialization and setup functions*/
    621
    /*Initialization and setup functions*/
    622
    #[no_mangle]
    622
    #[no_mangle]
    623
    pub unsafe extern "C" fn initialize_ncurses() {
    623
    pub unsafe extern "C" fn initialize_ncurses() {
    624
        signal(2i32, Some(finish));
    624
        signal(2i32, Some(finish));
    625
        initscr();
    625
        initscr();
    626
        win.keypad(0 != 1i32);
    626
        win.keypad(0 != 1i32);
    627
        ::pancurses::nonl();
    627
        ::pancurses::nonl();

    670
                3i32 as libc::c_short,
    670
               3i32 as libc::c_short,
    671
                0i32 as libc::c_short,
    671
                0i32 as libc::c_short,
    672
            );
    672
            );
    673
        };
    673
        };
    674
    }
    674
    }
    675
    fn encode_input(inp: Option<::pancurses::Input>) -> libc::c_int {
    676
        use pancurses::Input::*;
    677
        let inp = match inp {
    678
            Some(x) => x,
    679
            None => return -1,
    680
        };
    681
        match inp {
    682
            // TODO: unicode inputs in the range 256 .. 512 can
    683
            // collide with ncurses special keycodes
    684
            Character(c) => c as u32 as libc::c_int,
    685
            Unknown(i) => i,
    686
            special => {
    687
                let idx = ::pancurses::SPECIAL_KEY_CODES
    688
                    .iter()
    689
                    .position(|&k| k == special)
    690
                    .unwrap();
    691
                let code = idx as i32 + ::pancurses::KEY_OFFSET;
    692
                if code > ::pancurses::KEY_F15 {
    693
                    code + 48
    694
                } else {
    695
                    code
    696
                }
    697
            }
    698
        }
    699
    }
    675
    unsafe extern "C" fn finish(mut sig: libc::c_int) {
    700
    unsafe extern "C" fn finish(mut sig: libc::c_in