Many refactoring commands in c2rust refactor
are designed to work only on
selected portions of the crate, rather than affecting the entire crate
uniformly. To support this, c2rust refactor
has a mark system, which
allows marking AST nodes (such as functions, expressions, or type annotations)
with simple string labels. Certain commands add or remove marks, while others
check the existing marks to identify nodes to transform.
For example, in a program containing several byte string literals, you can use
select
to mark a specific one:
select target 'item(B2); desc(expr);'
1 | static B1: &'static [u8] = b"123"; | 1 | static B1: &'static [u8] = b"123"; |
2 | static B2: &'static [u8] = b"abc"; | 2 | static B2: &'static [u8] = ▶b"abc"◀; |
3 | static B3: &'static [u8] = b"!!!"; | 3 | static B3: &'static [u8] = b"!!!"; |
Then, you can use bytestr_to_str
to change only the marked byte string to an
ordinary string literal, leaving the others unaffected:
bytestr_to_str
1 | static B1: &'static [u8] = b"123"; | 1 | static B1: &'static [u8] = b"123"; |
2 | static B2: &'static [u8] = ▶b"abc"◀; | 2 | static B2: &'static [u8] = ▶"abc"◀; |
3 | static B3: &'static [u8] = b"!!!"; | 3 | static B3: &'static [u8] = b"!!!"; |
This ability to limit transformations to specific parts of the program is useful for refactoring a large codebase incrementally, on a module-by-module or function-by-function basis.
The remainder of this tutorial describes select
and related mark-manipulation
commands. For details of how marks affect various transformation commands, see
the command documentation or read about the
marked!
pattern for rewrite_expr
and other
pattern-matching commands.
Marks
A "mark" is a short string label that is associated with a node in the AST.
Marks can be applied to nodes of most kinds, including items, expressions,
patterns, type annotations, and so on. The mark string can be any valid Rust
identifier, though most commands that process marks use short words such as
target
, dest
, or new
. It's possible to apply multiple distinct marks to
the same node, and it's also possible to mark children of marked nodes
separately from their parents (for example, to mark an expression and one of
its subexpressions).
Here are some examples.
select target 'crate; desc(match_expr(2 + 2));'
1 | fn f() -> Option<i32> { | 1 | fn f() -> Option<i32> { |
2 | Some(2 + 2) | 2 | Some(▶2 + 2◀) |
3 | } | 3 | } |
4 | 4 | ||
5 | fn g() -> i32 { | 5 | fn g() -> i32 { |
6 | match f() { | 6 | match f() { |
7 | Some(x) => x, | 7 | Some(x) => x, |
8 | None => 0, | 8 | None => 0, |
9 | } | 9 | } |
10 | } | 10 | } |
The ▶ ...
◀ indicators in the diff show
that the expression 2 + 2
has been marked. Hover over the indicators for
more details, such as the label of the added mark.
As mentioned above, most kinds of nodes can be marked, not only expressions. Here we mark a function, a pattern, and a type annotation:
select a 'item(f);' ;
select b 'item(g); desc(match_ty(i32));' ;
select c 'item(g); desc(match_pat(Some(x)));' ;
1 | fn f() -> Option<i32> { | 1 | ▶fn f() -> Option<i32> { |
2 | Some(2 + 2) | 2 | Some(2 + 2) |
3 | } | 3 | }◀ |
4 | 4 | ||
5 | fn g() -> i32 { | 5 | fn g() -> ▶i32◀ { |
6 | match f() { | 6 | match f() { |
7 | Some(x) => x, | 7 | ▶Some(x)◀ => x, |
8 | None => 0, | 8 | None => 0, |
9 | } | 9 | } |
10 | } | 10 | } |
As mentioned above, it's possible to mark the same node twice with different labels. (Marking it twice with the same label is no different from marking it once.) Here's an example of marking a function multiple times:
select a 'item(f);' ;
select a 'item(f);' ;
select b 'item(f);' ;
1 | fn f() -> Option<i32> { | 1 | ▶fn f() -> Option<i32> { |
2 | Some(2 + 2) | 2 | Some(2 + 2) |
3 | } | 3 | }◀ |
4 | 4 | ||
5 | fn g() -> i32 { | 5 | fn g() -> i32 { |
6 | match f() { | 6 | match f() { |
7 | Some(x) => x, | 7 | Some(x) => x, |
8 | None => 0, | 8 | None => 0, |
9 | } | 9 | } |
10 | } | 10 | } |
As you can see by hovering over the indicators, labels a
and b
were both
added to the function f
.
Marks on a node have no connection to marks on its parent or child nodes. We
can, for example, mark an expression like 2 + 2
, then separately mark its
subexpressions with either the same or different labels:
select a 'item(f); desc(match_expr(2 + 2));' ;
select a 'item(f); desc(match_expr(2)); first;' ;
select b 'item(f); desc(match_expr(2)); last;' ;
1 | fn f() -> Option<i32> { | 1 | fn f() -> Option<i32> { |
2 | Some(2 + 2) | 2 | Some(▶▶2◀ + ▶2◀◀) |
3 | } | 3 | } |
4 | 4 | ||
5 | fn g() -> i32 { | 5 | fn g() -> i32 { |
6 | match f() { | 6 | match f() { |
7 | Some(x) => x, | 7 | Some(x) => x, |
8 | None => 0, | 8 | None => 0, |
9 | } | 9 | } |
10 | } | 10 | } |
Hovering over the mark indicators shows precisely what has happened: we marked
both 2 + 2
and the first 2
with the label a
, and marked the second 2
with the label b
.
The select
command
The select
command provides a simple scripting language for applying marks to
specific nodes. The basic syntax of the command is:
select LABEL SCRIPT
select
runs a SCRIPT
(written in the language described below) to obtain a
set of AST nodes, then marks every node in the set with LABEL
, which should
be a single identifier such as target
.
More concretely, when running the script, select
maintains a "current
selection", which is a set of AST nodes. Script operations (described below)
can extend or modify the current selection. At the end of the script, select
marks every node in the current selection with LABEL
.
We next describe a few common select script patterns, followed by details on the available operations and filters.
Common patterns
Selecting an item by path
For items such as functions, type declarations, or traits, the item(path)
operation selects the item by its path:
select target 'item(f);' ;
select target 'item(T);' ;
select target 'item(S);' ;
select target 'item(m::g);' ;
1 | fn f() {} | 1 | ▶fn f() {}◀ |
2 | trait T {} | 2 | ▶trait T {}◀ |
3 | struct S {} | 3 | ▶struct S {}◀ |
4 | mod m { | 4 | mod m { |
5 | fn g() {} | 5 | ▶fn g() {}◀ |
6 | } | 6 | } |
Note that this only works for the kinds of items that can be imported via
use
. It doesn't handle other kinds of item-like nodes, such as impl methods,
which cannot be imported directly.
Selecting all nodes matching a filter
The operations crate; desc(filter);
together select all nodes (or,
equivalently, all descendants of the crate) that match a filter. For example,
we can select all expressions matching the pattern 2 + 2
using a match_expr
filter:
select target 'crate; desc(match_expr(2 + 2));'
1 | fn f() -> i32 { | 1 | fn f() -> i32 { |
2 | 2 + 2 | 2 | ▶2 + 2◀ |
3 | } | 3 | } |
4 | 4 | ||
5 | const FOUR: i32 = 2 + 2; | 5 | const FOUR: i32 = ▶2 + 2◀; |
6 | 6 | ||
7 | static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4]; | 7 | static ARRAY: [u8; ▶2 + 2◀] = [1, 2, 3, 4]; |
Here we see that crate; desc(filter);
can find matching items anywhere in the
crate: inside function bodies, constant declarations, and even inside the
length expression of an array type annotation.
Selecting filtered nodes inside a parent node
In the previous example, crate; desc(filter);
is made up of two separate
script operations. crate
selects the entire crate:
select target 'crate;'
1 | fn f() -> i32 { | 1 | ▶fn f() -> i32 { |
2 | 2 + 2 | 2 | 2 + 2 |
3 | } | 3 | } |
4 | 4 | ||
5 | const FOUR: i32 = 2 + 2; | 5 | const FOUR: i32 = 2 + 2; |
6 | 6 | ||
7 | static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4]; | 7 | static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];◀ |
Then desc(filter)
looks for descendants of selected nodes that match
filter
, and replaces the current selection with the nodes it finds:
clear_marks ;
select target 'crate; desc(match_expr(2 + 2));'
1 | ▶fn f() -> i32 { | 1 | fn f() -> i32 { |
2 | 2 + 2 | 2 | ▶2 + 2◀ |
3 | } | 3 | } |
4 | 4 | ||
5 | const FOUR: i32 = 2 + 2; | 5 | const FOUR: i32 = ▶2 + 2◀; |
6 | 6 | ||
7 | static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4];◀ | 7 | static ARRAY: [u8; ▶2 + 2◀] = [1, 2, 3, 4]; |
(Note: we use clear_marks
here only for illustration purposes, to make the
diff clearly show the changes between the old and new versions of our select
command.)
Combining desc
with operations other than crate
allows selecting
descendants of only specific nodes. For example, we can find expressions
matching 2 + 2
, but only within the function f
:
select target 'item(f); desc(match_expr(2 + 2));'
1 | fn f() -> i32 { | 1 | fn f() -> i32 { |
2 | 2 + 2 | 2 | ▶2 + 2◀ |
3 | } | 3 | } |
4 | 4 | ||
5 | const FOUR: i32 = 2 + 2; | 5 | const FOUR: i32 = 2 + 2; |
6 | 6 | ||
7 | static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4]; | 7 | static ARRAY: [u8; 2 + 2] = [1, 2, 3, 4]; |
In a more complex example, we can use multiple desc
calls to target an
expression inside of a specific method (recall that methods can't be selected
directly with item(path)
). We first select the module containing the impl:
select target 'item(m);'
1 | fn f() -> i32 { | 1 | fn f() -> i32 { |
2 | 2 + 2 | 2 | 2 + 2 |
3 | } | 3 | } |
4 | 4 | ||
5 | mod m { | 5 | ▶mod m { |
6 | struct S; | 6 | struct S; |
7 | impl S { | 7 | impl S { |
8 | fn f(&self) -> i32 { | 8 | fn f(&self) -> i32 { |
9 | 2 + 2 | 9 | 2 + 2 |
10 | } | 10 | } |
11 | } | 11 | } |
12 | } | 12 | }◀ |
Then we select the method of interest, using the name
filter (described
below):
clear_marks ;
select target 'item(m); desc(name("f"));'
1 | fn f() -> i32 { | 1 | fn f() -> i32 { |
2 | 2 + 2 | 2 | 2 + 2 |
3 | } | 3 | } |
4 | 4 | ||
5 | ▶mod m { | 5 | mod m { |
6 | struct S; | 6 | struct S; |
7 | impl S { | 7 | impl S { |
8 | fn f(&self) -> i32 { | 8 | ▶fn f(&self) -> i32 { |
9 | 2 + 2 | 9 | 2 + 2 |
10 | } | 10 | }◀ |
11 | } | 11 | } |
12 | }◀ | 12 | } |
And finally, we select the expression inside the method:
clear_marks ;
select target 'item(m); desc(name("f")); desc(match_expr(2 + 2));'
1 | fn f() -> i32 { | 1 | fn f() -> i32 { |
2 | 2 + 2 | 2 | 2 + 2 |
3 | } | 3 | } |
4 | 4 | ||
5 | mod m { | 5 | mod m { |
6 | struct S; | 6 | struct S; |
7 | impl S { | 7 | impl S { |
8 | ▶fn f(&self) -> i32 { | 8 | fn f(&self) -> i32 { |
9 | 2 + 2 | 9 | ▶2 + 2◀ |
10 | }◀ | 10 | } |
11 | } | 11 | } |
12 | } | 12 | } |
Combined with some additional filters described below, this approach is quite effective for marking nodes that can't be named with an ordinary import path, such as impl methods or items nested inside functions.
Script operations
A select
script can consist of any number of operations, which will be run in
order to completion. (There is no control flow in select
scripts.) Each
operation ends with a semicolon, much like Rust statements.
The remainder of this section documents each script operation.
crate
crate
(which takes no arguments) adds the root node of the entire crate to
the current selection. All functions, modules, and other declarations are
descendants of this single root node.
Example:
select target 'crate;'
1 | fn f() -> i32 { | 1 | ▶fn f() -> i32 { |
2 | 123 | 2 | 123 |
3 | } | 3 | } |
4 | mod m { | 4 | mod m { |
5 | static S: i32 = 0; | 5 | static S: i32 = 0; |
6 | } | 6 | }◀ |
item
item(p)
adds the item identified by the path p
to the current selection.
The provided path is handled like in Rust's use
declarations (except that
only plain paths are supported, not wildcards or curly-braced blocks).
select target 'item(m::S);'
1 | fn f() -> i32 { | 1 | fn f() -> i32 { |
2 | 123 | 2 | 123 |
3 | } | 3 | } |
4 | mod m { | 4 | mod m { |
5 | static S: i32 = 0; | 5 | ▶static S: i32 = 0;◀ |
6 | } | 6 | } |
Because the item
operation only adds to the current selection (as opposed to
replacing the current selection with a set containing only the identified
item), we can run item
multiple times to select several different items at
once:
select target 'item(f); item(m::S); item(m);'
1 | fn f() -> i32 { | 1 | ▶fn f() -> i32 { |
2 | 123 | 2 | 123 |
3 | } | 3 | }◀ |
4 | mod m { | 4 | ▶mod m { |
5 | static S: i32 = 0; | 5 | ▶static S: i32 = 0;◀ |
6 | } | 6 | }◀ |
child
child(f)
checks each child of each currently selected node against the filter
f
, and replaces the current selection with the set of matching children.
This can be used, for example, to select a static
's type annotation without
selecting type annotations that appear inside its initializer:
select target 'item(S); child(kind(ty));'
1 | static S: i32 = 123_u8 as i32; | 1 | static S: ▶i32◀ = 123_u8 as i32; |
2 | const C: u32 = 0; | 2 | const C: u32 = 0; |
To illustrate how this works, here is the AST for the static S
item:
- item
static S
- identifier
S
(the name of thestatic
) - type
i32
(the type annotation of thestatic
) - expression
123_u8 as i32
(the initializer of thestatic
)- expression
123_u8
(the input of the cast expression) - type
i32
(the target type of the cast expression)
- expression
- identifier
The static
's type annotation is a direct child of the static (and has
kind ty
, matching the kind(ty)
filter), so the type annotation is selected
by the example command above. The target type for the cast is not a direct
child of the static - rather, it's a child of the initializer expression, which
is a child of the static - so it is ignored.
desc
desc(f)
("descendant") checks each descendant of each currently selected node
against the filter f
, and replaces the current selection with the set of
matching descendants. This is similar to child
, but checks for matching
descendants at any depth, not only matching direct children.
Using the same example as for child
, we see that desc
selects more nodes:
select target 'item(S); desc(kind(ty));'
1 | static S: i32 = 123_u8 as i32; | 1 | static S: ▶i32◀ = 123_u8 as ▶i32◀; |
2 | const C: u32 = 0; | 2 | const C: u32 = 0; |
Specifically, it selects both the type annotation of the static
and the
target type of the cast expression, as both are descendants of the static
(though at different depths). Of course, it still does not select the type
annotation of the const C
, which is not a descendant of static S
at any
depth.
Note that desc
only considers the strict descendants of marked nodes - that
is, it does not consider a node to be a "depth-zero" descendant of itself. So,
for example, the following command selects nothing:
select target 'item(S); desc(item_kind(static));'
1 | static S: i32 = 123_u8 as i32; | 1 | static S: i32 = 123_u8 as i32; |
2 | const C: u32 = 0; | 2 | const C: u32 = 0; |
S
itself is a static
, but contains no additional statics inside of it, and
desc
does not consider S
itself when looking for item_kind(static)
descendants.
filter
filter(f)
checks each currently selected node against the filter f
, and
replaces the current selection with the set of matching nodes. Equivalently,
filter(f)
removes from the current selection any nodes that don't match f
.
Most uses of the filter
operation can be replaced by passing a more
appropriate filter expression to desc
or child
, so the examples in this
section are somewhat contrived. (filter
can still be useful in combination
with marked
, described below, or in more complex select scripts.)
Here is a slightly roundabout way to select all items named f
. First, we
select all items:
select target 'crate; desc(kind(item));'
1 | fn f() {} | 1 | ▶fn f() {}◀ |
2 | fn g() {} | 2 | ▶fn g() {}◀ |
3 | 3 | ||
4 | mod m { | 4 | ▶mod m { |
5 | fn f() {} | 5 | ▶fn f() {}◀ |
6 | } | 6 | }◀ |
Then, we use filter
to keep only items named f
:
clear_marks ;
select target 'crate; desc(kind(item)); filter(name("f"));'
1 | ▶fn f() {}◀ | 1 | ▶fn f() {}◀ |
2 | ▶fn g() {}◀ | 2 | fn g() {} |
3 | 3 | ||
4 | ▶mod m { | 4 | mod m { |
5 | ▶fn f() {}◀ | 5 | ▶fn f() {}◀ |
6 | }◀ | 6 | } |
With this command, only descendants of crate matching both filters kind(item)
and name("f")
are selected. (This could be written more simply as crate; desc(kind(item) && name("f"));
.)
first
and last
first
replaces the current selection with a set containing only the first
selected node. last
does the same with the last selected node. "First" and
"last" are determined by a postorder traversal of the AST, so sibling nodes are
ordered as expected, and a parent node come "after" all of its children.
The first
and last
operations are most useful for finding places to insert
new nodes (such as with the create_item
command)
while ignoring details such as the specific names or kinds of the nodes around
the insertion point. For example, we can use last
to easily select the last
item in a module. First, we select all the module's items:
select target 'item(m); child(kind(item));'
1 | mod m { | 1 | mod m { |
2 | fn f() {} | 2 | ▶fn f() {}◀ |
3 | static S: i32 = 0; | 3 | ▶static S: i32 = 0;◀ |
4 | const C: i32 = 1; | 4 | ▶const C: i32 = 1;◀ |
5 | } | 5 | } |
Then we use last
to select only the last such child:
clear_marks ;
select target 'item(m); child(kind(item)); last;'
1 | mod m { | 1 | mod m { |
2 | ▶fn f() {}◀ | 2 | fn f() {} |
3 | ▶static S: i32 = 0;◀ | 3 | static S: i32 = 0; |
4 | ▶const C: i32 = 1;◀ | 4 | ▶const C: i32 = 1;◀ |
5 | } | 5 | } |
Now we could use create_item
to insert a new item
after the last existing one.
marked
marked(l)
adds all nodes marked with label l
to the current selection.
This is useful for more complex marking operations, since (together with the
delete_marks
command) it allows using temporary marks to manipulate multiple
sets of nodes simultaneously.
For example, suppose we wish to select both the first and the last item in a
module. Normally, this would require duplicating the select
command, since
both first
and last
replace the entire current selection with the single
first or last item. This would be undesirable if the operations for setting up
the initial set of items were fairly complex. But with marked
, we can save
the selection before running first
and restore it afterward.
We begin by selecting all items in the module and saving that selection by
marking it with the tmp_all_items
label:
select tmp_all_items 'item(m); child(kind(item));'
1 | mod m { | 1 | mod m { |
2 | fn f() {} | 2 | ▶fn f() {}◀ |
3 | static S: i32 = 0; | 3 | ▶static S: i32 = 0;◀ |
4 | const C: i32 = 1; | 4 | ▶const C: i32 = 1;◀ |
5 | } | 5 | } |
Next, we use marked
to retrieve the tmp_all_items
set and take the first
item from it. This reduces the current selection to only a single item, but
the tmp_all_items
marks remain intact for later use.
select target 'marked(tmp_all_items); first;'
1 | mod m { | 1 | mod m { |
2 | ▶fn f() {}◀ | 2 | ▶fn f() {}◀ |
3 | ▶static S: i32 = 0;◀ | 3 | ▶static S: i32 = 0;◀ |
4 | ▶const C: i32 = 1;◀ | 4 | ▶const C: i32 = 1;◀ |
5 | } | 5 | } |
We do the same to mark the last item with target
:
select target 'marked(tmp_all_items); last;'
1 | mod m { | 1 | mod m { |
2 | ▶fn f() {}◀ | 2 | ▶fn f() {}◀ |
3 | ▶static S: i32 = 0;◀ | 3 | ▶static S: i32 = 0;◀ |
4 | ▶const C: i32 = 1;◀ | 4 | ▶const C: i32 = 1;◀ |
5 | } | 5 | } |
Finally, we clean up, removing the tmp_all_items
marks using the
delete_marks
command:
delete_marks tmp_all_items
1 | mod m { | 1 | mod m { |
2 | ▶fn f() {}◀ | 2 | ▶fn f() {}◀ |
3 | ▶static S: i32 = 0;◀ | 3 | static S: i32 = 0; |
4 | ▶const C: i32 = 1;◀ | 4 | ▶const C: i32 = 1;◀ |
5 | } | 5 | } |
Now the only marks remaining are the target
marks on the first and last items
of the module, as we originally intended.
reset
reset
clears the set of marked nodes. This is only useful in combination
with mark
and unmark
, as otherwise the operations before a reset
have no
effect.
mark
and unmark
These operations allow select
scripts to manipulate marks directly, rather
than relying solely on the automatic marking of selected nodes at the end of
the script. mark(l)
marks all nodes in the current selection with label l
(immediately, rather than waiting until the select
command is finished), and
unmark(l)
removes label l
from all selected nodes.
mark
, unmark
, and reset
can be used to effectively combine multiple
select
commands in a single script. Here's the "first and last" example from
the marked
section, using only a single select
command:
select _dummy '
item(m); child(kind(item)); mark(tmp_all_items); reset;
marked(tmp_all_items); first; mark(target); reset;
marked(tmp_all_items); last; mark(target); reset;
marked(tmp_all_items); unmark(tmp_all_items); reset;
'
1 | mod m { | 1 | mod m { |
2 | fn f() {} | 2 | ▶fn f() {}◀ |
3 | static S: i32 = 0; | 3 | static S: i32 = 0; |
4 | const C: i32 = 1; | 4 | ▶const C: i32 = 1;◀ |
5 | } | 5 | } |
Note that we pass _dummy
as the LABEL
argument of select
, since the
desired target
marks are applied using the mark
operation, rather than
relying on the implicit marking done by select
.
unmark
is also useful in combination with marked
to interface with
non-select
mark manipulation commands. For example, suppose we want to mark
all occurrences of 2 + 2
that are passed as arguments to a function f
. One
option is to do this using the mark_arg_uses
command, with additional
processing by select
before and after. Here we start by marking the function
f
:
select target 'item(f);'
1 | fn f(x: i32) { | 1 | ▶fn f(x: i32) { |
2 | // ... | 2 | // ... |
3 | } | 3 | }◀ |
4 | 4 | ||
5 | fn g(x: i32) { | 5 | fn g(x: i32) { |
6 | // ... | 6 | // ... |
7 | } | 7 | } |
8 | 8 | ||
9 | fn main() { | 9 | fn main() { |
10 | f(1); | 10 | f(1); |
11 | f(2 + 2); | 11 | f(2 + 2); |
12 | g(2 + 2); | 12 | g(2 + 2); |
13 | let x = 2 + 2; | 13 | let x = 2 + 2; |
14 | } | 14 | } |
Next, we run mark_arg_uses
to replace the mark on f
with a mark on each
argument expression passed to f
:
mark_arg_uses 0 target
1 | ▶fn f(x: i32) { | 1 | fn f(x: i32) { |
2 | // ... | 2 | // ... |
3 | }◀ | 3 | } |
4 | 4 | ||
5 | fn g(x: i32) { | 5 | fn g(x: i32) { |
6 | // ... | 6 | // ... |
7 | } | 7 | } |
8 | 8 | ||
9 | fn main() { | 9 | fn main() { |
10 | f(1); | 10 | f(▶1◀); |
11 | f(2 + 2); | 11 | f(▶2 + 2◀); |
12 | g(2 + 2); | 12 | g(2 + 2); |
13 | let x = 2 + 2; | 13 | let x = 2 + 2; |
14 | } | 14 | } |
And finally, we use select
again to mark only those arguments that match 2 + 2
:
select target 'marked(target); unmark(target); filter(match_expr(2 + 2));'
1 | fn f(x: i32) { | 1 | fn f(x: i32) { |
2 | // ... | 2 | // ... |
3 | } | 3 | } |
4 | 4 | ||
5 | fn g(x: i32) { | 5 | fn g(x: i32) { |
6 | // ... | 6 | // ... |
7 | } | 7 | } |
8 | 8 | ||
9 | fn main() { | 9 | fn main() { |
10 | f(▶1◀); | 10 | f(1); |
11 | f(▶2 + 2◀); | 11 | f(▶2 + 2◀); |
12 | g(2 + 2); | 12 | g(2 + 2); |
13 | let x = 2 + 2; | 13 | let x = 2 + 2; |
14 | } | 14 | } |
Beginning the script with marked(target); unmark(target);
copies the set of
target
-marked nodes into the current selection, then removes the existing
marks. The remainder of the script can then operate as usual, manipulating
only the current selection with no need to worry about additional marks being
already present.
Filters
Boolean operators
Filter expressions can be combined using the boolean operators &&
, ||
, and
!
. A node matches the filter f1 && f2
only if it matches f1
and also
matches f2
, and so on.
kind
kind(k)
matches AST nodes whose node kind is k
. The supported node kinds
are:
item
- a top-level item, as instruct Foo { ... }
orfn foo() { ... }
. Includes both items in modules and items defined inside functions or other blocks, but does not include "item-like" nodes inside traits, impls, orextern
blocks.trait_item
- an item inside a trait definition, such as a method or associated type declarationimpl_item
- an item inside an impl block, such as a method or associated type definitionforeign_item
- an item inside anextern block
("foreign module"), such as a C function or static declarationstmt
expr
pat
- a pattern, including single-ident patterns likefoo
inlet foo = ...;
ty
- a type annotation, such asFoo
inlet x: Foo = ...;
arg
- a function or method argument declarationfield
- a struct, enum variant, or union field declarationitemlike
- matches nodes whose kind is any ofitem
,trait_item
,impl_item
, orforeign_item
any
- matches any node
The node kind k
can be used alone as shorthand for kind(k)
. For example,
the operation desc(item);
is the same as desc(kind(item));
.
item_kind
item_kind(k)
matches itemlike AST nodes whose subkind is k
. The itemlike
subkinds are:
extern_crate
use
static
const
fn
mod
foreign_mod
global_asm
ty
- type alias definition, as intype Foo = Bar;
existential
- existential type definition, as inexistential type Foo: Bar;
. Note that existential types are currently an unstable language feature.enum
struct
union
trait
- ordinarytrait Foo { ... }
definition, includingunsafe trait
trait_alias
- trait alias definition, as intrait Foo = Bar;
Note that trait aliases are currently an unstable language feature.impl
- including both trait and inherent implsmac
- macro invocation. Note thatselect
works on the macro-expanded AST, so macro invocations are never present under normal circumstances.macro_def
- 2.0/decl_macro
-style macro definition, as inmacro foo(...) { ... }
. Note that 2.0-style macro definitions are currently an unstable language feature.
Note that a single item_kind
filter can match multiple distinct node kinds,
as long as the subkind is correct. for example, item_kind(fn)
will match
fn
item
s, method trait_item
s and impl_item
s, and fn
declarations
inside extern
blocks (foreign_item
s). similarly, item_kind(ty)
matches
ordinary type
alias definitions, associated type declarations (in traits) and
definitions (in impls), and foreign type declarations inside extern
blocks.
item_kind
filters match only those nodes that also match kind(itemlike)
, as
other node kinds have no itemlike subkind.
The itemlike subkind k
can be used alone as shorthand for item_kind(k)
.
For example, the operation desc(fn);
is the same as desc(item_kind(fn));
.
pub
and mut
pub
matches any item, impl item, or foreign item whose visibility is pub
.
It currently does not support struct fields, even though they can also be
declared pub
.
mut
matches static mut
items, static mut
foreign item declarations, and
mutable binding patterns such as the mut foo
in let mut foo = ...;
.
name
name(re)
matches itemlikes, arguments, and fields whose name matches the
regular expression re
. For example, name("[fF].*")
matches fn f() { ... }
and struct Foo { ... }
, but not trait Bar { ... }
. It currently does not
support general binding patterns, aside from those in function arguments.
path
and path_prefix
path(p)
matches itemlikes and enum variants whose absolute path is p
.
path_prefix(n, p)
is similar to path(p)
, but drops the last n
segments
of the node's path before comparing to p
.
has_attr
has_attr(a)
matches itemlikes, exprs, and field declarations that have an
attribute named a
.
match_*
match_expr(e)
uses rewrite_expr
-style AST matching
to compare exprs to e
, and matches any node where AST matching succeeds. For
example, match_expr(__e + 1)
matches the expressions 1 + 1
, x + 1
, and
f() + 1
, but not 2 + 2
.
match_pat
, match_ty
, and match_stmt
are similar, but operate on pat, ty,
and stmt nodes respectively.
marked
marked(l)
matches nodes that are marked with the label l
.
any_child
, all_child
, any_desc
, and all_desc
any_child(f)
matches nodes that have a child that matches f
.
all_child(f)
matches nodes where all children of the node match f
.
any_desc
and all_desc
are similar, but consider all descendants instead of
only direct children.
Other commands
In addition to select
, c2rust refactor
contains a number of other
mark-manipulation commands. A few of these can be replicated with appropriate
select
scripts (though using the command is typically easier), but some are
more complex.
copy_marks
copy_marks OLD NEW
adds a mark with label NEW
to every node currently
marked with OLD
.
delete_marks
delete_marks OLD
removes the label OLD
from every node that is currently
marked with it.
rename_marks
rename_marks OLD NEW
behaves like copy_marks OLD NEW
followed by
delete_marks OLD
: it adds a mark with label NEW
to every node marked with
OLD
, then removes OLD
from each such node.
mark_uses
mark_uses LABEL
transfers LABEL
marks from definitions to uses. That is,
it finds each definition marked with LABEL
, marks each use of such a
definition with LABEL
, then removes LABEL
from the definitions. For
example, if a static FOO: ... = ...
is marked with target
, then mark_uses target
will add a target
mark to every expression FOO
that references the
marked definition and then remove target
from FOO
itself.
For the purposes of this command, a "use" of a definition is a path or
identifier that resolves to that definition. This includes expressions
(both paths and struct literals), patterns (paths to constants, structs, and
enum variants), and type annotations. When a function definition is marked,
only the function path itself (the foo::bar
in foo::bar(x)
) is considered a
use, not the entire call expression. Method calls (whether using dotted or
UFCS syntax) normally can't be handled at all, as their resolution is
"type-dependent" (however, the mark_callers
command can sometimes work when
mark_uses
does not).
mark_callers
mark_callers LABEL
transfers LABEL
marks from function or method
definitions to uses. That is, it works like mark_uses
, but is specialized to
functions and methods. mark_callers
uses more a more sophisticated means of
name resolution that allows it to detect uses via type-dependent method paths,
which mark_uses
cannot handle.
For purposes of mark_callers
, a "use" is a function call (foo::bar()
) or
method call (x.foo()
) expression where the function or method being called is
one of the marked definitons.
mark_arg_uses
mark_arg_uses INDEX LABEL
transfers LABEL
marks from function or method
definitions to the argument in position INDEX
at each use. That is, it works
like mark_callers
, but marks the expression passed as argument INDEX
instead of the entire call site.
INDEX
is zero-based. However, the self
/receiver argument of a method call
counts as the first argument (index 0), with the first argument in parentheses
having index 1 (arg0.f(arg1, arg2)
). For ordinary function calls (including
UFCS method calls), the first argument has index 0 (f(arg0, arg1, arg2)
)