Move semantics

Value types

In C/C++, values can be classified into two main categories: lvalue and rvalue. A lvalue is a value that has a memory address (Ex. variables) and a rvalue is a value that does not have a memory address (Ex. literals and temporary objects).

Look at the following function definition:

void func(int a) { }

Here, we call it by giving it a rvalue as argument:

func(7);

Below, we call it by giving it a lvalue as argument:

int var = 93; func(var);

To help us remember the difference between lvalue and rvalue, we may think about the following: In an assignment operation (= operation), the left operand (Lvalue) must have a memory address, while the right operand (Rvalue), does not need to have one.

int b = 4; // Ok: The left operand (b) is a lvalue. 7 = b; // Error: The left operand (7) is a rvalue.

Rvalue references

Note that rvalue references (At therefore move semantics too) are a feature of C++11.

A normal function argument can receive both lvalues and rvalues.

int func(int a) // Argument by value. {} int var = 8; func(42); // Ok, receives a rvalue. func(var); // Ok, receives a lvalue.

A function argument that is a (non-constant) reference can only receive lvalues.

int func(int & a) // Argument by reference. {} int var = 8; func(42); // Error: Reference receiving a rvalue. func(var); // Ok: Receives a lvalue.

Now, is there a way that a function argument can only receive rvalues? Since C++11, yes: By defining the argument as a rvalue reference. We define an argument as a rvalue reference by putting two ampersands && between its type and name.

int func(int && a) // Argument by rvalue reference. {} int var = 8; func(42); // Ok, receives a rvalue. func(var); // Error: a rvalue reference argument can not receive a lvalue.

The problem

Before looking at what move semantics are, let us have a look at the problem they resolve.

Earlier, we saw that the default copy constructor and copy operator are problematic when the class they belong to manages dynamically allocated memory. We learned how to resolve that problem by defining them. There is, however, still, in some situations, a (performance) problem. Look at the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#include <iostream> #include <cstdlib> // malloc #include <cstring> // memcpy class A { public: A(unsigned nbBytes) : _nbBytes(nbBytes) { std::cout << "A(" << nbBytes << ")\n"; _data = (char*)malloc(_nbBytes); } A & operator=(const A & a) { std::cout << "Copy of the data.\n"; free(_data); _nbBytes = a._nbBytes; if(_nbBytes) // If != 0. { // Allocates a memory space. _data = (char*)malloc(_nbBytes); // Copies the data from the object to copy. memcpy(_data, a._data, _nbBytes); } else _data = 0; return *this; } A(const A & a) { std::cout << "A(const A &)\n"; // To avoid the attempt to free an invalid memory space. _data = 0; // Calls the copy operator. (*this) = a; } ~A() { std::cout << "~A()\n"; free(_data); } protected: char *_data; unsigned _nbBytes; }; int main() { A a(10); a = A(40); return 0; }

Above, what happens is that an object of type A, named a, is created and allocates an array of 10 bytes. Then, a temporary object is created and it allocates 40 bytes. The object a is assigned with the temporary objects, so it frees the bytes it allocated, allocates a new memory space big enough to hold the bytes allocated by the temporary object and copies them. Then, the temporary object is deleted and the memory it allocated is freed.

Basically, the temporary object allocated memory only for the object a to copy. Since the object is temporary and is going to be deleted right after the assignment operation, we could (And should) simply 'steal' the memory space from it, instead of copying it. Now, how do we know if the object to copy is temporary (And therefore is going to be deleted right after it is copied) or not? Well... Remember that temporary objects are rvalues and rvalue references only take rvalues?

All we have to do is define, inside the class, a special constructor overload (Called move constructor) and operator overload (Called move operator) that takes, as argument, a single rvalue reference to an object of its type. We then define them so they 'steal' the dynamically allocated memory space(s) of the object received in argument. To do so, first, we make the pointer(s) of the newly created object point to the dynamically allocated memory space(s) pointed by the pointer(s) of the object received in argument. Finally, we assign the pointer(s) (Pointing to the stolen dynamically allocated memory spaces) of the object in argument with the address 0, so, when it is deleted, it does not free/delete the stolen memory space(s).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#include <iostream> #include <cstdlib> // malloc #include <cstring> // memcpy class A { public: A(unsigned nbBytes) : _nbBytes(nbBytes) { std::cout << "A(" << nbBytes << ")\n"; _data = (char*)malloc(_nbBytes); } A & operator=(const A & a) { std::cout << "Copy of the data.\n"; free(_data); _nbBytes = a._nbBytes; if(_nbBytes) // If != 0. { // Allocates a memory space. _data = (char*)malloc(_nbBytes); // Copies the data from the object to copy. memcpy(_data, a._data, _nbBytes); } else _data = 0; return *this; } A(const A & a) { std::cout << "A(const A &)\n"; // To avoid the attempt to free an invalid memory space. _data = 0; // Calls the copy operator. (*this) = a; } // Move operator A & operator=(A && a) { std::cout << "A & operator=(A &&)\n"; std::cout << "Object moved. (Memory space stolen).\n"; delete _data; // Steals its dynamically allocated memory space. _data = a._data; _nbBytes = a._nbBytes; /* Makes the pointer of the object received in argument point to the address 0, so it does not free the stolen memory space. */ a._data = 0; return (*this); } // Move constructor A(A && a) { std::cout << "A(A &&)\n"; // To avoid the attempt to free an invalid memory space. _data = 0; /* Calls the move operator. We explicitly cast the argument into a rvalue, to make sure the move operator is used rather than the copy operator. */ (*this) = (A&&)a; } ~A() { std::cout << "~A()\n"; free(_data); } protected: char *_data; unsigned _nbBytes; }; int main() { A a(10); a = A(40); return 0; }

Now, when an object of type A is assigned or constructed by copy with a temporary object, it steals the array of the temporary object, rather than copying it. The temporary object is then more 'reallocated' (moved) into the newly created object, than copied. This is why they are called move constructor/operator.

Now, this is better, but there is still a problem. Look at the following main function (Using the class A defined above):

1
2
3
4
5
6
7
8
9
10
int main() { A * ptr = new A(50); A a(*ptr); delete ptr; return 0; }

Even thought the object pointed by the pointer ptr is deleted directly after it is given to the constructor of the object a, the copy constructor is called rather than the move constructor, so the object pointed by ptr is copied instead of 'moved'. How do we manually make the move constructor be used instead of the copy constructor? By explicitly casting the object pointed by ptr, when we give it to the constructor of a, into a rvalue.

1
2
3
4
5
6
7
8
9
10
int main() { A * ptr = new A(50); A a((A&&)*ptr); delete ptr; return 0; }

Note that it is also possible to explicitly cast a lvalue into a rvalue by using the function std::move, declared in the header file <utility>:

A a(std::move(*ptr));