Move Semantics in Modern C++

2018/9/15 posted in  C++ comments

C++11: rvalue, rvalue reference, std::move, std::forward

Move semantics allows an object, under certain conditions, to take ownership of other object's external (heap) resources. This helps turning expensive copies into cheap moves.

Move semantics will not offer any advantages if object doesn't own external resources.

class Foo {
    int a;
    char b[64];
};

As above, a Foo instance doesn't own external resources, that is, moving this means copying a and array b, which means this class cannot benefit from move semantics.

When should it be?

A typical use is moving resources from one object to another instead of copying.

template <class T>
swap(T& a, T& b) {
    T tmp(a);   // make 1 copy of a
    a = b;      // make 1 copy of b, discard 1 copy of a
    b = tmp;    // make 1 copy of tmp, discarded 1 copy of b
}

template <class T>
swap(T& a, T& b) {
    T tmp(std::move(a));
    a = std::move(b);   
    b = std::move(tmp);
}

Think of what happens when T is vector<int> of size N. In the first version you read and write 3*N elements, in the second version you basically read and write just the 3 pointers to the vectors' heap buffers. Of course, class T needs to know how to do the moving; you should have a move assignment operator and a move constructor for class T for this to work.

Rvalue and Lvalue

lvalue: A value that resides in memory (heap or stack) and addressable. lvalues correspond to objects you can refer to, either by name or by pointer or reference.

rvalue: It resides only on the right side of an assignment expression such as a literal or a temporary which is intended to be non-modifiable. In C++11, rvalues indicate objects eligible for move operations, while lvalues generally don't.

Rvalue Reference

An rvalue reference is a new kind of reference that only binds to rvalue , and the syntax is X&& (not a ref to a ref, no such thing in C++). The old reference X& is now known as an lvalue reference.

Moving from lvalues is potentially dangerous, but moving from rvalues is harmless. We'd better avoid moving from lvalues, or at least make moving from lvalues explicit, so that we no longer move by accident.

implicit conversions

void foo(std::string&& r);
foo("hello world");

In the above example, "hello world" is an lvalue of type const char[12]. Since there is an implicit conversion from const char[12] through const char* to std::string, a temporary std::string is created, and r is bound to that temporary.

Move Constructor

In C++11, the unsafe auto_ptr has been replaced by unique_ptr which takes advantage of rvalue reference.

Let's kick off with unique_ptr

template<typename T>
class unique_ptr
{
    T* ptr;
public:
    T* operator->() const { return ptr; }
    T& operator*() const { return *ptr; }
    explicit unique_ptr(T* p = nullptr) { ptr = p; }
    ~unique_ptr() { delete ptr; }

The move constructor, note the rvalue reference

    unique_ptr(unique_ptr&& source){
        ptr = source.ptr;
        source.ptr = nullptr;
    }

This move constructor can only be supplied with rvalues

unique_ptr<A> a(new A);
unique_ptr<A> aa(a); // error, a is an lvalue

The move constructor transfers ownership of a managed resource into the current object.

Move Assignment Operator

An move assignment operator's job is to release its old resource and get the new resource from its argument

    unique_ptr& operator=(unique_ptr&& source) {
        if (this != &source) {
            delete ptr;         // release the old resource
            ptr = source.ptr;   // get the new resource
            source.ptr = nullptr;
        }
        return *this;
    }
};

Why std::move

Sometimes we want the compiler to treat an lvalue as if it were an rvalue, so it can invoke the move constructor. C++11 offers std::move, which simply casts a lvalue to an rvalue. It does not move anything by itself. Maybe it should have been named std::cast_to_rvalue, but we are stuck with the name by now.

unique_ptr<A> a(new A);
unique_ptr<A> b(a);            // error
unique_ptr<A> c(std::move(a)); // okay, explicitly write std::move

Moving Out of Functions

If a function returns by value, some object at call site is initialized with the expression after the return statement as an argument to the move constructor:

unique_ptr<A> make() {
    return unique_ptr<A>(new A); // temporary
}

unique_ptr<A> c(make()); // temporary is moved into c

Perhaps surprisingly, local objects can also be implicitly moved out of functions

unique_ptr<A> make() {
    unique_ptr<A> result(new A); // local
    return result;               // no std::move
}

C++11 has a special rule that allows returning local objects as a rvalue from functions without having to write std::move.

The return type is a value, not an reference. Rvalue reference is still reference. The caller would end up with a dangling reference if you tricked the compiler into accepting your code, like this

// DO NOT DO THIS!
unique_ptr<A>&& wrong_make() {
    unique_ptr<A> wrong_idea(new A);
    return std::move(wrong_idea);
}

Universal Reference

Let me introduce Universal Reference auto&& or T&& , which can be either rvalue reference or lvalue reference.

template<typename T>
void foo(T&& arg);      // arg is universal reference

foo(make());            // arg is unique_ptr<A>&&

unique_ptr<A> a(new A);
foo(a);                 // arg is unique_ptr<A>&

Reference Collapsing

If the argument is an lvalue of type X, due to a special rule, T is deduced to be X&, hence T&& would mean something like X& &&. But since C++ still has no notion of ref to ref, the type X& && is collapsed into X&. This is called Reference Collapsing rule, which is essential for Perfect Forwarding.

auto&& x = a;      // unique_ptr<A>& && x = a;  -> unique_ptr<A>& x = a;
auto&& x = make(); // unique_ptr<A>&& && x = a; -> unique_ptr<A>&& x = a;

Forwarding References

For param, like all function parameters, is an lvalue. Remember, an rvalue reference is itself a lvalue. Every call to process inside functions will thus want to invoke the lvalue overload. Perfect forwarding means we don’t just forward objects, we also forward their their types. std::forward<T> is used in templates for forwarding a type to invoke the correct overload you want.

void overloaded(const int& x) { std::cout << "[lvalue]"; }
void overloaded(int&& x) { std::cout << "[rvalue]"; }

auto&& ref = 8;
// `rvalue reference` is itself `lvalue`
overloaded(ref);             // [lvalue]
// let COMPILER know to use rvalue
overloaded(std::move(ref));  // [rvalue]

template <typename T>
void foo(T&& arg) {
  // NOTE !!! arg is always an lvalue
  overloaded(arg); 
  // parameter is rvalue only if T is int&&
  overloaded(std::forward<T>(arg));
}

int a;
foo(a); // calling foo with lvalue: [lvalue][lvalue]
foo(0); // calling foo with rvalue: [lvalue][rvalue]

Do Remember

std::move performs an unconditional cast to an rvalue. In and of itself, it doesn’t move anything.

std::forward casts its argument to an rvalue only if that argument is bound to an rvalue.

Both std::move and std::forward do nothing at runtime.