March | 2013 | The /under/documented side of programming

C++11’s Rvalues have been covered pretty well in various articles, books and Q&A boards. Unfortunately, there are still common mistakes made with respect to their use in move constructors and move-assignment functions. So I just thought I’d focus a little time on showing what can go wrong when Rvalue references are misused and how to avoid it. If you haven’t read up on them, or don’t fully understand what Rvalues are, you must look at these resources:

C++ Rvalue References Explained – Thomas Becker
Universal References in C++11 – Scott Meyers (video)
Note: don’t be dissuaded by his assertion that what he is telling you is a lie – in fact, it’s just a much simpler way of thinking of the end result of reference collapsing.
C++ Primer, 5th Edition (2012, Standly B. Lippman) – especially pages 688 – 694
(optional of course)

Note that I’ve included links to info on universal references and reference collapsing because this is probably an even bigger source of confusion. You may not actually encounter this problem if you don’t use template functions all that often, but its very important to know that Rvalue reference parameters can really become Lvalue references in template functions, and of course this is why std::forward() exists (perfect forwarding).

Move-function Declarations

Okay, with that material behind us, let’s examine now how move constructor and move-assignment functions look in a typical class:

   // Move constructor
   Object(Object&&);
   // Move-assignment operator
   Object& operator=(Object&&) noexcept;

These two functions look pretty much like their copy siblings, with the exception that there is no const specifier and we are of course using two ampersands. The noexcept specifier is also a new C++11 addition, and its use in move-assignment functions is optional – however, without it some compilers may opt not use it in areas (like STL containers) where move functions are expected not to throw. If you have a compiler which doesn’t support noexcept, I’d suggest using a ‘NO_EXCEPT‘ type macro which perhaps uses the old throw() modifier on non-conforming compilers.

Now, with the declaration out of the way, it’s time to move onto the definition of the functions..

A basic move constructor

A move constructor is fairly simple in that we are basically initializing everything with moves:

   // MOVE constructor
   Object(Object&& moveFrom) :
     m_Data(std::move(moveFrom.m_Data)),
     m_Val(std::move(moveFrom.m_Val))
   {
   // POD types can simply be copied, then invalidated
      m_ValidFlag = std::move(moveFrom.m_ValidFlag);
      m_NativeType = std::move(moveFrom.m_NativeType);
      moveFrom.m_ValidFlag = false;
      moveFrom.m_NativeType = 0;
   }

The above code demonstrates a few things:

1st, the member initialization list is used outside of the function body. This ensures that we call the right object constructors before the function body. If we initialize an object inside the constructor body, then we could easily wind up reinitializing an object. Certain compilers will optimize the first initialization out, but in practice you should always always initialize objects in the initialization list.

2nd, every time the Rvalue reference is used in any expression, it is surrounded by std::move(). This is where a lot of programmer’s make mistakes initially (including me!). You need to literally spam your move functions with std::move() every time the Rvalue reference is used – either as a whole, or when used to reference individual member variables.

The reason for all the calls to std::move() is of course the fact that Rvalue references behave like Lvalues in expressions – this is covered in Thomas Becker’s article, and can be most easily remembered as ‘named variables are Lvalues‘ (or at least, act like them). std::move() is really just a one-line Rvalue-reference cast, so it comes at no cost.

3rd: Although this is optional, its a good idea to use std::move() even on native data types and POD (plain old data) types, to keep things uniform and consistent. You’ll still need to manually invalidate this data in the Rvalue object (as needed), however.

A basic move-assignment function

A move-assignment function is a bit more complicated than the move-constructor. Here we know ahead of time that the object has already been initialized, so we need to cleanup or destroy the object’s current resources before we replace it with the new data. This is more or less the same behavior as a copy-assignment function, but of course here we are ‘consuming’ the passed object’s data:

   // MOVE-assignment operator
   Object& operator=(Object&& moveFrom) noexcept
   {
   // Avoid copies to self (same as for COPY-assignment)
      if (this != &moveFrom)
      {
         // Clean up our resources first
         destroy_this();

         // MOVE object members
         m_Data = std::move(moveFrom.m_Data);
         m_Val = std::move(moveFrom.m_Val);

         // POD types can simply be copied, then invalidated
         m_ValidFlag = std::move(moveFrom.m_ValidFlag);
         m_NativeType = std::move(moveFrom.m_NativeType);
         moveFrom.m_ValidFlag = false;
         moveFrom.m_NativeType = 0;
      }
      return *this;
   }

Just as a copy-assignment function has a check for self-assignment, so too does a move-assignment function – although the chances of it happening are pretty slim. Once it is determined that this function isn’t a self-assignment, the first thing it does is of course destroy the current resources (destroy_this();). After that, we can then move-assign the internal objects with the Rvalue object’s members, and then work on the POD types. We could as well call some internal Rvalue function inside each object to do the moves, but here we assume that each object has its own move-assignment function, or at the very least a copy-assignment function.

Make sure moves are persistent!

Things get complicated when we have to think about what objects accept Rvalues and which ones don’t. When using the C++ STL library, every object is pretty much guaranteed to have Rvalue assignment operators and constructors, so we can safely assume that a move function will in fact be called when needed. However, with your own objects and with other object libraries, you will need to check if this is the case. Sometimes where move functions are lacking, there may be other special functions that do what we intend – for example, acquireObject() and release() may be move-like functions we’d need to explicitly call in such cases. This can of course become tedious very quickly, but until everything we use is updated with move capability, you’ll need to examine object declarations closely.

Derived classes and moves

Another area I see problems occurring at with move functions is with derived classes. We obviously want the parent object (or super class) to move right along with the derived (or subclass) object. Assuming we built the parent object and included the necessary move constructor, the above move constructor can be simply modified with a call to the parent’s move constructor as the first item in the initialization list:

   // MOVE constructor
   Object(Object&& moveFrom) : Parent(std::move(moveFrom)),
     m_Data(std::move(moveFrom.m_Data)),
     m_Val(std::move(moveFrom.m_Val))
   {   // body - see above   }

Nothing much else to say here, other than the obvious: always use std:move()!

The move-assignment function will need to do extra work as well. The simplest form of moving the Parent object’s data is to manually invoke a move-assignment operator, something like this: Parent::operator=(std::move(moveFrom));. Obviously, there are many cases where that will not work, and really it boils down to how the data in the parent class interrelates with the data in the derived class. Just remember to destroy all resources, use std::move() on every assignment, and you should be fine.

The copy/move right-vs-wrong-way example program

Okay, since I figure the best way to show the right and wrong way to implement move functions is to give an example piece of code, I’ve written one up which hopefully clarifies everything. You can skip to the link below to see the program and its output, but I should of course cover what it is first.

In the example program, there’s 1 parent class (SuperClass), and 2 derived classes (SubClass and SubFailClass), one which acts appropriately, the other which misbehaves by incorrectly handling moves in both the constructor and assignment operator.

    SuperClass
    |       |
SubClass  SubFailClass

Inside SuperClass are 4 data members, one of which is a std::string, the other are basic types, with the exception of <T> m_Data which is a template object:

std::string m_Str;
T m_Data;
int m_Int;
bool m_bValid;

The main areas of focus in the source code are the copy and move constructors, and the copy and move assignment operators. Here’s how the base class copy/move members look:

SuperClass(const SuperClass& copyFrom) : m_Str(copyFrom.m_Str), m_Data(copyFrom.m_Data)
 {  m_Int = copyFrom.m_Int;  m_bValid = copyFrom.m_bValid; }

SuperClass(SuperClass&& moveFrom) noexcept
: m_Str(std::move(moveFrom.m_Str)), m_Data(std::move(moveFrom.m_Data))
{
   m_Int = std::move(moveFrom.m_Int);
   m_bValid = std::move(moveFrom.m_bValid);
}
SuperClass& operator=(const SuperClass& copyFrom)
{
   if (this != &copyFrom)
      assign(copyFrom);
   return *this;
}
SuperClass& operator=(SuperClass&& moveFrom) noexcept
{
   if (this != &moveFrom)
      assign(std::move(moveFrom));
   return *this;
}

As discussed, std::move() is used in the move variants to re-cast the ‘moveFrom‘ member into an Rvalue reference in each expression. Also, instead of making changes in the copy- and move-assignment functions, the assign() functions are called to make a copy or move. That function is relatively predictable for copy: just destroy the object contents and replace it with the copyFrom object:

void assign(const SuperClass& copyFrom)
{
   // Our data will be overwritten, so its best to free the resources 1st
   destroy();
   // Now we can copy
   m_Str = copyFrom.m_Str;
   m_Data = copyFrom.m_Data;
   m_Int = copyFrom.m_Int;
   m_bValid = copyFrom.m_bValid;
}

The move variant of course moves all the members, and invalidates the Rvalue object’s POD data ( a bool and int here):

void assign(SuperClass&& moveFrom)
{
   // Our data will be overwritten, so its best to free the resources 1st
   destroy();
   // move for non-POD and unknown types:
   m_Str = std::move(moveFrom.m_Str);
   m_Data = std::move(moveFrom.m_Data);

   // for POD members we can simply copy-assign, then invalidate the temporary
   m_Int = std::move(moveFrom.m_Int);
   m_bValid = std::move(moveFrom.m_bValid);

   moveFrom.m_Int = 0;
   moveFrom.m_bValid = false;
}

The derived classes SubClass and SubFailClass are basically superficial class objects. Their only purpose here is to demonstrate the correct and incorrect way to invoke move functions in the base class. The assignment operators are simple in that they call the parent class’s assign() function; however, the way they call it is what’s important. One properly invokes it using std::move(), the other fails. Also, here’s the way their constructors look:

SubClass(const SubClass& copyFrom) : Parent(copyFrom) {}
SubClass(SubClass&& moveFrom) noexcept : Parent(std::move(moveFrom)) {}

SubFailClass(const SubFailClass& copyFrom) : Parent(copyFrom) {}
SubFailClass(SubFailClass&& moveFrom) noexcept : Parent(moveFrom) {}

The latter class is the obvious failure here (as its name would indicate) – std::move() isn’t used!

The rest of the code in the example program just demonstrates how copies and moves happen through some helper functions. The functions are documented enough so that hopefully it all makes sense. Also, every function call is reported through cout, so it should make following the program’s inner workings more understandable.

Note that the output for constructors and destructors happens in reverse order. Also very important – a moved object should not contain a string at destruction. As you’ll see by the output, that’s not the case for SubFailClass, since it fails to do a proper move.

..and finally..

The example program and it’s output can be found on the online Coliru compiler.

UPDATE: Look at the output from the alternate version because the old version now doesn’t give the output we want. The reason is the compiler now optimizes temporary returns – normally a good thing – but for demonstration purposes, doesn’t give the output we want. The source code for the original is also available at Pastebin.

Kudos to you if you sat through that long post! Until next time..

The /under/documented side of programming

Ascend4nt's Programming

Monthly Archives: March 2013

Rvalues in move constructors and move-assignment functions