Perfect forwarding allows template functions to forward the arguments “as is” to any other function they call. This helps minimize the number of unnecessary copies and conversions when delegating information to other functions. In a quest to get rid of copying completely in a library I was writing, I came across the problem of perfect forwarding to functions launched on a separate thread.
Overview
Before tackling the problem of perfect forwarding in the next part, I will
quickly overview rvalues, present a way to measure copies/moves, and finally
observe a “problem” with std::async
. In the second part of the
article we will cover perfect forwarding, and overview the problem in that
context. Finally a generic solution will be presented.
This article is geared towards library writers and those who write generic template code in C++11. Most of the code from both parts of the article can be found in this gist.
Lvalues and Rvalues
Let’s start with a quick recap of lvalues and rvalues. ( You can skip this section if you have a firm grasp of the concept )
With C++11 we have to be able to distinguish lvalues and rvalues. The names hint at the fact that lvalues are values that tend to appear on the left hand side of an assignment expression, whereas rvalues are those that are on the right. In other words, lvalues are values that you can freely refer to in their scope because they are bound to an identifier- such as variables, and functions.
Rvalues, cannot be directly referred to because they are temporaries and are not bound to an identifier- such as values returned by a function, or the direct result of an inline construction:
With C++11, one can now create functions that accept lvalue references
(T&
), as well as rvalue references
(T&&
). Since rvalues are temporaries, it allows us to
transfer ownership of that temporary, instead of performing a needless copy.
The new notation allows a function to distinguish between lvaluesT&
that already exist in an outer scope, vs rvaluesT&&
that are yet to be bound to a scope.
To facilitate this transfer of ownership, the function std::move
is available. Here is a quick example of its usage in the context of
object construction:
Given this quick overview, it should be apparent that moves help save unnecessary copies- essential if you want to write an efficient library. This is the “value of rvalues”. For a more focused overview, you could also look at a larger article by Alex Allain about move semantics.
Profiling copies/moves
In order to ensure optimal performance I wrote tests to count how many copies or moves occurred during the invocations of various API calls. To carry out these tests, I created a simple class that kept a shared count of the amount of moves and copies that were performed on it:
In case you just skipped the above block of code, the usage is simple:
Armed with the move_checker
I was able to profile my code, and make
sure there were no extraneous copies. During the rest of the article, I will
provide asserts with the actual number of copy/move counts a particular piece of
code produces.
Std::thread and async
The next piece of the puzzle is launching functions on other threads. Thankfully
C++11 comes with its own standard implementation of threads, allowing for easy
execution of functions, and passing of arguments to other threads. Here I run a
function that prints the contents of an iterable on another thread using std::async
:
Notice, to run the function on another thread, the arguments have to be
available on the other thread. I can move the checker
into another
thread if I don’t need it. As expected, no copies are performed. The two moves
are accounted for:
- One move into the
std::async
function itself - Another move into the newly created thread.
Since our printContents
function takes an object by
const&
it is normal to expect that no copies or moves are
performed- we just access the object through a reference from another thread.
Let’s try it:
Woops! Where did that copy come from? This manifested as a cryptic compilation error when I was doing perfect forwarding (we’ll get to that shortly). As a result I posted a question on stack overflow, and the answer is relevant here.
… async will always make a copy of [ non-const lvalue references ] internally … to ensure they exist and are valid throughout the running time of the thread created. jogojapan
To ensure users don’t shoot themselves in the foot, the async function preemptively copies an lvalue argument in the event that the lvalue goes out of scope, and is destroyed before the thread completes its function. The breakdown of the numbers above is thus:
- A local copy is created in the
std::async
function - The copy is then moved into the new thread.
This is the safe route and minimizes unintended errors for the average user of
the async api. However, if you’re library writer, you may want to choose not
to make an expensive copy, and in that case you can either pass a pointer, or
wrap the reference with std::ref
(as suggested in a comment by tshino below).
Note, this would not work for rvalues, as std::ref
cannot hold an
rvalue reference. To summarize the local solution:
To avoid an extra copy when passing lvalue references as arguments that you know will outlive the thread through
std::async
, you can wrap them withstd::ref
.
Continued in Part 2
Now that we have come upon the problem and seen a simple local solution, we’ll consider it in a more generic context of perfect forwarding in Part 2 of the article.