Inheritance and Virtual Table are often used to create interface in C++ polymorphic class
What if ... there were another way to do this ?
easier, cleaner, faster and more reliable
This article explains how to useCRTP
, [std::variant
](https://en.cppreference.com/w/cpp/utility/variant andstd::visit
to increase code performance.
Inheritance is a mechanism that allows developers to create a hierarchy between classes, using "is-a" relationships. The class being inherited from is called the parent class (or base class), and the class inheriting is called the child class (or derived class).
This is useful for many purposes :
-
Code reusability
A child class will inherit datas and functions, so there's no need to duplicate code. Thus, the code base is faster to create, and easier to read & maintain. -
Make the code more "human friendly"
An "Is-a" relationship is meaningful for anyone who is familiar with OOP. It allows to create a class hierarchy that best reflects the way we mentally organize informations. A
Cat
is aFeline
, aFeline
is anAnimal
, and anAnimal
is aLifeForm
.class LifeForm{}; class Animal : public LifeForm{}; class Feline : public Animal{}; class Cat : public Feline{}; void func() { Cat my_cat; }
-
Static polymorphism
Static polymorphism is a polymorphism resolved at compile time.
In the previous code snippet,
my_cat
is variable ofCat
type.
my_cat
is also aFeline
, anAnimal
, and aLifeForm
.void feed (Animal & any_animal) { // feed `any_animal` } void func() { cat my_cat; feed(my_cat); }
-
Runtime polymorhpsim & vTable
Ah, here's the big thing.
Runtime polymorphism is a polymorphism resolved at runtime. How ? Using vTables. Virtual tables (vTable) is a lookup table of functions pointers used to resolve function calls in a dynamic (late) binding way. When compiling a class, the compiler (at compile time thus) creates a static array that contains one entry for each virtual function that the class can call.
struct Animal { virtual void move_forward() = 0; }; struct Fish : public Animal { void move_forward() override { // swim using fins } }; struct Snake : public Animal { void move_forward() override { // use serpentine method } }; struct Feline : public Animal { void move_forward() override { // walk using paws } }; void func() { using pointer_type = std::unique_ptr<Animal>; pointer_type my_animal(new Snake()); my_animal->move_forward(); // use serpentine method }
As mentionned in the previous section, a virtual function call typically means a call via a function pointer stored in a vTable.
- Pros : Allows runtime polymorphism
- Cons : Bad performance impact
Why ?
When called, a virtual function require first to read the adress of the function. Thus, while the function instruction are loaded into the memory, the CPU will idle, waiting.
Great ! We now can summarize our list of requierement, and dive into code experiments !
Requierements:
- Interface
- Polymorphism
- Minimal amount of easy-to-read code
- Better performances than vtables
Let's use the following use-case:
- An actor that can receive, queue, and reacts to messages, and update itself as well.
Actor scenario:
- Receives and queues 3 messages, one after another
- Updates itself
- Handles queued messages
Additionaly, we want Actor
to be polymorphic, in order to handle a collection or them.
And we want to be able to add or remove them dynamically from the collection.
For example purpose, we use integer (int
) as message type, just like:
using message_type = int;
Using inheritance, our abstract class may looks like this:
struct Actor
{
virtual ~Actor() = default;
virtual void update() = 0;
void handle_queued_messages() { /* handle queued message one after another */ }
void receive_message (message_type && msg)
{ // queue `msg` into `pending_messages`
pending_messages.emplace(std::forward<message_type>(msg));
}
private:
std::queue<message_type> pending_messages;
virtual void handle_one_message (message_type && msg) = 0;
};
struct A : Actor
{
void update() override { /* impl ... */ }
void handle_one_message (message_type && msg) override { /* impl ... */ }
};
struct B : Actor
{
void update() override { /* impl ... */ }
void handle_one_message (message_type && msg) override { /* impl ... */ }
};
using container_type = std::vector<std::unique_ptr<using_inheritance::Actor>>;
container_type actors;
// fill actors with A-s and B-s
for (auto & active_actor : actors)
{ // broadcast messages ...
active_actor->receive_message(41);
active_actor->receive_message(42);
active_actor->receive_message(43);
}
for (auto & active_actor : actors)
{
active_actor->update();
active_actor->handle_all_messages();
}
In the code snippet above, we can see two issues :
- Pointers indirection when dereferencing
active_actor
for each member function call. - Virtual Table usage with pure virtual functions calls :
update()
andhandle_one_message(msg)
.
Curiously recurring template pattern (CRTP), is a C++ idiom in which a class derive from a template class instanciation that use the first one as template argument.
It allows safe, static downcasting, from the base class into the derived one.
If you want more informations about CRTP, please consider reading this blog serie, from fluentcpp.com.
template <typename T>
class Base
{};
class Derived : public Base<Derived>
{};
The main advantage of doing such thing is that from the Base
class perspective, the Derived
object is itself but downcasted.
Thus, by design, because the base is always inherited from by its template parameter, we can use static_cast
instead of dynamic_cast
.
In summary, using static_cast
, the Base
class can access the Derived
class by downcasting itself into the Derived
class.
template <typename T>
struct Base
{
void do_stuff()
{
T & as_derived = static_cast<T&>(*this);
// do stuffs with `as_derived`
}
};
Also, we need to use two CRTP best-practices :
- Base class has private
constructor
- Base class is
friend
with its derived classes
This way, we ensure template
Base
class is always instanciated by the class it is derived from.
Below is legal (compiling) code. Check it on godbolt.
template <typename T>
struct Base
{};
struct Impl1 : public Base<Impl1>
{};
struct Impl2 : public Base<Impl1> // oops ! Shoud be base<impl_2>
{};
But if we use the tricks mentioned above, Impl2
does not compile anymore.
template <typename T>
struct Base
{
private:
base() = default;
friend T;
};
Looks good so far. Let's see our current implementation progress (collapsible, click to expand) :
Interface
template <typename T>
struct Actor
{
void update()
{
as_underlying().update();
}
void handle_all_messages()
{
while (!pending_messages.empty())
{
auto message = std::move(pending_messages.front());
pending_messages.pop();
handle_one_message(std::move(message));
}
}
void receive_message(message_type && msg)
{
pending_messages.emplace(std::forward<message_type>(msg));
}
private:
friend T;
Actor() = default;
std::queue<message_type> pending_messages;
inline T & as_underlying()
{
return static_cast<T&>(*this);
}
void handle_one_message(message_type && msg)
{
as_underlying().handle_one_message(std::forward<message_type>(msg));
}
};
Derived classes
struct A : Actor<A>
{
using Actor::Actor;
void update(){ /* impl ...*/ }
private:
friend struct Actor<A>;
void handle_one_message(message_type && msg){ /* impl ...*/ }
};
struct B : Actor<B>
{
using Actor::Actor;
void update(){ /* impl ...*/ }
private:
friend struct Actor<B>;
void handle_one_message(message_type && msg){ /* impl ...*/ }
};
Let's have a look to our checklist of requierements :
- Interface
- Polymorphism
- Minimal amount of easy-to-read code
- Better performances than vtables
CRTP looks great, but in opposition to inheritance, we lost polymorphism.
Remember, we need to get multiple implementation of actor into a container.
using container_type = std::vector<std::unique_ptr<Actor</* ? */>>>;
Also, what if we could avoid the usage of pointers as container's value_type
?
Well, we know our implementation types at compile time.
So, an all designated solution might be to use std::variant.
template <typename ... Ts>
using poly_T = std::variant<Ts...>;
using container_type = std::vector
<
poly_T<A, B>
>;
So we can get the following usage : (Test it on Godbolot)
container_type actors
{
A{},
B{},
A{}
/* etc ... */
};
actors.emplace_back(A{});
actors.emplace_back(B{});
Before we can check our "polymorphism" from our list of requierement,
we need to find a synthax to call member functions.
Once again, the STL provides an all designated solution, using std::variant's std::visit.
template <class R, class Visitor, class... Variants>
constexpr R visit (Visitor&& my_visitor, Variants&&... my_var);
std::visit
applies the visitormy_visitor
to thestd::variant my_var
TheVisitor
is any callable that covers every possible alternatives ofVariants
Thus, in order to interact with our std::variant
, we can define visitors in many ways.
In our specific case, we just need generic lambdas that were introduced with C++14.
Implementing our use case, we can write the following code:
container_type actors; /* contains many A-s and B-s */
for (auto & active_actor : actors)
{ // broadcast messages ...
std::visit([](auto & act)
{
act.receive_message(41);
act.receive_message(42);
act.receive_message(43);
}, active_actor);
}
for (auto & active_actor : actors)
{ // update, then handle pending messages
std::visit([](auto & act)
{
act.update();
act.handle_all_messages();
}, active_actor);
}
Another great advantage of this design is that we can handle the template parameters of std::variant
differently (here, A
and B
),
without using dynamic_cast.
Sometimes, you may want to have additional code for a specific type.
In legacy code (using inheritance), we would do the following ugly thing:
struct Base
{
virtual ~Base() = default;
};
struct A : public Base{};
struct B : public Base{};
void handle_base_value(Base * value)
{
if (A * value_as_A_ptr = dynamic_cast<A*>(value))
; // deal with value_as_A_ptr
else if (B * value_as_B_ptr = dynamic_cast<B*>(value))
; // deal with value_as_B_ptr
else
; // deal with other types
}
void func()
{
Base * my_value = new A();
handle_base_value(my_value);
}
Using std::variant
and std::visit
we don't need dynamic_cast
and pointers anymore, because types are known at compile time.
for (auto & active_actor : actors)
{
std::visit( [](auto & act)
{
using T = std::decay_t<decltype(arg)>;
if constexpr (std::is_same_v<T, A>)
; // `act` is an `A`
else if constexpr (std::is_same_v<T,B>)
; // `act` is a `B`
else
// handle other types or stop compilation
static_assert(always_false<T>::value, "non-exhaustive visitor!");
}, active_actor);
}
Alternatively, we still can define visitor types the old way, defining all operator()
overloads so it is exhaustive.
using visitor_type = struct
{
void operator() (using_CRTP_and_variants::A &) {}
void operator() (using_CRTP_and_variants::B &) {}
};
for (auto & active_actor : actors)
{
std::visit(visitor_type{}, active_actor);
}
In order to avoid if-constexpr and reduce the amount of code, another alternative is to use the convinient overloaded lambas trick:
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
for (auto & active_actor : actors)
{
std::visit(overloaded
{
[](A & arg){ /* `arg` is an A */ },
[](B & arg){ /* `arg` is a B */ },
[](auto & ){ /* other types */ }
}, active_actor);
}
The shorter the better, right ?
Let's have a look to our checklist :
- Interface
- Polymorphism
- Minimal amount of easy-to-read code
- Better performances than inheritance
Let's try the snippet (see below) on quick-bench.com, using C++20 standard and O3 optimization level.
Compiler | STL | CTRP + variants | vtable | how much faster? | link |
---|---|---|---|---|---|
Clang 7.0 | Libc++ (LLVM) | 345 | 502 | 1.5 times | test it ! |
GCC 8.2 | libstdC++ (GNU) | 281 | 667 | 2.4 times faster | test it ! |
From 1.5 to 2.4 times faster. Now we can check our "Better performances" checkbox.
- Interface
- Polymorphism
- Minimal amount of easy-to-read code
- Better performances than inheritance
See the benchmark complete snippet
// gcc : CRTP_and_variant is 2.5 times faster (libstdc++, GNU)
// clang : CRTP_and_variant is 1.4 times faster (libc++, LLVM)
#include <queue>
#include <variant>
#include <iostream>
#include <memory>
using message_type = int;
namespace using_CRTP_and_variants
{
template <typename T>
struct Actor
{
void update()
{
as_underlying().update();
}
void handle_all_messages()
{ // internal states only
while (!pending_messages.empty())
{
auto message = std::move(pending_messages.front());
pending_messages.pop();
handle_one_message(std::move(message));
}
}
void receive_message (message_type && msg)
{
pending_messages.emplace(std::forward<message_type>(msg));
}
private:
friend T;
Actor() = default;
std::queue<message_type> pending_messages;
inline T & as_underlying()
{
return static_cast<T&>(*this);
}
inline T const & as_underlying() const
{
return static_cast<T const &>(*this);
}
void handle_one_message (message_type && msg)
{
as_underlying().handle_one_message(std::forward<message_type>(msg));
}
};
struct A : Actor<A>
{
using Actor::Actor;
void update()
{
//std::cout << "A : update()\n";
}
private:
friend struct Actor<A>;
void handle_one_message (message_type && msg)
{
//std::cout << "A : handle_one_message : " << msg << '\n';
}
};
struct B : Actor<B>
{
using Actor::Actor;
void update()
{
//std::cout << "B : update()\n";
}
private:
friend struct Actor<B>;
void handle_one_message (message_type && msg)
{
//std::cout << "B : handle_one_message : " << msg << '\n';
}
};
}
namespace using_inheritance
{
struct Actor
{
virtual ~Actor() = default;
virtual void update() = 0;
void handle_all_messages()
{ // internal states only
while (!pending_messages.empty())
{
auto message = std::move(pending_messages.front());
pending_messages.pop();
handle_one_message(std::move(message));
}
}
void receive_message (message_type && msg)
{
pending_messages.emplace(std::forward<message_type>(msg));
}
private:
std::queue<message_type> pending_messages;
virtual void handle_one_message(message_type && msg) = 0;
};
struct A : Actor
{
void update() override
{
//std::cout << "A : update()\n";
}
void handle_one_message(message_type && msg) override
{
//std::cout << "A : handle_one_message : " << msg << '\n';
}
};
struct B : Actor
{
void update() override
{
//std::cout << "B : update()\n";
}
void handle_one_message(message_type && msg) override
{
//std::cout << "B : handle_one_message : " << msg << '\n';
}
};
}
template <typename ... Ts>
using poly_T = std::variant<Ts...>;
static void test_CRTP_and_variants(benchmark::State& state) {
using container_type = std::vector<poly_T
<
using_CRTP_and_variants::A,
using_CRTP_and_variants::B>
>;
container_type actors
{
using_CRTP_and_variants::A{},
using_CRTP_and_variants::B{},
using_CRTP_and_variants::A{},
using_CRTP_and_variants::B{},
using_CRTP_and_variants::A{},
using_CRTP_and_variants::B{},
using_CRTP_and_variants::A{},
using_CRTP_and_variants::B{},
using_CRTP_and_variants::A{},
using_CRTP_and_variants::B{}
};
for (auto _ : state) {
for (auto & active_actor : actors)
{ // broadcast messages ...
std::visit([](auto & act)
{
act.receive_message(41);
act.receive_message(42);
act.receive_message(43);
}, active_actor);
}
for (auto & active_actor : actors)
{
std::visit([](auto & act)
{
act.update();
act.handle_all_messages();
}, active_actor);
}
benchmark::DoNotOptimize(actors);
}
}
// Register the function as a benchmark
BENCHMARK(test_CRTP_and_variants);
static void test_inheritance(benchmark::State& state) {
using container_type = std::vector<std::unique_ptr<using_inheritance::Actor>>;
container_type actors;
{
actors.emplace_back(std::make_unique<using_inheritance::A>());
actors.emplace_back(std::make_unique<using_inheritance::B>());
actors.emplace_back(std::make_unique<using_inheritance::A>());
actors.emplace_back(std::make_unique<using_inheritance::B>());
actors.emplace_back(std::make_unique<using_inheritance::A>());
actors.emplace_back(std::make_unique<using_inheritance::B>());
actors.emplace_back(std::make_unique<using_inheritance::A>());
actors.emplace_back(std::make_unique<using_inheritance::B>());
actors.emplace_back(std::make_unique<using_inheritance::A>());
actors.emplace_back(std::make_unique<using_inheritance::B>());
}
for (auto _ : state) {
for (auto & active_actor : actors)
{ // broadcast messages ...
active_actor->receive_message(41);
active_actor->receive_message(42);
active_actor->receive_message(43);
}
for (auto & active_actor : actors)
{
active_actor->update();
active_actor->handle_all_messages();
}
}
}
BENCHMARK(test_inheritance);
According to our design, we can define an interface, create many polymorphic implementations with a minimal amount of readable code.
As a result, we end with better performances than the old fashion way "inheritance with virtual table".
However, is it worthy ?
- no more virtual table
std::visit
flexibility- Better performance
sizeof(std::variant<...>)
We can already see an issue : std::variants<...>
size on the stack.
Indeed, the size of a std::variant
is slightly greater than the type with the largest alignement it contains,
as it must also store the information of which type it currently contains.
-
CRTP can hide/shadow functions
CRTP can hide/shadow member functions, and thus lead to unexpected behaviors.
In order to avoid this, as saw previously, we used explicit base contructor call andfriend
ship.
However, according to Herb Sutter's talks "Thoughts on a more powerful and simpler C++", we may find a way to solve this in the future using metaclasses. Maybe...
We still can do some static checks using SFINAE with std::void_t and std::experimental::is_detected
todo : MAJ the TOC