C++ Polymorphism

Written by:

Thomas Tan
Ng Zhia Yang

Last updated: 14 July 2022

Feedback form for Lecture 9

Please fill in the feedback form for lecture 9.

1 Introduction

1.1 What is polymorphism?

Polymorphism is centred on the idea of function dispatch. Whenever we write code that invokes different behaviour depending on the type of an object, we are performing a polymorphic operation. Because we might not necessarily know the type of an object, polymorphic code exhibits different behaviour that depends on other (compile time or run time) factors.

It’s easier to show by example.

struct Expr {
  enum Op { PLUS, MINUS, VALUE } op;
  Expr* lhs;  // Only if `op != VALUE`
  Expr* rhs;  // Only if `op != VALUE`
  int value;  // Only if `op == VALUE`

  int evaluate() const {
    switch (op) {  // <--Polymorphism!
      case PLUS:   // <--type
        return lhs->evaluate()  //
               + rhs->evaluate();
      case MINUS:  // <--type
        return lhs->evaluate()  //
               - rhs->evaluate();
      case VALUE:  // <--type
        return value;
    }
  }
};

struct SortedInts {
  //     v-- Polymorphism!
  bool (*comp)(int, int);
  // Note: we can use a template
  // instead, which would let us
  // use function objects as well

  std::vector<int> ints;

  void insert(int x) {
    // don't @ me for this algo
    ints.push_back(x);
    std::sort(  //
        ints.begin(),
        ints.end(),
        comp);
  }
};

template <typename T>
T min(T a, T b) {
  //       v- Polymorphism!
  return a < b ? a : b;
}

int main() {
  using std::literals;

  // T = int
  // < compares integer
  min(1, 2);

  // T = double,
  // < compares double
  min(1.0, 2.0);

  // T = std::string,
  // < compares std::string
  min("Hi"s, "mark"s);

  // Different assembly is
  // emitted in all 3 cases
}

//        Polymorphism?
//        vv         vv
std::cout << "Hello" << "world\n"s;

template <typename T>
void print(T x) { std::cout << x; }
//                          ^^
//                     Polymorphism!

//   Polymorphism!             Polymorphism?
//   v~~~~~~~~~~~v                   v
std::unordered_map<std::string, int> m1;
std::unordered_map<int, int> singtel;

// Polymorphism?
m1.emplace("Hi", 1);
// emplace calls
//   `std::hash<std::string>{}("Hi"s)`
// which is some string hashing algo

singtel.emplace(2, 3);
// emplace calls
//   `std::hash<int>{}(2)`
// which is the identity function

Snippet 1: Many kinds of polymorphism are possible

In this lecture, we cover polymorphism, starting with philosophy, moving on to relatively language-agnostic patterns for obtaining polymorphism, and finally showing the C++ language features that help us achieve polymorphism.

1.2 Static vs dynamic dispatch

References:

Wikipedia Static dispatch
Wikipedia Dynamic dispatch

Polymorphism is centred on the idea of dispatch, but there are (at least) two places where dispatch can be done.

The first is during compile time, where the compiler inspects the types of the arguments and decides what implementation to call.

This is also called static dispatch, and here we sometimes qualify the definition further to specify that the compiler is inspecting the static types of the arguments during the dispatch.

Everything we mentioned in the previous lecture more or less falls into this category.

The second is during runtime, where we use the “runtime type” of the arguments to decide what implementation to call. We do this by finding some way to encode the “runtime type” of an object into the object itself.

At runtime, the static types of the arguments have obviously been fixed, so static dispatch, even if it occurred, does not affect what this second round of dispatch will do.

Philosophically, I am rather loose about what runtime information we might inspect in order to determine the “runtime type”, as I think it’s better not to pigeonhole yourself into one mode of thinking. This might be the dynamic type, or some other information entirely, like function pointers or enum values.

// Static dispatch, static types
// decide which impl to call
int min(int, int);
long min(long, long);

// Dispatch happens here
min(1, 2);

// Dynamic dispatch, runtime
// information decides
struct Expr {
  void* data_ptr;
  int (*evaluate_ptr)(void*);
  int evaluate() {
    // Dispatch happens here
    return evaluate_ptr(data_ptr);
  }
};

Snippet 2: Static vs dynamic dispatch

1.2.1 Parametric vs ad-hoc (static polymorphism)

Within static polymorphism, there is another dimension: ad-hoc vs parametric. The distinction is that parametric polymorphism allows us to write a single implementation that operates identically over a range of types. We know this in C++ as templates, since their bodies are identical (specialisations notwithstanding), but they can be instantiated with different types.

On the other hand, ad-hoc polymorphism is the case where a single polymorphic function can have potentially distinct implementations based on the input types; in C++ we know this as function overloading.

1.2.2 Closed vs open (dynamic polymorphism)

Similarly, we can have either closed or open dynamic polymorphism.

With dynamic polymorphism, the compiler will have fixed some dispatching algorithm at compile time, but this dispatching algorithm will call an “actual” implementation at runtime.

The question is how many different “actual” implementations are possible with such a system. If the number of possible actual implementations is arbitrary, we say that the polymorphism is open. Otherwise, we say it is closed.

struct Expr {
  void* data;
  int (*evaluate_ptr)(void* data);
  int evaluate() {
    // Many implementations
    return evaluate_ptr(data);
  }
};

Snippet 3: Open polymorphism

struct Expr {
  enum Op { PLUS, MINUS, VALUE } op;
  Expr* lhs;
  Expr* rhs;
  int value;
  int evaluate() {
    switch (op) {
      // At most 3 implementations
    }
  }
};

Snippet 4: Closed polymorphism

1.2.3 Keyword polymorphism

Lastly, C++ also has a little-known kind of polymorphism — keyword polymorphism. This refers to the case where a single keyword has different behaviour depending on where it’s used. Here’s a (probably non-exhaustive) list of them:

Non-exhaustive list of keyword polymorphism in C++

requires
- clause (to specify named concepts)
- expression (to constrain with statements and/or booleans)
noexcept
- specifier (to say that a function doesn’t throw)
- operator (to check whether an expression never throws)
template
- to declare a template
- as a template disambiguator for parsing dependent names
virtual
- to declare a virtual method
- to specify virtual inheritance
class
- to declare a class
- as a replacement for typename in template parameters
- for enum class
typename
- in template parameters
- as a type disambiguator for dependent names
- to specify a type requirement in requires clauses
static
- lifetime specifier (in functions)
- linkage specifier (for globals)
- member specifier (for non-instance fields and functions)
...
- a template parameter pack (eg. template <typename...>)
- a function parameter pack (eg. void foo(Args...))
- a C variadic function specifier (eg. void printf(const char*, ...))
- an operator for fold expressions
->
- as a member access operator (for pointers, eg. foo->bar)
- as a deduction guide
- for trailing-return-type syntax
- to specify the expected type of an expression in a requires clause
auto
- as a shorthand for non-type template parameters
- as a function parameter specifier (implicitly templated functions)
- as a type specifier for automatic type deduction (and function return types)
- to denote trailing function return type syntax

As you can see, there are many dimensions to polymorphism :D. C++ is probably the language with the greatest amount of keyword polymorphism (that we know of, at least).

1.3 Polymorphic code vs Polymorphism

Is this polymorphic?

int add(int a, int b) { return a + b; }
long add(long a, long b) { return a + b; }

Snippet 5: Two add overloads

Most people would say no. And yet, if we invoked add with unknown types, we require static dispatch to be done before we can see what implementation is used, which is polymorphism.

template <typename T>
requires std::same_as<int, T> || std::same_as<long, T>
T f(T a, T b) {
  add(a, b);
}

Snippet 6: Invoking add with unknown types

However, the only polymorphic part is the add itself. While the individual implementations of add wouldn’t be considered “polymorphic” by most people, the effect of having both overloads means that it may be used in a context where the type is unknown. In such cases, the use of add is definitely be considered polymorphic.

Just like how lvalues and lvalue references are different kinds of terminology (value category vs type), there is also a slight nuance here. We might have non-polymorphic code that makes use of polymorphic functions, or polymorphic code that doesn’t use any polymorphic functions.

template <class T>
bool is_int(T) {
  return false;
}

template <>
bool is_int(int) {
  return true;
}

Snippet 7: Polymorphic function, non-polymorphic function body, no polymorphic function use

int my_min(int a, int b) {
  return std::min(a, b);  //
}

Snippet 8: Non-polymorphic function
Non-polymorphic function body
Uses polymorphic functions

template <typename T>
T my_max(T a, T b) {
  return -std::min(-a, -b);
}

Snippet 9: Polymorphic function
Polymorphic function body
Uses polymorphic functions

2 Metaprogramming ≈ static polymorphism

All the tools that we covered in the previous lecture (templates, constexpr, SFINAE, concepts, etc.) can be used to implement static polymorphism in C++.

Since templates are instantiated at compile-time with constant types and values, they are clearly an example of static polymorphism; at the callsite, the compiler performs the instantiation (deduction and substitution) and calls exactly that function. Specifically, this is a form of parametric polymorphism.

In this section, we’ll talk more specifically about how these tools enable you to write polymorphic code.

2.1 `println` example

Revisiting our println example yet again, we’ll annotate each place in the code where some kind of polymorphism appears:

template <typename Arg>  // <<--    (1)     -- vv
void print_one(const char*& fmt, size_t& len, Arg&& arg) {
  for (size_t i = 0; i < len; i++) {
    if (fmt[i] == '{' && i + 1 < len && fmt[i + 1] == '}') {
      //        vv -- (2)
      std::cout << std::string_view(fmt, i) << arg;
      //                              (3) -- ^^
      fmt += (i + 2);
      len -= (i + 2);
    }
  }
}

void println(const char* fmt) {
  //        vv     vv   -- (4)
  std::cout << fmt << "\n";
}

template <typename... Args>  // << vv -- (5)
void println(const char* fmt, Args&&... args) {
  auto len = strlen(fmt);
  // vv -- (6)
  (print_one(fmt, len, std::forward<Args>(args)), ...);
  std::cout << "\n";
  //        ^^ -- (7)
}

int main() {
  // vv -- (8)
  println("hello, {} #{}", "world", 42);
}

Snippet 10: The println function from our previous lecture

Let’s go through each site:

print_one is a polymorphic function, since it is templated
this is a non-polymorphic use of a polymorphic function, operator<< (which is overloaded)
this is a polymorphic use of the same function (operator<<), since it varies depending on the type of arg
similar to (2), this is a non-polymorphic use of operator<<
like (1), println is a polymorphic function. We can also consider it “multiply polymorphic”, since it is also polymorphic on the number of arguments (but still static, since it is known at compile-time)
polymorphic use of a polymorphic function
same as (2) and (4)
non-polymorphic use of a polymorphic function (println)

As you can see, we’ve covered the different ways that polymorphism can appear in code; in this example, templates were the main mechanism that we used to achieve static polymorphism. In fact, all static polymorphism in C++ must be done with either overloading or templates.

2.2 Static polymorphism with `constexpr`

Another way we can implement static polymorphism is through constexpr, in a way that is very similar to our switch-case example above. Rather than switching on the runtime value of a type variable, we now switch on the compile-time type of the expression. For example:

struct UnaryOp {};
struct BinaryOp {};

template <typename Expr>
auto evaluate(const Expr& expr) {
  if constexpr (std::is_same_v<Expr, UnaryOp>) {
    // do some stuff
  } else if constexpr (std::is_same_v<Expr, BinaryOp>) {
    // do some other stuff
  } else {
    // do some other other stuff
    (void) expr;
  }
}

Snippet 11: An example of achieving static polymorphism with constexpr

2.3 Combining ad-hoc and parametric polymorphism

Since we can make templated functions but also overload them, we can get a combination of both ad-hoc and parametric polymorphism in C++.

Together with the overload resolution rules we covered last lecture, this lets us have finer-grained control over exactly which function is called given a set of argument types. For example, we can have a set of overloads to handle specific cases, while also having a templated function as a fallback.

Note that explicit function specialisation serves much the same purpose as overloading, and can be thought of as a form of ad-hoc polymorphism as well (since the specialisations are distinct implementations).

2.4 Curiously Recurring Template Pattern (CRTP)

References:

Here, we introduce yet another initialism into our ever-growing collection — the Curiously Recurring Template Pattern, or CRTP. The pattern is essentially to make a templated base class, and template it with the derived class, like this:

template <typename Impl>
struct Base {
  void foo() {
    auto derived = static_cast<Impl*>(this);
    derived->implementation();
  }
  // more methods and stuff
};

struct Derived : Base<Derived> {
  void implementation() {  //
    std::cout << "uwu\n";
  }
  // more methods and stuff
};

Snippet 12: The basic form of CRTP

2.4.1 Motivation

Why would we want to do something like that? Consider what we know from non-virtual inheritance (the kind introduced in Lecture 2):

struct Base {
  void foo() {
    std::cout << "Base::foo\n";
    this->impl();
  }

  void impl() {  //
    std::cout << "Base::impl\n";
  }
};

struct Derived : Base {
  void impl() {  //
    std::cout << "Derived::impl\n";
  }
};

Base b{};
b.foo();

std::cout << "\n";

Derived d{};
d.foo();

$ ./main.out | sed '1,/<1/d;/1>/,$d'
Base::foo
Base::impl
Base::foo
Base::impl

Snippet 13: Limitations of function overriding

What we might have wanted to see was Derived::impl in the second call, but we got Base::impl instead. If we think about it, this is completely expected; the base class will perform static dispatch, and will only ever call the base class methods — even if the runtime type of the object is Derived.

This pattern is common when we have a derived class that needs to override some specific behaviour, but still wants to re-use most of the implementation from the base class (eg. setup/teardown code).

2.4.2 Mechanism

How CRTP works is that we rely on the derived class to provide some (or all) of the actual implementation, while the base class is responsible for either providing the interface, or handling any setup or wrapping that needs to happen.

We make use of static_cast<T> here to cast from the base class’s this pointer to an instance of the derived class, which is safe as long as a class always inherits from the base of itself.

Then, we are able use the derived class — call both member and static functions, access fields, etc. This happens entirely at compile-time, since the type of the derived class is known (as a template argument) by the compiler, making CRTP a kind of static polymorphism. At each call site we know exactly what derived type we have, and thus exactly what derived implementation method to call.

How can the base class know about derived methods?

You may wonder how the base class can even know about the methods of the derived class, let alone call them. This is possible thanks to the way C++ compiles templates; the body of a templated function is only instantiated when it is called — lazily.

So, as long as the definition of the derived class is available by the time you call the methods, then everything works!

template <typename T>
struct Foo {
  void foo() {
    std::cout << "hello\n";
  }
  void poison() {
    static_assert(std::is_same_v<char>);
  }
};

int main() {
  Foo<int> f{};
  f.foo();
}

Snippet 14: An example of templated function bodies not being instantiated until they are called

The above code works without issues! Of course, if you actually try to call poison(), then you get a compile error.

2.4.3 Example: common code reuse

As mentioned above, one of the main reasons that CRTP is employed is to reuse code from the base class while customising some aspect of it, or providing some functionality that the base class requires. For example, we might have code that is designed to download stuff from websites:

template <typename T>
struct Downloader {
  void download() {
    auto impl = static_cast<T*>(this);
    connect_to_website();

    impl->authenticate();

    do_download();
  }

protected:
  void send(...) {}
  void cookie(...) {}
};

struct Luminus  //
    : Downloader<Luminus> {
  void authenticate() {  //
    cookie("oauth", "asdf");
  }
};

struct Github  //
    : Downloader<Github> {
  void authenticate() {
    send("torvalds", "*******");
    cookie("my_cookie", 6969);
    cookie("csrf", "aaaaaaaa");
  }
};

Snippet 15: An example of using CRTP to reuse common code

Here, we can factor out the common code (opening the socket, sending HTTP requests, that kind of thing) into the base class. Then, we can use the fact that the base class knows exactly what the derived class is (via the template argument) to call website-specific functionality (like authentication).

Another good example of CRTP used in the interface/code-reuse way is in Boost’s iterator_facade.

2.4.4 Example: polymorphic chaining

Another instance where CRTP is commonly used is when we want the derived class to be correctly “preserved” when returning from a base class method. Let’s implement our own printing class that makes use of method chaining:

struct MyBadCout {
  MyBadCout& print(auto x) {
    std::cout << x;
    return *this;
  }
};

struct ColourfulBadCout : MyBadCout {
  ColourfulBadCout& set_colour(int colour) {
    std::cout << "\x1b[" << colour << "m";
    return *this;
  }
};

Snippet 16: A (badly implemented) printer class using method chaining

The problem with this class is that once we call print(), we can no longer call set_colour() any more:

ColourfulBadCout colr{};
colr.set_colour(1)    //
    .print("hello");  //
//  .set_colour(10);     <-- this won't compile!

Snippet 17: The problem with mixing method chaining and inheritance

This is because print() always returns an instance of MyBadCout, and never anything else. We can solve this with CRTP, and now our original example will compile:

template <typename T>
struct MyCout {
  T& print(auto x) {
    std::cout << x;
    return static_cast<T&>(*this);
  }
};

struct BasicPrinter  //
    : MyCout<BasicPrinter> {};
struct ColourfulPrinter  //
    : MyCout<ColourfulPrinter> {
  ColourfulPrinter&  //
  set_colour(int colour) {
    std::cout << "\x1b[" << colour << "m";
    return *this;
  }
};

ColourfulPrinter colr{};
colr.set_colour(1)  //
    .print("hello")
    .set_colour(10)
    .print(" world");

Snippet 18: Implementing polymorphic method chaining

Note that we had to create a BasicPrinter here, because MyCout needs to be instantiated with some class.

2.4.5 Limitations of CRTP

Even though CRTP seems quite powerful, there are still some limitations, the main one being that now there are lots of templates in your code, and any user of the base class must now be templated:

template <typename T>
void do_stuff(const Base<T>& obj) {
  // ...
}

Snippet 19: CRTP can cause widespread template spillage

This is because Base itself isn’t a valid type, and we can only refer to a specific instantiation of it. This also applies to storage — it’s now hard to have a container of bases (that can be any derived type). For instance, std::vector<Base<A>*> can only store instances of A and not B, which sort of defeats the purpose.

One way around it is to make a new, “AbstractBase” class that the base CRTP class inherits from:

struct AbstractBase {};

template <typename T>
struct Base : AbstractBase {
  // ...
};

Snippet 20: An example of using an abstract base class

Now, we can have std::vector<AbstractBase*>, or pass it to functions that are not templated. Of course, this doesn’t avoid the issue entirely, only postpones it. When we want to actually use the object, we still need to cast it — either directly to the derived type, or to a Base<T>.

3 Closed dynamic polymorphism patterns

The main thing to keep in mind is that at the end of the day, all dynamic polymorphism depends on some runtime state in order to execute different code. This can be achieved in many different ways.

We’ll first show examples of closed polymorphism patterns, and later show examples of open polymorphism patterns.

3.1 Type tags

We’ve already shown many examples of this so far.

struct Expr {
  enum Op { PLUS, MINUS, VALUE } op;
  Expr* lhs;  // Only if `op != VALUE`
  Expr* rhs;  // Only if `op != VALUE`
  int value;  // Only if `op == VALUE`

  int evaluate() const {
    switch (op) {
      case PLUS:
        return lhs->evaluate()  //
               + rhs->evaluate();
      case MINUS:
        return lhs->evaluate()  //
               - rhs->evaluate();
      case VALUE:
        return value;
    }
  }
};

int main() {
  Expr a{.op = Expr::VALUE,  //
         .value = 5};
  Expr b{.op = Expr::VALUE,  //
         .value = 8};
  Expr c{.op = Expr::PLUS,  //
         .lhs = &a,
         .rhs = &b};
  std::cout << "5 + 8 = "    //
            << c.evaluate()  //
            << std::endl;
}

Snippet 21: Using a tag for polymorphic behaviour

We can also use a union in order to save 4 bytes from the struct layout.

struct Expr {
  struct BinaryArgs {
    Expr* lhs;
    Expr* rhs;
  };

  union {
    BinaryArgs args;
    int value;
  };

  enum Op { PLUS, MINUS, VALUE } op;

  // ...

int main() {
  Expr a{.value = 5,  //
         .op = Expr::VALUE};
  Expr b{.value = 8,  //
         .op = Expr::VALUE};
  Expr c{.args = {&a, &b},  //
         .op = Expr::PLUS};
  std::cout << "5 + 8 = "    //
            << c.evaluate()  //
            << std::endl;
}

Snippet 22: Optimizing struct layout with union and member ordering

3.2 `std::variant`

Rather than managing the tagged union ourselves, we could also throw the problem at the standard library. While we get much better exception safety and type checking, this comes at a cost of slightly more annoying code, since we have to use std::visit or std::get and all that jazz.

template <typename... T>
struct overloaded : T... {
  using T::operator()...;
};
template <class... Ts>
overloaded(Ts...)  //
    ->overloaded<Ts...>;

struct Expr {
  struct PlusExpr {
    Expr* lhs;
    Expr* rhs;
  };
  struct MinusExpr {
    Expr* lhs;
    Expr* rhs;
  };
  struct ValueExpr {
    int value;
  };

  std::variant<PlusExpr,  //
               MinusExpr,
               ValueExpr>
      expr;

  // ... evaluate ...
};

int evaluate() const {
  return std::visit(  //
      overloaded{
          //
          [](const PlusExpr& expr) {
            return                    //
                expr.lhs->evaluate()  //
                + expr.rhs->evaluate();
          },
          [](const MinusExpr& expr) {
            return                    //
                expr.lhs->evaluate()  //
                - expr.rhs->evaluate();
          },
          [](const ValueExpr& expr) {
            return expr.value;  //
          },
      },
      expr);
}

int main() {
  Expr a{Expr::ValueExpr{5}};
  Expr b{Expr::ValueExpr{8}};
  Expr c{Expr::PlusExpr{&a, &b}};
  std::cout << "5 + 8 = "    //
            << c.evaluate()  //
            << std::endl;
}

Snippet 23: Using std::variant for closed polymorphism

4 Open dynamic polymorphism patterns

Since tags can only encode a limited number of enumeration values that are fixed at the point of declaration, they cannot be used to implement open polymorphism.

Similarly, switch case statements are useless because they only have a fixed number of branches.

Instead, we need to use function pointers so that the number of possible behaviours simply depends on how many functions we can create, which is ~~unlimited~~ a lot. To give some runtime object a particular behaviour, we simply need to give it access to this function pointer.

There are many ways this function pointer can be encoded. In general, there are two dimensions to this:

	Internal (in struct)	External (alongside pointer)
Directly	Linux `container_of` Zig `@fieldParentPtr`	Not really used
Through a vtable	C++ `virtual`	Rust `dyn Trait*` Go `interface`

Snippet 24: The 4 main places to put function pointers

We’ll actually first go through one even more naive pattern first where the data struct and the polymorphic handle are the same struct. We’ll call this the “naive” approach, and it’s a sort of combination of both internal and external. It has the downsides of both, and arguably not even openly polymorphic :P

Another thing we’ll discuss is when we want a particular concrete object to support multiple orthogonal interfaces. This turns out to be very easy in the external case (e.g. Rust and Go), but very tricky in the internal case (e.g. Linux, Zig, and C++).

Finally, throughout these examples, try to be aware of where the code that does the dispatching lives. You will see subtle differences in all the solutions we’re discussing today.

4.1 Naively and directly stored function pointers

The most basic way to implement dynamic polymorphism is to store function pointers directly in the struct we want to imbue with polymorphic behaviour. We’ll also just use this struct as the handle, which means that the dispatching code lives in the same struct.

Here’s an example (and we will reuse this interface for the rest of the section):

struct Expr {
  // Contains the "runtime type"
  int (*evaluate_ptr)(Expr* lhs, Expr* rhs, int value);
  void (*print_ptr)(Expr* lhs, Expr* rhs, int value);

  // Data fields that are needed for correct `Expr`ing
  Expr* lhs;
  Expr* rhs;
  int value;

  int evaluate() {
    // Dispatcher
    return evaluate_ptr(lhs, rhs, value);
  }
  void print() {
    // Dispatcher
    print_ptr(lhs, rhs, value);
  }
};

Snippet 25: The framework for an openly polymorphic expression type

While this is certainly powerful enough to create what we want, in order to actually get anything useful out of this struct, we need to populate the function pointer member fields with actual functions that do the behaviour we want.

It’s unreasonable to simply ship this and expect the user of our library to do it all. So we can create some factory methods:

Expr make_plus(Expr* lhs, Expr* rhs) {
  return {
      [](Expr* lhs, Expr* rhs, int) {
        return lhs->evaluate() + rhs->evaluate();
      },
      [](Expr* lhs, Expr* rhs, int) {
        std::cout << '(';
        lhs->print();
        std::cout << " + ";
        rhs->print();
        std::cout << ')';
      },
      lhs,
      rhs,
      0,
  };
}

Expr make_minus(Expr* lhs, Expr* rhs) {
  return {
      [](Expr* lhs, Expr* rhs, int) {
        return lhs->evaluate() - rhs->evaluate();
      },
      [](Expr* lhs, Expr* rhs, int) {
        std::cout << '(';
        lhs->print();
        std::cout << " - ";
        rhs->print();
        std::cout << ')';
      },
      lhs,
      rhs,
      0,
  };
}

Expr make_value(int value) {
  return {
      [](Expr*, Expr*, int value) { return value; },
      [](Expr*, Expr*, int value) { std::cout << value; },
      nullptr,
      nullptr,
      value,
  };
}

Snippet 26: Factory methods to populate function pointer fields for us

In order to use this, the user can now do the following:

int main() {
  Expr a = make_value(3);
  Expr b = make_value(23);
  Expr c = make_plus(&a, &b);

  c.print();
  std::cout << " = " << c.evaluate()  //
            << std::endl;
}

$ ./internal-direct-1.out
(3 + 23) = 26

Snippet 27: Using the factory methods to create expressions

Since the polymorphism is open, if the user is unhappy with the library, they might choose to add new possible behaviours, though they cannot change the actual interface.

Expr make_times(Expr* lhs, Expr* rhs) {
  return {
      [](Expr* lhs, Expr* rhs, int) {
        return lhs->evaluate() * rhs->evaluate();
      },
      [](Expr* lhs, Expr* rhs, int) {
        std::cout << '(';
        lhs->print();
        std::cout << " * ";
        rhs->print();
        std::cout << ')';
      },
      lhs,
      rhs,
      0,
  };
}

int main() {
  Expr a = make_value(3);
  Expr b = make_value(23);
  Expr c = make_times(&a, &b);

  c.print();
  std::cout << " = " << c.evaluate()  //
            << std::endl;
}

$ ./internal-direct-2.out
(3 * 23) = 69

Snippet 28: Users can add new behaviours by simply initialising the function pointers to different values

4.2 Naively stored vtable pointer

Notice that in our example, we had a total of 6 custom behaviours:

plus behaviour for evaluate
plus behaviour for print
minus behaviour for evaluate
minus behaviour for print
value behaviour for evaluate
value behaviour for print

We can put the pointers to these implementations in evaluate_ptr and print_ptr as required, but there’s no reason we would ever want to mix and match:

evaluate_ptr = plus_behaviour_for_evaluate;
print_ptr = minus_behaviour_for_print;

I’m pointing this out because as the size of our interface grows, so does the number of possible pairings. Suppose we added 2 more functions to our interface, say take_derivative and simplify, the number of possible pairings would grow from 3² = 9 to 3⁴ = 81, which is a lot!

Knowing that most pairings are useless, we notice that we only really have as many possible runtime states as the number of distinct expression types. In our case, that’s 3:

Plus
Minus
Value

Rather than storing the function pointers separately, we can decrease size of the Expr struct by storing a pointer to the function pointers. A single pointer is sufficient as we won’t have exponentially many distinct expression types.

We call the struct of function pointers a virtual table, or vtable.

struct Expr {
  struct Vtable {
    int (*evaluate_ptr)(Expr* lhs, Expr* rhs, int value);
    void (*print_ptr)(Expr* lhs, Expr* rhs, int value);
  };

  // Expr now only needs 1 pointer instead of 2
  // to store dynamic type information
  Vtable* vtable_ptr;
  Expr* lhs;
  Expr* rhs;
  int value;

  int evaluate() {
    return vtable_ptr->evaluate_ptr(lhs, rhs, value);
  }
  void print() {
    vtable_ptr->print_ptr(lhs, rhs, value);
  }
};

// Each dynamic type first creates a global vtable struct
// containing behaviour function pointers
Expr::Vtable impl_expr_for_plus{
    [](Expr* lhs, Expr* rhs, int) {
      return lhs->evaluate() + rhs->evaluate();
    },
    [](Expr* lhs, Expr* rhs, int) {
      std::cout << '(';
      lhs->print();
      std::cout << " + ";
      rhs->print();
      std::cout << ')';
    },
};

Expr make_plus(Expr* lhs, Expr* rhs) {
  // Each instance can just point to the corresponding vtable
  return {&impl_expr_for_plus, lhs, rhs, 0};
}

int main() {
  Expr a = make_value(3);
  Expr b = make_value(23);
  Expr c = make_plus(&a, &b);

  c.print();
  std::cout << " = " << c.evaluate()  //
            << std::endl;
}

$ ./internal-vtable-1.out
(3 + 23) = 26

Snippet 29: Using vtables to store behaviour function pointers more efficiently

4.3 Mutable behaviour functions

So far, our behaviour functions simply take in as parameters the members of the Expr struct.

int evaluate_behaviour(Expr* lhs, Expr* rhs, int value);
void print_behaviour(Expr* lhs, Expr* rhs, int value);

This has some downsides. For one, some fields may not be used by some behaviours, e.g. print for values don’t touch lhs or rhs, so the caller is passing unnecessary information to print. Another downside is that the behaviour functions cannot mutate the Expr struct containing the data itself, as they’ve all been copied out.

To solve this, we can have our behaviour functions take in a pointer to the Expr itself, just like a member function. We can add a new function to our interface that actually mutates the Expr, called triple.

int evaluate_behaviour(Expr* self);
void print_behaviour(Expr* self);
void triple_behaviour(Expr* self);

Let’s see how this looks like:

struct Expr {
  int (*evaluate_ptr)(Expr* self);
  void (*print_ptr)(Expr* self);
  void (*triple_ptr)(Expr* self);
  Expr* lhs;
  Expr* rhs;
  int value;

  int evaluate() {
    return evaluate_ptr(this);
  }
  void print() {
    print_ptr(this);
  }
  void triple() {
    triple_ptr(this);
  }
};

Expr make_plus(Expr* lhs, Expr* rhs) {
  return {
      [](Expr* self) {
        return self->lhs->evaluate() + self->rhs->evaluate();
      },
      [](Expr* self) {
        std::cout << '(';
        self->lhs->print();
        std::cout << " + ";
        self->rhs->print();
        std::cout << ')';
      },
      [](Expr* self) {
        self->lhs->triple();
        self->rhs->triple();
      },
      lhs,
      rhs,
      0,
  };
}

// ...

Expr make_value(int value) {
  return {
      [](Expr* self) { return self->value; },
      [](Expr* self) { std::cout << self->value; },
      [](Expr* self) { self->value *= 3; },
      nullptr,
      nullptr,
      value,
  };
}

int main() {
  Expr a = make_value(3);
  Expr b = make_value(23);
  Expr c = make_plus(&a, &b);

  c.triple();

  std::cout << "Tripled: ";
  c.print();
  std::cout << " = " << c.evaluate()  //
            << std::endl;
}

$ ./internal-direct-3.out
Tripled: (9 + 69) = 78

Snippet 30: Mutable behaviour functions

4.4 Externally stored function pointers

Notice that whenever we created a value Expr, we still had to fill in the lhs and rhs fields and set them to nullptr even though they are unused.

In fact, in every single type of Expr, we were either leaving value unused or leaving lhs and rhs unused. That’s a waste of memory!

It would be far better if we could instead only store the fields necessary for that particular type of Expr.

Earlier, we showed how vtable pointers can optimise the struct layout by pulling the common function pointer pairings out to an external struct.

Similarly, we can pull out the information required for each type of Expr into external structs, and store a pointer to it in Expr.

struct PlusExpr {
  Expr* lhs;
  Expr* rhs;
};

struct MinusExpr {
  Expr* lhs;
  Expr* rhs;
};

struct ValueExpr {
  int value;
};

Snippet 31: Sketch for struct layouts for each type of Expr

We can obviously implement their behaviours quite simply, because the types are no longer polymorphic:

struct PlusExpr {
  Expr* lhs;
  Expr* rhs;

  int evaluate() {
    return lhs->evaluate()  //
           + rhs->evaluate();
  }
};

struct ValueExpr {
  int value;

  int evaluate() {
    return value;
  }
};

Snippet 32: Hardcoding behaviours in the non-polymorphic structs

But we get stuck when we try to actually create a PlusExpr of two ValueExprs:

ValueExpr a{3};
ValueExpr b{23};
PlusExpr c{ ??? };

Snippet 33: Trouble when creating PlusExpr since it requires polymorphic Expr* pointers

Well that’s obvious! We created non-polymorphic types when we needed polymorphic ones. PlusExpr cannot be written with anything other than a polymorphic type!

We now have to look at Expr itself. We need it to store a pointer to some unknown type, because we’re trying to implement open polymorphism. Since no single type will work, we use void*, which means “pointer to unknown type”.

struct Expr {
  int (*evaluate_ptr)(void* self);
  void* expr_ptr;

  int evaluate() { return evaluate_ptr(this); }
};

Snippet 34: Accidentally created a fat pointer

We can now create instances of Expr by creating adapters between the concrete types like PlusExpr to Expr, and the other way around from void* to PlusExpr as well.

We’ll also change PlusExpr to simply take an Expr directly, now that the Expr is really a fat pointer to some polymorphic Expr type.

struct PlusExpr {
  Expr lhs;
  Expr rhs;

  int evaluate() {
    return lhs.evaluate()  //
           + rhs.evaluate();
  }

  Expr toExpr() {
    // Adapter to Expr
    return {
        // When Expr::evaluate is called,
        // we have to adapt the void* back to a PlusExpr
        [](void* self) { reinterpret_cast<PlusExpr*>(self)->evaluate(); },
        reinterpret_cast<void*>(this)};
  }
};

Snippet 35: Adapter to and from Expr

Now it works:

int main() {
  ValueExpr a{3};
  ValueExpr b{23};
  PlusExpr c{a.toExpr(), b.toExpr()};

  c.print();
  std::cout << " = " << c.evaluate()  //
            << std::endl;
}

$ ./external-direct-1.out
(3 + 23) = 26

Snippet 36: Using fat pointers

Notice that when we switched to fat pointers, we only really use the polymorphic Expr type when we need to make use of polymorphism. PlusExpr needs to take in two handles to “actual” expressions that are fixed in size so that it can operate on any kind of Expr. However, when we create stuff like ValueExpr, we need a concrete instance to exist so that the data is stored somewhere.

4.5 Fat pointers with vtables

Just as we can use vtables to optimise the struct layout for internally stored function pointers, they also work with externally stored function pointers (fat pointers).

struct Expr {
  int (*evaluate_ptr)(void* self);
  void (*print_ptr)(void* self);
  void* expr_ptr;

  int evaluate() {
    return evaluate_ptr(expr_ptr);
  }
  void print() {
    print_ptr(expr_ptr);
  }
};

struct Expr {
  struct Vtable {
    int (*evaluate_ptr)(void* self);
    void (*print_ptr)(void* self);
  };
  Vtable* vtable_ptr;
  void* expr_ptr;

  int evaluate() {
    return vtable_ptr->evaluate_ptr(expr_ptr);
  }
  void print() {
    vtable_ptr->print_ptr(expr_ptr);
  }
};

Snippet 37: Fat pointers with directly stored function pointers vs vtable pointer

Initialising the vtable pointer is the same as before:

struct PlusExpr {
  Expr lhs;
  Expr rhs;

  // ...

  static Expr::Vtable impl_expr;
  Expr toExpr() {
    return {&impl_expr, reinterpret_cast<void*>(this)};
  }
};

Expr::Vtable PlusExpr::impl_expr{
    [](void* self) {
      return reinterpret_cast<PlusExpr*>(self)->evaluate();
    },
    [](void* self) { reinterpret_cast<PlusExpr*>(self)->print(); },
};

Snippet 38: Initialising vtable pointer for fat pointer

4.6 `std::function` is a fat pointer

std::function is a polymorphic fat pointer for any callable type.

For example, we might have the following function object:

struct EvaluableValueExpr {
  int value;

  int operator()() {
    return value;
  }
};

Snippet 39: Adapting ValueExpr to use operator() instead of evaluate

We can create a std::function from this as follows:

// int() is the type of a function
// that has no parameters and returns int
EvaluableValueExpr a{3};
std::function<int()> a_func{a};
// The type can be inferred thanks to CTAD
std::function a_func2{a};

Snippet 40: Turning a callable object into a std::function fat pointer

The way this works is pretty much the same as how we created our Expr fat pointer. A subtle difference is that std::function will copy the given callable and own the copy, whereas our Expr pointer simply stored a (non-owning) pointer.

We can use this to adapt PlusExpr as well:

struct EvaluablePlusExpr {
  std::function<int()> lhs;
  std::function<int()> rhs;

  int operator()() {
    return lhs() + rhs();
  }
};

EvaluableValueExpr a{3};
EvaluableValueExpr b{23};
EvaluablePlusExpr c{a, b};
std::cout << "(3 + 23) = " << c() << std::endl;

$ ./function-1.out
(3 + 23) = 26

Snippet 41: Using std::function to adapt PlusExpr

Here’s a simplified implementation of std::function that does NOT own the callable, and instead just takes a pointer to it:

Simplified implementation of a non-owning std::function

#include <iostream>
#include <utility>

template <typename Fn>
struct SimpleFunction;

template <typename Ret, typename... Args>
struct SimpleFunction<Ret(Args...)> {
  void* callable_ptr;
  Ret (*fn_ptr)(void*, Args...);

  // What gets called when we do simple_func_obj(...)
  auto operator()(Args... args) -> decltype(auto) {
    // We call the fn_ptr we stored, which takes in the state pointer plus
    // any additional arguments, just like a member function
    return fn_ptr(callable_ptr, std::forward<Args>(args)...);
  }

  // But what is fn_ptr initialised to? This is a function that casts the
  // type of callable_ptr from void* to the correct Callable* type, so
  // that it can be called.

  // This is a templated global variable that dispatches the fn_ptr call
  // to a call of the correct type for Callables.
  template <typename Callable>
  static inline Ret (*fn_ptr_for_func_object)(void*, Args...) =
      [](void* callable_ptr, Args... args) {
        // Here is where we cast the void* back to a Callable*, so that it
        // can be called.
        return (*static_cast<Callable*>(callable_ptr))(
            std::forward<Args>(args)...);
      };

  // In the constructor, we simply have to store the pointer to the
  // callable in callable_ptr, and initialise fn_ptr with our templated
  // global variable, instantiated for the Callable type.
  template <typename Callable>
  SimpleFunction(Callable* c)
      : callable_ptr{reinterpret_cast<void*>(c)},
        fn_ptr{fn_ptr_for_func_object<Callable>} {}
};

int main() {
  int i = 23;
  auto mult_i = [&i](int j) { i *= j; };
  SimpleFunction<void(int)> mult_func{&mult_i};
  mult_func(3);
  std::cout << "(23 * 3) = " << i << std::endl;
}

$ ./simple-function.out
(23 * 3) = 69

Snippet 42: What do you mean this is simple?
No, I meant simplified.

4.7 Multiple interfaces with fat pointers

With fat pointers, it becomes very easy to support multiple interfaces. All we need is more fat pointer types, each with their own interface. Any type that implements that interface can expose a function to cast to that fat pointer type.

For example, while Exprs are both evaluable and printable, there are other types that are only printable (like SimpleString), or some types that are only evaluable (like SimpleFunction).

Let’s model that!

struct Evaluable {
  int (*evaluate_ptr)(void* self);
  void* data_ptr;

  int evaluate() {
    return evaluate_ptr(data_ptr);
  }
};

struct Printable {
  void (*print_ptr)(void* self);
  void* data_ptr;

  void print() {
    print_ptr(data_ptr);
  }
};

struct EvaluablePrintable {
  int (*evaluate_ptr)(void* self);
  void (*print_ptr)(void* self);
  void* data_ptr;

  int evaluate() {
    return evaluate_ptr(data_ptr);
  }
  void print() {
    print_ptr(data_ptr);
  }

  Evaluable toEvaluable() {
    return {evaluate_ptr, data_ptr};
  }

  Printable toPrintable() {
    return {print_ptr, data_ptr};
  }
};

Snippet 43: The Evaluable and Printable fat pointers, as well as their combination

We can then add the appropriate cast functions to the various Exprs, as well as to SimpleString and SimpleFunction (stubbed).

struct PlusExpr {
  EvaluablePrintable lhs;
  EvaluablePrintable rhs;

  // ...

  EvaluablePrintable toEvaluablePrintable() {
    return {
        [](void* self) {
          return reinterpret_cast<PlusExpr*>(self)->evaluate();
        },
        [](void* self) { reinterpret_cast<PlusExpr*>(self)->print(); },
        reinterpret_cast<void*>(this),
    };
  }

  Evaluable toEvaluable() {
    return toEvaluablePrintable().toEvaluable();
  }

  Printable toPrintable() {
    return toEvaluablePrintable().toPrintable();
  }
};

Snippet 44: Cast functions for PlusExpr

struct SimpleString {
  std::string str;

  void print() {
    std::cout << str;
  }

  Printable toPrintable() {
    return {
        [](void* self) {
          reinterpret_cast<SimpleString*>(self)->print();
        },
        reinterpret_cast<void*>(this),
    };
  }
};

Snippet 45: Cast function for SimpleString

struct SimpleFunction {
  std::function<int()> fn;

  int evaluate() {
    return fn();
  }

  Evaluable toEvaluable() {
    return {
        [](void* self) {
          return reinterpret_cast<SimpleFunction*>(self)->evaluate();
        },
        reinterpret_cast<void*>(this),
    };
  }
};

Snippet 46: Cast function for SimpleFunction

This allows us to do really cool stuff now, since we can now store vectors of both expressions as well as strings or functions!

int main() {
  ValueExpr a{3};
  ValueExpr b{23};
  PlusExpr c{a.toEvaluablePrintable(), b.toEvaluablePrintable()};

  SimpleString s{"wow"};

  SimpleFunction f{[]() { return 42; }};

  //    ,- Being able to do this is the crux of interfaces!
  //   |   This is also called "type erasure"
  //   v
  std::vector<Printable> printables{c.toPrintable(), s.toPrintable()};
  std::vector<Evaluable> evaluables{c.toEvaluable(), f.toEvaluable()};

  for (auto& p : printables) {
    std::cout << "Printing: ";
    p.print();
    std::cout << std::endl;
  }

  for (auto& e : evaluables) {
    std::cout << "Evaluating: " << e.evaluate() << std::endl;
  }
}

$ ./external-multiple-1.out
Printing: (3 + 23)
Printing: wow
Evaluating: 26
Evaluating: 42

Snippet 47: Storing wildly different types in the same vector

4.8 Embedded interfaces: Internally stored function pointers

We’ve showed how multiple interfaces are possible with externally stored function pointers. The last combination, and the trickiest, is storing multiple interfaces internally.

Before we move on, let’s first solve the problem of supporting the interface pattern, but where we only have one such interface.

Here’s the very first solution for open polymorphism we’ve seen in this lecture.

struct Expr {
  int (*evaluate_ptr)(Expr* lhs, Expr* rhs, int value);
  void (*print_ptr)(Expr* lhs, Expr* rhs, int value);
  Expr* lhs;
  Expr* rhs;
  int value;
};

Snippet 48: The first solution we discussed today

Let’s consider how we might separate the EvaluablePrintable interface from this Expr class.

To be clear about what we want to achieve, imagine that we created a new type called TernaryExpr in order to get around the limitation that Expr supports only up to two sub-expressions.

struct TernaryExpr {
  int (*evaluate_ptr)(Expr* arg1, Expr* arg2, Expr* arg3, int value);
  void (*print_ptr)(Expr* arg1, Expr* arg2, Expr* arg3, int value);
  Expr* arg1;
  Expr* arg2;
  Expr* arg3;
  int value;
};

Snippet 49: TernaryExpr struct layout

In order for our existing Exprs to coexist with the new TernaryExpr, we might want to create a std::vector containing both such objects, so we can iterate over them simultaneously. Since Expr and TernaryExpr have different sizes, let’s settle for a std::vector of pointers.

We also don’t want to simply use a std::vector of fat pointers, since we’ve already done that earlier.

std::vector<SomeTypeGoesHere*> exprs_and_new_exprs;
// ...
for (SomeTypeGoesHere* p : exprs_and_new_exprs) {
  p->print();
  std::cout << " = " << p->evaluate() << std::endl;
}

Snippet 50: Goal: iterating over both Exprs and TernaryExprs at the same time

Whatever p->evaluate() calls, it is some dispatcher code that needs to call evaluate_ptr of the object. Right now, the evaluate_ptrs of Expr and TernaryExpr don’t even have the same types, which means the dispatcher code, whatever it is, can’t possibly perform the function call correctly.

We can fix that by using void* pointers to the object rather than passing the full contents of the object as the argument.

struct Expr {
  int (*evaluate_ptr)(void* self);
  void (*print_ptr)(void* self);
  Expr* lhs;
  Expr* rhs;
  int value;
};

struct TernaryExpr {
  int (*evaluate_ptr)(void* self);
  void (*print_ptr)(void* self);
  Expr* arg1;
  Expr* arg2;
  Expr* arg3;
  int value;
};

Snippet 51: Making the types of the function pointers match

Now, notice that the structs share a common base! Let’s simply factor that out into a base class.

struct EvalPrint {
  int(*evaluate_ptr)  //
      (void* self);
  void(*print_ptr)  //
      (void* self);
};

struct Expr {
  EvalPrint fns;
  Expr* lhs;
  Expr* rhs;
  int value;
};

struct TernaryExpr {
  EvalPrint fns;
  Expr* arg1;
  Expr* arg2;
  Expr* arg3;
  int value;
};

Snippet 52: Refactoring Expr and TernaryExpr to share a common base

Now, notice that EvalPrint is at offset 0 in both Expr and TernaryExpr. This means that any pointer to the EvalPrint sub-object in an Expr has the same value as a pointer to Expr, and it means we can do the following:

Expr a = make_...;
EvalPrint* a_interface_ptr = &a.fns;

// This is legal!
Expr* a_ptr = reinterpret_cast<Expr*>(a_interface_ptr);
a_ptr->evaluate();

Snippet 53: Converting from an EvalPrint* to a Expr*

We now have all the ingredients necessary to build our std::vector! Let’s put the dispatcher code in the EvalPrint struct, and get evaluate_ptr for Expr and TernaryExpr to cast the EvalPrint* it receives to Expr* and TernaryExpr* respectively, before performing the call.

struct EvalPrint {
  int (*evaluate_ptr)(EvalPrint* self);
  void (*print_ptr)(EvalPrint* self);

  // For fat pointers, note how we would
  // have passed expr_ptr or similar.
  // Here, we just pass `this`
  int evaluate() {
    return evaluate_ptr(this);
  }

  void print() {
    print_ptr(this);
  }
};

struct Expr {
  EvalPrint fns;
  Expr* lhs;
  Expr* rhs;
  int value;

  // Convenience methods
  int evaluate() {
    return fns.evaluate();
  }

  void print() {
    fns.print();
  }
};

struct TernaryExpr {
  EvalPrint fns;
  Expr* arg1;
  Expr* arg2;
  Expr* arg3;
  int value;

  // Convenience methods
  int evaluate() {
    return fns.evaluate();
  }

  void print() {
    fns.print();
  }
};

Expr make_plus(Expr* lhs, Expr* rhs) {
  // Note that the old `make_plus` would've accepted an `Expr* self`
  // Here, we accept a `EvalPrint*` since the dispatcher code doesn't
  // know whether we were an `Expr` or a `TernaryExpr`.
  EvalPrint fns{
      [](EvalPrint* void_self) {
        Expr* self = reinterpret_cast<Expr*>(void_self);
        return self->lhs->evaluate() + self->rhs->evaluate();
      },
      [](EvalPrint* void_self) {
        Expr* self = reinterpret_cast<Expr*>(void_self);
        std::cout << '(';
        self->lhs->print();
        std::cout << " + ";
        self->rhs->print();
        std::cout << ')';
      },
  };
  return {fns, lhs, rhs, 0};
}

Snippet 54: How to put (a single) embedded interface struct inside a type

So this solves the puzzle of how we can get interface pointers where the interface is simply embedded at the top of the concrete object. We can easily convert between an interface pointer and concrete object pointer by simply using reinterpret_cast.

4.9 Multiple embedded interfaces

Now, what about concrete types that support multiple interfaces? Can we do something like the following?

struct Eval {
  int (*eval_ptr)(Eval* self);

  int evaluate() {
    return eval_ptr(this);
  }
};

struct Print {
  void (*print_ptr)(Print* self);

  void print() {
    print_ptr(this);
  }
};

struct EvalPrint {
  Eval e;
  Print p;

  int evaluate() {
    return e.evaluate();
  }

  void print() {
    return p.print();
  }
};

struct Expr {
  EvalPrint fns;
  Expr* lhs;
  Expr* rhs;
  int value;

  int evaluate() {
    return fns.evaluate();
  }

  void print() {
    fns.print();
  }
};

Snippet 55: Preferred struct layout: is this possible?

The problem now is that a pointer to the Print inside an Expr is not necessarily the same as a pointer to the Expr itself. So if we were to naively port over our type casting dispatch handlers, it is going to explode.

Snippet 56: Naively `reinterpret_cast`ing `Print*` to `Expr*` is a bug

However, it turns out to be quite easy to account for the discrepancy by simply subtracting away the offset from the invalid pointer to get back a pointer that is actually valid.

Snippet 57: Using `offsetof` to compute the difference between the `Print*` and `Expr*` for `Expr`s

4.10 Summary

We’ve covered many different ways to do open dynamic polymorphism in this lecture. Let’s run through them one last time. In these examples, Data stores the actual data that the concrete object will hold. Handles are polymorphic, and the dispatch happens here.

// Most basic
// No data struct. Simply ensure
// Handle has all necessary fields

struct Handle {
  fn ptr
  fn ptr
  data field
  data field
};

// Vtable optimisation
struct Handle {
  struct Vtable {
    fn ptr
    fn ptr
  };
  Vtable* v ptr
  data field
  data field
};

// Fat pointer
struct Data {
  data field
  data field
};

struct Handle {
  fn ptr
  fn ptr
  Data*
};

// Fat pointer with vtable
struct Data {
  data field
  data field
};

struct Handle {
  struct Vtable {
    fn ptr
    fn ptr
  };
  Vtable*
  Data*
};

// Embedded interface
struct Data {
  struct Iface {
    fn ptr
    fn ptr
  };
  Iface
  data field
  data field
};

struct Handle {
  Iface*
};

// Multiple
// embedded interfaces
struct Data {
  struct IfaceA {
    fn ptr };
  struct iFaceB {
    fn ptr };
  IfaceA
  IfaceB
  data field
  data field
};

struct HandleA {
  IfaceA*
};

struct HandleB {
  IfaceB*
};

// Embedded interface
// with vtable
struct Data {
  struct Vtable {
    fn ptr
    fn ptr
  };
  struct Iface {
    Vtable*
  }
  Iface
  data field
  data field
};

// Note:
// double pointer :^)
struct Handle {
  Iface*
};

Snippet 58: The various open dynamic polymorphism patterns

5 `virtual`: Built-in vtables for C++

Under the hood, C++ virtual simply uses the function-pointer-in-vtable pattern to implement virtual methods. Of course, it also stores the type tag (the dynamic type of the object) alongside the vtable pointer.

In this section, we cover roughly how virtual methods are compiled by showing how they map to the vtable pattern we showed earlier.

5.1 Virtual methods

Let’s look at a simple case of using virtual methods. As seasoned OOP developers, we’ll dispense with the classic “animal taxonomy” example, and revisit our expressions from above:

struct Expr {
  virtual int evaluate() = 0;
  virtual void print() = 0;
};

struct Value : Expr {
  Value(int x) : value(x) {}
  virtual int evaluate() override {
    return value;
  }
  virtual void print() override {
    std::cout << value;
  }

  int value;
};

struct Plus : Expr {
  Plus(Expr* l, Expr* r)  //
      : lhs(l), rhs(r) {}
  virtual int evaluate() override {
    return lhs->evaluate()  //
           + rhs->evaluate();
  }
  virtual void print() override {
    std::cout << '(';
    lhs->print();
    std::cout << " + ";
    rhs->print();
    std::cout << ')';
  }

  Expr* lhs;
  Expr* rhs;
};

Snippet 59: A simple example of using virtual methods

We’ve implemented evaluate and print from above, and they do exactly what you’d expect. To make functions virtual, we prepend the virtual specifier to it, and to override the base implementation, we append override (in the same place as we would put const).

Note that if a method has the same declaration as a virtual function in a base class, then that method is also virtual, even if it did not specify the virtual keyword. Similarly, it will override base class implementations whether or not the override keyword is present.

Finally, note the use of = 0 in declaring the evaluate and print methods in the base Expr class — doing this declares them as pure virtual functions. If a class has at least one such function, then it is abstract, and instances of it cannot be created. If you’re familiar with Java, then these mechanics should not be surprising.

Let’s run some of this code:

Expr* a = new Value(69);
Expr* b = new Value(420);
Expr* c = new Plus(a, b);

c->print();
std::cout << " = "          //
          << c->evaluate()  //
          << "\n";

$ ./expr.out | sed '1,/<1/d;/1>/,$d'
(69 + 420) = 489

Snippet 60: Running our simple example above

Even though the static types of a, b, and c were all Expr, we called the correct functions at runtime due to dynamic dispatch, which is done automatically by the compiler thanks to us using virtual functions.

5.1.1 Name lookup for virtual methods

When calling a virtual method (eg. expr->evaluate()), the compiler still does member lookup as usual. However, after lookup is completed, if the function is virtual, it then inserts code to perform the dynamic dispatch instead.

This means that, in the following example:

struct Animal {
  virtual void eat() { /* ... */ }
};

struct Bird {
  virtual void fly() { /* ... */ }
};

// later:
Animal* a = new Bird();
a->fly();   // won't compile!

Snippet 61: We can’t escape from animal taxonomy in OOP

We can’t call fly() on a, since its static type is still Animal, and member lookup for a fly function will fail. However, since Bird still inherits from Animal, member lookup will still find the inherited eat virtual function, and inserts the appropriate dynamic dispatch code.

5.1.2 Roughly what `virtual` corresponds to

virtual on a method roughly corresponds to adding a dispatcher code. override-ing a virtual method corresponds to changing some behaviour in the vtable corresponding to that virtual, and the pure specifier = 0 corresponds to having no vtable for that class, making that class non-constructible.

Our earlier example roughly corresponds to the following “manually written” polymorphism:

struct Expr {
  struct Vtable {
    int (*evaluate_ptr)(Expr*);
    void (*print_ptr)(Expr*);
  };

  Vtable* vtable_ptr;

  Expr() = delete;

protected:
  Expr(Vtable* vtable_ptr) : vtable_ptr{vtable_ptr} {}

public:
  int evaluate() {
    return vtable_ptr->evaluate_ptr(this);
  }
  void print() {
    vtable_ptr->print_ptr(this);
  }
};

struct Value : Expr {
  int value;

  Value(int x) : Expr{&impl_Expr_for_Value}, value(x) {}

  static int evaluate_impl(Expr* expr) {
    return static_cast<Value*>(expr)->value;
  }
  static void print_impl(Expr* expr) {
    std::cout << static_cast<Value*>(expr)->value;
  }

  static inline Expr::Vtable impl_Expr_for_Value{
      evaluate_impl,
      print_impl,
  };
};

struct Plus : Expr {
  Expr* lhs;
  Expr* rhs;

  Plus(Expr* l, Expr* r)  //
      : Expr{&impl_Expr_for_Plus}, lhs(l), rhs(r) {}

  static int evaluate_impl(Expr* expr) {
    Plus* self = static_cast<Plus*>(expr);
    return self->lhs->evaluate()  //
           + self->rhs->evaluate();
  }
  static void print_impl(Expr* expr) {
    Plus* self = static_cast<Plus*>(expr);
    std::cout << '(';
    self->lhs->print();
    std::cout << " + ";
    self->rhs->print();
    std::cout << ')';
  }

  static inline Expr::Vtable impl_Expr_for_Plus{
      evaluate_impl,
      print_impl,
  };
};

Snippet 62: Roughly what virtual corresponds to

5.1.3 Struct layout with `virtual`

In terms of memory layout, a PlusExpr object looks something like this:

The memory layout of a PlusExpr object — The memory layout of a `PlusExpr` object

This shouldn’t be surprising, since the vtable implementation of dynamic dispatch was already discussed above. In this case, since PlusExpr didn’t introduce any new virtual functions (it only overrode those from the base class), it does not have a vtable pointer of its own.

Let’s go back to animal taxonomy for a moment:

struct Animal {
  ~Animal() {
    std::cout << "x_x\n";
  }
  virtual void eat() {
    std::cout << "nom nom\n";
  }

  virtual void sleep() {
    std::cout << "zzz\n";
  }
};

struct Bird : Animal {
  ~Bird() {
    std::cout << "splat\n";
  }

  virtual void eat() override {
    std::cout << "gobble\n";
  }

  virtual void fly() {
    std::cout << "wheee\n";
  }
};

Snippet 63: Simple Animal and Bird classes

In this case, the layout would look something like this:

The memory layout of a Bird object — The memory layout of a `Bird` object

Here, since Bird introduces a new virtual method (fly), it also has a vtable pointer. Note that the vtable impl Animal for Bird points to Animal::sleep(), since Bird did not override this particular function.

As an optimisation, we can actually collapse the two vtables into one, and have only one vtable pointer for the entire struct. Since Bird inherits from Animal, it would make sense that the vtable for Bird would also “inherit” the vtable for Animal, ie. contain the same functions.

In fact, all compilers perform this optimisation, so that the actual layout looks like this:

An optimised (compressed) vtable implementation

Nothing much has changed, we simply put the vtable for Animal directly into Bird.

5.2 `dynamic_cast`: checking the type at runtime

Sometimes, it might be necessary to check whether an object has a specific dynamic type; this can be useful if we only need to filter for one or two classes in the hierarchy and don’t want to make a new method to abstract out this tiny behaviour.

To do that, we can use dynamic_cast, which is another kind of cast in C++; it allows you to cast between types in an inheritance hierarchy depending on the dynamic type of an object. Implementation-wise, it simply checks the hidden type tag at runtime to see if the target type is the same as the actual runtime type.

For instance, if we wanted to check if an expression was specifically a Value, we could do this:

Expr* a = new Value(100);
Expr* b = new Plus(a, a);

auto va = dynamic_cast<Value*>(a);
auto vb = dynamic_cast<Value*>(b);

std::cout << "va = "     //
          << (void*) va  //
          << "\n";
std::cout << "vb = "     //
          << (void*) vb  //
          << "\n";

$ ./expr.out | sed '1,/<2/d;/2>/,$d'
va = 0x407310
vb = (nil)

Snippet 64: An example of using dynamic_cast

It’s pretty simple — if the object indeed has that type, then it returns a pointer to it (note that because of multiple inheritance, the pointer returned is not necessarily the pointer given). Otherwise, it returns a null pointer.

Of course, “OOP experts” would say not to use dynamic_cast at all, since we should just rely on dynamic dispatch entirely. Fortunately, C++ still lets you do it (:

Note that we can also use static_cast to perform a downcast in the same way that we used dynamic_cast. However, you should be sure that the dynamic type of the object is what you claim it to be. Otherwise, you get undefined behaviour!

Lastly, note that the conversion from a derived class to a base class like this:

Base* b = new Derived();

Snippet 65: Implicit conversion of a derived class to a base class

is allowed implicitly.

5.3 Final methods and classes

Just like in Java, we can declare classes and variables as final, and they have the same behaviour:

class Foo final {   // note the location of 'final'
};

// not allowed! 'Foo' is final
class Bar : Foo {
};

Snippet 66: Making a class final

Again, final on a method works the same way:

class Foo {
  virtual void foo() final;
};

class Bar : Foo {
  // not allowed! 'Foo::foo' is final
  virtual void foo() override;
};

Snippet 67: Making a method final

5.4 Virtual destructors

There’s one glaring flaw with our examples above (both the animals and the expressions), though. What happens when we want to destruct our objects?

Naturally, if we have a Bird object, we’d like to call its destructor so that it can clean up any resources:

struct Animal {
  ~Animal() {
    std::cout << "x_x\n";
  }
};
struct Bird : Animal {
  ~Bird() {
    std::cout << "splat\n";
  }
};

Animal* a = new Bird();
a->eat();
a->sleep();

delete a;

$ ./animal.out
gobble
zzz
x_x

Snippet 68: Sad animal destruction noises

Unfortunately, we didn’t call the right destructor! Just like normal functions, destructor calls are also looked up by their static type only. If the compiler doesn’t find a virtual function, then it just performs static dispatch.

Here, since the destructor wasn’t virtual (it was the implicitly declared compiler-generated one), we only call ~Animal when we delete it, instead of ~Bird. To fix this, all we need to do is mark the destructor in the base class as virtual:

struct Animal {
  virtual ~Animal() {
    std::cout << "x_x\n";
  }
};

// later
Animal* a = new Bird();
a->eat();
a->sleep();

delete a;

$ ./animal2.out
gobble
zzz
splat
x_x

Snippet 69: Correctly using virtual destructors

Note that we didn’t need to make the Bird destructor virtual — this is due to the “implicitly virtual if any base class functions are virtual” mechanism discussed earlier. Also, we see that ~Bird calls ~Animal, which is completely in-line with typical destructor behaviour; we just made the initial dispatch happen dynamically instead of statically.

In short, whenever you’re using any virtual functions, you should almost always have a virtual destructor as well. In fact, compilers can even warn you:

clang++ -g -stdlib=libc++ -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Wno-unused-variable -Wno-unused-but-set-variable   -c -o animal.o animal.cpp
clang++ -g -stdlib=libc++ -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Wno-unused-variable -Wno-unused-but-set-variable    animal.o   -o animal.out
animal.cpp:50:3: warning: delete called on non-final 'Animal' that has virtual functions but non-virtual destructor [-Wdelete-non-abstract-non-virtual-dtor]
  delete a;
  ^
1 warning generated.

5.5 Vtables with multiple inheritance

Everything is rather simple when we’re only dealing with single inheritance, but what if we have multiple inheritance? For example, let’s expand our animal taxonomy:

struct Animal {
  int hp;

  virtual ~Animal() {
    std::cout << "x_x\n";
  }

  virtual void eat() {
    std::cout << "nom nom\n";
  }

  virtual void sleep() {
    std::cout << "zzz\n";
  }
};

struct Beaked {
  virtual ~Beaked() {}

  virtual void peck() {
    std::cout << "whoosh\n";
  }
};

struct Bird : Animal, Beaked {
  ~Bird() {
    std::cout << "splat\n";
  }

  virtual void fly() {
    std::cout << "wheee\n";
  }

  virtual void eat() override {
    std::cout << "gobble\n";
  }

  virtual void peck() override {
    std::cout << "ouch\n";
  }
};

Snippet 70: Separating out beaked-ness into its own class

Of course, the code still works as we expect and the output is the same. However, the layout in memory is now quite different:

The memory layout of a new Bird object — The memory layout of a new `Bird` object

Note that we added an int field (hp) to the Animal class, and that base top-offset for our impl Beaked for Bird is now 16. For multiple inheritance, the vtable must also store the offset from the base pointer for each subclass; in this case, the Beaked object really only starts after the Animal object in a Bird.

We can verify this with dynamic_cast:

Animal* birb = new Bird();
auto anim = dynamic_cast<Animal*>(birb);
auto beak = dynamic_cast<Beaked*>(birb);

std::cout << "birb = "  //
          << (void*) birb << "\n";
std::cout << "anim = "  //
          << (void*) birb << "\n";
std::cout << "beak = "  //
          << (void*) beak << "\n";

$ ./animal3.out
birb = 0x4062a0
anim = 0x4062a0
beak = 0x4062b0

Snippet 71: Demonstrating that the address of Beaked is offset from the address of Bird

Indeed, we see that the address of beak is higher than that of birb. As for why it’s 16 and not just 4, this is due to alignment, as well as the extra typeinfo field (which is essentially the type tag). Either way, we do see that the address of the Beaked object is not the same as the Animal (which is the same as Bird).

The actual implementation of the “base top-offset” varies, but one method is to have vtable point to a thunk (small function), which first performs the offset to this, and then calls the actual implementation.

Implementing base-to-top offset for our dispatch handler

Alternatively, we can implement the base-to-top offset in our dispatch handler like so:

struct Expr {
  struct Vtable {
    ptrdiff_t base_top_offset;
    int (*evaluate_ptr)(Expr*);
    void (*print_ptr)(Expr*);
  };

  Vtable* vtable_ptr;

  Expr() = delete;

protected:
  Expr(Vtable* vtable_ptr) : vtable_ptr{vtable_ptr} {}

public:
  int evaluate() {
    return vtable_ptr->evaluate_ptr(this);
  }
  void print() {
    vtable_ptr->print_ptr(this);
  }
};

struct Value : Expr {
  int value;

  Value(int x) : Expr{&impl_Expr_for_Value}, value(x) {}

  static int evaluate_impl(Expr* expr) {
    Value* self =
        reinterpret_cast<Value*>(reinterpret_cast<char*>(expr) -
                                 expr->vtable_ptr->base_top_offset);
    return self->value;
  }
  static void print_impl(Expr* expr) {
    Value* self =
        reinterpret_cast<Value*>(reinterpret_cast<char*>(expr) -
                                 expr->vtable_ptr->base_top_offset);
    std::cout << self->value;
  }

  static inline Expr::Vtable impl_Expr_for_Value{
      0,
      evaluate_impl,
      print_impl,
  };
};

Snippet 72: Implementing base-to-top offset in dispatch handler

5.5.1 Casting in multiple inheritance

We used dynamic_cast above, but static_cast would have also worked, and it does perform the correct address offsetting in multiple-inheritance scenarios:

Bird* birb = new Bird();
auto beak1 = static_cast<Beaked*>(birb);
auto beak2 = (Beaked*) birb;

std::cout << "birb = "  //
          << (void*) birb << "\n";
std::cout << "beak1 = "  //
          << (void*) beak1 << "\n";
std::cout << "beak2 = "  //
          << (void*) beak2 << "\n";

$ ./animal4.out
birb = 0x4062a0
beak1 = 0x4062b0
beak2 = 0x4062b0

Snippet 73: static_cast also performs the offset computation

As an aside, C-style casts (like (Beaked*) birb) will first try static_cast before reinterpret_cast, so in this case it does the correct thing. If we explicitly use reinterpret_cast however, we don’t get the automatic offset computation!

Bird* birb = new Bird();
auto beak = reinterpret_cast<Beaked*>(birb);

std::cout << "birb = "  //
          << (void*) birb << "\n";
std::cout << "beak = "  //
          << (void*) beak << "\n";

$ ./animal5.out
birb = 0x4062a0
beak = 0x4062a0

Snippet 74: reinterpret_cast does not do offset computation!

Thankfully though, the compiler helpfully warns us:

clang++ -g -stdlib=libc++ -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Wno-unused-variable -Wno-unused-but-set-variable   -c -o animal5.o animal5.cpp
clang++ -g -stdlib=libc++ -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Wno-unused-variable -Wno-unused-but-set-variable    animal5.o   -o animal5.out
animal5.cpp:50:15: warning: 'reinterpret_cast' from class 'Bird *' to its base at non-zero offset 'Beaked *' behaves differently from 'static_cast' [-Wreinterpret-base-class]
  auto beak = reinterpret_cast<Beaked*>(birb);
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
animal5.cpp:50:15: note: use 'static_cast' to adjust the pointer correctly while upcasting
  auto beak = reinterpret_cast<Beaked*>(birb);
              ^~~~~~~~~~~~~~~~
              static_cast
1 warning generated.

Snippet 75: The compiler does warn about using reinterpret_cast in this case

5.6 Virtual inheritance

Since C++ allows for multiple inheritance, we can run into the “dreaded” diamond inheritance problem:

Snippet 76: An example of diamond inheritance

In the memory layout diagram on the right (vtable pointers omitted), we see that there’s two instances of the Widget class in an ImageObject — one belonging to the Image base class, and one belonging to Button.

This is clearly suboptimal, especially if Widget has many fields. Even worse, ImageButton becomes a pain to use:

clang++ -g -stdlib=libc++ -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Wno-unused-variable -Wno-unused-but-set-variable   -c -o widget-broken.o widget-broken.cpp
widget-broken.cpp:24:7: error: member 'event' found in multiple base classes of different types
  ib->event();
      ^
widget-broken.cpp:6:16: note: member found by ambiguous name lookup
  virtual void event() {
               ^
widget-broken.cpp:14:16: note: member found by ambiguous name lookup
  virtual void event() override {
               ^
1 error generated.
make: *** [Makefile:25: widget-broken.o] Error 1

There’s two ways around it; either we perform a cast to the appropriate base class that we want to call:

auto ib = new ImageButton();
static_cast<Image*>(ib)->event();
static_cast<Button*>(ib)->event();

$ ./wig.out | sed '1,/<1/d;/1>/,$d'
boop
clicked

Snippet 77: Getting around ambiguous calls by casting

Or we explicitly qualify the function call:

auto ib = new ImageButton();
ib->Image::event();
ib->Button::event();

$ ./wig.out | sed '1,/<2/d;/2>/,$d'
boop
clicked

Snippet 78: Getting around ambiguous calls by qualification

Either way, this is not very nice to use. Enter virtual inheritance, where we simply make Image and Button inherit from Widget virtually:

struct Image : virtual Widget {};

struct Button : virtual Widget {
  virtual void event() override {
    std::cout << "clicked\n";
  };
};

auto ib = new ImageButton();
ib->event();

$ ./wig2.out | sed '1,/<1/d;/1>/,$d'
clicked

Snippet 79: Using virtual inheritance

Virtual inheritance ensures that there’s only one instance of the base class (in this case, Widget) per object. Since there’s now no longer any ambiguity, we can also just call event() without qualification or casting, and it’ll do the right thing.

Note that we had to add virtual to the classes inheriting from Widget — not the top-level class. In order to ensure that really only one instance of Widget will exist in your class hierarchy, every class deriving from Widget should use virtual inheritance.

5.6.1 Layout of classes with virtual inheritance

The layout and implementation of virtual inheritance is quite complex, and we won’t go into it too much (and most of the time, it’s not necessary to know the full details). For starters, the layout now looks something like this:

Memory layout of ImageButton with virtual inheritance — Memory layout of `ImageButton` with virtual inheritance

Notice that there’s now only one copy of Widget in the object. Also, there are two new pointers in Image and Button — these tell it the location of its base class (they both inherit Widget). This is necessary because it is impossible to know the exact offset at compile-time. It’s called a virtual base pointer because… it points to the virtual base class, though these are commonly stored as offsets instead.

Omitted from the diagram are also other offsets, notably in each base class that tells the offset to the start of the entire object, which is required for some pointer operations.

6 Performance implications of virtual methods

Since virtual methods require an extra level of indirection that cannot be resolved at compile-time, they generally have a non-zero performance penalty. Furthermore, they can’t be inlined (again, because in general we do not know the target at compile-time), so an optimisation opportunity is lost.

This can materialise as instruction cache misses or branch mis-predictions, or it can be a complete non-issue if the compiler was able to sufficiently optimise the code. As with all performance-sensitive code though, please benchmark your own use-case first!

Since virtual methods are so widespread (not just in C++, but also in other languages like Java), compiler designers have come up with many novel ways to optimise them, and you should not abandon virtual dispatch just because it might be slow.

6.1 Virtual methods vs CRTP

Reference: Eli Bendersky - The cost of dynamic (virtual calls) vs. static (CRTP) dispatch in C++

We won’t repeat the article above, but the TL;DR is that because everything is known at compile-time for CRTP, the compiler is able to optimise a lot better — mostly by inlining code. The additional cost of an indirect call, especially in a tight loop, is very high.

6.2 Directly stored function pointer vs vtable pointer

TODO: Directly stored fptrs have less indirections required, but vtable pointers are more space efficient (if there is more than one fptr).

6.3 Devirtualisation

Reference: Arthur O’Dwyer - When can the C++ compiler devirtualise a call?

In some situations, the compiler can devirtualise a virtual function call — that is, transform it into a normal function call that doesn’t go through the vtable and doesn’t depend on the dynamic type of the object; we’ll cover some cases here.

6.3.1 Known dynamic type

The first (and simplest) case is when the compiler knows exactly what the dynamic type of the object is, like in these cases:

// this case is trivial
Derived d;
d.foo();

// a sufficiently smart compiler can also detect this case
Base* b = new Derived();
b->foo();

// it needs to be a little smarter to still handle this case
Derived d1, d2;
Base* b = rand() % 2 ? &d1 : &d2;
b->foo();

Snippet 80: Examples where the compiler can know the dynamic type of an object

6.3.2 No derived classes

Another case is when the compiler knows that a given class is a “leaf” class, or that no other class derives from it. One easy way is when the method is marked final:

struct Derived : Base {
  void foo() final {
    // ...
  }
};

Derived* d = ...;
d->foo();

Snippet 81: Examples where the compiler knows that nobody else derives from a class

Another more esoteric case is when a class has internal linkage — if nobody outside of the translation unit can even name the class, then of course nobody else can derive from it.

6.4 Embedded interface structs perform badly

Reference/case study: allocgate (zig)

When comparing between internally stored function pointers (embedded interfaces) and vtable pointers (either internal like C++, or externally as fat pointers like Rust), the primary concern is usually object size — copying around the list of function pointers (especially if there’s many) can be slow.

However, a large factor is also performance, specifically the ability for the compiler to devirtualise the virtual call. As mentioned above, compiler authors have come up with many techniques and heuristics to detect and optimise virtual calls, but of course they can only detect a certain “shape” of usage.

In Zig’s case above, the LLVM backend was unable to optimise the embedded interface pointers, whereas it was able to devirtualise the fat pointer implementation a lot better.