Written by:
Last updated: 13 June 2022
Welcome to C++ crash course!
This is a biweekly course spanning 7 weeks (14 lectures total). The lecture timings are:
Throughout this course, we will cover:
We will release these lecture notes to those that attend. Please fill out the registration form posted at the door.
In this lecture, we cover:
As a refresher, the compilation process of a C or C++ program roughly looks like the following:
Let’s start with a really simple single file C++ program.
extern "C" int puts(const char*);
void hello(const char*);
extern const char* message;
int main() {
hello(message);
}
void hello(const char* str) {
puts(str);
}
const char* message = "Hello simple world!";
This file consists of several declarations and definitions.
At the top, we have a declaration of the puts function whose
definition is not included in this file itself.
extern "C" int puts(const char*);
puts, definition is in
libc
Next are declarations for the hello function and the
message variable.
void hello(const char*);
extern const char* message;
hello and message
Note the extern on message! If we left it out,
like so:
const char* message;
This is actually a definition of message, and it is implicitly the following:
// Note: this definition only holds for globals
const char* message = 0; // kinda, close enough
// Note: for locals, the above would be something like this instead
const char* message = <something indeterminate>; // i.e. it's uninitialized memory
This is followed by definitions, which are declarations that also define what the value of the declaration should be.
int main() {
hello(message);
}
void hello(const char* str) {
puts(str);
}
main and hello
const char* message = "Hello simple world!";
messageIn particular, note the following:
hello and
message twice, once as a forward declaration, once as a
definition.
main without
declaring it twice.
In the next section, we look at what some of those magic words mean, such
as extern "C" and extern, and also
introduce the rest of the magic.
Here’s a hello world program showcasing many different concepts in the C compilation process. Please take note of the various kinds of declarations and definitions in the program.
// a.cpp
#include "b.h"
#include "puts.h"
static const char message[] = "Welcome to C++ crash course!";
void print_hello() {
puts(message);
}
int main() {
print_hello();
hello::print_hello();
hello::print_goodbye();
return hello::exit_code;
}
// puts.h
#pragma once
extern "C" int puts(const char*);
// b.h
#pragma once
#include "puts.h"
namespace hello {
void print_hello();
inline const char goodbye_message[] = "Goodbye.";
inline void print_goodbye() {
puts(goodbye_message);
}
extern int exit_code;
} // namespace hello
// b.cpp
#include "b.h"
#include "puts.h"
static const char message[] = "Hello world!";
namespace hello {
void print_hello() {
puts(message);
print_goodbye();
}
int exit_code = 0;
} // namespace hello
To compile it, we can run:
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -o a.o a.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -o b.o b.cpp
$ rm -f b.a # in case the .a already exists, make sure we remove it first
$ ar rcs b.a b.o
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror a.o b.a -o a.out
Then we can run it:
$ ./a.out
Welcome to C++ crash course!
Hello world!
Goodbye.
Goodbye.
Let’s break down what’s included in this program.
There are 4 files in this program, but notice that we’ve only specified
a.cpp and b.cpp in the command line.
The .cpp files are each considered a
translation unit. Each translation unit is compiled to its own
.o object file.
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -o a.o a.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -o b.o b.cpp
The first set of flags are used to enable a good set of warnings and use
the latest C++ specification. The -c flag tells clang to
“compile” an object file. This invocation of the compiler consists of 2
stages: preprocessing and compilation.
The other two files, puts.h and b.h are not
translation units and instead are copy-pasted into the
.cpp files during the preprocessing stage.
This means that the two translation units are being compiled as if the following was the input to the compiler:
For a.cpp:
// a.cpp
// b.h
// puts.h
extern "C" int puts(const char*);
namespace hello {
void print_hello();
inline const char goodbye_message[] = "Goodbye.";
inline void print_goodbye() {
puts(goodbye_message);
}
extern int exit_code;
} // namespace hello
static const char message[] = "Welcome to C++ crash course!";
void print_hello() {
puts(message);
}
int main() {
print_hello();
hello::print_hello();
hello::print_goodbye();
return hello::exit_code;
}
a.cpp translation unit after preprocessing
(roughly)
For b.cpp:
// b.cpp
// b.h
// puts.h
extern "C" int puts(const char*);
namespace hello {
void print_hello();
inline const char goodbye_message[] = "Goodbye.";
inline void print_goodbye() {
puts(goodbye_message);
}
extern int exit_code;
} // namespace hello
static const char message[] = "Hello world!";
namespace hello {
void print_hello() {
puts(message);
print_goodbye();
}
int exit_code = 0;
} // namespace hello
b.cpp translation unit after preprocessing
(roughly)
Notice that puts.h is only included once, even though both
a.cpp and b.h include it. This is due to the
#pragma once directive. In other projects, you may notice
#define and #ifndef to achieve a similar (and
subtly different) effect. Which style you should use mostly depends on the
prevailing coding convention.
After preprocessing and compilation, we have two object files,
a.o and b.o.
We can use objdump -t <.o file> to view the symbol
table of an object file.
objdump -t a.o$ objdump -t a.o
a.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 a.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l O .rodata 000000000000001d _ZL7message
0000000000000000 l d .text._ZN5hello13print_goodbyeEv 0000000000000000 .text._ZN5hello13print_goodbyeEv
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_ranges 0000000000000000 .debug_ranges
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 g F .text 0000000000000015 _Z11print_hellov
0000000000000000 *UND* 0000000000000000 puts
0000000000000020 g F .text 000000000000002b main
0000000000000000 *UND* 0000000000000000 _ZN5hello11print_helloEv
0000000000000000 w F .text._ZN5hello13print_goodbyeEv 0000000000000015 _ZN5hello13print_goodbyeEv
0000000000000000 *UND* 0000000000000000 _ZN5hello9exit_codeE
0000000000000000 w O .rodata._ZN5hello15goodbye_messageE 0000000000000009 _ZN5hello15goodbye_messageE
In particular, note that we have the following lines corresponding to the
functions defined in a.cpp:
$ objdump -t a.o
a.o: file format elf64-x86-64
SYMBOL TABLE:
...
0000000000000000 l O .rodata 000000000000001d _ZL7message
0000000000000000 l d .text._ZN5hello13print_goodbyeEv 0000000000000000 .text._ZN5hello13print_goodbyeEv
0000000000000000 g F .text 0000000000000015 _Z11print_hellov
0000000000000000 *UND* 0000000000000000 puts
0000000000000020 g F .text 000000000000002b main
0000000000000000 *UND* 0000000000000000 _ZN5hello11print_helloEv
0000000000000000 w F .text._ZN5hello13print_goodbyeEv 0000000000000015 _ZN5hello13print_goodbyeEv
0000000000000000 *UND* 0000000000000000 _ZN5hello9exit_codeE
0000000000000000 w O .rodata._ZN5hello15goodbye_messageE 0000000000000009 _ZN5hello15goodbye_messageE
objdump of a.o
Notice that some of the names are garbled, such as
_ZL7message, whereas others are not, such as
puts or main. This is an example of C++’s name
mangling. To demangle the names, we can use the tool c++filt.
$ objdump -t a.o | c++filt
a.o: file format elf64-x86-64
SYMBOL TABLE:
...
0000000000000000 l O .rodata 000000000000001d message
0000000000000000 l d .text._ZN5hello13print_goodbyeEv 0000000000000000 .text._ZN5hello13print_goodbyeEv
0000000000000000 g F .text 0000000000000015 print_hello()
0000000000000000 *UND* 0000000000000000 puts
0000000000000020 g F .text 000000000000002b main
0000000000000000 *UND* 0000000000000000 hello::print_hello()
0000000000000000 w F .text._ZN5hello13print_goodbyeEv 0000000000000015 hello::print_goodbye()
0000000000000000 *UND* 0000000000000000 hello::exit_code
0000000000000000 w O .rodata._ZN5hello15goodbye_messageE 0000000000000009 hello::goodbye_message
objdump of a.o with
c++filt
applied
Notice that some of these symbols are marked as *UND*, which
stands for undefined. These symbols correspond exactly to the declarations
in the a.cpp translation unit that do not (yet) have a
definition.
In b.h and b.cpp, we declared and defined
functions and variables in the hello namespace. Namespaces
are a convenient way for libraries to prevent name clashes with other
libraries or user code. For example, the C++ standard library uses the
std namespace, the Boost libraries use the
boost namespace, and the Abseil libraries use the
absl namespace.
This allows us to define two functions both called
print_hello, but since the one that lives in
b.cpp is in the hello namespace, we can see that
there is no longer a name clash, and that the symbol names for these two
functions in the object file are distinct (_Z11print_hellov
and _ZN5hello11print_helloEv).
extern "C"
As we have seen in the previous subsection, one reason C++ performs name mangling is to implement namespaces.
However, sometimes we want to specifically interface with other libraries
on the system, libc being the main example. In such cases, we
want to tell C++ to compile code that can interface with other libraries.
Most libraries will expose C bindings or C++ bindings.
If you need to interface with a C library (a library with C bindings),
then you need to use declarations that do not mangle names, such as
extern "C"
int puts(const char*);.
Otherwise, you simply need the C++ declarations as is.
One benefit of using C bindings is that it’s much more widely supported in other languages, such as Rust, Go, Python, Zig, Swift, Hare, Nim, Ruby, Crystal, Java, C#, PHP, R, Kotlin, Dart, … … … yes, pretty much every language in existence (with FFI support).
This means that it is theoretically possible that libc is
written in some other language that isn’t C (e.g. the best language, Zig).
Usually, libraries will provide their own C header files. For
libc, these can be found under /usr/include in
most (or all?) Linux distributions.
$ grep '\bputs\b' /usr/include/stdio.h
extern int puts (const char *__s);
puts is actually declared in
the standard libc headers
#include <...>?
#include <...>
functions the same as
#include "..."
except that the filename in ... are only searched
under /usr/include and any additional paths specified at
the command line, using the -I compiler flag.
On the other hand,
#include "..."
will first search ... as a relative path to a file, and if
it cannot be found, then it will be searched using the same rules as
<...>.
inline
Notice that the variable hello::goodbye_message and function
hello::print_goodbye are defined in both translation units.
This might be surprising if you’ve seen the following error message before:
// a.cpp
extern "C" int puts(const char*);
void print_hello() {
puts("Hello");
}
int main() {
print_hello();
}
// b.cpp
extern "C" int puts(const char*);
void print_hello() {
puts("Hello");
}
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -MMD -o a.o a.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -MMD -o b.o b.cpp
$ objdump -t a.o | c++filt | grep print
0000000000000000 g F .text 0000000000000015 print_hello()
$ objdump -t b.o | c++filt | grep print
0000000000000000 g F .text 0000000000000015 print_hello()
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror a.o b.o -o a.out
/usr/bin/ld: b.o: in function `print_hello()':
/__w/ccc-2/ccc-2/lectures/l01/duplicate/b.cpp:3: multiple definition of `print_hello()'; a.o:/__w/ccc-2/ccc-2/lectures/l01/duplicate/a.cpp:3: first defined here
clang: error: linker command failed with exit code 1 (use -v to see invocation)
This can be fixed by declaring both functions as inline.
// a.cpp
extern "C" int puts(const char*);
inline void print_hello() {
puts("Hello");
}
int main() {
print_hello();
}
// b.cpp
extern "C" int puts(const char*);
inline void print_hello() {
puts("Hello");
}
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -MMD -o a.o a.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -MMD -o b.o b.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror a.o b.o -o a.out
inline is
used
Note that if inline is used, there is still only one
definition in the whole program, just that the definition is “included
inline” with the declaration. If inline is used but the
definitions are not the same, then strange behaviour can happen (and error
messages may or may not be printed).
inline in the complicated hello example and
it still worked!
That is because of the way you are allowed to override symbols in a static library. But that’s a little too advanced for now, so let’s ignore that.
static
The variable message is defined in both translation units as
well. However, we cannot use inline as the two translation
units have different definitions.
Here is what would happen if we did:
inline const char message[] = "Welcome to C++ crash course!";
inline in a.cpp
inline const char message[] = "Hello world!";
inline in b.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -MMD -o a.o a.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -MMD -o b.o b.cpp
$ rm -f b.a
$ ar rcs b.a b.o
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror a.o b.a -o a.out
$ ./a.out
Welcome to C++ crash course!
Welcome to C++ crash course!
Goodbye.
Goodbye.
Instead, we have defined the two variables with static.
Unlike inline, where there is still only one definition,
using static makes the symbol local to the current
translation unit, and thus there are actually two distinct variables that
are both named message.
Note: In this section, we show a lot of
objdump output. Rest assured that
you do not need to know these details! But
we’re showing this so you can get a general appreciation of
what’s going on, and hopefully this gives you more context on how other
parts of the language work.
In this program, we see that a.cpp is using some functions
that were defined in b.cpp, specifically
hello::print_hello().
How does main() know how to call
hello::print_hello() if it doesn’t exist? Let’s take a look
at the assembly using objdump -d.
$ objdump -d a.o | awk '/main>:/,/^$/ { print }'
0000000000000020 <main>:
20: 55 push %rbp
21: 48 89 e5 mov %rsp,%rbp
24: 48 83 ec 10 sub $0x10,%rsp
28: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
2f: e8 00 00 00 00 callq 34 <main+0x14>
34: e8 00 00 00 00 callq 39 <main+0x19>
39: e8 00 00 00 00 callq 3e <main+0x1e>
3e: 8b 04 25 00 00 00 00 mov 0x0,%eax
45: 48 83 c4 10 add $0x10,%rsp
49: 5d pop %rbp
4a: c3 retq
main in a.o
The instructions at 2f, 34, and
39 are all call instructions, corresponding to
the 3 function calls in the program, but they just have zeroes instead of
the relative address of the function to be called.
In other words, the compiler doesn’t know how to call
hello::print_hello().
Now if we look at the disassembly for the final executable, we see:
$ objdump -d a.out | awk '/main>:/,/^$/ { print }'
0000000000401150 <main>:
401150: 55 push %rbp
401151: 48 89 e5 mov %rsp,%rbp
401154: 48 83 ec 10 sub $0x10,%rsp
401158: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
40115f: e8 cc ff ff ff callq 401130 <_Z11print_hellov>
401164: e8 37 00 00 00 callq 4011a0 <_ZN5hello11print_helloEv>
401169: e8 12 00 00 00 callq 401180 <_ZN5hello13print_goodbyeEv>
40116e: 8b 04 25 34 40 40 00 mov 0x404034,%eax
401175: 48 83 c4 10 add $0x10,%rsp
401179: 5d pop %rbp
40117a: c3 retq
40117b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
main in a.out
Now, the compiler does know how to call all the functions in the program.
This process of putting actual addresses in the required places in the assembly is called relocation, and is the primary job of the linker.
Let’s see exactly how this is done.
The first thing to note is that the compiler creates placeholder symbols for all the undefined symbols, as we have seen earlier. This allows the linker to know which definitions to associate to all the symbols across all translation units.
We can see that from objdump that
hello::print_hello()
marked as *UND* in a.o, whereas it is defined in
b.o.
$ objdump -t a.o | c++filt | grep hello::print_hello
0000000000000000 *UND* 0000000000000000 hello::print_hello()
$ objdump -t b.o | c++filt | grep hello::print_hello
0000000000000000 g F .text 000000000000001a hello::print_hello()
hello::print_hello()
undefined in a.o but defined in b.o
Running objdump on the final executable, we see that the
linker has merged these two symbols and now there is just one
hello::print_hello()
in the program.
$ objdump -t a.out | c++filt | grep hello::print_hello
00000000004011a0 g F .text 000000000000001a hello::print_hello()
hello::print_hello()
defined in a.out
The next thing the compiler does is to record all the places that the
linker needs to “fill in” inside a data structure called the relocation
table. We can view this information by passing --reloc in
addition to -d.
$ objdump -d --reloc a.o | awk '/main>:/,/^$/ { print }'
0000000000000020 <main>:
20: 55 push %rbp
21: 48 89 e5 mov %rsp,%rbp
24: 48 83 ec 10 sub $0x10,%rsp
28: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
2f: e8 00 00 00 00 callq 34 <main+0x14>
30: R_X86_64_PLT32 _Z11print_hellov-0x4
34: e8 00 00 00 00 callq 39 <main+0x19>
35: R_X86_64_PLT32 _ZN5hello11print_helloEv-0x4
39: e8 00 00 00 00 callq 3e <main+0x1e>
3a: R_X86_64_PLT32 _ZN5hello13print_goodbyeEv-0x4
3e: 8b 04 25 00 00 00 00 mov 0x0,%eax
41: R_X86_64_32S _ZN5hello9exit_codeE
45: 48 83 c4 10 add $0x10,%rsp
49: 5d pop %rbp
4a: c3 retq
main in
a.o
Check your understanding: Why is forgetting to define a function that is already declared not a compile time error, but is instead a link time error?
void f();
int main() {
f();
return 0;
}
undefined.cpp
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror -c -MMD -o undefined.o undefined.cpp
$ # No error at compile time!
$ clang++ -g -std=c++20 -Wpedantic -Wall -Wextra -Wconversion -Werror undefined.o -o undefined.out
/usr/bin/ld: undefined.o: in function `main':
/__w/ccc-2/ccc-2/lectures/l01/undefined-fun-error/undefined.cpp:4: undefined reference to `f()'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
$ # We got an error at link time
Let’s take one last look at objdump output, but now looking
at the function we didn’t define, puts.
$ objdump -t a.out | c++filt | grep 'puts'
0000000000000000 F *UND* 0000000000000000 puts@@GLIBC_2.2.5
puts is still undefined even in the final
executable
Notice that it is still undefined, even though we now have an executable. What’s going on?
The way this works is that puts is “linked at runtime”, a
process called loading.
We can view the list of shared libraries that the executable will try to
load by using ldd.
$ ldd a.out
linux-vdso.so.1 (0x00007ffff7fcd000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007ffff7dde000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007ffff7c8f000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007ffff7c74000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7a82000)
/lib64/ld-linux-x86-64.so.2 (0x00007ffff7fcf000)
We can see that libc.so.6 will be loaded, and we can find the
puts symbol in there, again with objdump:
$ # This time, we use -T instead of -t in order to display dynamic symbols.
$ objdump -T "$(ldd a.out | grep libc.so | tr ' ' '\n' | grep /)" | grep '\bputs\b'
0000000000084420 w DF .text 00000000000001dc GLIBC_2.2.5 puts
You should know:
extern)
extern)
inline declaration is (a declaration with
the (one unique) definition included)
static declaration is (a declaration local to
the translation unit)
#include and
#pragma once do
extern "C" does// stack.cpp
#include <cstddef>
#include <iostream>
int* make_heap_int(int n);
int* make_dangling_pointer(int n);
void add_ten_bad(int x);
void add_ten(int* x);
int main() {
{
int* px = make_heap_int(69);
{
int* dangled = make_dangling_pointer(420);
std::cout << "*dangled = " << *dangled << "\n";
}
std::cout << "*px = " << *px << "\n";
add_ten_bad(*px);
std::cout << "*px = " << *px << "\n";
add_ten(px);
std::cout << "*px = " << *px << "\n";
{
int y = 69;
std::cout << "y = " << y << "\n";
add_ten_bad(y);
std::cout << "y = " << y << "\n";
add_ten(&y);
std::cout << "y = " << y << "\n";
}
int** ppx = new int*;
*ppx = px;
std::cout << "**ppx = " << **ppx << "\n";
delete ppx;
delete px;
}
}
int* make_heap_int(int n) {
int x = n + 1;
int* pn = new int;
*pn = x;
return pn;
}
int* make_dangling_pointer(int n) {
int x = n + 1;
return &x;
}
void add_ten_bad(int x) {
x += 10;
}
void add_ten(int* x) {
*x += 10;
}
Some of you may have learnt from CS1101S (or SICP JS) that the environment model can be used to model how Javascript objects are arranged in memory.
C and C++ does not work the same way. Since these are lower level languages that allow for direct access to memory and memory addresses, a relatively abstract model like the environment model would not model the semantics of a C++ program accurately.
On the other hand, a model that directly models the way the operating
system gives the program pages to use (via mmap) and how the
program subsequently manages the memory (new and
delete updating a bunch of data structures) would be far too
detailed, and in some cases, even inaccurate!
Instead, we use the stack and heap model to visualise how objects are laid out in memory.
int* make_heap_int(int n) {
int x = n + 1;
int* pn = new int;
*pn = x;
return pn;
}
int* make_dangling_pointer(int n) {
int x = n + 1;
return &x;
}
int* px = make_heap_int(69);
{
int* dangled = make_dangling_pointer(420);
std::cout << "*dangled = " << *dangled << "\n";
}
Immediately after entering make_heap_int, the state of memory
looks something like this:
int x = n
= 1;, introduces a new local variable. This pushes a new object onto the
stack, and so we have:
Now
new int
allocates a new object on the heap, and returns the address to that
object. Unlike objects on the stack, which are allocated and deallocated
in a vertical stack-like fashion, objects on the heap can be allocated and
deallocated in any order.
Unlike objects on the stack, which are usually referred to directly by
name, we have a pointer to the object we just allocated on the heap. To
work with it, we need to dereference the pointer, e.g. *pn = x;.
Now we
return pn;, so we have:
We can also create pointers by using & to get the address
of an object.
In make_dangling_pointer, we misuse
& to get the address of an object on the stack.
When make_dangling_pointer returns, the objects on the stack
are deallocated, and so the return value now points to an object that has
been deallocated. This is what we call a dangling pointer, and using it in
any way is always* a bug1.
Make sure you understand how the following code works:
void add_ten_bad(int x) {
x += 10;
}
void add_ten(int* x) {
*x += 10;
}
std::cout << "*px = " << *px << "\n";
add_ten_bad(*px);
std::cout << "*px = " << *px << "\n";
add_ten(px);
std::cout << "*px = " << *px << "\n";
{
int y = 69;
std::cout << "y = " << y << "\n";
add_ten_bad(y);
std::cout << "y = " << y << "\n";
add_ten(&y);
std::cout << "y = " << y << "\n";
}
Since pointers themselves are just objects, we can create a pointer to a pointer object.
int** ppx = new int*;
*ppx = px;
std::cout << "**ppx = " << **ppx << "\n";
We now have 2 objects on the heap. Unless we clean them up, they will stay around and consume RAM, so it’s important to manage your memory and ensure that every allocated object is deallocated whenever there is no longer a need for it.
delete ppx;
delete px;
delete
The stack and heap model we showed does not correspond 100% to the actual state of the memory, but it is a reasonable first approximation.
Since it is likely you will be exposed to more assembly, we’ll show you a little taste of it and explain how arguments are passed to a function.
For learning about the lower level details of how C++ is compiled, Compiler Explorer (aka Godbolt) is extremely useful.
// calling.cpp
#include <cstdint>
#include <iostream>
struct NumberList {
int numbers[10];
};
int multiply_sum_numbers(NumberList nums, int mult);
NumberList multiply_numbers(NumberList nums, int mult);
int sum_six(int a, int b, int c, int d, int e, int f);
int sum_eight(int a, int b, int c, int d, int e, int f, int g, int h);
int identity(int a);
int main() {
{
int one_to_six_sum = sum_six(1, 2, 3, 4, 5, 6);
std::cout << "sum(1..6) = " << one_to_six_sum << "\n";
if (one_to_six_sum < 25) {
int one_to_eight_sum = sum_eight(1, 2, 3, 4, 5, 6, 7, 8);
std::cout << "sum(1..8) = " << one_to_eight_sum << "\n";
}
}
{
NumberList numbers{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}};
int multiply_sum = multiply_sum_numbers(numbers, 2);
std::cout << "multiply_sum = " << multiply_sum << "\n";
}
{
NumberList numbers{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}};
NumberList new_numbers = multiply_numbers(numbers, 2);
for (size_t i = 0; i < 10; i++) {
std::cout << "new_nums[" << i << "] = " << new_numbers.numbers[i]
<< "\n";
}
}
}
NumberList multiply_numbers(NumberList nums, int mult) {
int* num = &nums.numbers[0];
while (num != &nums.numbers[10]) {
*num *= mult;
++num;
}
return nums;
}
int multiply_sum_numbers(NumberList nums, int mult) {
int sum = 0;
int* num = &nums.numbers[0];
while (num != &nums.numbers[10]) {
*num *= mult;
sum += *num;
++num;
}
return sum;
}
int identity(int a) {
return a;
}
int sum_six(int a, int b, int c, int d, int e, int f) {
return a + b + c + d + e + f;
}
int sum_eight(int a, int b, int c, int d, int e, int f, int g, int h) {
return a + b + c + d + e + f + g + h;
}
When there are 6 arguments or less, the (System V) calling convention passes them by register.
int one_to_six_sum = sum_six(1, 2, 3, 4, 5, 6);
$ cat calling.s | c++filt | grep -B6 -A1 'call.*sum_six('
mov edi, 1
mov esi, 2
mov edx, 3
mov ecx, 4
mov r8d, 5
mov r9d, 6
call sum_six(int, int, int, int, int, int)
mov dword ptr [rbp - 8], eax
Notice all the mov to the registers at the beginning.
When there are more than 6 arguments, the (System V) calling convention spills the excess arguments onto the stack.
int one_to_eight_sum = sum_eight(1, 2, 3, 4, 5, 6, 7, 8);
$ cat calling.s | c++filt | grep -B8 -A1 'call.*sum_eight('
mov edi, 1
mov esi, 2
mov edx, 3
mov ecx, 4
mov r8d, 5
mov r9d, 6
mov dword ptr [rsp], 7
mov dword ptr [rsp + 8], 8
call sum_eight(int, int, int, int, int, int, int, int)
mov dword ptr [rbp - 12], eax
Notice the mov to stack locations after the movs
to registers.
When an argument is too large to fit in a register, it doesn’t get allocated a register and instead is always put on the stack.
NumberList numbers{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}};
int multiply_sum = multiply_sum_numbers(numbers, 2);
$ cat calling.s | c++filt | grep -B8 -A1 'call.*multiply_sum_numbers('
mov rcx, qword ptr [rbp - 72]
mov rax, rsp
mov qword ptr [rax + 32], rcx
movups xmm0, xmmword ptr [rbp - 104]
movups xmm1, xmmword ptr [rbp - 88]
movups xmmword ptr [rax + 16], xmm1
movups xmmword ptr [rax], xmm0
mov edi, 2
call multiply_sum_numbers(NumberList, int)
mov dword ptr [rbp - 60], eax
Notice that the compiler uses a series of mov to copy the
array to [rax + *] (a stack location). The callee knows where
to access this array using the same rules as if there were more than 6
arguments. The second argument is passed as if it was the first argument
(in rdi), since the real first argument didn’t use up this
register.
Functions will usually do some stuff at the top of the body to save
registers that belong to the caller. These are called
callee-saved registers. There are also registers that are
caller-saved, and these must be saved by the caller before the
call instruction, and the callee is free to use these
registers for their own purposes.
An example of a callee-saved register is rbp, and an example
of a caller-saved register is rax.
int identity(int a) {
return a;
}
$ cat calling.s | c++filt | grep -v '.cfi_' | awk '/identity.*:/,/ret$/ { print }'
identity(int): # @identity(int)
# %bb.0:
push rbp
mov rbp, rsp
mov dword ptr [rbp - 4], edi
mov eax, dword ptr [rbp - 4]
pop rbp
ret
Notice the way rbp is saved using the
push instruction, which saves it to the stack. Just before
the function returns, it restores rbp with the
pop instruction.
On the other hand, the callee is freely using rax without
saving and restoring its value.
Arguments are usually returned in rax, and slightly larger
arguments are returned in two parts, in rax and
rdx.
int sum_six(int a, int b, int c, int d, int e, int f) {
return a + b + c + d + e + f;
}
$ cat calling.s | c++filt | grep -v '.cfi_' | awk '/sum_six.*:/,/ret$/ { print }'
sum_six(int, int, int, int, int, int): # @sum_six(int, int, int, int, int, int)
# %bb.0:
push rbp
mov rbp, rsp
mov dword ptr [rbp - 4], edi
mov dword ptr [rbp - 8], esi
mov dword ptr [rbp - 12], edx
mov dword ptr [rbp - 16], ecx
mov dword ptr [rbp - 20], r8d
mov dword ptr [rbp - 24], r9d
mov eax, dword ptr [rbp - 4]
add eax, dword ptr [rbp - 8]
add eax, dword ptr [rbp - 12]
add eax, dword ptr [rbp - 16]
add eax, dword ptr [rbp - 20]
add eax, dword ptr [rbp - 24]
pop rbp
ret
Notice that the sum is stored in rax.
Finally, just as how large arguments are passed by the stack, large return values are returned by the stack, but this requires some coordination with the caller.
First the caller needs to allocate space on the stack, then it passes the
pointer to this space as a hidden first argument (rdi). The
second argument (2) which was originally passed in
rdi now gets allocated the second register used for argument
passing (rsi).
NumberList numbers{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}};
NumberList new_numbers = multiply_numbers(numbers, 2);
$ cat calling.s | c++filt | grep -B8 -A1 'call.*multiply_numbers('
mov rax, rsp
mov qword ptr [rax + 32], rcx
movups xmm0, xmmword ptr [rbp - 224]
movups xmm1, xmmword ptr [rbp - 208]
movups xmmword ptr [rax + 16], xmm1
movups xmmword ptr [rax], xmm0
lea rdi, [rbp - 184]
mov esi, 2
call multiply_numbers(NumberList, int)
mov qword ptr [rbp - 232], 0
The callee may then return the result simply by storing it to this memory location.
Now let’s explain the actual call instruction itself:
$ cat calling.s | c++filt | grep 'call.*multiply_sum_numbers('
call multiply_sum_numbers(NumberList, int)
Notice that at the end of multiply_sum_numbers, we have a
corresponding ret, which jumps back to inside
main, after the call.
We know how call knows where to jump to, it simply needs to
jump to multiply_sum_numbers. But
multiply_sum_numbers could be called from multiple places, so
how does it know to jump back to main in this case?
The way this works is that call actually pushes the instruction address of
the next instruction following the call onto the stack.
Broadly speaking, C++ has a few categories of types:
int, float)int*)int[10])struct foo { int x; })int&)
There is also the void type, which doesn’t really fit into
any category – it represents the absence of a value.
The built-in types, also known as the fundamental types of C++, are language-defined types that are available without any headers (and their type names are keywords). There are also compound types like structs and arrays that are composed of these fundamental types (or other compound types). We’ll cover these below.
Arithmetic types are the numeric types of C++. There are integral and floating-point arithmetic types, and for the integral ones, there are signed and unsigned variants.
As a refresher, signed types can be used to represent both positive and
negative numbers, while unsigned types can only represent positive numbers
(and 0). Floating-point types do not have signed/unsigned variants. By
default (ie. without the unsigned specifier), types are
signed.
This is the list of built-in arithmetic types:
char, unsigned char,
signed char
short, unsigned shortint, unsigned intlong, unsigned longlong long, unsigned long longfloatdoublelong double
Unfortunately, there isn’t a fixed size for each of these types; the C++
standard only guarantees that each arithmetic type has a
minimum size (eg. int is only guaranteed to hold
values in the range [-32768, +32767]). Their actual sizes
(and ranges) are platform-specific. A list of common platforms and their
integer sizes can be found
here.
Thankfully, there is a standard library header,
<cstdint> that defines the fixed-sized integer types
(u?)int(8|16|32|64)_t (eg. uint8_t,
int64_t) that are guaranteed to be exactly that size. When
you need to ensure that your variable can represent numbers of a certain
range, or to ensure conformance to some binary format, you should use
these types.
As an aside, size_t should be the type used for representing
“sizes”, or “counts”. For example, you should use size_t to
hold the number of elements in an array, rather than int or
long. This type is declared in the
<cstddef> header.
Unlike in C, bool is an actual type, that can hold either the
value true or false. While it is possible to
think of these as 1 and 0 respectively, the
standard does not require them to be exactly these values.
The standard specifies how boolean types are represented:
Type bool is a distinct type that has the same object representation, value representation, and alignment requirements as an implementation-defined unsigned integer type. The values of type bool are true and false.
// pointers.cpp
#include <cstddef>
#include <cstdint>
#include <iostream>
static void print_message(const char* msg);
static void call_function(void (*fn)(const char*));
int main() {
int foo = 69;
int* p_foo = &foo;
std::cout << "----- pointers -----\n";
std::cout << "foo = " << foo << "\n"; // prints 69
std::cout << "p_foo = " << p_foo << "\n"; // prints (eg.) 0x1000000
std::cout << "*p_foo = " << *p_foo << "\n"; // prints 69
std::cout << "pointer arithmetic:\n";
std::cout << "p_foo = " << p_foo << "\n"; // (eg.) 0x1000000
std::cout << "p_foo+1 = " << p_foo + 1 << "\n"; // 0x1000004
std::cout << "p_foo+9 = " << p_foo + 9 << "\n"; // 0x1000024
// two-star programmer
int** p_p_foo = &p_foo;
std::cout << "**p_p_foo = " << **p_p_foo << "\n";
std::cout << "p_foo = " << p_foo << "\n";
std::cout << "p_p_foo = " << p_p_foo << "\n";
std::cout << "\n";
{
int x = 10;
int y = 20;
int* px = &x;
const int* pcx = &x; // pointer-to-const X
int const* pcx2 = &x; // (also) pointer-to-const X
int* const cpx = &x; // const-pointer-to X
const int* const cpcx = &x; // const-pointer-to-const X
px = &y; // works, pointer is not const
*px = 10; // works, pointed-to is not const
pcx = &y; // works, pointer is not const
*cpx = 10; // works, pointed-to is not const
/* *pcx = 10; */ // doesn't work, pointed-to is const
/* cpx = &y; */ // doesn't work, pointer is const
/* cpcx = &y; */ // doesn't work, pointer is const
/* *cpcx = 3; */ // doesn't work, pointed-to is const
}
std::cout << "\n----- function pointers -----\n";
{
call_function(print_message);
call_function(&print_message);
void (*foo)(const char*) = print_message;
call_function(foo);
call_function(*foo);
call_function(**foo);
// 20-star programmer
call_function(********************foo);
}
}
static void print_message(const char* msg) {
std::cout << "hello, the message is '" << msg << "'\n";
}
static void call_function(void (*fn)(const char*)) {
fn("there is no message");
(*****fn)("uwu");
}
In C++, you can take a pointer to any type; a pointer is conceptually just
the memory location (ie. address) of another object — which is the thing
it points to. For example, an int* points to an integer
object, an int** points to an integer pointer, and so on.
To get a pointer value, you can take the address of an object
using the & operator, eg. &x. The type
of the pointer from such an operation is naturally a pointer to the type
of the object; eg. if x is an int, then
&x is an expression of type int*. To
dereference a pointer (and get the pointed-to value), use the unary
* operator.
int foo = 69;
int* p_foo = &foo;
std::cout << "----- pointers -----\n";
std::cout << "foo = " << foo << "\n"; // prints 69
std::cout << "p_foo = " << p_foo << "\n"; // prints (eg.) 0x1000000
std::cout << "*p_foo = " << *p_foo << "\n"; // prints 69
Unlike with C’s NULL constant, nullptr is an
actual keyword in C++, and is specifically designed for pointer types.
It doesn’t rely on implicit void*-to-T* conversions, but it
can be assigned to any pointer type!
Prefer to use nullptr instead of 0.
You can also take the address of a pointer itself (they are not special), yielding a pointer to a pointer:
// two-star programmer
int** p_p_foo = &p_foo;
std::cout << "**p_p_foo = " << **p_p_foo << "\n";
std::cout << "p_foo = " << p_foo << "\n";
std::cout << "p_p_foo = " << p_p_foo << "\n";
C++ allows arithmetic on pointers, effectively treating them as arrays.
That is, if we have some int* x with an address of
0x1000, then x + 1 would be 0x1004,
not 0x1001 (assuming that sizeof(int) == 4). In
general, the increment corresponds to the size of the pointed-to type.
std::cout << "p_foo = " << p_foo << "\n"; // (eg.) 0x1000000
std::cout << "p_foo+1 = " << p_foo + 1 << "\n"; // 0x1000004
std::cout << "p_foo+9 = " << p_foo + 9 << "\n"; // 0x1000024
Variables can be made const so that they can’t be modified, and this
applies to pointers as well. However, since we are dealing with pointers,
we should also consider whether or not the pointed-to object can be
modified (inner const-ness). Since there are two places we can
(potentially) put const, we have 4 possible combinations:
int x = 10;
int y = 20;
int* px = &x;
const int* pcx = &x; // pointer-to-const X
int const* pcx2 = &x; // (also) pointer-to-const X
int* const cpx = &x; // const-pointer-to X
const int* const cpcx = &x; // const-pointer-to-const X
px = &y; // works, pointer is not const
*px = 10; // works, pointed-to is not const
pcx = &y; // works, pointer is not const
*cpx = 10; // works, pointed-to is not const
/* *pcx = 10; */ // doesn't work, pointed-to is const
/* cpx = &y; */ // doesn't work, pointer is const
/* cpcx = &y; */ // doesn't work, pointer is const
/* *cpcx = 3; */ // doesn't work, pointed-to is const
In the diagram below, red denotes things that cannot be changed; red boxes means the pointer itself can’t be changed, while red arrows mean the thing that is pointed to cannot be changed.
Note that const int* x and int const* x are
equivalent declarations; see
this page
(or search “east-const” and “west-const” for more information.
Functions are special because they are not objects, and so you cannot make variables, arrays, etc. of function type. However, you can take a pointer to the function, which acts just like a normal pointer. Suppose we have the following functions:
static void print_message(const char* msg) {
std::cout << "hello, the message is '" << msg << "'\n";
}
static void call_function(void (*fn)(const char*)) {
fn("there is no message");
(*****fn)("uwu");
}
The first simply takes a message and prints it, while the second takes a
function pointer, and calls it with a message. The syntax for a function
pointer type is a little weird, but it declares fn as a
pointer to a function returning void and taking a
const char* parameter.
Note that we don’t actually need to dereference the pointer to call it — function pointers are special in this regard. However, we can also dereference it as many times as we want, and it still behaves like a function.
The reason for this behaviour is that a “function value” usually immediately turns into (through implicit conversion) a function pointer, so any expression that uses it — like a dereference — will just yield another pointer, which can be dereferenced again.
Next, we can call our wrapper function:
call_function(print_message);
call_function(&print_message);
Again, note that taking the address of the function with
& is optional. We can also create local variables of
function-pointer type, which behave in much the same way:
void (*foo)(const char*) = print_message;
call_function(foo);
call_function(*foo);
call_function(**foo);
// 20-star programmer
call_function(********************foo);
// references.cpp
#include <cstddef>
#include <cstdint>
#include <iostream>
static void print_message(const char* msg);
static void call_function(void (&fn)(const char*));
int main() {
int foo = 69;
int& r_foo = foo;
std::cout << "r_foo = " << r_foo << "\n";
// addresses are the same
std::cout << "&foo = " << &foo << "\n";
std::cout << "&r_foo = " << &r_foo << "\n";
// transparently "decays" to an int
int uwu = r_foo;
std::cout << "uwu = " << uwu << "\n";
// make references from pointers by dereferencing
int* p_foo = &foo;
int& r_foo2 = *p_foo;
int bar = 123;
r_foo = bar;
std::cout << "foo = " << foo << "\n"; // prints 123
int x = 10;
const int& rcx = x; // reference-to-const X
int const& rcx2 = x; // (also) reference-to-const X
/* rcx = 10; */ // doesn't work, referenced-to is const
/* int& const crx = x; */ // illegal -- cannot have const reference
{
void (*ptr)(const char*) = print_message;
call_function(print_message);
call_function(*ptr);
/* call_function(&print_message); */ // does not compile
/* call_function(ptr); */ // does not compile
}
}
static void print_message(const char* msg) {
std::cout << "hello, the message is '" << msg << "'\n";
}
static void call_function(void (&fn)(const char*)) {
fn("there is no message");
(*****fn)("uwu");
}
References are unique to C++ (from C); they can be thought of as “transparent pointers”. They implicitly convert to a “normal” value (ie. a non-reference) when required, and the address of a reference is the address of the value it refers to. In this sense, a reference is not a real object, since it is not required to have storage.
In fact, you cannot have references to references, pointers to references, or arrays of references. This means that a reference does not have an “address” or “memory location” at all according to the standard.
int foo = 69;
int& r_foo = foo;
std::cout << "r_foo = " << r_foo << "\n";
// addresses are the same
std::cout << "&foo = " << &foo << "\n";
std::cout << "&r_foo = " << &r_foo << "\n";
// transparently "decays" to an int
int uwu = r_foo;
std::cout << "uwu = " << uwu << "\n";
// make references from pointers by dereferencing
int* p_foo = &foo;
int& r_foo2 = *p_foo;
Another quirk of references (that stems from the above) is that they cannot be “re-pointed” to refer to something else. After they are initialized, they act as an alias for the referred-to object, and any assignments to the reference will change the object itself.
int bar = 123;
r_foo = bar;
std::cout << "foo = " << foo << "\n"; // prints 123
A side effect of this is that references must be initialized, and so it is illegal to have uninitialized or null references:
int& x; // will not compile
Similar to pointers, references also have a concept of “inner” const-ness. However, since references themselves cannot be reassigned (see above: you can’t “repoint” a reference), it doesn’t make sense to make the reference itself const.
int x = 10;
const int& rcx = x; // reference-to-const X
int const& rcx2 = x; // (also) reference-to-const X
/* rcx = 10; */ // doesn't work, referenced-to is const
/* int& const crx = x; */ // illegal -- cannot have const reference
Much like function pointers, function references also exist. You can take a reference to a function just like any other value; since functions implicitly convert to function pointers though, you can also dereference function references, which might be unintuitive.
static void print_message(const char* msg) {
std::cout << "hello, the message is '" << msg << "'\n";
}
static void call_function(void (&fn)(const char*)) {
fn("there is no message");
(*****fn)("uwu");
}
However, a pointer is not a reference, and so you cannot pass a function pointer to someone expecting a function reference.
void (*ptr)(const char*) = print_message;
call_function(print_message);
call_function(*ptr);
/* call_function(&print_message); */ // does not compile
/* call_function(ptr); */ // does not compile
To get a function reference from a function pointer, simply dereference the pointer.
// arrays.cpp
#include <cstddef>
#include <cstdint>
#include <iostream>
static void print_array(int* array, size_t num_elements);
static void print_array_better(int (&array)[3]);
static void print_matrix(int (*mat)[2]);
static void print_matrix_better(int (&mat)[3][2]);
int main() {
{
int ax[3] = {100, 200, 420};
int bx[5] = {69};
std::cout << "ax[0] = " << ax[0] << "\n";
int* ptr_ax0 = ax;
int* ptr_first = &ax[0];
int* ptr_ax3 = ax + 3;
int* ptr_fourth = &ax[3];
std::cout << "ptr_ax0 = " << ptr_ax0 << "\n"; // 0x1000
std::cout << "ptr_first = " << ptr_first << "\n"; // 0x1000
std::cout << "ptr_ax3 = " << ptr_ax3 << "\n"; // 0x100C
std::cout << "ptr_fourth = " << ptr_fourth << "\n"; // 0x100C
std::cout << "\n----- array-to-pointer decay -----\n";
print_array(ax, 3);
std::cout << "\n----- references-to-array -----\n";
print_array_better(ax);
// print_array_better(bx); // doesn't compile
}
std::cout << "\n----- multidimensional arrays -----\n";
{
int matrix[3][2] = {{1, 2}, {3, 4}, {5, 6}};
std::cout << "matrix[2][1] = " << matrix[2][1] << "\n";
print_matrix(matrix);
print_matrix_better(matrix);
std::cout << "\n----- multidimensional arrays (safety) -----\n";
int smol_matrix[][2] = {{69, 420}, {42, 1337}};
print_matrix(smol_matrix); // oops
// print_matrix_better(smol_matrix); // doesn't compile
}
}
static void print_array(int* array, size_t num_elements) {
for (size_t i = 0; i < num_elements; i++)
std::cout << "array[" << i << "] = " << array[i] << "\n";
std::cout << "last_elem = " << array[num_elements - 1] << "\n";
// std::cout << "last_elem = " << array[num_elements] << "\n";
}
static void print_array_better(int (&array)[3]) {
size_t num_elements = sizeof(array) / sizeof(array[0]);
for (size_t i = 0; i < num_elements; i++)
std::cout << "array[" << i << "] = " << array[i] << "\n";
// std::cout << "last_elem = " << array[2] << "\n";
// std::cout << "last_elem = " << array[10] << "\n";
}
static void print_matrix(int (*mat)[2]) {
std::cout << "print_matrix:\n";
std::cout << " " << mat[0][0] << " " << mat[0][1] << "\n";
std::cout << " " << mat[1][0] << " " << mat[1][1] << "\n";
std::cout << " " << mat[2][0] << " " << mat[2][1] << "\n";
}
static void print_matrix_better(int (&mat)[3][2]) {
std::cout << "print_matrix_better:\n";
std::cout << " " << mat[0][0] << " " << mat[0][1] << "\n";
std::cout << " " << mat[1][0] << " " << mat[1][1] << "\n";
std::cout << " " << mat[2][0] << " " << mat[2][1] << "\n";
// std::cout << "kekw = " << mat[3][3] << "\n";
}
Unlike other higher-level languages, arrays need to have a fixed size that
is part of the variable declaration (and hence part of its type);
int x[10] declares an array of 10 integers. Multi-dimensional
arrays are also possible, eg. int x[10][20]; this creates an
array of 10 arrays of 20 integers.
The “value” of an array is actually a memory location (address), of the
first element in the array. That is, &x[0] and
static_cast<int*>(x) have the same “value”.
int ax[3] = {100, 200, 420};
int bx[5] = {69};
std::cout << "ax[0] = " << ax[0] << "\n";
int* ptr_ax0 = ax;
int* ptr_first = &ax[0];
int* ptr_ax3 = ax + 3;
int* ptr_fourth = &ax[3];
std::cout << "ptr_ax0 = " << ptr_ax0 << "\n"; // 0x1000
std::cout << "ptr_first = " << ptr_first << "\n"; // 0x1000
std::cout << "ptr_ax3 = " << ptr_ax3 << "\n"; // 0x100C
std::cout << "ptr_fourth = " << ptr_fourth << "\n"; // 0x100C
As seen from above, the “value” of an array is exactly the address of its
first element. See also that arrays can implicitly convert to pointers (to
the same element type), and pointer arithmetic on an array works as if it
was done on the pointer. In this sense, &x[3] is
equivalent to x + 3.
Note that objects of array type cannot be reassigned; you can only assign
to the elements of the array. Furthermore, you cannot create arrays of
references, functions, or void.
Arrays simply place their elements contiguously in memory, with
increasing addresses. Assuming we have an array
int x[6] and sizeof(int) == 4, then the layout
in memory looks something like this:
If the element type of an array is itself an array, then you get multidimensional arrays. The following snippet declares a 3-by-2 (3 rows, 2 columns) matrix:
int matrix[3][2] = {{1, 2}, {3, 4}, {5, 6}};
std::cout << "matrix[2][1] = " << matrix[2][1] << "\n";
It is convenient to remember this as declaring an “array of 3 (array of 2 (int))”.
The usage (subscripting) follows the declaration; the first subscript
([2] in our case) indexes the first, “outer” array, and the
second one ([1]) indexes the “inner” array. Remember that
multidimensional arrays are just arrays of arrays.
With multidimensional arrays, there are two ways to place the elements —
as an array of rows, or an array of columns. These are called row-major
and column-major respectively. C and C++ uses row-major, which for some
array int A[4][3] looks like the following:
Each distinct colour is a “row” (there are 4 rows and 3 columns), and the rows are laid out contiguously in memory.
The other layout method which is used by (eg.) Fortran is column-major, where columns are stored contiguously in memory. The colours still represent rows, but note how contiguous memory locations make up the columns now:
As mentioned above, arrays are able to implicitly convert to a pointer to
their element type, eg. int x[10] can implicitly convert to
int*.
This is important because in function declarations, parameters of type
array-of-T are
defined by the standard
to be replaced by pointer-to-T. That is, you cannot actually
pass arrays to functions, only the pointer. By extension, this means that
arrays are always pass-by-reference!
To get around the pointer decay issue, you can just wrap your array in a struct:
struct Array { int x[10]; };
void foo(Array array) { /* ... */ }
Alternatively, use std::array, which we’ll introduce later
:P
For example, in the snippet below, we have a function to print an array:
static void print_array(int* array, size_t num_elements) {
for (size_t i = 0; i < num_elements; i++)
std::cout << "array[" << i << "] = " << array[i] << "\n";
std::cout << "last_elem = " << array[num_elements - 1] << "\n";
// std::cout << "last_elem = " << array[num_elements] << "\n";
}
However, this requires passing the length of the array separately, which makes us susceptible to bugs:
int x[3] = { 1, 2, 3 };
print_array(x, 5);
The compiler is unable to error (or even warn us) about this, because it doesn’t know how long the array actually is.
For multidimensional arrays, only the first “level” of an array can decay
this way; eg. for int x[10][20], it will decay to
int (*x)[20] (ptr-to-array of 20 ints), and not
int**. The reason is that the size of the inner array is
needed to ensure that the correct offset can be calculated when indexing
them.
This means that int[5][3] does not decay to
int**! A pointer-to-array and pointer-to-pointer are
fundamentally different types, so there is no implicit conversion here.
Given the following:
int x[3][7];
int (*px)[7] = x;
int* a = &x[2][4];
int* b = &px[2][4];
We would expect the addresses of a and b to be
identical. How are their addresses calculated? The array bounds must be
known, because the offset is
4 * sizeof(int) + 2 * (sizeof(int[7])).
Here, int[7] is the type of the inner array. If
array-to-pointer decay also affected inner levels, then the offset
calculation would not work.
To get around this problem, we can make the function take in a reference to an array instead:
static void print_array_better(int (&array)[3]) {
size_t num_elements = sizeof(array) / sizeof(array[0]);
for (size_t i = 0; i < num_elements; i++)
std::cout << "array[" << i << "] = " << array[i] << "\n";
// std::cout << "last_elem = " << array[2] << "\n";
// std::cout << "last_elem = " << array[10] << "\n";
}
Note that the parentheses around the declaration are necessary due to C++’s weird declaration syntax. This ensures that the function can only accept an array that contains exactly 3 elements.
Three shall be the number thou shalt count, and the number of the counting shall be three. Four shalt thou not count, neither count thou two, excepting that thou then proceed to three. Five is right out.
This makes the following calls fail to compile:
int x[4] = { 1, 2, 3, 4 };
int y[2] = { 69, 420 };
print_array_better(x);
print_array_better(y);
It also means that we do not have to manually pass the array size to the function, since it is part of the parameter’s type.
To create a function accepting arrays of any size (and not just 3), you can use a templated function — stick around to find out how those work :D
template <size_t N>
void print_array_betterer(int (&arr)[N]) {
// ...
}
#include <cstdint>
#include <iostream>
int main() {
std::cout << "sizeof(uint8_t): " << sizeof(uint8_t) << "\t";
std::cout << "alignof(uint8_t): " << alignof(uint8_t) << "\n";
std::cout << "sizeof(uint16_t): " << sizeof(uint16_t) << "\t";
std::cout << "alignof(uint16_t): " << alignof(uint16_t) << "\n";
std::cout << "sizeof(uint32_t): " << sizeof(uint32_t) << "\t";
std::cout << "alignof(uint32_t): " << alignof(uint32_t) << "\n";
std::cout << "sizeof(double): " << sizeof(double) << "\t";
std::cout << "alignof(double): " << alignof(double) << "\n";
}
Before we talk about structures and their layout, we must discuss alignment and padding. Each type has both a size and an alignment, and for primitive types these are usually the same. Eg:
std::cout << "sizeof(uint8_t): " << sizeof(uint8_t) << "\t";
std::cout << "alignof(uint8_t): " << alignof(uint8_t) << "\n";
std::cout << "sizeof(uint16_t): " << sizeof(uint16_t) << "\t";
std::cout << "alignof(uint16_t): " << alignof(uint16_t) << "\n";
std::cout << "sizeof(uint32_t): " << sizeof(uint32_t) << "\t";
std::cout << "alignof(uint32_t): " << alignof(uint32_t) << "\n";
std::cout << "sizeof(double): " << sizeof(double) << "\t";
std::cout << "alignof(double): " << alignof(double) << "\n";
$ ./sizeof.out | head -n4
sizeof(uint8_t): 1 alignof(uint8_t): 1
sizeof(uint16_t): 2 alignof(uint16_t): 2
sizeof(uint32_t): 4 alignof(uint32_t): 4
sizeof(double): 8 alignof(double): 8
Observe that we can use the
sizeof
operator to get the size of a type (it also be used on an expression, eg.
sizeof(x)), and the
alignof
operator to get the alignment of a type.
The alignment of a type is the number of bytes between addresses that an object of that type can be allocated at. For example, if a type has an alignment of 8, then its objects must be allocated at addresses which are a multiple of 8. The C++ standard mandates that the alignment of a type is always a power-of-2.
Object alignment requirements are mainly a result of hardware; on some systems, accessing misaligned objects in memory (eg. a 4-byte word starting at an odd address) generates a CPU exception, and on other systems it works but is substantially slower.
The size of a type is simpler; it is simply the number of bytes that the object takes up in memory. Note that the size of a type must always be at least as large as its alignment; otherwise, two such types cannot ever be placed in an array since there would need to be padding between them.
For the types above, notice that the alignment of the type is the same as its size — this is not true for structs, which we will discuss now.
// structs.cpp
#include <cstddef>
#include <cstdint>
#include <iostream>
int main() {
struct Box {
int8_t a;
int32_t b;
int8_t c;
};
Box box{42, 17, 69};
box.a = 100;
std::cout << "box.a = " << +box.a << "\n";
std::cout << "box.b = " << +box.b << "\n";
std::cout << "box.c = " << +box.c << "\n";
Box* box_ptr = &box;
std::cout << "box_ptr->b = " << box_ptr->b << "\n";
std::cout << "\n";
// sizeof can take both a type and an expression
std::cout << "sizeof(Box) = " << sizeof(Box) << "\t";
std::cout << "sizeof(int8_t) = " << sizeof(int8_t) << "\t";
std::cout << "sizeof(int32_t) = " << sizeof(int32_t) << "\n";
std::cout << "alignof(Box) = " << alignof(Box) << "\t";
std::cout << "alignof(int8_t) = " << alignof(int8_t) << "\t";
std::cout << "alignof(int32_t) = " << alignof(int32_t) << "\n";
std::cout << "\n";
std::cout //
<< "offsetof(Box, a) = " //
<< offsetof(Box, a) << "\n";
std::cout //
<< "offsetof(Box, b) = " //
<< offsetof(Box, b) << "\n";
std::cout //
<< "offsetof(Box, c) = " //
<< offsetof(Box, c) << "\n";
struct BoxActual {
int8_t a;
uint8_t _padding1[3];
int32_t b;
int8_t c;
uint8_t _padding2[3];
};
std::cout //
<< "sizeof(Box) = " //
<< sizeof(Box) << "\n";
std::cout //
<< "sizeof(BoxActual) = " //
<< sizeof(BoxActual) << "\n";
struct Box2 {
int8_t a;
int8_t c;
int32_t b;
};
std::cout //
<< "sizeof(Box2) = " //
<< sizeof(Box2) << "\n";
std::cout //
<< "alignof(Box2) = " //
<< alignof(Box2) << "\n";
std::cout << "\n----- structs and const-ness ----\n";
{
struct Thing {
int x;
int& rx;
int* px;
int ax[3];
};
int aoeu = 10;
Thing thing{aoeu, aoeu, &aoeu, {10, 20, 30}};
thing.x = 10;
thing.rx = 20;
*thing.px = 30;
thing.ax[0] = 1;
thing.ax[2] = 1;
const Thing const_thing{aoeu, aoeu, &aoeu, {10, 20, 30}};
// const_thing.x = 1; // doesn't work
// const_thing.px = nullptr; // doesn't work
*const_thing.px = 10; // works!
// const_thing.ax[0] = 1; // doesn't work
const_thing.rx = 10; // works!
const Thing* ptr_const_thing = &thing;
// ptr_const_thing->x = 10; // doesn't work
*ptr_const_thing->px = 10; // works
ptr_const_thing->rx = 10; // works
}
}
Structs allow you to group multiple objects together to create a new
compound object. A struct can contain fields as well
as methods (covered later), and other things (static members, nested
types, etc.).
To start with, a basic struct definition looks like this:
struct Box {
int8_t a;
int32_t b;
int8_t c;
};
You can initialize structs simply by providing their members in order:
Box box{42, 17, 69};
box.a = 100;
std::cout << "box.a = " << +box.a << "\n";
std::cout << "box.b = " << +box.b << "\n";
std::cout << "box.c = " << +box.c << "\n";
$ ./structs.out | head -n3
box.a = 100
box.b = 17
box.c = 69
In the example above, box is initialized with
a = 42, b = 17, c = 69. You can
also assign (or refer) to a specific by using the . operator.
Unlike some other languages, C++ (and C) does not automatically
dereference pointers to structs. To access fields from pointer-to-struct,
use the -> operator:
Box* box_ptr = &box;
std::cout << "box_ptr->b = " << box_ptr->b << "\n";
An important aspect of understanding how structs work is how their fields are laid out in memory. While the fields will appear in the order that they are declared in the struct, there may be padding bytes between each field, and after the last field.
We already covered
sizeof and
alignof above,
so let’s look at the size and alignment of our Box struct:
std::cout << "sizeof(Box) = " << sizeof(Box) << "\t";
std::cout << "sizeof(int8_t) = " << sizeof(int8_t) << "\t";
std::cout << "sizeof(int32_t) = " << sizeof(int32_t) << "\n";
std::cout << "alignof(Box) = " << alignof(Box) << "\t";
std::cout << "alignof(int8_t) = " << alignof(int8_t) << "\t";
std::cout << "alignof(int32_t) = " << alignof(int32_t) << "\n";
$ ./structs.out | sed -n '6,7p'
sizeof(Box) = 12 sizeof(int8_t) = 1 sizeof(int32_t) = 4
alignof(Box) = 4 alignof(int8_t) = 1 alignof(int32_t) = 4
While the total size of the fields in the Box is only 6 (1 +
4 + 1), its actual size is 12! Also note that the alignment of a struct is
the alignment of its most-aligned member, which in this case is the
int32_t that
is 4-byte aligned.
Back to the size: the reason for it is padding, which are extra
bytes inserted into the struct between fields to preserve the alignment
requirements of those fields. The layout of our Box in memory
really looks something like this:
struct BoxActual {
int8_t a;
uint8_t _padding1[3];
int32_t b;
int8_t c;
uint8_t _padding2[3];
};
std::cout //
<< "sizeof(Box) = " //
<< sizeof(Box) << "\n";
std::cout //
<< "sizeof(BoxActual) = " //
<< sizeof(BoxActual) << "\n";
$ ./structs.out | sed -n '12,13p'
sizeof(Box) = 12
sizeof(BoxActual) = 12
Box with padding illustrated
We see that there are two extra padding regions, 3 bytes each, that have
been inserted before field b and after field c.
The first padding is necessary to ensure that b exists on a
4-byte boundary (because it is 4-byte aligned), and the second padding
ensures that the size of the overall struct is a multiple of its
alignment.
The “equivalent” struct is on the right, with the padding bytes explicitly
shown. Note that the contents of these padding bytes cannot be relied on.
As a consequence, it is also incorrect to perform a byte-wise memory
compare (using memcmp) two struct objects if they have
padding, since their padding bytes might differ (even though their fields
are identical).
We can also confirm this by using the offsetof operator, like
so:
std::cout //
<< "offsetof(Box, a) = " //
<< offsetof(Box, a) << "\n";
std::cout //
<< "offsetof(Box, b) = " //
<< offsetof(Box, b) << "\n";
std::cout //
<< "offsetof(Box, c) = " //
<< offsetof(Box, c) << "\n";
$ ./structs.out | sed -n '9,11p'
offsetof(Box, a) = 0
offsetof(Box, b) = 4
offsetof(Box, c) = 8
It is possible to optimise the size of structs by rearranging fields, if you are aware of how padding works; if we look at an alternative arrangement that puts the two smaller fields first, we get a more compact layout with less padding:
struct Box2 {
int8_t a;
int8_t c;
int32_t b;
};
std::cout //
<< "sizeof(Box2) = " //
<< sizeof(Box2) << "\n";
std::cout //
<< "alignof(Box2) = " //
<< alignof(Box2) << "\n";
$ ./structs.out | sed -n '15,16p'
sizeof(Box2) = 8
alignof(Box2) = 4
Box2 with padding illustrated
// unions.cpp
#include <bit>
#include <iostream>
// rip apple
#if !defined(__cpp_lib_bit_cast)
namespace std {
template <typename To, typename From>
To bit_cast(const From& from) {
return *reinterpret_cast<const To*>(&from);
}
} // namespace std
#endif
int main() {
union Box {
int x;
double y;
char zz[20];
};
std::cout << "sizeof(Box) = " //
<< sizeof(Box) << "\n";
std::cout << "alignof(Box) = " //
<< alignof(Box) << "\n";
Box box; // no member is active
box.x = 420; // x is active
box.y = 3.1; // y is active
// this is undefined behaviour!
std::cout << "box.x = " << box.x << "\n";
box.zz[0] = 0x2c; // zz is active
box.zz[1] = 0x0f; // zz is (still) active
box.zz[2] = 0x01; // ...
box.zz[3] = 0x00; // ...
// this is undefined behaviour!!
std::cout << "box.x = " << box.x << "\n";
{
char bytes[4]{0x2c, 0x0f, 0x01, 0x00};
int foo = std::bit_cast<int>(bytes);
std::cout << "foo = " << foo << "\n";
}
}
The last important compound type in C++ is the union type. They are superficially similar to structs, except that the memory layout (and indeed, the usage) of unions are completely different.
Whereas struct members each have a unique offset (from the start of the struct), all union members begin at the same offset — hence you can think of it as a union of its fields, where their storage overlaps. As with structs, the alignment of a union is the same as the alignment of its most-aligned field. The size however, is the size of its largest member, and not the sum.
union Box {
int x;
double y;
char zz[20];
};
std::cout << "sizeof(Box) = " //
<< sizeof(Box) << "\n";
std::cout << "alignof(Box) = " //
<< alignof(Box) << "\n";
$ ./unions.out | head -n2
sizeof(Box) = 24
alignof(Box) = 8
The most common use of unions is to implement a type that hold multiple
variant types, like enum in Rust. Note that plain unions are
typically not enough to implement something like this, but we’ll cover the
details a little later.
One important thing to note about unions is how they interact with object lifetimes. Each union object has (at most) one “active” member at a time:
Box box; // no member is active
box.x = 420; // x is active
box.y = 3.1; // y is active
However, what happens when you access a member of a union that is not active? Well, that is undefined behaviour:
box.y = 3.1; // y is active
// this is undefined behaviour!
std::cout << "box.x = " << box.x << "\n";
The actual rules for determining which variant member of a union is active are somewhat involved, but the “short version” is that a member is active when its lifetime begins.
For more details, you can look at the following cppreference pages:
Also note that, by these rules, type-punning as done in C (ie. reading from an inactive member of a union) is undefined behaviour under C++!
That being said, most mainstream compilers usually allow you to do this as a non-standard extension, like so:
box.zz[0] = 0x2c; // zz is active
box.zz[1] = 0x0f; // zz is (still) active
box.zz[2] = 0x01; // ...
box.zz[3] = 0x00; // ...
// this is undefined behaviour!!
std::cout << "box.x = " << box.x << "\n";
Again, this type punning is explicitly undefined behaviour, but
most major compilers support doing this. The standards-compliant way to
do this kind of type punning is with std::bit_cast:
char bytes[4]{0x2c, 0x0f, 0x01, 0x00};
int foo = std::bit_cast<int>(bytes);
std::cout << "foo = " << foo << "\n";
// enums.cpp
#include <iostream>
namespace AA {
namespace BB {
enum Colour {
Red, //
Blue,
Pink
};
} // namespace BB
} // namespace AA
int main() {
enum Fruit1 {
Apple, //
Pear = 10,
Grape = Pear + 10,
Guava = 10,
Peach
};
std::cout << "Apple = " //
<< Apple << "\n";
std::cout << "Pear = " //
<< Pear << "\n";
std::cout << "Grape = " //
<< Grape << "\n";
std::cout << "Peach = " //
<< Fruit1::Peach << "\n";
std::cout << "Guava = " //
<< Guava << "\n";
std::cout << "Pink = " << AA::BB::Pink << "\n";
std::cout << "\n";
enum class Fruit2 {
Apple, //
Pear = 10,
Grape = 20,
Peach
};
std::cout << "Apple = " << static_cast<int>(Fruit2::Apple) << "\n";
std::cout << "Pear = " << static_cast<int>(Fruit2::Pear) << "\n";
std::cout << "Grape = " << static_cast<int>(Fruit2::Grape) << "\n";
std::cout << "Peach = " << static_cast<int>(Fruit2::Peach) << "\n";
std::cout << "sizeof(Fruit1) = " << sizeof(Fruit1) << "\n";
enum class Fruit3 : uint8_t {
Apple, //
Pear = 10,
Grape = 20,
Peach
};
std::cout << "sizeof(Fruit3) = " << sizeof(Fruit3) << "\n";
}
C++ has two kinds of enumerations; the ones inherited from C (unscoped enums), and ones “new” to C++ (scoped enums). Enumerations are generally used to give names to values; in C++, these values are limited to integer values (in other languages like Rust or Swift, you can make them values of any type).
We’ll start with unscoped enums:
enum Fruit1 {
Apple, //
Pear = 10,
Grape = Pear + 10,
Guava = 10,
Peach
};
std::cout << "Apple = " //
<< Apple << "\n";
std::cout << "Pear = " //
<< Pear << "\n";
std::cout << "Grape = " //
<< Grape << "\n";
std::cout << "Peach = " //
<< Fruit1::Peach << "\n";
std::cout << "Guava = " //
<< Guava << "\n";
$ ./enums.out | head -n5
Apple = 0
Pear = 10
Grape = 20
Peach = 11
Guava = 10
The important thing to note here is that the enumerator names are
available without qualification, ie. it was not necessary to write
Fruit1::Apple,
rather Apple sufficed. That’s why they’re called unscoped
enums after all :D
(Note: you can still refer to them with qualification, eg.
Fruit1::Apple, but this isn’t necessary.)
The next thing to realise is that you can assign integer values to each
enumerator, eg. Pear is 10, and
Grape is 20. If not explicitly specified, the
first enumerator gets a value of 0 (as we can see here). Subsequent
enumerators are one greater than their predecessor, which is why
Peach has a value of 21 (since
20 + 1 == 21). Also note that you can refer to earlier
enumerations
One way to get around the lack of scoping is to place the enum in a namespace, like so:
namespace AA {
namespace BB {
enum Colour {
Red, //
Blue,
Pink
};
} // namespace BB
} // namespace AA
This way, you can use
AA::BB::Pink
to refer to the enumerator:
std::cout << "Pink = " << AA::BB::Pink << "\n";
The problem with unscoped enumerations become apparent when we want multiple enum declarations with the same enumerator name in the same scope, or do not want to bother with making a separate namespace.
To declare a scoped enumeration, simply use
enum class
instead of just enum, like so:
enum class Fruit2 {
Apple, //
Pear = 10,
Grape = 20,
Peach
};
Now, the enumerators are not available without qualifying them.
std::cout << "Apple = " << static_cast<int>(Fruit2::Apple) << "\n";
std::cout << "Pear = " << static_cast<int>(Fruit2::Pear) << "\n";
std::cout << "Grape = " << static_cast<int>(Fruit2::Grape) << "\n";
std::cout << "Peach = " << static_cast<int>(Fruit2::Peach) << "\n";
You might have noticed that for printing the unscoped enumerations, we did
not have to do any special casts or conversions, and we were able to print
them. This is because unscoped enumerations implicitly convert to
their underlying type, which std::cout knows how to
print.
For scoped enums on the other hand, this implicit conversion does not
happen, and we had to mannually perform a static_cast to
int in order to get std::cout to print them. As
a fun exercise in C++ error messages, try removing the cast and see how
many pages of errors you get :D
The default underlying type of an enumeration is int:
std::cout << "sizeof(Fruit1) = " << sizeof(Fruit1) << "\n";
$ ./enums.out | sed -n '12p'
sizeof(Fruit1) = 4
This might seem like a waste of space, since we only have 4 enumerators, and we’re using 32 whole bits to track it. We can explicitly specify the underlying type of an enum, like this:
enum class Fruit3 : uint8_t {
Apple, //
Pear = 10,
Grape = 20,
Peach
};
std::cout << "sizeof(Fruit3) = " << sizeof(Fruit3) << "\n";
Now, this only takes 1 byte:
$ ./enums.out | tail -n1
sizeof(Fruit3) = 1
Dangling pointers can never be safely used again, even if it happens to point to a new object in the same location, unless a very particular set of conditions hold, and then it is sometimes okay, given that other precautions are taken if necessary.↩︎
© 13 June 2022, Ng Zhia Yang, Tan Chee Kun Thomas, All Rights Reserved
^