C++ in Game Development
"C++ for game developers: Unreal Engine architecture, ECS patterns, data-oriented design, real-time constraints, frame budgets, and performance-critical idioms."
Why C++ dominates game development
Every major game engine (Unreal Engine, Unity's C++ backend, id Tech, CryEngine, FROSTBITE, Godot's core) is written in C++. The reasons:
- Zero-overhead abstractions — inlining, templates, and CRTP produce machine code identical to hand-written C.
- Memory control — custom allocators, placement new, and manual lifetime management.
- Platform reach — compiles to consoles, PC, mobile, and WebAssembly with a single codebase.
- Determinism — no GC pauses, predictable timing.
Frame budget thinking
A game at 60 fps has 16.6 ms per frame. At 120 fps: 8.3 ms. Every system — AI, physics, rendering, audio — competes for this budget.
// Typical frame budget (AAA game at 60fps, ~16ms total)
// Rendering: 8-10ms (GPU-bound, ~5ms CPU submission)
// Physics: 1-2ms
// Animation: 1-2ms
// AI/gameplay: 1-2ms
// Audio: 0.5ms
// Remaining: 0ms margin — everything is hand-tuned
// Measure with high-resolution timers
auto start = std::chrono::high_resolution_clock::now();
update_ai(entities);
auto end = std::chrono::high_resolution_clock::now();
auto us = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();Data-Oriented Design (DOD)
Object-Oriented Design groups data by type hierarchy. Data-Oriented Design groups data by access pattern.
// AOS — Array of Structs (cache-unfriendly for transform updates)
struct Entity {
Vector3 position; // 12 bytes
Quaternion rotation;// 16 bytes
Vector3 scale; // 12 bytes
int health; // 4 bytes
// ...50 more fields
};
Entity entities[10000];
// To update positions, we load all 10000 full structs into cache
// but only use 12 bytes of each — 96% cache waste
// SOA — Struct of Arrays (cache-friendly)
struct EntityPool {
Vector3 positions[10000]; // 120KB — tight, all used
Quaternion rotations[10000];
Vector3 scales[10000];
int healths[10000];
};
// Position update loads only position data — 100% cache efficiency
void update_positions(EntityPool& pool, float dt) {
for (int i = 0; i < pool.count; ++i) {
pool.positions[i] += pool.velocities[i] * dt;
}
}Entity-Component-System (ECS)
ECS is the architectural pattern that puts DOD into practice at scale.
- Entity — just an ID (uint32_t or uint64_t)
- Component — plain data struct, no logic
- System — processes all entities that have a specific set of components
EnTT — the go-to C++ ECS library
#include <entt/entt.hpp>
// Components — plain data
struct Position { float x, y; };
struct Velocity { float dx, dy; };
struct Health { int hp, max_hp; };
entt::registry registry;
// Create entities
auto player = registry.create();
registry.emplace<Position>(player, 0.f, 0.f);
registry.emplace<Velocity>(player, 1.f, 0.f);
registry.emplace<Health>(player, 100, 100);
auto enemy = registry.create();
registry.emplace<Position>(enemy, 10.f, 5.f);
registry.emplace<Health>(enemy, 50, 50);
// Systems — iterate only entities with required components
void movement_system(entt::registry& reg, float dt) {
auto view = reg.view<Position, Velocity>();
for (auto [entity, pos, vel] : view.each()) {
pos.x += vel.dx * dt;
pos.y += vel.dy * dt;
}
}
// Group — even faster, guaranteed contiguous memory
auto group = registry.group<Position>(entt::get<Velocity>);
group.each([dt](auto& pos, const auto& vel) {
pos.x += vel.dx * dt;
pos.y += vel.dy * dt;
});Memory management patterns
Games pre-allocate everything. No new in hot paths.
Pool allocator
template<typename T, std::size_t Capacity>
class PoolAllocator {
alignas(T) std::byte storage_[sizeof(T) * Capacity];
T* free_list_[Capacity];
std::size_t free_count_;
public:
PoolAllocator() : free_count_(Capacity) {
T* base = reinterpret_cast<T*>(storage_);
for (std::size_t i = 0; i < Capacity; ++i)
free_list_[i] = base + i;
}
T* allocate() {
assert(free_count_ > 0);
return free_list_[--free_count_];
}
void deallocate(T* p) {
free_list_[free_count_++] = p;
}
};
// Usage — O(1) allocation, no heap fragmentation
PoolAllocator<Bullet, 1024> bullet_pool;
Bullet* b = bullet_pool.allocate();
new(b) Bullet{pos, vel}; // placement new
// ...
b->~Bullet(); // explicit destructor
bullet_pool.deallocate(b);Frame allocator (linear/bump allocator)
class FrameAllocator {
std::byte* buf_;
std::byte* cur_;
std::size_t cap_;
public:
FrameAllocator(std::size_t size)
: buf_(new std::byte[size]), cur_(buf_), cap_(size) {}
void* alloc(std::size_t size, std::size_t align = alignof(std::max_align_t)) {
auto p = std::align(align, size, (void*&)cur_, cap_ - (cur_ - buf_));
cur_ += size;
return p;
}
void reset() { cur_ = buf_; } // O(1) — free entire frame
};
FrameAllocator frame_alloc(4 * 1024 * 1024); // 4MB per frame
// Per-frame temporary allocations
void update_frame() {
auto* temp_path = static_cast<NavPath*>(
frame_alloc.alloc(sizeof(NavPath)));
new(temp_path) NavPath;
// used this frame, gone next frame
frame_alloc.reset();
}Unreal Engine C++ patterns
AActor and UObject hierarchy
// Actor — has a position, can tick, can be placed in a level
UCLASS()
class AMyCharacter : public ACharacter
{
GENERATED_BODY()
// Exposed to Blueprint and editor
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Combat")
float AttackDamage = 50.f;
UFUNCTION(BlueprintCallable, Category = "Movement")
void Dash(FVector Direction, float Speed);
public:
virtual void Tick(float DeltaTime) override;
virtual void BeginPlay() override;
};
// Component — data + behavior attached to an actor
UCLASS()
class UHealthComponent : public UActorComponent
{
GENERATED_BODY()
UPROPERTY(Replicated)
float Health = 100.f;
void TakeDamage(float Amount);
};UE Memory — TSharedPtr vs raw pointers
// UObjects — use UPROPERTY to prevent GC
UPROPERTY()
UHealthComponent* HealthComp; // GC won't collect this
// Non-UObject heap objects — TSharedPtr
TSharedPtr<FMyData> SharedData = MakeShared<FMyData>();
TWeakPtr<FMyData> WeakRef = SharedData; // non-owning
// Stack data and POD — plain C++ (no GC overhead)
FVector Position;
int32 FrameCount = 0;SIMD for game math
#include <immintrin.h> // SSE/AVX intrinsics
// Transform 4 positions with SSE2 (4 floats at once)
void transform_positions_sse(
const float* __restrict src_x, const float* __restrict src_y,
float* __restrict dst_x, float* __restrict dst_y,
float tx, float ty, int count)
{
__m128 vtx = _mm_set1_ps(tx);
__m128 vty = _mm_set1_ps(ty);
for (int i = 0; i < count; i += 4) {
__m128 vx = _mm_loadu_ps(src_x + i);
__m128 vy = _mm_loadu_ps(src_y + i);
_mm_storeu_ps(dst_x + i, _mm_add_ps(vx, vtx));
_mm_storeu_ps(dst_y + i, _mm_add_ps(vy, vty));
}
}
// Processes 4 entities per iteration instead of 1Key libraries
| Library | Purpose |
|---|---|
| EnTT | ECS, signals, event dispatcher |
| flecs | Advanced ECS with queries and modules |
| GLM | GLSL-compatible math (vectors, matrices) |
| PhysX / Bullet / Jolt | Physics simulation |
| FMOD / SoLoud / miniaudio | Audio |
| Dear ImGui | In-game debug UI |
| Tracy | Frame profiler (low overhead, real-time) |
| recastnavigation | Navmesh generation and pathfinding |
| sokol | Minimal cross-platform 3D API wrapper |
Pitfalls specific to game dev
Virtual functions in hot loops — virtual dispatch prevents inlining and has branch prediction cost. Use ECS, CRTP, or std::variant + std::visit instead.
std::shared_ptr in hot paths — reference counting atomics are cache line dirty operations. Use pool allocators + raw pointers with clear ownership.
Heap allocation per frame — any new in the update loop causes heap fragmentation over time. Pre-allocate or use frame allocators.
False sharing — when two threads write to variables that share a cache line (typically 64 bytes), performance tanks. Pad structs to 64 bytes or use separate cache lines per thread.