Skip to content
C++
Domain Track
Difficulty 4/5

C++ in Game Development

"C++ for game developers: Unreal Engine architecture, ECS patterns, data-oriented design, real-time constraints, frame budgets, and performance-critical idioms."

Why C++ dominates game development

Every major game engine (Unreal Engine, Unity's C++ backend, id Tech, CryEngine, FROSTBITE, Godot's core) is written in C++. The reasons:

  • Zero-overhead abstractions — inlining, templates, and CRTP produce machine code identical to hand-written C.
  • Memory control — custom allocators, placement new, and manual lifetime management.
  • Platform reach — compiles to consoles, PC, mobile, and WebAssembly with a single codebase.
  • Determinism — no GC pauses, predictable timing.

Frame budget thinking

A game at 60 fps has 16.6 ms per frame. At 120 fps: 8.3 ms. Every system — AI, physics, rendering, audio — competes for this budget.

cpp
// Typical frame budget (AAA game at 60fps, ~16ms total)
// Rendering:    8-10ms  (GPU-bound, ~5ms CPU submission)
// Physics:      1-2ms
// Animation:    1-2ms
// AI/gameplay:  1-2ms
// Audio:        0.5ms
// Remaining:    0ms margin — everything is hand-tuned

// Measure with high-resolution timers
auto start = std::chrono::high_resolution_clock::now();
update_ai(entities);
auto end = std::chrono::high_resolution_clock::now();
auto us = std::chrono::duration_cast<std::chrono::microseconds>(end - start).count();

Data-Oriented Design (DOD)

Object-Oriented Design groups data by type hierarchy. Data-Oriented Design groups data by access pattern.

cpp
// AOS — Array of Structs (cache-unfriendly for transform updates)
struct Entity {
    Vector3 position;   // 12 bytes
    Quaternion rotation;// 16 bytes
    Vector3 scale;      // 12 bytes
    int health;         // 4 bytes
    // ...50 more fields
};
Entity entities[10000];

// To update positions, we load all 10000 full structs into cache
// but only use 12 bytes of each — 96% cache waste

// SOA — Struct of Arrays (cache-friendly)
struct EntityPool {
    Vector3  positions[10000];   // 120KB — tight, all used
    Quaternion rotations[10000];
    Vector3  scales[10000];
    int      healths[10000];
};

// Position update loads only position data — 100% cache efficiency
void update_positions(EntityPool& pool, float dt) {
    for (int i = 0; i < pool.count; ++i) {
        pool.positions[i] += pool.velocities[i] * dt;
    }
}

Entity-Component-System (ECS)

ECS is the architectural pattern that puts DOD into practice at scale.

  • Entity — just an ID (uint32_t or uint64_t)
  • Component — plain data struct, no logic
  • System — processes all entities that have a specific set of components

EnTT — the go-to C++ ECS library

cpp
#include <entt/entt.hpp>

// Components — plain data
struct Position { float x, y; };
struct Velocity { float dx, dy; };
struct Health   { int hp, max_hp; };

entt::registry registry;

// Create entities
auto player = registry.create();
registry.emplace<Position>(player, 0.f, 0.f);
registry.emplace<Velocity>(player, 1.f, 0.f);
registry.emplace<Health>(player, 100, 100);

auto enemy = registry.create();
registry.emplace<Position>(enemy, 10.f, 5.f);
registry.emplace<Health>(enemy, 50, 50);

// Systems — iterate only entities with required components
void movement_system(entt::registry& reg, float dt) {
    auto view = reg.view<Position, Velocity>();
    for (auto [entity, pos, vel] : view.each()) {
        pos.x += vel.dx * dt;
        pos.y += vel.dy * dt;
    }
}

// Group — even faster, guaranteed contiguous memory
auto group = registry.group<Position>(entt::get<Velocity>);
group.each([dt](auto& pos, const auto& vel) {
    pos.x += vel.dx * dt;
    pos.y += vel.dy * dt;
});

Memory management patterns

Games pre-allocate everything. No new in hot paths.

Pool allocator

cpp
template<typename T, std::size_t Capacity>
class PoolAllocator {
    alignas(T) std::byte storage_[sizeof(T) * Capacity];
    T* free_list_[Capacity];
    std::size_t free_count_;

public:
    PoolAllocator() : free_count_(Capacity) {
        T* base = reinterpret_cast<T*>(storage_);
        for (std::size_t i = 0; i < Capacity; ++i)
            free_list_[i] = base + i;
    }

    T* allocate() {
        assert(free_count_ > 0);
        return free_list_[--free_count_];
    }

    void deallocate(T* p) {
        free_list_[free_count_++] = p;
    }
};

// Usage — O(1) allocation, no heap fragmentation
PoolAllocator<Bullet, 1024> bullet_pool;
Bullet* b = bullet_pool.allocate();
new(b) Bullet{pos, vel};    // placement new
// ...
b->~Bullet();               // explicit destructor
bullet_pool.deallocate(b);

Frame allocator (linear/bump allocator)

cpp
class FrameAllocator {
    std::byte* buf_;
    std::byte* cur_;
    std::size_t cap_;

public:
    FrameAllocator(std::size_t size)
        : buf_(new std::byte[size]), cur_(buf_), cap_(size) {}

    void* alloc(std::size_t size, std::size_t align = alignof(std::max_align_t)) {
        auto p = std::align(align, size, (void*&)cur_, cap_ - (cur_ - buf_));
        cur_ += size;
        return p;
    }

    void reset() { cur_ = buf_; }  // O(1) — free entire frame
};

FrameAllocator frame_alloc(4 * 1024 * 1024);  // 4MB per frame

// Per-frame temporary allocations
void update_frame() {
    auto* temp_path = static_cast<NavPath*>(
        frame_alloc.alloc(sizeof(NavPath)));
    new(temp_path) NavPath;
    // used this frame, gone next frame
    frame_alloc.reset();
}

Unreal Engine C++ patterns

AActor and UObject hierarchy

cpp
// Actor — has a position, can tick, can be placed in a level
UCLASS()
class AMyCharacter : public ACharacter
{
    GENERATED_BODY()

    // Exposed to Blueprint and editor
    UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Combat")
    float AttackDamage = 50.f;

    UFUNCTION(BlueprintCallable, Category = "Movement")
    void Dash(FVector Direction, float Speed);

public:
    virtual void Tick(float DeltaTime) override;
    virtual void BeginPlay() override;
};

// Component — data + behavior attached to an actor
UCLASS()
class UHealthComponent : public UActorComponent
{
    GENERATED_BODY()

    UPROPERTY(Replicated)
    float Health = 100.f;

    void TakeDamage(float Amount);
};

UE Memory — TSharedPtr vs raw pointers

cpp
// UObjects — use UPROPERTY to prevent GC
UPROPERTY()
UHealthComponent* HealthComp;  // GC won't collect this

// Non-UObject heap objects — TSharedPtr
TSharedPtr<FMyData> SharedData = MakeShared<FMyData>();
TWeakPtr<FMyData> WeakRef = SharedData;  // non-owning

// Stack data and POD — plain C++ (no GC overhead)
FVector Position;
int32 FrameCount = 0;

SIMD for game math

cpp
#include <immintrin.h>  // SSE/AVX intrinsics

// Transform 4 positions with SSE2 (4 floats at once)
void transform_positions_sse(
    const float* __restrict src_x, const float* __restrict src_y,
    float* __restrict dst_x, float* __restrict dst_y,
    float tx, float ty, int count)
{
    __m128 vtx = _mm_set1_ps(tx);
    __m128 vty = _mm_set1_ps(ty);

    for (int i = 0; i < count; i += 4) {
        __m128 vx = _mm_loadu_ps(src_x + i);
        __m128 vy = _mm_loadu_ps(src_y + i);
        _mm_storeu_ps(dst_x + i, _mm_add_ps(vx, vtx));
        _mm_storeu_ps(dst_y + i, _mm_add_ps(vy, vty));
    }
}
// Processes 4 entities per iteration instead of 1

Key libraries

LibraryPurpose
EnTTECS, signals, event dispatcher
flecsAdvanced ECS with queries and modules
GLMGLSL-compatible math (vectors, matrices)
PhysX / Bullet / JoltPhysics simulation
FMOD / SoLoud / miniaudioAudio
Dear ImGuiIn-game debug UI
TracyFrame profiler (low overhead, real-time)
recastnavigationNavmesh generation and pathfinding
sokolMinimal cross-platform 3D API wrapper

Pitfalls specific to game dev

Virtual functions in hot loops — virtual dispatch prevents inlining and has branch prediction cost. Use ECS, CRTP, or std::variant + std::visit instead.

std::shared_ptr in hot paths — reference counting atomics are cache line dirty operations. Use pool allocators + raw pointers with clear ownership.

Heap allocation per frame — any new in the update loop causes heap fragmentation over time. Pre-allocate or use frame allocators.

False sharing — when two threads write to variables that share a cache line (typically 64 bytes), performance tanks. Pad structs to 64 bytes or use separate cache lines per thread.