Skip to content
C++
Language
since C++98
Advanced

Undefined Behavior

C++ undefined behavior — what it is, how compilers exploit it, the most dangerous forms, and how to detect and eliminate it.

Undefined Behaviorsince C++98

A program that executes an operation the C++ standard designates as undefined behavior gives up all guarantees — the compiler may generate any code it chooses, including code that silently corrupts data, eliminates safety checks, or introduces security vulnerabilities.

Overview

Undefined behavior is not a runtime failure mode — it is a compile-time contract violation the optimizer exploits. The standard deliberately leaves certain operations undefined so compilers can optimize without disproving impossibility. When you write x + 1 where x is a signed int, the compiler is permitted to assume that expression never overflows, and it will restructure your code around that assumption.

The core danger: UB does not produce "garbage output." It produces code whose behavior the optimizer derived from a false premise. The result can appear correct in debug builds, fail silently under -O2, and change behavior across compiler versions — without any diagnostic.

The standard defines four tiers below "defined":

TermMeaning
Undefined behaviorNo requirements — anything may happen
Unspecified behaviorOne of several valid outcomes; standard doesn't say which
Implementation-definedPlatform must pick one outcome and document it
Erroneous behavior (C++26)UB with mandatory diagnostic support in hardened implementations

C++26 introduces erroneous behavior for operations like reading an uninitialized bool — conforming implementations must either produce the indeterminate value deterministically or trap, closing some of the practical UB gap without full defined-behavior overhead.

How compilers exploit UB

The optimizer operates under the as-if rule: it may rewrite any code that preserves observable behavior for a valid program. Because UB cannot occur in a valid program by definition, any path reachable only via UB is provably dead and may be eliminated.

cpp
// Overflow check the optimizer deletes entirely
bool will_overflow(int x) {
    return x + 1 <= x;   // signed overflow is UB — so x+1 > x always
}
// -O2 output (GCC 14, Clang 18): return false;
cpp
// Null check after dereference — optimizer proves it's dead code
void write(int* p) {
    *p = 42;              // dereference: establishes p != null
    if (p == nullptr)     // provably false — eliminated
        log_error();
}
cpp
// Signed loop counter — wrap assumed never to occur; termination check dropped
for (int i = 0; i >= 0; ++i) {   // UB: wraps to INT_MIN
    process(i);
}
// GCC -O3: may generate an infinite loop — the "i >= 0" check is always true

These are not compiler bugs. They are correct transformations of a program that has already violated the standard.


Examples

Signed integer overflow

Signed arithmetic is not modular in C++. Two's complement wrapping that C programmers sometimes rely on is undefined behavior — and C++ only mandated two's complement representation in C++20.

cpp
int x = INT_MAX;
int y = x + 1;            // UB: signed overflow (pre-C++20 and C++20+)

// Safe alternatives:
if (x > INT_MAX - 1) throw std::overflow_error("overflow");
long long y = static_cast<long long>(x) + 1;   // promote before adding
int y = std::add_sat(x, 1);                     // C++26: saturating arithmetic

Unsigned integers wrap by definition — that is fully specified behavior:

cpp
unsigned u = UINT_MAX;
unsigned v = u + 1;       // 0 — defined, guaranteed modular wrap

Out-of-bounds access

cpp
int arr[5];
int x = arr[5];           // UB: one past end
arr[-1] = 0;              // UB: before start

std::vector<int> v{1, 2, 3};
v[10];                    // UB: operator[] is unchecked
v.at(10);                 // C++98: throws std::out_of_range — use at trust boundaries

Dangling references and use-after-free

cpp
int& bad() {
    int x = 42;
    return x;             // UB: x destroyed at return
}

// Iterator invalidation — common and silent
std::vector<int> v{1, 2, 3};
auto it = v.begin();
v.push_back(4);           // may reallocate — it is now dangling
*it = 99;                 // UB

// Fix: hold indices instead of iterators across mutations
// Or: v.reserve(N) before inserting, when final size is known

Uninitialized reads

cpp
int x;                    // indeterminate value — not zero-initialized
if (x > 0) { ... }       // UB: reading indeterminate value

// C++26 erroneous behavior: reading uninitialized trivial types becomes
// erroneous (not UB) — implementations may produce 0 or trap

Strict aliasing violations

The strict aliasing rule permits the compiler to assume pointers of different types do not alias, enabling load/store reordering that breaks programs that violate it. The only types that may alias any object are char*, unsigned char*, and (C++17) std::byte*.

cpp
float f = 3.14f;
int* p = reinterpret_cast<int*>(&f);
*p = 0x3f800000;          // UB: accessing float storage through int*

// Correct type punning:
int bits;
std::memcpy(&bits, &f, sizeof(float));        // always defined
int bits = std::bit_cast<int>(f);             // C++20: preferred, constexpr-capable

std::bit_cast (C++20) is the canonical solution — it is constexpr, communicates intent, and generates identical code to memcpy on optimized builds.

Shift overflow

cpp
int x = 1;
x << 31;    // UB pre-C++20: shifts into sign bit for signed int
x << 32;    // UB all versions: shift amount >= bit width

// C++20 change: left-shift of signed integers is defined as
// truncation (two's complement) — shifting into sign bit is no longer UB
// C++20: (1 << 31) is defined for 32-bit int as INT_MIN

uint32_t u = 1u << 31;   // Always defined: 0x80000000u

Data races

cpp
int counter = 0;
std::thread t1([&]{ ++counter; });   // C++11: std::thread introduced
std::thread t2([&]{ ++counter; });
t1.join(); t2.join();
// UB: concurrent non-atomic writes — C++11 memory model defines this as UB

// Fix: std::atomic<int> counter{0};  // C++11

Detection

ToolCatchesFlag
UBSanSigned overflow, null deref, invalid enum, OOB-fsanitize=undefined
ASanOOB access, use-after-free, stack use-after-scope-fsanitize=address
MSanUninitialized reads-fsanitize=memory
TSanData races-fsanitize=thread
ValgrindOOB, use-after-free, leaksNo recompile needed
clang-tidyStatic: uninitialized vars, aliasing patternsCompile-time
bash
# Full sanitizer build — use this for CI on test suites
clang++ -fsanitize=address,undefined -fno-sanitize-recover=all \
        -g -O1 -o myapp main.cpp

# -fno-sanitize-recover=all: first UB hit aborts with stack trace
# -O1: enough optimization to trigger UB, cheap enough for CI

-fsanitize-trap=undefined (Clang) emits a hardware trap instead of a runtime call — useful for embedded targets that cannot link the sanitizer runtime.


Best Practices

  • Enable ASan + UBSan in CI. Run every test suite under -fsanitize=address,undefined -fno-sanitize-recover=all. Most dynamic UB is caught here at near-zero maintenance cost.
  • Test at -O2, not just debug. Many UB manifestations only appear when the optimizer runs. Debug builds mask the problem.
  • Use std::bit_cast for type punning (C++20). It's constexpr, expresses intent, and compiles to zero overhead.
  • Prefer .at() over [] at trust boundaries — where input size comes from external sources. The bounds check is a single branch.
  • Use std::atomic for any shared variable (C++11). Even reads of non-atomic data are UB when another thread writes.
  • Initialize all variables at declaration. Modern optimizers eliminate redundant stores; the bug risk from skipping initialization is not worth it.

Common Pitfalls

Signed/unsigned comparison. int i = -1; if (i < v.size())size() returns size_t (unsigned); -1 promotes to SIZE_MAX. Not UB, but almost always a bug. Compile with -Wsign-compare.

std::optional access without checking (C++17):

cpp
std::optional<int> opt;   // C++17
int x = *opt;             // UB: accessing disengaged optional
int x = opt.value();      // throws std::bad_optional_access — safer at trust boundaries

Placement new without std::launder (C++17). Reusing storage via placement new and then accessing through the original pointer is UB:

cpp
alignas(int) char buf[sizeof(int)];
new (buf) int{42};
int* p = reinterpret_cast<int*>(buf);          // UB: stale pointer
int* p = std::launder(reinterpret_cast<int*>(buf));  // C++17: defined

Assuming sanitizers catch everything. MSan misses some heap uninitialized reads; UBSan typically misses strict aliasing violations (-fsanitize=strict-aliasing is rarely enabled); neither tool catches logic errors. Sanitizers are a floor, not a ceiling.

Integer overflow in security-sensitive length arithmetic.

cpp
// Classic heap overflow via wrapping length
uint32_t len = attacker_controlled;
char* out = new char[len + 1];    // len = UINT_MAX → +1 wraps to 0
memcpy(out, src, len);            // copies UINT_MAX bytes — heap corruption

// Fix:
if (len > MAX_SAFE_SIZE) return error;
char* out = new char[len + 1];

See Also