Algebraic Types
Sum types and product types in C++: composing data with std::variant, std::optional, and structs to make illegal states unrepresentable.
Algebraic Typessince C++17Algebraic types are composite types formed by two orthogonal operations: product (all fields present simultaneously, as in a struct) and sum (exactly one of several alternatives active at a time, as in std::variant), enabling the compiler to enforce exhaustive case analysis and eliminate illegal states.
Overview
Algebraic type theory names two fundamental ways to compose types:
- Product types hold all members simultaneously. A
struct Point { double x; double y; }has a state space equal to the Cartesian product ofdouble Γ double. Every struct, class, andstd::tupleis a product type. - Sum types hold exactly one of several alternatives at a time. The state space is the sum β the union β of the alternatives' state spaces. Also called discriminated unions or tagged unions.
C++ has always had raw union and struct, but neither composition primitive was type-safe. Raw union shares storage without tracking which member is active; accessing the wrong member is undefined behaviour. Struct boolean flags like bool has_value; T value; duplicate constraint enforcement across every callsite.
C++17 closed the gap with two vocabulary types:
std::variant<Ts...>β the general sum type. Exactly one alternative is active at all times. Access through the wrong type throws or returns null.std::optional<T>β the degenerate sum:Tor nothing. Semantically equivalent tostd::variant<std::monostate, T>but with a purpose-built interface for nullable values.
C++23 adds std::expected<T, E>, which is morally std::variant<T, E> with monadic operations for chaining fallible computations without exceptions.
The central payoff: when you add an alternative to a variant and fail to handle it in a std::visit call, the program fails to compile. There is no silent runtime fallthrough.
Syntax
Product types
// Plain struct β always available
struct Point { double x; double y; };
// C++11: std::tuple for anonymous products
#include <tuple>
using RGB = std::tuple<uint8_t, uint8_t, uint8_t>;
// C++17: structured bindings decompose any product type
auto [r, g, b] = RGB{255, 128, 0};Sum types with std::variant
#include <variant> // C++17
using Number = std::variant<int, double, long long>;
Number n = 42; // holds int
n = 3.14; // now holds double; previous value is destroyed
n = 100LL; // now holds long long
// Non-throwing index-based probe
if (auto* p = std::get_if<double>(&n)) // C++17 β returns nullptr on mismatch
std::cout << "double: " << *p << '\n';
// Throwing type-based access
double d = std::get<double>(n); // throws std::bad_variant_access if wrong
// Unambiguous construction β avoids implicit-conversion surprises
std::variant<long, double> v{std::in_place_type<long>, 1}; // C++17Exhaustive dispatch with std::visit
// Overload helper β deduction guide implicit in C++20, explicit needed in C++17
template<class... Fs>
struct overloaded : Fs... { using Fs::operator()...; };
template<class... Fs> overloaded(Fs...) -> overloaded<Fs...>; // C++17 deduction guide
Number result = 3.14;
std::visit(overloaded{
[](int i) { std::cout << "int: " << i << '\n'; },
[](double d) { std::cout << "double: " << d << '\n'; },
[](long long l) { std::cout << "ll: " << l << '\n'; },
}, result);
// Missing an alternative β compile error, not a runtime bugExamples
Expression tree (recursive sum type)
Recursive variants require indirection because Expr cannot contain itself directly.
#include <variant>
#include <memory>
#include <string>
#include <unordered_map>
struct Literal { double value; };
struct Variable { std::string name; };
struct BinaryOp;
using Expr = std::variant<Literal, Variable, std::unique_ptr<BinaryOp>>;
struct BinaryOp {
char op;
Expr lhs;
Expr rhs;
};
double eval(const Expr& e, const std::unordered_map<std::string, double>& env) {
return std::visit(overloaded{
[](const Literal& l) { return l.value; },
[&](const Variable& v) { return env.at(v.name); },
[&](const std::unique_ptr<BinaryOp>& b) {
double l = eval(b->lhs, env), r = eval(b->rhs, env);
switch (b->op) {
case '+': return l + r;
case '-': return l - r;
case '*': return l * r;
case '/': return l / r;
default: return 0.0;
}
},
}, e);
}Every new node kind added to Expr forces an update to every std::visit call β the compiler tracks the obligation, not comments or docs.
State machine without sentinel values
struct Idle {};
struct Running { std::string task_name; };
struct Failed { std::string error; int code; };
struct Done { int exit_code; };
using WorkerState = std::variant<Idle, Running, Failed, Done>;
void start(WorkerState& s, std::string task) {
std::visit(overloaded{
[&](Idle&) { s = Running{std::move(task)}; },
[](Running&) { /* already active β no-op or throw */ },
[](Failed&) { s = Idle{}; }, // reset before retry
[](Done&) { s = Idle{}; },
}, s);
}
bool terminal(const WorkerState& s) noexcept {
return std::holds_alternative<Failed>(s) // C++17
|| std::holds_alternative<Done>(s);
}Contrast this with enum class State { Idle, Running, Failed, Done } plus a separate string error; int code: nothing prevents State::Idle while error is non-empty.
Result type without exceptions (C++17 approximation)
template<class T, class E>
using Result = std::variant<T, E>;
struct ParseError { std::string msg; int line; };
Result<double, ParseError> parse_number(std::string_view sv) { // string_view: C++17
char* end{};
double v = std::strtod(sv.data(), &end);
if (end == sv.data())
return ParseError{"not a number", 0};
return v;
}
auto r = parse_number("3.14");
if (auto* val = std::get_if<double>(&r))
std::cout << "parsed: " << *val << '\n';
else if (auto* err = std::get_if<ParseError>(&r))
std::cerr << "error on line " << err->line << ": " << err->msg << '\n';In C++23, replace this pattern with std::expected<double, ParseError>. It provides and_then, transform, and or_else for monadic chaining without manual visiting.
Optional for nullable returns
#include <optional> // C++17
std::optional<std::string> find_user(const Database& db, int id) {
auto row = db.query_one("SELECT name FROM users WHERE id = ?", id);
if (!row) return std::nullopt;
return row->get<std::string>("name");
}
// At the callsite, the absence is explicit and enforced
if (auto name = find_user(db, 42))
greet(*name);
// C++23: monadic interface avoids manual if-checks
auto upper = find_user(db, 42)
.transform([](std::string s) {
std::ranges::transform(s, s.begin(), ::toupper);
return s;
})
.value_or("UNKNOWN");Best Practices
Prefer std::get_if over std::get when the alternative might legitimately be absent. std::get<T> throws std::bad_variant_access on mismatch; std::get_if<T> returns a nullable pointer. In hot paths, the pointer form avoids exception overhead.
Use std::monostate to opt into default construction. A std::variant<A, B> default-constructs to A{}. If A is not default-constructible, the variant is also not default-constructible. Prepend std::monostate to allow it: std::variant<std::monostate, NonDefault>.
Flatten nested variants. std::variant<A, std::variant<B, C>> is rarely correct. Flatten it to std::variant<A, B, C> to keep visit calls manageable and avoid double-dispatch.
Annotate visitor return types explicitly when alternatives differ. std::visit requires all branches to return the same type. If branches return int and long, the lambda deduces conflicting types. Annotate the return: [](int i) -> long { return i; } or wrap in an explicit cast.
Common Pitfalls
Valueless-by-exception state. If an alternative's move constructor throws during emplace, the variant enters a valueless_by_exception() state. Subsequent std::get calls throw. Avoid this by preferring alternatives whose move constructors are noexcept β check with std::is_nothrow_move_constructible_v<T> (C++17).
Ambiguous implicit conversions. std::variant<int, long> v = 1; may fail to compile or pick an unexpected alternative depending on overload resolution. Use std::in_place_type<T> for unambiguous construction, especially with arithmetic types.
std::optional<bool> boolean traps. An empty optional<bool> is falsy; a optional<bool> holding false is truthy. The idiomatic conversion to bool breaks intuition when T is itself bool, pointer, or another contextually-convertible type. Always use .has_value() and dereference separately rather than relying on bare boolean conversion.
Recursive types need indirection. A std::variant cannot list itself as an alternative; the size would be infinite. Introduce std::unique_ptr<Node> or std::shared_ptr<Node> as the recursive branch, as shown in the AST example above.
See Also
std::variantβ cppreference (type-safe union, C++17)std::optionalβ cppreference (nullable value wrapper, C++17)std::expectedβ cppreference (error-return monad, C++23)std::visitβ cppreference (exhaustive sum-type dispatch, C++17)std::monostateβ cppreference (empty alternative tag, C++17)