Skip to content
C++
Library
since C++98
Intermediate

Numeric Algorithms

"C++ numeric algorithms: accumulate, reduce, transform_reduce, prefix scans, inner_product, iota, gcd, lcm, midpoint, and lerp with parallel execution notes."

<numeric> Algorithmssince C++98

The <numeric> header provides fold, prefix-scan, inner-product, and sequence-fill algorithms over ranges; C++17 added parallel-safe reduce, transform_reduce, and scan variants; C++20 added midpoint (in <numeric>) and lerp (in <cmath>).

Overview

<numeric> algorithms fall into four groups:

  • Folds: accumulate (C++98, ordered), reduce (C++17, reorderable/parallel)
  • Map-reduce: inner_product (C++98), transform_reduce (C++17)
  • Prefix scans: partial_sum (C++98), inclusive_scan/exclusive_scan/transform_*_scan (C++17)
  • Utilities: adjacent_difference (C++98), iota (C++11), gcd/lcm (C++17), midpoint (C++20), lerp in <cmath> (C++20)

Parallel overloads accepting std::execution policies require <execution> and linking against an appropriate parallel backend (e.g., TBB on Linux, built-in on MSVC).

std::accumulate vs std::reduce

accumulate (C++98) is a strict left fold β€” it evaluates elements left-to-right in order, making it correct for non-associative and non-commutative operations but preventing parallelism.

cpp
#include <numeric>
#include <functional>

std::vector<double> v{1.5, 2.5, 3.0};

// Init type determines accumulator type β€” use 0.0 for floating-point ranges
double sum  = std::accumulate(v.begin(), v.end(), 0.0);           // 7.0
double prod = std::accumulate(v.begin(), v.end(), 1.0, std::multiplies<>{});

// String concat: order matters β€” accumulate is the correct tool
std::vector<std::string> words{"foo", "bar", "baz"};
std::string s = std::accumulate(words.begin(), words.end(), std::string{},
    [](std::string acc, const std::string& x) { return std::move(acc) + x; });
// "foobarbaz"

reduce (C++17) permits reordering of operations β€” correct when the operation is associative and commutative. It exposes execution policy overloads:

cpp
#include <numeric>
#include <execution>

std::vector<double> data(1'000'000, 1.0);

double s1 = std::reduce(data.begin(), data.end(), 0.0);                    // C++17
double s2 = std::reduce(std::execution::par_unseq,                         // C++17
                        data.begin(), data.end(), 0.0);

// WRONG: string concat order is unspecified with reduce
// auto bad = std::reduce(words.begin(), words.end());  // UB / wrong result

std::transform_reduce

One-pass map-reduce β€” avoids materialising an intermediate transformed range. The two-range overload computes a generalised inner product:

cpp
#include <numeric>

std::vector<double> a{1, 2, 3};
std::vector<double> b{4, 5, 6};

// Dot product: multiply elements pairwise, then sum                        // C++17
double dot = std::transform_reduce(a.begin(), a.end(), b.begin(), 0.0);
// 1*4 + 2*5 + 3*6 = 32.0

// L2 norm squared β€” unary transform overload
double norm_sq = std::transform_reduce(
    a.begin(), a.end(), 0.0,
    std::plus{},
    [](double x) { return x * x; });
// 1 + 4 + 9 = 14.0

// Parallel weighted sum β€” safe: + and * are commutative+associative on reals
std::vector<double> weights{0.5, 0.3, 0.2};
double wsum = std::transform_reduce(
    std::execution::par_unseq,
    a.begin(), a.end(), weights.begin(), 0.0);

Prefer transform_reduce over inner_product in C++17+ code β€” it composes with execution policies.

Prefix Scans

C++17 splits partial_sum (C++98) into explicit inclusive/exclusive variants that accept execution policies and a custom binary operation:

cpp
#include <numeric>

std::vector<int> v{1, 2, 3, 4, 5};
std::vector<int> out(v.size());

// partial_sum (C++98) β€” inclusive, strictly ordered
std::partial_sum(v.begin(), v.end(), out.begin());
// out: {1, 3, 6, 10, 15}

// inclusive_scan (C++17) β€” parallel-capable, same result for commutative ops
std::inclusive_scan(std::execution::par, v.begin(), v.end(), out.begin());
// out: {1, 3, 6, 10, 15}

// exclusive_scan (C++17) β€” identity first, last element excluded
std::exclusive_scan(v.begin(), v.end(), out.begin(), 0);
// out: {0, 1, 3, 6, 10}

// Running product with init value
std::inclusive_scan(v.begin(), v.end(), out.begin(), std::multiplies{}, 1);
// out: {1, 2, 6, 24, 120}

// transform_exclusive_scan (C++17): squares then exclusive prefix sum
std::transform_exclusive_scan(v.begin(), v.end(), out.begin(),
    0, std::plus{}, [](int x) { return x * x; });
// transforms: {1, 4, 9, 16, 25}  β†’  out: {0, 1, 5, 14, 30}

Exclusive scan is the standard building block for work-offset tables in parallel scatter/gather: each element gets the sum of all preceding elements, useful for packing results into a flat array.

std::inner_product and std::adjacent_difference

Two C++98 algorithms with no C++17 parallel equivalents β€” use transform_reduce and manual loops instead when parallelism matters.

cpp
#include <numeric>

std::vector<int> a{1, 2, 3};
std::vector<int> b{4, 5, 6};

// inner_product (C++98): generalised dot product
int dot  = std::inner_product(a.begin(), a.end(), b.begin(), 0);
// 1*4 + 2*5 + 3*6 = 32

// Manhattan distance with custom ops
int dist = std::inner_product(a.begin(), a.end(), b.begin(), 0,
    std::plus{},
    [](int x, int y) { return std::abs(x - y); });
// |1-4| + |2-5| + |3-6| = 9

// adjacent_difference (C++98): discrete derivative
// out[0] = in[0]; out[i] = op(in[i], in[i-1])
std::vector<int> prices{100, 103, 97, 110};
std::vector<int> deltas(prices.size());
std::adjacent_difference(prices.begin(), prices.end(), deltas.begin());
// deltas: {100, 3, -6, 13}  (first element copied, rest are differences)

// Custom op: per-mille change
std::adjacent_difference(prices.begin(), prices.end(), deltas.begin(),
    [](int cur, int prev) { return (cur - prev) * 1000 / prev; });
// deltas: {100, 30, -58, 134}

In-place (d_first == first) is valid for adjacent_difference; a shifted-by-one overlap is not.

std::iota, std::gcd, std::lcm

cpp
#include <numeric>

// iota (C++11): fill with consecutive values β€” classic argsort pattern
std::vector<size_t> idx(data.size());
std::iota(idx.begin(), idx.end(), 0);
std::sort(idx.begin(), idx.end(),
    [&](size_t i, size_t j) { return data[i] < data[j]; });
// idx is now a permutation that stably sorts data

// gcd / lcm (C++17): work on integers, take absolute values internally
std::gcd(12, 8)    // 4
std::gcd(-12, 8)   // 4
std::gcd(0, 5)     // 5
std::lcm(4, 6)     // 12

auto reduce_fraction = [](int n, int d) -> std::pair<int, int> {
    int g = std::gcd(n, d);
    return {n / g, d / g};
};
// reduce_fraction(6, 9) β†’ {2, 3}

// GCD of an entire range via accumulate
std::vector<int> vals{12, 18, 30};
int g = std::accumulate(vals.begin(), vals.end(), 0,
    [](int a, int b) { return std::gcd(a, b); });
// g = 6

std::midpoint and std::lerp (C++20)

midpoint lives in <numeric>; lerp lives in <cmath>. Both are C++20.

cpp
#include <numeric>  // midpoint
#include <cmath>    // lerp

// midpoint (C++20): overflow-safe; rounds toward first argument for integers
std::midpoint(0, 7)                    // 3 (toward 0)
std::midpoint(7, 0)                    // 4 (toward 7)
std::midpoint(INT_MAX - 1, INT_MAX)    // INT_MAX - 1  (no overflow)
std::midpoint(0.0, 1.0)               // 0.5

// Works with pointers into the same array
int arr[10];
int* mid = std::midpoint(arr, arr + 10);  // arr + 5

// lerp (C++20): a + t*(b-a), but with exact boundary guarantees
// lerp(a, b, 0) == a exactly; lerp(a, b, 1) == b exactly
std::lerp(0.0, 10.0, 0.25)  // 2.5
std::lerp(0.0, 10.0, 1.0)   // exactly 10.0

// Monotone for t in [0,1] when a <= b β€” safe for animation blending
struct Vec3 { float x, y, z; };
auto lerp3 = [](Vec3 a, Vec3 b, float t) -> Vec3 {
    return {std::lerp(a.x, b.x, t),
            std::lerp(a.y, b.y, t),
            std::lerp(a.z, b.z, t)};
};
// t outside [0,1] extrapolates β€” valid but check your domain

Best Practices

  • Match init type to result type: accumulate(v.begin(), v.end(), 0) yields int even if v is vector<double>. Always pass 0.0, 0LL, or the correct typed value.
  • Use reduce for arithmetic, accumulate when order matters (string building, running state, non-commutative ops).
  • Prefer transform_reduce over inner_product for dot products in new C++17 code β€” it participates in execution policies.
  • Use exclusive_scan for offset tables β€” its output[i] holds the sum of all preceding elements, the right shape for parallel compaction indices.
  • Use iota + sort with a lambda for argsort rather than maintaining parallel index arrays manually.

Common Pitfalls

Integer init silently truncates:

cpp
std::vector<double> v{1.5, 2.5};
auto bad  = std::accumulate(v.begin(), v.end(), 0);    // int: 3
auto good = std::accumulate(v.begin(), v.end(), 0.0);  // double: 4.0

reduce with non-commutative ops gives unspecified order:

cpp
std::vector<std::string> w{"a", "b", "c"};
// UB/unspecified: result could be "bac", "abc", "cab"
auto s = std::reduce(w.begin(), w.end());
// Fix: std::accumulate

Parallel reduce is not bit-identical to sequential accumulate: floating-point addition is not associative, so parallel reductions may produce slightly different rounding. This is expected, correct, and usually acceptable for scientific workloads β€” but do not use parallel reduce to reproduce a specific numerical sequence.

adjacent_difference first element is always copied: out[0] == in[0] regardless of your custom op. Account for this in downstream consumers.

Quick Reference

AlgorithmHeaderSincePurpose
accumulate<numeric>C++98Ordered left fold
inner_product<numeric>C++98Generalised dot product
partial_sum<numeric>C++98Inclusive prefix sum (ordered)
adjacent_difference<numeric>C++98Discrete derivative
iota<numeric>C++11Fill with consecutive values
reduce<numeric>C++17Reorderable fold, parallel-safe
transform_reduce<numeric>C++17Map then reduce
inclusive_scan<numeric>C++17Inclusive scan + parallel
exclusive_scan<numeric>C++17Exclusive scan + parallel
transform_inclusive_scan<numeric>C++17Map then inclusive scan
transform_exclusive_scan<numeric>C++17Map then exclusive scan
gcd<numeric>C++17Greatest common divisor
lcm<numeric>C++17Least common multiple
midpoint<numeric>C++20Overflow-safe midpoint
lerp<cmath>C++20Linear interpolation

See Also