Numeric Algorithms
"C++ numeric algorithms: accumulate, reduce, transform_reduce, prefix scans, inner_product, iota, gcd, lcm, midpoint, and lerp with parallel execution notes."
<numeric> Algorithmssince C++98The <numeric> header provides fold, prefix-scan, inner-product, and sequence-fill algorithms over ranges; C++17 added parallel-safe reduce, transform_reduce, and scan variants; C++20 added midpoint (in <numeric>) and lerp (in <cmath>).
Overview
<numeric> algorithms fall into four groups:
- Folds:
accumulate(C++98, ordered),reduce(C++17, reorderable/parallel) - Map-reduce:
inner_product(C++98),transform_reduce(C++17) - Prefix scans:
partial_sum(C++98),inclusive_scan/exclusive_scan/transform_*_scan(C++17) - Utilities:
adjacent_difference(C++98),iota(C++11),gcd/lcm(C++17),midpoint(C++20),lerpin<cmath>(C++20)
Parallel overloads accepting std::execution policies require <execution> and linking against an appropriate parallel backend (e.g., TBB on Linux, built-in on MSVC).
std::accumulate vs std::reduce
accumulate (C++98) is a strict left fold β it evaluates elements left-to-right in order, making it correct for non-associative and non-commutative operations but preventing parallelism.
#include <numeric>
#include <functional>
std::vector<double> v{1.5, 2.5, 3.0};
// Init type determines accumulator type β use 0.0 for floating-point ranges
double sum = std::accumulate(v.begin(), v.end(), 0.0); // 7.0
double prod = std::accumulate(v.begin(), v.end(), 1.0, std::multiplies<>{});
// String concat: order matters β accumulate is the correct tool
std::vector<std::string> words{"foo", "bar", "baz"};
std::string s = std::accumulate(words.begin(), words.end(), std::string{},
[](std::string acc, const std::string& x) { return std::move(acc) + x; });
// "foobarbaz"reduce (C++17) permits reordering of operations β correct when the operation is associative and commutative. It exposes execution policy overloads:
#include <numeric>
#include <execution>
std::vector<double> data(1'000'000, 1.0);
double s1 = std::reduce(data.begin(), data.end(), 0.0); // C++17
double s2 = std::reduce(std::execution::par_unseq, // C++17
data.begin(), data.end(), 0.0);
// WRONG: string concat order is unspecified with reduce
// auto bad = std::reduce(words.begin(), words.end()); // UB / wrong resultstd::transform_reduce
One-pass map-reduce β avoids materialising an intermediate transformed range. The two-range overload computes a generalised inner product:
#include <numeric>
std::vector<double> a{1, 2, 3};
std::vector<double> b{4, 5, 6};
// Dot product: multiply elements pairwise, then sum // C++17
double dot = std::transform_reduce(a.begin(), a.end(), b.begin(), 0.0);
// 1*4 + 2*5 + 3*6 = 32.0
// L2 norm squared β unary transform overload
double norm_sq = std::transform_reduce(
a.begin(), a.end(), 0.0,
std::plus{},
[](double x) { return x * x; });
// 1 + 4 + 9 = 14.0
// Parallel weighted sum β safe: + and * are commutative+associative on reals
std::vector<double> weights{0.5, 0.3, 0.2};
double wsum = std::transform_reduce(
std::execution::par_unseq,
a.begin(), a.end(), weights.begin(), 0.0);Prefer transform_reduce over inner_product in C++17+ code β it composes with execution policies.
Prefix Scans
C++17 splits partial_sum (C++98) into explicit inclusive/exclusive variants that accept execution policies and a custom binary operation:
#include <numeric>
std::vector<int> v{1, 2, 3, 4, 5};
std::vector<int> out(v.size());
// partial_sum (C++98) β inclusive, strictly ordered
std::partial_sum(v.begin(), v.end(), out.begin());
// out: {1, 3, 6, 10, 15}
// inclusive_scan (C++17) β parallel-capable, same result for commutative ops
std::inclusive_scan(std::execution::par, v.begin(), v.end(), out.begin());
// out: {1, 3, 6, 10, 15}
// exclusive_scan (C++17) β identity first, last element excluded
std::exclusive_scan(v.begin(), v.end(), out.begin(), 0);
// out: {0, 1, 3, 6, 10}
// Running product with init value
std::inclusive_scan(v.begin(), v.end(), out.begin(), std::multiplies{}, 1);
// out: {1, 2, 6, 24, 120}
// transform_exclusive_scan (C++17): squares then exclusive prefix sum
std::transform_exclusive_scan(v.begin(), v.end(), out.begin(),
0, std::plus{}, [](int x) { return x * x; });
// transforms: {1, 4, 9, 16, 25} β out: {0, 1, 5, 14, 30}Exclusive scan is the standard building block for work-offset tables in parallel scatter/gather: each element gets the sum of all preceding elements, useful for packing results into a flat array.
std::inner_product and std::adjacent_difference
Two C++98 algorithms with no C++17 parallel equivalents β use transform_reduce and manual loops instead when parallelism matters.
#include <numeric>
std::vector<int> a{1, 2, 3};
std::vector<int> b{4, 5, 6};
// inner_product (C++98): generalised dot product
int dot = std::inner_product(a.begin(), a.end(), b.begin(), 0);
// 1*4 + 2*5 + 3*6 = 32
// Manhattan distance with custom ops
int dist = std::inner_product(a.begin(), a.end(), b.begin(), 0,
std::plus{},
[](int x, int y) { return std::abs(x - y); });
// |1-4| + |2-5| + |3-6| = 9
// adjacent_difference (C++98): discrete derivative
// out[0] = in[0]; out[i] = op(in[i], in[i-1])
std::vector<int> prices{100, 103, 97, 110};
std::vector<int> deltas(prices.size());
std::adjacent_difference(prices.begin(), prices.end(), deltas.begin());
// deltas: {100, 3, -6, 13} (first element copied, rest are differences)
// Custom op: per-mille change
std::adjacent_difference(prices.begin(), prices.end(), deltas.begin(),
[](int cur, int prev) { return (cur - prev) * 1000 / prev; });
// deltas: {100, 30, -58, 134}In-place (d_first == first) is valid for adjacent_difference; a shifted-by-one overlap is not.
std::iota, std::gcd, std::lcm
#include <numeric>
// iota (C++11): fill with consecutive values β classic argsort pattern
std::vector<size_t> idx(data.size());
std::iota(idx.begin(), idx.end(), 0);
std::sort(idx.begin(), idx.end(),
[&](size_t i, size_t j) { return data[i] < data[j]; });
// idx is now a permutation that stably sorts data
// gcd / lcm (C++17): work on integers, take absolute values internally
std::gcd(12, 8) // 4
std::gcd(-12, 8) // 4
std::gcd(0, 5) // 5
std::lcm(4, 6) // 12
auto reduce_fraction = [](int n, int d) -> std::pair<int, int> {
int g = std::gcd(n, d);
return {n / g, d / g};
};
// reduce_fraction(6, 9) β {2, 3}
// GCD of an entire range via accumulate
std::vector<int> vals{12, 18, 30};
int g = std::accumulate(vals.begin(), vals.end(), 0,
[](int a, int b) { return std::gcd(a, b); });
// g = 6std::midpoint and std::lerp (C++20)
midpoint lives in <numeric>; lerp lives in <cmath>. Both are C++20.
#include <numeric> // midpoint
#include <cmath> // lerp
// midpoint (C++20): overflow-safe; rounds toward first argument for integers
std::midpoint(0, 7) // 3 (toward 0)
std::midpoint(7, 0) // 4 (toward 7)
std::midpoint(INT_MAX - 1, INT_MAX) // INT_MAX - 1 (no overflow)
std::midpoint(0.0, 1.0) // 0.5
// Works with pointers into the same array
int arr[10];
int* mid = std::midpoint(arr, arr + 10); // arr + 5
// lerp (C++20): a + t*(b-a), but with exact boundary guarantees
// lerp(a, b, 0) == a exactly; lerp(a, b, 1) == b exactly
std::lerp(0.0, 10.0, 0.25) // 2.5
std::lerp(0.0, 10.0, 1.0) // exactly 10.0
// Monotone for t in [0,1] when a <= b β safe for animation blending
struct Vec3 { float x, y, z; };
auto lerp3 = [](Vec3 a, Vec3 b, float t) -> Vec3 {
return {std::lerp(a.x, b.x, t),
std::lerp(a.y, b.y, t),
std::lerp(a.z, b.z, t)};
};
// t outside [0,1] extrapolates β valid but check your domainBest Practices
- Match init type to result type:
accumulate(v.begin(), v.end(), 0)yieldsinteven ifvisvector<double>. Always pass0.0,0LL, or the correct typed value. - Use
reducefor arithmetic,accumulatewhen order matters (string building, running state, non-commutative ops). - Prefer
transform_reduceoverinner_productfor dot products in new C++17 code β it participates in execution policies. - Use
exclusive_scanfor offset tables β its output[i] holds the sum of all preceding elements, the right shape for parallel compaction indices. - Use
iota+sortwith a lambda for argsort rather than maintaining parallel index arrays manually.
Common Pitfalls
Integer init silently truncates:
std::vector<double> v{1.5, 2.5};
auto bad = std::accumulate(v.begin(), v.end(), 0); // int: 3
auto good = std::accumulate(v.begin(), v.end(), 0.0); // double: 4.0reduce with non-commutative ops gives unspecified order:
std::vector<std::string> w{"a", "b", "c"};
// UB/unspecified: result could be "bac", "abc", "cab"
auto s = std::reduce(w.begin(), w.end());
// Fix: std::accumulateParallel reduce is not bit-identical to sequential accumulate: floating-point addition is not associative, so parallel reductions may produce slightly different rounding. This is expected, correct, and usually acceptable for scientific workloads β but do not use parallel reduce to reproduce a specific numerical sequence.
adjacent_difference first element is always copied: out[0] == in[0] regardless of your custom op. Account for this in downstream consumers.
Quick Reference
| Algorithm | Header | Since | Purpose |
|---|---|---|---|
accumulate | <numeric> | C++98 | Ordered left fold |
inner_product | <numeric> | C++98 | Generalised dot product |
partial_sum | <numeric> | C++98 | Inclusive prefix sum (ordered) |
adjacent_difference | <numeric> | C++98 | Discrete derivative |
iota | <numeric> | C++11 | Fill with consecutive values |
reduce | <numeric> | C++17 | Reorderable fold, parallel-safe |
transform_reduce | <numeric> | C++17 | Map then reduce |
inclusive_scan | <numeric> | C++17 | Inclusive scan + parallel |
exclusive_scan | <numeric> | C++17 | Exclusive scan + parallel |
transform_inclusive_scan | <numeric> | C++17 | Map then inclusive scan |
transform_exclusive_scan | <numeric> | C++17 | Map then exclusive scan |
gcd | <numeric> | C++17 | Greatest common divisor |
lcm | <numeric> | C++17 | Least common multiple |
midpoint | <numeric> | C++20 | Overflow-safe midpoint |
lerp | <cmath> | C++20 | Linear interpolation |
See Also
- Execution Policies (
<execution>) βseq,par,par_unseq,unseqsemantics and thread-safety requirements - Ranges Algorithms β C++20 range-based views composable with numeric algorithms
- Fold Expressions β compile-time equivalent for parameter pack reductions
std::valarrayβ C++98 numeric array with built-in element-wise operators and slice views