Fold Operations
Reduce a range to a single value using std::accumulate, std::reduce, std::transform_reduce, and the C++23 ranges::fold family.
Fold Operationssince C++98A fold reduces a sequence of elements to a single accumulated value by repeatedly applying a binary operation, starting from an explicit initial value and consuming elements left-to-right (or right-to-left for right folds).
Overview
The C++ standard library provides a layered progression of fold algorithms, each adding capabilities introduced across standards:
| Algorithm | Header | Since | Notes |
|---|---|---|---|
std::accumulate | <numeric> | C++98 | Sequential left fold |
std::inner_product | <numeric> | C++98 | Two-range fold |
std::partial_sum | <numeric> | C++98 | Prefix fold (scan) |
std::reduce | <numeric> | C++17 | Parallelisable; order unspecified |
std::transform_reduce | <numeric> | C++17 | Map then reduce |
std::inclusive_scan | <numeric> | C++17 | Parallelisable prefix fold |
std::exclusive_scan | <numeric> | C++17 | Prefix fold excluding current element |
std::ranges::fold_left | <algorithm> | C++23 | Range-based left fold |
std::ranges::fold_right | <algorithm> | C++23 | Range-based right fold |
std::ranges::fold_left_first | <algorithm> | C++23 | Left fold; first element as init |
Syntax
// C++98 β sequential left fold
T accumulate(InputIt first, InputIt last, T init);
T accumulate(InputIt first, InputIt last, T init, BinaryOp op);
// C++17 β order-unspecified fold, supports execution policies
T reduce(InputIt first, InputIt last);
T reduce(InputIt first, InputIt last, T init);
T reduce(ExecPolicy&& policy, ForwardIt first, ForwardIt last, T init, BinaryOp op);
// C++17 β transform each element then fold
T transform_reduce(InputIt first, InputIt last, T init, BinaryReduceOp r, UnaryTransformOp t);
T transform_reduce(InputIt1 first1, InputIt1 last1, InputIt2 first2, T init); // dot product
// C++23 β range-based folds
auto ranges::fold_left(R&& r, T init, F op) -> T;
auto ranges::fold_right(R&& r, T init, F op) -> T;
auto ranges::fold_left_first(R&& r, F op) -> optional<T>; // init = first elementExamples
Basic accumulation and the init-type trap
std::accumulate deduces its return type from init, not from the range element type. Passing integer 0 for a range of double silently truncates.
#include <numeric>
#include <vector>
std::vector<double> prices{1.5, 2.3, 0.8, 4.1};
// BAD: init is int β result is int (truncated to 8)
int wrong = std::accumulate(prices.begin(), prices.end(), 0);
// CORRECT: init matches the value type
double total = std::accumulate(prices.begin(), prices.end(), 0.0); // 8.7Custom binary operation
#include <numeric>
#include <vector>
#include <string>
std::vector<std::string> words{"fold", "is", "powerful"};
// Right-fold simulation via reverse iterators
std::string sentence = std::accumulate(
words.rbegin(), words.rend(), std::string{},
[](const std::string& acc, const std::string& w) {
return acc.empty() ? w : acc + " " + w;
}
); // "powerful is fold"std::reduce for parallel workloads (C++17)
std::reduce allows the runtime to reorder and group operations, enabling vectorisation and thread-level parallelism. The binary operation must be associative and commutative for results to be deterministic.
#include <numeric>
#include <execution> // C++17
#include <vector>
std::vector<double> samples(10'000'000);
// ... fill samples ...
// Sequential β identical to accumulate
double seq_sum = std::reduce(samples.begin(), samples.end(), 0.0);
// Parallel β may use all available cores
double par_sum = std::reduce(
std::execution::par_unseq, // C++17 execution policy
samples.begin(), samples.end(), 0.0
);Subtraction and division are not commutative; using them with std::reduce produces undefined behaviour when an execution policy is active.
std::transform_reduce for map-reduce (C++17)
The canonical use case is a dot product or any weighted sum:
#include <numeric>
#include <execution>
#include <vector>
std::vector<double> weights{0.2, 0.5, 0.3};
std::vector<double> values{10.0, 20.0, 30.0};
// Dot product: sum(weights[i] * values[i])
double weighted = std::transform_reduce(
std::execution::par, // C++17
weights.begin(), weights.end(),
values.begin(),
0.0 // init
); // 0.2*10 + 0.5*20 + 0.3*30 = 21.0
// Unary form: sum of squares
std::vector<int> v{1, 2, 3, 4, 5};
int sum_sq = std::transform_reduce(
v.begin(), v.end(), 0,
std::plus<>{},
[](int x){ return x * x; }
); // 55C++23 ranges::fold_left and fold_right
The C++23 ranges fold algorithms work directly on range objects and return the accumulated value without requiring explicit iterator pairs. fold_left_first uses the first element as the initial value, returning std::optional to handle empty ranges safely.
#include <algorithm> // C++23
#include <ranges>
#include <vector>
#include <optional>
std::vector<int> nums{1, 2, 3, 4, 5};
// Left fold: ((((1+2)+3)+4)+5) = 15
int sum = std::ranges::fold_left(nums, 0, std::plus<>{}); // C++23
// Right fold: (1+(2+(3+(4+5)))) = 15
int rsum = std::ranges::fold_right(nums, 0, std::plus<>{}); // C++23
// fold_left_first: no explicit init; uses nums[0] as starting value
std::optional<int> maybe = std::ranges::fold_left_first(nums, std::plus<>{}); // C++23
int result = maybe.value(); // 15; safe β nums is non-empty
// Composes naturally with views
auto evens_sum = std::ranges::fold_left(
nums | std::views::filter([](int x){ return x % 2 == 0; }), // C++20
0, std::plus<>{}
); // 6Best Practices
Prefer std::reduce over std::accumulate for arithmetic on large ranges. With std::execution::par_unseq, the compiler can auto-vectorise and distribute across threads. Reserve std::accumulate for non-commutative operations (e.g., string concatenation, matrix multiplication) or when exact left-to-right order is required.
Prefer std::ranges::fold_left in new C++23 code. It accepts ranges directly, composes with views pipelines, and expresses intent more clearly than iterator pairs. fold_left_first eliminates the need to manually extract front() as an initial value.
Match the init type explicitly. Template deduction derives the return type from init. For any fold over floating-point elements, always write 0.0, 0.0f, or T{} β never a bare integer literal.
Use std::transform_reduce instead of separate transform + accumulate. A single pass is cache-friendlier and enables the parallelism optimisations to span both phases simultaneously.
Common Pitfalls
Mutation in the binary operation. Both std::accumulate and std::reduce pass the accumulator by value. Capturing and mutating state outside the lambda is a data race when a parallel execution policy is active.
Non-commutative ops with std::reduce. Passing std::execution::par to std::reduce with subtraction or non-commutative user ops produces implementation-defined results. The standard allows arbitrary grouping of operations.
Empty range with fold_left_first. When the range is empty, std::ranges::fold_left_first returns std::nullopt. Always check the optional before using the value.
Overflow in the accumulator. If the element type is int and the sum exceeds INT_MAX, the accumulator overflows. Use a wider init type: std::accumulate(v.begin(), v.end(), 0LL) promotes the accumulator to long long.
See Also
std::partial_sum/std::inclusive_scan/std::exclusive_scanβ prefix folds that emit one output per input elementstd::inner_productβ C++98 two-range fold equivalent totransform_reduce's two-iterator form- Execution policies (
std::execution::par,std::execution::par_unseq) β C++17 parallel algorithm dispatch - C++17 fold expressions β language-level pack expansion over variadic templates using
(... op args)syntax, a compile-time analogue to these runtime algorithms