Skip to content
C++
Library
since C++98
Intermediate

Fold Operations

Reduce a range to a single value using std::accumulate, std::reduce, std::transform_reduce, and the C++23 ranges::fold family.

Fold Operationssince C++98

A fold reduces a sequence of elements to a single accumulated value by repeatedly applying a binary operation, starting from an explicit initial value and consuming elements left-to-right (or right-to-left for right folds).

Overview

The C++ standard library provides a layered progression of fold algorithms, each adding capabilities introduced across standards:

AlgorithmHeaderSinceNotes
std::accumulate<numeric>C++98Sequential left fold
std::inner_product<numeric>C++98Two-range fold
std::partial_sum<numeric>C++98Prefix fold (scan)
std::reduce<numeric>C++17Parallelisable; order unspecified
std::transform_reduce<numeric>C++17Map then reduce
std::inclusive_scan<numeric>C++17Parallelisable prefix fold
std::exclusive_scan<numeric>C++17Prefix fold excluding current element
std::ranges::fold_left<algorithm>C++23Range-based left fold
std::ranges::fold_right<algorithm>C++23Range-based right fold
std::ranges::fold_left_first<algorithm>C++23Left fold; first element as init

Syntax

cpp
// C++98 β€” sequential left fold
T accumulate(InputIt first, InputIt last, T init);
T accumulate(InputIt first, InputIt last, T init, BinaryOp op);

// C++17 β€” order-unspecified fold, supports execution policies
T reduce(InputIt first, InputIt last);
T reduce(InputIt first, InputIt last, T init);
T reduce(ExecPolicy&& policy, ForwardIt first, ForwardIt last, T init, BinaryOp op);

// C++17 β€” transform each element then fold
T transform_reduce(InputIt first, InputIt last, T init, BinaryReduceOp r, UnaryTransformOp t);
T transform_reduce(InputIt1 first1, InputIt1 last1, InputIt2 first2, T init); // dot product

// C++23 β€” range-based folds
auto ranges::fold_left(R&& r, T init, F op) -> T;
auto ranges::fold_right(R&& r, T init, F op) -> T;
auto ranges::fold_left_first(R&& r, F op) -> optional<T>;  // init = first element

Examples

Basic accumulation and the init-type trap

std::accumulate deduces its return type from init, not from the range element type. Passing integer 0 for a range of double silently truncates.

cpp
#include <numeric>
#include <vector>

std::vector<double> prices{1.5, 2.3, 0.8, 4.1};

// BAD: init is int β€” result is int (truncated to 8)
int wrong = std::accumulate(prices.begin(), prices.end(), 0);

// CORRECT: init matches the value type
double total = std::accumulate(prices.begin(), prices.end(), 0.0); // 8.7

Custom binary operation

cpp
#include <numeric>
#include <vector>
#include <string>

std::vector<std::string> words{"fold", "is", "powerful"};

// Right-fold simulation via reverse iterators
std::string sentence = std::accumulate(
    words.rbegin(), words.rend(), std::string{},
    [](const std::string& acc, const std::string& w) {
        return acc.empty() ? w : acc + " " + w;
    }
); // "powerful is fold"

std::reduce for parallel workloads (C++17)

std::reduce allows the runtime to reorder and group operations, enabling vectorisation and thread-level parallelism. The binary operation must be associative and commutative for results to be deterministic.

cpp
#include <numeric>
#include <execution>  // C++17
#include <vector>

std::vector<double> samples(10'000'000);
// ... fill samples ...

// Sequential β€” identical to accumulate
double seq_sum = std::reduce(samples.begin(), samples.end(), 0.0);

// Parallel β€” may use all available cores
double par_sum = std::reduce(
    std::execution::par_unseq,   // C++17 execution policy
    samples.begin(), samples.end(), 0.0
);

Subtraction and division are not commutative; using them with std::reduce produces undefined behaviour when an execution policy is active.

std::transform_reduce for map-reduce (C++17)

The canonical use case is a dot product or any weighted sum:

cpp
#include <numeric>
#include <execution>
#include <vector>

std::vector<double> weights{0.2, 0.5, 0.3};
std::vector<double> values{10.0, 20.0, 30.0};

// Dot product: sum(weights[i] * values[i])
double weighted = std::transform_reduce(
    std::execution::par,   // C++17
    weights.begin(), weights.end(),
    values.begin(),
    0.0  // init
);  // 0.2*10 + 0.5*20 + 0.3*30 = 21.0

// Unary form: sum of squares
std::vector<int> v{1, 2, 3, 4, 5};
int sum_sq = std::transform_reduce(
    v.begin(), v.end(), 0,
    std::plus<>{},
    [](int x){ return x * x; }
);  // 55

C++23 ranges::fold_left and fold_right

The C++23 ranges fold algorithms work directly on range objects and return the accumulated value without requiring explicit iterator pairs. fold_left_first uses the first element as the initial value, returning std::optional to handle empty ranges safely.

cpp
#include <algorithm>  // C++23
#include <ranges>
#include <vector>
#include <optional>

std::vector<int> nums{1, 2, 3, 4, 5};

// Left fold: ((((1+2)+3)+4)+5) = 15
int sum = std::ranges::fold_left(nums, 0, std::plus<>{}); // C++23

// Right fold: (1+(2+(3+(4+5)))) = 15
int rsum = std::ranges::fold_right(nums, 0, std::plus<>{}); // C++23

// fold_left_first: no explicit init; uses nums[0] as starting value
std::optional<int> maybe = std::ranges::fold_left_first(nums, std::plus<>{}); // C++23
int result = maybe.value(); // 15; safe β€” nums is non-empty

// Composes naturally with views
auto evens_sum = std::ranges::fold_left(
    nums | std::views::filter([](int x){ return x % 2 == 0; }), // C++20
    0, std::plus<>{}
); // 6

Best Practices

Prefer std::reduce over std::accumulate for arithmetic on large ranges. With std::execution::par_unseq, the compiler can auto-vectorise and distribute across threads. Reserve std::accumulate for non-commutative operations (e.g., string concatenation, matrix multiplication) or when exact left-to-right order is required.

Prefer std::ranges::fold_left in new C++23 code. It accepts ranges directly, composes with views pipelines, and expresses intent more clearly than iterator pairs. fold_left_first eliminates the need to manually extract front() as an initial value.

Match the init type explicitly. Template deduction derives the return type from init. For any fold over floating-point elements, always write 0.0, 0.0f, or T{} β€” never a bare integer literal.

Use std::transform_reduce instead of separate transform + accumulate. A single pass is cache-friendlier and enables the parallelism optimisations to span both phases simultaneously.

Common Pitfalls

Mutation in the binary operation. Both std::accumulate and std::reduce pass the accumulator by value. Capturing and mutating state outside the lambda is a data race when a parallel execution policy is active.

Non-commutative ops with std::reduce. Passing std::execution::par to std::reduce with subtraction or non-commutative user ops produces implementation-defined results. The standard allows arbitrary grouping of operations.

Empty range with fold_left_first. When the range is empty, std::ranges::fold_left_first returns std::nullopt. Always check the optional before using the value.

Overflow in the accumulator. If the element type is int and the sum exceeds INT_MAX, the accumulator overflows. Use a wider init type: std::accumulate(v.begin(), v.end(), 0LL) promotes the accumulator to long long.

See Also

  • std::partial_sum / std::inclusive_scan / std::exclusive_scan β€” prefix folds that emit one output per input element
  • std::inner_product β€” C++98 two-range fold equivalent to transform_reduce's two-iterator form
  • Execution policies (std::execution::par, std::execution::par_unseq) β€” C++17 parallel algorithm dispatch
  • C++17 fold expressions β€” language-level pack expansion over variadic templates using (... op args) syntax, a compile-time analogue to these runtime algorithms