What is std::linalg in C++?

C++26 BLAS-level 1/2/3 free functions operating on std::mdspan views for portable dense linear algebra without external library dependencies.

Which C++ standard introduced std::linalg?

std::linalg was introduced in C++26.

What is the difficulty level of std::linalg?

std::linalg is considered Advanced-level C++ material.

std::linalg

std::linalgsince C++26

A namespace of BLAS-compatible free functions introduced in C++26 under <linalg> that perform dense vector, matrix–vector, and matrix–matrix operations directly on std::mdspan views, supporting row-major, column-major, and strided memory layouts without heap allocation inside the algorithm layer.

Overview

Before C++26, the standard library offered only scattered numerical primitives: std::inner_product, std::valarray, and the <numeric> algorithms. Production code needing true linear algebra reached for BLAS/LAPACK C bindings, Eigen, Blaze, or similar third-party libraries. C++26 addresses this gap with std::linalg: a standardised, namespace-scoped set of functions that mirrors the structure of BLAS Levels 1, 2, and 3.

The entire design rests on std::mdspan (C++23), which provides a non-owning, layout-parameterised view over contiguous or strided memory. Every std::linalg function takes mdspan arguments and dispatches work according to the view's layout and accessor policies. Ownership stays in caller-managed storage; layout policy lives in the mdspan type; algorithms live in linalg. This layering makes composition possible without heap allocation inside any algorithm.

The library ships four zero-copy view adaptors that wrap an existing mdspan to express mathematical structure:

Adaptor	Effect
`std::linalg::transposed(A)`	Swaps row and column extents/strides
`std::linalg::conjugated(A)`	Returns complex conjugates through the accessor
`std::linalg::conjugate_transposed(A)`	Hermitian transpose (†)
`std::linalg::scaled(α, A)`	Multiplies every element access by scalar `α`

Triangular and symmetric structure is communicated via tag-type arguments rather than a separate type hierarchy: std::linalg::upper_triangle / lower_triangle selects which triangle is meaningful; std::linalg::explicit_diagonal / implicit_unit_diagonal controls whether the diagonal is stored or assumed to be 1. This avoids allocating a dense copy just to use a triangular solver.

Syntax

All functions live in namespace std::linalg after #include <linalg>. Include <mdspan> (C++23) explicitly — the standard does not guarantee it is pulled in transitively.

BLAS Level 1 — vector/vector operations

cpp

Godbolt

// Dot product: result = Σ x[i] * y[i]  (real types)          // C++26
auto r = std::linalg::dot(x, y);

// Conjugate dot product: result = Σ conj(x[i]) * y[i]         // C++26
auto rc = std::linalg::dotc(x, y);

// Euclidean 2-norm, accumulated from init                      // C++26
auto n = std::linalg::vector_two_norm(x, 0.0);

// Overflow-safe sum of squares (returns {scale, ssq})         // C++26
auto [sc, ssq] = std::linalg::vector_sum_of_squares(x, {1.0, 0.0});

// In-place scale: x *= alpha                                   // C++26
std::linalg::scale(alpha, x);

// z = x + y                                                    // C++26
std::linalg::add(x, y, z);

// z = alpha*x + y  (compose scaled adaptor with add)          // C++26
std::linalg::add(std::linalg::scaled(alpha, x), y, z);

BLAS Level 2 — matrix/vector

cpp

Godbolt

std::linalg::matrix_vector_product(A, x, y);           // y = A*x     C++26
std::linalg::matrix_vector_product(A, x, z, y);        // y = A*x + z C++26

std::linalg::symmetric_matrix_vector_product(
    A, std::linalg::upper_triangle, x, y);              // y = A*x (symmetric, upper triangle) C++26

std::linalg::triangular_matrix_vector_product(
    A, std::linalg::upper_triangle,
    std::linalg::implicit_unit_diagonal, x, y);         // y = A*x (unit upper triangular) C++26

BLAS Level 3 — matrix/matrix

cpp

Godbolt

std::linalg::matrix_product(A, B, C);                  // C = A*B      C++26
std::linalg::matrix_product(A, B, E, C);               // C = A*B + E  C++26

Matrix norms

cpp

Godbolt

auto f = std::linalg::matrix_frob_norm(A, 0.0);        // Frobenius    C++26
auto o = std::linalg::matrix_one_norm(A, 0.0);         // max col-sum  C++26
auto i = std::linalg::matrix_inf_norm(A, 0.0);         // max row-sum  C++26

Examples

Dot product and 2-norm of a flat array

cpp

Godbolt

#include <linalg>    // C++26
#include <mdspan>    // C++23
#include <array>
#include <print>     // C++23

int main() {
    std::array<double, 4> data{1.0, 2.0, 3.0, 4.0};
    std::mdspan x(data.data(), 4);                             // C++23, rank-1

    auto d = std::linalg::dot(x, x);                          // 1+4+9+16 = 30
    auto n = std::linalg::vector_two_norm(x, 0.0);            // sqrt(30)

    std::println("dot={}, norm={:.6f}", d, n);
}

General matrix–vector product (row-major)

cpp

Godbolt

#include <linalg>
#include <mdspan>
#include <vector>

void gemv_row_major() {
    // Stored row-major: A = [[1,2,3],[4,5,6],[7,8,9]]
    std::vector<double> a_data{1,2,3, 4,5,6, 7,8,9};
    std::vector<double> x_data{1, 0, -1};
    std::vector<double> y_data(3, 0.0);

    // layout_right is the default — row-major, matching C array convention  // C++23
    std::mdspan<double, std::extents<std::size_t, 3, 3>> A(a_data.data());
    std::mdspan<double, std::extents<std::size_t, 3>>    x(x_data.data());
    std::mdspan<double, std::extents<std::size_t, 3>>    y(y_data.data());

    std::linalg::matrix_vector_product(A, x, y);               // C++26 — y = A*x
    // y == {-2.0, -2.0, -2.0}
}

Column-major layout for BLAS interoperability

Traditional BLAS and Fortran routines use column-major storage. Use std::layout_left to model this without copying:

cpp

Godbolt

#include <linalg>
#include <mdspan>
#include <vector>

void gemm_column_major() {
    // Column-major 2×2: a = {1,3,2,4} → [[1,2],[3,4]]
    std::vector<double> a{1,3,2,4};
    std::vector<double> b{5,7,6,8};
    std::vector<double> c(4, 0.0);

    using Ext = std::extents<std::size_t, 2, 2>;
    std::mdspan<double, Ext, std::layout_left> A(a.data()); // C++23
    std::mdspan<double, Ext, std::layout_left> B(b.data());
    std::mdspan<double, Ext, std::layout_left> C(c.data());

    std::linalg::matrix_product(A, B, C);                    // C++26 — C = A*B
}

Zero-copy A^T A using `transposed`

cpp

Godbolt

#include <linalg>
#include <mdspan>

template<class InMat, class OutMat>
void compute_gram_matrix(InMat A, OutMat ATA) {
    // A^T * A without allocating a transposed copy — C++26
    std::linalg::matrix_product(
        std::linalg::transposed(A),
        A,
        ATA
    );
}

Best Practices

Wrap data in mdspan at API boundaries and carry the type through. Building an mdspan from a raw pointer at every call site loses extent and layout information that enables bounds reasoning. Accept and return mdspan in internal interfaces.

Match storage layout to your access pattern. std::layout_right (row-major, the default) is cache-friendly for row-wise traversal. If downstream code expects Fortran or BLAS column-major conventions, create layout_left spans at the construction site rather than inserting a transpose step.

Prefer view adaptors over materialising temporaries. std::linalg::transposed(A) is a zero-cost type-level operation; allocating a transposed copy just to pass it to a product call wastes memory and bandwidth.

Use vector_sum_of_squares over a manual dot for norm computation on extreme-valued data. The two-argument vector_two_norm accumulates Σ x[i]² before taking the square root, which overflows for large elements. vector_sum_of_squares uses a scale/ssq pair that avoids intermediate overflow.

Initialise output spans before passing them to the three-argument accumulating forms. matrix_product(A, B, E, C) computes C = A*B + E, reading from E. A zero-initialised output fed as E is the correct pattern for a plain product that also accumulates into an existing matrix.

Common Pitfalls

Aliasing input and output is undefined behaviour. Unlike std::copy, which explicitly handles overlap, every std::linalg function requires that no output span alias any input span. There is no diagnostic. Passing the same buffer as both x and y in matrix_vector_product(A, x, y) silently corrupts results.

Using dot instead of dotc for complex inner products. For std::complex element types, dot(x, y) computes Σ x[i] * y[i] without conjugation. The standard sesquilinear inner product used in physics and signal processing is dotc(x, y) = Σ conj(x[i]) * y[i]. Using dot with complex vectors produces a bilinear result that is typically wrong in this context.

Misinterpreting extent order for column-major spans. std::mdspan<T, std::extents<size_t, M, N>, std::layout_left> always means M rows and N columns — the extent order is (rows, cols) regardless of storage order. BLAS C wrappers often use (N, M) conventions inherited from Fortran; transposing this mental model when constructing spans is a common source of silent shape errors.

Assuming all linalg functions are available in constant-evaluated contexts. The standard marks functions constexpr where the element type permits it, but compile-time linalg support in standard library implementations is still maturing in early C++26 toolchains. Verify before relying on it in constexpr or consteval contexts.