Data-Parallel Types: Reduction
In this article, I will discuss reduction and mask reduction for data-parallel types.
Reduction
A reduction reduces the SIMD vector to a single element. The library provides three functions for this purpose: reduce, hmin, and hmax. The following program shows how these functions are used.
// reduction.cpp
#include <array>
#include <experimental/simd>
#include <functional>
#include <iostream>
#include <string_view>
namespace stdx = std::experimental;
void println(std::string_view name, auto const& a) {
std::cout << name << ": ";
for (std::size_t i{}; i != std::size(a); ++i)
std::cout << a[i] << ' ';
std::cout << '\n';
}
int main() {
const stdx::fixed_size_simd<int, 16> a([](int i) { return i; });
println("a", a);
auto sum = stdx::reduce(a);
std::cout << "sum: " << sum << "\n\n";
const stdx::fixed_size_simd<int, 8> b([](int i) { return i + 1; });
println("b", b);
auto product = stdx::reduce(b, std::multiplies<>());
std::cout << "product: " << product << "\n\n";
auto maximum = stdx::hmax(b);
std::cout << "maximum: " << maximum << "\n\n";
auto minimum = stdx::hmin(b);
std::cout << "minimum: " << minimum << "\n\n";
}
First, the reduce function is used. By default, the + operator is applied in the usual way. However, this function can also be parameterized with any binary operator. The expression stdx::reduce(b, std::multiplies<>()) applies the function object std::multiplies from the header functional. The functions hmax and hmin determine the maximum and minimum of the SIMD vector b. Finally, the program output follows:
Mask Reduction
When reducing a mask, the SIMD mask is reduced to either true or false value.
all_of, any_of, none_of, and some_of
Here we encounter some old friends from C++: all_of, any_of, none_of, and some_of. When using these functions, the SIMD mask is reduced to either true or false value.
cppreference.com has a nice example of these functions:
// reductionWithMask.cpp
#include <cassert>
#include <experimental/simd>
namespace stq = std::experimental;
int main()
{
using mask = stq::fixed_size_simd_mask<int, 4>;
mask mask1{false}; // = {0, 0, 0, 0}
assert
(
stq::none_of(mask1) == true &&
stq::any_of(mask1) == false &&
stq::some_of(mask1) == false &&
stq::all_of(mask1) == false
);
mask mask2{true}; // = {1, 1, 1, 1}
assert
(
stq::none_of(mask2) == false &&
stq::any_of(mask2) == true &&
stq::some_of(mask2) == false &&
stq::all_of(mask2) == true
);
mask mask3{true};
mask3[0] = mask3[1] = false; // mask3 = {0, 0, 1, 1}
assert
(
stq::none_of(mask3) == false &&
stq::any_of(mask3) == true &&
stq::some_of(mask3) == true &&
stq::all_of(mask3) == false
);
}
popcount
popcount determines how many values in a SIMD mask are true. A program can be written quickly to do this:
// popcount.cpp
#include <experimental/simd>
#include <iostream>
#include <string_view>
namespace stdx = std::experimental;
void println(std::string_view name, auto const& a) {
std::cout << std::boolalpha << name << ": ";
for (std::size_t i{}; i != std::size(a); ++i)
std::cout << a[i] << ' ';
std::cout << '\n';
}
int main() {
const stdx::native_simd<int> a = 1;
println("a", a);
const stdx::native_simd<int> b([](int i) { return i - 2; });
println("b", b);
const auto c = a + b;
println("c", c);
const stdx::native_simd_mask x = c < 0;
println("x", x);
auto cnt = popcount(x);
std::cout << "cnt: " << cnt << '\n';
}
Thanks to the program output, this should be self-explanatory.
find_first_set and find_last_set
The following program demonstrates the two functions in use:
// find_first_set.cpp
#include <experimental/simd>
#include <iostream>
#include <string_view>
namespace stdx = std::experimental;
void println(std::string_view name, auto const& a) {
std::cout << std::boolalpha << name << ": ";
for (std::size_t i{}; i != std::size(a); ++i)
std::cout << a[i] << ' ';
std::cout << '\n';
}
int main() {
stdx::simd_mask<short> x{0};
println("x", x);
x[1] = true;
x[x.size() - 1] = true;
println("x", x);
auto first = stdx::find_first_set(x);
std::cout << "find_first_set(x): " << first << '\n';
auto last = stdx::find_last_set(x);
std::cout << "find_last_set(x): " << last<< '\n';
}
Finally, here is the output of the program:
What’s next?
In my next article, I will focus on the algorithms of data-parallel types.