SlideShare a Scribd company logo
SC20 C++20 BoF
Michael Wong (Codeplay)
Bjarne Stroustrup (Morgan Stanley)
[date] • [Location]
SYCL WG
State of the Union 2020
Michael Wong
SYCL WG Chair
Codeplay VP of R&D
ISOCPP Director & VP
ISO C++ Directions Group Chair
michael@codeplay.com | wongmichael.com/about
SYCL Present and Future Roadmap (May Change)
2011
OpenCL 1.2
OpenCL C Kernel
Language
OpenCL 2.1
SPIR-V in Core
2015
SYCL 1.2
C++11 Single source
programming
OpenCL 2.2
2017
SYCL 1.2.1
C++11 Single source
programming
2020
SYCL 2020
C++17 Single source
programming
Many backend options
2021-????
SYCL 2021-?
C++20 Single source
programming
Many backend options
C++11 C++14 C++17 C++20
OpenCL 3.0
C++23
SYCL community is vibrant
SYCL-1.2.1
2X growth
SYCL 2020 Potential Features
Generalization (a.k.a the Backend Model) presented by Gordon Brown
Unified Shared Memory (USM) presented by James Brodman
Improvement to Program class Modules presented by Gordon Brown
Host Task with Interop presented by Gordon Brown
In order queues, presented by James Brodman
SYCL 2020 compared with SYCL 1.2.1
Easier to integrate with C++17 (CTAD, Deduction Guides...)
Less verbose, smaller code size, simplify patterns
Backend independent
Multiple object archives aka modules simplify interoperability
Ease porting C++ applications to SYCL
Enable capabilities to improve programmability
Backwards compatible but minor API break based on user feedback
SYCL Evolution
2017
SYCL 1.2.1
Improving Software Ecosystem
Tool, libraries, GitHub
Expanding Implementation
DPC++
ComputeCpp
triSYCL
hipSYCL
Regular Maintenance Updates
Spec clarifications, formatting and bug fixes
https://www.khronos.org/registry/SYCL/
Target 2020
Provisional Q3 then Final Q4
Selected Extension
Pipeline aiming for SYCL
2020 Provisional Q3
Reduction
Subgroups
Accessor simplification
Atomic rework
Extension mechanism
Address spaces
Vector rework
Specialization Constants
Integration of successful
Extensions plus new Core
functionality
Converge SYCL with ISO
C++ and continue to
support OpenCL to deploy
on more devices
CPU
GPU
FPGA
AI processors
Custom Processors
Repeat The Cycle every 1.5-3 years
SYCL 2020 Roadmap (WIP, MAY CHANGE)
SYCL Ecosystem, Research and Benchmarks
Active Working Group Members
Benchmarks
SYCL-BLAS
Linear Algebra
Libraries
Implementations
SYCL-DNN
Machine Learning
Libraries and Parallel
Acceleration Frameworks
Research
SYCL-MLEigen
RSBench
SYCL Parallel STL
oneAPI
oneMKL
SYCL, Aurora and Exascale computing
SYCL can
run on AMD
ROCM
SYCL 2020 Provisional is here
• SYCL 2020 provisional is released and final in Q4
• We need your feedback asap
• https://app.slack.com/client/TDMDFS87M/CE9UX4CHG
• https://community.khronos.org/c/sycl
• https://sycl.tech
• What features are you looking for in SYCL 2020?
• What feature would you like to aim for in future SYCL?
• How do you join SYCL?
Engaging with the Khronos SYCL Ecosystem
SYCL
Working
Groups
SYCL
Advisory
Panels
Contribute to SYCL open source
specs, CTS, tools and ecosystem
Khronos SYCL Forums, Slack
Channels, stackoverflow, reddit, and
SYCL.tech
Khronos members under Khronos NDA and IP
Framework participate and vote in working
group meetings. Starts at $3.5K/yr.
https://www.khronos.org/members/
https://www.khronos.org/registry/SYCL/
Invited Advisors under the Khronos NDA and
IP Framework can comment and contribute
to requirements and draft specifications
https://www.khronos.org/advisors/
Spec fixes and suggestions made under the Khronos IP
Framework. Open source contributions under repo’s
CLA – typically Apache 2.0
https://github.com/KhronosGroup
https://github.com/KhronosGroup/SYCL-CTS
https://github.com/KhronosGroup/SYCL-Docs
https://github.com/KhronosGroup/SYCL-Shared
https://github.com/KhronosGroup/SYCL-Registry
https://github.com/KhronosGroup/SyclParallelSTL
Open to all!
https://community.khronos.org/www.khr.io/slack
https://app.slack.com/client/TDMDFS87M/CE9UX4CHG
https://community.khronos.org/c/sycl/
https://stackoverflow.com/questions/tagged/sycl
https://www.reddit.com/r/sycl
https://github.com/codeplaysoftware/syclacademy
https://sycl.tech/
$0
$0
$0
$
Any member or
non-member can propose a
new SYCL feature or fix
Thank You!
• Khronos SYCL is creating cutting-edge royalty-free open standard
• For C++ Heterogeneous compute, vision, inferencing acceleration
• Information on Khronos SYCL Standards: https://www.khronos.org/sycl
• Any entity/individual is welcome to join Khronos SYCL: https://www.khronos.org/members
• Join the SYCLCon Tutorial Monday and Wednesday Live panel : Wednesday Apr 29 15:00-18:00 GMT
• Have your questions answered live by a group of SYCL experts
• Michael Wong: michael@codeplay.com | wongmichael.com/about
Gather industry
requirements for future
open standards
Draft Specifications
Confidential to Khronos
members
Publicly Release
Specifications and
Conformance Tests
Gain early insights
into industry trends
and directions
Influence the design and direction
of key open standards that will
drive your business
Accelerate your
time-to-market with early
access to specification
drafts
Network with domain experts
from diverse companies in
your industry
State-of-the-art IP Framework
protects your Intellectual
Property
Enhance your company reputation
as an industry leader through
Khronos participation
Benefits of Khronos membership
SC20 C++20 BoF
Michael Wong (Codeplay)
Bjarne Stroustrup (Morgan Stanley)
[date] • [Location]
1. C++ 20
2. C++20 Big Features
3. C++20 Parallel Concurrency
4. Backup: C++20 complete feature list
Agenda
C++: evolving towards greater elegance and power
C++11 C++14 C++17 C++20C++98
Inheritance
Exception
Templates
STL
Auto
Lambda
Concurrency
Move
Futures
Template
deduction
Compile
time if
Parallel
STL
Modules
Concept
Coroutine
Ranges
2011 2014 2017 2019 20201998
“the committee”
Stroustrup - C++20 - Aarhus 2020 14
2011
2014
1990
2017
C++20 is here: wg21.link/N4861
Stroustrup - C++20 - SC 2020 15
C++20
• Major language features
– Modules
– Concepts
– Coroutines
– Improved compile-time programming support
• Major standard-library components
– Ranges
– Dates
– Formats
– Parallel algorithms
– Span
• Many minor language features and standard-library components
• A dense web of interrelated mutually-supporting features
• Most is shipping already
Stroustrup - C++20 - SC 2020 16
By “major”
I mean
“changes
how we think”
C++20
• Implementations are fast improving
– Keep an eye on new compiler and standard-library releases
– Many/most features are already shipping
• Shipping in GCC, Clang, Microsoft
– Modules (still experimental in GCC)
– Concepts
– Coroutines (still experimental in GCC)
– …
• Available on GitHub
– Ranges
– Dates
– Format
– Span
– …
Stroustrup - C++20 - SC 2020 17
Not Science Fiction
Making the preprocessor redundant
• Templates
– Concepts
• Type deduction
• Compile-time computation
– Constexpr and consteval functions
– Static reflection (not yet)
– Type traits
• Modules
• std::source_location
• Contracts (not yet)
Stroustrup - C++20 - SC 2020 18
#define
#include
#ifndef
assert()
__file__
Compile-time computation
• Not just built-in types
– Examples from <chrono>
cout << weekday{June/21/2016}; // Tuesday
static_assert( weekday{June/21/2016}==Tuesday ); // At compile time
static_assert(2020y/February/Tuesday[last] == February/25/2020); // true
auto tp = system_clock::now();
cout << tp; // 2019-11-14 10:13:40.785346
cout << zoned_time tp{current_zone(),tp}; // 2019-11-14 11:13:40.785346 CET
Stroustrup - C++20 - SC 2020 19
Howard Hinnant
Compile-time computation: invisible
• Compile-time computation tends to be invisible
auto z = sqrt(3+2.7i); // call sqrt(complex<double>)
auto d = 5min+10s+200us+300ns; // a duration
auto s = "This is not a pointer to char"s; // a string
// implementations:
constexpr complex<double> operator""i(long double d) { return {0,d}; }
constexpr seconds operator""s(unsigned long long s) {return s; }
constexpr string operator""s(const char* str, size_t len) { return {str, len}; }
Stroustrup - C++20 - SC 2020 20
Template argument type deduction
• Help avoid range errors
int a[128];
// … fill a …
std::span s {a};
for (const auto x : s) cout << x << ‘ ‘;
• Help avoid data races
void do_something()
{
std::scoped_lock lck {mut1,mut2};
// … Manipulate shared data …
}
Stroustrup - C++20 - SC 2020 21
No repeated element type
No repeated array bound
No repeated mutex type
No mutex order dependence
Compile-time computation
Stroustrup - C++20 - SC 2020 22
Daveed Vandevoorde Michael Wong
constexpr functions
consteval functions
Template aliases
UDL
Concepts
Type deduction
Template aliases
Type traits Gabriel Dos Reis
Michael Spertus
1. C++ 20
2. C++20 Big Features
3. C++20 Parallel Concurrency
4. Backup: C++20 complete feature list
Agenda
Modules
• Order dependence
#include "a.h"
#include "b.h"
• Can be different from
#include "b.h"
#include "a.h"
• #include is textual inclusion
• Implies
– #include is transitive
– much repeated compilation
– Subtle bugs
Stroustrup - C++20 - SC 2020 24
• Modularity
import a;
import b;
• Same as
import b;
import a;
• import is not transitive
• Implies
– Much compilation can be done once
only
Finally!
Modules
• Better code hygiene: modularity (especially protection from macros)
– => Faster compile times (hopefully factors rather than percent)
export module map_printer; // we are defining a module
import iostream;
import containers;
using namespace std;
export
template<Sequence S>
void print_map(const S& m) {
for (const auto& [key,val] : m) // break out key and value
cout << key << " -> " << val << 'n';
}
Stroustrup - C++20 - SC 2020 25
Compile speeds
Stroustrup - C++20 - SC 2020 26
Modularity and transition
• Support for getting from the #include world to the import world
– Global module
– Modular headers
– Module partitions
• A module need not be in a single source file
module;
#include "xx.h" // to global module
export module C;
import "a.h" // “modular headers”
import "b.h"
import A;
export int f() { … }
• Not yet: a modular standard library
– Versions exist, but not yet in the ISO standard
– I hope for import std;
Stroustrup - C++20 - SC 2020 27
Potentially
holding back progress
Here,
#include works as ever
Modules and transition
• Source organization
• Header file conversion
– Header and module coexistence
• Build systems
– Build2
– Cmake prototype
Stroustrup - C++20 - SC 2020 28
Nathan Sidwell
Gabriel Dos Reis
Richard Smith
Generic programming:
The backbone of the C++ standard library
• Containers
– vector, list, stack, queue, priority_queue, ...
• Ranges
• Algorithms
– sort(), partial_sort(), is_sorted(), merge(), find(), find_if(),...
– Most with parallel and vectorized versions
• Concurrency support (type safe)
– Threads, locks, futures, ...
• Time
– time_points, durations, calendars, time_zones
• Random numbers
– distributions and engines (lots)
• Numeric types and algorithms
– complex
– accumulate(), inner_product(), iota(), ...
• Strings and Regular expressions
• Formats
Stroustrup - C++20 - SC 2020 29
RAII
Type deduction
Parameterized
Types and algorithms
Generic Programming
• Write code that works for all suitable argument types
– void sort(R); // pseudo declaration
• R can be any sequence with random access
• R’s elements can be compared using <
– E* find_if(R,P); // pseudo declaration
• R can be any sequence that you can read from sequentially
• P must be a predicate on R’s element type
• E* must point to the found element of R if any (or one beyond the end)
• That’s what the standard says
– “our job” is to tell this to the compiler
– C++20 enables that
Stroustrup - C++20 - SC 2020 30
Generic Programming with C++20 Concepts
• Write code that works for all suitable argument types
void sort(Sortable_range auto& r);
vector<string> vs;
// … fill vs …
sort(vs);
array<int,128> ai;
// … fill ai …
sort(ai);
Stroustrup - C++20 - SC 2020 31
Implicit:
• Type of container
• Type of element
• Number of elements
• Comparison criteria
A concept:
• Specifies what is required of r’s type
Pre-C++20 C++20
vector v {1,2,3};
sort (begin(v), end(v));
ranges::sort(v);
auto answer { v,
| views::transform([](int I { return
to_string(i); }) };
// “1”, “2”, “3”
// can also reverse, drop, filter
Generic Programming with C++20 Ranges
Generic Programming: Concept with Ranges
• Write code that works for all suitable argument types
– Many/most algorithms have more than one template argument type
– We need to express relationships among template arguments
template<input_range R, indirect_unary_predicate<iterator_t<R> Pred>
Iterator_t<R> ranges::find_if(R&& r, Pred p);
list<int> lsti;
// … fill lsti …
auto p = find_if(lsti, greater_than{7});
vector<string> vs;
// … fill vs …
auto q = find_if(vs, [](const string& s) { return has_vowels(s); });
Stroustrup - C++20 - Copenhagen 2020 33
<ranges>
Overloading
• Overloading based on concepts
void sort(Forward_sortable_range auto&);
void sort(Sortable_range auto&);
void some_code(vector<int> vec&,list<int> lst)
{
sort(lst); // sort(Forward_sortable_range auto&)
sort(vec) // sort(Sortable_range auto&)
}
• We don’t have to say
– “Sortable_range is stricter/better than Forward_sortable_range”
– we compute that from their definitions
Stroustrup - C++20 - SC 2020 34
Design principles:
• Don’t force the user to do
what a machine does better
• Zero overhead compared
to unconstrained templates
Concepts
• A concept is a compile-time predicate
– A function run at compile time yielding a Boolean
– Often built from other concepts
template<typename R>
concept Sortable_range =
random_access_range<R> // has begin()/end(), ++, [], +, …
&& sortable<iterator_t<R>>; // can compare and swap elements
template<typename R>
concept Forward_sortable_range =
forward_range<R> // has begin()/end(), ++; no [] or +
&& sortable<iterator_t<R>>; // can compare and swap elements
Stroustrup - C++20 - SC 2020 35
There are libraries of concepts
<ranges>: random_access_range and sortable
Concepts
• A concept is a compile-time predicate
– A function runs at compile time yielding a Boolean
– One or more arguments
– Can be built from fundamental language properties: use patterns
template<typename T, typename U = T>
concept equality_comparable = requires(T a, U b) {
{a==b} -> bool;
{a!=b} -> bool;
{b==a} -> bool;
{b!=a} -> bool;
}
Stroustrup - C++20 - SC 2020 36
There are libraries of concepts
<concepts>: equality_comparable
Types and concepts
• A type
– Specifies the set of operations that can be applied to an object
• Implicitly and explicitly
• Relies on function declarations and language rules
– Specifies how an object is laid out in memory
• A single-argument concept
– Specifies the set of operations that can be applied to an object
• Implicitly and explicitly
• Relies on use patterns
– reflecting function declarations and language rules
– Says nothing about the layout of the object
Stroustrup - C++20 - SC 2020 37
Ideal:
Use concepts where we now use types,
except for defining layout
Generic Programming is “just” programming
• Why?
– From 1988 to now “template programming” and “ordinary programming” have been very different
• Different syntax
• Different look-up rules
• Different source code organization
• “Expert friendly” programming techniques
– We don’t need two different sets of techniques (and notations)
• Unnecessary complexity
• Make simple things simple!
– “ordinary programming” is expressive and familiar
Stroustrup - C++20 - SC 2020 38
Make simple things simple!
Do so through generalization
Generic Programming
Stroustrup - C++20 - SC 2020 39
• will change the way we think about Programming
C++20 Big changes: good for HPC workload
•Concepts: Reduces errors, increases expressiveness; less hacking
•Modules: Better code hygiene; much better compile times
•Ranges: Improved notation, better pipelining
•Coroutines: Better cooperative functions, better concurrency control
•Subtle big changes:
• compile-time computation is improving
• type deduction is simplifying notation
1. C++ Projects
2. C++20 Big Features
3. C++20 Parallel Concurrency
4. Backup: C++20 complete feature list
Agenda
• cooperative cancellation of threads
• new synchronization facilities
• updates to atomics with atomic_ref
• coroutines
C++20 asynchronous, concurrency, parallelism, heterogeneous
programming
Parallel/concurrency before C++11 (C++98)
Asynchronous Agents Parallel collections Mutable shared state Heterogeneous (GPUs,
accelerators, FPGA,
embedded AI processors)
summary tasks that run independently
and communicate via
messages
operations on groups of
things, exploit parallelism in
data and algorithm
structures
avoid races and
synchronizing objects in
shared memory
Dispatch/offload to other
nodes (including distributed)
examples GUI,background printing,
disk/net access
trees, quicksorts,
compilation
locked data(99%), lock-free
libraries (wizards), atomics
(experts)
Pipelines, reactive
programming, offload,,
target, dispatch
key metrics responsiveness throughput, many core
scalability
race free, lock free Independent forward
progress,, load-shared
requirement isolation, messages low overhead composability Distributed, heterogeneous
today's abstractions POSIX threads, win32
threads, OpenCL, vendor
intrinsic
openmp, TBB, PPL,
OpenCL, vendor intrinsic
locks, lock hierarchies,
vendor atomic instructions,
vendor intrinsic
OpenCL, CUDA
Parallel/concurrency for C++11, 14, 17, C++20
Asynchronous Agents Parallel collections Mutable shared state Heterogeneous/Distributed
abstractions from
C++11, 14, 17, 20
C++11: thread,lambda
function, TLS, async
C++ 20: Jthreads
+interrupt _token,
coroutines
C++11: packaged tasks,
promises, futures,
C++ 17: ParallelSTL,
control false sharing
C++20 : Vec execution
policy, Algorithm
un-sequenced policy
C++11: locks, memory model,
mutex, condition variable, atomics,
static init/term,
C++ 14:
shared_lock/shared_timed_mutex,
OOTA, atomic_signal_fence,
C++ 17: scoped _lock,
shared_mutex, ordering of memory
models, progress guarantees,
TOE, execution policies
C++20: atomic_ref, Latches and
barriers, atomic<shared_ptr>
Atomics & padding bits
Simplified atomic init
Atomic C/C++ compatibility
Semaphores and waiting
Fixed gaps in memory model ,
Improved atomic flags, Repair
memory model
C++11: lambda
C++14: generic lambda
C++17: , progress
guarantees, TOE,
execution policies
C++20: atomic_ref
Atomic_ref
atomic_ref <T>
• std::atomic_ref allows you to
perform atomic operations on
non-atomic objects.
• This can be important when sharing
headers with C code, or where a
struct needs to match a specific
binary layout so you can’t use
std::atomic, or if you have distinctive
non-atomic parts of your program
and you only need to do atomic
access in a few places
• this is where atomic_ref<T> is
superior to atomic<T> and is
more efficient
• If you use std::atomic_ref to access
an object, all accesses to that object
must use std::atomic_ref within
that scope.
struct my_c_struct{
int count;
data* ptr;
};
void do_stuff(my_c_struct* p){
std::atomic_ref<int> count_ref(p->count);
++count_ref;
// ...
}
coroutines
• Current futures and promise is
eager and closed
• We aim to move to lazy and open
• Executors switching to a sender
receiver model (similar to promise
futures but without the blocking
synchronization, using lazy
continuation)
• All Awaitables are Senders (some
senders are awaitable)
• Coroutines are receivers
• Can layer Eager operations on top
of lazy
• Futures model problem: shared state
racing between producer completing
and consumer attaching continuation
• synchronization, heap, type erasure
• Returning handle to eagerly started
concurrent operations (e.g. future) is
returning an obligation for the caller to
manually join that operation
• Concurrency resource created by the call
• Returns handle to concurrent operation
e.g. future <T>
• Concurrency resource must be released
by joining
• Join needs to be asynchronous
• But destructors cannot be asynchronous
so can’t use it to join automatically
The future of parallelism is lazy
coroutines
• Not Pre-emptive but cooperative
• A coroutine is a function that can be
suspended mid execution and resumed
at a later time.
• Resuming a coroutine continues from
the suspension point;
• local variables have their values from
the original call
• C++20 provides stackless coroutines
• Only the locals for the current function
are saved
• Everything is localized
• Minimal memory allocation — can have
millions of in-flight coroutines
• Whole coroutine overhead can be
eliminated by the compiler — Gor’s
“disappearing coroutines”
future<remote_data>
async_get_data(key_type key);
future<data> retrieve_data(
key_type key){
auto rem_data=
co_await async_get_data(key);
co_return process(rem_data);
}
Cooperative instead of preemption
•co_await to suspend execution
until resumed
•co_yield to suspend +returning a
value
•co_return to complete+return
value
• You can’t tell from the signature
• Only if body uses the special
keywords
• Just an implementation detail
• Dangling references
• No plain or placeholder return yet
• No std::generator<T>
• No constexpr, constructor,
destructors, main as coroutines
• Not a replacement for callbacks, as
coroutines cannot be overloaded yetfuture<data> retrieve_data(
key_type key){
auto rem_data=
co_await
async_get_data(key);
co_return process(rem_data);
}
future<remote_data>
async_get_data(key_type
key){
// code
co_yield rem_data;
}
CALL
RETURN
SUSPEND
AWAIT
RESUME
Like Callback but not a replacement
•co_await to suspend execution
until resumed
•co_yield to suspend +returning a
value
•co_return to complete+return
value
• You can’t tell from the signature
• Only if body uses the special
keywords
• Just an implementation detail
• Dangling references
• No plain or placeholder return yet
• No std::generator<T>
• No constexpr, constructor,
destructors, main as coroutines
• Not a replacement for callbacks, as
coroutines cannot be overloaded yetfuture<data> retrieve_data(
key_type key){
auto rem_data=
co_await
async_get_data(key);
//callback code
co_return process(rem_data);
}
future<remote_data>
async_get_data(key_type
key){
// code
co_yield rem_data;
}
CALL
RETURN
SUSPEND
REGISTER
CALLBACK
CALL
CALLBACK
Thread1Thread2
More for HPC
In C++ 20 In future C++
• Better lambda
• atomic_ref
• [[likely]][[unlikely]]
• Better constexpr
• Better Class Template
Argument Deduction (CTAD)
• Better library
• Span
• NTTP
• Calendars
• constint
• [[nodiscard()]]
• …
• Linear Algebra
• executors
• machine learning
• data affinity
• data layout
• data locality
• data movement
• mdspan
• mdarray
• Good use of hardware
• originally from C
• CPU, GPU, FPGA, AI/ML chips, …
• Zero-overhead abstraction
• originally from Simula
• performant libraries
• simplified programming
• control of complexity
“if you don’t need the right answer, I can make it as fast as you like”
“if you can afford to waste 98% of your CPU, I can make programming much simpler”
C++
• C++20 is a major release, maybe even bigger than C++11
• Less verbose code
• Solves Error Novel problem with Concepts
• Solves Constant Recompilation Problem with Modules
• Improves STL with Ranges
• Better Lazy Cooperative function Control with Coroutine and atomic_ref
• Works well to improve HPC workloads to make them
• compile faster,
• safer,
• do more at compile time,
• less verbose and
• run faster.
C++20 Take Away
• C++ Projects
• C++20 Big Features
• C++20 Parallel Concurrency
• Backup: C++20 complete feature list
Agenda
Status after Feb Prague C++ Meeting
ISO NUMBER NAME STATUS LINKS C++20?
ISO/IEC TS
19217:2015
C++
Extensions
for
Concepts
Published 2015-11-13.
(ISO Store). Final draft:
n4553 (2015-10-02)
Current draft: p0734r0
(2017-07-14)
Merged into C++20 (with
modifications).
Constrained templates
Merged into C++20,
including abbreviated
function templates!
Executors
Abstraction for
where/how code runs in a
concurrent context
Not headed for C++ 20,
now retarget for C++23
Coroutines
Resumable functions,
based on stackless await
design
Published! Merged into
C++20
Reflection
TS
Reflection
V2
Static code reflection
mechanisms
A value-based constexpr
version of the Reflection
TS
PDTS ballot done.
Approved for publication
as a TS.
Aiming for C++23
SG14
Lightweight
Exceptions
In progress
Status after Feb Prague C++ Meeting
ISO
number
Name Status What is it? C++20?
ISO/IEC TS
19571:2016
C++ Extensions
for Concurrency
TS
Published 2016-01-19. (ISO Store)
Final draft: p0159r0 (2015-10-22)
improvements to future, latches
and barriers, atomic smart
pointers
Latches, atomic<shared_ptr<t>>
merged into C++20. Already in
Visual Studio release and
Anthony Williams Just Threads!
and waiting for subsequent
usage experience.
Withdrawn as some parts
(latches,
atomic<shared_ptr<>>are now in
C++20
ISO/IEC DTS
21425:2017
Ranges TS
Published 2017-12-05. (ISO Store)
Draft: n4685 (2017-07-31)
Range-based algorithms and
views
Merged in C++20
ISO/IEC TS
19216:2018
Networking TS
Published 2018-04-24. (ISO Store)
Draft n4734 (2017-04-04). Latest
draft: n4771 (2018-10-08)
Sockets library based on
Boost.ASIO
Published. Not headed to C++20.
ISO/IEC TS
21544:2018
SG2
Modules V1
Modules V2
Published 2018-05-16. (ISO Store)
Final Draft n4720 (2018-01-29)
A component system to
supersede the textual header file
inclusion model
Improvements to Modules v1,
including a better transition path
Published as a TS
Merged into C++20
SG21 Contracts Pre and post conditions
Removed from C++20. Reset as
SG21
Status after Feb Prague C++ Meeting
ISO number Name Status What is it? C++20?
ISO/IEC DTS 19568:xxxx Numerics TS
Early development. Draft
p0101 (2015-09-27)
Various numerical
facilities
Under active
development
ISO/IEC DTS 19571:xxxx Concurrency TS 2 Early development
Exploring , lock-free,
hazard pointers, RCU,
atomic views, concurrent
data structures, fibers
Deprecate volatile, add
volatile_load/store, TLS?
Under active
development. Possible
new clause
ISO/IEC TS 19570:2018 Parallelism TS 2
Published 2018-11-15.
(ISO Store). Draft: n4773
(2018-10-08)
task blocks, progress
guarantees, SIMD<T>,
vec, no_vec loop based
execution policy
Published. SIMD<T>,
progress guarantees,
loopbased execution
policy are Headed into
C++23
ISO/IEC DTS 19841:xxxx
Transactional Memory TS
2
Early development
Exploring simplified
atomic model of only
memory updates.
Under active
development.
ISO/IEC DTS 19568:xxxx Graphics TS
Early development. Draft
p0267r8 (2018-06-26)
2D drawing API using
Cairo interface, adding
stateless interfacec
Restarted after being
shutdown.
ISO/IEC DTS 19568:xxxx Library Fundamental V3
Initial draft, early
development
Generic scope guard and
RAII wrappers
Under development
Status after Feb Prague C++ Meeting
ISO number Name Status What is it? C++20?
SG14 Linear Algebra SG14 SIG, LEWG
Wrapper on Blas and a
C++-based proposal.
separated into 3 layers
Under active development.
Aiming for C++23
SG19 Machine Learning SG19 SIG
Improve C++ for ML,AI,
DNN, Statistics, Differential
Calculus, Data structure
Graph programming
Under active development.
Aiming for C++23
SG16 Pattern Matching SG16 WIP
A match-like facility for C++
WIP
Under active development.
Aiming for C++23
SG12
Undefined Behaviour/Safety
Critical
SG12 WIP
optimization that cause UB.
Pointer provenance, signed
integer overflow Validate
external C++ Safety APIs:
Misra, Autosar
Under active development.
Aiming for C++23
SG20 Education SG20 WIP
Support educating C++,
especially new features
Under active development.
Aiming for C++23
SG19 Audio SG13 HMI WIP Audio drivers
Under active development.
Aiming for C++23
SG16 Unicode SG16 WIP
Compile-time regular
expression, source code
info capture, charset
transcoding
Under active development.
Aiming for C++23
SG15 Tooling Ecosystem SG15 WIP
Build systems, debug and
tools for Modules;
Under active development.
Aiming for C++23 TR
C++ 20 Language Features
• Removal of Contracts
• Class template argument deduction for
aggregates
• Class template argument deduction for alias
templates
• Mitigating minor modules maladies.
• Relaxing redefinition restrictions for
re-exportation robustness
• Recognizing header unit imports requires full
preprocessing.
• Using unconstrained template template
parameters with constrained templates
• On the non-uniform semantics of
return-type-requirements
• Non-type template parameters are incomplete
without floating-point types
• Inline namespaces: fragility bites.
• rethrow_exception must be allowed to copy
•Adding the constinit keyword
•Permitting trivial default initialization in
constexpr contexts
•Enabling constexpr intrinsics by permitting
unevaluated inline assembly in constexpr
functions
•More constexpr containers.
• When do you actually use <=>?
• Spaceship needs a tune-up
•using enum.
•Additional contexts for implicit move
construction.
•Conditionally trivial special member functions
•[[nodiscard("should have a reason")]]
•[[nodiscard]] for constructors.
•Deprecate uses of the comma operator in
subscripting expressions.
•Deprecating volatile
•Interaction of memory_order_consume with
release sequences
More C++ 20 Language Features
• Most notably, the Concepts Technical
Specification has been merged into
C++20!
• Template parameter lists for generic
lambdas.
• Designated initializers.
• Lambda capture [=, *this]
• A __VA_OPT__ macro to make variadic
macros easier to use.
• Default member initializers for bitfields
• A tweak to C++17’s constructor
template argument deduction rules
• Fixing const-qualified pointers to
members
• The most significant new feature voted in was
operator<=>,
• Range-based for statements with initializer.
• Lambdas is unevaluated contexts.
• Default constructible and assignable stateless
lambdas.
• Simplifying implicit lambda capture.
• Fixing small functionality gaps in constraints.
• Deprecating the notion of “plain old data”
(POD).
• Access checking on specializations.
• const mismatch with defaulted copy
constructor.
• ADL and function templates that are not visible.
• Core issue 1581: when are constexpr member
functions defined?
More C++20 Language Features
• Language support for empty objects
• Relaxing the structured bindings
customization point finding rules.
• Structured bindings in accessible
members.
• Allow pack expansion in lambda
init-capture.
• Symmetry for <=>
• Likely and unlikely attributes
• Down with typename!
• Relaxing range-based for loop’s
customization point finding rules
• Support for contract-based
programming in C++20
•Class types in non-type template
parameters.
•Allowing virtual function calls in
constant expressions.
•Prohibit aggregates with
user-declared constructors.
•Efficient sized deletion for
variable-sized classes.
More C++ 20 Language Features
• Abbreviated function templates
(AFTs).
• Improvements to
return-type-requirements.
• Immediate functions.
• std::is_constant_evaluated()
• try / catch blocks in constexpr
functions.
• Allowing dynamic_cast and
polymorphic typeid in constant
expressions.
• Changing the active member of a
union inside constexpr
• char8_t: a type for UTF-8 characters
and strings.
• Access control in contract
conditions.
• Revising the C++ memory model.
• Weakening release sequences.
• Nested inline namespaces
• Signed integers are two’s
complement
• Consistency improvements for <=>
and other comparison operators.
• Conditionally explicit constructors,
a.k.a. explicit(bool).
• Deprecate implicit capture of this
via [=].
• Integrating feature-test macros
into the C++ working draft.
• A tweak to the rules about when
certain errors related to a class
being abstract are reported.
• A tweak to the treatment of
padding bits during atomic
compare-and-exchange
operations.
• Tweaks to the __VA_OPT__
preprocessor feature.
• Updating the reference to the
Unicode standard.
More C++ 20 Language Features
•Modules!
•Merging the Coroutines TS into
C++20
•Allow initializing aggregates from
a parenthesized list of values
•<=> != ==, an important fix to the
default comparisons design.
•Extending structured bindings to be
more like variable declarations.
•Reference capture of structured
bindings.
•Contract postconditions and return type
deduction.
•Array size deduction in
new-expressions. This is also a Defect
Report against previous versions of C++.
•Contra CWG DR1778 (a bugfix related to
noexcept and explicitly defaulted
functions).
•Make char16_t/char32_t string literals
be UTF-16/32.
C++20 Library Features
•constexpr INVOKE
•Movability of single-pass iterators
•basic_istream_view::iterator should not
be copyable
•Layout-compatibility and
pointer-interconvertibility traits
•Remove dedicated precalculated hash
lookup interface.
•Miscellaneous minor fixes for chrono
•char8_t backward compatibility
remediation
•bind_front should not unwrap
reference_wrapper
•Iterator difference type and integer
overflow
•Helpful pointers for ContiguousIterator
•Views and size types
•Exposing a narrow contract for ceil2
•constexpr feature macro concerns
•The C++20 Synchronization Library,
•Input range adaptors
•Making std::vector constexpr
•Making std::string constexpr
•Stop token and joining thread
•Adopt source_location for C++20.
•Rename concepts to standard_case for
C++20, while we still can.
•The mothership has landed Standard
library header units for C++20
•to_array from LFTS with updates
•Bit operations
•Math constants
•Efficient access to basic_stringbuf‘s
buffer
•Text formatting.
•Integration of <chrono> with text
formatting
•printf corner cases in std::format
•Output std::chrono::days with ‘d‘ suffix
More C++20 Library Features
•Support for detecting
endianness programmatically
•Repairing elementary string
conversions (also a Defect
Report)
•Improvements to the
integration of C++17 class
template argument deduction
into the standard library (also a
Defect Report)
•Extending make_shared to
support arrays
•Transformation trait remove_cvref
•Treating unnecessary decay
•Using nodiscard in the standard library
•Make std::memory_order a scoped
enumeration
•Synchronized buffered ostream
•A utility to convert pointer-like objects to raw
pointers
•Add constexpr modifiers to functions in
<algorithm> and <utility> headers.
•constexpr for std::complex
•Atomic shared_ptr
•Floating-point atomics
•De-pessimize legacy <numeric> algorithms
with std::move
String prefix and suffix checking, i.e.
More C++20 library Features
•calendar and timezone library.
•std::span
•<version> header
•Tweak on how unordered
containers are compared
•String::reserve() should not shrink
•User specializations of function
templates in namespace std
•Manipulators for C++
synchronized buffer ostream
•constexpr iterator requirements
• The most notable addition at this meeting
was standard library Concepts.
• atomic_ref
• Bit-casting object representations
• Standard library specification in a Concepts
and Contracts world
• Checking for the existence of an element in
associative containers
• Add shift() to <algorithm>
• Implicit conversion traits and utility functions
• Integral power-of-2 operations
• The identity metafunction
• Improving the return value of erase()-like
algorithms
• constexpr comparison operators for
std::array
• constexpr for swap and related functions
• fpos requirements
• Eradicating unnecessarily explicit default
constructors
• Removing some facilities that were
More C++20 Library Features
• The most notable addition at this meeting was merging the
Ranges TS into C++20!
• Fixing operator>>(basic_istream&, CharT*).
• variant and optional should propagate copy/move triviality.
• visit<R>: explicit return type for visit.
• <chrono> zero(), min(), and max() should be noexcept.
• constexpr in std::pointer_traits.
• Miscellaneous constexpr bits.
• unwrap_ref_decay and unwrap_reference
• reference_wrapper for incomplete types
• A sane variant converting constructor
• std::function move constructor should be noexcept
• std::assume_aligned
• Smart pointer creation with default initialization
• Improving completeness requirements for type traits)
• Remove CommonReference requirement from
StrictWeakOrdering (a.k.a fixing relations)
• Utility functions to implement uses-allocator
construction
• Should span be Regular?
• Make stateful allocator propagation more consistent
for operator+(basic_string))
• Simplified partial function application
• Heterogeneous lookup for unordered containers
• Adopt consistent container erasure from Library
Fundamentals v2
More C++20 Library Features
•polymorphic_allocator<> as a
vocabulary type.
•Well-behaved interpolation for
numbers and pointers., a.k.a.
std::midpoint
•Signed ssize() functions, unsigned
size() functions in span
•I stream, you stream, we all stream for
istream_iterator.
•Ranges design cleanup
•Target vectorization policies (from the
Parallelism TS v2)
•Usability enhancements for
std::span
•Make create_directory() intuitive.
•Precalculated hash values in lookup
•Traits for [un]bounded arrays
•Making std::underlying_type
SFINAE-friendly.
Parallel/concurrency after C++11
Asynchronus Agents Parallel collections Mutable shared state Heterogeneous (GPUs,
accelerators, FPGA, embedded
AI processors)
summary tasks that run independently and
communicate via messages
operations on groups of things,
exploit parallelism in data and
algorithm structures
avoid races and synchronizing
objects in shared memory
Dispatch/offload to other nodes
(including distributed)
examples GUI,background printing, disk/net
access
trees, quicksorts, compilation locked data(99%), lock-free
libraries (wizards), atomics
(experts)
Pipelines, reactive programming,
offload,, target, dispatch
key metrics responsiveness throughput, many core scalability race free, lock free Independent forward progress,,
load-shared
requirement isolation, messages low overhead composability Distributed, heterogeneous
today's abstractions C++11: thread,lambda function,
TLS, Async
C++11: packaged tasks,
promises, futures,
C++11: locks, memory model,
mutex, condition variable,
atomics, static init/term
C++11: lambda
Parallel/concurrency after C++14
Asynchronous Agents Parallel collections Mutable shared state Heterogeneous
summary tasks that run independently and
communicate via messages
operations on groups of things,
exploit parallelism in data and
algorithm structures
avoid races and synchronizing
objects in shared memory
Dispatch/offload to other nodes
(including distributed)
examples GUI,background printing, disk/net
access
trees, quicksorts, compilation locked data(99%), lock-free
libraries (wizards), atomics
(experts)
Pipelines, reactive programming,
offload,, target, dispatch
key metrics responsiveness throughput, many core scalability race free, lock free Independent forward progress,,
load-shared
requirement isolation, messages low overhead composability Distributed, heterogeneous
today's abstractions C++11: thread,lambda function,
TLS, async
C++14: generic lambda
C++11: packaged tasks,
promises, futures,
C++11: locks, memory model,
mutex, condition variable,
atomics, static init/term,
C++ 14:
shared_lock/shared_timed_mutex
, OOTA, atomic_signal_fence,
C++11: lambda
C++14: generic lambda
Parallel/concurrency after C++17
Asynchronus Agents Parallel collections Mutable shared state Heterogeneous (GPUs,
accelerators, FPGA,
embedded AI processors)
summary tasks that run independently and
communicate via messages
operations on groups of
things, exploit parallelism in
data and algorithm structures
avoid races and
synchronizing objects in
shared memory
Dispatch/offload to other
nodes (including distributed)
today's abstractions C++11: thread,lambda function,
TLS, async
C++14: generic lambda
C++11: packaged tasks,
promises, futures,
C++ 17: ParallelSTL, control
false sharing
C++11: locks, memory
model, mutex, condition
variable, atomics, static
init/term,
C++ 14:
shared_lock/shared_timed_
mutex, OOTA,
atomic_signal_fence,
C++ 17: scoped _lock,
shared_mutex, ordering of
memory models, progress
guarantees, TOE, execution
policies
C++11: lambda
C++14: generic lambda
C++17: progress
guarantees, TOE, execution
policies

More Related Content

PDF
Case Studies in advanced analytics with R
PDF
PyData Berlin 2018: dvc.org
PDF
Know your R usage workflow to handle reproducibility challenges
PDF
Managing large (and small) R based solutions with R Suite
PDF
DVC: O'Reilly Artificial Intelligence Conference 2019 - New York
PDF
Software maintenance PyConPL 2016
PDF
Extended Property Graphs and Cypher on Gradoop
PDF
Decentralized Evolution and Consolidation of RDF Graphs
Case Studies in advanced analytics with R
PyData Berlin 2018: dvc.org
Know your R usage workflow to handle reproducibility challenges
Managing large (and small) R based solutions with R Suite
DVC: O'Reilly Artificial Intelligence Conference 2019 - New York
Software maintenance PyConPL 2016
Extended Property Graphs and Cypher on Gradoop
Decentralized Evolution and Consolidation of RDF Graphs

Similar to SC20 SYCL and C++ Birds of a Feather 19th Nov 2020 (20)

PDF
C++0x :: Introduction to some amazing features
PDF
SYCL 2020 Specification
PDF
C 20 for Programmers 3rd Edition P. Deitel
PDF
C 20 for Programmers 3rd Edition P. Deitel
PDF
Get C 20 for Programmers 3rd Edition P. Deitel free all chapters
PDF
C 20 for Programmers 3rd Edition P. Deitel
PDF
C 20 for Programmers 3rd Edition P. Deitel
PPTX
What – if anything – have we learned from C++? by Bjarne Stroustrup @ Curry O...
PDF
C 20 for Programmers 3rd Edition Harvey Deitel
PPT
Stroustrup c++0x overview
PDF
C++: a fast tour of a fast language
PDF
Overview of C++20 and a deeper look into modules
PDF
C++ 11 Style : A Touch of Class
PPTX
Modern C++ Lunch and Learn
PPTX
Modern C++
PDF
Guide To Scientific Computing In C Joe Pittfrancis Jonathan Whiteley
PDF
Getting started with C++
PDF
Getting Started with C++
PPTX
C++ LectuNSVAHDVQwyfkyuQWVHGWQUDKFEre-14.pptx
PPTX
Whats New in Visual Studio 2012 for C++ Developers
C++0x :: Introduction to some amazing features
SYCL 2020 Specification
C 20 for Programmers 3rd Edition P. Deitel
C 20 for Programmers 3rd Edition P. Deitel
Get C 20 for Programmers 3rd Edition P. Deitel free all chapters
C 20 for Programmers 3rd Edition P. Deitel
C 20 for Programmers 3rd Edition P. Deitel
What – if anything – have we learned from C++? by Bjarne Stroustrup @ Curry O...
C 20 for Programmers 3rd Edition Harvey Deitel
Stroustrup c++0x overview
C++: a fast tour of a fast language
Overview of C++20 and a deeper look into modules
C++ 11 Style : A Touch of Class
Modern C++ Lunch and Learn
Modern C++
Guide To Scientific Computing In C Joe Pittfrancis Jonathan Whiteley
Getting started with C++
Getting Started with C++
C++ LectuNSVAHDVQwyfkyuQWVHGWQUDKFEre-14.pptx
Whats New in Visual Studio 2012 for C++ Developers
Ad

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
KodekX | Application Modernization Development
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Cloud computing and distributed systems.
PPT
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Spectral efficient network and resource selection model in 5G networks
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cuic standard and advanced reporting.pdf
KodekX | Application Modernization Development
Digital-Transformation-Roadmap-for-Companies.pptx
NewMind AI Monthly Chronicles - July 2025
NewMind AI Weekly Chronicles - August'25 Week I
20250228 LYD VKU AI Blended-Learning.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Unlocking AI with Model Context Protocol (MCP)
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Network Security Unit 5.pdf for BCA BBA.
Cloud computing and distributed systems.
Teaching material agriculture food technology
Ad

SC20 SYCL and C++ Birds of a Feather 19th Nov 2020

  • 1. SC20 C++20 BoF Michael Wong (Codeplay) Bjarne Stroustrup (Morgan Stanley) [date] • [Location]
  • 2. SYCL WG State of the Union 2020 Michael Wong SYCL WG Chair Codeplay VP of R&D ISOCPP Director & VP ISO C++ Directions Group Chair michael@codeplay.com | wongmichael.com/about
  • 3. SYCL Present and Future Roadmap (May Change) 2011 OpenCL 1.2 OpenCL C Kernel Language OpenCL 2.1 SPIR-V in Core 2015 SYCL 1.2 C++11 Single source programming OpenCL 2.2 2017 SYCL 1.2.1 C++11 Single source programming 2020 SYCL 2020 C++17 Single source programming Many backend options 2021-???? SYCL 2021-? C++20 Single source programming Many backend options C++11 C++14 C++17 C++20 OpenCL 3.0 C++23
  • 4. SYCL community is vibrant SYCL-1.2.1 2X growth
  • 5. SYCL 2020 Potential Features Generalization (a.k.a the Backend Model) presented by Gordon Brown Unified Shared Memory (USM) presented by James Brodman Improvement to Program class Modules presented by Gordon Brown Host Task with Interop presented by Gordon Brown In order queues, presented by James Brodman SYCL 2020 compared with SYCL 1.2.1 Easier to integrate with C++17 (CTAD, Deduction Guides...) Less verbose, smaller code size, simplify patterns Backend independent Multiple object archives aka modules simplify interoperability Ease porting C++ applications to SYCL Enable capabilities to improve programmability Backwards compatible but minor API break based on user feedback SYCL Evolution 2017 SYCL 1.2.1 Improving Software Ecosystem Tool, libraries, GitHub Expanding Implementation DPC++ ComputeCpp triSYCL hipSYCL Regular Maintenance Updates Spec clarifications, formatting and bug fixes https://www.khronos.org/registry/SYCL/ Target 2020 Provisional Q3 then Final Q4 Selected Extension Pipeline aiming for SYCL 2020 Provisional Q3 Reduction Subgroups Accessor simplification Atomic rework Extension mechanism Address spaces Vector rework Specialization Constants Integration of successful Extensions plus new Core functionality Converge SYCL with ISO C++ and continue to support OpenCL to deploy on more devices CPU GPU FPGA AI processors Custom Processors Repeat The Cycle every 1.5-3 years SYCL 2020 Roadmap (WIP, MAY CHANGE)
  • 6. SYCL Ecosystem, Research and Benchmarks Active Working Group Members Benchmarks SYCL-BLAS Linear Algebra Libraries Implementations SYCL-DNN Machine Learning Libraries and Parallel Acceleration Frameworks Research SYCL-MLEigen RSBench SYCL Parallel STL oneAPI oneMKL
  • 7. SYCL, Aurora and Exascale computing SYCL can run on AMD ROCM
  • 8. SYCL 2020 Provisional is here • SYCL 2020 provisional is released and final in Q4 • We need your feedback asap • https://app.slack.com/client/TDMDFS87M/CE9UX4CHG • https://community.khronos.org/c/sycl • https://sycl.tech • What features are you looking for in SYCL 2020? • What feature would you like to aim for in future SYCL? • How do you join SYCL?
  • 9. Engaging with the Khronos SYCL Ecosystem SYCL Working Groups SYCL Advisory Panels Contribute to SYCL open source specs, CTS, tools and ecosystem Khronos SYCL Forums, Slack Channels, stackoverflow, reddit, and SYCL.tech Khronos members under Khronos NDA and IP Framework participate and vote in working group meetings. Starts at $3.5K/yr. https://www.khronos.org/members/ https://www.khronos.org/registry/SYCL/ Invited Advisors under the Khronos NDA and IP Framework can comment and contribute to requirements and draft specifications https://www.khronos.org/advisors/ Spec fixes and suggestions made under the Khronos IP Framework. Open source contributions under repo’s CLA – typically Apache 2.0 https://github.com/KhronosGroup https://github.com/KhronosGroup/SYCL-CTS https://github.com/KhronosGroup/SYCL-Docs https://github.com/KhronosGroup/SYCL-Shared https://github.com/KhronosGroup/SYCL-Registry https://github.com/KhronosGroup/SyclParallelSTL Open to all! https://community.khronos.org/www.khr.io/slack https://app.slack.com/client/TDMDFS87M/CE9UX4CHG https://community.khronos.org/c/sycl/ https://stackoverflow.com/questions/tagged/sycl https://www.reddit.com/r/sycl https://github.com/codeplaysoftware/syclacademy https://sycl.tech/ $0 $0 $0 $ Any member or non-member can propose a new SYCL feature or fix
  • 10. Thank You! • Khronos SYCL is creating cutting-edge royalty-free open standard • For C++ Heterogeneous compute, vision, inferencing acceleration • Information on Khronos SYCL Standards: https://www.khronos.org/sycl • Any entity/individual is welcome to join Khronos SYCL: https://www.khronos.org/members • Join the SYCLCon Tutorial Monday and Wednesday Live panel : Wednesday Apr 29 15:00-18:00 GMT • Have your questions answered live by a group of SYCL experts • Michael Wong: michael@codeplay.com | wongmichael.com/about Gather industry requirements for future open standards Draft Specifications Confidential to Khronos members Publicly Release Specifications and Conformance Tests Gain early insights into industry trends and directions Influence the design and direction of key open standards that will drive your business Accelerate your time-to-market with early access to specification drafts Network with domain experts from diverse companies in your industry State-of-the-art IP Framework protects your Intellectual Property Enhance your company reputation as an industry leader through Khronos participation Benefits of Khronos membership
  • 11. SC20 C++20 BoF Michael Wong (Codeplay) Bjarne Stroustrup (Morgan Stanley) [date] • [Location]
  • 12. 1. C++ 20 2. C++20 Big Features 3. C++20 Parallel Concurrency 4. Backup: C++20 complete feature list Agenda
  • 13. C++: evolving towards greater elegance and power C++11 C++14 C++17 C++20C++98 Inheritance Exception Templates STL Auto Lambda Concurrency Move Futures Template deduction Compile time if Parallel STL Modules Concept Coroutine Ranges 2011 2014 2017 2019 20201998
  • 14. “the committee” Stroustrup - C++20 - Aarhus 2020 14 2011 2014 1990 2017
  • 15. C++20 is here: wg21.link/N4861 Stroustrup - C++20 - SC 2020 15
  • 16. C++20 • Major language features – Modules – Concepts – Coroutines – Improved compile-time programming support • Major standard-library components – Ranges – Dates – Formats – Parallel algorithms – Span • Many minor language features and standard-library components • A dense web of interrelated mutually-supporting features • Most is shipping already Stroustrup - C++20 - SC 2020 16 By “major” I mean “changes how we think”
  • 17. C++20 • Implementations are fast improving – Keep an eye on new compiler and standard-library releases – Many/most features are already shipping • Shipping in GCC, Clang, Microsoft – Modules (still experimental in GCC) – Concepts – Coroutines (still experimental in GCC) – … • Available on GitHub – Ranges – Dates – Format – Span – … Stroustrup - C++20 - SC 2020 17 Not Science Fiction
  • 18. Making the preprocessor redundant • Templates – Concepts • Type deduction • Compile-time computation – Constexpr and consteval functions – Static reflection (not yet) – Type traits • Modules • std::source_location • Contracts (not yet) Stroustrup - C++20 - SC 2020 18 #define #include #ifndef assert() __file__
  • 19. Compile-time computation • Not just built-in types – Examples from <chrono> cout << weekday{June/21/2016}; // Tuesday static_assert( weekday{June/21/2016}==Tuesday ); // At compile time static_assert(2020y/February/Tuesday[last] == February/25/2020); // true auto tp = system_clock::now(); cout << tp; // 2019-11-14 10:13:40.785346 cout << zoned_time tp{current_zone(),tp}; // 2019-11-14 11:13:40.785346 CET Stroustrup - C++20 - SC 2020 19 Howard Hinnant
  • 20. Compile-time computation: invisible • Compile-time computation tends to be invisible auto z = sqrt(3+2.7i); // call sqrt(complex<double>) auto d = 5min+10s+200us+300ns; // a duration auto s = "This is not a pointer to char"s; // a string // implementations: constexpr complex<double> operator""i(long double d) { return {0,d}; } constexpr seconds operator""s(unsigned long long s) {return s; } constexpr string operator""s(const char* str, size_t len) { return {str, len}; } Stroustrup - C++20 - SC 2020 20
  • 21. Template argument type deduction • Help avoid range errors int a[128]; // … fill a … std::span s {a}; for (const auto x : s) cout << x << ‘ ‘; • Help avoid data races void do_something() { std::scoped_lock lck {mut1,mut2}; // … Manipulate shared data … } Stroustrup - C++20 - SC 2020 21 No repeated element type No repeated array bound No repeated mutex type No mutex order dependence
  • 22. Compile-time computation Stroustrup - C++20 - SC 2020 22 Daveed Vandevoorde Michael Wong constexpr functions consteval functions Template aliases UDL Concepts Type deduction Template aliases Type traits Gabriel Dos Reis Michael Spertus
  • 23. 1. C++ 20 2. C++20 Big Features 3. C++20 Parallel Concurrency 4. Backup: C++20 complete feature list Agenda
  • 24. Modules • Order dependence #include "a.h" #include "b.h" • Can be different from #include "b.h" #include "a.h" • #include is textual inclusion • Implies – #include is transitive – much repeated compilation – Subtle bugs Stroustrup - C++20 - SC 2020 24 • Modularity import a; import b; • Same as import b; import a; • import is not transitive • Implies – Much compilation can be done once only Finally!
  • 25. Modules • Better code hygiene: modularity (especially protection from macros) – => Faster compile times (hopefully factors rather than percent) export module map_printer; // we are defining a module import iostream; import containers; using namespace std; export template<Sequence S> void print_map(const S& m) { for (const auto& [key,val] : m) // break out key and value cout << key << " -> " << val << 'n'; } Stroustrup - C++20 - SC 2020 25
  • 26. Compile speeds Stroustrup - C++20 - SC 2020 26
  • 27. Modularity and transition • Support for getting from the #include world to the import world – Global module – Modular headers – Module partitions • A module need not be in a single source file module; #include "xx.h" // to global module export module C; import "a.h" // “modular headers” import "b.h" import A; export int f() { … } • Not yet: a modular standard library – Versions exist, but not yet in the ISO standard – I hope for import std; Stroustrup - C++20 - SC 2020 27 Potentially holding back progress Here, #include works as ever
  • 28. Modules and transition • Source organization • Header file conversion – Header and module coexistence • Build systems – Build2 – Cmake prototype Stroustrup - C++20 - SC 2020 28 Nathan Sidwell Gabriel Dos Reis Richard Smith
  • 29. Generic programming: The backbone of the C++ standard library • Containers – vector, list, stack, queue, priority_queue, ... • Ranges • Algorithms – sort(), partial_sort(), is_sorted(), merge(), find(), find_if(),... – Most with parallel and vectorized versions • Concurrency support (type safe) – Threads, locks, futures, ... • Time – time_points, durations, calendars, time_zones • Random numbers – distributions and engines (lots) • Numeric types and algorithms – complex – accumulate(), inner_product(), iota(), ... • Strings and Regular expressions • Formats Stroustrup - C++20 - SC 2020 29 RAII Type deduction Parameterized Types and algorithms
  • 30. Generic Programming • Write code that works for all suitable argument types – void sort(R); // pseudo declaration • R can be any sequence with random access • R’s elements can be compared using < – E* find_if(R,P); // pseudo declaration • R can be any sequence that you can read from sequentially • P must be a predicate on R’s element type • E* must point to the found element of R if any (or one beyond the end) • That’s what the standard says – “our job” is to tell this to the compiler – C++20 enables that Stroustrup - C++20 - SC 2020 30
  • 31. Generic Programming with C++20 Concepts • Write code that works for all suitable argument types void sort(Sortable_range auto& r); vector<string> vs; // … fill vs … sort(vs); array<int,128> ai; // … fill ai … sort(ai); Stroustrup - C++20 - SC 2020 31 Implicit: • Type of container • Type of element • Number of elements • Comparison criteria A concept: • Specifies what is required of r’s type
  • 32. Pre-C++20 C++20 vector v {1,2,3}; sort (begin(v), end(v)); ranges::sort(v); auto answer { v, | views::transform([](int I { return to_string(i); }) }; // “1”, “2”, “3” // can also reverse, drop, filter Generic Programming with C++20 Ranges
  • 33. Generic Programming: Concept with Ranges • Write code that works for all suitable argument types – Many/most algorithms have more than one template argument type – We need to express relationships among template arguments template<input_range R, indirect_unary_predicate<iterator_t<R> Pred> Iterator_t<R> ranges::find_if(R&& r, Pred p); list<int> lsti; // … fill lsti … auto p = find_if(lsti, greater_than{7}); vector<string> vs; // … fill vs … auto q = find_if(vs, [](const string& s) { return has_vowels(s); }); Stroustrup - C++20 - Copenhagen 2020 33 <ranges>
  • 34. Overloading • Overloading based on concepts void sort(Forward_sortable_range auto&); void sort(Sortable_range auto&); void some_code(vector<int> vec&,list<int> lst) { sort(lst); // sort(Forward_sortable_range auto&) sort(vec) // sort(Sortable_range auto&) } • We don’t have to say – “Sortable_range is stricter/better than Forward_sortable_range” – we compute that from their definitions Stroustrup - C++20 - SC 2020 34 Design principles: • Don’t force the user to do what a machine does better • Zero overhead compared to unconstrained templates
  • 35. Concepts • A concept is a compile-time predicate – A function run at compile time yielding a Boolean – Often built from other concepts template<typename R> concept Sortable_range = random_access_range<R> // has begin()/end(), ++, [], +, … && sortable<iterator_t<R>>; // can compare and swap elements template<typename R> concept Forward_sortable_range = forward_range<R> // has begin()/end(), ++; no [] or + && sortable<iterator_t<R>>; // can compare and swap elements Stroustrup - C++20 - SC 2020 35 There are libraries of concepts <ranges>: random_access_range and sortable
  • 36. Concepts • A concept is a compile-time predicate – A function runs at compile time yielding a Boolean – One or more arguments – Can be built from fundamental language properties: use patterns template<typename T, typename U = T> concept equality_comparable = requires(T a, U b) { {a==b} -> bool; {a!=b} -> bool; {b==a} -> bool; {b!=a} -> bool; } Stroustrup - C++20 - SC 2020 36 There are libraries of concepts <concepts>: equality_comparable
  • 37. Types and concepts • A type – Specifies the set of operations that can be applied to an object • Implicitly and explicitly • Relies on function declarations and language rules – Specifies how an object is laid out in memory • A single-argument concept – Specifies the set of operations that can be applied to an object • Implicitly and explicitly • Relies on use patterns – reflecting function declarations and language rules – Says nothing about the layout of the object Stroustrup - C++20 - SC 2020 37 Ideal: Use concepts where we now use types, except for defining layout
  • 38. Generic Programming is “just” programming • Why? – From 1988 to now “template programming” and “ordinary programming” have been very different • Different syntax • Different look-up rules • Different source code organization • “Expert friendly” programming techniques – We don’t need two different sets of techniques (and notations) • Unnecessary complexity • Make simple things simple! – “ordinary programming” is expressive and familiar Stroustrup - C++20 - SC 2020 38 Make simple things simple! Do so through generalization
  • 39. Generic Programming Stroustrup - C++20 - SC 2020 39 • will change the way we think about Programming
  • 40. C++20 Big changes: good for HPC workload •Concepts: Reduces errors, increases expressiveness; less hacking •Modules: Better code hygiene; much better compile times •Ranges: Improved notation, better pipelining •Coroutines: Better cooperative functions, better concurrency control •Subtle big changes: • compile-time computation is improving • type deduction is simplifying notation
  • 41. 1. C++ Projects 2. C++20 Big Features 3. C++20 Parallel Concurrency 4. Backup: C++20 complete feature list Agenda
  • 42. • cooperative cancellation of threads • new synchronization facilities • updates to atomics with atomic_ref • coroutines C++20 asynchronous, concurrency, parallelism, heterogeneous programming
  • 43. Parallel/concurrency before C++11 (C++98) Asynchronous Agents Parallel collections Mutable shared state Heterogeneous (GPUs, accelerators, FPGA, embedded AI processors) summary tasks that run independently and communicate via messages operations on groups of things, exploit parallelism in data and algorithm structures avoid races and synchronizing objects in shared memory Dispatch/offload to other nodes (including distributed) examples GUI,background printing, disk/net access trees, quicksorts, compilation locked data(99%), lock-free libraries (wizards), atomics (experts) Pipelines, reactive programming, offload,, target, dispatch key metrics responsiveness throughput, many core scalability race free, lock free Independent forward progress,, load-shared requirement isolation, messages low overhead composability Distributed, heterogeneous today's abstractions POSIX threads, win32 threads, OpenCL, vendor intrinsic openmp, TBB, PPL, OpenCL, vendor intrinsic locks, lock hierarchies, vendor atomic instructions, vendor intrinsic OpenCL, CUDA
  • 44. Parallel/concurrency for C++11, 14, 17, C++20 Asynchronous Agents Parallel collections Mutable shared state Heterogeneous/Distributed abstractions from C++11, 14, 17, 20 C++11: thread,lambda function, TLS, async C++ 20: Jthreads +interrupt _token, coroutines C++11: packaged tasks, promises, futures, C++ 17: ParallelSTL, control false sharing C++20 : Vec execution policy, Algorithm un-sequenced policy C++11: locks, memory model, mutex, condition variable, atomics, static init/term, C++ 14: shared_lock/shared_timed_mutex, OOTA, atomic_signal_fence, C++ 17: scoped _lock, shared_mutex, ordering of memory models, progress guarantees, TOE, execution policies C++20: atomic_ref, Latches and barriers, atomic<shared_ptr> Atomics & padding bits Simplified atomic init Atomic C/C++ compatibility Semaphores and waiting Fixed gaps in memory model , Improved atomic flags, Repair memory model C++11: lambda C++14: generic lambda C++17: , progress guarantees, TOE, execution policies C++20: atomic_ref
  • 46. atomic_ref <T> • std::atomic_ref allows you to perform atomic operations on non-atomic objects. • This can be important when sharing headers with C code, or where a struct needs to match a specific binary layout so you can’t use std::atomic, or if you have distinctive non-atomic parts of your program and you only need to do atomic access in a few places • this is where atomic_ref<T> is superior to atomic<T> and is more efficient • If you use std::atomic_ref to access an object, all accesses to that object must use std::atomic_ref within that scope. struct my_c_struct{ int count; data* ptr; }; void do_stuff(my_c_struct* p){ std::atomic_ref<int> count_ref(p->count); ++count_ref; // ... }
  • 48. • Current futures and promise is eager and closed • We aim to move to lazy and open • Executors switching to a sender receiver model (similar to promise futures but without the blocking synchronization, using lazy continuation) • All Awaitables are Senders (some senders are awaitable) • Coroutines are receivers • Can layer Eager operations on top of lazy • Futures model problem: shared state racing between producer completing and consumer attaching continuation • synchronization, heap, type erasure • Returning handle to eagerly started concurrent operations (e.g. future) is returning an obligation for the caller to manually join that operation • Concurrency resource created by the call • Returns handle to concurrent operation e.g. future <T> • Concurrency resource must be released by joining • Join needs to be asynchronous • But destructors cannot be asynchronous so can’t use it to join automatically The future of parallelism is lazy
  • 49. coroutines • Not Pre-emptive but cooperative • A coroutine is a function that can be suspended mid execution and resumed at a later time. • Resuming a coroutine continues from the suspension point; • local variables have their values from the original call • C++20 provides stackless coroutines • Only the locals for the current function are saved • Everything is localized • Minimal memory allocation — can have millions of in-flight coroutines • Whole coroutine overhead can be eliminated by the compiler — Gor’s “disappearing coroutines” future<remote_data> async_get_data(key_type key); future<data> retrieve_data( key_type key){ auto rem_data= co_await async_get_data(key); co_return process(rem_data); }
  • 50. Cooperative instead of preemption •co_await to suspend execution until resumed •co_yield to suspend +returning a value •co_return to complete+return value • You can’t tell from the signature • Only if body uses the special keywords • Just an implementation detail • Dangling references • No plain or placeholder return yet • No std::generator<T> • No constexpr, constructor, destructors, main as coroutines • Not a replacement for callbacks, as coroutines cannot be overloaded yetfuture<data> retrieve_data( key_type key){ auto rem_data= co_await async_get_data(key); co_return process(rem_data); } future<remote_data> async_get_data(key_type key){ // code co_yield rem_data; } CALL RETURN SUSPEND AWAIT RESUME
  • 51. Like Callback but not a replacement •co_await to suspend execution until resumed •co_yield to suspend +returning a value •co_return to complete+return value • You can’t tell from the signature • Only if body uses the special keywords • Just an implementation detail • Dangling references • No plain or placeholder return yet • No std::generator<T> • No constexpr, constructor, destructors, main as coroutines • Not a replacement for callbacks, as coroutines cannot be overloaded yetfuture<data> retrieve_data( key_type key){ auto rem_data= co_await async_get_data(key); //callback code co_return process(rem_data); } future<remote_data> async_get_data(key_type key){ // code co_yield rem_data; } CALL RETURN SUSPEND REGISTER CALLBACK CALL CALLBACK Thread1Thread2
  • 52. More for HPC In C++ 20 In future C++ • Better lambda • atomic_ref • [[likely]][[unlikely]] • Better constexpr • Better Class Template Argument Deduction (CTAD) • Better library • Span • NTTP • Calendars • constint • [[nodiscard()]] • … • Linear Algebra • executors • machine learning • data affinity • data layout • data locality • data movement • mdspan • mdarray
  • 53. • Good use of hardware • originally from C • CPU, GPU, FPGA, AI/ML chips, … • Zero-overhead abstraction • originally from Simula • performant libraries • simplified programming • control of complexity “if you don’t need the right answer, I can make it as fast as you like” “if you can afford to waste 98% of your CPU, I can make programming much simpler” C++
  • 54. • C++20 is a major release, maybe even bigger than C++11 • Less verbose code • Solves Error Novel problem with Concepts • Solves Constant Recompilation Problem with Modules • Improves STL with Ranges • Better Lazy Cooperative function Control with Coroutine and atomic_ref • Works well to improve HPC workloads to make them • compile faster, • safer, • do more at compile time, • less verbose and • run faster. C++20 Take Away
  • 55. • C++ Projects • C++20 Big Features • C++20 Parallel Concurrency • Backup: C++20 complete feature list Agenda
  • 56. Status after Feb Prague C++ Meeting ISO NUMBER NAME STATUS LINKS C++20? ISO/IEC TS 19217:2015 C++ Extensions for Concepts Published 2015-11-13. (ISO Store). Final draft: n4553 (2015-10-02) Current draft: p0734r0 (2017-07-14) Merged into C++20 (with modifications). Constrained templates Merged into C++20, including abbreviated function templates! Executors Abstraction for where/how code runs in a concurrent context Not headed for C++ 20, now retarget for C++23 Coroutines Resumable functions, based on stackless await design Published! Merged into C++20 Reflection TS Reflection V2 Static code reflection mechanisms A value-based constexpr version of the Reflection TS PDTS ballot done. Approved for publication as a TS. Aiming for C++23 SG14 Lightweight Exceptions In progress
  • 57. Status after Feb Prague C++ Meeting ISO number Name Status What is it? C++20? ISO/IEC TS 19571:2016 C++ Extensions for Concurrency TS Published 2016-01-19. (ISO Store) Final draft: p0159r0 (2015-10-22) improvements to future, latches and barriers, atomic smart pointers Latches, atomic<shared_ptr<t>> merged into C++20. Already in Visual Studio release and Anthony Williams Just Threads! and waiting for subsequent usage experience. Withdrawn as some parts (latches, atomic<shared_ptr<>>are now in C++20 ISO/IEC DTS 21425:2017 Ranges TS Published 2017-12-05. (ISO Store) Draft: n4685 (2017-07-31) Range-based algorithms and views Merged in C++20 ISO/IEC TS 19216:2018 Networking TS Published 2018-04-24. (ISO Store) Draft n4734 (2017-04-04). Latest draft: n4771 (2018-10-08) Sockets library based on Boost.ASIO Published. Not headed to C++20. ISO/IEC TS 21544:2018 SG2 Modules V1 Modules V2 Published 2018-05-16. (ISO Store) Final Draft n4720 (2018-01-29) A component system to supersede the textual header file inclusion model Improvements to Modules v1, including a better transition path Published as a TS Merged into C++20 SG21 Contracts Pre and post conditions Removed from C++20. Reset as SG21
  • 58. Status after Feb Prague C++ Meeting ISO number Name Status What is it? C++20? ISO/IEC DTS 19568:xxxx Numerics TS Early development. Draft p0101 (2015-09-27) Various numerical facilities Under active development ISO/IEC DTS 19571:xxxx Concurrency TS 2 Early development Exploring , lock-free, hazard pointers, RCU, atomic views, concurrent data structures, fibers Deprecate volatile, add volatile_load/store, TLS? Under active development. Possible new clause ISO/IEC TS 19570:2018 Parallelism TS 2 Published 2018-11-15. (ISO Store). Draft: n4773 (2018-10-08) task blocks, progress guarantees, SIMD<T>, vec, no_vec loop based execution policy Published. SIMD<T>, progress guarantees, loopbased execution policy are Headed into C++23 ISO/IEC DTS 19841:xxxx Transactional Memory TS 2 Early development Exploring simplified atomic model of only memory updates. Under active development. ISO/IEC DTS 19568:xxxx Graphics TS Early development. Draft p0267r8 (2018-06-26) 2D drawing API using Cairo interface, adding stateless interfacec Restarted after being shutdown. ISO/IEC DTS 19568:xxxx Library Fundamental V3 Initial draft, early development Generic scope guard and RAII wrappers Under development
  • 59. Status after Feb Prague C++ Meeting ISO number Name Status What is it? C++20? SG14 Linear Algebra SG14 SIG, LEWG Wrapper on Blas and a C++-based proposal. separated into 3 layers Under active development. Aiming for C++23 SG19 Machine Learning SG19 SIG Improve C++ for ML,AI, DNN, Statistics, Differential Calculus, Data structure Graph programming Under active development. Aiming for C++23 SG16 Pattern Matching SG16 WIP A match-like facility for C++ WIP Under active development. Aiming for C++23 SG12 Undefined Behaviour/Safety Critical SG12 WIP optimization that cause UB. Pointer provenance, signed integer overflow Validate external C++ Safety APIs: Misra, Autosar Under active development. Aiming for C++23 SG20 Education SG20 WIP Support educating C++, especially new features Under active development. Aiming for C++23 SG19 Audio SG13 HMI WIP Audio drivers Under active development. Aiming for C++23 SG16 Unicode SG16 WIP Compile-time regular expression, source code info capture, charset transcoding Under active development. Aiming for C++23 SG15 Tooling Ecosystem SG15 WIP Build systems, debug and tools for Modules; Under active development. Aiming for C++23 TR
  • 60. C++ 20 Language Features • Removal of Contracts • Class template argument deduction for aggregates • Class template argument deduction for alias templates • Mitigating minor modules maladies. • Relaxing redefinition restrictions for re-exportation robustness • Recognizing header unit imports requires full preprocessing. • Using unconstrained template template parameters with constrained templates • On the non-uniform semantics of return-type-requirements • Non-type template parameters are incomplete without floating-point types • Inline namespaces: fragility bites. • rethrow_exception must be allowed to copy •Adding the constinit keyword •Permitting trivial default initialization in constexpr contexts •Enabling constexpr intrinsics by permitting unevaluated inline assembly in constexpr functions •More constexpr containers. • When do you actually use <=>? • Spaceship needs a tune-up •using enum. •Additional contexts for implicit move construction. •Conditionally trivial special member functions •[[nodiscard("should have a reason")]] •[[nodiscard]] for constructors. •Deprecate uses of the comma operator in subscripting expressions. •Deprecating volatile •Interaction of memory_order_consume with release sequences
  • 61. More C++ 20 Language Features • Most notably, the Concepts Technical Specification has been merged into C++20! • Template parameter lists for generic lambdas. • Designated initializers. • Lambda capture [=, *this] • A __VA_OPT__ macro to make variadic macros easier to use. • Default member initializers for bitfields • A tweak to C++17’s constructor template argument deduction rules • Fixing const-qualified pointers to members • The most significant new feature voted in was operator<=>, • Range-based for statements with initializer. • Lambdas is unevaluated contexts. • Default constructible and assignable stateless lambdas. • Simplifying implicit lambda capture. • Fixing small functionality gaps in constraints. • Deprecating the notion of “plain old data” (POD). • Access checking on specializations. • const mismatch with defaulted copy constructor. • ADL and function templates that are not visible. • Core issue 1581: when are constexpr member functions defined?
  • 62. More C++20 Language Features • Language support for empty objects • Relaxing the structured bindings customization point finding rules. • Structured bindings in accessible members. • Allow pack expansion in lambda init-capture. • Symmetry for <=> • Likely and unlikely attributes • Down with typename! • Relaxing range-based for loop’s customization point finding rules • Support for contract-based programming in C++20 •Class types in non-type template parameters. •Allowing virtual function calls in constant expressions. •Prohibit aggregates with user-declared constructors. •Efficient sized deletion for variable-sized classes.
  • 63. More C++ 20 Language Features • Abbreviated function templates (AFTs). • Improvements to return-type-requirements. • Immediate functions. • std::is_constant_evaluated() • try / catch blocks in constexpr functions. • Allowing dynamic_cast and polymorphic typeid in constant expressions. • Changing the active member of a union inside constexpr • char8_t: a type for UTF-8 characters and strings. • Access control in contract conditions. • Revising the C++ memory model. • Weakening release sequences. • Nested inline namespaces • Signed integers are two’s complement • Consistency improvements for <=> and other comparison operators. • Conditionally explicit constructors, a.k.a. explicit(bool). • Deprecate implicit capture of this via [=]. • Integrating feature-test macros into the C++ working draft. • A tweak to the rules about when certain errors related to a class being abstract are reported. • A tweak to the treatment of padding bits during atomic compare-and-exchange operations. • Tweaks to the __VA_OPT__ preprocessor feature. • Updating the reference to the Unicode standard.
  • 64. More C++ 20 Language Features •Modules! •Merging the Coroutines TS into C++20 •Allow initializing aggregates from a parenthesized list of values •<=> != ==, an important fix to the default comparisons design. •Extending structured bindings to be more like variable declarations. •Reference capture of structured bindings. •Contract postconditions and return type deduction. •Array size deduction in new-expressions. This is also a Defect Report against previous versions of C++. •Contra CWG DR1778 (a bugfix related to noexcept and explicitly defaulted functions). •Make char16_t/char32_t string literals be UTF-16/32.
  • 65. C++20 Library Features •constexpr INVOKE •Movability of single-pass iterators •basic_istream_view::iterator should not be copyable •Layout-compatibility and pointer-interconvertibility traits •Remove dedicated precalculated hash lookup interface. •Miscellaneous minor fixes for chrono •char8_t backward compatibility remediation •bind_front should not unwrap reference_wrapper •Iterator difference type and integer overflow •Helpful pointers for ContiguousIterator •Views and size types •Exposing a narrow contract for ceil2 •constexpr feature macro concerns •The C++20 Synchronization Library, •Input range adaptors •Making std::vector constexpr •Making std::string constexpr •Stop token and joining thread •Adopt source_location for C++20. •Rename concepts to standard_case for C++20, while we still can. •The mothership has landed Standard library header units for C++20 •to_array from LFTS with updates •Bit operations •Math constants •Efficient access to basic_stringbuf‘s buffer •Text formatting. •Integration of <chrono> with text formatting •printf corner cases in std::format •Output std::chrono::days with ‘d‘ suffix
  • 66. More C++20 Library Features •Support for detecting endianness programmatically •Repairing elementary string conversions (also a Defect Report) •Improvements to the integration of C++17 class template argument deduction into the standard library (also a Defect Report) •Extending make_shared to support arrays •Transformation trait remove_cvref •Treating unnecessary decay •Using nodiscard in the standard library •Make std::memory_order a scoped enumeration •Synchronized buffered ostream •A utility to convert pointer-like objects to raw pointers •Add constexpr modifiers to functions in <algorithm> and <utility> headers. •constexpr for std::complex •Atomic shared_ptr •Floating-point atomics •De-pessimize legacy <numeric> algorithms with std::move String prefix and suffix checking, i.e.
  • 67. More C++20 library Features •calendar and timezone library. •std::span •<version> header •Tweak on how unordered containers are compared •String::reserve() should not shrink •User specializations of function templates in namespace std •Manipulators for C++ synchronized buffer ostream •constexpr iterator requirements • The most notable addition at this meeting was standard library Concepts. • atomic_ref • Bit-casting object representations • Standard library specification in a Concepts and Contracts world • Checking for the existence of an element in associative containers • Add shift() to <algorithm> • Implicit conversion traits and utility functions • Integral power-of-2 operations • The identity metafunction • Improving the return value of erase()-like algorithms • constexpr comparison operators for std::array • constexpr for swap and related functions • fpos requirements • Eradicating unnecessarily explicit default constructors • Removing some facilities that were
  • 68. More C++20 Library Features • The most notable addition at this meeting was merging the Ranges TS into C++20! • Fixing operator>>(basic_istream&, CharT*). • variant and optional should propagate copy/move triviality. • visit<R>: explicit return type for visit. • <chrono> zero(), min(), and max() should be noexcept. • constexpr in std::pointer_traits. • Miscellaneous constexpr bits. • unwrap_ref_decay and unwrap_reference • reference_wrapper for incomplete types • A sane variant converting constructor • std::function move constructor should be noexcept • std::assume_aligned • Smart pointer creation with default initialization • Improving completeness requirements for type traits) • Remove CommonReference requirement from StrictWeakOrdering (a.k.a fixing relations) • Utility functions to implement uses-allocator construction • Should span be Regular? • Make stateful allocator propagation more consistent for operator+(basic_string)) • Simplified partial function application • Heterogeneous lookup for unordered containers • Adopt consistent container erasure from Library Fundamentals v2
  • 69. More C++20 Library Features •polymorphic_allocator<> as a vocabulary type. •Well-behaved interpolation for numbers and pointers., a.k.a. std::midpoint •Signed ssize() functions, unsigned size() functions in span •I stream, you stream, we all stream for istream_iterator. •Ranges design cleanup •Target vectorization policies (from the Parallelism TS v2) •Usability enhancements for std::span •Make create_directory() intuitive. •Precalculated hash values in lookup •Traits for [un]bounded arrays •Making std::underlying_type SFINAE-friendly.
  • 70. Parallel/concurrency after C++11 Asynchronus Agents Parallel collections Mutable shared state Heterogeneous (GPUs, accelerators, FPGA, embedded AI processors) summary tasks that run independently and communicate via messages operations on groups of things, exploit parallelism in data and algorithm structures avoid races and synchronizing objects in shared memory Dispatch/offload to other nodes (including distributed) examples GUI,background printing, disk/net access trees, quicksorts, compilation locked data(99%), lock-free libraries (wizards), atomics (experts) Pipelines, reactive programming, offload,, target, dispatch key metrics responsiveness throughput, many core scalability race free, lock free Independent forward progress,, load-shared requirement isolation, messages low overhead composability Distributed, heterogeneous today's abstractions C++11: thread,lambda function, TLS, Async C++11: packaged tasks, promises, futures, C++11: locks, memory model, mutex, condition variable, atomics, static init/term C++11: lambda
  • 71. Parallel/concurrency after C++14 Asynchronous Agents Parallel collections Mutable shared state Heterogeneous summary tasks that run independently and communicate via messages operations on groups of things, exploit parallelism in data and algorithm structures avoid races and synchronizing objects in shared memory Dispatch/offload to other nodes (including distributed) examples GUI,background printing, disk/net access trees, quicksorts, compilation locked data(99%), lock-free libraries (wizards), atomics (experts) Pipelines, reactive programming, offload,, target, dispatch key metrics responsiveness throughput, many core scalability race free, lock free Independent forward progress,, load-shared requirement isolation, messages low overhead composability Distributed, heterogeneous today's abstractions C++11: thread,lambda function, TLS, async C++14: generic lambda C++11: packaged tasks, promises, futures, C++11: locks, memory model, mutex, condition variable, atomics, static init/term, C++ 14: shared_lock/shared_timed_mutex , OOTA, atomic_signal_fence, C++11: lambda C++14: generic lambda
  • 72. Parallel/concurrency after C++17 Asynchronus Agents Parallel collections Mutable shared state Heterogeneous (GPUs, accelerators, FPGA, embedded AI processors) summary tasks that run independently and communicate via messages operations on groups of things, exploit parallelism in data and algorithm structures avoid races and synchronizing objects in shared memory Dispatch/offload to other nodes (including distributed) today's abstractions C++11: thread,lambda function, TLS, async C++14: generic lambda C++11: packaged tasks, promises, futures, C++ 17: ParallelSTL, control false sharing C++11: locks, memory model, mutex, condition variable, atomics, static init/term, C++ 14: shared_lock/shared_timed_ mutex, OOTA, atomic_signal_fence, C++ 17: scoped _lock, shared_mutex, ordering of memory models, progress guarantees, TOE, execution policies C++11: lambda C++14: generic lambda C++17: progress guarantees, TOE, execution policies