Meeting: 2025 07 08

What would be most useful for applications to get from vendors w.r.t. RAJAPerf?
Use RAJAPerf as a performance benchmark AND also for compiler testing/verification?
- For performance, we should allow them to modify RAJA and RAJA Perf with the following rules:
  - No RAJA API changes
  - Must do RAJA variants (can do base too) and cannot remove RAJA from code
  - Must use the same RAJA version for each kernel (tied to the RAJA Perf version via submodule)
- For compiler testing
  - Ensure compiler supports all C++ language features used in RAJA and RAJA Perf
  - Ensure performance of RAJA and base variants of each kernel are within some bounds
How to prioritize which kernels to include in benchmarking exercises?
- Cover gaps as much as possible in our small set of proxy-apps
- Add a kernel to represent a MARBL case that we don't cover currently (slight modification of MASS3DEA)?
- Stress shared memory usage by increasing order of one of the MARBL-based kernels
- We have a shared memory version of LTIMES in RAJA examples. No shared memory is used in Kripke?
  - https://github.com/LLNL/RAJA/blob/develop/benchmark/ltimes.cpp#L1529-L1594
  - We can adapt this and add it to RAJA Perf
- Add high-dimensional tensor contraction (Arturo is working on this)
- Vendor-supplied BLAS1/2 (sparse) batched versions -- should this be captured in the contract or represented in a benchmark?
What metrics do we want to require (FOM)?
Throughput plots
Questions for Olga/benchmarking team
- When do we have to freeze/release the code?
- When do we have to have benchmark data?
- How many kernels can we have in Tier 1? Tier 2?
Tier 1 Kernels
- FEMSWEEP (Apps) -- lots of things need to be done to make it Tier 1 ready...
- Others if we can have more than one?
Tier 2 Kernels
- REDUCE_STRUCT (multiple reductions is important case to cover -- atomics)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meeting: 2025 07 08

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally