wmma_f32_16x16x16_f16_gfx12 > Struct Reference

wmma_f32_16x16x16_f16_gfx12 > Struct Reference#

Composable Kernel: ck::mfma_type< MfmaInstr::wmma_f32_16x16x16_f16_gfx12 > Struct Reference
ck::mfma_type< MfmaInstr::wmma_f32_16x16x16_f16_gfx12 > Struct Reference

#include <xdlops_gemm.hpp>

Inheritance diagram for ck::mfma_type< MfmaInstr::wmma_f32_16x16x16_f16_gfx12 >:
ck::mfma_type_gfx12_base

Public Member Functions

template<index_t MPerWmma, index_t NPerWmma, class FloatA, class FloatB, class FloatC>
__device__ void run (const FloatA &a, const FloatB &b, FloatC &reg_c) const

Additional Inherited Members

Static Public Attributes inherited from ck::mfma_type_gfx12_base
static constexpr index_t group_size = 8
static constexpr index_t num_groups_per_blk = 1
static constexpr index_t num_regs_per_blk = 8
static constexpr index_t num_threads_per_blk = 16
static constexpr index_t wave_size = 32
static constexpr index_t num_input_blks = 2
static constexpr index_t num_output_blks = 1
static constexpr index_t m_per_blk = 16
static constexpr index_t n_per_blk = 16
static constexpr index_t k_per_blk = 8
static constexpr bool is_k_reduction = true

Member Function Documentation

◆ run()

template<index_t MPerWmma, index_t NPerWmma, class FloatA, class FloatB, class FloatC>
__device__ void ck::mfma_type< MfmaInstr::wmma_f32_16x16x16_f16_gfx12 >::run ( const FloatA & a,
const FloatB & b,
FloatC & reg_c ) const
inline

The documentation for this struct was generated from the following file: