## Features - Allow taking a block of tensor slice, as long as each dimension is contiguous (unit stride). - Fix some problems in `insert_slice_async`'s semantic. - More general verification for ops that return shared layout encoding. ## Known Limitations - `insert_slice_async` still uses the old semantic. May submit another PR later to support similar semantic like `tensor.extract_slice`. - No encoding verification for `tensor.extract_slice`. - 3d tensor ops are broken. - Strided accesses are not allowed. - May cause a little performance slowdown since we are passing strides as values but not constants (e.g., int). It would be difficult to pass strides as attributes when we have control flows. A block argument is possible to accept tensors with different strides.
12 lines
154 B
CMake
12 lines
154 B
CMake
add_mlir_dialect_library(TritonGPUIR
|
|
Dialect.cpp
|
|
Traits.cpp
|
|
|
|
DEPENDS
|
|
TritonGPUTableGen
|
|
TritonGPUAttrDefsIncGen
|
|
|
|
LINK_LIBS PUBLIC
|
|
TritonIR
|
|
)
|