[Triton-MLIR] Replace triton.extract_slice with tensor.extract_slice and support more general tensor slicing (#837)

## Features

- Allow taking a block of tensor slice, as long as each dimension is
contiguous (unit stride).
- Fix some problems in `insert_slice_async`'s semantic.
- More general verification for ops that return shared layout encoding.

## Known Limitations

- `insert_slice_async` still uses the old semantic. May submit another
PR later to support similar semantic like `tensor.extract_slice`.
- No encoding verification for `tensor.extract_slice`.
- 3d tensor ops are broken.
- Strided accesses are not allowed.
- May cause a little performance slowdown since we are passing strides
as values but not constants (e.g., int).
It would be difficult to pass strides as attributes when we have control
flows. A block argument is possible to accept tensors with different
strides.
This commit is contained in:
Keren Zhou
2022-11-06 22:59:03 -08:00
committed by GitHub
parent a4ff0c362c
commit fdd59900f7
26 changed files with 507 additions and 339 deletions

View File

@@ -1,4 +1,5 @@
#include "triton/Analysis/Alias.h"
#include "mlir/Dialect/Tensor/IR/Tensor.h"
#include "triton/Analysis/Utility.h"
#include "triton/Dialect/TritonGPU/IR/Dialect.h"
@@ -24,18 +25,18 @@ ChangeResult SharedMemoryAliasAnalysis::visitOperation(
if (maybeSharedAllocationOp(op)) {
// These ops may allocate a new shared memory buffer.
auto result = op->getResult(0);
if (isSharedEncoding(result)) {
// FIXME(Keren): extract and insert are always alias for now
if (auto extractSliceOp = dyn_cast<triton::gpu::ExtractSliceOp>(op)) {
// extract_slice %src, %index
aliasInfo = AliasInfo(operands[0]->getValue());
} else if (auto insertSliceOp =
dyn_cast<triton::gpu::InsertSliceAsyncOp>(op)) {
// insert_slice_async %src, %dst, %index
aliasInfo = AliasInfo(operands[1]->getValue());
} else {
aliasInfo.insert(result);
}
// FIXME(Keren): extract and insert are always alias for now
if (auto extractSliceOp = dyn_cast<tensor::ExtractSliceOp>(op)) {
// extract_slice %src
aliasInfo = AliasInfo(operands[0]->getValue());
pessimistic = false;
} else if (auto insertSliceOp =
dyn_cast<triton::gpu::InsertSliceAsyncOp>(op)) {
// insert_slice_async %src, %dst, %index
aliasInfo = AliasInfo(operands[1]->getValue());
pessimistic = false;
} else if (isSharedEncoding(result)) {
aliasInfo.insert(result);
pessimistic = false;
}
}