triton

Files

Keren Zhou 3b80801dff [Triton-MLIR][Backend] Fix many problems to get the pipeline working (#809 )

1. Rewrite code generation of insert_slice_async.
2. Correct the wrong index passed to extract_slice in pipeline.
3. Add a prologue in pipeline to wait for dangling cp.asyncs.  
4. Move scf to cf conversion inside TritonGPUToLLVM because we need to
perform membar before scf to cf. It shouldn't be a technical limitation
and could be improved by a more general membar analysis.
5. Use an attribute to memoize the shared memory size and support
dynamic shared memory.
6. Prevent the combine pass to reorder insert_slice and extract_slice
across async_wait

Co-authored-by: Superjomn <yanchunwei@outlook.com>

2022-10-27 22:09:06 -07:00

triton_ops.mlir

[OPTIMIZER] Made layout simplification pass efficient for fused attention kernels (#790 )

2022-10-21 16:52:15 -07:00

triton_to_tritongpu.mlir

[OPTIMIZER] Made layout simplification pass efficient for fused attention kernels (#790 )

2022-10-21 16:52:15 -07:00

tritongpu_to_llvm.mlir

[Triton-MLIR][Backend] Fix many problems to get the pipeline working (#809 )

2022-10-27 22:09:06 -07:00