2b9355c9e4e193bf8dbebcd8e4f1be84367096a8
generating allocation code inside the tensorflow op
Triton
This is the development repository of Triton, a language and compiler for writing highly efficient custom Deep-Learning primitives.
The formal foundations of this project are described in the following MAPL2019 publication: Triton: An Intermediate Language and Compiler for Tiled Neural Network Computations. Please cite us if you use our work!
The main features of Triton at the moment are:
- PyTriton: A Python API for writing custom operations for Triton-C compute-kernels. PyTriton automatically generates and just-in-time Tensorflow and PyTorch bindings.
- Triton-C: An imperative, single-threaded language for writing highly efficient compute-kernels at a relatively high abstraction level using numpy-like extensions of the C language.
- Triton-IR: An intermediate-representation for optimizing multi-dimensional array operations in linear algebra programs
- Triton-JIT: An optimizing just-in-time compiler for Triton-C, which generates GPU code on par with state-of-the-art CUDA-C (e.g., CUTLASS) and PTX (e.g., ISAAC). This includes transparent support for mixed-precision and Tensor Cores.
Installation
Triton is a fairly self-contained package and uses its own parser (forked from wgtcc) and LLVM code-generator. However, at the moment it still relies on LLVM-8.0+ for PTX code generation.
sudo apt-get install llvm-8-dev
git clone https://github.com/ptillet/triton.git;
cd triton/python/;
python setup.py develop;
cd examples;
python dot.py
Tutorials
- The PyTriton API
- The Triton-C language
- The Triton-IR representation (coming soon...)
- The Triton-JIT compiler (coming soon...)
Description
Languages
C++
49.7%
Python
35.3%
MLIR
13.3%
CMake
1.7%