Commit Graph

68 Commits

Author SHA1 Message Date
Philippe Tillet
acd5e44611 [GENERAL] Some minor improvements here and there to build systems and docs (#148) 2021-07-28 01:51:17 -07:00
Philippe Tillet
b253b77c71 [DOCS] Improved documentation and integration in CI (#139) 2021-07-27 12:38:49 -07:00
Philippe Tillet
b7b05a560e [DRIVER] Now giving the option to use system ptxas through environment variable (#123) 2021-07-27 12:38:49 -07:00
Philippe Tillet
840140bf26 [CODEGEN] Removed dedicated reassociate pass to merge it into LLVM isel (#101)
This massively simplifies implementation of `reassociate` and also fixes
a bunch of bug. The pass could still be improved, but can already be used
to generate constant pointer offsets in eg the matmul epilogue
2021-07-27 12:38:49 -07:00
Philippe Tillet
bfc0a7587d [PYTHON] Renamed triton.core -> triton.language (#92) 2021-07-27 12:38:49 -07:00
Philippe Tillet
29e33e50b7 [DOCS] Updates and improvements (#87) 2021-07-27 12:38:49 -07:00
Philippe Tillet
39f4730305 Deprecation of Triton-C and Replacement by decorated Python functions (#86)
This PR implements a major overhaul of the frontend for Triton, and replaces Triton-C by a pure Python API in which kernels are defined as @triton.jit decorated functions. The documentation and tutorials have also been updated to accommodate these changes.

See documentations for more information on the new API
2021-07-27 12:38:49 -07:00
Philippe Tillet
1fdb465b71 [DOCS] Various improvements and typo fixes 2021-07-27 12:38:49 -07:00
Philippe Tillet
183878dce5 [DOCS] Added matrix multiplication tutorial 2021-07-27 12:38:49 -07:00
Philippe Tillet
50e58d73db [DOCS] Improved plots in tutorials 2021-07-27 12:38:49 -07:00
Philippe Tillet
d1d09566b1 [DOCS] Improved tutorials documentation 2021-07-27 12:38:49 -07:00
Philippe Tillet
92242ace2c [DOCS] Re-structured documentation hierarchy 2021-07-27 12:38:49 -07:00
Philippe Tillet
ca04da3575 [DOCS] Switched tutorials to Python and use Sphinx Gallery 2021-07-27 12:38:49 -07:00
Philippe Tillet
5172792543 [DOCS] Added .ipynb tutorials in docs 2021-07-27 12:38:49 -07:00
Philippe Tillet
3ecf834a69 [PYTHON] Deleted 01-vector-add.py: it is an unnecessary duplicate of
01-vector-add.ipynb
2021-07-27 12:38:49 -07:00
Philippe Tillet
62835a0979 [RUNTIME] Added auto-alignment mechanism (#71)
This PR adds an automatic memory alignment mechanism in the Triton runtime. Specifically, the JIT compiler detects the alignment (in bytes) of each pointer argument as well as the largest power of two divisor (between 1 and 16) of each integer argument. Proper .aligned and .multipleof attributes are then added to the Triton-IR on-the-fly for all auto-tunable kernels. There is a cache that remembers all the kernels compiled for each possible configuration.

This PR also includes substantial cleaning of the Python API. This adds 2-3us overhead, mostly due to accessing integer #defines from the auto-tuned compilation options. The previous solution was slightly faster but hacky and potentially unsafe, so this is preferred for now.
2021-07-27 12:38:49 -07:00
Philippe Tillet
50ff1aea86 [DOCS] Added Python 02-fused-softmax.ipynb tutorial 2021-07-27 12:38:49 -07:00
Philippe Tillet
269ebc12e5 [PYTHON][TESTS][DOC] Various improvement of the API and code quality:
* Simplified `triton.kernel` API to achieve lower latency:
  > .data_ptr() must now be passed as kernel argument. No more implicit
conversion from torch.tensor
  > compilation options are now constant attributes, i.e., opt.d('VAR')
becomes opt.VAR
  > torch.device must now be passed explicitly to triton.kernel (no
longer inferred from torch.tensor arguments)
* C++ tests moved to `python/tests/`
* C++ tutorial created in `tutorials/`
* Python tutorial created in python/tutorials/
* Version changed to 1.0alpha
* No longer copying C++ headers into the Python package
* added python/triton/ops/ package for pre-written Triton ops
2021-07-27 12:38:48 -07:00