Commit Graph

33 Commits

Author SHA1 Message Date
Philippe Tillet
90ded16c32 [DOCS] Added placeholder docstring for layernorm tutorial 2021-10-15 19:04:01 -07:00
Philippe Tillet
d4baad426d [DOCS] Added layer norm example (#326) 2021-10-08 11:02:10 -07:00
Philippe Tillet
4163d32c49 [DOCS] Fixed leftover exit() in 01-vector-add tutorial 2021-09-10 15:52:26 -07:00
Philippe Tillet
ac10551d55 [PYTHON] Now providing triton.next_power_of_2 (#273) 2021-09-10 11:05:44 -07:00
Philippe Tillet
94c83d30ce [GENERAL] Removed deprecated driver files and added basic compatibility with rocm (#268)
- Removed driver module -- accelerator runtime is handled by pytorch
- Added basic support for ROCM based on @micmelesse 's PR -- now can execute empty kernel on AMD devices without any compile-time changes
- Now only using PREFER_SHARED for kernels when the size of shared memory is greater than 49k. Otherwise there can be poor L1 performance for broadcast tensors
2021-09-09 00:04:28 -07:00
Szymon Sidor
8bedcce9be [LANG] Added seeded random number generation - philox (#261) 2021-09-02 22:02:40 -07:00
Sasank Chilamkurthy
6aa5720d75 [DOCS] use numel for num_elements in elementwise tutorial (#228) 2021-08-19 19:35:12 -07:00
Philippe Tillet
f26a48a3b4 [DOCS] Various improvements (#224)
- Added docstr for autotune, Config, heuristics
- Added docstr for atomics
- Hiding internal _builder argument used for built-in language primitives
- Re-factor docstr to use common templates between similar functions.
2021-08-18 11:15:53 -07:00
Philippe Tillet
70e28ff380 [DOCS] Minor modifications of the matmul tutorial (#199)
Making the code more compact and fixing inconsistencies between text variable names and final python program.
2021-08-11 18:59:15 -07:00
Philippe Tillet
398d4b4aeb [DOCS] softmax tutorial fixup (#198) 2021-08-11 17:35:00 -07:00
Nicholas Joseph
6cd1ec3955 [DOCS] Fix formatting mistakes (#192) 2021-08-06 12:58:43 -07:00
Nicholas Joseph
68f7eeba92 [DOCS] Improve matmul tutorial readability (#188) 2021-08-05 16:05:56 -07:00
Nicholas Joseph
4e6f667c2f [DOCS] Improve readability of 02-fused-softmax.py (#186) 2021-08-05 09:39:07 -07:00
Nicholas Joseph
23c71538fc [DOCS] Improve tutorial readability (#185) 2021-08-05 09:27:06 -07:00
Xiangru Lian
9967e9d4b4 [DOCS] Fix fused softmax example script naive softmax implementation (#178) 2021-08-02 09:37:31 -07:00
Philippe Tillet
acd5e44611 [GENERAL] Some minor improvements here and there to build systems and docs (#148) 2021-07-28 01:51:17 -07:00
Philippe Tillet
b253b77c71 [DOCS] Improved documentation and integration in CI (#139) 2021-07-27 12:38:49 -07:00
Philippe Tillet
b7b05a560e [DRIVER] Now giving the option to use system ptxas through environment variable (#123) 2021-07-27 12:38:49 -07:00
Philippe Tillet
840140bf26 [CODEGEN] Removed dedicated reassociate pass to merge it into LLVM isel (#101)
This massively simplifies implementation of `reassociate` and also fixes
a bunch of bug. The pass could still be improved, but can already be used
to generate constant pointer offsets in eg the matmul epilogue
2021-07-27 12:38:49 -07:00
Philippe Tillet
bfc0a7587d [PYTHON] Renamed triton.core -> triton.language (#92) 2021-07-27 12:38:49 -07:00
Philippe Tillet
29e33e50b7 [DOCS] Updates and improvements (#87) 2021-07-27 12:38:49 -07:00
Philippe Tillet
39f4730305 Deprecation of Triton-C and Replacement by decorated Python functions (#86)
This PR implements a major overhaul of the frontend for Triton, and replaces Triton-C by a pure Python API in which kernels are defined as @triton.jit decorated functions. The documentation and tutorials have also been updated to accommodate these changes.

See documentations for more information on the new API
2021-07-27 12:38:49 -07:00
Philippe Tillet
1fdb465b71 [DOCS] Various improvements and typo fixes 2021-07-27 12:38:49 -07:00
Philippe Tillet
183878dce5 [DOCS] Added matrix multiplication tutorial 2021-07-27 12:38:49 -07:00
Philippe Tillet
50e58d73db [DOCS] Improved plots in tutorials 2021-07-27 12:38:49 -07:00
Philippe Tillet
d1d09566b1 [DOCS] Improved tutorials documentation 2021-07-27 12:38:49 -07:00
Philippe Tillet
92242ace2c [DOCS] Re-structured documentation hierarchy 2021-07-27 12:38:49 -07:00
Philippe Tillet
ca04da3575 [DOCS] Switched tutorials to Python and use Sphinx Gallery 2021-07-27 12:38:49 -07:00
Philippe Tillet
5172792543 [DOCS] Added .ipynb tutorials in docs 2021-07-27 12:38:49 -07:00
Philippe Tillet
3ecf834a69 [PYTHON] Deleted 01-vector-add.py: it is an unnecessary duplicate of
01-vector-add.ipynb
2021-07-27 12:38:49 -07:00
Philippe Tillet
62835a0979 [RUNTIME] Added auto-alignment mechanism (#71)
This PR adds an automatic memory alignment mechanism in the Triton runtime. Specifically, the JIT compiler detects the alignment (in bytes) of each pointer argument as well as the largest power of two divisor (between 1 and 16) of each integer argument. Proper .aligned and .multipleof attributes are then added to the Triton-IR on-the-fly for all auto-tunable kernels. There is a cache that remembers all the kernels compiled for each possible configuration.

This PR also includes substantial cleaning of the Python API. This adds 2-3us overhead, mostly due to accessing integer #defines from the auto-tuned compilation options. The previous solution was slightly faster but hacky and potentially unsafe, so this is preferred for now.
2021-07-27 12:38:49 -07:00
Philippe Tillet
50ff1aea86 [DOCS] Added Python 02-fused-softmax.ipynb tutorial 2021-07-27 12:38:49 -07:00
Philippe Tillet
269ebc12e5 [PYTHON][TESTS][DOC] Various improvement of the API and code quality:
* Simplified `triton.kernel` API to achieve lower latency:
  > .data_ptr() must now be passed as kernel argument. No more implicit
conversion from torch.tensor
  > compilation options are now constant attributes, i.e., opt.d('VAR')
becomes opt.VAR
  > torch.device must now be passed explicitly to triton.kernel (no
longer inferred from torch.tensor arguments)
* C++ tests moved to `python/tests/`
* C++ tutorial created in `tutorials/`
* Python tutorial created in python/tutorials/
* Version changed to 1.0alpha
* No longer copying C++ headers into the Python package
* added python/triton/ops/ package for pre-written Triton ops
2021-07-27 12:38:48 -07:00