Commit Graph

311 Commits

Author SHA1 Message Date
Benjamin Lefaudeux
bd855ac13d [DOCS] Adding some doc on the benchmarks + requirements file (#285) 2021-09-18 16:37:30 -07:00
Philippe Tillet
313d6488f6 [CODEGEN] Fixed over-aggressive division handling in alignment pass (#280) 2021-09-15 00:40:17 -07:00
Philippe Tillet
da5063d898 [TEST] Added performance regression tests (#283) 2021-09-14 01:46:32 -07:00
Philippe Tillet
8fdd7e7ed6 [LANG] Fixed semantics of boolean load/store (#282) 2021-09-13 17:39:06 -07:00
Philippe Tillet
3e395bc84e [LANG] Fixed semantics of NaN in float comparisons (#281) 2021-09-13 15:06:29 -07:00
Min Xu
cecca90bea [DOCS] update installation doc and add gitignore (#279)
Co-authored-by: Min Xu <min.xu.public@gmail.com>
2021-09-12 21:11:45 -07:00
Philippe Tillet
4163d32c49 [DOCS] Fixed leftover exit() in 01-vector-add tutorial 2021-09-10 15:52:26 -07:00
Philippe Tillet
34369906b4 [PYTHON] Fix-up the previous commit 2021-09-10 11:13:25 -07:00
Philippe Tillet
ac10551d55 [PYTHON] Now providing triton.next_power_of_2 (#273) 2021-09-10 11:05:44 -07:00
Philippe Tillet
43723ccb95 [FRONTEND] Removed circular import that broke Python 3.6 support (#272) 2021-09-09 13:46:55 -07:00
Philippe Tillet
585e5cd0ec [TEST] Added test for empty kernel (#271) 2021-09-09 10:20:37 -07:00
Philippe Tillet
94c83d30ce [GENERAL] Removed deprecated driver files and added basic compatibility with rocm (#268)
- Removed driver module -- accelerator runtime is handled by pytorch
- Added basic support for ROCM based on @micmelesse 's PR -- now can execute empty kernel on AMD devices without any compile-time changes
- Now only using PREFER_SHARED for kernels when the size of shared memory is greater than 49k. Otherwise there can be poor L1 performance for broadcast tensors
2021-09-09 00:04:28 -07:00
Szymon Sidor
8bedcce9be [LANG] Added seeded random number generation - philox (#261) 2021-09-02 22:02:40 -07:00
Philippe Tillet
c069ef907e [PYTHON] triton.language is now a submodule rather than a single file (#260) 2021-09-02 13:30:14 -07:00
Philippe Tillet
8a882b215f [CODEGEN] Fixed performance regression on vectorized loads (#259) 2021-09-02 01:07:31 -07:00
Philippe Tillet
768e0ded28 [CODEGEN] Fixed bug in pipelining pass and casting semantics analysis (#257) 2021-09-01 20:58:47 -07:00
Rohit Dwivedula
c0daffc625 [DOCS] @heuristics -> @triton.heuristics in some snippets (#253) 2021-09-01 18:50:17 -07:00
daadaada
274d613488 [IR] Better printer (#256) 2021-09-01 09:55:12 -07:00
Philippe Tillet
4ff3714d61 [CODEGEN] Various bugfixes and stability improvements in compiler backend (#240) 2021-08-30 11:50:35 -07:00
daadaada
85426dbaf7 [DOCS] Add comments in layout.h (#249) 2021-08-28 18:07:32 -07:00
milesial
5b29da719d [DRIVER] Add CUDA P2P support (#209) 2021-08-20 21:00:54 -07:00
Sasank Chilamkurthy
6aa5720d75 [DOCS] use numel for num_elements in elementwise tutorial (#228) 2021-08-19 19:35:12 -07:00
Philippe Tillet
f26a48a3b4 [DOCS] Various improvements (#224)
- Added docstr for autotune, Config, heuristics
- Added docstr for atomics
- Hiding internal _builder argument used for built-in language primitives
- Re-factor docstr to use common templates between similar functions.
2021-08-18 11:15:53 -07:00
Philippe Tillet
226fde6ea1 [CODEGEN] Now using atomic_rmw code path for atomic_xchg (#222) 2021-08-17 16:33:23 -07:00
Philippe Tillet
64b8e7222d [LICENSE] Edit copyright notice (#219) 2021-08-17 09:25:19 -07:00
Philippe Tillet
a714b6b856 [PYTHON] re-activated auto-tuner configurations for triton.ops.matmul (#212) 2021-08-16 22:56:21 -07:00
Philippe Tillet
bb1eebb4b4 [CODEGEN] Fixed bug for visit_reduce1d with 64-bit data-types (#207) 2021-08-14 21:07:01 -07:00
Philippe Tillet
6e7593b446 added reset_to_zero in vector addition (#205) 2021-08-14 10:58:38 -07:00
Philippe Tillet
c45c2e9684 [DOCS] Added docs for cos/sin/sqrt (#204) 2021-08-14 10:34:07 -07:00
Philippe Tillet
c7a272cb91 [FRONTEND] Added default arguments for range (#203) 2021-08-14 10:11:18 -07:00
Philippe Tillet
b120d70a0a [CI] Moved from assert_allclose to assert_almost_equal (#200) 2021-08-12 12:00:30 -07:00
Philippe Tillet
70e28ff380 [DOCS] Minor modifications of the matmul tutorial (#199)
Making the code more compact and fixing inconsistencies between text variable names and final python program.
2021-08-11 18:59:15 -07:00
Philippe Tillet
398d4b4aeb [DOCS] softmax tutorial fixup (#198) 2021-08-11 17:35:00 -07:00
Philippe Tillet
83da7065da [DRIVER] Portability fixup (#195) 2021-08-07 18:53:11 -07:00
Philippe Tillet
298da78058 [CODEGEN/DRIVER] Tweaks for performance optimization (#193) 2021-08-07 16:41:44 -07:00
Nicholas Joseph
6cd1ec3955 [DOCS] Fix formatting mistakes (#192) 2021-08-06 12:58:43 -07:00
Nicholas Joseph
68f7eeba92 [DOCS] Improve matmul tutorial readability (#188) 2021-08-05 16:05:56 -07:00
Nicholas Joseph
4e6f667c2f [DOCS] Improve readability of 02-fused-softmax.py (#186) 2021-08-05 09:39:07 -07:00
Nicholas Joseph
23c71538fc [DOCS] Improve tutorial readability (#185) 2021-08-05 09:27:06 -07:00
Philippe Tillet
3cb77aa126 [README] Added "we're hiring!" with link to some of our blog posts (#180) 2021-08-02 16:46:26 -07:00
Xiangru Lian
9967e9d4b4 [DOCS] Fix fused softmax example script naive softmax implementation (#178) 2021-08-02 09:37:31 -07:00
Philippe Tillet
e8031fe61f [DRIVER] More robust support of unsupported CUDA version (#179) 2021-08-02 09:06:55 -07:00
milesial
b7cdf670c3 [DOCS] Fix related work (#172) 2021-08-01 11:06:37 -07:00
daadaada
c7060eadb2 [CODEGEN] Fix bug in auto-pipeline pass when a value depends on multiple phis (#164) 2021-07-31 23:40:36 -07:00
Philippe Tillet
c0bb895d9d [BUILD] More portable detection of terminfo (#173) 2021-07-31 17:09:49 -07:00
Philippe Tillet
a34c57402f [PYTHON] Improved error message for CPU (#167) 2021-07-30 09:47:27 -07:00
Ikko Ashimine
2293afece7 [README] GitHub format (#165)
Github -> GitHub
2021-07-30 09:47:08 -07:00
Philippe Tillet
cb5c280691 [DOCS] Added contributions section to README.md 2021-07-29 11:40:34 -07:00
Reid Draper
2322d6df2a [CI] Update ptillet to openai (#152) 2021-07-29 11:39:50 -07:00
Philippe Tillet
2f0f51be50 [DRIVER] No longer crashing when encountering CUDA version >11.4 2021-07-29 11:27:55 -07:00