Commit Graph

21 Commits

Author SHA1 Message Date
Philippe Tillet
b8a52c70c9 [LANG] Now requiring tiles have power of 2 number of elements 2021-07-27 12:38:48 -07:00
Philippe Tillet
ad5a30bae1 [LANG] Added __debug_barrier() call to force insertion of a CUDA
__syncthreads
2021-07-27 12:38:48 -07:00
Philippe Tillet
9f9d7b8840 [LANG] Fixed parsing error for built-in functions exp/log/sqrtf 2021-07-27 12:38:48 -07:00
Philippe Tillet
269ebc12e5 [PYTHON][TESTS][DOC] Various improvement of the API and code quality:
* Simplified `triton.kernel` API to achieve lower latency:
  > .data_ptr() must now be passed as kernel argument. No more implicit
conversion from torch.tensor
  > compilation options are now constant attributes, i.e., opt.d('VAR')
becomes opt.VAR
  > torch.device must now be passed explicitly to triton.kernel (no
longer inferred from torch.tensor arguments)
* C++ tests moved to `python/tests/`
* C++ tutorial created in `tutorials/`
* Python tutorial created in python/tutorials/
* Version changed to 1.0alpha
* No longer copying C++ headers into the Python package
* added python/triton/ops/ package for pre-written Triton ops
2021-07-27 12:38:48 -07:00
Philippe Tillet
083bbd1e8d [GENERAL] Merged v1.0alpha into master. Added features are:
- A100 support via mma.16816
- Thread swizzling for conflict-free shared memory accesses without
padding
- Complete overhaul of the LLVM code generation in
codegen/selection/generator.cc to remove overengineering
- Added debugging capabilities in the Python binding
- Compilation error for kernels that spill
2021-07-27 12:38:48 -07:00
Yan Da
01ef691b84 [LANG] Fix gep bug in INC 2021-07-27 12:38:48 -07:00
Yan Da
e9b2335224 [LANG] Add support for POSTFIX_INC and POSTFIX_DEC, and pointer type 2021-07-27 12:38:48 -07:00
Yan Da
05b95b7fa6 [LANG] Add support for PREFIX_INC and PREFIX_DEC. 2021-07-27 12:38:48 -07:00
Philippe Tillet
7d095ec686 [LANG] Added sqrtf support 2021-07-27 12:38:48 -07:00
Philippe Tillet
8f8d36c7a4 [GENERAL] Various bugfixes 2021-07-27 12:38:48 -07:00
Philippe Tillet
50587bbf4b [General] LLVM-9 -> LLVM-10 2021-07-27 12:38:48 -07:00
Philippe Tillet
f152150e7d [LANG] Added log intrinsic 2021-07-27 12:38:48 -07:00
Philippe Tillet
049ab989b5 [GENERAL] Various improvements:
* Sparse einsum in triton.ops.einsum
* Hacky support for fixed-tile-size atomic-add
* Various bugfixes in parser
2021-07-27 12:38:48 -07:00
Philippe Tillet
acff1b5e05 [RUNTIME] Lower-level interface for executing functions 2021-07-27 12:38:48 -07:00
Philippe Tillet
e7461a862b [CODEGEN] Bugfix in Disassociate pass; Added fp32 atomic_add support 2021-07-27 12:38:48 -07:00
Philippe Tillet
bb2d98ce4b [LANG] Added support for flattening 2021-07-27 12:38:48 -07:00
Philippe Tillet
ddd89e1b22 [GENERAL] Fixed some undefined behavior with GCC-9 2021-07-27 12:38:48 -07:00
Philippe Tillet
5943baa53f [GENERAL] Error messages now no longer make terminal color green 2021-07-27 12:38:48 -07:00
Philippe Tillet
3304629de9 [CORE] Fixed several issues that arose in the development of the
torch-blocksparse package:

* Now using warp shuffle in reductions when possible
* Various bugfixes in layout inference
* Added INFINITY, exponential and select
* Better error messages for unimplemented constructs
2021-07-27 12:38:48 -07:00
Philippe Tillet
d22cf4f717 [TRITON][LANG] Added support for bitcast 2021-07-27 12:38:48 -07:00
Philippe Tillet
6d7cf35123 History prior to this date belonged to the now deprecated ISAAC project, and was deleted to save space 2021-07-27 12:38:38 -07:00