Commit Graph

300 Commits

Author SHA1 Message Date
Philippe Tillet
fedbe6f439 [PYTHON] Added triton.__version__ string 2021-07-27 12:38:48 -07:00
Philippe Tillet
6fb4800f57 Improvements w/ Auto-Tuning and standard benchmarks (#57)
[PYTHON] Bug-fixes in the auto-tuning module and improvement of the existing API for it
2021-07-27 12:38:48 -07:00
Philippe Tillet
ad005d49ac [PYTHON] Added benchmark code for CUTLASS 2021-07-27 12:38:48 -07:00
Philippe Tillet
ad5a30bae1 [LANG] Added __debug_barrier() call to force insertion of a CUDA
__syncthreads
2021-07-27 12:38:48 -07:00
Philippe Tillet
3fde4b8f5b [RUNTIME] Auto-tuning now works as expected when the values of
autotune_key change
2021-07-27 12:38:48 -07:00
Philippe Tillet
52af8cda34 [PYTHON] Fixed issue with IS_TK_DIV_K 2021-07-27 12:38:48 -07:00
Philippe Tillet
7cf358a352 [TUTORIALS] Fixed TYPO in CMakeLists.txt 2021-07-27 12:38:48 -07:00
Philippe Tillet
9b31244897 [PYTHON] Added benchmarking code 2021-07-27 12:38:48 -07:00
Philippe Tillet
0b025db2ee [RUNTIME] Added option to print LLVM-IR
Also includes appropriate driver code change for that
2021-07-27 12:38:48 -07:00
Philippe Tillet
7ba242fcce [PYTHON][OPS] Added block-sparse softmax 2021-07-27 12:38:48 -07:00
Philippe Tillet
f81da73b6a [PYTHON] Added utility to read single Triton kernel from provided file
in triton.read
2021-07-27 12:38:48 -07:00
Philippe Tillet
9f9d7b8840 [LANG] Fixed parsing error for built-in functions exp/log/sqrtf 2021-07-27 12:38:48 -07:00
Philippe Tillet
269ebc12e5 [PYTHON][TESTS][DOC] Various improvement of the API and code quality:
* Simplified `triton.kernel` API to achieve lower latency:
  > .data_ptr() must now be passed as kernel argument. No more implicit
conversion from torch.tensor
  > compilation options are now constant attributes, i.e., opt.d('VAR')
becomes opt.VAR
  > torch.device must now be passed explicitly to triton.kernel (no
longer inferred from torch.tensor arguments)
* C++ tests moved to `python/tests/`
* C++ tutorial created in `tutorials/`
* Python tutorial created in python/tutorials/
* Version changed to 1.0alpha
* No longer copying C++ headers into the Python package
* added python/triton/ops/ package for pre-written Triton ops
2021-07-27 12:38:48 -07:00
Philippe Tillet
a5a477c36b [CODEGEN] Fixed bug in recoalesce_inst LLVM codegen 2021-07-27 12:38:48 -07:00
Philippe Tillet
376c876eb8 [RUNTIME] Disable error on spills 2021-07-27 12:38:48 -07:00
Philippe Tillet
3b36a1e60c [CODEGEN] Fixed issue in traversal order for atomic_add and store_inst 2021-07-27 12:38:48 -07:00
Philippe Tillet
fdc8e8ef61 [TESTS] Fixed bug in how test arguments are enqueued 2021-07-27 12:38:48 -07:00
Philippe Tillet
083bbd1e8d [GENERAL] Merged v1.0alpha into master. Added features are:
- A100 support via mma.16816
- Thread swizzling for conflict-free shared memory accesses without
padding
- Complete overhaul of the LLVM code generation in
codegen/selection/generator.cc to remove overengineering
- Added debugging capabilities in the Python binding
- Compilation error for kernels that spill
2021-07-27 12:38:48 -07:00
Philippe Tillet
c0bc7ed8b0 [PYTHON] Added TRITON_DEBUG_MODE which reallocates input tensors outside of the pytorch memory pool to spot out-of-bounds accesses more easily 2021-07-27 12:38:48 -07:00
Philippe Tillet
c4fceeea49 [LANG] Added hacky min/max 2021-07-27 12:38:48 -07:00
Philippe Tillet
d70f54fd6a Merge pull request #45 from daadaada/master
[LANG] Add support for PREFIX_INC, PREFIX_DEC, POSTFIX_INC and POSTFIX_DEC
2021-07-27 12:38:48 -07:00
Philippe Tillet
547a99a5d4 [VERSION] 0.2.3 -> 0.3.0 2021-07-27 12:38:48 -07:00
Yan Da
27dc780871 [IR] Check constant_int type 2021-07-27 12:38:48 -07:00
Philippe Tillet
fd5c72d6a0 [LANG] Added some more atomic_add support 2021-07-27 12:38:48 -07:00
Yan Da
01ef691b84 [LANG] Fix gep bug in INC 2021-07-27 12:38:48 -07:00
Philippe Tillet
5e8f4c934c [DRIVER] Better exception handling of invalid ptx 2021-07-27 12:38:48 -07:00
Yan Da
e9b2335224 [LANG] Add support for POSTFIX_INC and POSTFIX_DEC, and pointer type 2021-07-27 12:38:48 -07:00
Philippe Tillet
44ca2c0cb8 [DRIVER] Removed deprecated files and functions 2021-07-27 12:38:48 -07:00
Yan Da
05b95b7fa6 [LANG] Add support for PREFIX_INC and PREFIX_DEC. 2021-07-27 12:38:48 -07:00
Philippe Tillet
7ab2c2a356 [DRIVER] Removed obsolete SetArg 2021-07-27 12:38:48 -07:00
Philippe Tillet
8ab62803db [PYTHON] Context switching logic moved to PyTorch 2021-07-27 12:38:48 -07:00
Philippe Tillet
4f08d87fed [DRIVER] Simplified Driver API by substantially removing reliance on driver::context 2021-07-27 12:38:48 -07:00
Philippe Tillet
f42b04d925 [DRIVER] Added (slow) support for CUDA11 and Ampere 2021-07-27 12:38:48 -07:00
Philippe Tillet
baa858aa74 [CODEGEN] Fixed bug in atomic_add 2021-07-27 12:38:48 -07:00
Philippe Tillet
7d095ec686 [LANG] Added sqrtf support 2021-07-27 12:38:48 -07:00
Philippe Tillet
073fddffc1 [PYTHON] Compiling Triton in Release mode now... 2021-07-27 12:38:48 -07:00
Philippe Tillet
5d84fde733 tmp 2021-07-27 12:38:48 -07:00
Philippe Tillet
da287bb710 [CODEGEN] Progress on atom.add.f16x2 2021-07-27 12:38:48 -07:00
Philippe Tillet
a77c925dfd [DRIVER] Improved performance of Host driver code 2021-07-27 12:38:48 -07:00
Philippe Tillet
8f8d36c7a4 [GENERAL] Various bugfixes 2021-07-27 12:38:48 -07:00
Philippe Tillet
50587bbf4b [General] LLVM-9 -> LLVM-10 2021-07-27 12:38:48 -07:00
Philippe Tillet
8f3ee53f24 [PYTHON] Added option to show PTX source code in Python 2021-07-27 12:38:48 -07:00
Philippe Tillet
cf80ccc798 [PYTHON] Fixed torch ABI issue 2021-07-27 12:38:48 -07:00
Philippe Tillet
06abc8cb40 [GENERAL] Fix compatibility issue with older Torch versions 2021-07-27 12:38:48 -07:00
Philippe Tillet
f152150e7d [LANG] Added log intrinsic 2021-07-27 12:38:48 -07:00
Philippe Tillet
02a6e81b88 [PYTHON] Cleaning C++ bindings 2021-07-27 12:38:48 -07:00
Philippe Tillet
34f1d5e565 [CODEGEN] Fixed bug in 2D reductions 2021-07-27 12:38:48 -07:00
Philippe Tillet
049ab989b5 [GENERAL] Various improvements:
* Sparse einsum in triton.ops.einsum
* Hacky support for fixed-tile-size atomic-add
* Various bugfixes in parser
2021-07-27 12:38:48 -07:00
Philippe Tillet
444907589d [GENERAL] Fixed MacOS compilation issues 2021-07-27 12:38:48 -07:00
Philippe Tillet
664d3cae89 [DRIVER] Removed OpenCL support
There is no plan to support OpenCL anytime soon (Vulkan would be preferred). Removing the adequate portion of the driver code
2021-07-27 12:38:48 -07:00