Commit Graph

287 Commits

Author SHA1 Message Date
Philippe Tillet
29a0ad6c4d [DRIVER] Now always using PTXv6.4 2021-07-27 12:38:48 -07:00
Philippe Tillet
c251dc50f3 [PACKAGING] Now version 0.2.0 2021-07-27 12:38:48 -07:00
Philippe Tillet
4ccd78f1a6 [EXAMPLES][TUTORIAL] Changed to new triton.kernel API 2021-07-27 12:38:48 -07:00
Philippe Tillet
c33d6d15f5 [TRITON][PYTHON] Reverted back to distutils 2021-07-27 12:38:48 -07:00
Philippe Tillet
955b027103 [TRITON][KERNEL] Fixed issue for concurrent compilation of torch
extensions
2021-07-27 12:38:48 -07:00
Philippe Tillet
8bdfbe2514 [ANALYSIS] Replaced min by gcd in layout inference 2021-07-27 12:38:48 -07:00
Philippe Tillet
e18f169a39 [CODEGEN] Fixed various issues in alignment inference pass 2021-07-27 12:38:48 -07:00
Philippe Tillet
da6008128e [CODEGEN] Fixed bug in alignment inference that prevented vectorization
in some cases
2021-07-27 12:38:48 -07:00
Philippe Tillet
ca5b7c5df4 [README] Changed requirement to LLVM-9 2021-07-27 12:38:48 -07:00
Philippe Tillet
d85141182d [PACKAGING] Now version 0.1.3 2021-07-27 12:38:48 -07:00
Philippe Tillet
4bb0311f60 [TRITON] Fixed misaligned address issue 2021-07-27 12:38:48 -07:00
Philippe Tillet
a8f1b85c5f [CODEGEN] Removed unnecessary coalescing rematerialization 2021-07-27 12:38:48 -07:00
Philippe Tillet
5995cbff8e [CORE] Auto-tuning now copies scalar buffers. Still needs to copy all buffers that are both read from and written to. 2021-07-27 12:38:48 -07:00
Philippe Tillet
78cd54b0c8 [PYTHON] Added support for FP16 scalar kernel arguments 2021-07-27 12:38:48 -07:00
Philippe Tillet
e7461a862b [CODEGEN] Bugfix in Disassociate pass; Added fp32 atomic_add support 2021-07-27 12:38:48 -07:00
Philippe Tillet
bb2d98ce4b [LANG] Added support for flattening 2021-07-27 12:38:48 -07:00
Philippe Tillet
694bfbddf9 [PACKAGING] Now version 0.1.2 2021-07-27 12:38:48 -07:00
Philippe Tillet
13ff6472e0 [LANG] Fixed undefined behavior in replace_all_uses_with() 2021-07-27 12:38:48 -07:00
Philippe Tillet
ddd89e1b22 [GENERAL] Fixed some undefined behavior with GCC-9 2021-07-27 12:38:48 -07:00
Philippe Tillet
0516ea96d0 [CODEGEN] Fixed bug that caused missing recoalescing for some transpose
operations
2021-07-27 12:38:48 -07:00
Philippe Tillet
0c5bd7563a [README] Improved wording 2021-07-27 12:38:48 -07:00
Philippe Tillet
f35b9100e2 [PYTHON] Restored compatibility with powerpc 2021-07-27 12:38:48 -07:00
Philippe Tillet
1426b103e9 [PYTHON] Removed -std=gnu++11 in extra_cflags 2021-07-27 12:38:48 -07:00
Philippe Tillet
04a9ea060b [GENERAL] Added compatibility with pytorch 1.2.0 and powerpc 2021-07-27 12:38:48 -07:00
Philippe Tillet
9984ee8c7a [DOCS] Added pip command in README.md 2021-07-27 12:38:48 -07:00
Philippe Tillet
32d615f8f8 [DOCS] Now specifying pip command in installation.rst 2021-07-27 12:38:48 -07:00
Phillippe Tillet
ab75fbccc0 Merge pull request #38 from jack-willturner/master
Add working examples to tutorials and python examples folder
2021-07-27 12:38:48 -07:00
Philippe Tillet
609ef3a24d [CORE] Fixed bug for Multi-GPU 2021-07-27 12:38:48 -07:00
jack-willturner
180ed26b61 [DOCS] Transposition fix 2021-07-27 12:38:48 -07:00
Philippe Tillet
24586e60aa [PACKAGING] sdist now generates working .tar.gz file 2021-07-27 12:38:48 -07:00
jack-willturner
0920da6fae Merge https://github.com/ptillet/triton 2021-07-27 12:38:48 -07:00
Philippe Tillet
769c1180c5 [PACKAGING] Fixed import error 2021-07-27 12:38:48 -07:00
jack-willturner
a98a2db2c2 [DOCS] Matrix copy and transpose 2021-07-27 12:38:48 -07:00
Philippe Tillet
435acbf585 [PACKAGING] Added MANIFEST.in and some symlinks for better packaging 2021-07-27 12:38:48 -07:00
jack-willturner
32819dea51 [DOCS] Matmul and vecadd working examples 2021-07-27 12:38:48 -07:00
Philippe Tillet
ce4a4728f5 [PACKAGING] Fixed typo in setup.py 2021-07-27 12:38:48 -07:00
Philippe Tillet
3709f564e1 [PACKAGING] Added some more files for packaging 2021-07-27 12:38:48 -07:00
Philippe Tillet
c73dee080c [CODEGEN] Fixed bug for phi nodes with constant incoming value 2021-07-27 12:38:48 -07:00
Philippe Tillet
54805596f5 [CODEGEN][ANALYSIS] bugfix in alignment analysis 2021-07-27 12:38:48 -07:00
Philippe Tillet
f805ff278a [PYTHON][SRC][BINDING] Improved code portability across compilers 2021-07-27 12:38:48 -07:00
Philippe Tillet
c36ad6bf8a [PYTHON][EXAMPLES][EINSUM] Updated configs for matmul 2021-07-27 12:38:48 -07:00
Philippe Tillet
7924642b78 [PYTHON][EXAMPLES][EINSUM] Added stride in CONV2D example 2021-07-27 12:38:48 -07:00
Philippe Tillet
f22ad0064c [PYTHON][EXAMPLES][EINSUM] Added group-convolution test/benchmark 2021-07-27 12:38:48 -07:00
Philippe Tillet
5bb977173f [PYTHON][EINSUM] re-established auto-tuning 2021-07-27 12:38:48 -07:00
Philippe Tillet
ec2cb2155e [TESTS] Simplified testing of half-precision transposes 2021-07-27 12:38:48 -07:00
Philippe Tillet
4ae0e28b32 [PYTHON][KERNEL] Added thread-safety when caching custom torch op 2021-07-27 12:38:48 -07:00
Philippe Tillet
677ccfb44e [CORE][RUNTIME] Better error message on internal compilation error 2021-07-27 12:38:48 -07:00
Philippe Tillet
94e8ee7f01 [PYTHON][KERNEL] Better handling of case where cache directory already
exists
2021-07-27 12:38:48 -07:00
Philippe Tillet
5943baa53f [GENERAL] Error messages now no longer make terminal color green 2021-07-27 12:38:48 -07:00
Philippe Tillet
3304629de9 [CORE] Fixed several issues that arose in the development of the
torch-blocksparse package:

* Now using warp shuffle in reductions when possible
* Various bugfixes in layout inference
* Added INFINITY, exponential and select
* Better error messages for unimplemented constructs
2021-07-27 12:38:48 -07:00