Philippe Tillet
29a0ad6c4d
[DRIVER] Now always using PTXv6.4
2021-07-27 12:38:48 -07:00
Philippe Tillet
c251dc50f3
[PACKAGING] Now version 0.2.0
2021-07-27 12:38:48 -07:00
Philippe Tillet
4ccd78f1a6
[EXAMPLES][TUTORIAL] Changed to new triton.kernel API
2021-07-27 12:38:48 -07:00
Philippe Tillet
c33d6d15f5
[TRITON][PYTHON] Reverted back to distutils
2021-07-27 12:38:48 -07:00
Philippe Tillet
955b027103
[TRITON][KERNEL] Fixed issue for concurrent compilation of torch
...
extensions
2021-07-27 12:38:48 -07:00
Philippe Tillet
8bdfbe2514
[ANALYSIS] Replaced min by gcd in layout inference
2021-07-27 12:38:48 -07:00
Philippe Tillet
e18f169a39
[CODEGEN] Fixed various issues in alignment inference pass
2021-07-27 12:38:48 -07:00
Philippe Tillet
da6008128e
[CODEGEN] Fixed bug in alignment inference that prevented vectorization
...
in some cases
2021-07-27 12:38:48 -07:00
Philippe Tillet
ca5b7c5df4
[README] Changed requirement to LLVM-9
2021-07-27 12:38:48 -07:00
Philippe Tillet
d85141182d
[PACKAGING] Now version 0.1.3
2021-07-27 12:38:48 -07:00
Philippe Tillet
4bb0311f60
[TRITON] Fixed misaligned address issue
2021-07-27 12:38:48 -07:00
Philippe Tillet
a8f1b85c5f
[CODEGEN] Removed unnecessary coalescing rematerialization
2021-07-27 12:38:48 -07:00
Philippe Tillet
5995cbff8e
[CORE] Auto-tuning now copies scalar buffers. Still needs to copy all buffers that are both read from and written to.
2021-07-27 12:38:48 -07:00
Philippe Tillet
78cd54b0c8
[PYTHON] Added support for FP16 scalar kernel arguments
2021-07-27 12:38:48 -07:00
Philippe Tillet
e7461a862b
[CODEGEN] Bugfix in Disassociate pass; Added fp32 atomic_add support
2021-07-27 12:38:48 -07:00
Philippe Tillet
bb2d98ce4b
[LANG] Added support for flattening
2021-07-27 12:38:48 -07:00
Philippe Tillet
694bfbddf9
[PACKAGING] Now version 0.1.2
2021-07-27 12:38:48 -07:00
Philippe Tillet
13ff6472e0
[LANG] Fixed undefined behavior in replace_all_uses_with()
2021-07-27 12:38:48 -07:00
Philippe Tillet
ddd89e1b22
[GENERAL] Fixed some undefined behavior with GCC-9
2021-07-27 12:38:48 -07:00
Philippe Tillet
0516ea96d0
[CODEGEN] Fixed bug that caused missing recoalescing for some transpose
...
operations
2021-07-27 12:38:48 -07:00
Philippe Tillet
0c5bd7563a
[README] Improved wording
2021-07-27 12:38:48 -07:00
Philippe Tillet
f35b9100e2
[PYTHON] Restored compatibility with powerpc
2021-07-27 12:38:48 -07:00
Philippe Tillet
1426b103e9
[PYTHON] Removed -std=gnu++11 in extra_cflags
2021-07-27 12:38:48 -07:00
Philippe Tillet
04a9ea060b
[GENERAL] Added compatibility with pytorch 1.2.0 and powerpc
2021-07-27 12:38:48 -07:00
Philippe Tillet
9984ee8c7a
[DOCS] Added pip command in README.md
2021-07-27 12:38:48 -07:00
Philippe Tillet
32d615f8f8
[DOCS] Now specifying pip command in installation.rst
2021-07-27 12:38:48 -07:00
Phillippe Tillet
ab75fbccc0
Merge pull request #38 from jack-willturner/master
...
Add working examples to tutorials and python examples folder
2021-07-27 12:38:48 -07:00
Philippe Tillet
609ef3a24d
[CORE] Fixed bug for Multi-GPU
2021-07-27 12:38:48 -07:00
jack-willturner
180ed26b61
[DOCS] Transposition fix
2021-07-27 12:38:48 -07:00
Philippe Tillet
24586e60aa
[PACKAGING] sdist now generates working .tar.gz file
2021-07-27 12:38:48 -07:00
jack-willturner
0920da6fae
Merge https://github.com/ptillet/triton
2021-07-27 12:38:48 -07:00
Philippe Tillet
769c1180c5
[PACKAGING] Fixed import error
2021-07-27 12:38:48 -07:00
jack-willturner
a98a2db2c2
[DOCS] Matrix copy and transpose
2021-07-27 12:38:48 -07:00
Philippe Tillet
435acbf585
[PACKAGING] Added MANIFEST.in and some symlinks for better packaging
2021-07-27 12:38:48 -07:00
jack-willturner
32819dea51
[DOCS] Matmul and vecadd working examples
2021-07-27 12:38:48 -07:00
Philippe Tillet
ce4a4728f5
[PACKAGING] Fixed typo in setup.py
2021-07-27 12:38:48 -07:00
Philippe Tillet
3709f564e1
[PACKAGING] Added some more files for packaging
2021-07-27 12:38:48 -07:00
Philippe Tillet
c73dee080c
[CODEGEN] Fixed bug for phi nodes with constant incoming value
2021-07-27 12:38:48 -07:00
Philippe Tillet
54805596f5
[CODEGEN][ANALYSIS] bugfix in alignment analysis
2021-07-27 12:38:48 -07:00
Philippe Tillet
f805ff278a
[PYTHON][SRC][BINDING] Improved code portability across compilers
2021-07-27 12:38:48 -07:00
Philippe Tillet
c36ad6bf8a
[PYTHON][EXAMPLES][EINSUM] Updated configs for matmul
2021-07-27 12:38:48 -07:00
Philippe Tillet
7924642b78
[PYTHON][EXAMPLES][EINSUM] Added stride in CONV2D example
2021-07-27 12:38:48 -07:00
Philippe Tillet
f22ad0064c
[PYTHON][EXAMPLES][EINSUM] Added group-convolution test/benchmark
2021-07-27 12:38:48 -07:00
Philippe Tillet
5bb977173f
[PYTHON][EINSUM] re-established auto-tuning
2021-07-27 12:38:48 -07:00
Philippe Tillet
ec2cb2155e
[TESTS] Simplified testing of half-precision transposes
2021-07-27 12:38:48 -07:00
Philippe Tillet
4ae0e28b32
[PYTHON][KERNEL] Added thread-safety when caching custom torch op
2021-07-27 12:38:48 -07:00
Philippe Tillet
677ccfb44e
[CORE][RUNTIME] Better error message on internal compilation error
2021-07-27 12:38:48 -07:00
Philippe Tillet
94e8ee7f01
[PYTHON][KERNEL] Better handling of case where cache directory already
...
exists
2021-07-27 12:38:48 -07:00
Philippe Tillet
5943baa53f
[GENERAL] Error messages now no longer make terminal color green
2021-07-27 12:38:48 -07:00
Philippe Tillet
3304629de9
[CORE] Fixed several issues that arose in the development of the
...
torch-blocksparse package:
* Now using warp shuffle in reductions when possible
* Various bugfixes in layout inference
* Added INFINITY, exponential and select
* Better error messages for unimplemented constructs
2021-07-27 12:38:48 -07:00