Philippe Tillet
|
c0bc7ed8b0
|
[PYTHON] Added TRITON_DEBUG_MODE which reallocates input tensors outside of the pytorch memory pool to spot out-of-bounds accesses more easily
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
547a99a5d4
|
[VERSION] 0.2.3 -> 0.3.0
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
8ab62803db
|
[PYTHON] Context switching logic moved to PyTorch
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
4f08d87fed
|
[DRIVER] Simplified Driver API by substantially removing reliance on driver::context
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
073fddffc1
|
[PYTHON] Compiling Triton in Release mode now...
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
a77c925dfd
|
[DRIVER] Improved performance of Host driver code
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
8f8d36c7a4
|
[GENERAL] Various bugfixes
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
50587bbf4b
|
[General] LLVM-9 -> LLVM-10
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
8f3ee53f24
|
[PYTHON] Added option to show PTX source code in Python
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
cf80ccc798
|
[PYTHON] Fixed torch ABI issue
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
06abc8cb40
|
[GENERAL] Fix compatibility issue with older Torch versions
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
f152150e7d
|
[LANG] Added log intrinsic
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
02a6e81b88
|
[PYTHON] Cleaning C++ bindings
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
049ab989b5
|
[GENERAL] Various improvements:
* Sparse einsum in triton.ops.einsum
* Hacky support for fixed-tile-size atomic-add
* Various bugfixes in parser
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
840308ab5d
|
[CODEGEN] More work on the CPU backend
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
64eaec016f
|
[Version] Now version 0.2.3
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
db4e4b9dbf
|
[VERSION] Now version 0.2.2
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
7af9d812cf
|
[PYTHON] Added credits to Scott Gray for the idea used in launch.cc
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
acff1b5e05
|
[RUNTIME] Lower-level interface for executing functions
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
ba9955ae39
|
[CODEGEN][ANALYSIS] Fixed issue in layout inference
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
89e456107b
|
[EXAMPLES] Improved mat_mul example
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
68c18238a9
|
[EXAMPLES] Added conv2d example
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
46297a949f
|
[PACKAGING] Now version 0.2.1
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
c251dc50f3
|
[PACKAGING] Now version 0.2.0
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
4ccd78f1a6
|
[EXAMPLES][TUTORIAL] Changed to new triton.kernel API
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
c33d6d15f5
|
[TRITON][PYTHON] Reverted back to distutils
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
955b027103
|
[TRITON][KERNEL] Fixed issue for concurrent compilation of torch
extensions
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
d85141182d
|
[PACKAGING] Now version 0.1.3
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
5995cbff8e
|
[CORE] Auto-tuning now copies scalar buffers. Still needs to copy all buffers that are both read from and written to.
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
78cd54b0c8
|
[PYTHON] Added support for FP16 scalar kernel arguments
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
694bfbddf9
|
[PACKAGING] Now version 0.1.2
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
13ff6472e0
|
[LANG] Fixed undefined behavior in replace_all_uses_with()
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
f35b9100e2
|
[PYTHON] Restored compatibility with powerpc
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
1426b103e9
|
[PYTHON] Removed -std=gnu++11 in extra_cflags
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
04a9ea060b
|
[GENERAL] Added compatibility with pytorch 1.2.0 and powerpc
|
2021-07-27 12:38:48 -07:00 |
|
jack-willturner
|
180ed26b61
|
[DOCS] Transposition fix
|
2021-07-27 12:38:48 -07:00 |
|
jack-willturner
|
0920da6fae
|
Merge https://github.com/ptillet/triton
|
2021-07-27 12:38:48 -07:00 |
|
jack-willturner
|
a98a2db2c2
|
[DOCS] Matrix copy and transpose
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
609ef3a24d
|
[CORE] Fixed bug for Multi-GPU
|
2021-07-27 12:38:48 -07:00 |
|
jack-willturner
|
32819dea51
|
[DOCS] Matmul and vecadd working examples
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
24586e60aa
|
[PACKAGING] sdist now generates working .tar.gz file
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
ce4a4728f5
|
[PACKAGING] Fixed typo in setup.py
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
769c1180c5
|
[PACKAGING] Fixed import error
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
3709f564e1
|
[PACKAGING] Added some more files for packaging
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
435acbf585
|
[PACKAGING] Added MANIFEST.in and some symlinks for better packaging
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
f805ff278a
|
[PYTHON][SRC][BINDING] Improved code portability across compilers
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
c36ad6bf8a
|
[PYTHON][EXAMPLES][EINSUM] Updated configs for matmul
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
7924642b78
|
[PYTHON][EXAMPLES][EINSUM] Added stride in CONV2D example
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
f22ad0064c
|
[PYTHON][EXAMPLES][EINSUM] Added group-convolution test/benchmark
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
5bb977173f
|
[PYTHON][EINSUM] re-established auto-tuning
|
2021-07-27 12:38:48 -07:00 |
|