Philippe Tillet
|
c4fceeea49
|
[LANG] Added hacky min/max
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
d70f54fd6a
|
Merge pull request #45 from daadaada/master
[LANG] Add support for PREFIX_INC, PREFIX_DEC, POSTFIX_INC and POSTFIX_DEC
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
547a99a5d4
|
[VERSION] 0.2.3 -> 0.3.0
|
2021-07-27 12:38:48 -07:00 |
|
Yan Da
|
27dc780871
|
[IR] Check constant_int type
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
fd5c72d6a0
|
[LANG] Added some more atomic_add support
|
2021-07-27 12:38:48 -07:00 |
|
Yan Da
|
01ef691b84
|
[LANG] Fix gep bug in INC
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
5e8f4c934c
|
[DRIVER] Better exception handling of invalid ptx
|
2021-07-27 12:38:48 -07:00 |
|
Yan Da
|
e9b2335224
|
[LANG] Add support for POSTFIX_INC and POSTFIX_DEC, and pointer type
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
44ca2c0cb8
|
[DRIVER] Removed deprecated files and functions
|
2021-07-27 12:38:48 -07:00 |
|
Yan Da
|
05b95b7fa6
|
[LANG] Add support for PREFIX_INC and PREFIX_DEC.
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
7ab2c2a356
|
[DRIVER] Removed obsolete SetArg
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
8ab62803db
|
[PYTHON] Context switching logic moved to PyTorch
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
4f08d87fed
|
[DRIVER] Simplified Driver API by substantially removing reliance on driver::context
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
f42b04d925
|
[DRIVER] Added (slow) support for CUDA11 and Ampere
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
baa858aa74
|
[CODEGEN] Fixed bug in atomic_add
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
7d095ec686
|
[LANG] Added sqrtf support
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
073fddffc1
|
[PYTHON] Compiling Triton in Release mode now...
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
5d84fde733
|
tmp
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
da287bb710
|
[CODEGEN] Progress on atom.add.f16x2
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
a77c925dfd
|
[DRIVER] Improved performance of Host driver code
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
8f8d36c7a4
|
[GENERAL] Various bugfixes
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
50587bbf4b
|
[General] LLVM-9 -> LLVM-10
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
8f3ee53f24
|
[PYTHON] Added option to show PTX source code in Python
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
cf80ccc798
|
[PYTHON] Fixed torch ABI issue
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
06abc8cb40
|
[GENERAL] Fix compatibility issue with older Torch versions
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
f152150e7d
|
[LANG] Added log intrinsic
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
02a6e81b88
|
[PYTHON] Cleaning C++ bindings
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
34f1d5e565
|
[CODEGEN] Fixed bug in 2D reductions
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
049ab989b5
|
[GENERAL] Various improvements:
* Sparse einsum in triton.ops.einsum
* Hacky support for fixed-tile-size atomic-add
* Various bugfixes in parser
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
444907589d
|
[GENERAL] Fixed MacOS compilation issues
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
664d3cae89
|
[DRIVER] Removed OpenCL support
There is no plan to support OpenCL anytime soon (Vulkan would be preferred). Removing the adequate portion of the driver code
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
840308ab5d
|
[CODEGEN] More work on the CPU backend
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
64eaec016f
|
[Version] Now version 0.2.3
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
a7d285a480
|
Merge pull request #41 from jeffra/fix-conda-build
fix llvm build inside conda environment
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
db4e4b9dbf
|
[VERSION] Now version 0.2.2
|
2021-07-27 12:38:48 -07:00 |
|
Jeff Rasley
|
7fdf2e378c
|
fix llvm build inside conda environment (see link for similar issue)
https://github.com/tensorflow/tensorflow/issues/12998
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
7af9d812cf
|
[PYTHON] Added credits to Scott Gray for the idea used in launch.cc
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
150ba0c70b
|
[TESTS] Updated the test to be compatible with the new runtime API
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
acff1b5e05
|
[RUNTIME] Lower-level interface for executing functions
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
f4f216b88a
|
[EXAMPLES] Added C++ example for Conv2d
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
ba9955ae39
|
[CODEGEN][ANALYSIS] Fixed issue in layout inference
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
89e456107b
|
[EXAMPLES] Improved mat_mul example
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
68c18238a9
|
[EXAMPLES] Added conv2d example
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
46297a949f
|
[PACKAGING] Now version 0.2.1
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
29a0ad6c4d
|
[DRIVER] Now always using PTXv6.4
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
c251dc50f3
|
[PACKAGING] Now version 0.2.0
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
4ccd78f1a6
|
[EXAMPLES][TUTORIAL] Changed to new triton.kernel API
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
c33d6d15f5
|
[TRITON][PYTHON] Reverted back to distutils
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
955b027103
|
[TRITON][KERNEL] Fixed issue for concurrent compilation of torch
extensions
|
2021-07-27 12:38:48 -07:00 |
|
Philippe Tillet
|
8bdfbe2514
|
[ANALYSIS] Replaced min by gcd in layout inference
|
2021-07-27 12:38:48 -07:00 |
|