Commit Graph

7 Commits

Author SHA1 Message Date
Philippe Tillet
9c05ec148f [BUILD] Added automatic nightly build releases to pip in CI; removed build-time dependence on LLVM and PyTorch (#77)
Recently there has been more and more report about installation issues:

    - Installing Triton before upgrading pytorch can create some issues because Triton uses some torch headers

    - llvm-10-dev not available on some platform; llvm-11-dev not available on e.g. Ubuntu.
    absence of nightly builds

This PR should fix all these issues. Some CMake tricks are used to download and install llvm at build time. Triton Python bindings were modified to remove dependence on pytorch ops. Midnight CI job added to generate binary wheels for all Triton version and update them on pypi's new triton-nightly project.

This PR will also make it very easy to use LLVM forks in the future for whatever needs we have.
2021-03-22 20:03:37 -04:00
Philippe Tillet
15f8e8c3b7 [CODEGEN] Major performance improvements on A100 (#70)
Improved handling of asynchronous copy, scheduling and synchronization for A100. Now achieving CUTLASS-like performance on large square dense matrix multiplication tasks
2021-02-21 18:19:39 -05:00
Philippe Tillet
c847cc6320 [DRIVER] Added options for developers to cache PTX file so that ti can
be manually modified
2021-02-09 00:09:10 -05:00
Philippe Tillet
1726197bb4 Improvements w/ Auto-Tuning and standard benchmarks (#57)
[PYTHON] Bug-fixes in the auto-tuning module and improvement of the existing API for it
2021-02-03 16:37:21 -05:00
Philippe Tillet
6e77538087 [RUNTIME] Auto-tuning now works as expected when the values of
autotune_key change
2021-01-31 19:23:51 -05:00
Philippe Tillet
0b23f95b20 [RUNTIME] Added option to print LLVM-IR
Also includes appropriate driver code change for that
2021-01-31 01:01:32 -05:00
Philippe Tillet
79d098450f [PYTHON][TESTS][DOC] Various improvement of the API and code quality:
* Simplified `triton.kernel` API to achieve lower latency:
  > .data_ptr() must now be passed as kernel argument. No more implicit
conversion from torch.tensor
  > compilation options are now constant attributes, i.e., opt.d('VAR')
becomes opt.VAR
  > torch.device must now be passed explicitly to triton.kernel (no
longer inferred from torch.tensor arguments)
* C++ tests moved to `python/tests/`
* C++ tutorial created in `tutorials/`
* Python tutorial created in python/tutorials/
* Version changed to 1.0alpha
* No longer copying C++ headers into the Python package
* added python/triton/ops/ package for pre-written Triton ops
2021-01-29 17:27:16 -05:00