Commit Graph

  • 8b82c8ed1a [GH-PAGES] Updated website Philippe Tillet 2022-02-09 03:39:04 +00:00
  • 5c5dc17308 [GH-PAGES] Updated website Philippe Tillet 2022-02-09 03:23:53 +00:00
  • e2e2cafecc [GH-PAGES] Updated website Philippe Tillet 2022-02-09 03:17:47 +00:00
  • 1caa8f007a [GH-PAGES] Updated website Philippe Tillet 2022-02-09 01:09:10 +00:00
  • 4941bc7001 [DOCS] Some more fixes (#455) Philippe Tillet 2022-02-08 16:53:56 -08:00
  • 87b6bdfc6e Updated update-website.sh Philippe Tillet 2022-02-08 16:51:48 -08:00
  • 989c163b13 [GH-PAGES] Updated website Philippe Tillet 2022-02-08 23:45:21 +00:00
  • 95bb988ed0 [GH-PAGES] Updated website Philippe Tillet 2022-02-08 20:05:45 +00:00
  • 5b4729ede6 [GH-PAGES] Updated website Philippe Tillet 2022-02-08 20:02:05 +00:00
  • 2fdf0a4fe8 [DOCS] changed build command Philippe Tillet 2022-02-08 11:45:21 -08:00
  • 077d6c8ff0 [DOCS] re-activated tutorials Philippe Tillet 2022-02-08 11:42:39 -08:00
  • a47b2f5208 [GH-PAGES] Updated website Philippe Tillet 2022-02-08 19:40:59 +00:00
  • 822ddcd14b [DOCS] Added versioning (#453) Philippe Tillet 2022-02-08 11:28:18 -08:00
  • cf8c3ba438 [GH-PAGES] Updated website Philippe Tillet 2022-02-08 00:24:15 +00:00
  • 0f03cfcfd3 [GH-PAGES] Updated website Philippe Tillet 2022-02-07 03:25:11 +00:00
  • 7b48340ffd [CI] Some fixes for the build (#451) Philippe Tillet 2022-02-06 19:11:33 -08:00
  • 5a8a544d10 [OPS][BLOCKSPARSE] Improved robustness, clarity and performance (#450) Philippe Tillet 2022-02-06 18:00:45 -08:00
  • 69ff52ea1f [CODEGEN] removed buggy (and mostly useless) optimization in peephole pass (#449) Philippe Tillet 2022-02-05 21:37:23 -08:00
  • 137bb67fad [LANG] Add fp16 to fp8 conversion (#444) TC 2022-02-02 20:42:09 -08:00
  • 3b20170fa3 Merge pull request #448 from openai/v2.0 Philippe Tillet 2022-01-30 20:49:08 -08:00
  • b0d6e2f322 [STYLE] run autopep Philippe Tillet 2022-01-30 20:27:44 -08:00
  • 2922dc141c Merge branch 'master' into v2.0 Philippe Tillet 2022-01-30 20:25:01 -08:00
  • 807d8a1945 [ALL] Merge master (#447) Philippe Tillet 2022-01-30 20:21:20 -08:00
  • bef76b142a [BACKEND] float division is now approximate by default (#446) Philippe Tillet 2022-01-29 18:29:29 -08:00
  • bd52e530a0 [OPS][BLOCKSPARSE] Fix padding issue in DSD LUT (#445) Philippe Tillet 2022-01-28 21:40:30 -08:00
  • e68d6a7776 [BACKEND] Making the warp-level tile "more square" to increase data-reuse for tl.dot. (#442) daadaada 2022-01-28 01:59:54 +08:00
  • 59d371c6eb [BACKEND] Added Int8 mma (#440) daadaada 2022-01-28 01:12:44 +08:00
  • 3a23c1dd33 [BACKEND] minor, hotfix for gcc compilation (#439) Benjamin Lefaudeux 2022-01-23 17:24:02 -05:00
  • ccf9abe0ba [FRONTEND][RANDOM] Improved backward compatibility of RNG (#438) Philippe Tillet 2022-01-21 18:05:55 -08:00
  • 4c97d1ecd7 [FRONTEND] Bunch of fixes here and there (#436) Philippe Tillet 2022-01-20 10:55:59 -08:00
  • e0c5709cc8 [FRONTEND] Fixed semantics bug on ptr to bool conversions (#432) Philippe Tillet 2022-01-17 18:00:03 -08:00
  • 2a944ded53 [TESTS] Added bfloat16 tests (#430) daadaada 2022-01-14 15:38:32 +08:00
  • 4c94359199 [FRONTEND] Alignment fix-up (#428) Philippe Tillet 2022-01-11 23:11:58 -08:00
  • bbc78f6516 [FRONTEND][RANDOM] Make sure offset dtype is always uint32 before calling uint32_to_uniform_float (#427) Philippe Tillet 2022-01-11 11:08:49 -08:00
  • bf32205edc [OPS][BLOCKSPARSE] Remove unnecessary loop and add cuda bool layout support (#425) Botao Yu 2022-01-12 03:07:16 +08:00
  • 94a2e10fe5 [BACKEND] Add bf16 & tf32 mma supports (on A100) (#426) daadaada 2022-01-12 02:20:31 +08:00
  • efdabe6073 [STYLE] check python with flake8 (#424) Madeleine Thompson 2022-01-07 15:28:36 -08:00
  • a70acfec77 [STYLE] add isort and autopep8 config files and check on CI (#423) Madeleine Thompson 2022-01-07 13:11:34 -08:00
  • 9801aa7b56 [DOCS] fix tutorials for v2.0 (#422) Madeleine Thompson 2022-01-07 12:34:38 -08:00
  • 8bf551ae7a [STYLE] run autopep8 and isort (#421) Madeleine Thompson 2022-01-06 14:34:17 -08:00
  • 6f7acad48f [CODEGEN] Avoid use of deprecated AST nodes (#418) Shantanu 2022-01-06 12:04:33 -08:00
  • 120cda015e [FRONTEND] use unsigned integers to simplify RNG (#417) Madeleine Thompson 2022-01-06 10:49:09 -08:00
  • 001fb757fe [OPS][BLOCKSPARSE] Added .contiguous() in blocksparse inputs when necessary (#420) Philippe Tillet 2022-01-06 12:56:22 -05:00
  • 0ab9d67bad uint8, uint16, uint32, and uint64 in kernels (#413) Madeleine Thompson 2022-01-05 15:27:17 -08:00
  • d8db0308cb [TEST] use numpy for reference results in test_core.py (#409) Madeleine Thompson 2022-01-04 13:07:29 -08:00
  • 03f1256f60 [FRONTEND] Added volatile flag for load (#407) Philippe Tillet 2021-12-30 22:33:24 -08:00
  • 3edc2633e9 [TUTORIALS] Fix 01-vector-add.py typo (#406) Noah Ziems 2021-12-29 18:09:34 -05:00
  • 985798f101 add missing bfloat16 repr and improve assertions (#403) Madeleine Thompson 2021-12-23 17:01:17 -08:00
  • d8fce83e7a [FRONTEND] Remade exception picklable Philippe Tillet 2021-12-21 22:14:06 -08:00
  • a425f24d54 [FRONTEND] Better cache hook (#400) Philippe Tillet 2021-12-21 21:29:47 -08:00
  • 2509124dd0 [DRIVER] Fixed some issue with how ptxas is used (#399) Philippe Tillet 2021-12-21 14:31:51 -08:00
  • 39d4bfed83 [OPS] Add performance model for gemm/gemv (#397) daadaada 2021-12-22 01:56:10 +08:00
  • 5cdb948c05 [FRONTEND] signed-integer math fixes and testing (#395) Madeleine Thompson 2021-12-21 09:46:05 -08:00
  • 4a8953efa3 [FRONTEND] Replace the legacy print call in triton.cc with the SlotTracker-based one. (#396) daadaada 2021-12-19 10:03:22 +08:00
  • fa62b4a8f6 [FRONTEND] better stringification (#394) Madeleine Thompson 2021-12-17 20:11:45 -08:00
  • 4e93b41c52 [GENERAL] Some minor fixups (#393) Philippe Tillet 2021-12-17 18:06:21 -08:00
  • e062812969 [CODEGEN] Disabled peephole for masked load + select -- masked_load doesn't work as expected when vectorized Philippe Tillet 2021-12-17 12:44:47 -08:00
  • eb077fc993 [RUNTIME] fixed NVidia DLL names on Windows (#392) Victor 2021-12-16 22:09:52 -08:00
  • e0b92c1380 [FRONTEND] Reverted from .random import *. There are still some namespace errors in the Triton frontend apparently Philippe Tillet 2021-12-16 18:37:51 -08:00
  • 558555630f [FRONTEND] Added xor_sum Philippe Tillet 2021-12-16 17:55:35 -08:00
  • 94d5c2e8b5 [ROCM] enable matmul(dot) and others (#391) Michael Melesse 2021-12-13 12:28:15 -08:00
  • e575ae3443 [FRONTEND] Minor accumulated style and warning fixes (#388) Madeleine Thompson 2021-12-10 15:19:20 -08:00
  • 9def2424ab [RUNTIME] Fix typo in IfExp Philippe Tillet 2021-12-09 15:14:06 -08:00
  • e31b9b4e66 [RUNTIME] Better support for None (#387) Philippe Tillet 2021-12-09 13:21:22 -08:00
  • 73b04d71b2 Fixes for building on Windows (#382) Victor 2021-12-07 14:10:58 -08:00
  • 0ff1a26b70 fixed p2p tests failing when there are no supported p2p devices (#386) Victor 2021-12-06 18:14:03 -08:00
  • f23bf55f15 [RUNTIME] release the gil on launch (#383) Philippe Tillet 2021-12-03 13:01:01 -08:00
  • 8ec9f037bb [BACKEND/CODE_GEN] Fixed float32 matmul problem (#380) Philippe Tillet 2021-11-30 22:00:56 -08:00
  • c86ad9c9ab [FRONTEND] Added default arguments to non-kernel @triton.jit'd function (#379) Philippe Tillet 2021-11-29 19:11:26 -08:00
  • 1296eb877b [RUNTIME] Config hook v2.0 (#373) daadaada 2021-11-22 03:20:59 +08:00
  • 5693b582ea [RUNTIME] Now using pybind11 to avoid memory leaks (#377) Philippe Tillet 2021-11-21 02:30:22 -08:00
  • edd4b0c8b7 [CODEGEN] Fixed issue with jit function passed as constexpr Philippe Tillet 2021-11-16 09:53:34 -08:00
  • 5b7ba3eb96 [CODEGEN] Reverted to old launch method (memory leak?) Philippe Tillet 2021-11-16 01:21:03 -08:00
  • 791b953b21 [CODEGEN] Reverted to old way to query current stream Philippe Tillet 2021-11-16 00:17:27 -08:00
  • b908095872 [VERSION] Bumped triton.__version__ to 2.0.0 Philippe Tillet 2021-11-12 15:10:04 -08:00
  • 01cc3d4503 [RUNTIME] Restored do_not_specialize (#374) Philippe Tillet 2021-11-12 15:06:55 -08:00
  • e66bf76354 [RUNTIME] Bunch of bugfixes (#372) Philippe Tillet 2021-11-12 00:55:00 -08:00
  • f7ab96cfd7 [FRONTEND] Fixed some issues with constexpr Philippe Tillet 2021-11-05 09:26:33 -07:00
  • 9a02dddf29 Fix sdd_lut (#368) daadaada 2021-11-09 00:25:05 +08:00
  • 5d54352164 [FRONTEND] Significantly reduce kernel launch time (#367) Philippe Tillet 2021-11-04 13:25:24 -07:00
  • 2acaa4d0dd [LANG] Added support for constexpr (#361) Philippe Tillet 2021-10-30 00:32:58 -07:00
  • b7f0e87dc2 [DRIVER] Removed std::cout log message Philippe Tillet 2021-10-29 10:42:10 -07:00
  • 770ea96cca [PACKAGING] Bumped dev version to 2.0.0 Philippe Tillet 2021-10-29 01:28:17 -07:00
  • 969d6de8a2 [PACKAGING] Bumped dev version to 1.1.2 Philippe Tillet 2021-10-29 01:24:19 -07:00
  • 2d6df9b518 [PACKAGING] Bumped dev version to 1.1.2 v1.1.2 Philippe Tillet 2021-10-29 01:24:19 -07:00
  • 1b842f8e5e [CI] Now running integration tests on pull requests on branch v2.0 Philippe Tillet 2021-10-29 01:06:50 -07:00
  • d3e584d4ba Revert "[DRIVER] Fixed CUDA 10.1 bug (#357)" (#358) Philippe Tillet 2021-10-26 15:04:49 -07:00
  • d35014ba47 [DRIVER] Fixed CUDA 10.1 bug (#357) Philippe Tillet 2021-10-26 11:17:06 -07:00
  • 5ce1b726dc [CODEGEN] Various bugfixes that make it possible to fuse RNG in a matmul epilogue (#356) Philippe Tillet 2021-10-24 02:30:46 -07:00
  • 858dec8372 [CODEGEN] Add cache modifier to tl.load (#351) daadaada 2021-10-18 13:14:04 +08:00
  • 90ded16c32 [DOCS] Added placeholder docstring for layernorm tutorial Philippe Tillet 2021-10-15 19:04:01 -07:00
  • abbc554838 [VERSION] Bumped version to 1.1.1 (#350) v1.1.1 Philippe Tillet 2021-10-14 18:09:39 -07:00
  • 9b32075062 [CODEGEN] Some compiler improvements (#349) Philippe Tillet 2021-10-13 17:49:39 -07:00
  • c2e6b90ff1 [CODEGEN] Fixes masked load exception (#342) Stephen McGroarty 2021-10-13 21:31:52 +01:00
  • bfacc191b3 [FRONTEND] Now cache re-compiles when language changes (#348) Philippe Tillet 2021-10-13 12:29:57 -07:00
  • f5ad168686 [PYTHON] Fix up __version__ (#345) Shantanu 2021-10-13 00:09:00 -07:00
  • c3c0ff0552 [LANGUAGE] Fixed issue with duplicates in large arrays of random uniform numbers (#338) Philippe Tillet 2021-10-10 15:22:34 -07:00
  • 9e9d781912 [CODEGEN] Pipeline fixup (#336) daadaada 2021-10-10 16:47:11 +08:00
  • d5f20dbce0 [IR] Fix error when building in debug mode (#331) daadaada 2021-10-09 12:40:20 +08:00
  • d4baad426d [DOCS] Added layer norm example (#326) Philippe Tillet 2021-10-08 11:02:10 -07:00