Commit Graph

  • 406d03bfaf Improve ROCm support. (#780) Daniil Fukalov 2022-10-14 21:33:42 +03:00
  • 5898352f97 [Triton-IR] Fix LoadOp definition (#771) (#777) Shintaro Iwasaki 2022-10-13 18:53:00 -07:00
  • db3aa1d1fb [FRONTEND] Fix libdevice (#776) Keren Zhou 2022-10-13 17:18:16 -07:00
  • ddae106c0e [DOCS] Update installation.rst to fix windows build error (#747) Twizzes 2022-10-13 13:27:15 -07:00
  • 963d031247 [Triton-IR] Fix LoadOp Triton->TritonGPU conversion (#775) Da Yan 2022-10-14 03:57:39 +08:00
  • bc98aead33 [Backend] Fix for mov.u8 (#766) Keren Zhou 2022-10-12 14:32:27 -07:00
  • 71b46acc42 [IR] Added special-purpose dequantize instruction (#759) Yu Guo 2022-10-12 14:14:45 -07:00
  • 33e6f0df7f [DRIVER] Bumped CUDA requirement to 11.4+. This is to avoid bad performance surprises as older ptxas are much slower. (#769) Philippe Tillet 2022-10-12 12:02:30 -07:00
  • 1baa4e125f [triton-mlir][BACKEND] decouple loading from mma codegen in dot conversion (#764) Yan Chunwei 2022-10-12 10:45:17 +08:00
  • 623c99609f [Triton-IR] Added type inference and verifier for Triton-IR operations (#767) Philippe Tillet 2022-10-11 18:16:41 -07:00
  • af76c989eb [RUNTIME] Make entry point cache key depend on triton version hash (#765) Philippe Tillet 2022-10-11 13:24:30 -07:00
  • 09cc2d454b [FRONTEND] Fix a bool tensor storing problem (#746) Bin Bao 2022-10-10 15:11:50 -04:00
  • b6e5a231e5 [OPTIMIZER] Added swizzling pass (#758) Philippe Tillet 2022-10-10 01:12:37 -07:00
  • 555f94f9b9 [triton-mlir][BACKEND] Support masked load/store (#657) Yan Chunwei 2022-10-10 13:29:53 +08:00
  • 5d4b26d380 [RUNTIME] support multiple devices in the same process (#757) Felipe Petroski Such 2022-10-09 20:30:04 -07:00
  • 9a11a567ce [DOCS] Fixed typos in 01-vector-add.py (#751) Chris 2022-10-09 21:12:46 -04:00
  • ccc5ab6ac9 [BUILD] When set, use MLIR_DIR for finding both MLIR and LLVM (#755) Ian Bearman 2022-10-09 13:11:20 -07:00
  • 89f6e1db5e [BUILD] use cmake to set include path when build isn't triggered by setup.py (#754) Ian Bearman 2022-10-09 12:30:44 -07:00
  • 863578a7fa [BUILD] Enable current-dir inclusion (#753) Ian Bearman 2022-10-09 11:09:49 -07:00
  • 448d14a598 [BUILD] Add TRITON Prefix to build variables (#752) Ian Bearman 2022-10-09 10:55:17 -07:00
  • 1d772cd843 [Triton-MLIR][Backend] Add SCF lowering in the backend (#750) goostavz 2022-10-08 18:36:37 +08:00
  • 498c685b46 [OPTIMIZER] layout simplification: ignore non-tensor iter arguments in for loop rematerialization (#749) Philippe Tillet 2022-10-07 21:52:29 -07:00
  • 11345e9b74 [RUNTIME] Add callback functions for external tools (#738) Keren Zhou 2022-10-05 14:46:55 -07:00
  • bdfdb9a1d2 [RUNTIME] Fixed JIT bug that leg some constexpr values to be overriden by specialization parameters (#742) Philippe Tillet 2022-10-05 11:00:32 -07:00
  • e843257295 [Backend] Fix a bug in emitIndicesForBlocked (#740) goostavz 2022-10-05 12:29:59 +08:00
  • 77c752dc78 [RUNTIME] remove fixed cu_include_dir (#739) shenggan 2022-10-05 10:49:57 +08:00
  • d3c925db8a [FRONTEND] properly broadcast scalar where condition (#736) Natalia Gimelshein 2022-10-04 12:44:03 -07:00
  • 289ff293cc [Triton-MLIR] Generate LLVM/PTX code for async ops (#735) Keren Zhou 2022-10-04 09:37:00 -07:00
  • 2b0f877fad [RUNTIME] Support environments with multiple cudalibs (#733) fdrocha 2022-10-03 19:36:24 +01:00
  • f9d7f2f126 [Triton-MLIR][Backend] Support ConvertLayout blocked->shared and a few fixes related with mma(#716) goostavz 2022-10-03 19:33:25 +08:00
  • 4a2d3b7d79 [RUNTIME] Dump llvm, ttir, and sass to help debugging (#732) Keren Zhou 2022-10-02 17:39:52 -07:00
  • f55960e773 [FRONTEND] fix broadcasting for where (#729) Natalia Gimelshein 2022-10-01 13:18:47 -07:00
  • b244db06da [TUTORIALS] Attention tutorial fixup Phil Tillet 2022-09-30 19:30:46 -07:00
  • 7b61303ea1 [CODEGEN] Fix extract_N_bufferable in layout analysis (#728) Shintaro Iwasaki 2022-09-30 12:21:22 -07:00
  • ae59f51c2d [CODEGEN] Fix an inliner to call a function with a phi-node (#727) Shintaro Iwasaki 2022-09-29 21:36:40 -07:00
  • f45e31ba7c [FRONTEND] Make sure to hold the gil when creating python objects (#726) albanD 2022-09-29 21:06:22 -04:00
  • dad97528b2 [TESTING] allclose fixup (#724) Philippe Tillet 2022-09-28 15:49:05 -07:00
  • baba98ad69 [Triton-MLIR] Fix threadsPerWarp derivation in BlockedEncodingAttr (#722) Keren Zhou 2022-09-27 16:41:30 -07:00
  • 9ddf0921fb [OPTIMIZER] Added DotOp to the list of expensive ops we don't want to rematerialize. (#718) Philippe Tillet 2022-09-27 09:05:49 -07:00
  • df8d276089 [Triton-MLIR][Backend] Fix smem base bug in dot codegen (#715) Yan Chunwei 2022-09-27 17:28:17 +08:00
  • 3a84278530 [Triton-MLIR][BACKEND] Refine dot conversion (#710) Yan Chunwei 2022-09-27 14:38:34 +08:00
  • 61b61755e5 [Triton-MLIR][Backend] Support layout conversion between mmaLayout and blockedLayout (#693) goostavz 2022-09-27 11:58:47 +08:00
  • 1e91ed30d0 [RUNTIME] Major code cleanup (#711) Philippe Tillet 2022-09-26 16:38:06 -07:00
  • 8bb09f83ee [CI] Added CODEOWNERS file (#709) Philippe Tillet 2022-09-24 16:32:44 -07:00
  • 998fd5f9af [FRONTEND] Make triton.compile work without a cuda context (#708) Jason Ansel 2022-09-24 13:41:47 -07:00
  • 22ec22c257 [FRONTEND] Backport new runtime from master (#706) Philippe Tillet 2022-09-23 16:09:43 -07:00
  • 3ac929b48b [BUILD] Download pybind11 in setup.py (#703) Shintaro Iwasaki 2022-09-23 15:54:07 -07:00
  • 579c03615d [FRONTEND] Reduce number of compiles in JITFunction (#704) Jason Ansel 2022-09-23 14:44:52 -07:00
  • ecd1bc33df [Triton-MLIR] Keren/code gen for extract slice and alloc tensor (#692) Keren Zhou 2022-09-23 12:38:14 -07:00
  • c56f0198dd Revert "[Triton-MLIR][pybind11] Update pybind11 to 2.10.0" (#702) Philippe Tillet 2022-09-23 12:31:33 -07:00
  • 25e1b36785 Revert "[pybind11] Use git-submodule for pybind11" (#701) Philippe Tillet 2022-09-23 12:25:38 -07:00
  • 61d104ab3a [FRONTEND] Use git-submodule for pybind11 (#699) Shintaro Iwasaki 2022-09-23 09:55:03 -07:00
  • 922155f1d2 [BACKEND] add dot conversion (mma version=2) (#672) Yan Chunwei 2022-09-23 11:43:54 +08:00
  • 23f424c660 [Triton-MLIR][pybind11] Update pybind11 to 2.10.0 (#694) Shintaro Iwasaki 2022-09-22 17:53:42 -07:00
  • 8c3d4d5749 [RUNTIME] now decoupling entry point from cubin (#696) Philippe Tillet 2022-09-22 16:44:22 -07:00
  • 940ef3f0ac [BACKEND] llvm::dyn_cast -> llvm::dyn_cast_or_null (#689) Shintaro Iwasaki 2022-09-21 20:26:40 -07:00
  • df67068bb0 [pybind11] Update pybind11 to 2.10.0 (#691) Shintaro Iwasaki 2022-09-21 20:18:02 -07:00
  • 677ddae618 [FRONTEND] Add warmup for triton.jit() (#684) Philippe Tillet 2022-09-21 12:13:20 -07:00
  • 6abe813d1c Fix issue breaking cudagraphs (#685) Jason Ansel 2022-09-21 10:20:48 -07:00
  • e318185eb4 [DOCS] Improved README.md wording (#683) Philippe Tillet 2022-09-20 18:09:43 -07:00
  • 7dc2a70edb Revert "Add .warmup() for triton.jit()" (#682) Philippe Tillet 2022-09-20 16:05:14 -07:00
  • 48f30550f1 [FRONTEND] Now using raw compiler syscalls when possible (#678) Philippe Tillet 2022-09-19 21:01:36 -07:00
  • 93b1adc53b [FRONTEND] Add .warmup() for triton.jit() (#671) Jason Ansel 2022-09-18 23:09:34 -07:00
  • 82956e5d6b [PACKAGING] Added missing package Phil Tillet 2022-09-18 17:34:05 -07:00
  • 2baf333d44 [DOCS] Fixed typos (#670) Philippe Tillet 2022-09-18 17:13:12 -07:00
  • 49f6bc3f2b [FRONTEND] Fix filename too long error in new runtime (#669) Jason Ansel 2022-09-18 14:26:29 -07:00
  • 00f4ef6958 [CI] wheel/docs workflows now only run on V100 machine Phil Tillet 2022-09-18 13:26:42 -07:00
  • e647402fd3 Fix warning in generated C code (#667) Jason Ansel 2022-09-18 12:57:32 -07:00
  • 4a77dfb042 [FRONTEND] Complete rewrite of the runtime (#644) Philippe Tillet 2022-09-18 08:51:48 -07:00
  • 15bfd0cb79 [BACKEND] Support of ConvertLayoutOp from blocked to blocked and SliceLayout with blocked parent (#658) goostavz 2022-09-18 05:58:42 +08:00
  • 889d9e34a1 [REPO] update gitignore (#666) Ian Bearman 2022-09-17 14:25:28 -07:00
  • 13669b46a6 [DOCS] Correct spelling (#665) Shintaro Iwasaki 2022-09-16 15:07:34 -07:00
  • c668d6596e [DOCS] Fix spelling (#664) Shintaro Iwasaki 2022-09-16 12:26:40 -07:00
  • e9e1a4e682 [FRONTEND] Fix the implicit broadcasting rule (#663) Shintaro Iwasaki 2022-09-16 10:49:15 -07:00
  • 80e3fb5270 [CI] Now using clang-format from pip (#662) Philippe Tillet 2022-09-15 16:24:37 -07:00
  • 43be75ad42 [FRONTEND] Add scalar type support for some ops (#661) Shintaro Iwasaki 2022-09-15 16:12:52 -07:00
  • 2e08450c80 [OPTIMIZER] Better pipeline tests (#660) Da Yan 2022-09-15 14:26:40 +08:00
  • 297d27e1c8 [Triton-MLIR] add GitHub CI runners (#655) Shintaro Iwasaki 2022-09-14 23:09:56 -07:00
  • 4580a04710 [FRONTEND] Improve error message for CPU tensors (#654) Sophia Wisdom 2022-09-14 14:26:42 -07:00
  • cfbbc7b43a [CI] Added V100 tag to disambiguate self-hosted runners (#653) Philippe Tillet 2022-09-14 13:47:50 -07:00
  • c14dff2190 [CI] Added A10 tag to disambiguate self-hosted runners (#652) Philippe Tillet 2022-09-14 13:08:01 -07:00
  • 59a8e25f43 [DOCS] Fix typo (#650) Yunxing Dai 2022-09-14 12:17:05 -07:00
  • affd3325b2 [GH-PAGES] Updated website gh-pages Philippe Tillet 2022-09-14 00:53:02 +00:00
  • 9fd9c56321 [GH-PAGES] Updated website Philippe Tillet 2022-09-13 00:54:01 +00:00
  • a81d78b680 [GH-PAGES] Updated website Philippe Tillet 2022-09-12 00:51:39 +00:00
  • f79b7c6f03 [GH-PAGES] Updated website Philippe Tillet 2022-09-11 00:50:20 +00:00
  • 4588c0bc46 [GH-PAGES] Updated website Philippe Tillet 2022-09-10 00:52:29 +00:00
  • 16aed94ff5 [Analysis/Allocation] Allocation passes now assumes that slices always alias (#108) Keren Zhou 2022-09-09 12:03:41 -07:00
  • 9bd5a3dcd2 [OPTIMIZER] Pipeline async buffer (#110) Philippe Tillet 2022-09-09 11:01:14 -07:00
  • 2a852044d9 [BACKEND] Add C++ tests for PTXFormat and some tiny refinement (#109) Yan Chunwei 2022-09-10 00:15:07 +08:00
  • 8f733c4476 [GH-PAGES] Updated website Philippe Tillet 2022-09-09 00:53:43 +00:00
  • 762f8c9f51 [GH-PAGES] Updated website Philippe Tillet 2022-09-08 00:52:31 +00:00
  • 8e1a3b0434 [GH-PAGES] Updated website Philippe Tillet 2022-09-07 00:51:18 +00:00
  • a9464f4993 [Backend] Vectorize Load/Store Ops (#86) Yan Chunwei 2022-09-07 03:28:09 +08:00
  • 35e346bcff [OPTIMIZER] Better pipeline pass (#100) Da Yan 2022-09-06 23:31:13 +08:00
  • a0bab9748e [OPTIMIZER] Coalesce pass no longer takes a num-warps argument (#99) Philippe Tillet 2022-09-05 18:09:02 -07:00
  • c46759fc89 [GH-PAGES] Updated website Philippe Tillet 2022-09-06 00:50:44 +00:00
  • af0e35297e [GH-PAGES] Updated website Philippe Tillet 2022-09-05 00:53:28 +00:00
  • ea175f689e [CI]Added initial framework of CXX unittest (#98) Jun Yang 2022-09-04 12:50:27 +08:00
  • ef6b89f4f1 [GH-PAGES] Updated website Philippe Tillet 2022-09-04 00:51:14 +00:00