Commit Graph
Select branches
Hide Pull Requests
fix-extelemwise-in-combine-ops
gh-pages
jit-hook
keren/assert
keren/improve-hook
keren/insert-slice-other-nonzero
keren/perf-debug
keren/v100-perf-regression
master
phil/fused-attention-perf-fixup
phil/mma-v1-is-row-debug
phil/swizzle-bug-repro
port-fma
rocm
#10
#100
#1000
#1001
#1002
#1004
#1006
#1007
#1008
#101
#1010
#1012
#1013
#1013
#1014
#1018
#1019
#102
#1020
#1020
#1025
#1027
#1028
#1029
#103
#1030
#1033
#1034
#1036
#1037
#1038
#1039
#104
#1042
#1043
#1043
#105
#106
#107
#108
#109
#11
#11
#110
#111
#112
#114
#116
#118
#119
#120
#121
#123
#124
#125
#126
#127
#128
#129
#13
#131
#132
#133
#134
#135
#136
#137
#138
#139
#140
#141
#142
#143
#144
#145
#146
#147
#148
#149
#15
#151
#152
#158
#164
#165
#167
#168
#172
#173
#178
#179
#18
#180
#185
#186
#188
#19
#190
#192
#193
#195
#198
#199
#20
#200
#203
#204
#205
#207
#209
#212
#219
#22
#222
#224
#225
#228
#23
#23
#231
#24
#240
#249
#250
#251
#253
#255
#256
#257
#258
#259
#260
#261
#268
#271
#272
#273
#276
#279
#28
#280
#281
#282
#283
#285
#286
#287
#288
#291
#292
#293
#294
#295
#296
#297
#298
#299
#3
#300
#301
#302
#303
#304
#305
#306
#307
#308
#309
#311
#312
#313
#314
#317
#318
#320
#324
#326
#331
#336
#337
#338
#342
#344
#345
#346
#347
#348
#349
#35
#350
#351
#356
#357
#358
#361
#362
#367
#368
#372
#373
#374
#377
#379
#38
#380
#381
#382
#383
#386
#387
#388
#390
#391
#392
#393
#394
#395
#396
#397
#399
#40
#400
#401
#403
#406
#407
#408
#409
#41
#413
#414
#415
#417
#418
#420
#421
#422
#423
#424
#425
#426
#427
#428
#430
#431
#432
#436
#438
#439
#440
#442
#444
#445
#446
#447
#448
#449
#45
#450
#451
#453
#455
#456
#457
#458
#462
#463
#464
#467
#468
#469
#470
#471
#473
#474
#478
#481
#482
#483
#484
#485
#487
#488
#490
#491
#492
#493
#495
#499
#500
#501
#502
#503
#505
#507
#510
#513
#514
#515
#516
#519
#52
#520
#522
#523
#524
#526
#527
#528
#53
#531
#533
#534
#535
#538
#538
#539
#541
#545
#546
#547
#548
#549
#551
#552
#553
#555
#556
#557
#559
#560
#561
#562
#564
#565
#567
#569
#57
#570
#571
#572
#575
#575
#577
#578
#579
#58
#582
#587
#588
#59
#590
#595
#598
#60
#600
#601
#602
#604
#606
#607
#608
#61
#614
#614
#617
#62
#623
#63
#632
#636
#637
#644
#65
#650
#651
#652
#653
#654
#655
#657
#658
#66
#660
#661
#662
#663
#664
#665
#666
#667
#668
#669
#670
#671
#672
#678
#68
#682
#683
#684
#685
#689
#69
#691
#692
#693
#694
#696
#697
#699
#7
#70
#700
#701
#702
#703
#704
#706
#708
#709
#71
#710
#711
#712
#715
#716
#718
#722
#724
#726
#727
#728
#729
#73
#732
#733
#735
#736
#738
#739
#740
#742
#746
#747
#749
#75
#750
#751
#752
#753
#754
#755
#757
#758
#759
#764
#765
#766
#767
#769
#77
#774
#775
#776
#777
#78
#780
#782
#784
#785
#786
#788
#789
#790
#791
#792
#794
#796
#797
#798
#799
#80
#800
#801
#803
#804
#805
#809
#81
#812
#814
#815
#816
#817
#818
#819
#82
#820
#821
#822
#823
#825
#826
#827
#829
#83
#830
#831
#833
#834
#835
#836
#837
#838
#839
#840
#841
#842
#843
#844
#845
#847
#848
#849
#850
#851
#852
#853
#854
#856
#857
#858
#859
#86
#862
#863
#864
#867
#868
#869
#87
#872
#873
#874
#875
#876
#877
#878
#879
#88
#880
#881
#883
#885
#886
#887
#887
#888
#889
#89
#890
#890
#894
#896
#897
#898
#899
#90
#901
#902
#903
#904
#906
#907
#908
#909
#91
#910
#912
#913
#914
#915
#916
#917
#918
#92
#920
#921
#922
#923
#924
#925
#926
#927
#928
#929
#93
#930
#931
#933
#936
#937
#938
#939
#94
#941
#943
#944
#945
#946
#947
#947
#948
#95
#951
#952
#953
#956
#957
#958
#959
#96
#960
#961
#962
#963
#964
#966
#968
#969
#97
#970
#971
#972
#973
#975
#976
#977
#978
#979
#980
#982
#982
#983
#985
#987
#988
#990
#991
#993
#994
#995
#996
#997
#998
#999
isaac
legacy-backend
v0.1
v0.2.3
v0.4
v1.0
v1.1
v1.1.1
v1.1.2
Select branches
Hide Pull Requests
fix-extelemwise-in-combine-ops
gh-pages
jit-hook
keren/assert
keren/improve-hook
keren/insert-slice-other-nonzero
keren/perf-debug
keren/v100-perf-regression
master
phil/fused-attention-perf-fixup
phil/mma-v1-is-row-debug
phil/swizzle-bug-repro
port-fma
rocm
#10
#100
#1000
#1001
#1002
#1004
#1006
#1007
#1008
#101
#1010
#1012
#1013
#1013
#1014
#1018
#1019
#102
#1020
#1020
#1025
#1027
#1028
#1029
#103
#1030
#1033
#1034
#1036
#1037
#1038
#1039
#104
#1042
#1043
#1043
#105
#106
#107
#108
#109
#11
#11
#110
#111
#112
#114
#116
#118
#119
#120
#121
#123
#124
#125
#126
#127
#128
#129
#13
#131
#132
#133
#134
#135
#136
#137
#138
#139
#140
#141
#142
#143
#144
#145
#146
#147
#148
#149
#15
#151
#152
#158
#164
#165
#167
#168
#172
#173
#178
#179
#18
#180
#185
#186
#188
#19
#190
#192
#193
#195
#198
#199
#20
#200
#203
#204
#205
#207
#209
#212
#219
#22
#222
#224
#225
#228
#23
#23
#231
#24
#240
#249
#250
#251
#253
#255
#256
#257
#258
#259
#260
#261
#268
#271
#272
#273
#276
#279
#28
#280
#281
#282
#283
#285
#286
#287
#288
#291
#292
#293
#294
#295
#296
#297
#298
#299
#3
#300
#301
#302
#303
#304
#305
#306
#307
#308
#309
#311
#312
#313
#314
#317
#318
#320
#324
#326
#331
#336
#337
#338
#342
#344
#345
#346
#347
#348
#349
#35
#350
#351
#356
#357
#358
#361
#362
#367
#368
#372
#373
#374
#377
#379
#38
#380
#381
#382
#383
#386
#387
#388
#390
#391
#392
#393
#394
#395
#396
#397
#399
#40
#400
#401
#403
#406
#407
#408
#409
#41
#413
#414
#415
#417
#418
#420
#421
#422
#423
#424
#425
#426
#427
#428
#430
#431
#432
#436
#438
#439
#440
#442
#444
#445
#446
#447
#448
#449
#45
#450
#451
#453
#455
#456
#457
#458
#462
#463
#464
#467
#468
#469
#470
#471
#473
#474
#478
#481
#482
#483
#484
#485
#487
#488
#490
#491
#492
#493
#495
#499
#500
#501
#502
#503
#505
#507
#510
#513
#514
#515
#516
#519
#52
#520
#522
#523
#524
#526
#527
#528
#53
#531
#533
#534
#535
#538
#538
#539
#541
#545
#546
#547
#548
#549
#551
#552
#553
#555
#556
#557
#559
#560
#561
#562
#564
#565
#567
#569
#57
#570
#571
#572
#575
#575
#577
#578
#579
#58
#582
#587
#588
#59
#590
#595
#598
#60
#600
#601
#602
#604
#606
#607
#608
#61
#614
#614
#617
#62
#623
#63
#632
#636
#637
#644
#65
#650
#651
#652
#653
#654
#655
#657
#658
#66
#660
#661
#662
#663
#664
#665
#666
#667
#668
#669
#670
#671
#672
#678
#68
#682
#683
#684
#685
#689
#69
#691
#692
#693
#694
#696
#697
#699
#7
#70
#700
#701
#702
#703
#704
#706
#708
#709
#71
#710
#711
#712
#715
#716
#718
#722
#724
#726
#727
#728
#729
#73
#732
#733
#735
#736
#738
#739
#740
#742
#746
#747
#749
#75
#750
#751
#752
#753
#754
#755
#757
#758
#759
#764
#765
#766
#767
#769
#77
#774
#775
#776
#777
#78
#780
#782
#784
#785
#786
#788
#789
#790
#791
#792
#794
#796
#797
#798
#799
#80
#800
#801
#803
#804
#805
#809
#81
#812
#814
#815
#816
#817
#818
#819
#82
#820
#821
#822
#823
#825
#826
#827
#829
#83
#830
#831
#833
#834
#835
#836
#837
#838
#839
#840
#841
#842
#843
#844
#845
#847
#848
#849
#850
#851
#852
#853
#854
#856
#857
#858
#859
#86
#862
#863
#864
#867
#868
#869
#87
#872
#873
#874
#875
#876
#877
#878
#879
#88
#880
#881
#883
#885
#886
#887
#887
#888
#889
#89
#890
#890
#894
#896
#897
#898
#899
#90
#901
#902
#903
#904
#906
#907
#908
#909
#91
#910
#912
#913
#914
#915
#916
#917
#918
#92
#920
#921
#922
#923
#924
#925
#926
#927
#928
#929
#93
#930
#931
#933
#936
#937
#938
#939
#94
#941
#943
#944
#945
#946
#947
#947
#948
#95
#951
#952
#953
#956
#957
#958
#959
#96
#960
#961
#962
#963
#964
#966
#968
#969
#97
#970
#971
#972
#973
#975
#976
#977
#978
#979
#980
#982
#982
#983
#985
#987
#988
#990
#991
#993
#994
#995
#996
#997
#998
#999
isaac
legacy-backend
v0.1
v0.2.3
v0.4
v1.0
v1.1
v1.1.1
v1.1.2
-
f20f48a255
Move
Jokeren
2022-12-06 13:29:29 -08:00 -
3eff110fbc
Restore
Jokeren
2022-12-06 13:28:43 -08:00 -
5f85b79718
Merge branch 'triton-mlir' into keren/insert-slice-other-nonzero
Jokeren
2022-12-06 13:25:20 -08:00 -
bab7338965
Fix
Jokeren
2022-12-06 13:24:50 -08:00 -
74f3d7a80f
Fix
Jokeren
2022-12-06 12:53:25 -08:00 -
115cd3ac47
[FRONTEND] Added
reshape
as an alias forview
(for now) (#956)Philippe Tillet
2022-12-06 09:57:05 -08:00 -
532e10cf87
[FRONTEND][BACKEND] Clean-up transpositions (#953)
Philippe Tillet
2022-12-06 09:32:13 -08:00 -
16e973edf2
[BACKEND] Fix dependency analysis in pipeline (#946)
Keren Zhou
2022-12-06 09:08:55 -08:00 -
b539e031e8
Add test
Jokeren
2022-12-05 23:38:54 -08:00 -
46fa29496c
Init
Jokeren
2022-12-05 23:18:13 -08:00 -
9490252261
[FRONTEND] Support alternative install locations of system libdevice.10.bc (#951)
Crutcher Dunnavant
2022-12-05 19:41:44 -08:00 -
e419781978
[Triton-MLIR][BACKEND] Make mmav1 works on basic cases (#944)
Yan Chunwei
2022-12-06 10:57:08 +08:00 -
189491727a
[FRONTEND] Extract and unify @builtin/@extern (#913)
Crutcher Dunnavant
2022-12-05 14:59:41 -08:00 -
e0072d210a
[FRONTEND] Propagate mypy types through @jit, @builtin, etc (#915)
Crutcher Dunnavant
2022-12-05 14:41:02 -08:00 -
2fa17588f7
[FRONTEND] Expand __init__ * imports, add __all__ (#912)
Crutcher Dunnavant
2022-12-05 14:22:55 -08:00 -
e057c65cf0
[BACKEND] Porting the legacy heuristic rule in assigning shared layout for A/B of MMAv1 (#948)
goostavz
2022-12-06 03:30:23 +08:00 -
99c7e0e008
[BUILD] Change default build type (#945)
Philippe Tillet
2022-12-03 17:47:33 -08:00 -
f2fcaeabf3
[BACKEND] Support dot op when the output is mma encoding and allowtf32 is true (#937)
Keren Zhou
2022-12-03 11:14:12 -08:00 -
8edfe813a5
[FRONTEND][BACKEND] Added
trans
instruction; made flash attention bwd pass work (#943)Philippe Tillet
2022-12-03 09:58:24 -08:00 -
4d64589b22
[Triton-MLIR][Backend] Fix the definition of MmaEncodingAttr v1, and the output sequence of DotConversion in MMAv1 (#941)
goostavz
2022-12-03 21:12:48 +08:00 -
8650b4d1cb
[DRIVER] Fix typos (#939)
legacy-backend
Yang Hau
2022-12-03 03:13:46 +08:00 -
521ff9ad74
[TRITON-MLIR][FRONTEND]fix scf.if to run through layernorm tutorial (#938)
donproc
2022-12-02 17:45:29 +08:00 -
c280ebda1b
[Triton-MLIR][BACKEND] Fix the membar pass to add missing barriers caused by scf.for (#933)
Keren Zhou
2022-12-01 11:54:18 -08:00 -
9def1bcebf
[TRITON-MLIR][FRONTEND]minor fix to run through atomic_cas test (#925)
donproc
2022-12-01 21:43:26 +08:00 -
7d90a07d0b
[Triton-MLIR][BACKEND] Refactor decompose insert_slice_async (#929)
Keren Zhou
2022-11-30 10:07:34 -08:00 -
6461254fb5
[BACKEND] Make flash attention forward pass work (#928)
Philippe Tillet
2022-11-30 11:13:24 +01:00 -
4e6a8209ed
[Triton-MLIR] Two fixes on allocation and backend related with MMA v1 (#930)
goostavz
2022-11-30 17:27:26 +08:00 -
9bb54402b3
[FRONTEND][BACKEND] Small fixes to multiple_of, num_programs, axisinfo; enable block-sparse tests (#927)
Philippe Tillet
2022-11-29 20:00:34 +01:00 -
66c36c4378
[BACKEND] Fixed bounds-wrapping issues (#926)
Philippe Tillet
2022-11-29 17:56:45 +01:00 -
661be523c0
[Triton-MLIR][BACKEND] Minor fixes of shared memory in ReduceOpConversion (#924)
Qingyi Liu
2022-11-29 11:50:31 +08:00 -
c87fbf886e
[Triton-MLIR][BACKEND] Remove static and unnamed namespace in Utility.h (#923)
Yan Chunwei
2022-11-29 09:06:06 +08:00 -
dfc8e7fb95
Fix
keren/perf-debug
Jokeren
2022-11-28 13:47:13 -08:00 -
2f9aef1132
Fix
Jokeren
2022-11-28 13:00:26 -08:00 -
f605d95b82
unroll_2
Jokeren
2022-11-28 12:59:05 -08:00 -
b378118647
c64
Jokeren
2022-11-28 12:19:52 -08:00 -
cfcf042e55
Init
Jokeren
2022-11-28 11:55:41 -08:00 -
0c1d4d764e
[Triton-MLIR][Backend] support MMA v1 in ConvertLayout (#922)
goostavz
2022-11-28 16:10:30 +08:00 -
9d31998a9d
[Triton-MLIR][BACKEND] Add argmin / argmax implementation for ReduceOp (#918)
Qingyi Liu
2022-11-28 14:59:27 +08:00 -
04ec5deb41
[Triton-MLIR][BACKEND] decouple the dot code (#921)
Yan Chunwei
2022-11-28 13:30:27 +08:00 -
630dc315ee
[Triton-MLIR] uncomment the UT in test_gemm that has already been fixed (#920)
goostavz
2022-11-28 11:23:20 +08:00 -
35c9ec1103
[Triton-MLIR][Backend] Fix number of warps and threads per warp when matrices are small (#917)
Keren Zhou
2022-11-26 12:30:38 -08:00 -
ee098d0341
Merge branch 'master' into keren/improve-hook
Jokeren
2022-11-25 15:04:59 -08:00 -
f63be0e9b5
[TRITON-MLIR][BACKEND]support atomic_cas (#914)
donproc
2022-11-25 12:02:08 +08:00 -
feef58ee8a
Pass fn to CompiliedKernel
Jokeren
2022-11-24 14:22:35 -08:00 -
153aecb339
[Triton-MLIR][BACKEND] insert_slice_async on GPUs < sm80 (#908)
Keren Zhou
2022-11-24 14:05:54 -08:00 -
f98aed1258
[Triton-MLIR][RUNTIME] Add /usr/bin/ptxas as a search path (#909)
Crutcher Dunnavant
2022-11-24 10:49:16 -08:00 -
ace7d28736
[Triton-MLIR][RUNTIME] Fix ir metadata lookup bug (#910)
Crutcher Dunnavant
2022-11-24 00:27:23 -08:00 -
b688f7b7b8
[Triton-MLIR] add_volta_warpsPerTile (#907)
ben-zhang-609
2022-11-24 09:44:29 +08:00 -
8925c2cd11
[TRITON-MLIR][BACKEND]AtomicRMWOp supports scalar (#903)
donproc
2022-11-23 15:59:09 +08:00 -
2e33352419
[Triton-MLIR] Fix side effects (#906)
Keren Zhou
2022-11-22 23:29:18 -08:00 -
037f9efa95
[Triton-MLIR][BACKEND] Fix wpt overflow issue in mma v2 (#904)
Yan Chunwei
2022-11-23 11:27:15 +08:00 -
07786dc932
[Triton-MLIR] Add compute capability (#902)
ben-zhang-609
2022-11-23 03:08:23 +08:00 -
2afebcd79b
[Triton-MLIR][Backend] Remove unnecessary barriers (#901)
Keren Zhou
2022-11-22 10:03:29 -08:00 -
136668bac3
[Triton-MLIR][BACKEND] tiny code cleanup (#899)
Yan Chunwei
2022-11-21 16:00:46 +08:00 -
04b852e031
[Triton-MLIR] Fix warnings and variable names (#898)
Keren Zhou
2022-11-20 22:25:27 -08:00 -
85cccfb81f
[BUILD] Fix compilation problems in the release build (#897)
Keren Zhou
2022-11-20 21:40:36 -08:00 -
23f71daa27
[OPTIMIZER] Fixed up order of shared layouts (#881)
Philippe Tillet
2022-11-21 06:25:02 +01:00 -
44f577984d
Fix format double substitution bug:
{i}
=>{{i}}
(#886)Crutcher Dunnavant
2022-11-20 11:44:42 -08:00 -
4d64ffb5fe
[FRONTEND] Handle for loops with negative constant steps (#896)
Philippe Tillet
2022-11-20 11:37:38 +01:00 -
6c5f646f4e
[WIP][Triton-MLIR] Prefetch pass fixup (#873)
Keren Zhou
2022-11-19 19:57:16 -08:00 -
e8994209f4
[Triton-MLIR][Backend]fix mma-v2 transpose error (#888)
Yan Chunwei
2022-11-20 11:29:09 +08:00 -
8a5647782d
[Triton-MLIR][Testing]Fix tests warning, with small code clean-up (#894)
Jun Yang
2022-11-19 22:33:59 +08:00 -
afaf59b0c9
[TRITON-MLIR][BACKEND] Atomic support mask (#889)
donproc
2022-11-19 19:57:19 +08:00 -
46fd581b0a
Merge pull request #29 from ROCmSoftwarePlatform/parse_amdgcn_from_rocminfo
rocm
rsanthanam-amd
2022-11-18 12:53:25 -06:00 -
8cc448d92e
Changes to eliminate the need for the MI_GPU_ARCH environment variable.
Rohit Santhanam
2022-11-18 12:58:51 +00:00 -
dab4855bdf
[TESTING] Added infrastructure for executing TTGIR program and test for layout conversions (#885)
Philippe Tillet
2022-11-18 07:46:45 +01:00 -
9ea6135eb5
[Triton-MLIR][Backend] Some cleanup in getMultiDimIndex/getLinearIndex (#880)
goostavz
2022-11-18 09:19:21 +08:00 -
0e4691e6dd
[FRONTEND] Fix ExternLibrary(format=) bug; type annotate build_extern.py (#883)
Crutcher Dunnavant
2022-11-17 09:45:30 -08:00 -
5eee738df7
[Triton-MLIR][FRONTEND] [BACKEND] fix atomics (#879)
donproc
2022-11-16 12:25:15 +08:00 -
37f5846280
[Triton-MLIR][Backend] Minor fix for allocation and backend in handling tt.ptr tensors (#878)
goostavz
2022-11-15 18:08:07 +08:00 -
a22ff39017
[Triton-MLIR][BACKEND] Refine/add codegen for get_promgram_id and get_num_programs Op (#877)
Yan Chunwei
2022-11-15 15:45:24 +08:00 -
4c4159c6fa
[Triton-MLIR] Add ex2.approx implementation for ExpOp and fix smem allocation for ReduceOpConversion (#875)
Qingyi Liu
2022-11-15 09:27:32 +08:00 -
c28cfd821b
[Triton-MLIR][Backend] Fix convert_layout blocked->shared in non-default order (#876)
goostavz
2022-11-15 09:02:46 +08:00 -
1eedaf7bec
[Triton-MLIR][BACKEND] adapt DotOp layout for FMADot (#872)
Yan Chunwei
2022-11-14 16:56:30 +08:00 -
516a241234
[Triton-MLIR] Fix some typos (#874)
Chenggang Zhao
2022-11-14 10:15:53 +08:00 -
f40c63fb03
[Triton-MLIR][OPTIMIZER] Cleaned up swizzling (#869)
Philippe Tillet
2022-11-10 12:05:46 -08:00 -
2aa538ec2e
[BACKEND] Added support for mma layouts in reductions (#863)
Philippe Tillet
2022-11-10 09:58:07 -08:00 -
57fd1864a7
[Triton-MLIR] Support FP8 (#864)
Chenggang Zhao
2022-11-10 15:53:06 +08:00 -
4946167241
[Triton-MLIR]
tt.dot
operands now must have DotOperand layout; also added prefetch pass prototype (#712)Da Yan
2022-11-10 13:57:27 +08:00 -
8832e32683
[Triton-MLIR][BACKEND] Refine ptxbuilder (#867)
Yan Chunwei
2022-11-10 13:41:52 +08:00 -
4640023d9b
[Triton-MLIR][Backend]add atomic rmw without mask (#842)
donproc
2022-11-10 08:15:58 +08:00 -
0c87360657
[Triton-MLIR][Backend] Port FMADot conversion for DotOp (#844)
Yan Chunwei
2022-11-09 12:57:50 +08:00 -
de5b84c476
[Triton-MLIR][Backend] Fix mma<v2> int8 precision error (#850)
Yan Chunwei
2022-11-09 12:23:43 +08:00 -
e517b58d59
[Triton-MLIR] Minor fixes to enable fused-softmax and layer-norm tutorials (#835)
Qingyi Liu
2022-11-09 10:18:56 +08:00 -
2da71b2aaa
[Triton-MLIR] Increase block size K to completely eliminate shared memory bank conflicts (#862)
Keren Zhou
2022-11-08 17:39:23 -08:00 -
080b4addf8
[Triton-MLIR][Backend] Fix the order in linear/delinear and a few bugs in reduce conversion (#851)
goostavz
2022-11-09 02:10:09 +08:00 -
303790da88
[BUILD] use Python Var In Tests (#859)
Ian Bearman
2022-11-08 09:44:19 -08:00 -
137344946f
[OPTIMIZER] Fix the load-mask issue with the pipeline pass (#857)
Da Yan
2022-11-09 01:29:53 +08:00 -
976cf12af1
[OPTIMIZER] Fixed memory coalescing (#847)
Philippe Tillet
2022-11-07 06:22:18 -08:00 -
b6f15e214b
[FRONTEND] Fixed up type cast in atomics codegen (#853)
Philippe Tillet
2022-11-07 05:46:24 -08:00 -
84ad215268
[Triton-MLIR] Enable libdevice for ptx backend when has external functions. (#848)
ben-zhang-609
2022-11-07 16:01:50 +08:00 -
fdd59900f7
[Triton-MLIR] Replace triton.extract_slice with tensor.extract_slice and support more general tensor slicing (#837)
Keren Zhou
2022-11-06 22:59:03 -08:00 -
a4ff0c362c
[FRONTEND] Fix issues with atomics (#849)
Philippe Tillet
2022-11-06 20:52:11 -08:00 -
d767919bc1
[OPTIMIZER] Not using MMA on FP32 when allowTF32 is false
port-fma
Phil Tillet
2022-11-04 23:16:28 -07:00 -
0d7e753227
[TESTING] use torch.int for autotuning cache (#840)
Natalia Gimelshein
2022-11-04 18:05:16 -07:00 -
b6dbe959f0
[RUNTIME] Re-vamped cache so users can manually patch IR / ptx / cubin files (#845)
Philippe Tillet
2022-11-04 10:57:29 -07:00 -
b39cc56f93
up
Superjomn
2022-11-04 18:04:20 +08:00 -
db64477153
Merge remote-tracking branch 'origin/triton-mlir' into port-fma
Superjomn
2022-11-04 17:43:54 +08:00 -
95d8e383cb
add testing
Superjomn
2022-11-04 17:43:09 +08:00 -
1ed6ee34ba
finish coding
Superjomn
2022-11-04 16:54:05 +08:00