[GH-PAGES] Updated website

This commit is contained in:
Philippe Tillet
2022-06-21 00:46:27 +00:00
parent ab91a5bbc3
commit c168f03e0c
158 changed files with 236 additions and 236 deletions

View File

@@ -459,37 +459,37 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we
matmul-performance:
M cuBLAS ... Triton Triton (+ LeakyReLU)
0 256.0 2.730667 ... 2.978909 2.978909
1 384.0 7.372800 ... 8.507077 8.507077
2 512.0 14.563555 ... 16.384000 15.420235
0 256.0 2.730667 ... 3.276800 3.276800
1 384.0 7.372800 ... 7.899428 7.899428
2 512.0 14.563555 ... 15.420235 15.420235
3 640.0 22.260869 ... 24.380953 24.380953
4 768.0 32.768000 ... 34.028308 34.028308
5 896.0 39.025776 ... 40.140799 39.025776
4 768.0 32.768000 ... 35.389441 34.028308
5 896.0 37.971025 ... 40.140799 39.025776
6 1024.0 49.932191 ... 53.773130 52.428801
7 1152.0 45.242181 ... 48.161033 47.396572
8 1280.0 51.200001 ... 57.690139 57.690139
9 1408.0 64.138541 ... 68.147202 66.485074
10 1536.0 79.526831 ... 80.430545 78.643199
11 1664.0 62.929456 ... 63.372618 62.492442
10 1536.0 80.430545 ... 81.355034 78.643199
11 1664.0 63.372618 ... 63.372618 62.492442
12 1792.0 72.983276 ... 73.460287 59.467852
13 1920.0 69.120002 ... 71.257735 70.892307
13 1920.0 69.467336 ... 71.257735 70.892307
14 2048.0 73.262953 ... 78.033565 76.959706
15 2176.0 83.155572 ... 87.876193 85.998493
15 2176.0 83.155572 ... 87.494120 85.998493
16 2304.0 68.446623 ... 78.064941 77.057651
17 2432.0 71.305746 ... 86.711310 84.621881
18 2560.0 77.833728 ... 82.331658 81.715711
19 2688.0 83.369354 ... 90.966561 89.044730
20 2816.0 82.602666 ... 84.197315 83.552120
21 2944.0 81.698415 ... 83.477440 82.509987
22 3072.0 82.420822 ... 86.053349 88.612060
23 3200.0 84.712112 ... 89.387425 95.096582
24 3328.0 83.808259 ... 85.703924 84.397770
25 3456.0 82.519518 ... 91.928814 89.579522
26 3584.0 85.552231 ... 95.756542 95.654673
27 3712.0 86.044224 ... 89.353616 83.386762
28 3840.0 85.201850 ... 93.012618 85.597527
29 3968.0 91.747320 ... 85.330496 89.789505
30 4096.0 91.522488 ... 90.687655 90.382307
17 2432.0 71.125224 ... 86.444504 84.877538
18 2560.0 77.833728 ... 82.956960 81.108913
19 2688.0 83.552988 ... 90.966561 89.044730
20 2816.0 82.995641 ... 84.197315 83.712490
21 2944.0 82.646820 ... 83.758038 82.237674
22 3072.0 82.661468 ... 87.787755 88.612060
23 3200.0 81.424937 ... 93.704243 93.430660
24 3328.0 81.622783 ... 85.703924 84.795401
25 3456.0 82.604067 ... 92.033756 90.281712
26 3584.0 85.674507 ... 93.273228 95.960933
27 3712.0 85.748791 ... 86.118401 87.706180
28 3840.0 81.019778 ... 88.900318 89.475729
29 3968.0 88.008611 ... 87.315873 88.040360
30 4096.0 93.924229 ... 93.727466 87.552332
[31 rows x 5 columns]
@@ -499,7 +499,7 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we
.. rst-class:: sphx-glr-timing
**Total running time of the script:** ( 6 minutes 6.898 seconds)
**Total running time of the script:** ( 5 minutes 59.034 seconds)
.. _sphx_glr_download_getting-started_tutorials_03-matrix-multiplication.py: