[GH-PAGES] Updated website

2022-09-06 00:50:44 +00:00
parent af0e35297e
commit c46759fc89
161 changed files with 238 additions and 238 deletions
--- a/master/_sources/getting-started/tutorials/01-vector-add.rst.txt
+++ b/master/_sources/getting-started/tutorials/01-vector-add.rst.txt
@@ -255,7 +255,7 @@ We can now run the decorated function above. Pass `print_data=True` to see the p

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 1 minutes  43.794 seconds)
+   **Total running time of the script:** ( 1 minutes  44.974 seconds)


 .. _sphx_glr_download_getting-started_tutorials_01-vector-add.py:
--- a/master/_sources/getting-started/tutorials/02-fused-softmax.rst.txt
+++ b/master/_sources/getting-started/tutorials/02-fused-softmax.rst.txt
@@ -287,7 +287,7 @@ We will then compare its performance against (1) :code:`torch.softmax` and (2) t
    93  12160.0  812.359066      406.179533   198.733401
    94  12288.0  812.429770      415.661740   198.995960
    95  12416.0  812.498981      412.149375   198.655991
-    96  12544.0  812.566838      412.971190   198.864492
+    96  12544.0  810.925276      412.546756   198.864492
    97  12672.0  811.007961      412.097543   198.971549

    [98 rows x 4 columns]
@@ -306,7 +306,7 @@ In the above plot, we can see that:

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 3 minutes  29.999 seconds)
+   **Total running time of the script:** ( 3 minutes  30.087 seconds)


 .. _sphx_glr_download_getting-started_tutorials_02-fused-softmax.py:
--- a/master/_sources/getting-started/tutorials/03-matrix-multiplication.rst.txt
+++ b/master/_sources/getting-started/tutorials/03-matrix-multiplication.rst.txt
@@ -459,37 +459,37 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we

    matmul-performance:
             M     cuBLAS  ...     Triton  Triton (+ LeakyReLU)
-    0    256.0   2.730667  ...   2.978909              3.276800
-    1    384.0   7.372800  ...   8.507077              8.507077
-    2    512.0  14.563555  ...  16.384000             16.384000
+    0    256.0   2.978909  ...   2.978909              2.978909
+    1    384.0   7.372800  ...   8.507077              7.899428
+    2    512.0  14.563555  ...  16.384000             15.420235
    3    640.0  22.260869  ...  24.380953             24.380953
    4    768.0  32.768000  ...  35.389441             34.028308
    5    896.0  39.025776  ...  40.140799             39.025776
    6   1024.0  49.932191  ...  53.773130             52.428801
-    7   1152.0  45.242181  ...  47.396572             47.396572
+    7   1152.0  45.242181  ...  48.161033             47.396572
    8   1280.0  51.200001  ...  57.690139             57.690139
-    9   1408.0  64.138541  ...  68.147202             67.305878
+    9   1408.0  64.138541  ...  69.009825             68.147202
    10  1536.0  80.430545  ...  81.355034             79.526831
-    11  1664.0  62.929456  ...  63.372618             62.492442
-    12  1792.0  72.512412  ...  73.460287             59.467852
-    13  1920.0  69.120002  ...  71.257735             71.257735
-    14  2048.0  73.584279  ...  78.398206             77.314362
-    15  2176.0  83.500614  ...  87.494120             85.998493
-    16  2304.0  68.251065  ...  78.064941             77.307030
-    17  2432.0  71.305746  ...  86.711310             83.614477
-    18  2560.0  78.019048  ...  82.747477             81.715711
-    19  2688.0  83.737433  ...  90.316801             89.254248
-    20  2816.0  79.733474  ...  84.197315             83.074685
-    21  2944.0  82.034625  ...  83.060049             82.237674
-    22  3072.0  82.661468  ...  85.147525             88.750943
-    23  3200.0  84.768213  ...  94.814812             95.808380
-    24  3328.0  83.034941  ...  85.096096             81.346098
-    25  3456.0  81.026701  ...  89.579522             83.545665
-    26  3584.0  85.633710  ...  93.661869             94.947616
-    27  3712.0  85.455380  ...  87.246590             87.552452
-    28  3840.0  81.738356  ...  89.766237             89.693434
-    29  3968.0  88.938731  ...  92.163097             85.093402
-    30  4096.0  93.401342  ...  86.009438             85.543487
+    11  1664.0  63.372618  ...  63.822072             62.492442
+    12  1792.0  72.983276  ...  73.943582             59.625589
+    13  1920.0  69.467336  ...  71.626943             71.257735
+    14  2048.0  73.908442  ...  78.398206             77.314362
+    15  2176.0  83.155572  ...  87.304326             85.998493
+    16  2304.0  68.446623  ...  78.064941             77.307030
+    17  2432.0  71.305746  ...  86.179335             85.653855
+    18  2560.0  77.833728  ...  82.956960             81.715711
+    19  2688.0  83.369354  ...  90.102270             89.464755
+    20  2816.0  80.099554  ...  84.687779             83.873477
+    21  2944.0  82.237674  ...  83.337844             82.102191
+    22  3072.0  81.589488  ...  89.877939             88.335577
+    23  3200.0  84.210524  ...  95.808380             93.841640
+    24  3328.0  84.003845  ...  85.398926             84.895397
+    25  3456.0  81.766291  ...  92.033756             91.200871
+    26  3584.0  86.125852  ...  92.220917             94.647779
+    27  3712.0  85.309435  ...  89.035062             82.287760
+    28  3840.0  84.485870  ...  92.817458             88.686451
+    29  3968.0  92.372393  ...  85.033178             90.724116
+    30  4096.0  86.202781  ...  92.820009             88.563330

    [31 rows x 5 columns]

@@ -499,7 +499,7 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 6 minutes  31.264 seconds)
+   **Total running time of the script:** ( 6 minutes  33.939 seconds)


 .. _sphx_glr_download_getting-started_tutorials_03-matrix-multiplication.py:
--- a/master/_sources/getting-started/tutorials/05-layer-norm.rst.txt
+++ b/master/_sources/getting-started/tutorials/05-layer-norm.rst.txt
@@ -40,16 +40,16 @@ Layer Normalization
              N      Triton       Torch        Apex
    0    1024.0  585.142849  277.694907  468.114273
    1    1536.0  630.153868  323.368435  511.999982
-    2    2048.0  668.734716  334.367358  520.126988
-    3    2560.0  694.237267  365.714281  518.481028
+    2    2048.0  682.666643  334.367358  520.126988
+    3    2560.0  694.237267  365.714281  512.000013
    4    3072.0  712.347810  378.092307  496.484863
-    5    3584.0  725.873439  384.859062  455.111115
+    5    3584.0  725.873439  384.859062  448.000001
    6    4096.0  728.177767  381.023256  455.111095
-    7    4608.0  670.254540  394.267384  421.302872
-    8    5120.0  688.403381  397.669909  424.455959
-    9    5632.0  704.000002  395.228063  413.357796
+    7    4608.0  670.254540  394.267384  426.173427
+    8    5120.0  688.403381  397.669909  422.268057
+    9    5632.0  704.000002  395.228063  415.262685
    10   6144.0  697.191505  402.885254  409.600010
-    11   6656.0  700.631610  400.360920  400.360920
+    11   6656.0  705.271522  400.360920  400.360920
    12   7168.0  690.891575  396.844306  387.459443
    13   7680.0  678.895043  393.846167  386.415087
    14   8192.0  636.271854  393.609605  371.308771
@@ -60,14 +60,14 @@ Layer Normalization
    19  10752.0  547.872604  411.559798  381.445676
    20  11264.0  533.207081  406.826188  373.134567
    21  11776.0  520.486200  409.599991  377.587162
-    22  12288.0  514.680630  413.911572  383.251457
+    22  12288.0  513.336807  413.911572  383.251457
    23  12800.0  504.433489  410.420828  376.470582
    24  13312.0  494.180982  405.699062  376.976995
    25  13824.0  482.934503  411.888257  379.389355
    26  14336.0  471.967074  406.695045  374.185964
    27  14848.0  461.297068  408.192434  375.304904
    28  15360.0  454.269882  406.214870  378.092307
-    29  15872.0  447.098578  406.974373  376.225175
+    29  15872.0  447.887117  406.974373  376.225175



@@ -393,7 +393,7 @@ Layer Normalization

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 5 minutes  33.449 seconds)
+   **Total running time of the script:** ( 5 minutes  35.450 seconds)


 .. _sphx_glr_download_getting-started_tutorials_05-layer-norm.py:
--- a/master/_sources/getting-started/tutorials/06-fused-attention.rst.txt
+++ b/master/_sources/getting-started/tutorials/06-fused-attention.rst.txt
@@ -390,7 +390,7 @@ This is a Triton implementation of the Flash Attention algorithm

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 0 minutes  0.073 seconds)
+   **Total running time of the script:** ( 0 minutes  0.075 seconds)


 .. _sphx_glr_download_getting-started_tutorials_06-fused-attention.py:
--- a/master/_sources/getting-started/tutorials/sg_execution_times.rst.txt
+++ b/master/_sources/getting-started/tutorials/sg_execution_times.rst.txt
@@ -5,18 +5,18 @@

 Computation times
 =================
-**17:18.602** total execution time for **getting-started_tutorials** files:
+**17:24.547** total execution time for **getting-started_tutorials** files:

 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_03-matrix-multiplication.py` (``03-matrix-multiplication.py``) | 06:31.264 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_03-matrix-multiplication.py` (``03-matrix-multiplication.py``) | 06:33.939 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_05-layer-norm.py` (``05-layer-norm.py``)                       | 05:33.449 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_05-layer-norm.py` (``05-layer-norm.py``)                       | 05:35.450 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_02-fused-softmax.py` (``02-fused-softmax.py``)                 | 03:29.999 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_02-fused-softmax.py` (``02-fused-softmax.py``)                 | 03:30.087 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_01-vector-add.py` (``01-vector-add.py``)                       | 01:43.794 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_01-vector-add.py` (``01-vector-add.py``)                       | 01:44.974 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_06-fused-attention.py` (``06-fused-attention.py``)             | 00:00.073 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_06-fused-attention.py` (``06-fused-attention.py``)             | 00:00.075 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_getting-started_tutorials_04-low-memory-dropout.py` (``04-low-memory-dropout.py``)       | 00:00.012 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+