[GH-PAGES] Updated website

2022-08-20 00:47:58 +00:00
parent a7462d444b
commit b4367e5d30
165 changed files with 234 additions and 234 deletions
--- a/master/_sources/getting-started/tutorials/01-vector-add.rst.txt
+++ b/master/_sources/getting-started/tutorials/01-vector-add.rst.txt
@@ -234,18 +234,18 @@ We can now run the decorated function above. Pass `print_data=True` to see the p
               size      Triton       Torch
    0        4096.0    9.600000    9.600000
    1        8192.0   19.200000   19.200000
-    2       16384.0   31.999999   38.400001
+    2       16384.0   38.400001   38.400001
    3       32768.0   76.800002   76.800002
    4       65536.0  127.999995  127.999995
    5      131072.0  219.428568  219.428568
-    6      262144.0  384.000001  341.333321
+    6      262144.0  341.333321  384.000001
    7      524288.0  472.615390  472.615390
    8     1048576.0  614.400016  614.400016
    9     2097152.0  722.823517  722.823517
    10    4194304.0  780.190482  780.190482
    11    8388608.0  812.429770  812.429770
    12   16777216.0  833.084721  833.084721
-    13   33554432.0  842.004273  842.004273
+    13   33554432.0  842.004273  843.811163
    14   67108864.0  847.448255  848.362445
    15  134217728.0  849.737435  850.656574

@@ -255,7 +255,7 @@ We can now run the decorated function above. Pass `print_data=True` to see the p

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 1 minutes  41.469 seconds)
+   **Total running time of the script:** ( 1 minutes  43.980 seconds)


 .. _sphx_glr_download_getting-started_tutorials_01-vector-add.py:
--- a/master/_sources/getting-started/tutorials/02-fused-softmax.rst.txt
+++ b/master/_sources/getting-started/tutorials/02-fused-softmax.rst.txt
@@ -278,17 +278,17 @@ We will then compare its performance against (1) :code:`torch.softmax` and (2) t

    softmax-performance:
              N      Triton  Torch (native)  Torch (jit)
-    0     256.0  546.133347      546.133347   188.321838
+    0     256.0  546.133347      546.133347   190.511628
    1     384.0  614.400016      585.142862   153.600004
    2     512.0  655.360017      606.814814   154.566038
    3     640.0  706.206879      640.000002   160.000000
    4     768.0  722.823517      664.216187   162.754967
    ..      ...         ...             ...          ...
-    93  12160.0  812.359066      406.179533   199.038365
-    94  12288.0  812.429770      415.661740   199.298541
-    95  12416.0  812.498981      412.149375   198.954424
-    96  12544.0  812.566838      412.546756   199.111113
-    97  12672.0  811.007961      412.097543   199.264875
+    93  12160.0  812.359066      406.179533   198.631953
+    94  12288.0  812.429770      415.661740   198.995960
+    95  12416.0  812.498981      412.149375   198.606350
+    96  12544.0  810.925276      412.971190   198.864492
+    97  12672.0  811.007961      412.097543   198.873965

    [98 rows x 4 columns]

@@ -306,7 +306,7 @@ In the above plot, we can see that:

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 3 minutes  28.831 seconds)
+   **Total running time of the script:** ( 3 minutes  30.496 seconds)


 .. _sphx_glr_download_getting-started_tutorials_02-fused-softmax.py:
--- a/master/_sources/getting-started/tutorials/03-matrix-multiplication.rst.txt
+++ b/master/_sources/getting-started/tutorials/03-matrix-multiplication.rst.txt
@@ -459,37 +459,37 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we

    matmul-performance:
             M     cuBLAS  ...     Triton  Triton (+ LeakyReLU)
-    0    256.0   2.978909  ...   3.276800              2.978909
+    0    256.0   2.978909  ...   2.978909              2.978909
    1    384.0   7.372800  ...   8.507077              8.507077
-    2    512.0  14.563555  ...  15.420235             16.384000
+    2    512.0  14.563555  ...  16.384000             16.384000
    3    640.0  22.260869  ...  24.380953             24.380953
    4    768.0  32.768000  ...  35.389441             34.028308
    5    896.0  39.025776  ...  40.140799             39.025776
-    6   1024.0  49.932191  ...  53.773130             52.428801
-    7   1152.0  44.566925  ...  48.161033             48.161033
+    6   1024.0  51.150050  ...  53.773130             52.428801
+    7   1152.0  45.242181  ...  47.396572             47.396572
    8   1280.0  51.200001  ...  57.690139             57.690139
-    9   1408.0  64.138541  ...  69.009825             68.147202
+    9   1408.0  64.138541  ...  68.147202             67.305878
    10  1536.0  80.430545  ...  81.355034             79.526831
-    11  1664.0  63.372618  ...  63.372618             62.929456
-    12  1792.0  72.983276  ...  73.460287             59.467852
-    13  1920.0  68.776119  ...  71.257735             71.257735
+    11  1664.0  62.929456  ...  63.372618             62.492442
+    12  1792.0  72.512412  ...  73.460287             59.467852
+    13  1920.0  69.120002  ...  71.626943             70.892307
    14  2048.0  73.908442  ...  78.398206             77.314362
-    15  2176.0  83.500614  ...  87.494120             86.367588
-    16  2304.0  68.251065  ...  78.064941             77.307030
-    17  2432.0  71.305746  ...  85.915795             83.864074
-    18  2560.0  78.019048  ...  82.747477             81.715711
-    19  2688.0  83.922689  ...  91.185232             89.254248
-    20  2816.0  80.026067  ...  83.712490             83.074685
-    21  2944.0  82.102191  ...  82.921853             82.784108
-    22  3072.0  81.884457  ...  89.310890             88.612060
-    23  3200.0  83.879425  ...  96.096095             93.704243
-    24  3328.0  84.101981  ...  85.602017             85.806075
-    25  3456.0  81.435930  ...  90.687926             90.994998
-    26  3584.0  87.042978  ...  93.273228             96.683219
-    27  3712.0  85.822459  ...  88.876645             87.094458
-    28  3840.0  81.798814  ...  88.050954             88.971840
-    29  3968.0  92.758598  ...  85.871877             91.301109
-    30  4096.0  86.536250  ...  87.041329             91.304576
+    15  2176.0  83.500614  ...  87.876193             86.367588
+    16  2304.0  68.446623  ...  78.064941             77.307030
+    17  2432.0  71.305746  ...  85.915795             75.320281
+    18  2560.0  78.019048  ...  82.747477             81.512437
+    19  2688.0  83.186525  ...  90.640519             89.254248
+    20  2816.0  82.135981  ...  84.035084             83.873477
+    21  2944.0  81.298583  ...  81.431424             83.337844
+    22  3072.0  82.062468  ...  85.275760             88.612060
+    23  3200.0  85.219705  ...  96.822991             95.808380
+    24  3328.0  84.101981  ...  85.398926             84.695641
+    25  3456.0  81.766291  ...  91.824110             90.790053
+    26  3584.0  86.540320  ...  88.238857             94.250936
+    27  3712.0  85.382349  ...  87.322855             86.829501
+    28  3840.0  80.990111  ...  93.563449             84.777307
+    29  3968.0  93.112506  ...  82.840416             84.915752
+    30  4096.0  86.259046  ...  85.489001             89.657802

    [31 rows x 5 columns]

@@ -499,7 +499,7 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 6 minutes  32.425 seconds)
+   **Total running time of the script:** ( 6 minutes  34.019 seconds)


 .. _sphx_glr_download_getting-started_tutorials_03-matrix-multiplication.py:
--- a/master/_sources/getting-started/tutorials/04-low-memory-dropout.rst.txt
+++ b/master/_sources/getting-started/tutorials/04-low-memory-dropout.rst.txt
@@ -240,7 +240,7 @@ References

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 0 minutes  0.012 seconds)
+   **Total running time of the script:** ( 0 minutes  0.013 seconds)


 .. _sphx_glr_download_getting-started_tutorials_04-low-memory-dropout.py:
--- a/master/_sources/getting-started/tutorials/05-layer-norm.rst.txt
+++ b/master/_sources/getting-started/tutorials/05-layer-norm.rst.txt
@@ -42,7 +42,7 @@ Layer Normalization
    1    1536.0  630.153868  323.368435  511.999982
    2    2048.0  682.666643  334.367358  520.126988
    3    2560.0  694.237267  365.714281  518.481028
-    4    3072.0  712.347810  378.092307  501.551037
+    4    3072.0  712.347810  378.092307  496.484863
    5    3584.0  725.873439  384.859062  451.527536
    6    4096.0  728.177767  381.023256  455.111095
    7    4608.0  670.254540  394.267384  421.302872
@@ -53,21 +53,21 @@ Layer Normalization
    12   7168.0  690.891575  396.844306  387.459443
    13   7680.0  678.895043  393.846167  386.415087
    14   8192.0  639.375598  393.609605  372.363633
-    15   8704.0  624.502255  389.005597  380.502740
-    16   9216.0  604.327881  407.337026  383.002605
-    17   9728.0  585.142883  409.599987  383.369452
+    15   8704.0  627.315309  389.005597  380.502740
+    16   9216.0  606.814809  407.337026  383.999986
+    17   9728.0  587.350922  409.599987  383.369452
    18  10240.0  564.965524  408.578556  382.803739
-    19  10752.0  546.133312  411.559798  381.445676
+    19  10752.0  547.872604  411.559798  381.445676
    20  11264.0  533.207081  406.826188  373.134567
    21  11776.0  520.486200  409.599991  377.587162
-    22  12288.0  516.031509  413.911572  383.251457
+    22  12288.0  514.680630  414.784810  383.251457
    23  12800.0  504.433489  410.420828  376.470582
    24  13312.0  494.180982  405.699062  376.976995
    25  13824.0  482.934503  411.888257  379.389355
    26  14336.0  471.967074  406.695045  374.185964
    27  14848.0  461.297068  408.192434  375.304904
    28  15360.0  454.269882  406.214870  378.092307
-    29  15872.0  447.887117  406.974373  376.225175
+    29  15872.0  447.098578  406.974373  376.783377



@@ -393,7 +393,7 @@ Layer Normalization

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 5 minutes  36.947 seconds)
+   **Total running time of the script:** ( 5 minutes  34.799 seconds)


 .. _sphx_glr_download_getting-started_tutorials_05-layer-norm.py:
--- a/master/_sources/getting-started/tutorials/06-fused-attention.rst.txt
+++ b/master/_sources/getting-started/tutorials/06-fused-attention.rst.txt
@@ -390,7 +390,7 @@ This is a Triton implementation of the Flash Attention algorithm

 .. rst-class:: sphx-glr-timing

-   **Total running time of the script:** ( 0 minutes  0.077 seconds)
+   **Total running time of the script:** ( 0 minutes  0.080 seconds)


 .. _sphx_glr_download_getting-started_tutorials_06-fused-attention.py:
--- a/master/_sources/getting-started/tutorials/sg_execution_times.rst.txt
+++ b/master/_sources/getting-started/tutorials/sg_execution_times.rst.txt
@@ -5,20 +5,20 @@

 Computation times
 =================
-**17:19.771** total execution time for **getting-started_tutorials** files:
+**17:23.396** total execution time for **getting-started_tutorials** files:

 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_03-matrix-multiplication.py` (``03-matrix-multiplication.py``) | 06:32.425 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_03-matrix-multiplication.py` (``03-matrix-multiplication.py``) | 06:34.019 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_05-layer-norm.py` (``05-layer-norm.py``)                       | 05:36.947 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_05-layer-norm.py` (``05-layer-norm.py``)                       | 05:34.799 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_02-fused-softmax.py` (``02-fused-softmax.py``)                 | 03:28.831 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_02-fused-softmax.py` (``02-fused-softmax.py``)                 | 03:30.496 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_01-vector-add.py` (``01-vector-add.py``)                       | 01:41.469 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_01-vector-add.py` (``01-vector-add.py``)                       | 01:43.980 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_06-fused-attention.py` (``06-fused-attention.py``)             | 00:00.077 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_06-fused-attention.py` (``06-fused-attention.py``)             | 00:00.080 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
-| :ref:`sphx_glr_getting-started_tutorials_04-low-memory-dropout.py` (``04-low-memory-dropout.py``)       | 00:00.012 | 0.0 MB |
+| :ref:`sphx_glr_getting-started_tutorials_04-low-memory-dropout.py` (``04-low-memory-dropout.py``)       | 00:00.013 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+
 | :ref:`sphx_glr_getting-started_tutorials_07-libdevice-function.py` (``07-libdevice-function.py``)       | 00:00.010 | 0.0 MB |
 +---------------------------------------------------------------------------------------------------------+-----------+--------+