[DOCS] Fix spelling (#664)

This PR applies minor spelling fix in comments and string literals to
`master`. It shouldn't hurt anything.
This commit is contained in:
Shintaro Iwasaki
2022-09-16 12:26:40 -07:00
committed by GitHub
parent 4580a04710
commit c668d6596e
16 changed files with 19 additions and 19 deletions

View File

@@ -78,7 +78,7 @@ def softmax_kernel(
input_ptrs = row_start_ptr + col_offsets
# Load the row into SRAM, using a mask since BLOCK_SIZE may be > than n_cols
row = tl.load(input_ptrs, mask=col_offsets < n_cols, other=-float('inf'))
# Substract maximum for numerical stability
# Subtract maximum for numerical stability
row_minus_max = row - tl.max(row, axis=0)
# Note that exponentials in Triton are fast but approximate (i.e., think __expf in CUDA)
numerator = tl.exp(row_minus_max)

View File

@@ -18,7 +18,7 @@ You will specifically learn about:
# They are notoriously hard to optimize, hence their implementation is generally done by
# hardware vendors themselves as part of so-called "kernel libraries" (e.g., cuBLAS).
# Unfortunately, these libraries are often proprietary and cannot be easily customized
# to accomodate the needs of modern deep learning workloads (e.g., fused activation functions).
# to accommodate the needs of modern deep learning workloads (e.g., fused activation functions).
# In this tutorial, you will learn how to implement efficient matrix multiplications by
# yourself with Triton, in a way that is easy to customize and extend.
#