[GH-PAGES] Updated website
@@ -43,7 +43,7 @@ def add_kernel(
|
|||||||
y = tl.load(y_ptr + offsets, mask=mask)
|
y = tl.load(y_ptr + offsets, mask=mask)
|
||||||
output = x + y
|
output = x + y
|
||||||
# Write x + y back to DRAM
|
# Write x + y back to DRAM
|
||||||
tl.store(output_ptr + offsets, output)
|
tl.store(output_ptr + offsets, output, mask=mask)
|
||||||
|
|
||||||
|
|
||||||
# %%
|
# %%
|
||||||
|
@@ -0,0 +1,100 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"collapsed": false
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"%matplotlib inline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"\n# Low-Memory Dropout\n\nIn this tutorial, you will write a memory-efficient implementation of dropout whose state\nwill be composed of a single int32 seed. This differs from more traditional implementations of dropout,\nwhose state is generally composed of a bit mask tensor of the same shape as the input. You will learn about:\n\n- The limitations of naive implementations of Dropout with PyTorch\n- Parallel pseudo-random number generation in Triton\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Baseline\nThe *dropout* operator was first introduced in [SRIVASTAVA2014]_ as a way to improve the performance \nof deep neural networks in low-data regime (i.e. regularization).\n\nIt takes a vector as input and produces a vector of the same shape as output. Each scalar in the\noutput has a probability $p$ of being changed to zero and otherwise it is copied from the input.\nThis forces the network to perform well even when only $1 - p$ scalars from the input are available.\n\nAt evaluation time we want to use the full power of the network so we set $p=0$. Naively this would\nincrease the norm of the output (which can be a bad thing, e.g. it can lead to artificial decrease\nin the output softmax temperature). To prevent this we multiply the output by $\\frac{1}{1 - p}$, which\nkeeps the norm consistent regardless of the dropout probability.\n\nLet's first take a look at the baseline implementation.\n\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"collapsed": false
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import tabulate\nimport torch\nimport triton\nimport triton.language as tl\n\n@triton.jit\ndef _dropout(\n x_ptr, # pointer to the input\n x_keep_ptr, # pointer to a mask of 0s and 1s\n output_ptr, # pointer to the output\n n_elements, # number of elements in the `x` tensor\n p, # probability that an element of `x` is changed to zero\n **meta,\n):\n BLOCK_SIZE = meta['BLOCK_SIZE']\n pid = tl.program_id(axis=0)\n block_start = pid * BLOCK_SIZE\n offsets = block_start + tl.arange(0, BLOCK_SIZE)\n mask = offsets < n_elements\n # Load data\n x = tl.load(x_ptr + offsets, mask=mask)\n x_keep = tl.load(x_keep_ptr + offsets, mask=mask)\n # The line below is the crucial part, described in the paragraph above!\n output = tl.where(x_keep, x / (1 - p), 0.0)\n # Write-back output\n tl.store(output_ptr + offsets, output, mask=mask)\n\n\ndef dropout(x, x_keep, p):\n output = torch.empty_like(x)\n assert x.is_contiguous()\n n_elements = x.numel()\n grid = lambda meta: (triton.cdiv(n_elements, meta['BLOCK_SIZE']),)\n _dropout[grid](x, x_keep, output, n_elements, p, BLOCK_SIZE=1024)\n return output\n\n# Input tensor\nx = torch.randn(size=(10,)).cuda()\n# Dropout mask\np = 0.5\nx_keep = (torch.rand(size=(10,)) > p).to(torch.int32).cuda()\n#\noutput = dropout(x, x_keep=x_keep, p=p)\nprint(tabulate.tabulate([\n [\"input\"] + x.tolist(),\n [\"keep mask\"] + x_keep.tolist(),\n [\"output\"] + output.tolist()\n]))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Seeded dropout\nAbove implementation of dropout works fine, but it can be a bit awkward to deal with. Firstly\nwe need to store the dropout mask for backpropagation. Secondly, dropout state management can get\nvery tricky when using recompute/checkpointing (e.g. see all the notes about `preserve_rng_state` in\nhttps://pytorch.org/docs/1.9.0/checkpoint.html). In this tutorial we'll describe an alternative implementation\nthat (1) has a smaller memory footprint; (2) requires less data movement; and (3) simplifies the management\nof persisting randomness across multiple invocations of the kernel.\n\nPseudorandom number generation in Triton is simple! In this tutorial we will use the\n:code:`triton.language.rand` function which generates a block of uniformly distributed :code:`float32` \nvalues in [0, 1), given a seed and a block of :code:`int32` offsets. But if you need it, Triton also provides\nother `random number generation strategies <Random Number Generation>`.\n\n<div class=\"alert alert-info\"><h4>Note</h4><p>Triton's implementation of PRNG is based on the Philox algorithm (described on [SALMON2011]_).</p></div>\n\nLet's put it all together.\n\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"collapsed": false
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"@triton.jit\ndef _seeded_dropout(\n x_ptr,\n output_ptr,\n n_elements,\n p,\n seed,\n **meta,\n):\n # compute memory offsets of elements handled by this instance\n BLOCK_SIZE = meta['BLOCK_SIZE']\n pid = tl.program_id(axis=0)\n block_start = pid * BLOCK_SIZE\n offsets = block_start + tl.arange(0, BLOCK_SIZE)\n # load data from x\n mask = offsets < n_elements\n x = tl.load(x_ptr + offsets, mask=mask)\n # randomly prune it\n random = tl.rand(seed, offsets)\n x_keep = random > p\n # write-back\n output = tl.where(x_keep, x / (1 - p), 0.0)\n tl.store(output_ptr + offsets, output, mask=mask)\n\n\ndef seeded_dropout(x, p, seed):\n output = torch.empty_like(x)\n assert x.is_contiguous()\n n_elements = x.numel()\n grid = lambda meta: (triton.cdiv(n_elements, meta['BLOCK_SIZE']),)\n _seeded_dropout[grid](x, output, n_elements, p, seed, BLOCK_SIZE=1024)\n return output\n\n\nx = torch.randn(size=(10,)).cuda()\n# Compare this to the baseline - dropout mask is never instantiated!\noutput = seeded_dropout(x, p=0.5, seed=123)\noutput2 = seeded_dropout(x, p=0.5, seed=123)\noutput3 = seeded_dropout(x, p=0.5, seed=512)\n\nprint(tabulate.tabulate([\n [\"input\"] + x.tolist(),\n [\"output (seed = 123)\"] + output.tolist(),\n [\"output (seed = 123)\"] + output2.tolist(),\n [\"output (seed = 512)\"] + output3.tolist()\n]))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Et Voil\u00e0! We have a triton kernel that applies the same dropout mask provided the seed is the same!\nIf you'd like explore further applications of pseudorandomness in GPU programming, we encourage you\nto explore the `triton/language/random` folder!\n\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Exercises\n1. Extend the kernel to operate over a matrix and use a vector of seeds - one per row.\n2. Add support for striding.\n3. (challenge) Implement a kernel for sparse Johnson-Lindenstrauss transform which generates the projection matrix one the fly each time using a seed.\n\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## References\n\n.. [SALMON2011] John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw, \"Parallel Random Numbers: As Easy as 1, 2, 3\", 2011\n.. [SRIVASTAVA2014] Nitish Srivastava and Geoffrey Hinton and Alex Krizhevsky and Ilya Sutskever and Ruslan Salakhutdinov, \"Dropout: A Simple Way to Prevent Neural Networks from Overfitting\", JMLR 2014\n\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.8.10"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 0
|
||||||
|
}
|
@@ -0,0 +1,164 @@
|
|||||||
|
"""
|
||||||
|
Low-Memory Dropout
|
||||||
|
=================
|
||||||
|
|
||||||
|
In this tutorial, you will write a memory-efficient implementation of dropout whose state
|
||||||
|
will be composed of a single int32 seed. This differs from more traditional implementations of dropout,
|
||||||
|
whose state is generally composed of a bit mask tensor of the same shape as the input. You will learn about:
|
||||||
|
|
||||||
|
- The limitations of naive implementations of Dropout with PyTorch
|
||||||
|
- Parallel pseudo-random number generation in Triton
|
||||||
|
"""
|
||||||
|
|
||||||
|
# %%
|
||||||
|
# Baseline
|
||||||
|
# -------------
|
||||||
|
# The *dropout* operator was first introduced in [SRIVASTAVA2014]_ as a way to improve the performance
|
||||||
|
# of deep neural networks in low-data regime (i.e. regularization).
|
||||||
|
#
|
||||||
|
# It takes a vector as input and produces a vector of the same shape as output. Each scalar in the
|
||||||
|
# output has a probability :math:`p` of being changed to zero and otherwise it is copied from the input.
|
||||||
|
# This forces the network to perform well even when only :math:`1 - p` scalars from the input are available.
|
||||||
|
#
|
||||||
|
# At evaluation time we want to use the full power of the network so we set :math:`p=0`. Naively this would
|
||||||
|
# increase the norm of the output (which can be a bad thing, e.g. it can lead to artificial decrease
|
||||||
|
# in the output softmax temperature). To prevent this we multiply the output by :math:`\frac{1}{1 - p}`, which
|
||||||
|
# keeps the norm consistent regardless of the dropout probability.
|
||||||
|
#
|
||||||
|
# Let's first take a look at the baseline implementation.
|
||||||
|
|
||||||
|
|
||||||
|
import tabulate
|
||||||
|
import torch
|
||||||
|
import triton
|
||||||
|
import triton.language as tl
|
||||||
|
|
||||||
|
@triton.jit
|
||||||
|
def _dropout(
|
||||||
|
x_ptr, # pointer to the input
|
||||||
|
x_keep_ptr, # pointer to a mask of 0s and 1s
|
||||||
|
output_ptr, # pointer to the output
|
||||||
|
n_elements, # number of elements in the `x` tensor
|
||||||
|
p, # probability that an element of `x` is changed to zero
|
||||||
|
**meta,
|
||||||
|
):
|
||||||
|
BLOCK_SIZE = meta['BLOCK_SIZE']
|
||||||
|
pid = tl.program_id(axis=0)
|
||||||
|
block_start = pid * BLOCK_SIZE
|
||||||
|
offsets = block_start + tl.arange(0, BLOCK_SIZE)
|
||||||
|
mask = offsets < n_elements
|
||||||
|
# Load data
|
||||||
|
x = tl.load(x_ptr + offsets, mask=mask)
|
||||||
|
x_keep = tl.load(x_keep_ptr + offsets, mask=mask)
|
||||||
|
# The line below is the crucial part, described in the paragraph above!
|
||||||
|
output = tl.where(x_keep, x / (1 - p), 0.0)
|
||||||
|
# Write-back output
|
||||||
|
tl.store(output_ptr + offsets, output, mask=mask)
|
||||||
|
|
||||||
|
|
||||||
|
def dropout(x, x_keep, p):
|
||||||
|
output = torch.empty_like(x)
|
||||||
|
assert x.is_contiguous()
|
||||||
|
n_elements = x.numel()
|
||||||
|
grid = lambda meta: (triton.cdiv(n_elements, meta['BLOCK_SIZE']),)
|
||||||
|
_dropout[grid](x, x_keep, output, n_elements, p, BLOCK_SIZE=1024)
|
||||||
|
return output
|
||||||
|
|
||||||
|
# Input tensor
|
||||||
|
x = torch.randn(size=(10,)).cuda()
|
||||||
|
# Dropout mask
|
||||||
|
p = 0.5
|
||||||
|
x_keep = (torch.rand(size=(10,)) > p).to(torch.int32).cuda()
|
||||||
|
#
|
||||||
|
output = dropout(x, x_keep=x_keep, p=p)
|
||||||
|
print(tabulate.tabulate([
|
||||||
|
["input"] + x.tolist(),
|
||||||
|
["keep mask"] + x_keep.tolist(),
|
||||||
|
["output"] + output.tolist()
|
||||||
|
]))
|
||||||
|
|
||||||
|
# %%
|
||||||
|
# Seeded dropout
|
||||||
|
# -------------
|
||||||
|
# Above implementation of dropout works fine, but it can be a bit awkward to deal with. Firstly
|
||||||
|
# we need to store the dropout mask for backpropagation. Secondly, dropout state management can get
|
||||||
|
# very tricky when using recompute/checkpointing (e.g. see all the notes about `preserve_rng_state` in
|
||||||
|
# https://pytorch.org/docs/1.9.0/checkpoint.html). In this tutorial we'll describe an alternative implementation
|
||||||
|
# that (1) has a smaller memory footprint; (2) requires less data movement; and (3) simplifies the management
|
||||||
|
# of persisting randomness across multiple invocations of the kernel.
|
||||||
|
#
|
||||||
|
# Pseudorandom number generation in Triton is simple! In this tutorial we will use the
|
||||||
|
# :code:`triton.language.rand` function which generates a block of uniformly distributed :code:`float32`
|
||||||
|
# values in [0, 1), given a seed and a block of :code:`int32` offsets. But if you need it, Triton also provides
|
||||||
|
# other :ref:`random number generation strategies <Random Number Generation>`.
|
||||||
|
#
|
||||||
|
# .. note::
|
||||||
|
# Triton's implementation of PRNG is based on the Philox algorithm (described on [SALMON2011]_).
|
||||||
|
#
|
||||||
|
# Let's put it all together.
|
||||||
|
|
||||||
|
@triton.jit
|
||||||
|
def _seeded_dropout(
|
||||||
|
x_ptr,
|
||||||
|
output_ptr,
|
||||||
|
n_elements,
|
||||||
|
p,
|
||||||
|
seed,
|
||||||
|
**meta,
|
||||||
|
):
|
||||||
|
# compute memory offsets of elements handled by this instance
|
||||||
|
BLOCK_SIZE = meta['BLOCK_SIZE']
|
||||||
|
pid = tl.program_id(axis=0)
|
||||||
|
block_start = pid * BLOCK_SIZE
|
||||||
|
offsets = block_start + tl.arange(0, BLOCK_SIZE)
|
||||||
|
# load data from x
|
||||||
|
mask = offsets < n_elements
|
||||||
|
x = tl.load(x_ptr + offsets, mask=mask)
|
||||||
|
# randomly prune it
|
||||||
|
random = tl.rand(seed, offsets)
|
||||||
|
x_keep = random > p
|
||||||
|
# write-back
|
||||||
|
output = tl.where(x_keep, x / (1 - p), 0.0)
|
||||||
|
tl.store(output_ptr + offsets, output, mask=mask)
|
||||||
|
|
||||||
|
|
||||||
|
def seeded_dropout(x, p, seed):
|
||||||
|
output = torch.empty_like(x)
|
||||||
|
assert x.is_contiguous()
|
||||||
|
n_elements = x.numel()
|
||||||
|
grid = lambda meta: (triton.cdiv(n_elements, meta['BLOCK_SIZE']),)
|
||||||
|
_seeded_dropout[grid](x, output, n_elements, p, seed, BLOCK_SIZE=1024)
|
||||||
|
return output
|
||||||
|
|
||||||
|
|
||||||
|
x = torch.randn(size=(10,)).cuda()
|
||||||
|
# Compare this to the baseline - dropout mask is never instantiated!
|
||||||
|
output = seeded_dropout(x, p=0.5, seed=123)
|
||||||
|
output2 = seeded_dropout(x, p=0.5, seed=123)
|
||||||
|
output3 = seeded_dropout(x, p=0.5, seed=512)
|
||||||
|
|
||||||
|
print(tabulate.tabulate([
|
||||||
|
["input"] + x.tolist(),
|
||||||
|
["output (seed = 123)"] + output.tolist(),
|
||||||
|
["output (seed = 123)"] + output2.tolist(),
|
||||||
|
["output (seed = 512)"] + output3.tolist()
|
||||||
|
]))
|
||||||
|
|
||||||
|
# %%
|
||||||
|
# Et Voilà! We have a triton kernel that applies the same dropout mask provided the seed is the same!
|
||||||
|
# If you'd like explore further applications of pseudorandomness in GPU programming, we encourage you
|
||||||
|
# to explore the `triton/language/random` folder!
|
||||||
|
|
||||||
|
# %%
|
||||||
|
# Exercises
|
||||||
|
# -------------
|
||||||
|
# 1. Extend the kernel to operate over a matrix and use a vector of seeds - one per row.
|
||||||
|
# 2. Add support for striding.
|
||||||
|
# 3. (challenge) Implement a kernel for sparse Johnson-Lindenstrauss transform which generates the projection matrix one the fly each time using a seed.
|
||||||
|
|
||||||
|
# %%
|
||||||
|
# References
|
||||||
|
# --------------
|
||||||
|
#
|
||||||
|
# .. [SALMON2011] John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw, "Parallel Random Numbers: As Easy as 1, 2, 3", 2011
|
||||||
|
# .. [SRIVASTAVA2014] Nitish Srivastava and Geoffrey Hinton and Alex Krizhevsky and Ilya Sutskever and Ruslan Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting", JMLR 2014
|
@@ -33,7 +33,7 @@
|
|||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import torch\nimport triton\nimport triton.language as tl\n\n\n@triton.jit\ndef add_kernel(\n x_ptr, # *Pointer* to first input vector\n y_ptr, # *Pointer* to second input vector\n output_ptr, # *Pointer* to output vector\n n_elements, # Size of the vector\n **meta, # Optional meta-parameters for the kernel\n):\n BLOCK_SIZE = meta['BLOCK_SIZE'] # How many inputs each program should process\n # There are multiple 'program's processing different data. We identify which program\n # we are here\n pid = tl.program_id(axis=0) # We use a 1D launch grid so axis is 0\n # This program will process inputs that are offset from the initial data.\n # for instance, if you had a vector of length 256 and block_size of 64, the programs\n # would each access the elements [0:64, 64:128, 128:192, 192:256].\n # Note that offsets is a list of pointers\n block_start = pid * BLOCK_SIZE\n offsets = block_start + tl.arange(0, BLOCK_SIZE)\n # Create a mask to guard memory operations against out-of-bounds accesses\n mask = offsets < n_elements\n # Load x and y from DRAM, masking out any extar elements in case the input is not a\n # multiple of the block size\n x = tl.load(x_ptr + offsets, mask=mask)\n y = tl.load(y_ptr + offsets, mask=mask)\n output = x + y\n # Write x + y back to DRAM\n tl.store(output_ptr + offsets, output)"
|
"import torch\nimport triton\nimport triton.language as tl\n\n\n@triton.jit\ndef add_kernel(\n x_ptr, # *Pointer* to first input vector\n y_ptr, # *Pointer* to second input vector\n output_ptr, # *Pointer* to output vector\n n_elements, # Size of the vector\n **meta, # Optional meta-parameters for the kernel\n):\n BLOCK_SIZE = meta['BLOCK_SIZE'] # How many inputs each program should process\n # There are multiple 'program's processing different data. We identify which program\n # we are here\n pid = tl.program_id(axis=0) # We use a 1D launch grid so axis is 0\n # This program will process inputs that are offset from the initial data.\n # for instance, if you had a vector of length 256 and block_size of 64, the programs\n # would each access the elements [0:64, 64:128, 128:192, 192:256].\n # Note that offsets is a list of pointers\n block_start = pid * BLOCK_SIZE\n offsets = block_start + tl.arange(0, BLOCK_SIZE)\n # Create a mask to guard memory operations against out-of-bounds accesses\n mask = offsets < n_elements\n # Load x and y from DRAM, masking out any extar elements in case the input is not a\n # multiple of the block size\n x = tl.load(x_ptr + offsets, mask=mask)\n y = tl.load(y_ptr + offsets, mask=mask)\n output = x + y\n # Write x + y back to DRAM\n tl.store(output_ptr + offsets, output, mask=mask)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
Before Width: | Height: | Size: 25 KiB After Width: | Height: | Size: 25 KiB |
Before Width: | Height: | Size: 16 KiB After Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 37 KiB After Width: | Height: | Size: 37 KiB |
Before Width: | Height: | Size: 24 KiB After Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 54 KiB After Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 32 KiB After Width: | Height: | Size: 32 KiB |
BIN
_images/sphx_glr_04-low-memory-dropout_thumb.png
Normal file
After Width: | Height: | Size: 26 KiB |
@@ -67,7 +67,7 @@ Compute Kernel
|
|||||||
y = tl.load(y_ptr + offsets, mask=mask)
|
y = tl.load(y_ptr + offsets, mask=mask)
|
||||||
output = x + y
|
output = x + y
|
||||||
# Write x + y back to DRAM
|
# Write x + y back to DRAM
|
||||||
tl.store(output_ptr + offsets, output)
|
tl.store(output_ptr + offsets, output, mask=mask)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -231,16 +231,16 @@ We can now run the decorated function above. Pass `print_data=True` to see the p
|
|||||||
|
|
||||||
vector-add-performance:
|
vector-add-performance:
|
||||||
size Triton Torch
|
size Triton Torch
|
||||||
0 4096.0 8.000000 9.600000
|
0 4096.0 9.600000 9.600000
|
||||||
1 8192.0 19.200000 19.200000
|
1 8192.0 19.200000 19.200000
|
||||||
2 16384.0 38.400001 38.400001
|
2 16384.0 38.400001 38.400001
|
||||||
3 32768.0 76.800002 76.800002
|
3 32768.0 76.800002 76.800002
|
||||||
4 65536.0 127.999995 127.999995
|
4 65536.0 127.999995 127.999995
|
||||||
5 131072.0 219.428568 219.428568
|
5 131072.0 219.428568 219.428568
|
||||||
6 262144.0 384.000001 341.333321
|
6 262144.0 341.333321 384.000001
|
||||||
7 524288.0 472.615390 472.615390
|
7 524288.0 472.615390 472.615390
|
||||||
8 1048576.0 614.400016 614.400016
|
8 1048576.0 614.400016 614.400016
|
||||||
9 2097152.0 722.823517 722.823517
|
9 2097152.0 702.171410 722.823517
|
||||||
10 4194304.0 780.190482 780.190482
|
10 4194304.0 780.190482 780.190482
|
||||||
11 8388608.0 812.429770 812.429770
|
11 8388608.0 812.429770 812.429770
|
||||||
12 16777216.0 833.084721 833.084721
|
12 16777216.0 833.084721 833.084721
|
||||||
@@ -254,7 +254,7 @@ We can now run the decorated function above. Pass `print_data=True` to see the p
|
|||||||
|
|
||||||
.. rst-class:: sphx-glr-timing
|
.. rst-class:: sphx-glr-timing
|
||||||
|
|
||||||
**Total running time of the script:** ( 0 minutes 11.053 seconds)
|
**Total running time of the script:** ( 0 minutes 10.972 seconds)
|
||||||
|
|
||||||
|
|
||||||
.. _sphx_glr_download_getting-started_tutorials_01-vector-add.py:
|
.. _sphx_glr_download_getting-started_tutorials_01-vector-add.py:
|
||||||
|
@@ -310,7 +310,7 @@ We will then compare its performance against (1) :code:`torch.softmax` and (2) t
|
|||||||
94 12288.0 812.429770 415.661740 199.298541
|
94 12288.0 812.429770 415.661740 199.298541
|
||||||
95 12416.0 810.840807 412.149375 198.954424
|
95 12416.0 810.840807 412.149375 198.954424
|
||||||
96 12544.0 810.925276 412.971190 199.209928
|
96 12544.0 810.925276 412.971190 199.209928
|
||||||
97 12672.0 811.007961 412.097543 199.167004
|
97 12672.0 811.007961 412.097543 199.264875
|
||||||
|
|
||||||
[98 rows x 4 columns]
|
[98 rows x 4 columns]
|
||||||
|
|
||||||
@@ -328,7 +328,7 @@ In the above plot, we can see that:
|
|||||||
|
|
||||||
.. rst-class:: sphx-glr-timing
|
.. rst-class:: sphx-glr-timing
|
||||||
|
|
||||||
**Total running time of the script:** ( 1 minutes 13.131 seconds)
|
**Total running time of the script:** ( 1 minutes 12.586 seconds)
|
||||||
|
|
||||||
|
|
||||||
.. _sphx_glr_download_getting-started_tutorials_02-fused-softmax.py:
|
.. _sphx_glr_download_getting-started_tutorials_02-fused-softmax.py:
|
||||||
|
@@ -462,37 +462,37 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we
|
|||||||
|
|
||||||
matmul-performance:
|
matmul-performance:
|
||||||
M cuBLAS ... Triton Triton (+ LeakyReLU)
|
M cuBLAS ... Triton Triton (+ LeakyReLU)
|
||||||
0 256.0 2.978909 ... 2.978909 2.978909
|
0 256.0 2.978909 ... 3.276800 3.276800
|
||||||
1 384.0 7.372800 ... 8.507077 8.507077
|
1 384.0 7.372800 ... 8.507077 8.507077
|
||||||
2 512.0 14.563555 ... 16.384000 16.384000
|
2 512.0 14.563555 ... 16.384000 16.384000
|
||||||
3 640.0 22.260869 ... 24.380953 24.380953
|
3 640.0 22.260869 ... 24.380953 24.380953
|
||||||
4 768.0 32.768000 ... 34.028308 34.028308
|
4 768.0 32.768000 ... 35.389441 34.028308
|
||||||
5 896.0 39.025776 ... 40.140799 39.025776
|
5 896.0 39.025776 ... 40.140799 39.025776
|
||||||
6 1024.0 49.932191 ... 53.773130 52.428801
|
6 1024.0 49.932191 ... 52.428801 52.428801
|
||||||
7 1152.0 44.566925 ... 46.656000 46.656000
|
7 1152.0 44.566925 ... 46.656000 46.656000
|
||||||
8 1280.0 51.200001 ... 56.888887 56.888887
|
8 1280.0 51.200001 ... 56.888887 56.888887
|
||||||
9 1408.0 64.138541 ... 63.392744 63.392744
|
9 1408.0 64.138541 ... 63.392744 57.368243
|
||||||
10 1536.0 78.643199 ... 76.106321 76.106321
|
10 1536.0 79.526831 ... 75.296679 75.296679
|
||||||
11 1664.0 63.372618 ... 62.061463 62.061463
|
11 1664.0 62.929456 ... 61.217089 61.636381
|
||||||
12 1792.0 72.983276 ... 62.790080 62.441243
|
12 1792.0 72.983276 ... 62.441243 62.441243
|
||||||
13 1920.0 69.467336 ... 67.106797 69.818184
|
13 1920.0 68.776119 ... 70.172588 69.818184
|
||||||
14 2048.0 73.908442 ... 74.898285 74.565406
|
14 2048.0 73.584279 ... 74.565406 74.565406
|
||||||
15 2176.0 83.155572 ... 81.472263 81.143743
|
15 2176.0 83.155572 ... 80.494588 80.494588
|
||||||
16 2304.0 68.446623 ... 73.501144 73.275679
|
16 2304.0 68.251065 ... 73.275679 73.275679
|
||||||
17 2432.0 71.125224 ... 81.197876 82.147552
|
17 2432.0 71.125224 ... 70.766913 80.041209
|
||||||
18 2560.0 77.649287 ... 76.920185 77.465723
|
18 2560.0 77.649287 ... 76.740048 76.027843
|
||||||
19 2688.0 81.053536 ... 83.737433 80.537273
|
19 2688.0 83.922689 ... 80.880718 83.186525
|
||||||
20 2816.0 82.135981 ... 78.301990 79.733474
|
20 2816.0 83.552120 ... 78.868366 78.442822
|
||||||
21 2944.0 80.510553 ... 78.605729 76.435630
|
21 2944.0 82.102191 ... 77.385141 77.990663
|
||||||
22 3072.0 81.472093 ... 83.638266 84.386148
|
22 3072.0 79.415291 ... 81.238312 83.146995
|
||||||
23 3200.0 84.656085 ... 86.956520 89.635851
|
23 3200.0 84.321474 ... 89.012517 89.761569
|
||||||
24 3328.0 81.530349 ... 84.596116 86.632127
|
24 3328.0 83.226931 ... 85.500351 87.051143
|
||||||
25 3456.0 81.683457 ... 84.068369 83.980802
|
25 3456.0 78.655188 ... 80.300370 83.632331
|
||||||
26 3584.0 87.211821 ... 87.466332 91.099693
|
26 3584.0 85.879071 ... 91.470385 93.661869
|
||||||
27 3712.0 85.896254 ... 83.596102 85.822459
|
27 3712.0 85.822459 ... 84.802499 88.876645
|
||||||
28 3840.0 84.421376 ... 86.197974 86.130841
|
28 3840.0 85.136259 ... 87.424508 88.121115
|
||||||
29 3968.0 92.442373 ... 87.913500 87.787005
|
29 3968.0 92.864488 ... 87.284643 87.597943
|
||||||
30 4096.0 93.596744 ... 89.240508 89.062862
|
30 4096.0 93.466385 ... 90.504200 89.898012
|
||||||
|
|
||||||
[31 rows x 5 columns]
|
[31 rows x 5 columns]
|
||||||
|
|
||||||
@@ -502,7 +502,7 @@ We can now compare the performance of our kernel against that of cuBLAS. Here we
|
|||||||
|
|
||||||
.. rst-class:: sphx-glr-timing
|
.. rst-class:: sphx-glr-timing
|
||||||
|
|
||||||
**Total running time of the script:** ( 2 minutes 14.737 seconds)
|
**Total running time of the script:** ( 2 minutes 20.017 seconds)
|
||||||
|
|
||||||
|
|
||||||
.. _sphx_glr_download_getting-started_tutorials_03-matrix-multiplication.py:
|
.. _sphx_glr_download_getting-started_tutorials_03-matrix-multiplication.py:
|
||||||
|
269
_sources/getting-started/tutorials/04-low-memory-dropout.rst.txt
Normal file
@@ -0,0 +1,269 @@
|
|||||||
|
|
||||||
|
.. DO NOT EDIT.
|
||||||
|
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
|
||||||
|
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
|
||||||
|
.. "getting-started/tutorials/04-low-memory-dropout.py"
|
||||||
|
.. LINE NUMBERS ARE GIVEN BELOW.
|
||||||
|
|
||||||
|
.. only:: html
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
:class: sphx-glr-download-link-note
|
||||||
|
|
||||||
|
Click :ref:`here <sphx_glr_download_getting-started_tutorials_04-low-memory-dropout.py>`
|
||||||
|
to download the full example code
|
||||||
|
|
||||||
|
.. rst-class:: sphx-glr-example-title
|
||||||
|
|
||||||
|
.. _sphx_glr_getting-started_tutorials_04-low-memory-dropout.py:
|
||||||
|
|
||||||
|
|
||||||
|
Low-Memory Dropout
|
||||||
|
=================
|
||||||
|
|
||||||
|
In this tutorial, you will write a memory-efficient implementation of dropout whose state
|
||||||
|
will be composed of a single int32 seed. This differs from more traditional implementations of dropout,
|
||||||
|
whose state is generally composed of a bit mask tensor of the same shape as the input. You will learn about:
|
||||||
|
|
||||||
|
- The limitations of naive implementations of Dropout with PyTorch
|
||||||
|
- Parallel pseudo-random number generation in Triton
|
||||||
|
|
||||||
|
.. GENERATED FROM PYTHON SOURCE LINES 14-29
|
||||||
|
|
||||||
|
Baseline
|
||||||
|
-------------
|
||||||
|
The *dropout* operator was first introduced in [SRIVASTAVA2014]_ as a way to improve the performance
|
||||||
|
of deep neural networks in low-data regime (i.e. regularization).
|
||||||
|
|
||||||
|
It takes a vector as input and produces a vector of the same shape as output. Each scalar in the
|
||||||
|
output has a probability :math:`p` of being changed to zero and otherwise it is copied from the input.
|
||||||
|
This forces the network to perform well even when only :math:`1 - p` scalars from the input are available.
|
||||||
|
|
||||||
|
At evaluation time we want to use the full power of the network so we set :math:`p=0`. Naively this would
|
||||||
|
increase the norm of the output (which can be a bad thing, e.g. it can lead to artificial decrease
|
||||||
|
in the output softmax temperature). To prevent this we multiply the output by :math:`\frac{1}{1 - p}`, which
|
||||||
|
keeps the norm consistent regardless of the dropout probability.
|
||||||
|
|
||||||
|
Let's first take a look at the baseline implementation.
|
||||||
|
|
||||||
|
.. GENERATED FROM PYTHON SOURCE LINES 29-80
|
||||||
|
|
||||||
|
.. code-block:: default
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
import tabulate
|
||||||
|
import torch
|
||||||
|
import triton
|
||||||
|
import triton.language as tl
|
||||||
|
|
||||||
|
@triton.jit
|
||||||
|
def _dropout(
|
||||||
|
x_ptr, # pointer to the input
|
||||||
|
x_keep_ptr, # pointer to a mask of 0s and 1s
|
||||||
|
output_ptr, # pointer to the output
|
||||||
|
n_elements, # number of elements in the `x` tensor
|
||||||
|
p, # probability that an element of `x` is changed to zero
|
||||||
|
**meta,
|
||||||
|
):
|
||||||
|
BLOCK_SIZE = meta['BLOCK_SIZE']
|
||||||
|
pid = tl.program_id(axis=0)
|
||||||
|
block_start = pid * BLOCK_SIZE
|
||||||
|
offsets = block_start + tl.arange(0, BLOCK_SIZE)
|
||||||
|
mask = offsets < n_elements
|
||||||
|
# Load data
|
||||||
|
x = tl.load(x_ptr + offsets, mask=mask)
|
||||||
|
x_keep = tl.load(x_keep_ptr + offsets, mask=mask)
|
||||||
|
# The line below is the crucial part, described in the paragraph above!
|
||||||
|
output = tl.where(x_keep, x / (1 - p), 0.0)
|
||||||
|
# Write-back output
|
||||||
|
tl.store(output_ptr + offsets, output, mask=mask)
|
||||||
|
|
||||||
|
|
||||||
|
def dropout(x, x_keep, p):
|
||||||
|
output = torch.empty_like(x)
|
||||||
|
assert x.is_contiguous()
|
||||||
|
n_elements = x.numel()
|
||||||
|
grid = lambda meta: (triton.cdiv(n_elements, meta['BLOCK_SIZE']),)
|
||||||
|
_dropout[grid](x, x_keep, output, n_elements, p, BLOCK_SIZE=1024)
|
||||||
|
return output
|
||||||
|
|
||||||
|
# Input tensor
|
||||||
|
x = torch.randn(size=(10,)).cuda()
|
||||||
|
# Dropout mask
|
||||||
|
p = 0.5
|
||||||
|
x_keep = (torch.rand(size=(10,)) > p).to(torch.int32).cuda()
|
||||||
|
#
|
||||||
|
output = dropout(x, x_keep=x_keep, p=p)
|
||||||
|
print(tabulate.tabulate([
|
||||||
|
["input"] + x.tolist(),
|
||||||
|
["keep mask"] + x_keep.tolist(),
|
||||||
|
["output"] + output.tolist()
|
||||||
|
]))
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
.. rst-class:: sphx-glr-script-out
|
||||||
|
|
||||||
|
Out:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
--------- ------- --------- -------- -------- -------- -------- -------- -------- --------- ---------
|
||||||
|
input 1.541 -0.293429 -2.17879 0.568431 -1.08452 -1.3986 0.403347 0.838026 -0.719258 -0.403344
|
||||||
|
keep mask 1 1 0 1 0 1 1 0 0 0
|
||||||
|
output 3.08199 -0.586858 0 1.13686 0 -2.79719 0.806694 0 0 0
|
||||||
|
--------- ------- --------- -------- -------- -------- -------- -------- -------- --------- ---------
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
.. GENERATED FROM PYTHON SOURCE LINES 81-99
|
||||||
|
|
||||||
|
Seeded dropout
|
||||||
|
-------------
|
||||||
|
Above implementation of dropout works fine, but it can be a bit awkward to deal with. Firstly
|
||||||
|
we need to store the dropout mask for backpropagation. Secondly, dropout state management can get
|
||||||
|
very tricky when using recompute/checkpointing (e.g. see all the notes about `preserve_rng_state` in
|
||||||
|
https://pytorch.org/docs/1.9.0/checkpoint.html). In this tutorial we'll describe an alternative implementation
|
||||||
|
that (1) has a smaller memory footprint; (2) requires less data movement; and (3) simplifies the management
|
||||||
|
of persisting randomness across multiple invocations of the kernel.
|
||||||
|
|
||||||
|
Pseudorandom number generation in Triton is simple! In this tutorial we will use the
|
||||||
|
:code:`triton.language.rand` function which generates a block of uniformly distributed :code:`float32`
|
||||||
|
values in [0, 1), given a seed and a block of :code:`int32` offsets. But if you need it, Triton also provides
|
||||||
|
other :ref:`random number generation strategies <Random Number Generation>`.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Triton's implementation of PRNG is based on the Philox algorithm (described on [SALMON2011]_).
|
||||||
|
|
||||||
|
Let's put it all together.
|
||||||
|
|
||||||
|
.. GENERATED FROM PYTHON SOURCE LINES 99-147
|
||||||
|
|
||||||
|
.. code-block:: default
|
||||||
|
|
||||||
|
|
||||||
|
@triton.jit
|
||||||
|
def _seeded_dropout(
|
||||||
|
x_ptr,
|
||||||
|
output_ptr,
|
||||||
|
n_elements,
|
||||||
|
p,
|
||||||
|
seed,
|
||||||
|
**meta,
|
||||||
|
):
|
||||||
|
# compute memory offsets of elements handled by this instance
|
||||||
|
BLOCK_SIZE = meta['BLOCK_SIZE']
|
||||||
|
pid = tl.program_id(axis=0)
|
||||||
|
block_start = pid * BLOCK_SIZE
|
||||||
|
offsets = block_start + tl.arange(0, BLOCK_SIZE)
|
||||||
|
# load data from x
|
||||||
|
mask = offsets < n_elements
|
||||||
|
x = tl.load(x_ptr + offsets, mask=mask)
|
||||||
|
# randomly prune it
|
||||||
|
random = tl.rand(seed, offsets)
|
||||||
|
x_keep = random > p
|
||||||
|
# write-back
|
||||||
|
output = tl.where(x_keep, x / (1 - p), 0.0)
|
||||||
|
tl.store(output_ptr + offsets, output, mask=mask)
|
||||||
|
|
||||||
|
|
||||||
|
def seeded_dropout(x, p, seed):
|
||||||
|
output = torch.empty_like(x)
|
||||||
|
assert x.is_contiguous()
|
||||||
|
n_elements = x.numel()
|
||||||
|
grid = lambda meta: (triton.cdiv(n_elements, meta['BLOCK_SIZE']),)
|
||||||
|
_seeded_dropout[grid](x, output, n_elements, p, seed, BLOCK_SIZE=1024)
|
||||||
|
return output
|
||||||
|
|
||||||
|
|
||||||
|
x = torch.randn(size=(10,)).cuda()
|
||||||
|
# Compare this to the baseline - dropout mask is never instantiated!
|
||||||
|
output = seeded_dropout(x, p=0.5, seed=123)
|
||||||
|
output2 = seeded_dropout(x, p=0.5, seed=123)
|
||||||
|
output3 = seeded_dropout(x, p=0.5, seed=512)
|
||||||
|
|
||||||
|
print(tabulate.tabulate([
|
||||||
|
["input"] + x.tolist(),
|
||||||
|
["output (seed = 123)"] + output.tolist(),
|
||||||
|
["output (seed = 123)"] + output2.tolist(),
|
||||||
|
["output (seed = 512)"] + output3.tolist()
|
||||||
|
]))
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
.. rst-class:: sphx-glr-script-out
|
||||||
|
|
||||||
|
Out:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
------------------- --------- -------- -------- ------- -------- -------- --------- --------- --------- ---------
|
||||||
|
input -0.952835 0.371721 0.408716 1.42142 0.149397 -0.67086 -0.214186 -0.431969 -0.707878 -0.106434
|
||||||
|
output (seed = 123) 0 0.743443 0 2.84284 0.298794 -1.34172 0 0 0 0
|
||||||
|
output (seed = 123) 0 0.743443 0 2.84284 0.298794 -1.34172 0 0 0 0
|
||||||
|
output (seed = 512) -1.90567 0.743443 0 2.84284 0.298794 -1.34172 0 -0.863938 0 -0.212868
|
||||||
|
------------------- --------- -------- -------- ------- -------- -------- --------- --------- --------- ---------
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
.. GENERATED FROM PYTHON SOURCE LINES 148-151
|
||||||
|
|
||||||
|
Et Voilà! We have a triton kernel that applies the same dropout mask provided the seed is the same!
|
||||||
|
If you'd like explore further applications of pseudorandomness in GPU programming, we encourage you
|
||||||
|
to explore the `triton/language/random` folder!
|
||||||
|
|
||||||
|
.. GENERATED FROM PYTHON SOURCE LINES 153-158
|
||||||
|
|
||||||
|
Exercises
|
||||||
|
-------------
|
||||||
|
1. Extend the kernel to operate over a matrix and use a vector of seeds - one per row.
|
||||||
|
2. Add support for striding.
|
||||||
|
3. (challenge) Implement a kernel for sparse Johnson-Lindenstrauss transform which generates the projection matrix one the fly each time using a seed.
|
||||||
|
|
||||||
|
.. GENERATED FROM PYTHON SOURCE LINES 160-165
|
||||||
|
|
||||||
|
References
|
||||||
|
--------------
|
||||||
|
|
||||||
|
.. [SALMON2011] John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw, "Parallel Random Numbers: As Easy as 1, 2, 3", 2011
|
||||||
|
.. [SRIVASTAVA2014] Nitish Srivastava and Geoffrey Hinton and Alex Krizhevsky and Ilya Sutskever and Ruslan Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting", JMLR 2014
|
||||||
|
|
||||||
|
|
||||||
|
.. rst-class:: sphx-glr-timing
|
||||||
|
|
||||||
|
**Total running time of the script:** ( 0 minutes 0.316 seconds)
|
||||||
|
|
||||||
|
|
||||||
|
.. _sphx_glr_download_getting-started_tutorials_04-low-memory-dropout.py:
|
||||||
|
|
||||||
|
|
||||||
|
.. only :: html
|
||||||
|
|
||||||
|
.. container:: sphx-glr-footer
|
||||||
|
:class: sphx-glr-footer-example
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
.. container:: sphx-glr-download sphx-glr-download-python
|
||||||
|
|
||||||
|
:download:`Download Python source code: 04-low-memory-dropout.py <04-low-memory-dropout.py>`
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
.. container:: sphx-glr-download sphx-glr-download-jupyter
|
||||||
|
|
||||||
|
:download:`Download Jupyter notebook: 04-low-memory-dropout.ipynb <04-low-memory-dropout.ipynb>`
|
||||||
|
|
||||||
|
|
||||||
|
.. only:: html
|
||||||
|
|
||||||
|
.. rst-class:: sphx-glr-signature
|
||||||
|
|
||||||
|
`Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
|
@@ -72,6 +72,27 @@ Below is a gallery of tutorials for writing various basic operations with Triton
|
|||||||
:hidden:
|
:hidden:
|
||||||
|
|
||||||
/getting-started/tutorials/03-matrix-multiplication
|
/getting-started/tutorials/03-matrix-multiplication
|
||||||
|
|
||||||
|
.. raw:: html
|
||||||
|
|
||||||
|
<div class="sphx-glr-thumbcontainer" tooltip="In this tutorial, you will write a memory-efficient implementation of dropout whose state will ...">
|
||||||
|
|
||||||
|
.. only:: html
|
||||||
|
|
||||||
|
.. figure:: /getting-started/tutorials/images/thumb/sphx_glr_04-low-memory-dropout_thumb.png
|
||||||
|
:alt: Low-Memory Dropout
|
||||||
|
|
||||||
|
:ref:`sphx_glr_getting-started_tutorials_04-low-memory-dropout.py`
|
||||||
|
|
||||||
|
.. raw:: html
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:hidden:
|
||||||
|
|
||||||
|
/getting-started/tutorials/04-low-memory-dropout
|
||||||
.. raw:: html
|
.. raw:: html
|
||||||
|
|
||||||
<div class="sphx-glr-clear"></div>
|
<div class="sphx-glr-clear"></div>
|
||||||
|
@@ -5,12 +5,14 @@
|
|||||||
|
|
||||||
Computation times
|
Computation times
|
||||||
=================
|
=================
|
||||||
**03:38.920** total execution time for **getting-started_tutorials** files:
|
**03:43.892** total execution time for **getting-started_tutorials** files:
|
||||||
|
|
||||||
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
||||||
| :ref:`sphx_glr_getting-started_tutorials_03-matrix-multiplication.py` (``03-matrix-multiplication.py``) | 02:14.737 | 0.0 MB |
|
| :ref:`sphx_glr_getting-started_tutorials_03-matrix-multiplication.py` (``03-matrix-multiplication.py``) | 02:20.017 | 0.0 MB |
|
||||||
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
||||||
| :ref:`sphx_glr_getting-started_tutorials_02-fused-softmax.py` (``02-fused-softmax.py``) | 01:13.131 | 0.0 MB |
|
| :ref:`sphx_glr_getting-started_tutorials_02-fused-softmax.py` (``02-fused-softmax.py``) | 01:12.586 | 0.0 MB |
|
||||||
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
||||||
| :ref:`sphx_glr_getting-started_tutorials_01-vector-add.py` (``01-vector-add.py``) | 00:11.053 | 0.0 MB |
|
| :ref:`sphx_glr_getting-started_tutorials_01-vector-add.py` (``01-vector-add.py``) | 00:10.972 | 0.0 MB |
|
||||||
|
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
||||||
|
| :ref:`sphx_glr_getting-started_tutorials_04-low-memory-dropout.py` (``04-low-memory-dropout.py``) | 00:00.316 | 0.0 MB |
|
||||||
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
+---------------------------------------------------------------------------------------------------------+-----------+--------+
|
||||||
|
@@ -0,0 +1,6 @@
|
|||||||
|
triton.language.rand
|
||||||
|
====================
|
||||||
|
|
||||||
|
.. currentmodule:: triton.language
|
||||||
|
|
||||||
|
.. autofunction:: rand
|
@@ -0,0 +1,6 @@
|
|||||||
|
triton.language.randint
|
||||||
|
=======================
|
||||||
|
|
||||||
|
.. currentmodule:: triton.language
|
||||||
|
|
||||||
|
.. autofunction:: randint
|
@@ -0,0 +1,6 @@
|
|||||||
|
triton.language.randint4x
|
||||||
|
=========================
|
||||||
|
|
||||||
|
.. currentmodule:: triton.language
|
||||||
|
|
||||||
|
.. autofunction:: randint4x
|
@@ -0,0 +1,6 @@
|
|||||||
|
triton.language.randn
|
||||||
|
=====================
|
||||||
|
|
||||||
|
.. currentmodule:: triton.language
|
||||||
|
|
||||||
|
.. autofunction:: randn
|
@@ -121,6 +121,19 @@ Comparison ops
|
|||||||
minimum
|
minimum
|
||||||
maximum
|
maximum
|
||||||
|
|
||||||
|
.. _Random Number Generation:
|
||||||
|
|
||||||
|
Random Number Generation
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
.. autosummary::
|
||||||
|
:toctree: generated
|
||||||
|
:nosignatures:
|
||||||
|
|
||||||
|
randint4x
|
||||||
|
randint
|
||||||
|
rand
|
||||||
|
randn
|
||||||
|
|
||||||
Compiler Hint Ops
|
Compiler Hint Ops
|
||||||
-------------------
|
-------------------
|
||||||
@@ -129,4 +142,4 @@ Compiler Hint Ops
|
|||||||
:toctree: generated
|
:toctree: generated
|
||||||
:nosignatures:
|
:nosignatures:
|
||||||
|
|
||||||
multiple_of
|
multiple_of
|
@@ -339,10 +339,18 @@
|
|||||||
<h2 id="R">R</h2>
|
<h2 id="R">R</h2>
|
||||||
<table style="width: 100%" class="indextable genindextable"><tr>
|
<table style="width: 100%" class="indextable genindextable"><tr>
|
||||||
<td style="width: 33%; vertical-align: top;"><ul>
|
<td style="width: 33%; vertical-align: top;"><ul>
|
||||||
<li><a href="python-api/generated/triton.language.ravel.html#triton.language.ravel">ravel() (in module triton.language)</a>
|
<li><a href="python-api/generated/triton.language.rand.html#triton.language.rand">rand() (in module triton.language)</a>
|
||||||
|
</li>
|
||||||
|
<li><a href="python-api/generated/triton.language.randint.html#triton.language.randint">randint() (in module triton.language)</a>
|
||||||
|
</li>
|
||||||
|
<li><a href="python-api/generated/triton.language.randint4x.html#triton.language.randint4x">randint4x() (in module triton.language)</a>
|
||||||
</li>
|
</li>
|
||||||
</ul></td>
|
</ul></td>
|
||||||
<td style="width: 33%; vertical-align: top;"><ul>
|
<td style="width: 33%; vertical-align: top;"><ul>
|
||||||
|
<li><a href="python-api/generated/triton.language.randn.html#triton.language.randn">randn() (in module triton.language)</a>
|
||||||
|
</li>
|
||||||
|
<li><a href="python-api/generated/triton.language.ravel.html#triton.language.ravel">ravel() (in module triton.language)</a>
|
||||||
|
</li>
|
||||||
<li><a href="python-api/generated/triton.language.reshape.html#triton.language.reshape">reshape() (in module triton.language)</a>
|
<li><a href="python-api/generated/triton.language.reshape.html#triton.language.reshape">reshape() (in module triton.language)</a>
|
||||||
</li>
|
</li>
|
||||||
</ul></td>
|
</ul></td>
|
||||||
|
@@ -103,6 +103,7 @@
|
|||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="02-fused-softmax.html">Fused Softmax</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="02-fused-softmax.html">Fused Softmax</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="03-matrix-multiplication.html">Matrix Multiplication</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="03-matrix-multiplication.html">Matrix Multiplication</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="04-low-memory-dropout.html">Low-Memory Dropout</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
@@ -231,7 +232,7 @@ to download the full example code</p>
|
|||||||
<span class="n">y</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">y_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
<span class="n">y</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">y_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
||||||
<span class="n">output</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
|
<span class="n">output</span> <span class="o">=</span> <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
|
||||||
<span class="c1"># Write x + y back to DRAM</span>
|
<span class="c1"># Write x + y back to DRAM</span>
|
||||||
<span class="n">tl</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="n">output_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">output</span><span class="p">)</span>
|
<span class="n">tl</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="n">output_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">output</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
||||||
</pre></div>
|
</pre></div>
|
||||||
</div>
|
</div>
|
||||||
<p>Let’s also declare a helper function to (1) allocate the <cite>z</cite> tensor
|
<p>Let’s also declare a helper function to (1) allocate the <cite>z</cite> tensor
|
||||||
@@ -319,16 +320,16 @@ for different problem sizes.</p>
|
|||||||
<p class="sphx-glr-script-out">Out:</p>
|
<p class="sphx-glr-script-out">Out:</p>
|
||||||
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>vector-add-performance:
|
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>vector-add-performance:
|
||||||
size Triton Torch
|
size Triton Torch
|
||||||
0 4096.0 8.000000 9.600000
|
0 4096.0 9.600000 9.600000
|
||||||
1 8192.0 19.200000 19.200000
|
1 8192.0 19.200000 19.200000
|
||||||
2 16384.0 38.400001 38.400001
|
2 16384.0 38.400001 38.400001
|
||||||
3 32768.0 76.800002 76.800002
|
3 32768.0 76.800002 76.800002
|
||||||
4 65536.0 127.999995 127.999995
|
4 65536.0 127.999995 127.999995
|
||||||
5 131072.0 219.428568 219.428568
|
5 131072.0 219.428568 219.428568
|
||||||
6 262144.0 384.000001 341.333321
|
6 262144.0 341.333321 384.000001
|
||||||
7 524288.0 472.615390 472.615390
|
7 524288.0 472.615390 472.615390
|
||||||
8 1048576.0 614.400016 614.400016
|
8 1048576.0 614.400016 614.400016
|
||||||
9 2097152.0 722.823517 722.823517
|
9 2097152.0 702.171410 722.823517
|
||||||
10 4194304.0 780.190482 780.190482
|
10 4194304.0 780.190482 780.190482
|
||||||
11 8388608.0 812.429770 812.429770
|
11 8388608.0 812.429770 812.429770
|
||||||
12 16777216.0 833.084721 833.084721
|
12 16777216.0 833.084721 833.084721
|
||||||
@@ -337,7 +338,7 @@ for different problem sizes.</p>
|
|||||||
15 134217728.0 851.577704 850.656574
|
15 134217728.0 851.577704 850.656574
|
||||||
</pre></div>
|
</pre></div>
|
||||||
</div>
|
</div>
|
||||||
<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 0 minutes 11.053 seconds)</p>
|
<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 0 minutes 10.972 seconds)</p>
|
||||||
<div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-getting-started-tutorials-01-vector-add-py">
|
<div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-getting-started-tutorials-01-vector-add-py">
|
||||||
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
||||||
<p><a class="reference download internal" download="" href="../../_downloads/62d97d49a32414049819dd8bb8378080/01-vector-add.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">01-vector-add.py</span></code></a></p>
|
<p><a class="reference download internal" download="" href="../../_downloads/62d97d49a32414049819dd8bb8378080/01-vector-add.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">01-vector-add.py</span></code></a></p>
|
||||||
|
@@ -106,6 +106,7 @@
|
|||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="03-matrix-multiplication.html">Matrix Multiplication</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="03-matrix-multiplication.html">Matrix Multiplication</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="04-low-memory-dropout.html">Low-Memory Dropout</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
@@ -395,7 +396,7 @@ We will then compare its performance against (1) <code class="code docutils lite
|
|||||||
94 12288.0 812.429770 415.661740 199.298541
|
94 12288.0 812.429770 415.661740 199.298541
|
||||||
95 12416.0 810.840807 412.149375 198.954424
|
95 12416.0 810.840807 412.149375 198.954424
|
||||||
96 12544.0 810.925276 412.971190 199.209928
|
96 12544.0 810.925276 412.971190 199.209928
|
||||||
97 12672.0 811.007961 412.097543 199.167004
|
97 12672.0 811.007961 412.097543 199.264875
|
||||||
|
|
||||||
[98 rows x 4 columns]
|
[98 rows x 4 columns]
|
||||||
</pre></div>
|
</pre></div>
|
||||||
@@ -408,7 +409,7 @@ We will then compare its performance against (1) <code class="code docutils lite
|
|||||||
Note however that the PyTorch <cite>softmax</cite> operation is more general and will works on tensors of any shape.</p></li>
|
Note however that the PyTorch <cite>softmax</cite> operation is more general and will works on tensors of any shape.</p></li>
|
||||||
</ul>
|
</ul>
|
||||||
</div></blockquote>
|
</div></blockquote>
|
||||||
<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes 13.131 seconds)</p>
|
<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 1 minutes 12.586 seconds)</p>
|
||||||
<div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-getting-started-tutorials-02-fused-softmax-py">
|
<div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-getting-started-tutorials-02-fused-softmax-py">
|
||||||
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
||||||
<p><a class="reference download internal" download="" href="../../_downloads/d91442ac2982c4e0cc3ab0f43534afbc/02-fused-softmax.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">02-fused-softmax.py</span></code></a></p>
|
<p><a class="reference download internal" download="" href="../../_downloads/d91442ac2982c4e0cc3ab0f43534afbc/02-fused-softmax.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">02-fused-softmax.py</span></code></a></p>
|
||||||
|
@@ -46,7 +46,7 @@
|
|||||||
|
|
||||||
<link rel="index" title="Index" href="../../genindex.html" />
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
<link rel="search" title="Search" href="../../search.html" />
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
<link rel="next" title="triton" href="../../python-api/triton.html" />
|
<link rel="next" title="Low-Memory Dropout" href="04-low-memory-dropout.html" />
|
||||||
<link rel="prev" title="Fused Softmax" href="02-fused-softmax.html" />
|
<link rel="prev" title="Fused Softmax" href="02-fused-softmax.html" />
|
||||||
</head>
|
</head>
|
||||||
|
|
||||||
@@ -113,6 +113,7 @@
|
|||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="04-low-memory-dropout.html">Low-Memory Dropout</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
@@ -566,42 +567,42 @@ torch_output=tensor([[ 1.1045, -36.9688, 31.4688, ..., -11.3906, 24.4531, -3
|
|||||||
<p class="sphx-glr-script-out">Out:</p>
|
<p class="sphx-glr-script-out">Out:</p>
|
||||||
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>matmul-performance:
|
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>matmul-performance:
|
||||||
M cuBLAS ... Triton Triton (+ LeakyReLU)
|
M cuBLAS ... Triton Triton (+ LeakyReLU)
|
||||||
0 256.0 2.978909 ... 2.978909 2.978909
|
0 256.0 2.978909 ... 3.276800 3.276800
|
||||||
1 384.0 7.372800 ... 8.507077 8.507077
|
1 384.0 7.372800 ... 8.507077 8.507077
|
||||||
2 512.0 14.563555 ... 16.384000 16.384000
|
2 512.0 14.563555 ... 16.384000 16.384000
|
||||||
3 640.0 22.260869 ... 24.380953 24.380953
|
3 640.0 22.260869 ... 24.380953 24.380953
|
||||||
4 768.0 32.768000 ... 34.028308 34.028308
|
4 768.0 32.768000 ... 35.389441 34.028308
|
||||||
5 896.0 39.025776 ... 40.140799 39.025776
|
5 896.0 39.025776 ... 40.140799 39.025776
|
||||||
6 1024.0 49.932191 ... 53.773130 52.428801
|
6 1024.0 49.932191 ... 52.428801 52.428801
|
||||||
7 1152.0 44.566925 ... 46.656000 46.656000
|
7 1152.0 44.566925 ... 46.656000 46.656000
|
||||||
8 1280.0 51.200001 ... 56.888887 56.888887
|
8 1280.0 51.200001 ... 56.888887 56.888887
|
||||||
9 1408.0 64.138541 ... 63.392744 63.392744
|
9 1408.0 64.138541 ... 63.392744 57.368243
|
||||||
10 1536.0 78.643199 ... 76.106321 76.106321
|
10 1536.0 79.526831 ... 75.296679 75.296679
|
||||||
11 1664.0 63.372618 ... 62.061463 62.061463
|
11 1664.0 62.929456 ... 61.217089 61.636381
|
||||||
12 1792.0 72.983276 ... 62.790080 62.441243
|
12 1792.0 72.983276 ... 62.441243 62.441243
|
||||||
13 1920.0 69.467336 ... 67.106797 69.818184
|
13 1920.0 68.776119 ... 70.172588 69.818184
|
||||||
14 2048.0 73.908442 ... 74.898285 74.565406
|
14 2048.0 73.584279 ... 74.565406 74.565406
|
||||||
15 2176.0 83.155572 ... 81.472263 81.143743
|
15 2176.0 83.155572 ... 80.494588 80.494588
|
||||||
16 2304.0 68.446623 ... 73.501144 73.275679
|
16 2304.0 68.251065 ... 73.275679 73.275679
|
||||||
17 2432.0 71.125224 ... 81.197876 82.147552
|
17 2432.0 71.125224 ... 70.766913 80.041209
|
||||||
18 2560.0 77.649287 ... 76.920185 77.465723
|
18 2560.0 77.649287 ... 76.740048 76.027843
|
||||||
19 2688.0 81.053536 ... 83.737433 80.537273
|
19 2688.0 83.922689 ... 80.880718 83.186525
|
||||||
20 2816.0 82.135981 ... 78.301990 79.733474
|
20 2816.0 83.552120 ... 78.868366 78.442822
|
||||||
21 2944.0 80.510553 ... 78.605729 76.435630
|
21 2944.0 82.102191 ... 77.385141 77.990663
|
||||||
22 3072.0 81.472093 ... 83.638266 84.386148
|
22 3072.0 79.415291 ... 81.238312 83.146995
|
||||||
23 3200.0 84.656085 ... 86.956520 89.635851
|
23 3200.0 84.321474 ... 89.012517 89.761569
|
||||||
24 3328.0 81.530349 ... 84.596116 86.632127
|
24 3328.0 83.226931 ... 85.500351 87.051143
|
||||||
25 3456.0 81.683457 ... 84.068369 83.980802
|
25 3456.0 78.655188 ... 80.300370 83.632331
|
||||||
26 3584.0 87.211821 ... 87.466332 91.099693
|
26 3584.0 85.879071 ... 91.470385 93.661869
|
||||||
27 3712.0 85.896254 ... 83.596102 85.822459
|
27 3712.0 85.822459 ... 84.802499 88.876645
|
||||||
28 3840.0 84.421376 ... 86.197974 86.130841
|
28 3840.0 85.136259 ... 87.424508 88.121115
|
||||||
29 3968.0 92.442373 ... 87.913500 87.787005
|
29 3968.0 92.864488 ... 87.284643 87.597943
|
||||||
30 4096.0 93.596744 ... 89.240508 89.062862
|
30 4096.0 93.466385 ... 90.504200 89.898012
|
||||||
|
|
||||||
[31 rows x 5 columns]
|
[31 rows x 5 columns]
|
||||||
</pre></div>
|
</pre></div>
|
||||||
</div>
|
</div>
|
||||||
<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes 14.737 seconds)</p>
|
<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 2 minutes 20.017 seconds)</p>
|
||||||
<div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-getting-started-tutorials-03-matrix-multiplication-py">
|
<div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-getting-started-tutorials-03-matrix-multiplication-py">
|
||||||
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
||||||
<p><a class="reference download internal" download="" href="../../_downloads/d5fee5b55a64e47f1b5724ec39adf171/03-matrix-multiplication.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">03-matrix-multiplication.py</span></code></a></p>
|
<p><a class="reference download internal" download="" href="../../_downloads/d5fee5b55a64e47f1b5724ec39adf171/03-matrix-multiplication.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">03-matrix-multiplication.py</span></code></a></p>
|
||||||
@@ -621,7 +622,7 @@ torch_output=tensor([[ 1.1045, -36.9688, 31.4688, ..., -11.3906, 24.4531, -3
|
|||||||
</div>
|
</div>
|
||||||
<footer>
|
<footer>
|
||||||
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
<a href="../../python-api/triton.html" class="btn btn-neutral float-right" title="triton" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
<a href="04-low-memory-dropout.html" class="btn btn-neutral float-right" title="Low-Memory Dropout" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
<a href="02-fused-softmax.html" class="btn btn-neutral float-left" title="Fused Softmax" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
<a href="02-fused-softmax.html" class="btn btn-neutral float-left" title="Fused Softmax" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
434
getting-started/tutorials/04-low-memory-dropout.html
Normal file
@@ -0,0 +1,434 @@
|
|||||||
|
|
||||||
|
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html class="writer-html5" lang="en" >
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||||
|
|
||||||
|
<title>Low-Memory Dropout — Triton documentation</title>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-binder.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-dataframe.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-rendered-html.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/custom.css" type="text/css" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<!--[if lt IE 9]>
|
||||||
|
<script src="../../_static/js/html5shiv.min.js"></script>
|
||||||
|
<![endif]-->
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script src="../../_static/jquery.js"></script>
|
||||||
|
<script src="../../_static/underscore.js"></script>
|
||||||
|
<script src="../../_static/doctools.js"></script>
|
||||||
|
<script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
|
||||||
|
|
||||||
|
<script type="text/javascript" src="../../_static/js/theme.js"></script>
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
|
<link rel="next" title="triton" href="../../python-api/triton.html" />
|
||||||
|
<link rel="prev" title="Matrix Multiplication" href="03-matrix-multiplication.html" />
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body class="wy-body-for-nav">
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-grid-for-nav">
|
||||||
|
|
||||||
|
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||||||
|
<div class="wy-side-scroll">
|
||||||
|
<div class="wy-side-nav-search" >
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../index.html" class="icon icon-home"> Triton
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</a>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="search">
|
||||||
|
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
|
||||||
|
<input type="text" name="q" placeholder="Search docs" />
|
||||||
|
<input type="hidden" name="check_keywords" value="yes" />
|
||||||
|
<input type="hidden" name="area" value="default" />
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Getting Started</span></p>
|
||||||
|
<ul class="current">
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../installation.html">Installation</a></li>
|
||||||
|
<li class="toctree-l1 current"><a class="reference internal" href="index.html">Tutorials</a><ul class="current">
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="01-vector-add.html">Vector Addition</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="02-fused-softmax.html">Fused Softmax</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="03-matrix-multiplication.html">Matrix Multiplication</a></li>
|
||||||
|
<li class="toctree-l2 current"><a class="current reference internal" href="#">Low-Memory Dropout</a><ul>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="#baseline">Baseline</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="#seeded-dropout">Seeded dropout</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="#exercises">Exercises</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="#references">References</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Python API</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../python-api/triton.html">triton</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../python-api/triton.language.html">triton.language</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../python-api/triton.testing.html">triton.testing</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Programming Guide</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-1/introduction.html">Introduction</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-2/related-work.html">Related Work</a></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
|
||||||
|
|
||||||
|
|
||||||
|
<nav class="wy-nav-top" aria-label="top navigation">
|
||||||
|
|
||||||
|
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
||||||
|
<a href="../../index.html">Triton</a>
|
||||||
|
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-nav-content">
|
||||||
|
|
||||||
|
<div class="rst-content">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="navigation" aria-label="breadcrumbs navigation">
|
||||||
|
|
||||||
|
<ul class="wy-breadcrumbs">
|
||||||
|
|
||||||
|
<li><a href="../../index.html" class="icon icon-home"></a> »</li>
|
||||||
|
|
||||||
|
<li><a href="index.html">Tutorials</a> »</li>
|
||||||
|
|
||||||
|
<li>Low-Memory Dropout</li>
|
||||||
|
|
||||||
|
|
||||||
|
<li class="wy-breadcrumbs-aside">
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../_sources/getting-started/tutorials/04-low-memory-dropout.rst.txt" rel="nofollow"> View page source</a>
|
||||||
|
|
||||||
|
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
</div>
|
||||||
|
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
||||||
|
<div itemprop="articleBody">
|
||||||
|
|
||||||
|
<div class="sphx-glr-download-link-note admonition note">
|
||||||
|
<p class="admonition-title">Note</p>
|
||||||
|
<p>Click <a class="reference internal" href="#sphx-glr-download-getting-started-tutorials-04-low-memory-dropout-py"><span class="std std-ref">here</span></a>
|
||||||
|
to download the full example code</p>
|
||||||
|
</div>
|
||||||
|
<div class="sphx-glr-example-title section" id="low-memory-dropout">
|
||||||
|
<span id="sphx-glr-getting-started-tutorials-04-low-memory-dropout-py"></span><h1>Low-Memory Dropout<a class="headerlink" href="#low-memory-dropout" title="Permalink to this headline">¶</a></h1>
|
||||||
|
<p>In this tutorial, you will write a memory-efficient implementation of dropout whose state
|
||||||
|
will be composed of a single int32 seed. This differs from more traditional implementations of dropout,
|
||||||
|
whose state is generally composed of a bit mask tensor of the same shape as the input. You will learn about:</p>
|
||||||
|
<ul class="simple">
|
||||||
|
<li><p>The limitations of naive implementations of Dropout with PyTorch</p></li>
|
||||||
|
<li><p>Parallel pseudo-random number generation in Triton</p></li>
|
||||||
|
</ul>
|
||||||
|
<div class="section" id="baseline">
|
||||||
|
<h2>Baseline<a class="headerlink" href="#baseline" title="Permalink to this headline">¶</a></h2>
|
||||||
|
<p>The <em>dropout</em> operator was first introduced in <a class="reference internal" href="#srivastava2014" id="id1"><span>[SRIVASTAVA2014]</span></a> as a way to improve the performance
|
||||||
|
of deep neural networks in low-data regime (i.e. regularization).</p>
|
||||||
|
<p>It takes a vector as input and produces a vector of the same shape as output. Each scalar in the
|
||||||
|
output has a probability <span class="math notranslate nohighlight">\(p\)</span> of being changed to zero and otherwise it is copied from the input.
|
||||||
|
This forces the network to perform well even when only <span class="math notranslate nohighlight">\(1 - p\)</span> scalars from the input are available.</p>
|
||||||
|
<p>At evaluation time we want to use the full power of the network so we set <span class="math notranslate nohighlight">\(p=0\)</span>. Naively this would
|
||||||
|
increase the norm of the output (which can be a bad thing, e.g. it can lead to artificial decrease
|
||||||
|
in the output softmax temperature). To prevent this we multiply the output by <span class="math notranslate nohighlight">\(\frac{1}{1 - p}\)</span>, which
|
||||||
|
keeps the norm consistent regardless of the dropout probability.</p>
|
||||||
|
<p>Let’s first take a look at the baseline implementation.</p>
|
||||||
|
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">tabulate</span>
|
||||||
|
<span class="kn">import</span> <span class="nn">torch</span>
|
||||||
|
<span class="kn">import</span> <span class="nn">triton</span>
|
||||||
|
<span class="kn">import</span> <span class="nn">triton.language</span> <span class="k">as</span> <span class="nn">tl</span>
|
||||||
|
|
||||||
|
<span class="nd">@triton</span><span class="o">.</span><span class="n">jit</span>
|
||||||
|
<span class="k">def</span> <span class="nf">_dropout</span><span class="p">(</span>
|
||||||
|
<span class="n">x_ptr</span><span class="p">,</span> <span class="c1"># pointer to the input</span>
|
||||||
|
<span class="n">x_keep_ptr</span><span class="p">,</span> <span class="c1"># pointer to a mask of 0s and 1s</span>
|
||||||
|
<span class="n">output_ptr</span><span class="p">,</span> <span class="c1"># pointer to the output</span>
|
||||||
|
<span class="n">n_elements</span><span class="p">,</span> <span class="c1"># number of elements in the `x` tensor</span>
|
||||||
|
<span class="n">p</span><span class="p">,</span> <span class="c1"># probability that an element of `x` is changed to zero</span>
|
||||||
|
<span class="o">**</span><span class="n">meta</span><span class="p">,</span>
|
||||||
|
<span class="p">):</span>
|
||||||
|
<span class="n">BLOCK_SIZE</span> <span class="o">=</span> <span class="n">meta</span><span class="p">[</span><span class="s1">'BLOCK_SIZE'</span><span class="p">]</span>
|
||||||
|
<span class="n">pid</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">program_id</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
|
||||||
|
<span class="n">block_start</span> <span class="o">=</span> <span class="n">pid</span> <span class="o">*</span> <span class="n">BLOCK_SIZE</span>
|
||||||
|
<span class="n">offsets</span> <span class="o">=</span> <span class="n">block_start</span> <span class="o">+</span> <span class="n">tl</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">BLOCK_SIZE</span><span class="p">)</span>
|
||||||
|
<span class="n">mask</span> <span class="o">=</span> <span class="n">offsets</span> <span class="o"><</span> <span class="n">n_elements</span>
|
||||||
|
<span class="c1"># Load data</span>
|
||||||
|
<span class="n">x</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">x_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
||||||
|
<span class="n">x_keep</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">x_keep_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
||||||
|
<span class="c1"># The line below is the crucial part, described in the paragraph above!</span>
|
||||||
|
<span class="n">output</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">x_keep</span><span class="p">,</span> <span class="n">x</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">p</span><span class="p">),</span> <span class="mf">0.0</span><span class="p">)</span>
|
||||||
|
<span class="c1"># Write-back output</span>
|
||||||
|
<span class="n">tl</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="n">output_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">output</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
||||||
|
|
||||||
|
|
||||||
|
<span class="k">def</span> <span class="nf">dropout</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">x_keep</span><span class="p">,</span> <span class="n">p</span><span class="p">):</span>
|
||||||
|
<span class="n">output</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">empty_like</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
|
||||||
|
<span class="k">assert</span> <span class="n">x</span><span class="o">.</span><span class="n">is_contiguous</span><span class="p">()</span>
|
||||||
|
<span class="n">n_elements</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">numel</span><span class="p">()</span>
|
||||||
|
<span class="n">grid</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">meta</span><span class="p">:</span> <span class="p">(</span><span class="n">triton</span><span class="o">.</span><span class="n">cdiv</span><span class="p">(</span><span class="n">n_elements</span><span class="p">,</span> <span class="n">meta</span><span class="p">[</span><span class="s1">'BLOCK_SIZE'</span><span class="p">]),)</span>
|
||||||
|
<span class="n">_dropout</span><span class="p">[</span><span class="n">grid</span><span class="p">](</span><span class="n">x</span><span class="p">,</span> <span class="n">x_keep</span><span class="p">,</span> <span class="n">output</span><span class="p">,</span> <span class="n">n_elements</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">BLOCK_SIZE</span><span class="o">=</span><span class="mi">1024</span><span class="p">)</span>
|
||||||
|
<span class="k">return</span> <span class="n">output</span>
|
||||||
|
|
||||||
|
<span class="c1"># Input tensor</span>
|
||||||
|
<span class="n">x</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,))</span><span class="o">.</span><span class="n">cuda</span><span class="p">()</span>
|
||||||
|
<span class="c1"># Dropout mask</span>
|
||||||
|
<span class="n">p</span> <span class="o">=</span> <span class="mf">0.5</span>
|
||||||
|
<span class="n">x_keep</span> <span class="o">=</span> <span class="p">(</span><span class="n">torch</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,))</span> <span class="o">></span> <span class="n">p</span><span class="p">)</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">torch</span><span class="o">.</span><span class="n">int32</span><span class="p">)</span><span class="o">.</span><span class="n">cuda</span><span class="p">()</span>
|
||||||
|
<span class="c1">#</span>
|
||||||
|
<span class="n">output</span> <span class="o">=</span> <span class="n">dropout</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">x_keep</span><span class="o">=</span><span class="n">x_keep</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="n">p</span><span class="p">)</span>
|
||||||
|
<span class="nb">print</span><span class="p">(</span><span class="n">tabulate</span><span class="o">.</span><span class="n">tabulate</span><span class="p">([</span>
|
||||||
|
<span class="p">[</span><span class="s2">"input"</span><span class="p">]</span> <span class="o">+</span> <span class="n">x</span><span class="o">.</span><span class="n">tolist</span><span class="p">(),</span>
|
||||||
|
<span class="p">[</span><span class="s2">"keep mask"</span><span class="p">]</span> <span class="o">+</span> <span class="n">x_keep</span><span class="o">.</span><span class="n">tolist</span><span class="p">(),</span>
|
||||||
|
<span class="p">[</span><span class="s2">"output"</span><span class="p">]</span> <span class="o">+</span> <span class="n">output</span><span class="o">.</span><span class="n">tolist</span><span class="p">()</span>
|
||||||
|
<span class="p">]))</span>
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
<p class="sphx-glr-script-out">Out:</p>
|
||||||
|
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>--------- ------- --------- -------- -------- -------- -------- -------- -------- --------- ---------
|
||||||
|
input 1.541 -0.293429 -2.17879 0.568431 -1.08452 -1.3986 0.403347 0.838026 -0.719258 -0.403344
|
||||||
|
keep mask 1 1 0 1 0 1 1 0 0 0
|
||||||
|
output 3.08199 -0.586858 0 1.13686 0 -2.79719 0.806694 0 0 0
|
||||||
|
--------- ------- --------- -------- -------- -------- -------- -------- -------- --------- ---------
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="section" id="seeded-dropout">
|
||||||
|
<h2>Seeded dropout<a class="headerlink" href="#seeded-dropout" title="Permalink to this headline">¶</a></h2>
|
||||||
|
<p>Above implementation of dropout works fine, but it can be a bit awkward to deal with. Firstly
|
||||||
|
we need to store the dropout mask for backpropagation. Secondly, dropout state management can get
|
||||||
|
very tricky when using recompute/checkpointing (e.g. see all the notes about <cite>preserve_rng_state</cite> in
|
||||||
|
<a class="reference external" href="https://pytorch.org/docs/1.9.0/checkpoint.html">https://pytorch.org/docs/1.9.0/checkpoint.html</a>). In this tutorial we’ll describe an alternative implementation
|
||||||
|
that (1) has a smaller memory footprint; (2) requires less data movement; and (3) simplifies the management
|
||||||
|
of persisting randomness across multiple invocations of the kernel.</p>
|
||||||
|
<p>Pseudorandom number generation in Triton is simple! In this tutorial we will use the
|
||||||
|
<code class="code docutils literal notranslate"><span class="pre">triton.language.rand</span></code> function which generates a block of uniformly distributed <code class="code docutils literal notranslate"><span class="pre">float32</span></code>
|
||||||
|
values in [0, 1), given a seed and a block of <code class="code docutils literal notranslate"><span class="pre">int32</span></code> offsets. But if you need it, Triton also provides
|
||||||
|
other <a class="reference internal" href="../../python-api/triton.language.html#random-number-generation"><span class="std std-ref">random number generation strategies</span></a>.</p>
|
||||||
|
<div class="admonition note">
|
||||||
|
<p class="admonition-title">Note</p>
|
||||||
|
<p>Triton’s implementation of PRNG is based on the Philox algorithm (described on <a class="reference internal" href="#salmon2011" id="id2"><span>[SALMON2011]</span></a>).</p>
|
||||||
|
</div>
|
||||||
|
<p>Let’s put it all together.</p>
|
||||||
|
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nd">@triton</span><span class="o">.</span><span class="n">jit</span>
|
||||||
|
<span class="k">def</span> <span class="nf">_seeded_dropout</span><span class="p">(</span>
|
||||||
|
<span class="n">x_ptr</span><span class="p">,</span>
|
||||||
|
<span class="n">output_ptr</span><span class="p">,</span>
|
||||||
|
<span class="n">n_elements</span><span class="p">,</span>
|
||||||
|
<span class="n">p</span><span class="p">,</span>
|
||||||
|
<span class="n">seed</span><span class="p">,</span>
|
||||||
|
<span class="o">**</span><span class="n">meta</span><span class="p">,</span>
|
||||||
|
<span class="p">):</span>
|
||||||
|
<span class="c1"># compute memory offsets of elements handled by this instance</span>
|
||||||
|
<span class="n">BLOCK_SIZE</span> <span class="o">=</span> <span class="n">meta</span><span class="p">[</span><span class="s1">'BLOCK_SIZE'</span><span class="p">]</span>
|
||||||
|
<span class="n">pid</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">program_id</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
|
||||||
|
<span class="n">block_start</span> <span class="o">=</span> <span class="n">pid</span> <span class="o">*</span> <span class="n">BLOCK_SIZE</span>
|
||||||
|
<span class="n">offsets</span> <span class="o">=</span> <span class="n">block_start</span> <span class="o">+</span> <span class="n">tl</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">BLOCK_SIZE</span><span class="p">)</span>
|
||||||
|
<span class="c1"># load data from x</span>
|
||||||
|
<span class="n">mask</span> <span class="o">=</span> <span class="n">offsets</span> <span class="o"><</span> <span class="n">n_elements</span>
|
||||||
|
<span class="n">x</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">x_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
||||||
|
<span class="c1"># randomly prune it</span>
|
||||||
|
<span class="n">random</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="n">seed</span><span class="p">,</span> <span class="n">offsets</span><span class="p">)</span>
|
||||||
|
<span class="n">x_keep</span> <span class="o">=</span> <span class="n">random</span> <span class="o">></span> <span class="n">p</span>
|
||||||
|
<span class="c1"># write-back</span>
|
||||||
|
<span class="n">output</span> <span class="o">=</span> <span class="n">tl</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">x_keep</span><span class="p">,</span> <span class="n">x</span> <span class="o">/</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">p</span><span class="p">),</span> <span class="mf">0.0</span><span class="p">)</span>
|
||||||
|
<span class="n">tl</span><span class="o">.</span><span class="n">store</span><span class="p">(</span><span class="n">output_ptr</span> <span class="o">+</span> <span class="n">offsets</span><span class="p">,</span> <span class="n">output</span><span class="p">,</span> <span class="n">mask</span><span class="o">=</span><span class="n">mask</span><span class="p">)</span>
|
||||||
|
|
||||||
|
|
||||||
|
<span class="k">def</span> <span class="nf">seeded_dropout</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">seed</span><span class="p">):</span>
|
||||||
|
<span class="n">output</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">empty_like</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
|
||||||
|
<span class="k">assert</span> <span class="n">x</span><span class="o">.</span><span class="n">is_contiguous</span><span class="p">()</span>
|
||||||
|
<span class="n">n_elements</span> <span class="o">=</span> <span class="n">x</span><span class="o">.</span><span class="n">numel</span><span class="p">()</span>
|
||||||
|
<span class="n">grid</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">meta</span><span class="p">:</span> <span class="p">(</span><span class="n">triton</span><span class="o">.</span><span class="n">cdiv</span><span class="p">(</span><span class="n">n_elements</span><span class="p">,</span> <span class="n">meta</span><span class="p">[</span><span class="s1">'BLOCK_SIZE'</span><span class="p">]),)</span>
|
||||||
|
<span class="n">_seeded_dropout</span><span class="p">[</span><span class="n">grid</span><span class="p">](</span><span class="n">x</span><span class="p">,</span> <span class="n">output</span><span class="p">,</span> <span class="n">n_elements</span><span class="p">,</span> <span class="n">p</span><span class="p">,</span> <span class="n">seed</span><span class="p">,</span> <span class="n">BLOCK_SIZE</span><span class="o">=</span><span class="mi">1024</span><span class="p">)</span>
|
||||||
|
<span class="k">return</span> <span class="n">output</span>
|
||||||
|
|
||||||
|
|
||||||
|
<span class="n">x</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="p">(</span><span class="mi">10</span><span class="p">,))</span><span class="o">.</span><span class="n">cuda</span><span class="p">()</span>
|
||||||
|
<span class="c1"># Compare this to the baseline - dropout mask is never instantiated!</span>
|
||||||
|
<span class="n">output</span> <span class="o">=</span> <span class="n">seeded_dropout</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span> <span class="n">seed</span><span class="o">=</span><span class="mi">123</span><span class="p">)</span>
|
||||||
|
<span class="n">output2</span> <span class="o">=</span> <span class="n">seeded_dropout</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span> <span class="n">seed</span><span class="o">=</span><span class="mi">123</span><span class="p">)</span>
|
||||||
|
<span class="n">output3</span> <span class="o">=</span> <span class="n">seeded_dropout</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">p</span><span class="o">=</span><span class="mf">0.5</span><span class="p">,</span> <span class="n">seed</span><span class="o">=</span><span class="mi">512</span><span class="p">)</span>
|
||||||
|
|
||||||
|
<span class="nb">print</span><span class="p">(</span><span class="n">tabulate</span><span class="o">.</span><span class="n">tabulate</span><span class="p">([</span>
|
||||||
|
<span class="p">[</span><span class="s2">"input"</span><span class="p">]</span> <span class="o">+</span> <span class="n">x</span><span class="o">.</span><span class="n">tolist</span><span class="p">(),</span>
|
||||||
|
<span class="p">[</span><span class="s2">"output (seed = 123)"</span><span class="p">]</span> <span class="o">+</span> <span class="n">output</span><span class="o">.</span><span class="n">tolist</span><span class="p">(),</span>
|
||||||
|
<span class="p">[</span><span class="s2">"output (seed = 123)"</span><span class="p">]</span> <span class="o">+</span> <span class="n">output2</span><span class="o">.</span><span class="n">tolist</span><span class="p">(),</span>
|
||||||
|
<span class="p">[</span><span class="s2">"output (seed = 512)"</span><span class="p">]</span> <span class="o">+</span> <span class="n">output3</span><span class="o">.</span><span class="n">tolist</span><span class="p">()</span>
|
||||||
|
<span class="p">]))</span>
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
<p class="sphx-glr-script-out">Out:</p>
|
||||||
|
<div class="sphx-glr-script-out highlight-none notranslate"><div class="highlight"><pre><span></span>------------------- --------- -------- -------- ------- -------- -------- --------- --------- --------- ---------
|
||||||
|
input -0.952835 0.371721 0.408716 1.42142 0.149397 -0.67086 -0.214186 -0.431969 -0.707878 -0.106434
|
||||||
|
output (seed = 123) 0 0.743443 0 2.84284 0.298794 -1.34172 0 0 0 0
|
||||||
|
output (seed = 123) 0 0.743443 0 2.84284 0.298794 -1.34172 0 0 0 0
|
||||||
|
output (seed = 512) -1.90567 0.743443 0 2.84284 0.298794 -1.34172 0 -0.863938 0 -0.212868
|
||||||
|
------------------- --------- -------- -------- ------- -------- -------- --------- --------- --------- ---------
|
||||||
|
</pre></div>
|
||||||
|
</div>
|
||||||
|
<p>Et Voilà! We have a triton kernel that applies the same dropout mask provided the seed is the same!
|
||||||
|
If you’d like explore further applications of pseudorandomness in GPU programming, we encourage you
|
||||||
|
to explore the <cite>triton/language/random</cite> folder!</p>
|
||||||
|
</div>
|
||||||
|
<div class="section" id="exercises">
|
||||||
|
<h2>Exercises<a class="headerlink" href="#exercises" title="Permalink to this headline">¶</a></h2>
|
||||||
|
<ol class="arabic simple">
|
||||||
|
<li><p>Extend the kernel to operate over a matrix and use a vector of seeds - one per row.</p></li>
|
||||||
|
<li><p>Add support for striding.</p></li>
|
||||||
|
<li><p>(challenge) Implement a kernel for sparse Johnson-Lindenstrauss transform which generates the projection matrix one the fly each time using a seed.</p></li>
|
||||||
|
</ol>
|
||||||
|
</div>
|
||||||
|
<div class="section" id="references">
|
||||||
|
<h2>References<a class="headerlink" href="#references" title="Permalink to this headline">¶</a></h2>
|
||||||
|
<dl class="citation">
|
||||||
|
<dt class="label" id="salmon2011"><span class="brackets"><a class="fn-backref" href="#id2">SALMON2011</a></span></dt>
|
||||||
|
<dd><p>John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw, “Parallel Random Numbers: As Easy as 1, 2, 3”, 2011</p>
|
||||||
|
</dd>
|
||||||
|
<dt class="label" id="srivastava2014"><span class="brackets"><a class="fn-backref" href="#id1">SRIVASTAVA2014</a></span></dt>
|
||||||
|
<dd><p>Nitish Srivastava and Geoffrey Hinton and Alex Krizhevsky and Ilya Sutskever and Ruslan Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting”, JMLR 2014</p>
|
||||||
|
</dd>
|
||||||
|
</dl>
|
||||||
|
<p class="sphx-glr-timing"><strong>Total running time of the script:</strong> ( 0 minutes 0.316 seconds)</p>
|
||||||
|
<div class="sphx-glr-footer class sphx-glr-footer-example docutils container" id="sphx-glr-download-getting-started-tutorials-04-low-memory-dropout-py">
|
||||||
|
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
||||||
|
<p><a class="reference download internal" download="" href="../../_downloads/c9aed78977a4c05741d675a38dde3d7d/04-low-memory-dropout.py"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">04-low-memory-dropout.py</span></code></a></p>
|
||||||
|
</div>
|
||||||
|
<div class="sphx-glr-download sphx-glr-download-jupyter docutils container">
|
||||||
|
<p><a class="reference download internal" download="" href="../../_downloads/bc847dec325798bdc436c4ef5ac8b78a/04-low-memory-dropout.ipynb"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">Jupyter</span> <span class="pre">notebook:</span> <span class="pre">04-low-memory-dropout.ipynb</span></code></a></p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<p class="sphx-glr-signature"><a class="reference external" href="https://sphinx-gallery.github.io">Gallery generated by Sphinx-Gallery</a></p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<footer>
|
||||||
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
|
<a href="../../python-api/triton.html" class="btn btn-neutral float-right" title="triton" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
|
<a href="03-matrix-multiplication.html" class="btn btn-neutral float-left" title="Matrix Multiplication" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
|
||||||
|
<div role="contentinfo">
|
||||||
|
<p>
|
||||||
|
© Copyright 2020, Philippe Tillet.
|
||||||
|
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
|
||||||
|
|
||||||
|
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
|
||||||
|
|
||||||
|
provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
||||||
|
|
||||||
|
</footer>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</section>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript">
|
||||||
|
jQuery(function () {
|
||||||
|
SphinxRtdTheme.Navigation.enable(true);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</body>
|
||||||
|
</html>
|
@@ -99,6 +99,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="01-vector-add.html">Vector Addition</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="01-vector-add.html">Vector Addition</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="02-fused-softmax.html">Fused Softmax</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="02-fused-softmax.html">Fused Softmax</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="03-matrix-multiplication.html">Matrix Multiplication</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="03-matrix-multiplication.html">Matrix Multiplication</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="04-low-memory-dropout.html">Low-Memory Dropout</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
</ul>
|
</ul>
|
||||||
@@ -200,6 +201,12 @@
|
|||||||
</div>
|
</div>
|
||||||
</div><div class="toctree-wrapper compound">
|
</div><div class="toctree-wrapper compound">
|
||||||
</div>
|
</div>
|
||||||
|
<div class="sphx-glr-thumbcontainer" tooltip="In this tutorial, you will write a memory-efficient implementation of dropout whose state will ..."><div class="figure align-default" id="id4">
|
||||||
|
<img alt="Low-Memory Dropout" src="../../_images/sphx_glr_04-low-memory-dropout_thumb.png" />
|
||||||
|
<p class="caption"><span class="caption-text"><a class="reference internal" href="04-low-memory-dropout.html#sphx-glr-getting-started-tutorials-04-low-memory-dropout-py"><span class="std std-ref">Low-Memory Dropout</span></a></span><a class="headerlink" href="#id4" title="Permalink to this image">¶</a></p>
|
||||||
|
</div>
|
||||||
|
</div><div class="toctree-wrapper compound">
|
||||||
|
</div>
|
||||||
<div class="sphx-glr-clear"></div><div class="sphx-glr-footer class sphx-glr-footer-gallery docutils container">
|
<div class="sphx-glr-clear"></div><div class="sphx-glr-footer class sphx-glr-footer-gallery docutils container">
|
||||||
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
<div class="sphx-glr-download sphx-glr-download-python docutils container">
|
||||||
<p><a class="reference download internal" download="" href="../../_downloads/763344228ae6bc253ed1a6cf586aa30d/tutorials_python.zip"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">all</span> <span class="pre">examples</span> <span class="pre">in</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tutorials_python.zip</span></code></a></p>
|
<p><a class="reference download internal" download="" href="../../_downloads/763344228ae6bc253ed1a6cf586aa30d/tutorials_python.zip"><code class="xref download docutils literal notranslate"><span class="pre">Download</span> <span class="pre">all</span> <span class="pre">examples</span> <span class="pre">in</span> <span class="pre">Python</span> <span class="pre">source</span> <span class="pre">code:</span> <span class="pre">tutorials_python.zip</span></code></a></p>
|
||||||
|
@@ -174,7 +174,7 @@
|
|||||||
|
|
||||||
<div class="section" id="computation-times">
|
<div class="section" id="computation-times">
|
||||||
<span id="sphx-glr-getting-started-tutorials-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
|
<span id="sphx-glr-getting-started-tutorials-sg-execution-times"></span><h1>Computation times<a class="headerlink" href="#computation-times" title="Permalink to this headline">¶</a></h1>
|
||||||
<p><strong>03:38.920</strong> total execution time for <strong>getting-started_tutorials</strong> files:</p>
|
<p><strong>03:43.892</strong> total execution time for <strong>getting-started_tutorials</strong> files:</p>
|
||||||
<table class="docutils align-default">
|
<table class="docutils align-default">
|
||||||
<colgroup>
|
<colgroup>
|
||||||
<col style="width: 85%" />
|
<col style="width: 85%" />
|
||||||
@@ -183,15 +183,19 @@
|
|||||||
</colgroup>
|
</colgroup>
|
||||||
<tbody>
|
<tbody>
|
||||||
<tr class="row-odd"><td><p><a class="reference internal" href="03-matrix-multiplication.html#sphx-glr-getting-started-tutorials-03-matrix-multiplication-py"><span class="std std-ref">Matrix Multiplication</span></a> (<code class="docutils literal notranslate"><span class="pre">03-matrix-multiplication.py</span></code>)</p></td>
|
<tr class="row-odd"><td><p><a class="reference internal" href="03-matrix-multiplication.html#sphx-glr-getting-started-tutorials-03-matrix-multiplication-py"><span class="std std-ref">Matrix Multiplication</span></a> (<code class="docutils literal notranslate"><span class="pre">03-matrix-multiplication.py</span></code>)</p></td>
|
||||||
<td><p>02:14.737</p></td>
|
<td><p>02:20.017</p></td>
|
||||||
<td><p>0.0 MB</p></td>
|
<td><p>0.0 MB</p></td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr class="row-even"><td><p><a class="reference internal" href="02-fused-softmax.html#sphx-glr-getting-started-tutorials-02-fused-softmax-py"><span class="std std-ref">Fused Softmax</span></a> (<code class="docutils literal notranslate"><span class="pre">02-fused-softmax.py</span></code>)</p></td>
|
<tr class="row-even"><td><p><a class="reference internal" href="02-fused-softmax.html#sphx-glr-getting-started-tutorials-02-fused-softmax-py"><span class="std std-ref">Fused Softmax</span></a> (<code class="docutils literal notranslate"><span class="pre">02-fused-softmax.py</span></code>)</p></td>
|
||||||
<td><p>01:13.131</p></td>
|
<td><p>01:12.586</p></td>
|
||||||
<td><p>0.0 MB</p></td>
|
<td><p>0.0 MB</p></td>
|
||||||
</tr>
|
</tr>
|
||||||
<tr class="row-odd"><td><p><a class="reference internal" href="01-vector-add.html#sphx-glr-getting-started-tutorials-01-vector-add-py"><span class="std std-ref">Vector Addition</span></a> (<code class="docutils literal notranslate"><span class="pre">01-vector-add.py</span></code>)</p></td>
|
<tr class="row-odd"><td><p><a class="reference internal" href="01-vector-add.html#sphx-glr-getting-started-tutorials-01-vector-add-py"><span class="std std-ref">Vector Addition</span></a> (<code class="docutils literal notranslate"><span class="pre">01-vector-add.py</span></code>)</p></td>
|
||||||
<td><p>00:11.053</p></td>
|
<td><p>00:10.972</p></td>
|
||||||
|
<td><p>0.0 MB</p></td>
|
||||||
|
</tr>
|
||||||
|
<tr class="row-even"><td><p><a class="reference internal" href="04-low-memory-dropout.html#sphx-glr-getting-started-tutorials-04-low-memory-dropout-py"><span class="std std-ref">Low-Memory Dropout</span></a> (<code class="docutils literal notranslate"><span class="pre">04-low-memory-dropout.py</span></code>)</p></td>
|
||||||
|
<td><p>00:00.316</p></td>
|
||||||
<td><p>0.0 MB</p></td>
|
<td><p>0.0 MB</p></td>
|
||||||
</tr>
|
</tr>
|
||||||
</tbody>
|
</tbody>
|
||||||
|
BIN
objects.inv
@@ -115,6 +115,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -117,6 +117,7 @@
|
|||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -123,6 +123,7 @@
|
|||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -117,6 +117,7 @@
|
|||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -117,6 +117,7 @@
|
|||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -117,6 +117,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -116,6 +116,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -120,6 +120,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -114,6 +114,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -120,6 +120,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -117,6 +117,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -120,6 +120,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -116,6 +116,7 @@
|
|||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -46,7 +46,7 @@
|
|||||||
|
|
||||||
<link rel="index" title="Index" href="../../genindex.html" />
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
<link rel="search" title="Search" href="../../search.html" />
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
<link rel="next" title="triton.language.multiple_of" href="triton.language.multiple_of.html" />
|
<link rel="next" title="triton.language.randint4x" href="triton.language.randint4x.html" />
|
||||||
<link rel="prev" title="triton.language.minimum" href="triton.language.minimum.html" />
|
<link rel="prev" title="triton.language.minimum" href="triton.language.minimum.html" />
|
||||||
</head>
|
</head>
|
||||||
|
|
||||||
@@ -115,6 +115,7 @@
|
|||||||
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.maximum</a></li>
|
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.maximum</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
@@ -217,7 +218,7 @@
|
|||||||
</div>
|
</div>
|
||||||
<footer>
|
<footer>
|
||||||
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
<a href="triton.language.multiple_of.html" class="btn btn-neutral float-right" title="triton.language.multiple_of" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
<a href="triton.language.randint4x.html" class="btn btn-neutral float-right" title="triton.language.randint4x" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
<a href="triton.language.minimum.html" class="btn btn-neutral float-left" title="triton.language.minimum" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
<a href="triton.language.minimum.html" class="btn btn-neutral float-left" title="triton.language.minimum" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
@@ -116,6 +116,7 @@
|
|||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -115,6 +115,7 @@
|
|||||||
<li class="toctree-l3"><a class="reference internal" href="triton.language.maximum.html">triton.language.maximum</a></li>
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.maximum.html">triton.language.maximum</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -47,7 +47,7 @@
|
|||||||
<link rel="index" title="Index" href="../../genindex.html" />
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
<link rel="search" title="Search" href="../../search.html" />
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
<link rel="next" title="triton.testing" href="../triton.testing.html" />
|
<link rel="next" title="triton.testing" href="../triton.testing.html" />
|
||||||
<link rel="prev" title="triton.language.maximum" href="triton.language.maximum.html" />
|
<link rel="prev" title="triton.language.randn" href="triton.language.randn.html" />
|
||||||
</head>
|
</head>
|
||||||
|
|
||||||
<body class="wy-body-for-nav">
|
<body class="wy-body-for-nav">
|
||||||
@@ -111,6 +111,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2 current"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a><ul class="current">
|
<li class="toctree-l2 current"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a><ul class="current">
|
||||||
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.multiple_of</a></li>
|
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.multiple_of</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
@@ -209,7 +210,7 @@
|
|||||||
<footer>
|
<footer>
|
||||||
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
<a href="../triton.testing.html" class="btn btn-neutral float-right" title="triton.testing" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
<a href="../triton.testing.html" class="btn btn-neutral float-right" title="triton.testing" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
<a href="triton.language.maximum.html" class="btn btn-neutral float-left" title="triton.language.maximum" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
<a href="triton.language.randn.html" class="btn btn-neutral float-left" title="triton.language.randn" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<hr/>
|
<hr/>
|
||||||
|
@@ -115,6 +115,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -115,6 +115,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
267
python-api/generated/triton.language.rand.html
Normal file
@@ -0,0 +1,267 @@
|
|||||||
|
|
||||||
|
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html class="writer-html5" lang="en" >
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||||
|
|
||||||
|
<title>triton.language.rand — Triton documentation</title>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-binder.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-dataframe.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-rendered-html.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/custom.css" type="text/css" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<!--[if lt IE 9]>
|
||||||
|
<script src="../../_static/js/html5shiv.min.js"></script>
|
||||||
|
<![endif]-->
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script src="../../_static/jquery.js"></script>
|
||||||
|
<script src="../../_static/underscore.js"></script>
|
||||||
|
<script src="../../_static/doctools.js"></script>
|
||||||
|
<script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
|
||||||
|
|
||||||
|
<script type="text/javascript" src="../../_static/js/theme.js"></script>
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
|
<link rel="next" title="triton.language.randn" href="triton.language.randn.html" />
|
||||||
|
<link rel="prev" title="triton.language.randint" href="triton.language.randint.html" />
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body class="wy-body-for-nav">
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-grid-for-nav">
|
||||||
|
|
||||||
|
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||||||
|
<div class="wy-side-scroll">
|
||||||
|
<div class="wy-side-nav-search" >
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../index.html" class="icon icon-home"> Triton
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</a>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="search">
|
||||||
|
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
|
||||||
|
<input type="text" name="q" placeholder="Search docs" />
|
||||||
|
<input type="hidden" name="check_keywords" value="yes" />
|
||||||
|
<input type="hidden" name="area" value="default" />
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Getting Started</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/installation.html">Installation</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/tutorials/index.html">Tutorials</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Python API</span></p>
|
||||||
|
<ul class="current">
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.html">triton</a></li>
|
||||||
|
<li class="toctree-l1 current"><a class="reference internal" href="../triton.language.html">triton.language</a><ul class="current">
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#programming-model">Programming Model</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#creation-ops">Creation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#shape-manipulation-ops">Shape Manipulation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#linear-algebra-ops">Linear Algebra Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#memory-ops">Memory Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#indexing-ops">Indexing Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#math-ops">Math Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2 current"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a><ul class="current">
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randint4x.html">triton.language.randint4x</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randint.html">triton.language.randint</a></li>
|
||||||
|
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.rand</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randn.html">triton.language.randn</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.testing.html">triton.testing</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Programming Guide</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-1/introduction.html">Introduction</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-2/related-work.html">Related Work</a></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
|
||||||
|
|
||||||
|
|
||||||
|
<nav class="wy-nav-top" aria-label="top navigation">
|
||||||
|
|
||||||
|
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
||||||
|
<a href="../../index.html">Triton</a>
|
||||||
|
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-nav-content">
|
||||||
|
|
||||||
|
<div class="rst-content">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="navigation" aria-label="breadcrumbs navigation">
|
||||||
|
|
||||||
|
<ul class="wy-breadcrumbs">
|
||||||
|
|
||||||
|
<li><a href="../../index.html" class="icon icon-home"></a> »</li>
|
||||||
|
|
||||||
|
<li><a href="../triton.language.html">triton.language</a> »</li>
|
||||||
|
|
||||||
|
<li>triton.language.rand</li>
|
||||||
|
|
||||||
|
|
||||||
|
<li class="wy-breadcrumbs-aside">
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../_sources/python-api/generated/triton.language.rand.rst.txt" rel="nofollow"> View page source</a>
|
||||||
|
|
||||||
|
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
</div>
|
||||||
|
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
||||||
|
<div itemprop="articleBody">
|
||||||
|
|
||||||
|
<div class="section" id="triton-language-rand">
|
||||||
|
<h1>triton.language.rand<a class="headerlink" href="#triton-language-rand" title="Permalink to this headline">¶</a></h1>
|
||||||
|
<dl class="py function">
|
||||||
|
<dt class="sig sig-object py" id="triton.language.rand">
|
||||||
|
<span class="sig-prename descclassname"><span class="pre">triton.language.</span></span><span class="sig-name descname"><span class="pre">rand</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">seed</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">offset</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#triton.language.rand" title="Permalink to this definition">¶</a></dt>
|
||||||
|
<dd><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block,
|
||||||
|
returns a block of random <code class="code docutils literal notranslate"><span class="pre">float32</span></code> in <span class="math notranslate nohighlight">\(U(0, 1)\)</span></p>
|
||||||
|
<dl class="field-list simple">
|
||||||
|
<dt class="field-odd">Parameters</dt>
|
||||||
|
<dd class="field-odd"><ul class="simple">
|
||||||
|
<li><p><strong>seed</strong> – The seed for generating random numbers.</p></li>
|
||||||
|
<li><p><strong>offsets</strong> – The offsets to generate random numbers for.</p></li>
|
||||||
|
</ul>
|
||||||
|
</dd>
|
||||||
|
</dl>
|
||||||
|
</dd></dl>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<footer>
|
||||||
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
|
<a href="triton.language.randn.html" class="btn btn-neutral float-right" title="triton.language.randn" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
|
<a href="triton.language.randint.html" class="btn btn-neutral float-left" title="triton.language.randint" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
|
||||||
|
<div role="contentinfo">
|
||||||
|
<p>
|
||||||
|
© Copyright 2020, Philippe Tillet.
|
||||||
|
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
|
||||||
|
|
||||||
|
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
|
||||||
|
|
||||||
|
provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
||||||
|
|
||||||
|
</footer>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</section>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript">
|
||||||
|
jQuery(function () {
|
||||||
|
SphinxRtdTheme.Navigation.enable(true);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</body>
|
||||||
|
</html>
|
268
python-api/generated/triton.language.randint.html
Normal file
@@ -0,0 +1,268 @@
|
|||||||
|
|
||||||
|
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html class="writer-html5" lang="en" >
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||||
|
|
||||||
|
<title>triton.language.randint — Triton documentation</title>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-binder.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-dataframe.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-rendered-html.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/custom.css" type="text/css" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<!--[if lt IE 9]>
|
||||||
|
<script src="../../_static/js/html5shiv.min.js"></script>
|
||||||
|
<![endif]-->
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script src="../../_static/jquery.js"></script>
|
||||||
|
<script src="../../_static/underscore.js"></script>
|
||||||
|
<script src="../../_static/doctools.js"></script>
|
||||||
|
|
||||||
|
<script type="text/javascript" src="../../_static/js/theme.js"></script>
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
|
<link rel="next" title="triton.language.rand" href="triton.language.rand.html" />
|
||||||
|
<link rel="prev" title="triton.language.randint4x" href="triton.language.randint4x.html" />
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body class="wy-body-for-nav">
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-grid-for-nav">
|
||||||
|
|
||||||
|
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||||||
|
<div class="wy-side-scroll">
|
||||||
|
<div class="wy-side-nav-search" >
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../index.html" class="icon icon-home"> Triton
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</a>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="search">
|
||||||
|
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
|
||||||
|
<input type="text" name="q" placeholder="Search docs" />
|
||||||
|
<input type="hidden" name="check_keywords" value="yes" />
|
||||||
|
<input type="hidden" name="area" value="default" />
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Getting Started</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/installation.html">Installation</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/tutorials/index.html">Tutorials</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Python API</span></p>
|
||||||
|
<ul class="current">
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.html">triton</a></li>
|
||||||
|
<li class="toctree-l1 current"><a class="reference internal" href="../triton.language.html">triton.language</a><ul class="current">
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#programming-model">Programming Model</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#creation-ops">Creation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#shape-manipulation-ops">Shape Manipulation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#linear-algebra-ops">Linear Algebra Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#memory-ops">Memory Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#indexing-ops">Indexing Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#math-ops">Math Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2 current"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a><ul class="current">
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randint4x.html">triton.language.randint4x</a></li>
|
||||||
|
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.randint</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.rand.html">triton.language.rand</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randn.html">triton.language.randn</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.testing.html">triton.testing</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Programming Guide</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-1/introduction.html">Introduction</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-2/related-work.html">Related Work</a></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
|
||||||
|
|
||||||
|
|
||||||
|
<nav class="wy-nav-top" aria-label="top navigation">
|
||||||
|
|
||||||
|
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
||||||
|
<a href="../../index.html">Triton</a>
|
||||||
|
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-nav-content">
|
||||||
|
|
||||||
|
<div class="rst-content">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="navigation" aria-label="breadcrumbs navigation">
|
||||||
|
|
||||||
|
<ul class="wy-breadcrumbs">
|
||||||
|
|
||||||
|
<li><a href="../../index.html" class="icon icon-home"></a> »</li>
|
||||||
|
|
||||||
|
<li><a href="../triton.language.html">triton.language</a> »</li>
|
||||||
|
|
||||||
|
<li>triton.language.randint</li>
|
||||||
|
|
||||||
|
|
||||||
|
<li class="wy-breadcrumbs-aside">
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../_sources/python-api/generated/triton.language.randint.rst.txt" rel="nofollow"> View page source</a>
|
||||||
|
|
||||||
|
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
</div>
|
||||||
|
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
||||||
|
<div itemprop="articleBody">
|
||||||
|
|
||||||
|
<div class="section" id="triton-language-randint">
|
||||||
|
<h1>triton.language.randint<a class="headerlink" href="#triton-language-randint" title="Permalink to this headline">¶</a></h1>
|
||||||
|
<dl class="py function">
|
||||||
|
<dt class="sig sig-object py" id="triton.language.randint">
|
||||||
|
<span class="sig-prename descclassname"><span class="pre">triton.language.</span></span><span class="sig-name descname"><span class="pre">randint</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">seed</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">offset</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#triton.language.randint" title="Permalink to this definition">¶</a></dt>
|
||||||
|
<dd><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block, returns a single
|
||||||
|
block of random <code class="code docutils literal notranslate"><span class="pre">int32</span></code>.</p>
|
||||||
|
<p>If you need multiple streams of random numbers,
|
||||||
|
using <cite>randint4x</cite> is likely to be faster than calling <cite>randint</cite> 4 times.</p>
|
||||||
|
<dl class="field-list simple">
|
||||||
|
<dt class="field-odd">Parameters</dt>
|
||||||
|
<dd class="field-odd"><ul class="simple">
|
||||||
|
<li><p><strong>seed</strong> – The seed for generating random numbers.</p></li>
|
||||||
|
<li><p><strong>offsets</strong> – The offsets to generate random numbers for.</p></li>
|
||||||
|
</ul>
|
||||||
|
</dd>
|
||||||
|
</dl>
|
||||||
|
</dd></dl>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<footer>
|
||||||
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
|
<a href="triton.language.rand.html" class="btn btn-neutral float-right" title="triton.language.rand" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
|
<a href="triton.language.randint4x.html" class="btn btn-neutral float-left" title="triton.language.randint4x" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
|
||||||
|
<div role="contentinfo">
|
||||||
|
<p>
|
||||||
|
© Copyright 2020, Philippe Tillet.
|
||||||
|
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
|
||||||
|
|
||||||
|
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
|
||||||
|
|
||||||
|
provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
||||||
|
|
||||||
|
</footer>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</section>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript">
|
||||||
|
jQuery(function () {
|
||||||
|
SphinxRtdTheme.Navigation.enable(true);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</body>
|
||||||
|
</html>
|
268
python-api/generated/triton.language.randint4x.html
Normal file
@@ -0,0 +1,268 @@
|
|||||||
|
|
||||||
|
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html class="writer-html5" lang="en" >
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||||
|
|
||||||
|
<title>triton.language.randint4x — Triton documentation</title>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-binder.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-dataframe.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-rendered-html.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/custom.css" type="text/css" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<!--[if lt IE 9]>
|
||||||
|
<script src="../../_static/js/html5shiv.min.js"></script>
|
||||||
|
<![endif]-->
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script src="../../_static/jquery.js"></script>
|
||||||
|
<script src="../../_static/underscore.js"></script>
|
||||||
|
<script src="../../_static/doctools.js"></script>
|
||||||
|
|
||||||
|
<script type="text/javascript" src="../../_static/js/theme.js"></script>
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
|
<link rel="next" title="triton.language.randint" href="triton.language.randint.html" />
|
||||||
|
<link rel="prev" title="triton.language.maximum" href="triton.language.maximum.html" />
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body class="wy-body-for-nav">
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-grid-for-nav">
|
||||||
|
|
||||||
|
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||||||
|
<div class="wy-side-scroll">
|
||||||
|
<div class="wy-side-nav-search" >
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../index.html" class="icon icon-home"> Triton
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</a>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="search">
|
||||||
|
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
|
||||||
|
<input type="text" name="q" placeholder="Search docs" />
|
||||||
|
<input type="hidden" name="check_keywords" value="yes" />
|
||||||
|
<input type="hidden" name="area" value="default" />
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Getting Started</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/installation.html">Installation</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/tutorials/index.html">Tutorials</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Python API</span></p>
|
||||||
|
<ul class="current">
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.html">triton</a></li>
|
||||||
|
<li class="toctree-l1 current"><a class="reference internal" href="../triton.language.html">triton.language</a><ul class="current">
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#programming-model">Programming Model</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#creation-ops">Creation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#shape-manipulation-ops">Shape Manipulation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#linear-algebra-ops">Linear Algebra Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#memory-ops">Memory Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#indexing-ops">Indexing Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#math-ops">Math Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2 current"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a><ul class="current">
|
||||||
|
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.randint4x</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randint.html">triton.language.randint</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.rand.html">triton.language.rand</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randn.html">triton.language.randn</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.testing.html">triton.testing</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Programming Guide</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-1/introduction.html">Introduction</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-2/related-work.html">Related Work</a></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
|
||||||
|
|
||||||
|
|
||||||
|
<nav class="wy-nav-top" aria-label="top navigation">
|
||||||
|
|
||||||
|
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
||||||
|
<a href="../../index.html">Triton</a>
|
||||||
|
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-nav-content">
|
||||||
|
|
||||||
|
<div class="rst-content">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="navigation" aria-label="breadcrumbs navigation">
|
||||||
|
|
||||||
|
<ul class="wy-breadcrumbs">
|
||||||
|
|
||||||
|
<li><a href="../../index.html" class="icon icon-home"></a> »</li>
|
||||||
|
|
||||||
|
<li><a href="../triton.language.html">triton.language</a> »</li>
|
||||||
|
|
||||||
|
<li>triton.language.randint4x</li>
|
||||||
|
|
||||||
|
|
||||||
|
<li class="wy-breadcrumbs-aside">
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../_sources/python-api/generated/triton.language.randint4x.rst.txt" rel="nofollow"> View page source</a>
|
||||||
|
|
||||||
|
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
</div>
|
||||||
|
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
||||||
|
<div itemprop="articleBody">
|
||||||
|
|
||||||
|
<div class="section" id="triton-language-randint4x">
|
||||||
|
<h1>triton.language.randint4x<a class="headerlink" href="#triton-language-randint4x" title="Permalink to this headline">¶</a></h1>
|
||||||
|
<dl class="py function">
|
||||||
|
<dt class="sig sig-object py" id="triton.language.randint4x">
|
||||||
|
<span class="sig-prename descclassname"><span class="pre">triton.language.</span></span><span class="sig-name descname"><span class="pre">randint4x</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">seed</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">offset</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#triton.language.randint4x" title="Permalink to this definition">¶</a></dt>
|
||||||
|
<dd><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block, returns four
|
||||||
|
blocks of random <code class="code docutils literal notranslate"><span class="pre">int32</span></code>.</p>
|
||||||
|
<p>This is the maximally efficient entry point
|
||||||
|
to Triton’s Philox pseudo-random number generator.</p>
|
||||||
|
<dl class="field-list simple">
|
||||||
|
<dt class="field-odd">Parameters</dt>
|
||||||
|
<dd class="field-odd"><ul class="simple">
|
||||||
|
<li><p><strong>seed</strong> – The seed for generating random numbers.</p></li>
|
||||||
|
<li><p><strong>offsets</strong> – The offsets to generate random numbers for.</p></li>
|
||||||
|
</ul>
|
||||||
|
</dd>
|
||||||
|
</dl>
|
||||||
|
</dd></dl>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<footer>
|
||||||
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
|
<a href="triton.language.randint.html" class="btn btn-neutral float-right" title="triton.language.randint" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
|
<a href="triton.language.maximum.html" class="btn btn-neutral float-left" title="triton.language.maximum" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
|
||||||
|
<div role="contentinfo">
|
||||||
|
<p>
|
||||||
|
© Copyright 2020, Philippe Tillet.
|
||||||
|
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
|
||||||
|
|
||||||
|
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
|
||||||
|
|
||||||
|
provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
||||||
|
|
||||||
|
</footer>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</section>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript">
|
||||||
|
jQuery(function () {
|
||||||
|
SphinxRtdTheme.Navigation.enable(true);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</body>
|
||||||
|
</html>
|
267
python-api/generated/triton.language.randn.html
Normal file
@@ -0,0 +1,267 @@
|
|||||||
|
|
||||||
|
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html class="writer-html5" lang="en" >
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8" />
|
||||||
|
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||||||
|
|
||||||
|
<title>triton.language.randn — Triton documentation</title>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/pygments.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-binder.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-dataframe.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/gallery-rendered-html.css" type="text/css" />
|
||||||
|
<link rel="stylesheet" href="../../_static/css/custom.css" type="text/css" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<!--[if lt IE 9]>
|
||||||
|
<script src="../../_static/js/html5shiv.min.js"></script>
|
||||||
|
<![endif]-->
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript" id="documentation_options" data-url_root="../../" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script data-url_root="../../" id="documentation_options" src="../../_static/documentation_options.js"></script>
|
||||||
|
<script src="../../_static/jquery.js"></script>
|
||||||
|
<script src="../../_static/underscore.js"></script>
|
||||||
|
<script src="../../_static/doctools.js"></script>
|
||||||
|
<script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
|
||||||
|
|
||||||
|
<script type="text/javascript" src="../../_static/js/theme.js"></script>
|
||||||
|
|
||||||
|
|
||||||
|
<link rel="index" title="Index" href="../../genindex.html" />
|
||||||
|
<link rel="search" title="Search" href="../../search.html" />
|
||||||
|
<link rel="next" title="triton.language.multiple_of" href="triton.language.multiple_of.html" />
|
||||||
|
<link rel="prev" title="triton.language.rand" href="triton.language.rand.html" />
|
||||||
|
</head>
|
||||||
|
|
||||||
|
<body class="wy-body-for-nav">
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-grid-for-nav">
|
||||||
|
|
||||||
|
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
|
||||||
|
<div class="wy-side-scroll">
|
||||||
|
<div class="wy-side-nav-search" >
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../index.html" class="icon icon-home"> Triton
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</a>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="search">
|
||||||
|
<form id="rtd-search-form" class="wy-form" action="../../search.html" method="get">
|
||||||
|
<input type="text" name="q" placeholder="Search docs" />
|
||||||
|
<input type="hidden" name="check_keywords" value="yes" />
|
||||||
|
<input type="hidden" name="area" value="default" />
|
||||||
|
</form>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="main navigation">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Getting Started</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/installation.html">Installation</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../getting-started/tutorials/index.html">Tutorials</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Python API</span></p>
|
||||||
|
<ul class="current">
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.html">triton</a></li>
|
||||||
|
<li class="toctree-l1 current"><a class="reference internal" href="../triton.language.html">triton.language</a><ul class="current">
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#programming-model">Programming Model</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#creation-ops">Creation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#shape-manipulation-ops">Shape Manipulation Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#linear-algebra-ops">Linear Algebra Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#memory-ops">Memory Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#indexing-ops">Indexing Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#math-ops">Math Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2 current"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a><ul class="current">
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randint4x.html">triton.language.randint4x</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.randint.html">triton.language.randint</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="triton.language.rand.html">triton.language.rand</a></li>
|
||||||
|
<li class="toctree-l3 current"><a class="current reference internal" href="#">triton.language.randn</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../triton.testing.html">triton.testing</a></li>
|
||||||
|
</ul>
|
||||||
|
<p class="caption" role="heading"><span class="caption-text">Programming Guide</span></p>
|
||||||
|
<ul>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-1/introduction.html">Introduction</a></li>
|
||||||
|
<li class="toctree-l1"><a class="reference internal" href="../../programming-guide/chapter-2/related-work.html">Related Work</a></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap">
|
||||||
|
|
||||||
|
|
||||||
|
<nav class="wy-nav-top" aria-label="top navigation">
|
||||||
|
|
||||||
|
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
|
||||||
|
<a href="../../index.html">Triton</a>
|
||||||
|
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
|
||||||
|
<div class="wy-nav-content">
|
||||||
|
|
||||||
|
<div class="rst-content">
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
<div role="navigation" aria-label="breadcrumbs navigation">
|
||||||
|
|
||||||
|
<ul class="wy-breadcrumbs">
|
||||||
|
|
||||||
|
<li><a href="../../index.html" class="icon icon-home"></a> »</li>
|
||||||
|
|
||||||
|
<li><a href="../triton.language.html">triton.language</a> »</li>
|
||||||
|
|
||||||
|
<li>triton.language.randn</li>
|
||||||
|
|
||||||
|
|
||||||
|
<li class="wy-breadcrumbs-aside">
|
||||||
|
|
||||||
|
|
||||||
|
<a href="../../_sources/python-api/generated/triton.language.randn.rst.txt" rel="nofollow"> View page source</a>
|
||||||
|
|
||||||
|
|
||||||
|
</li>
|
||||||
|
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
</div>
|
||||||
|
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
|
||||||
|
<div itemprop="articleBody">
|
||||||
|
|
||||||
|
<div class="section" id="triton-language-randn">
|
||||||
|
<h1>triton.language.randn<a class="headerlink" href="#triton-language-randn" title="Permalink to this headline">¶</a></h1>
|
||||||
|
<dl class="py function">
|
||||||
|
<dt class="sig sig-object py" id="triton.language.randn">
|
||||||
|
<span class="sig-prename descclassname"><span class="pre">triton.language.</span></span><span class="sig-name descname"><span class="pre">randn</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">seed</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">offset</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#triton.language.randn" title="Permalink to this definition">¶</a></dt>
|
||||||
|
<dd><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block,
|
||||||
|
returns a block of random <code class="code docutils literal notranslate"><span class="pre">float32</span></code> in <span class="math notranslate nohighlight">\(\mathcal{N}(0, 1)\)</span></p>
|
||||||
|
<dl class="field-list simple">
|
||||||
|
<dt class="field-odd">Parameters</dt>
|
||||||
|
<dd class="field-odd"><ul class="simple">
|
||||||
|
<li><p><strong>seed</strong> – The seed for generating random numbers.</p></li>
|
||||||
|
<li><p><strong>offsets</strong> – The offsets to generate random numbers for.</p></li>
|
||||||
|
</ul>
|
||||||
|
</dd>
|
||||||
|
</dl>
|
||||||
|
</dd></dl>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
<footer>
|
||||||
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
|
<a href="triton.language.multiple_of.html" class="btn btn-neutral float-right" title="triton.language.multiple_of" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
|
<a href="triton.language.rand.html" class="btn btn-neutral float-left" title="triton.language.rand" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<hr/>
|
||||||
|
|
||||||
|
<div role="contentinfo">
|
||||||
|
<p>
|
||||||
|
© Copyright 2020, Philippe Tillet.
|
||||||
|
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
|
||||||
|
|
||||||
|
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
|
||||||
|
|
||||||
|
provided by <a href="https://readthedocs.org">Read the Docs</a>.
|
||||||
|
|
||||||
|
</footer>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</section>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<script type="text/javascript">
|
||||||
|
jQuery(function () {
|
||||||
|
SphinxRtdTheme.Navigation.enable(true);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
</body>
|
||||||
|
</html>
|
@@ -116,6 +116,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -116,6 +116,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -120,6 +120,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -120,6 +120,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -120,6 +120,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -120,6 +120,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -117,6 +117,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -116,6 +116,7 @@
|
|||||||
</li>
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -114,6 +114,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -115,6 +115,7 @@
|
|||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#reduction-ops">Reduction Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#atomic-ops">Atomic Ops</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#comparison-ops">Comparison ops</a></li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#random-number-generation">Random Number Generation</a></li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
<li class="toctree-l2"><a class="reference internal" href="../triton.language.html#compiler-hint-ops">Compiler Hint Ops</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
@@ -47,7 +47,7 @@
|
|||||||
<link rel="index" title="Index" href="../genindex.html" />
|
<link rel="index" title="Index" href="../genindex.html" />
|
||||||
<link rel="search" title="Search" href="../search.html" />
|
<link rel="search" title="Search" href="../search.html" />
|
||||||
<link rel="next" title="triton.jit" href="generated/triton.jit.html" />
|
<link rel="next" title="triton.jit" href="generated/triton.jit.html" />
|
||||||
<link rel="prev" title="Matrix Multiplication" href="../getting-started/tutorials/03-matrix-multiplication.html" />
|
<link rel="prev" title="Low-Memory Dropout" href="../getting-started/tutorials/04-low-memory-dropout.html" />
|
||||||
</head>
|
</head>
|
||||||
|
|
||||||
<body class="wy-body-for-nav">
|
<body class="wy-body-for-nav">
|
||||||
@@ -211,7 +211,7 @@
|
|||||||
<footer>
|
<footer>
|
||||||
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
|
||||||
<a href="generated/triton.jit.html" class="btn btn-neutral float-right" title="triton.jit" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
<a href="generated/triton.jit.html" class="btn btn-neutral float-right" title="triton.jit" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
|
||||||
<a href="../getting-started/tutorials/03-matrix-multiplication.html" class="btn btn-neutral float-left" title="Matrix Multiplication" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
<a href="../getting-started/tutorials/04-low-memory-dropout.html" class="btn btn-neutral float-left" title="Low-Memory Dropout" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<hr/>
|
<hr/>
|
||||||
|
@@ -40,6 +40,7 @@
|
|||||||
<script src="../_static/jquery.js"></script>
|
<script src="../_static/jquery.js"></script>
|
||||||
<script src="../_static/underscore.js"></script>
|
<script src="../_static/underscore.js"></script>
|
||||||
<script src="../_static/doctools.js"></script>
|
<script src="../_static/doctools.js"></script>
|
||||||
|
<script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
|
||||||
|
|
||||||
<script type="text/javascript" src="../_static/js/theme.js"></script>
|
<script type="text/javascript" src="../_static/js/theme.js"></script>
|
||||||
|
|
||||||
@@ -160,6 +161,13 @@
|
|||||||
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.maximum.html">triton.language.maximum</a></li>
|
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.maximum.html">triton.language.maximum</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
|
<li class="toctree-l2"><a class="reference internal" href="#random-number-generation">Random Number Generation</a><ul>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.randint4x.html">triton.language.randint4x</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.randint.html">triton.language.randint</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.rand.html">triton.language.rand</a></li>
|
||||||
|
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.randn.html">triton.language.randn</a></li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
<li class="toctree-l2"><a class="reference internal" href="#compiler-hint-ops">Compiler Hint Ops</a><ul>
|
<li class="toctree-l2"><a class="reference internal" href="#compiler-hint-ops">Compiler Hint Ops</a><ul>
|
||||||
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.multiple_of.html">triton.language.multiple_of</a></li>
|
<li class="toctree-l3"><a class="reference internal" href="generated/triton.language.multiple_of.html">triton.language.multiple_of</a></li>
|
||||||
</ul>
|
</ul>
|
||||||
@@ -438,6 +446,29 @@
|
|||||||
</tbody>
|
</tbody>
|
||||||
</table>
|
</table>
|
||||||
</div>
|
</div>
|
||||||
|
<div class="section" id="random-number-generation">
|
||||||
|
<span id="id1"></span><h2>Random Number Generation<a class="headerlink" href="#random-number-generation" title="Permalink to this headline">¶</a></h2>
|
||||||
|
<table class="longtable docutils align-default">
|
||||||
|
<colgroup>
|
||||||
|
<col style="width: 10%" />
|
||||||
|
<col style="width: 90%" />
|
||||||
|
</colgroup>
|
||||||
|
<tbody>
|
||||||
|
<tr class="row-odd"><td><p><a class="reference internal" href="generated/triton.language.randint4x.html#triton.language.randint4x" title="triton.language.randint4x"><code class="xref py py-obj docutils literal notranslate"><span class="pre">randint4x</span></code></a></p></td>
|
||||||
|
<td><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block, returns four blocks of random <code class="code docutils literal notranslate"><span class="pre">int32</span></code>.</p></td>
|
||||||
|
</tr>
|
||||||
|
<tr class="row-even"><td><p><a class="reference internal" href="generated/triton.language.randint.html#triton.language.randint" title="triton.language.randint"><code class="xref py py-obj docutils literal notranslate"><span class="pre">randint</span></code></a></p></td>
|
||||||
|
<td><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block, returns a single block of random <code class="code docutils literal notranslate"><span class="pre">int32</span></code>.</p></td>
|
||||||
|
</tr>
|
||||||
|
<tr class="row-odd"><td><p><a class="reference internal" href="generated/triton.language.rand.html#triton.language.rand" title="triton.language.rand"><code class="xref py py-obj docutils literal notranslate"><span class="pre">rand</span></code></a></p></td>
|
||||||
|
<td><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block, returns a block of random <code class="code docutils literal notranslate"><span class="pre">float32</span></code> in <span class="math notranslate nohighlight">\(U(0, 1)\)</span></p></td>
|
||||||
|
</tr>
|
||||||
|
<tr class="row-even"><td><p><a class="reference internal" href="generated/triton.language.randn.html#triton.language.randn" title="triton.language.randn"><code class="xref py py-obj docutils literal notranslate"><span class="pre">randn</span></code></a></p></td>
|
||||||
|
<td><p>Given a <code class="code docutils literal notranslate"><span class="pre">seed</span></code> scalar and an <code class="code docutils literal notranslate"><span class="pre">offset</span></code> block, returns a block of random <code class="code docutils literal notranslate"><span class="pre">float32</span></code> in <span class="math notranslate nohighlight">\(\mathcal{N}(0, 1)\)</span></p></td>
|
||||||
|
</tr>
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
<div class="section" id="compiler-hint-ops">
|
<div class="section" id="compiler-hint-ops">
|
||||||
<h2>Compiler Hint Ops<a class="headerlink" href="#compiler-hint-ops" title="Permalink to this headline">¶</a></h2>
|
<h2>Compiler Hint Ops<a class="headerlink" href="#compiler-hint-ops" title="Permalink to this headline">¶</a></h2>
|
||||||
<table class="longtable docutils align-default">
|
<table class="longtable docutils align-default">
|
||||||
|