docstring

pytorch

Write docstrings for PyTorch functions and methods following PyTorch conventions. Use when writing or updating docstrings in PyTorch code.

Quick Install

bunx add-skill pytorch/pytorch -s docstring

autograddeep-learninggpumachine-learningneural-networknumpy

Instructions

PyTorch Docstring Writing Guide

This skill describes how to write docstrings for functions and methods in the PyTorch project, following the conventions in torch/_tensor_docs.py and torch/nn/functional.py.

General Principles

Use raw strings (r"""...""") for all docstrings to avoid issues with LaTeX/math backslashes
Follow Sphinx/reStructuredText (reST) format for documentation
Be concise but complete - include all essential information
Always include examples when possible
Use cross-references to related functions/classes

Docstring Structure

1. Function Signature (First Line)

Start with the function signature showing all parameters:

r"""function_name(param1, param2, *, kwarg1=default1, kwarg2=default2) -> ReturnType

Notes:

Include the function name
Show positional and keyword-only arguments (use * separator)

Loading…

Related Skills

flow

Use when you need to run Flow type checking, or when seeing Flow type errors in React code.

242,585

verify

Use when you want to validate changes before committing, or when you need to check all React contribution requirements.

242,585

feature-flags

Use when feature flag tests fail, flags need updating, understanding @gate pragmas, debugging channel-specific test failures, or adding new flags to React.

242,585

flags

Use when you need to check feature flag states, compare channels, or debug why a feature behaves differently across release channels.

242,582

r"""conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) -> Tensor

Applies a 2D convolution over an input image composed of several input
planes.

.. math::
    \text{Softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_j \exp(x_j)}

See :class:`~torch.nn.Conv2d` for details and output shape.

.. note::
    This function doesn't work directly with NLLLoss,
    which expects the Log to be computed between the Softmax and itself.
    Use log_softmax instead (it's faster and has better numerical properties).

.. warning::
    :func:`new_tensor` always copies :attr:`data`. If you have a Tensor
    ``data`` and want to avoid a copy, use :func:`torch.Tensor.requires_grad_`
    or :func:`torch.Tensor.detach`.

Args:
    input (Tensor): input tensor of shape :math:`(\text{minibatch} , \text{in\_channels} , iH , iW)`
    weight (Tensor): filters of shape :math:`(\text{out\_channels} , kH , kW)`
    bias (Tensor, optional): optional bias tensor of shape :math:`(\text{out\_channels})`. Default: ``None``
    stride (int or tuple): the stride of the convolving kernel. Can be a single number or a
      tuple `(sH, sW)`. Default: 1

Keyword args:
    dtype (:class:`torch.dtype`, optional): the desired type of returned tensor.
        Default: if None, same :class:`torch.dtype` as this tensor.
    device (:class:`torch.device`, optional): the desired device of returned tensor.
        Default: if None, same :class:`torch.device` as this tensor.
    requires_grad (bool, optional): If autograd should record operations on the
        returned tensor. Default: ``False``.

Returns:
    Tensor: Sampled tensor of same shape as `logits` from the Gumbel-Softmax distribution.
        If ``hard=True``, the returned samples will be one-hot, otherwise they will
        be probability distributions that sum to 1 across `dim`.

Examples::

    >>> inputs = torch.randn(33, 16, 30)
    >>> filters = torch.randn(20, 16, 5)
    >>> F.conv1d(inputs, filters)

    >>> # With square kernels and equal stride
    >>> filters = torch.randn(8, 4, 3, 3)
    >>> inputs = torch.randn(1, 4, 5, 5)
    >>> F.conv2d(inputs, filters, padding=1)

.. _Link Name:
    https://arxiv.org/abs/1611.00712

def relu(input: Tensor, inplace: bool = False) -> Tensor:
    r"""relu(input, inplace=False) -> Tensor

    Applies the rectified linear unit function element-wise. See
    :class:`~torch.nn.ReLU` for more details.
    """
    # implementation

conv1d = _add_docstr(
    torch.conv1d,
    r"""
conv1d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1) -> Tensor

Applies a 1D convolution over an input signal composed of several input
planes.

See :class:`~torch.nn.Conv1d` for details and output shape.

Args:
    input: input tensor of shape :math:`(\text{minibatch} , \text{in\_channels} , iW)`
    weight: filters of shape :math:`(\text{out\_channels} , kW)`
    ...
""",
)

add_docstr_all(
    "abs_",
    r"""
abs_() -> Tensor

In-place version of :meth:`~Tensor.abs`
""",
)

add_docstr_all(
    "absolute",
    r"""
absolute() -> Tensor

Alias for :func:`abs`
""",
)

:math:`(\text{minibatch} , \text{in\_channels} , iH , iW)`

common_args = parse_kwargs(
    """
    dtype (:class:`torch.dtype`, optional): the desired type of returned tensor.
        Default: if None, same as this tensor.
"""
)

# Then use with .format():
r"""
...

Keyword args:
    {dtype}
    {device}
""".format(**common_args)

r"""
{tf32_note}

{cudnn_reproducibility_note}
""".format(**reproducibility_notes, **tf32_notes)

def gumbel_softmax(
    logits: Tensor,
    tau: float = 1,
    hard: bool = False,
    eps: float = 1e-10,
    dim: int = -1,
) -> Tensor:
    r"""
    Sample from the Gumbel-Softmax distribution and optionally discretize.

    Args:
        logits (Tensor): `[..., num_features]` unnormalized log probabilities
        tau (float): non-negative scalar temperature
        hard (bool): if ``True``, the returned samples will be discretized as one-hot vectors,
              but will be differentiated as if it is the soft sample in autograd. Default: ``False``
        dim (int): A dimension along which softmax will be computed. Default: -1

    Returns:
        Tensor: Sampled tensor of same shape as `logits` from the Gumbel-Softmax distribution.
            If ``hard=True``, the returned samples will be one-hot, otherwise they will
            be probability distributions that sum to 1 across `dim`.

    .. note::
        This function is here for legacy reasons, may be removed from nn.Functional in the future.

    Examples::
        >>> logits = torch.randn(20, 32)
        >>> # Sample soft categorical using reparametrization trick:
        >>> F.gumbel_softmax(logits, tau=1, hard=False)
        >>> # Sample hard categorical using "Straight-through" trick:
        >>> F.gumbel_softmax(logits, tau=1, hard=True)

    .. _Link 1:
        https://arxiv.org/abs/1611.00712
    """
    # implementation

docstring

PyTorch Docstring Writing Guide

General Principles

Docstring Structure

1. Function Signature (First Line)

Related Skills

flow

verify

feature-flags

flags

PyTorch Docstring Writing Guide

General Principles

Docstring Structure

1. Function Signature (First Line)

2. Brief Description

3. Mathematical Formulas (if applicable)

4. Cross-References

5. Notes and Warnings

6. Args Section

7. Keyword Args Section (if applicable)

8. Returns Section (if needed)

9. Examples Section

10. External References

Method Types

Native Python Functions

C-Bound Functions (using add_docstr)

In-Place Variants

Alias Functions

Common Patterns

Shape Documentation

Reusable Argument Definitions

Template Insertion

Complete Example

Quick Checklist

Common Sphinx Roles Reference

Additional Notes

Related Skills

flow

verify

feature-flags

flags