PyTorch Autograd

PyTorch’s Autograd feature is what makes PyTorch flexible and fast for building machine learning projects. It allows for the rapid computation of multiple partial derivatives (gradients) over a complex computation.

Differentiation in Autograd

When making a PyTorch Tensors if you use the parameter requires_grad=True it signals to autograd that every operation must be tracked

We create a tensor Q from a and b

import torch
 
a = torch.tensor([2., 3.], requires_grad=True)
b = torch.tensor([6., 4.], requires_grad=True)

$Q = 3 a^{3} - b^{2}$

Computational Graph

Autograd keeps a record of data (tensors) & all executed operations (along with resulting new tensors) in a directed acyclic graph (DAG)

In a DAG the leaves are the input tensors and the roots are the output sensors. By tracing this graph you can automatically compute the gradients using the chain rule

Forward Pass:

Run the requested operation to compute a resulting tensor
Maintain the operation’s gradient function in DAG

Backward Pass kicks off when .backward() is called on the DAG root. Autograd:

computes gradients from each .grad_fn
Accumulates them in the respective tensor’s .grad attribute
Using chain rule propagates all the way to leaf tensors

Arrows - Direction of forward pass
Nodes - Backward function of each operation in forward pass
Leaf nodes (blue) - represent leaf tensors a and b

Ayush Garg

Recently Updated

Worker Threads

Directed Acyclic Graph (DAG)

Celery

SQL

PyTorch Autograd

Differentiation in Autograd

Computational Graph

Graph View

Table of Contents

Backlinks

Ayush Garg

Recently Updated

Worker Threads

Directed Acyclic Graph (DAG)

Celery

SQL

PyTorch Autograd

Differentiation in Autograd §

Computational Graph §

Graph View

Table of Contents

Backlinks

Differentiation in Autograd

Computational Graph