XLA

Accelerated Linear Algebra (XLA) is a compiler for high-level array and tensor programs.

It is mainly used for machine learning workloads, where a program is often made of many tensor operations like matrix multiplication, convolution, reshaping, reductions, and elementwise arithmetic.

The basic idea is:

XLA takes a computation graph and compiles it into faster machine code for a target device.

Mental model

XLA is not a machine learning framework by itself.

It sits underneath frameworks and compiles the numerical parts of the program.

Important tradeoff

XLA works best when shapes and control flow are relatively stable.

If a program changes shape constantly or has lots of dynamic Python-side logic, compilation can become less useful because XLA has to recompile or cannot see enough of the computation at once.

So XLA is especially useful for large, repeated numerical workloads where the same computation runs many times.

Ayush Garg

Recently Updated

XLA

Duck Typing

DiffusionBlocks: Training Neural Networks One Block at a Time

Group Relative Policy Optimizations

XLA

Mental model

Important tradeoff

Graph View

Table of Contents

Backlinks

Ayush Garg

Recently Updated

XLA

Duck Typing

DiffusionBlocks: Training Neural Networks One Block at a Time

Group Relative Policy Optimizations

XLA

Mental model §

Important tradeoff §

Graph View

Table of Contents

Backlinks

Mental model

Important tradeoff