Accelerated Linear Algebra (XLA) is a compiler for high-level array and tensor programs.
It is mainly used for machine learning workloads, where a program is often made of many tensor operations like matrix multiplication, convolution, reshaping, reductions, and elementwise arithmetic.
The basic idea is:
XLA takes a computation graph and compiles it into faster machine code for a target device.
Mental model
XLA is not a machine learning framework by itself.
It sits underneath frameworks and compiles the numerical parts of the program.
Important tradeoff
XLA works best when shapes and control flow are relatively stable.
If a program changes shape constantly or has lots of dynamic Python-side logic, compilation can become less useful because XLA has to recompile or cannot see enough of the computation at once.
So XLA is especially useful for large, repeated numerical workloads where the same computation runs many times.