Ayush Garg

Search

SearchSearch

Recently Updated

  • Pareto Principle

    Jun 18, 2026

    • Bits

      Jun 18, 2026

      • Magnitude of a normalized floating-point number

        Jun 18, 2026

        • Mixed Precision Training

          Jun 18, 2026

          Home

          ❯

          Index of Notes

          ❯

          Attention is All You Need

          Attention is All You Need

          Apr 19, 2025, 1 min read

          Paper Link: https://arxiv.org/pdf/1706.03762

          Concepts §

          • Transformer Encoder
          • Transformer Decoder
          • Scaled Dot-Product Attention
          • Multi-Head Attention
          • Self Attention
          • Sinusoidal Positional Encoding
          • Layer Normalization

          Graph View

          Backlinks

          • Transformer Encoder
          • Transformer

          Created by Ayush Garg using Quartz , © 2026

          • GitHub
          • Linkedin
          • Blog
          • Twitter