Ayush Garg

Search

SearchSearch

Recently Updated

  • vCPU

    Oct 28, 2025

    • Byte Pair Encoding

      Oct 27, 2025

      • C++ Optimizations

        Oct 27, 2025

        • Verifiers

          Oct 17, 2025

          Home

          ❯

          List of Notes

          ❯

          Self Attention

          Self Attention

          Mar 29, 2025, 1 min read

          Self Attention is an application of Scaled Dot-Product Attention

          Self Attention is when you apply Scaled Dot-Product Attention to a single sequence, using the same input for Q, K, and V

          Self attention lets each work “look at” other words to decide what’s important using Scaled Dot-Product Attention So:

          SelfAttention(X)=Attention(Q=XWQ,K=XWK,V=XWV)

          WQ,WK,WV are learned weight matrices

          Graph View

          Backlinks

          • Attention is All You Need

          Created by Ayush Garg using Quartz , © 2025

          • GitHub
          • Linkedin
          • Blog
          • Twitter