Ayush Garg

Search

SearchSearch

Recently Updated

  • LoRA: Low Rank Adaptation of Large Language Models

    Sep 16, 2025

    • University of Waterloo

      Sep 11, 2025

      • #university-of-waterloo
    • A Vision-Language-Action Flow Model for General Robot Control

      Sep 11, 2025

      • #research-paper
    • CS 241 - Definitions

      Sep 09, 2025

      Home

      ❯

      List of Notes

      ❯

      Self Attention

      Self Attention

      Mar 29, 2025, 1 min read

      Self Attention is an application of Scaled Dot-Product Attention

      Self Attention is when you apply Scaled Dot-Product Attention to a single sequence, using the same input for Q, K, and V

      Self attention lets each work “look at” other words to decide what’s important using Scaled Dot-Product Attention So:

      SelfAttention(X)=Attention(Q=XWQ,K=XWK,V=XWV)

      WQ,WK,WV are learned weight matrices

      Graph View

      Backlinks

      • Attention is All You Need

      Created by Ayush Garg using Quartz , © 2025

      • GitHub
      • Linkedin
      • Blog
      • Twitter