Swin Transformer paper explained, visualized, and animated by Ms. Coffee Bean. Find out what the Swin Transformer proposes to do better than the ViT vision transformer.
📺 ViT explained: [ Ссылка ]
📺 Transformer explained: [ Ссылка ]
📺► Positional embeddings (playlist): [ Ссылка ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
donor, Dres. Trost GbR, Yannik Schneider
➡️ AI Coffee Break Merch! 🛍️ [ Ссылка ]
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: [ Ссылка ]
Ko-fi: [ Ссылка ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
Paper discussed:
📜 Liu, Ze, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. "Swin transformer: Hierarchical vision transformer using shifted windows." arXiv preprint arXiv:2103.14030 (2021). [ Ссылка ]
💻 Swin Transformer code on GitHub: [ Ссылка ]
Outline:
00:00 Problems with ViT / Swin Motivation
04:16 Swin Transformer explained
06:00 Shifted Window based Self-attention
08:58 positional embeddings in the Swin Transformer
09:29 Task performance of the Swin Transformer
Music 🎵 : Bay Street Millionaires by Squadda B
---------------------
🔗 Links:
AICoffeeBreakQuiz: [ Ссылка ]
Twitter: [ Ссылка ]
Reddit: [ Ссылка ]
YouTube: [ Ссылка ]
#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Video and thumbnail contain emojis designed by OpenMoji – the open-source emoji and icon project. License: CC BY-SA 4.0 16x16 pixels comprehensible artificial intelligence
Swin Transformer paper animated and explained
Теги
swin transformerSwinformerViThierarchical vision transformerMicrosoft researchresearch paper explained16x16 pixelsimage patchimage vectorneural networkAIartificial intelligencemachine learningvisualizedvisualizationsdeep learningeasybeginnerexplainedbasicscomprehensibleresearchcomputer sciencewomen in aialgorithmshortexamplemachine learning researchaicoffeebeananimatedanimationtransformersletitia parcalabescuaicoffeecoffeebean