I will cover Vision transformer in three parts. The first part which is this video focusses on patch embedding in vision transformer.
I will go over all the details and explain everything happening inside the patch embedding in VIT in detail.
I will also go over how an implementation of patch embedding for vision transformer in Pytorch would look like.
The second part which goes through attention can be found here -
Attention in Vision Transformer (Part Two) - [ Ссылка ]
The third part which builds entire transformer and shows how to visualize attention maps and positional embeddings can be found below -
Implementing Vision Transformer (Part Three) - [ Ссылка ]
*Timestamps* :
00:00 Intro
00:56 Need for Patch Embedding in Vision Transformer
01:30 Converting Image into Sequence of Patches
01:59 Patch Embedding Projection
02:45 Positional Information for Patches
03:40 CLS Token
04:10 Patch Embedding Responsibilities
04:40 Patch Embedding Module Implementation
08:02 Outro
*Paper Link* - [ Ссылка ]
Implementation will be pushed here after all three videos are out - [ Ссылка ]
*Subscribe* - [ Ссылка ]
Background Track - Fruits of Life by Jimena Contreras
Email - explainingai.official@gmail.com
PATCH EMBEDDING | Vision Transformers explained
Теги
patch embedding vision transformerpatch embedding vision transformer pytorchpatch embedding vision transformer explainedpatch embedding in vitpatch embedding in vit transformerpatch embedding module vitpatch embedding block vision transformerpatch embedding vit pytorchpatch embedding pytorch vitpatch embedding pytorch vision transformerpatch embedding transformer visionvision transformer patch embeddingpatch embedding vitdeep learningpytorchaiml