Masked Image Modelling
Self-supervised learning pipeline that pre-trains a Vision Transformer to reconstruct partially masked images and then fine-tunes on the Oxford-IIIT Pets dataset for segmentation.
- Focus: generative pre-training, segmentation, ViT fine-tuning
- Stack: PyTorch, Hugging Face
