Back to landing

Masked Image Modelling

Self-supervised learning pipeline that pre-trains a Vision Transformer to reconstruct partially masked images and then fine-tunes on the Oxford-IIIT Pets dataset for segmentation.

  • Focus: generative pre-training, segmentation, ViT fine-tuning
  • Stack: PyTorch, Hugging Face

Reconstructions Segmentation Results

View code on GitHub