Taro Logo

Paper Reading: Self-supervised Vision Transformers(Foundational model)

Event details
Paper Reading: Self-supervised Vision Transformers(Foundational model) event
Event description

Dino is a self-supervised learning approach for computer vision tasks. It leverages vision transformers without using any labeled data. Dino v2 introduces several improvements over its predecessor, enhancing performance and scalability. It achieves state-of-the-art results on various benchmarks. Notably, Dino v2 demonstrates strong performance on downstream tasks such as image classification and object detection.

We will be walking through

  • Basics of computer vision
  • Self supervised learning
  • ViT (vision transformers)
  • DINO
  • DINO v2 model

Paper link - https://arxiv.org/abs/2304.07193

More details to be shared soon..