Logical Scribbles

[공부] New 공부 목록 (2023.11.22) 본문

Miscellaneous

[공부] New 공부 목록 (2023.11.22)

KimJake 2023. 11. 22. 19:24

Attention is all you need


Transformer


VIT (Vision Transformer)

https://arxiv.org/abs/2010.11929

 

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to rep

arxiv.org

https://arxiv.org/abs/2005.12872

 

End-to-End Object Detection with Transformers

We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor gene

arxiv.org

https://arxiv.org/abs/2206.02777

 

Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation

In this paper we present Mask DINO, a unified object detection and segmentation framework. Mask DINO extends DINO (DETR with Improved Denoising Anchor Boxes) by adding a mask prediction branch which supports all image segmentation tasks (instance, panoptic

arxiv.org

https://arxiv.org/abs/2203.12119

 

Visual Prompt Tuning

The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Tr

arxiv.org

 

+ YOLO 논문 다시 읽기 

'Miscellaneous' 카테고리의 다른 글

[공부] Paper List (Up to 2019) 기타 등등  (0) 2023.11.12