Logical Scribbles
[공부] New 공부 목록 (2023.11.22) 본문
Attention is all you need
Transformer
VIT (Vision Transformer)
https://arxiv.org/abs/2010.11929
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to rep
arxiv.org
https://arxiv.org/abs/2005.12872
End-to-End Object Detection with Transformers
We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor gene
arxiv.org
https://arxiv.org/abs/2206.02777
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
In this paper we present Mask DINO, a unified object detection and segmentation framework. Mask DINO extends DINO (DETR with Improved Denoising Anchor Boxes) by adding a mask prediction branch which supports all image segmentation tasks (instance, panoptic
arxiv.org
https://arxiv.org/abs/2203.12119
Visual Prompt Tuning
The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning for large-scale Tr
arxiv.org
+ YOLO 논문 다시 읽기
'Miscellaneous' 카테고리의 다른 글
[공부] Paper List (Up to 2019) 기타 등등 (0) | 2023.11.12 |
---|