Training-free framework that converts SAM3 into a real-time multi-class open-vocabulary detector. Achieves 55.8 AP on COCO val2017 (80 classes) at 15.8 FPS (4 classes, 1008px) on a single RTX 4080.
Abstract: YOLOv10, known for its efficiency in object detection methods, quickly and accurately detects objects in images. However, when detecting small objects in remote sensing imagery, traditional ...
Abstract: Traditional real-time object detection networks deployed in autonomous aerial vehicles (AAVs) struggle to extract features from small objects in complex backgrounds with occlusions and ...