Training-free framework that converts SAM3 into a real-time multi-class open-vocabulary detector. Achieves 55.8 AP on COCO val2017 (80 classes) at 15.8 FPS (4 classes, 1008px) on a single RTX 4080.
HOI-DETR is a transformer-based framework for detecting hands, hand-held objects, and their interactions in images and video. Built on the Co-DETR architecture, it adds a lightweight interaction ...
This important work introduces an integrated open-source platform for behavioral acquisition and pose estimation that substantially improves the accessibility and speed of real-time animal tracking ...
Abstract: Referring camouflaged object detection (Ref-COD) is a recently proposed task, aiming to segment specified camouflaged objects by leveraging visual reference, i.e., a small set of referring ...
Abstract: Tiny-object detection is increasingly crucial in fields such as remote sensing, traffic monitoring, and robotics. Inspired by human visual perception, the attention mechanism has become a ...