Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-07-23 | Perspective-Invariant 3D Object Detection | Ao Liang et.al. | 2507.17665 | null |
2025-07-23 | Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning | Xinyao Liu et.al. | 2507.17539 | null |
2025-07-23 | Illicit object detection in X-ray imaging using deep learning techniques: A comparative evaluation | Jorgen Cani et.al. | 2507.17508 | null |
2025-07-23 | Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection | Yehao Lu et.al. | 2507.17436 | null |
2025-07-23 | SFUOD: Source-Free Unknown Object Detection | Keon-Hee Park et.al. | 2507.17373 | null |
2025-07-23 | Optimizing Delivery Logistics: Enhancing Speed and Safety with Drone Technology | Maharshi Shastri et.al. | 2507.17253 | null |
2025-07-23 | A Low-Cost Machine Learning Approach for Timber Diameter Estimation | Fatemeh Hasanzadeh Fard et.al. | 2507.17219 | null |
2025-07-22 | Few-Shot Learning in Video and 3D Object Detection: A Survey | Md Meftahul Ferdaus et.al. | 2507.17079 | null |
2025-07-22 | Transformer Based Building Boundary Reconstruction using Attraction Field Maps | Muhammad Kamran et.al. | 2507.17038 | null |
2025-07-22 | Task-Specific Zero-shot Quantization-Aware Training for Object Detection | Changhao Li et.al. | 2507.16782 | null |
2025-07-22 | Screen2AX: Vision-Based Approach for Automatic macOS Accessibility Generation | Viktor Muryn et.al. | 2507.16704 | null |
2025-07-22 | Benchmarking pig detection and tracking under diverse and challenging conditions | Jonathan Henrich et.al. | 2507.16639 | null |
2025-07-22 | A2Mamba: Attention-augmented State Space Models for Visual Recognition | Meng Lou et.al. | 2507.16624 | null |
2025-07-22 | PlantSAM: An Object Detection-Driven Segmentation Pipeline for Herbarium Specimens | Youcef Sklab et.al. | 2507.16506 | null |
2025-07-22 | Towards Railway Domain Adaptation for LiDAR-based 3D Detection: Road-to-Rail and Sim-to-Real via SynDRA-BBox | Xavier Diaz et.al. | 2507.16413 | null |
2025-07-22 | MAN++: Scaling Momentum Auxiliary Network for Supervised Local Learning in Vision Tasks | Junhao Su et.al. | 2507.16279 | null |
2025-07-22 | Edge-case Synthesis for Fisheye Object Detection: A Data-centric Perspective | Seunghyeon Kim et.al. | 2507.16254 | null |
2025-07-22 | LDRFusion: A LiDAR-Dominant multimodal refinement framework for 3D object detection | Jijun Wang et.al. | 2507.16224 | null |
2025-07-22 | Design and Implementation of a Lightweight Object Detection System for Resource-Constrained Edge Environments | Jiyue Jiang et.al. | 2507.16155 | null |
2025-07-21 | Experimenting active and sequential learning in a medieval music manuscript | Sachin Sharma et.al. | 2507.15633 | null |
2025-07-21 | Few-Shot Object Detection via Spatial-Channel State Space Model | Zhimeng Xin et.al. | 2507.15308 | null |
2025-07-20 | Event-based Graph Representation with Spatial and Motion Vectors for Asynchronous Object Detection | Aayush Atul Verma et.al. | 2507.15150 | null |
2025-07-20 | BleedOrigin: Dynamic Bleeding Source Localization in Endoscopic Submucosal Dissection via Dual-Stage Detection and Tracking | Mengya Xu et.al. | 2507.15094 | null |
2025-07-20 | InsightX Agent: An LMM-based Agentic Framework with Integrated Tools for Reliable X-ray NDT Analysis | Jiale Liu et.al. | 2507.14899 | null |
2025-07-20 | An Uncertainty-aware DETR Enhancement Framework for Object Detection | Xingshu Chen et.al. | 2507.14855 | null |
2025-07-19 | Multispectral State-Space Feature Fusion: Bridging Shared and Cross-Parametric Interactions for Object Detection | Jifeng Shen et.al. | 2507.14643 | null |
2025-07-18 | C-DOG: Training-Free Multi-View Multi-Object Association in Dense Scenes Without Visual Feature via Connected δ-Overlap Graphs | Yung-Hong Sun et.al. | 2507.14095 | null |
2025-07-18 | Enhancing LiDAR Point Features with Foundation Model Priors for 3D Object Detection | Yujian Mo et.al. | 2507.13899 | null |
2025-07-18 | Moving Object Detection from Moving Camera Using Focus of Expansion Likelihood and Segmentation | Masahiro Ogawa et.al. | 2507.13628 | null |
2025-07-17 | NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS (C/2025 N1) | Colin Orion Chandler et.al. | 2507.13409 | null |
2025-07-17 | A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains | Antonio Finocchiaro et.al. | 2507.13326 | null |
2025-07-17 | RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images | Xiaozheng Jiang et.al. | 2507.13120 | null |
2025-07-17 | Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection | Riku Inoue et.al. | 2507.13085 | null |
2025-07-17 | Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis | Saswat Priyadarshi Nayak et.al. | 2507.13073 | null |
2025-07-17 | SOD-YOLO: Enhancing YOLO-Based Detection of Small Objects in UAV Imagery | Peijun Wang et.al. | 2507.12727 | null |
2025-07-16 | Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios | Van-Hoang-Anh Phan et.al. | 2507.12449 | null |
2025-07-16 | InterpIoU: Rethinking Bounding Box Regression with Interpolation-Based IoU Optimization | Haoyuan Liu et.al. | 2507.12420 | null |
2025-07-16 | AutoVDC: Automated Vision Data Cleaning Using Vision-Language Models | Santosh Vasa et.al. | 2507.12414 | null |
2025-07-18 | OD-VIRAT: A Large-Scale Benchmark for Object Detection in Realistic Surveillance Environments | Hayat Ullah et.al. | 2507.12396 | null |
2025-07-16 | Improving Lightweight Weed Detection via Knowledge Distillation | Ahmet Oğuz Saltık et.al. | 2507.12344 | null |
2025-07-16 | SS-DC: Spatial-Spectral Decoupling and Coupling Across Visible-Infrared Gap for Domain Adaptive Object Detection | Xiwei Zhang et.al. | 2507.12017 | null |
2025-07-16 | Frequency-Dynamic Attention Modulation for Dense Prediction | Linwei Chen et.al. | 2507.12006 | null |
2025-07-15 | Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping | Yujie Zhang et.al. | 2507.11279 | null |
2025-07-15 | Using Continual Learning for Real-Time Detection of Vulnerable Road Users in Complex Traffic Scenarios | Faryal Aurooj Nasir et.al. | 2507.11046 | null |
2025-07-15 | Combining Transformers and CNNs for Efficient Object Detection in High-Resolution Satellite Imagery | Nicolas Drapier et.al. | 2507.11040 | null |
2025-07-14 | A Lightweight and Robust Framework for Real-Time Colorectal Polyp Detection Using LOF-Based Preprocessing and YOLO-v11n | Saadat Behzadi et.al. | 2507.10864 | null |
2025-07-14 | LLM-Guided Agentic Object Detection for Open-World Understanding | Furkan Mumcu et.al. | 2507.10844 | null |
2025-07-14 | Versatile and Generalizable Manipulation via Goal-Conditioned Reinforcement Learning with Grounded Object Detection | Huiyi Wang et.al. | 2507.10814 | null |
2025-07-14 | Fine-Grained Zero-Shot Object Detection | Hongxu Ma et.al. | 2507.10358 | null |
2025-07-14 | BlueGlass: A Framework for Composite AI Safety | Harshal Nandigramwar et.al. | 2507.10106 | null |
2025-07-14 | SRG/ART-XC All-Sky X-ray Survey: Sensitivity Assessment Based on Aperture Photometry | N. Y. Tyrin et.al. | 2507.10060 | null |
2025-07-14 | 3DGAA: Realistic and Robust 3D Gaussian-based Adversarial Attack for Autonomous Driving | Yixun Zhang et.al. | 2507.09993 | null |
2025-07-14 | Measuring the Impact of Rotation Equivariance on Aerial Object Detection | Xiuyu Wu et.al. | 2507.09896 | null |
2025-07-13 | MLoRQ: Bridging Low-Rank and Quantization for Transformer Compression | Ofir Gordon et.al. | 2507.09616 | null |
2025-07-12 | Stereo-based 3D Anomaly Object Detection for Autonomous Driving: A New Dataset and Baseline | Shiyi Mu et.al. | 2507.09214 | null |
2025-07-12 | On the Fragility of Multimodal Perception to Temporal Misalignment in Autonomous Driving | Md Hasan Shahriar et.al. | 2507.09095 | null |
2025-07-11 | VISTA: A Visual Analytics Framework to Enhance Foundation Model-Generated Data Labels | Xiwei Xuan et.al. | 2507.09008 | null |
2025-07-11 | RoundaboutHD: High-Resolution Real-World Urban Environment Benchmark for Multi-Camera Vehicle Tracking | Yuqiang Lin et.al. | 2507.08729 | null |
2025-07-11 | DatasetAgent: A Novel Multi-Agent System for Auto-Constructing Datasets from Real-World Images | Haoran Sun et.al. | 2507.08648 | null |
2025-07-11 | OnlineBEV: Recurrent Temporal Fusion in Bird's Eye View Representations for Multi-Camera 3D Perception | Junho Koh et.al. | 2507.08644 | null |
2025-07-11 | Smelly, dense, and spreaded: The Object Detection for Olfactory References (ODOR) dataset | Mathias Zinnen et.al. | 2507.08384 | null |
2025-07-11 | Spectroscopic Observations of Four Candidates for Blue Large-Amplitude Pulsators. No BLAPs at High Galactic Latitudes | P. Pietrukowicz et.al. | 2507.08372 | null |
2025-07-11 | Understanding Driving Risks using Large Language Models: Toward Elderly Driver Assessment | Yuki Yoshihara et.al. | 2507.08367 | null |
2025-07-10 | An Embedded Real-time Object Alert System for Visually Impaired: A Monocular Depth Estimation based Approach through Computer Vision | Jareen Anjom et.al. | 2507.08165 | null |
2025-07-10 | Rainbow Artifacts from Electromagnetic Signal Injection Attacks on Image Sensors | Youqian Zhang et.al. | 2507.07773 | null |
2025-07-09 | Automated Video Segmentation Machine Learning Pipeline | Johannes Merz et.al. | 2507.07242 | null |
2025-07-09 | Aerial Maritime Vessel Detection and Identification | Antonella Barisic Kulas et.al. | 2507.07153 | null |
2025-07-09 | DenoiseCP-Net: Efficient Collective Perception in Adverse Weather via Joint LiDAR-Based 3D Object Detection and Denoising | Sven Teufel et.al. | 2507.06976 | null |
2025-07-09 | A multi-modal dataset for insect biodiversity with imagery and DNA at the trap and individual level | Johanna Orsholm et.al. | 2507.06972 | null |
2025-07-09 | Dataset and Benchmark for Enhancing Critical Retained Foreign Object Detection | Yuli Wang et.al. | 2507.06937 | null |
2025-07-09 | Unlocking Thermal Aerial Imaging: Synthetic Enhancement of UAV Datasets | Antonella Barisic Kulas et.al. | 2507.06797 | null |
2025-07-09 | LOVON: Legged Open-Vocabulary Object Navigator | Daojie Peng et.al. | 2507.06747 | null |
2025-07-09 | EA: An Event Autoencoder for High-Speed Vision Sensing | Riadul Islam et.al. | 2507.06459 | null |
2025-07-08 | Hierarchical Multi-Stage Transformer Architecture for Context-Aware Temporal Action Localization | Hayat Ullah et.al. | 2507.06411 | null |
2025-07-08 | ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge | Daghash K. Alqahtani et.al. | 2507.06011 | null |
2025-07-08 | R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding | Joonhyung Park et.al. | 2507.05673 | null |
2025-07-07 | From a Different Star: 3I/ATLAS in the context of the Ōtautahi-Oxford interstellar object population model | Matthew J. Hopkins et.al. | 2507.05318 | null |
2025-07-07 | Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations | Xiang Xu et.al. | 2507.05260 | null |
2025-07-07 | LERa: Replanning with Visual Feedback in Instruction Following | Svyatoslav Pchelintsev et.al. | 2507.05135 | null |
2025-07-07 | CVFusion: Cross-View Fusion of 4D Radar and Camera for 3D Object Detection | Hanzhi Zhong et.al. | 2507.04587 | null |
2025-07-06 | MambaFusion: Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection | Hanshi Wang et.al. | 2507.04369 | null |
2025-07-06 | DMAT: An End-to-End Framework for Joint Atmospheric Turbulence Mitigation and Object Detection | Paul Hill et.al. | 2507.04323 | null |
2025-07-06 | ZERO: Multi-modal Prompt-based Visual Grounding | Sangbum Choi et.al. | 2507.04270 | null |
2025-07-05 | Towards Accurate and Efficient 3D Object Detection for Autonomous Driving: A Mixture of Experts Computing System on Edge | Linshen Liu et.al. | 2507.04123 | null |
2025-07-04 | Zero Memory Overhead Approach for Protecting Vision Transformer Parameters | Fereshteh Baradaran et.al. | 2507.03816 | null |
2025-07-04 | 2.5D Object Detection for Intelligent Roadside Infrastructure | Nikolai Polley et.al. | 2507.03564 | null |
2025-07-04 | Enhancing Uncertainty Quantification for Runtime Safety Assurance Using Causal Risk Analysis and Operational Design Domain | Radouane Bouchekir et.al. | 2507.03515 | null |
2025-07-03 | Partial Weakly-Supervised Oriented Object Detection | Mingxin Liu et.al. | 2507.02751 | null |
2025-07-03 | Automatic Labelling for Low-Light Pedestrian Detection | Dimitrios Bouzoulas et.al. | 2507.02513 | null |
2025-07-03 | Weakly-supervised Contrastive Learning with Quantity Prompts for Moving Infrared Small Target Detection | Weiwei Duan et.al. | 2507.02454 | null |
2025-07-03 | A Late Collaborative Perception Framework for 3D Multi-Object and Multi-Source Association and Fusion | Maryem Fadili et.al. | 2507.02430 | null |
2025-07-03 | PLOT: Pseudo-Labeling via Video Object Tracking for Scalable Monocular 3D Object Detection | Seokyeong Lee et.al. | 2507.02393 | null |
2025-07-03 | Two-Steps Neural Networks for an Automated Cerebrovascular Landmark Detection | Rafic Nader et.al. | 2507.02349 | null |
2025-07-03 | Perception Activator: An intuitive and portable framework for brain cognitive exploration | Le Xu et.al. | 2507.02311 | null |
2025-07-03 | Understanding Trade offs When Conditioning Synthetic Data | Brandon Trabucco et.al. | 2507.02217 | null |
2025-07-02 | How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks | Rahul Ramachandran et.al. | 2507.01955 | null |
2025-07-02 | Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation | Andrei Jelea et.al. | 2507.01347 | null |
2025-07-01 | Rapid Salient Object Detection with Difference Convolutional Neural Networks | Zhuo Su et.al. | 2507.01182 | null |
2025-07-01 | Robust Component Detection for Flexible Manufacturing: A Deep Learning Approach to Tray-Free Object Recognition under Variable Lighting | Fatemeh Sadat Daneshmand et.al. | 2507.00852 | null |
2025-07-01 | UAVD-Mamba: Deformable Token Fusion Vision Mamba for Multimodal UAV Detection | Wei Li et.al. | 2507.00849 | null |
2025-07-01 | High-Frequency Semantics and Geometric Priors for End-to-End Detection Transformers in Challenging UAV Imagery | Hongxing Peng et.al. | 2507.00825 | null |
2025-07-01 | Multi-Modal Graph Convolutional Network with Sinusoidal Encoding for Robust Human Action Segmentation | Hao Xing et.al. | 2507.00752 | null |
2025-07-01 | UPRE: Zero-Shot Domain Adaptation for Object Detection via Unified Prompt and Representation Enhancement | Xiao Zhang et.al. | 2507.00721 | null |
2025-07-01 | Rectifying Magnitude Neglect in Linear Attention | Qihang Fan et.al. | 2507.00698 | null |
2025-07-01 | De-Simplifying Pseudo Labels to Enhancing Domain Adaptive Object Detection | Zehua Fu et.al. | 2507.00608 | null |
2025-06-30 | Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios | Deng Li et.al. | 2506.24063 | null |
2025-06-30 | Visual Textualization for Image Prompted Object Detection | Yongjian Wu et.al. | 2506.23785 | null |
2025-06-30 | PBCAT: Patch-based composite adversarial training against physically realizable attacks on object detection | Xiao Li et.al. | 2506.23581 | null |
2025-06-30 | Event-based Tiny Object Detection: A Benchmark Dataset and Baseline | Nuo Chen et.al. | 2506.23575 | null |
2025-06-30 | OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous Driving | Mingqian Ji et.al. | 2506.23565 | null |
2025-06-30 | From Sight to Insight: Unleashing Eye-Tracking in Weakly Supervised Video Salient Object Detection | Qi Qin et.al. | 2506.23519 | null |
2025-06-30 | Improve Underwater Object Detection through YOLOv12 Architecture and Physics-informed Augmentation | Tinh Nguyen et.al. | 2506.23505 | null |
2025-06-29 | Detecting What Matters: A Novel Approach for Out-of-Distribution 3D Object Detection in Autonomous Vehicles | Menna Taha et.al. | 2506.23426 | null |
2025-06-29 | Layer Decomposition and Morphological Reconstruction for Task-Oriented Infrared Image Enhancement | Siyuan Chai et.al. | 2506.23353 | null |
2025-06-29 | GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields | Shunsuke Yasuki et.al. | 2506.23352 | null |
2025-06-27 | Attention-disentangled Uniform Orthogonal Feature Space Optimization for Few-shot Object Detection | Taijin Zhao et.al. | 2506.22161 | null |
2025-06-27 | Evaluating Pointing Gestures for Target Selection in Human-Robot Collaboration | Noora Sassali et.al. | 2506.22116 | null |
2025-06-27 | CERBERUS: Crack Evaluation & Recognition Benchmark for Engineering Reliability & Urban Stability | Justin Reinman et.al. | 2506.21909 | null |
2025-06-27 | Visual Content Detection in Educational Videos with Transfer Learning and Dataset Enrichment | Dipayan Biswas et.al. | 2506.21903 | null |
2025-06-27 | Embodied Domain Adaptation for Object Detection | Xiangyu Shi et.al. | 2506.21860 | null |
2025-06-26 | PhotonSplat: 3D Scene Reconstruction and Colorization from SPAD Sensors | Sai Sri Teja et.al. | 2506.21680 | null |
2025-06-26 | Towards Reliable Detection of Empty Space: Conditional Marked Point Processes for Object Detection | Tobias J. Riedlinger et.al. | 2506.21486 | null |
2025-06-26 | TITAN: Query-Token based Domain Adaptive Adversarial Learning | Tajamul Ashraf et.al. | 2506.21484 | null |
2025-06-26 | A Comprehensive Dataset for Underground Miner Detection in Diverse Scenario | Cyrus Addy et.al. | 2506.21451 | null |
2025-06-26 | DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic | Munish Monga et.al. | 2506.21260 | null |
2025-06-26 | LASFNet: A Lightweight Attention-Guided Self-Modulation Feature Fusion Network for Multimodal Object Detection | Lei Hao et.al. | 2506.21018 | null |
2025-06-26 | ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation | Shruti Bansal et.al. | 2506.20969 | null |
2025-06-25 | Lightweight Multi-Frame Integration for Robust YOLO Object Detection in Videos | Yitong Quan et.al. | 2506.20550 | null |
2025-06-25 | Learning-based safety lifting monitoring system for cranes on construction sites | Hao Chen et.al. | 2506.20475 | null |
2025-06-25 | Feature Hallucination for Self-supervised Action Recognition | Lei Wang et.al. | 2506.20342 | null |
2025-06-25 | From Codicology to Code: A Comparative Study of Transformer and YOLO-based Detectors for Layout Analysis in Historical Documents | Sergio Torres Aguilar et.al. | 2506.20326 | null |
2025-06-25 | TDiR: Transformer based Diffusion for Image Restoration Tasks | Abbas Anwar et.al. | 2506.20302 | null |
2025-06-25 | Integrated optomechanical ultrasonic sensors with nano-Pascal-level sensitivity | Xuening Cao et.al. | 2506.20219 | null |
2025-06-24 | A Survey of Multi-sensor Fusion Perception for Embodied AI: Background, Methods, Challenges and Prospects | Shulan Ruan et.al. | 2506.19769 | null |
2025-06-26 | Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance | Xuesong Li et.al. | 2506.19683 | null |
2025-06-24 | Probabilistic modelling and safety assurance of an agriculture robot providing light-treatment | Mustafa Adam et.al. | 2506.19620 | null |
2025-06-24 | USIS16K: High-Quality Dataset for Underwater Salient Instance Segmentation | Lin Hong et.al. | 2506.19472 | null |
2025-06-23 | SpaNN: Detecting Multiple Adversarial Patches on CNNs by Spanning Saliency Thresholds | Mauricio Byrd Victorica et.al. | 2506.18591 | null |
2025-06-23 | Improvement on LiDAR-Camera Calibration Using Square Targets | Zhongyuan Li et.al. | 2506.18294 | null |
2025-06-23 | Learning Approach to Efficient Vision-based Active Tracking of a Flying Target by an Unmanned Aerial Vehicle | Jagadeswara PKV Pothuri et.al. | 2506.18264 | null |
2025-06-23 | Ground tracking for improved landmine detection in a GPR system | Li Tang et.al. | 2506.18258 | null |
2025-06-24 | Referring Expression Instance Retrieval and A Strong End-to-End Baseline | Xiangzhao Hao et.al. | 2506.18246 | null |
2025-06-24 | Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages | Klaudia Ropel et.al. | 2506.18069 | null |
2025-06-21 | YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception | Mengqi Lei et.al. | 2506.17733 | null |
2025-06-21 | CSDN: A Context-Gated Self-Adaptive Detection Network for Real-Time Object Detection | Wei Haolin et.al. | 2506.17679 | null |
2025-06-21 | DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving | Mihir Godbole et.al. | 2506.17590 | null |
2025-06-20 | YASMOT: Yet another stereo image multi-object tracker | Ketil Malde et.al. | 2506.17186 | link |
2025-06-20 | Class Agnostic Instance-level Descriptor for Visual Instance Search | Qi-Ying Sun et.al. | 2506.16745 | null |
2025-06-20 | Cross-modal Offset-guided Dynamic Alignment and Fusion for Weakly Aligned UAV Object Detection | Liu Zongzhen et.al. | 2506.16737 | null |
2025-06-19 | How Hard Is Snow? A Paired Domain Adaptation Dataset for Clear and Snowy Weather: CADC+ | Mei Qi Tang et.al. | 2506.16531 | null |
2025-06-19 | Can AI Dream of Unseen Galaxies? Conditional Diffusion Model for Galaxy Morphology Augmentation | Chenrui Ma et.al. | 2506.16233 | null |
2025-06-19 | VideoGAN-based Trajectory Proposal for Automated Vehicles | Annajoyce Mariani et.al. | 2506.16209 | null |
2025-06-19 | BLADE: An Automated Framework for Classifying Light Curves from the Center for Near-Earth Object Studies (CNEOS) Fireball Database | Elizabeth A. Silber et.al. | 2506.16099 | null |
2025-06-19 | Polyline Path Masked Attention for Vision Transformer | Zhongchen Zhao et.al. | 2506.15940 | null |
2025-06-18 | BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion | Yuqing Lan et.al. | 2506.15610 | null |
2025-06-18 | Retrospective Memory for Camouflaged Object Detection | Chenxi Zhang et.al. | 2506.15244 | null |
2025-06-19 | Efficient Retail Video Annotation: A Robust Key Frame Generation Approach for Product and Customer Interaction Analysis | Varun Mannam et.al. | 2506.14854 | null |
2025-06-18 | YOLOv11-RGBT: Towards a Comprehensive Single-Stage Multispectral Object Detection Framework | Dahang Wan et.al. | 2506.14696 | null |
2025-06-17 | VisText-Mosquito: A Multimodal Dataset and Benchmark for AI-Based Mosquito Breeding Site Detection and Reasoning | Md. Adnanul Islam et.al. | 2506.14629 | null |
2025-06-17 | GAMORA: A Gesture Articulated Meta Operative Robotic Arm for Hazardous Material Handling in Containment-Level Environments | Farha Abdul Wasay et.al. | 2506.14513 | null |
2025-06-17 | Comparison of Two Methods for Stationary Incident Detection Based on Background Image | Deepak Ghimire et.al. | 2506.14256 | null |
2025-06-16 | A Point Cloud Completion Approach for the Grasping of Partially Occluded Objects and Its Applications in Robotic Strawberry Harvesting | Ali Abouzeid et.al. | 2506.14066 | link |
2025-06-16 | FindMeIfYouCan: Bringing Open Set metrics to |
Daniel Montoya et.al. | 2506.14008 | null |
2025-06-16 | How Real is CARLAs Dynamic Vision Sensor? A Study on the Sim-to-Real Gap in Traffic Object Detection | Kaiyuan Tan et.al. | 2506.13722 | null |
2025-06-17 | Lecture Video Visual Objects (LVVO) Dataset: A Benchmark for Visual Object Detection in Educational Videos | Dipayan Biswas et.al. | 2506.13657 | link |
2025-06-16 | UAV Object Detection and Positioning in a Mining Industrial Metaverse with Custom Geo-Referenced Data | Vasiliki Balaska et.al. | 2506.13505 | null |
2025-06-16 | Sparse Convolutional Recurrent Learning for Efficient Event-based Neuromorphic Object Detection | Shenqi Wang et.al. | 2506.13440 | null |
2025-06-16 | Cognitive Synergy Architecture: SEGO for Human-Centric Collaborative Robots | Jaehong Oh et.al. | 2506.13149 | null |
2025-06-15 | MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection | Yuxiang Wang et.al. | 2506.12697 | null |
2025-06-14 | UniDet-D: A Unified Dynamic Spectral Attention Model for Object Detection under Adverse Weathers | Yuantao Wang et.al. | 2506.12324 | null |
2025-06-14 | MatchPlant: An Open-Source Pipeline for UAV-Based Single-Plant Detection and Data Extraction | Worasit Sangjan et.al. | 2506.12295 | link |
2025-06-13 | Vision-based Lifting of 2D Object Detections for Automated Driving | Hendrik Königshof et.al. | 2506.11839 | null |
2025-06-13 | Teleoperated Driving: a New Challenge for 3D Object Detection in Compressed Point Clouds | Filippo Bragato et.al. | 2506.11804 | null |
2025-06-13 | GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers | Guang Liang et.al. | 2506.11784 | null |
2025-06-12 | Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes | Hongyu Chen et.al. | 2506.11175 | null |
2025-06-12 | Discrete Lorenz Attractors in 3D Sinusoidal Maps | Sishu Shankar Muni et.al. | 2506.10788 | null |
2025-06-12 | Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement | Yuqi Shen et.al. | 2506.10712 | null |
2025-06-12 | Semantic-decoupled Spatial Partition Guided Point-supervised Oriented Object Detection | Xinyuan Liu et.al. | 2506.10601 | link |
2025-06-12 | Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration | Jun Wang et.al. | 2506.10573 | null |
2025-06-12 | FSATFusion: Frequency-Spatial Attention Transformer for Infrared and Visible Image Fusion | Tianpei Zhang et.al. | 2506.10366 | link |
2025-06-11 | DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos | Rajeev Yasarla et.al. | 2506.10242 | null |
2025-06-11 | CEM-FBGTinyDet: Context-Enhanced Foreground Balance with Gradient Tuning for tiny Objects | Tao Liu et.al. | 2506.09897 | null |
2025-06-11 | 3DGeoDet: General-purpose Geometry-aware Image-based 3D Object Detection | Yi Zhang et.al. | 2506.09541 | null |
2025-06-11 | MSSDF: Modality-Shared Self-supervised Distillation for High-Resolution Multi-modal Remote Sensing Image Learning | Tong Wang et.al. | 2506.09327 | null |
2025-06-10 | Efficient Edge Deployment of Quantized YOLOv4-Tiny for Aerial Emergency Object Detection on Raspberry Pi 5 | Sindhu Boddu et.al. | 2506.09300 | null |
2025-06-10 | Lightweight Object Detection Using Quantized YOLOv4-Tiny for Emergency Response in Aerial Imagery | Sindhu Boddu et.al. | 2506.09299 | null |
2025-06-10 | WD-DETR: Wavelet Denoising-Enhanced Real-Time Object Detection Transformer for Robot Perception with Event Cameras | Yangjie Cui et.al. | 2506.09098 | null |
2025-06-11 | Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Xuanchi Ren et.al. | 2506.09042 | null |
2025-06-10 | ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations | Amirreza Rouhi et.al. | 2506.08968 | null |
2025-06-10 | Data Augmentation For Small Object using Fast AutoAugment | DaeEun Yoon et.al. | 2506.08956 | null |
2025-06-11 | Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting | Keyi Liu et.al. | 2506.08777 | null |
2025-06-10 | ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction | Juan Yeo et.al. | 2506.08678 | null |
2025-06-10 | Hierarchical Neural Collapse Detection Transformer for Class Incremental Object Detection | Duc Thanh Pham et.al. | 2506.08562 | null |
2025-06-10 | Discovery of Odd Radio Circles and Other Peculiars in the First Year of the EMU Survey using Object Detection | Nikhel Gupta et.al. | 2506.08439 | null |
2025-06-09 | CrosswalkNet: An Optimized Deep Learning Framework for Pedestrian Crosswalk Detection in Aerial Images with High-Performance Computing | Zubin Bhuyan et.al. | 2506.07885 | null |
2025-06-09 | SAM2Auto: Auto Annotation Using FLASH | Arash Rocky et.al. | 2506.07850 | null |
2025-06-09 | Design and Evaluation of Deep Learning-Based Dual-Spectrum Image Fusion Methods | Beining Xu et.al. | 2506.07779 | null |
2025-06-09 | SpikeSMOKE: Spiking Neural Networks for Monocular 3D Object Detection with Cross-Scale Gated Coding | Xuemei Chen et.al. | 2506.07737 | null |
2025-06-09 | Domain Randomization for Object Detection in Manufacturing Applications using Synthetic Data: A Comprehensive Study | Xiaomeng Zhu et.al. | 2506.07539 | null |
2025-06-09 | SpatialLM: Training Large Language Models for Structured Indoor Modeling | Yongsen Mao et.al. | 2506.07491 | null |
2025-06-09 | Happiness Finder: Exploring the Role of AI in Enhancing Well-Being During Four-Leaf Clover Searches | Anna Yokokubo et.al. | 2506.07393 | null |
2025-06-09 | Multiple Object Stitching for Unsupervised Representation Learning | Chengchao Shen et.al. | 2506.07364 | link |
2025-06-09 | CBAM-STN-TPS-YOLO: Enhancing Agricultural Object Detection through Spatially Adaptive Attention Mechanisms | Satvik Praveen et.al. | 2506.07357 | null |
2025-06-08 | UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning | Weiqi Yan et.al. | 2506.07087 | null |
2025-06-06 | Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection | Yu Li et.al. | 2506.05872 | null |
2025-06-06 | Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration | Fanhu Zeng et.al. | 2506.05709 | null |
2025-06-06 | Integer Binary-Range Alignment Neuron for Spiking Neural Networks | Binghao Ye et.al. | 2506.05679 | null |
2025-06-05 | Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training | Aneesh Deogan et.al. | 2506.05092 | null |
2025-06-06 | Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets | Mikhail Kennerley et.al. | 2506.04737 | null |
2025-06-05 | Gen-n-Val: Agentic Image Data Generation and Validation | Jing-En Huang et.al. | 2506.04676 | null |
2025-06-05 | VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection | Wuyang Li et.al. | 2506.04623 | null |
2025-06-04 | FALO: Fast and Accurate LiDAR 3D Object Detection on Resource-Constrained Devices | Shizhong Han et.al. | 2506.04499 | null |
2025-06-04 | Neural Object Detection for 4D STEM: High-Throughput Sub-Pixel Electron Diffraction Pattern Recognition | Arda Genc et.al. | 2506.04477 | null |
2025-06-04 | Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector | Boyong He et.al. | 2506.04211 | link |
2025-06-04 | FSHNet: Fully Sparse Hybrid Network for 3D Object Detection | Shuai Liu et.al. | 2506.03714 | null |
2025-06-04 | How PARTs assemble into wholes: Learning the relative composition of images | Melika Ayoughi et.al. | 2506.03682 | null |
2025-06-05 | MambaNeXt-YOLO: A Hybrid State Space Model for Real-time Object Detection | Xiaochun Lei et.al. | 2506.03654 | null |
2025-06-04 | DiagNet: Detecting Objects using Diagonal Constraints on Adjacency Matrix of Graph Neural Network | Chong Hyun Lee et.al. | 2506.03571 | null |
2025-06-03 | SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports | Dheeraj Khanna et.al. | 2506.03335 | null |
2025-06-03 | Simulate Any Radar: Attribute-Controllable Radar Simulation via Waveform Parameter Embedding | Weiqing Xiao et.al. | 2506.03134 | null |
2025-06-03 | Towards Auto-Annotation from Annotation Guidelines: A Benchmark through 3D LiDAR Detection | Yechi Ma et.al. | 2506.02914 | null |
2025-06-04 | Open-PMC-18M: A High-Fidelity Large Scale Medical Dataset for Multimodal Representation Learning | Negin Baghbanzadeh et.al. | 2506.02738 | null |
2025-06-03 | GeneA-SLAM2: Dynamic SLAM with AutoEncoder-Preprocessed Genetic Keypoints Resampling and Depth Variance-Guided Dynamic Region Removal | Shufan Qing et.al. | 2506.02736 | link |
2025-06-03 | Sight Guide: A Wearable Assistive Perception and Navigation System for the Vision Assistance Race in the Cybathlon 2024 | Patrick Pfreundschuh et.al. | 2506.02676 | null |
2025-06-03 | Probabilistic Online Event Downsampling | Andreu Girbau-Xalabarder et.al. | 2506.02547 | null |
2025-06-03 | Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning | Kunyu Wang et.al. | 2506.02462 | null |
2025-06-03 | Auto-Labeling Data for Object Detection | Brent A. Griffin et.al. | 2506.02359 | null |
2025-06-02 | OD3: Optimization-free Dataset Distillation for Object Detection | Salwa K. Al Khatib et.al. | 2506.01942 | null |
2025-06-02 | A Novel Context-Adaptive Fusion of Shadow and Highlight Regions for Efficient Sonar Image Classification | Kamal Basha S et.al. | 2506.01445 | null |
2025-05-30 | Deformable Attention Mechanisms Applied to Object Detection, case of Remote Sensing | Anasse Boutayeb et.al. | 2505.24489 | null |
2025-05-30 | Leadership Assessment in Pediatric Intensive Care Unit Team Training | Liangyang Ouyang et.al. | 2505.24389 | null |
2025-05-30 | D2AF: A Dual-Driven Annotation and Filtering Framework for Visual Grounding | Yichi Zhang et.al. | 2505.24372 | null |
2025-05-29 | Conformal Object Detection by Sequential Risk Control | Léo Andéol et.al. | 2505.24038 | null |
2025-05-29 | Rooms from Motion: Un-posed Indoor 3D Object Detection as Localization and Mapping | Justin Lazarow et.al. | 2505.23756 | null |
2025-05-29 | Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need | Qiang Wang et.al. | 2505.23744 | null |
2025-05-29 | FMG-Det: Foundation Model Guided Robust Object Detection | Darryl Hannan et.al. | 2505.23726 | null |
2025-05-29 | CF-DETR: Coarse-to-Fine Transformer for Real-Time Object Detection | Woojin Shin et.al. | 2505.23317 | null |
2025-05-30 | WTEFNet: Real-Time Low-Light Object Detection for Advanced Driver Assistance Systems | Hao Wu et.al. | 2505.23201 | null |
2025-05-29 | Language-guided Learning for Object Detection Tackling Multiple Variations in Aerial Images | Sungjune Park et.al. | 2505.23193 | null |
2025-05-29 | DIP-R1: Deep Inspection and Perception with RL Looking Through and Understanding Complex Scenes | Sungjune Park et.al. | 2505.23179 | null |
2025-05-29 | The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector | Aixuan Li et.al. | 2505.22499 | null |
2025-05-28 | Task-Driven Implicit Representations for Automated Design of LiDAR Systems | Nikhil Behari et.al. | 2505.22344 | null |
2025-05-29 | YH-MINER: Multimodal Intelligent System for Natural Ecological Reef Metric Extraction | Mingzhuang Wang et.al. | 2505.22250 | null |
2025-05-28 | S2AFormer: Strip Self-Attention for Efficient Vision Transformer | Guoan Xu et.al. | 2505.22195 | null |
2025-05-28 | Learning A Robust RGB-Thermal Detector for Extreme Modality Imbalance | Chao Tian et.al. | 2505.22154 | null |
2025-05-28 | Prototype Embedding Optimization for Human-Object Interaction Detection in Livestreaming | Menghui Zhang et.al. | 2505.22011 | null |
2025-05-28 | Cross-DINO: Cross the Deep MLP and Transformer for Small Object Detection | Guiping Cao et.al. | 2505.21868 | null |
2025-05-27 | Object Concepts Emerge from Motion | Haoqian Liang et.al. | 2505.21635 | null |
2025-05-27 | Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO | Muzhi Zhu et.al. | 2505.21457 | null |
2025-05-27 | Visual Product Graph: Bridging Visual Products And Composite Images For End-to-End Style Recommendations | Yue Li Du et.al. | 2505.21454 | null |
2025-05-27 | YOLO-SPCI: Enhancing Remote Sensing Object Detection via Selective-Perspective-Class Integration | Xinyuan Wang et.al. | 2505.21370 | null |
2025-05-27 | Assured Autonomy with Neuro-Symbolic Perception | R. Spencer Hallyburton et.al. | 2505.21322 | null |
2025-05-27 | Robust Video-Based Pothole Detection and Area Estimation for Intelligent Vehicles with Depth Map and Kalman Smoothing | Dehao Wang et.al. | 2505.21049 | null |
2025-05-27 | YOLO-FireAD: Efficient Fire Detection via Attention-Guided Inverted Residual Learning and Dual-Pooling Feature Preservation | Weichao Pan et.al. | 2505.20884 | null |
2025-05-27 | Open-Det: An Efficient Learning Framework for Open-Ended Detection | Guiping Cao et.al. | 2505.20639 | null |
2025-05-27 | Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models | Peter Robicheaux et.al. | 2505.20612 | null |
2025-05-26 | From Data to Modeling: Fully Open-vocabulary Scene Graph Generation | Zuyao Chen et.al. | 2505.20106 | null |
2025-05-26 | Underwater Diffusion Attention Network with Contrastive Language-Image Joint Learning for Underwater Image Enhancement | Afrah Shaahid et.al. | 2505.19895 | null |
2025-05-26 | ADD-SLAM: Adaptive Dynamic Dense SLAM with Gaussian Splatting | Wenhua Wu et.al. | 2505.19420 | null |
2025-05-26 | Neural nanophotonic object detector with ultra-wide field-of-view | Ji Chen et.al. | 2505.19379 | null |
2025-05-25 | What do Blind and Low-Vision People Really Want from Assistive Smart Devices? Comparison of the Literature with a Focus Study | Bhanuka Gamage et.al. | 2505.19325 | null |
2025-05-25 | VL-SAM-V2: Open-World Object Detection with General and Specific Query Fusion | Zhiwei Lin et.al. | 2505.18986 | null |
2025-05-24 | Mitigating Context Bias in Domain Adaptation for Object Detection using Mask Pooling | Hojun Son et.al. | 2505.18446 | null |
2025-05-23 | Sampling Strategies for Efficient Training of Deep Learning Object Detection Algorithms | Gefei Shen et.al. | 2505.18302 | null |
2025-05-23 | One RL to See Them All: Visual Triple Unified Reinforcement Learning | Yan Ma et.al. | 2505.18129 | null |
2025-05-23 | SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification | Shashank Agnihotri et.al. | 2505.18015 | null |
2025-05-23 | RQR3D: Reparametrizing the regression targets for BEV-based 3D object detection | Ozsel Kilinc et.al. | 2505.17732 | null |
2025-05-23 | Adaptive Semantic Token Communication for Transformer-based Edge Inference | Alessio Devoto et.al. | 2505.17604 | null |
2025-05-23 | OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in Infographics | Jiangning Zhu et.al. | 2505.17473 | null |
2025-05-23 | Reflectance Prediction-based Knowledge Distillation for Robust 3D Object Detection in Compressed Point Clouds | Hao Jing et.al. | 2505.17442 | null |
2025-05-23 | Optimizing YOLOv8 for Parking Space Detection: Comparative Analysis of Custom YOLOv8 Architecture | Apar Pokhrel et.al. | 2505.17364 | null |
2025-05-22 | Extending Dataset Pruning to Object Detection: A Variance-based Approach | Ryota Yagi et.al. | 2505.17245 | null |
2025-05-22 | Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining | Shangquan Sun et.al. | 2505.16811 | null |
2025-05-22 | Robust Vision-Based Runway Detection through Conformal Prediction and Conformal mAP | Alya Zouzou et.al. | 2505.16740 | link |
2025-05-22 | CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving | Huitong Yang et.al. | 2505.16524 | null |
2025-05-22 | MAFE R-CNN: Selecting More Samples to Learn Category-aware Features for Small Object Detection | Yichen Li et.al. | 2505.16442 | null |
2025-05-22 | AdvReal: Adversarial Patch Generation Framework with Application to Adversarial Safety Evaluation of Object Detection Systems | Yuanhao Huang et.al. | 2505.16402 | link |
2025-05-22 | Self-Classification Enhancement and Correction for Weakly Supervised Object Detection | Yufei Yin et.al. | 2505.16294 | null |
2025-05-21 | SNAP: A Benchmark for Testing the Effects of Capture Conditions on Fundamental Vision Tasks | Iuliia Kotseruba et.al. | 2505.15628 | link |
2025-05-21 | Detection of Underwater Multi-Targets Based on Self-Supervised Learning and Deformable Path Aggregation Feature Pyramid Network | Chang Liu et.al. | 2505.15518 | null |
2025-05-21 | RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation | Naman Patel et.al. | 2505.15373 | null |
2025-05-21 | Multispectral Detection Transformer with Infrared-Centric Sensor Fusion | Seongmin Hwang et.al. | 2505.15137 | null |
2025-05-20 | SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation | Yuyang Dong et.al. | 2505.14381 | null |
2025-05-20 | Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation | Bin-Bin Gao et.al. | 2505.14239 | null |
2025-05-20 | Intra-class Patch Swap for Self-Distillation | Hongjun Choi et.al. | 2505.14124 | link |
2025-05-20 | Scaling Vision Mamba Across Resolutions via Fractal Traversal | Bo Li et.al. | 2505.14062 | null |
2025-05-20 | Automated Quality Evaluation of Cervical Cytopathology Whole Slide Images Based on Content Analysis | Lanlan Kang et.al. | 2505.13875 | null |
2025-05-20 | Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving | Jingzheng Li et.al. | 2505.13872 | null |
2025-05-20 | A Challenge to Build Neuro-Symbolic Video Agents | Sahil Shah et.al. | 2505.13851 | link |
2025-05-20 | InstanceBEV: Unifying Instance and BEV Representation for Global Modeling | Feng Li et.al. | 2505.13817 | null |
2025-05-19 | Dynamic Graph Induced Contour-aware Heat Conduction Network for Event-based Object Detection | Xiao Wang et.al. | 2505.12908 | link |
2025-05-19 | Rethinking Features-Fused-Pyramid-Neck for Object Detection | Hulin Li et.al. | 2505.12820 | link |
2025-05-19 | Enhancing Transformers Through Conditioned Embedded Tokens | Hemanth Saratchandran et.al. | 2505.12789 | null |
2025-05-19 | LiDAR MOT-DETR: A LiDAR-based Two-Stage Transformer for 3D Multiple Object Tracking | Martha Teiko Teye et.al. | 2505.12753 | null |
2025-05-19 | VLC Fusion: Vision-Language Conditioned Sensor Fusion for Robust Object Detection | Aditya Taparia et.al. | 2505.12715 | null |
2025-05-17 | EarthSynth: Generating Informative Earth Observation with Diffusion Models | Jiancheng Pan et.al. | 2505.12108 | null |
2025-05-17 | Experimental Study on Automatically Assembling Custom Catering Packages With a 3-DOF Delta Robot Using Deep Learning Methods | Reihaneh Yourdkhani et.al. | 2505.11879 | null |
2025-05-16 | Improving Object Detection Performance through YOLOv8: A Comprehensive Training and Evaluation Study | Rana Poureskandar et.al. | 2505.11424 | null |
2025-05-16 | MTevent: A Multi-Task Event Camera Dataset for 6D Pose Estimation and Moving Object Detection | Shrutarv Awasthi et.al. | 2505.11282 | null |
2025-05-16 | M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object Detection | Chao Wang et.al. | 2505.10931 | null |
2025-05-16 | A High-Performance Thermal Infrared Object Detection Framework with Centralized Regulation | Jinke Li et.al. | 2505.10825 | null |
2025-05-15 | StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation | Daniel A. P. Oliveira et.al. | 2505.10292 | link |
2025-05-15 | Defect Detection in Photolithographic Patterns Using Deep Learning Models Trained on Synthetic Data | Prashant P. Shinde et.al. | 2505.10192 | null |
2025-05-15 | Application of YOLOv8 in monocular downward multiple Car Target detection | Shijie Lyu et.al. | 2505.10016 | null |
2025-05-14 | EdgeAI Drone for Autonomous Construction Site Demonstrator | Emre Girgin et.al. | 2505.09837 | link |
2025-05-14 | WhatsAI: Transforming Meta Ray-Bans into an Extensible Generative AI Platform for Accessibility | Nasif Zaman et.al. | 2505.09823 | null |
2025-05-14 | MoRAL: Motion-aware Multi-Frame 4D Radar and LiDAR Fusion for Robust 3D Object Detection | Xiangyuan Peng et.al. | 2505.09422 | null |
2025-05-14 | A drone that learns to efficiently find objects in agricultural fields: from simulation to the real world | Rick van Essen et.al. | 2505.09278 | null |
2025-05-14 | DRRNet: Macro-Micro Feature Fusion and Dual Reverse Refinement for Camouflaged Object Detection | Jianlin Sun et.al. | 2505.09168 | link |
2025-05-14 | Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models | Lucas Choi et.al. | 2505.09139 | null |
2025-05-14 | Promoting SAM for Camouflaged Object Detection via Selective Key Point-based Guidance | Guoying Liang et.al. | 2505.09123 | null |
2025-05-13 | Robustness Analysis against Adversarial Patch Attacks in Fully Unmanned Stores | Hyunsik Na et.al. | 2505.08835 | null |
2025-05-13 | Augmented Reality for RObots (ARRO): Pointing Visuomotor Policies Towards Visual Robustness | Reihaneh Mirjalili et.al. | 2505.08627 | null |
2025-05-14 | Thermal Detection of People with Mobility Restrictions for Barrier Reduction at Traffic Lights Controlled Intersections | Xiao Ni et.al. | 2505.08568 | link |
2025-05-13 | MDF: Multi-Modal Data Fusion with CNN-Based Object Detection for Enhanced Indoor Localization Using LiDAR-SLAM | Saqi Hussain Kalan et.al. | 2505.08388 | null |
2025-05-13 | HMPNet: A Feature Aggregation Architecture for Maritime Object Detection from a Shipborne Perspective | Yu Zhang et.al. | 2505.08231 | link |
2025-05-13 | Object detection in adverse weather conditions for autonomous vehicles using Instruct Pix2Pix | Unai Gurbindo et.al. | 2505.08228 | null |
2025-05-13 | MoKD: Multi-Task Optimization for Knowledge Distillation | Zeeshan Hayder et.al. | 2505.08170 | null |
2025-05-12 | Hybrid Spiking Vision Transformer for Object Detection with Event Cameras | Qi Xu et.al. | 2505.07715 | null |
2025-05-12 | Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs | Kamil Jeziorek et.al. | 2505.07556 | null |
2025-05-12 | DepthFusion: Depth-Aware Hybrid Feature Fusion for LiDAR-Camera 3D Object Detection | Mingqian Ji et.al. | 2505.07398 | null |
2025-05-12 | Language-Driven Dual Style Mixing for Single-Domain Generalized Object Detection | Hongda Qin et.al. | 2505.07219 | link |
2025-05-11 | Differentiable NMS via Sinkhorn Matching for End-to-End Fabric Defect Detection | Zhengyang Lu et.al. | 2505.07040 | null |
2025-05-11 | VALISENS: A Validated Innovative Multi-Sensor System for Cooperative Automated Driving | Lei Wan et.al. | 2505.06980 | null |
2025-05-10 | M3CAD: Towards Generic Cooperative Autonomous Driving Benchmark | Morui Zhu et.al. | 2505.06746 | null |
2025-05-10 | Underwater object detection in sonar imagery with detection transformer and Zero-shot neural architecture search | XiaoTong Gu et.al. | 2505.06694 | null |
2025-05-10 | METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection | Yongqi Wang et.al. | 2505.06663 | link |
2025-05-09 | Camera-Only Bird's Eye View Perception: A Neural Approach to LiDAR-Free Environmental Mapping for Autonomous Vehicles | Anupkumar Bochare et.al. | 2505.06113 | null |
2025-05-09 | Artificial intelligence pioneers the double-strangeness factory | Yan He et.al. | 2505.05802 | null |
2025-05-09 | Dome-DETR: DETR with Density-Oriented Feature-Query Manipulation for Efficient Tiny Object Detection | Zhangchi Hu et.al. | 2505.05741 | null |
2025-05-09 | DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer | Ho-Joong Kim et.al. | 2505.05711 | link |
2025-05-08 | PillarMamba: Learning Local-Global Context for Roadside Point Cloud via Hybrid State Space Model | Zhang Zhang et.al. | 2505.05397 | null |
2025-05-08 | PaniCar: Securing the Perception of Advanced Driving Assistance Systems Against Emergency Vehicle Lighting | Elad Feldman et.al. | 2505.05183 | null |
2025-05-08 | FG-CLIP: Fine-Grained Visual and Textual Alignment | Chunyu Xie et.al. | 2505.05071 | null |
2025-05-08 | A Simple Detector with Frame Dynamics is a Strong Tracker | Chenxu Peng et.al. | 2505.04917 | null |
2025-05-08 | Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model | Navin Ranjan et.al. | 2505.04861 | null |
2025-05-07 | Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective | Songsong Duan et.al. | 2505.04758 | null |
2025-05-07 | Hyb-KAN ViT: Hybrid Kolmogorov-Arnold Networks Augmented Vision Transformer | Sainath Dey et.al. | 2505.04740 | null |
2025-05-08 | MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection | Zhihao Zhang et.al. | 2505.04594 | null |
2025-05-07 | DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception | Junjie Wang et.al. | 2505.04410 | link |
2025-05-06 | LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs | Xinyuan Zhang et.al. | 2505.03460 | null |
2025-05-06 | From Word to Sentence: A Large-Scale Multi-Instance Dataset for Open-Set Aerial Detection | Guoting Wei et.al. | 2505.03334 | null |
2025-05-06 | VISLIX: An XAI Framework for Validating Vision Models with Slice Discovery and Analysis | Xinyuan Yan et.al. | 2505.03132 | null |
2025-05-05 | Sim2Real Transfer for Vision-Based Grasp Verification | Pau Amargant et.al. | 2505.03046 | link |
2025-05-05 | DPNet: Dynamic Pooling Network for Tiny Object Detection | Luqi Gong et.al. | 2505.02797 | null |
2025-05-05 | RGBX-DiffusionDet: A Framework for Multi-Modal RGB-X Object Detection Using DiffusionDet | Eliraz Orfaig et.al. | 2505.02586 | null |
2025-05-05 | Point Cloud Recombination: Systematic Real Data Augmentation Using Robotic Targets for LiDAR Perception Validation | Hubert Padusinski et.al. | 2505.02476 | null |
2025-05-03 | DriveNetBench: An Affordable and Configurable Single-Camera Benchmarking System for Autonomous Driving Networks | Ali Al-Bustami et.al. | 2505.01893 | link |
2025-05-03 | OODTE: A Differential Testing Engine for the ONNX Optimizer | Nikolaos Louloudakis et.al. | 2505.01892 | null |
2025-05-03 | CMAWRNet: Multiple Adverse Weather Removal via a Unified Quaternion Neural Architecture | Vladimir Frants et.al. | 2505.01882 | null |
2025-05-03 | DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion | Haoteng Li et.al. | 2505.01857 | null |
2025-05-03 | Toward Onboard AI-Enabled Solutions to Space Object Detection for Space Sustainability | Wenxuan Zhang et.al. | 2505.01650 | null |
2025-05-02 | CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion | Boyuan Meng et.al. | 2505.00938 | link |
2025-05-01 | Efficient On-Chip Implementation of 4D Radar-Based 3D Object Detection on Hailo-8L | Woong-Chan Byun et.al. | 2505.00757 | null |
2025-05-03 | Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook | Muyi Bao et.al. | 2505.00630 | null |
2025-05-01 | Visual Trajectory Prediction of Vessels for Inland Navigation | Alexander Puzicha et.al. | 2505.00599 | null |
2025-05-01 | Synthesizing and Identifying Noise Levels in Autonomous Vehicle Camera Radar Datasets | Mathis Morales et.al. | 2505.00584 | null |
2025-05-01 | X-ray illicit object detection using hybrid CNN-transformer neural network architectures | Jorgen Cani et.al. | 2505.00564 | null |
2025-05-01 | A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic | Muhammad Imran Zaman et.al. | 2505.00534 | null |
2025-05-01 | Inconsistency-based Active Learning for LiDAR Object Detection | Esteban Rivera et.al. | 2505.00511 | null |
2025-05-01 | HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection | Esteban Rivera et.al. | 2505.00507 | null |
2025-05-05 | Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution | Luigi Sigillo et.al. | 2505.00334 | null |
2025-04-30 | V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving | Jannik Lübberstedt et.al. | 2505.00156 | null |
2025-04-30 | LLM-Empowered Embodied Agent for Memory-Augmented Task Planning in Household Robotics | Marc Glocker et.al. | 2504.21716 | null |
2025-04-29 | T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection | Manikanta Varaganti et.al. | 2504.21231 | null |
2025-04-29 | FLIM-based Salient Object Detection Networks with Adaptive Decoders | Gilson Junior Soares et.al. | 2504.20872 | null |
2025-04-29 | A Survey on Event-based Optical Marker Systems | Nafiseh Jabbari Tofighi et.al. | 2504.20736 | null |
2025-04-29 | Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection | Siwei Wang et.al. | 2504.20602 | null |
2025-04-29 | Style-Adaptive Detection Transformer for Single-Source Domain Generalized Object Detection | Jianhong Han et.al. | 2504.20498 | null |
2025-04-28 | More Clear, More Flexible, More Precise: A Comprehensive Oriented Object Detection benchmark for UAV | Kai Ye et.al. | 2504.20032 | null |
2025-04-28 | Lossy Source Coding with Focal Loss | Alex Dytso et.al. | 2504.19913 | null |
2025-04-28 | Neural network task specialization via domain constraining | Roman Malashin et.al. | 2504.19592 | null |
2025-04-28 | GMAR: Gradient-Driven Multi-Head Attention Rollout for Vision Transformer Interpretability | Sehyeong Jo et.al. | 2504.19414 | null |
2025-04-27 | Improving Small Drone Detection Through Multi-Scale Processing and Data Augmentation | Rayson Laroca et.al. | 2504.19347 | null |
2025-04-27 | ODExAI: A Comprehensive Object Detection Explainable AI Evaluation | Loc Phuc Truong Nguyen et.al. | 2504.19249 | null |
2025-04-27 | Boosting Single-domain Generalized Object Detection via Vision-Language Knowledge Interaction | Xiaoran Xu et.al. | 2504.19086 | null |
2025-04-26 | Federated Learning-based Semantic Segmentation for Lane and Object Detection in Autonomous Driving | Gharbi Khamis Alshammari et.al. | 2504.18939 | null |
2025-04-25 | Dream-Box: Object-wise Outlier Generation for Out-of-Distribution Detection | Brian K. S. Isaac-Medina et.al. | 2504.18746 | null |
2025-04-25 | A Review of 3D Object Detection with Vision-Language Models | Ranjan Sapkota et.al. | 2504.18738 | null |
2025-04-25 | Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models | Patrick Müller et.al. | 2504.18510 | null |
2025-04-25 | Iterative Event-based Motion Segmentation by Variational Contrast Maximization | Ryo Yamaki et.al. | 2504.18447 | null |
2025-04-25 | A Multimodal Hybrid Late-Cascade Fusion Network for Enhanced 3D Object Detection | Carlo Sgaravatti et.al. | 2504.18419 | null |
2025-04-25 | A comprehensive review of classifier probability calibration metrics | Richard Oliver Lane et.al. | 2504.18278 | null |
2025-04-25 | LiDAR-Guided Monocular 3D Object Detection for Long-Range Railway Monitoring | Raul David Dominguez Sanchez et.al. | 2504.18203 | null |
2025-04-25 | Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition | Yin Tang et.al. | 2504.18201 | null |
2025-04-25 | E-InMeMo: Enhanced Prompting for Visual In-Context Learning | Jiahao Zhang et.al. | 2504.18158 | null |
2025-04-25 | MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View | Liugang Lu et.al. | 2504.18136 | null |
2025-04-25 | Opportunistic Collaborative Planning with Large Vision Model Guided Control and Joint Query-Service Optimization | Jiayi Chen et.al. | 2504.18057 | null |
2025-04-25 | Direct sampling method to retrieve small objects from two-dimensional limited-aperture scattered field data | Won-Kwang Park et.al. | 2504.18036 | null |
2025-04-24 | DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks | Yinqi Li et.al. | 2504.17253 | link |
2025-04-24 | Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation | Phillip Y. Lee et.al. | 2504.17207 | null |
2025-04-24 | AUTHENTICATION: Identifying Rare Failure Modes in Autonomous Vehicle Perception Systems using Adversarially Guided Diffusion Models | Mohammad Zarei et.al. | 2504.17179 | null |
2025-04-23 | Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection | Jens Petersen et.al. | 2504.17076 | null |
2025-04-23 | Gaussian Splatting is an Effective Data Generator for 3D Object Detection | Farhad G. Zanjani et.al. | 2504.16740 | null |
2025-04-23 | EHGCN: Hierarchical Euclidean-Hyperbolic Fusion via Motion-Aware GCN for Hybrid Event Stream Perception | Haosheng Chen et.al. | 2504.16616 | null |
2025-04-23 | Beyond Anonymization: Object Scrubbing for Privacy-Preserving 2D and 3D Vision Tasks | Murat Bilgehan Ertan et.al. | 2504.16557 | null |
2025-04-23 | Assessing the Feasibility of Internet-Sourced Video for Automatic Cattle Lameness Detection | Md Fahimuzzman Sohan et.al. | 2504.16404 | null |
2025-04-23 | Revisiting Radar Camera Alignment by Contrastive Learning for 3D Object Detection | Linhua Kong et.al. | 2504.16368 | null |
2025-04-22 | Vision Controlled Orthotic Hand Exoskeleton | Connor Blais et.al. | 2504.16319 | null |
2025-04-22 | Physical Intelligence et.al. | 2504.16054 | null | |
2025-04-22 | SAGA: Semantic-Aware Gray color Augmentation for Visible-to-Thermal Domain Adaptation across Multi-View Drone and Ground-Based Vision Systems | Manjunath D et.al. | 2504.15728 | null |
2025-04-22 | You Sense Only Once Beneath: Ultra-Light Real-Time Underwater Object Detection | Jun Dong et.al. | 2504.15694 | null |
2025-04-22 | A Vision-Enabled Prosthetic Hand for Children with Upper Limb Disabilities | Md Abdul Baset Sarker et.al. | 2504.15654 | null |
2025-04-21 | Context Aware Grounded Teacher for Source Free Object Detection | Tajamul Ashraf et.al. | 2504.15404 | null |
2025-04-21 | SuoiAI: Building a Dataset for Aquatic Invertebrates in Vietnam | Tue Vo et.al. | 2504.15252 | null |
2025-04-21 | An Efficient Aerial Image Detection with Variable Receptive Fields | Liu Wenbin et.al. | 2504.15165 | null |
2025-04-19 | Balancing Privacy and Action Performance: A Penalty-Driven Approach to Image Anonymization | Nazia Aslam et.al. | 2504.14301 | null |
2025-04-19 | Visual Consensus Prompting for Co-Salient Object Detection | Jie Wang et.al. | 2504.14254 | link |
2025-04-18 | Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models | Junjie Yang et.al. | 2504.13825 | null |
2025-04-18 | Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction | Yushen He et.al. | 2504.13647 | link |
2025-04-18 | DenSe-AdViT: A novel Vision Transformer for Dense SAR Object Detection | Yang Zhang et.al. | 2504.13638 | null |
2025-04-18 | HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection | YangChen Zeng et.al. | 2504.13469 | null |
2025-04-18 | Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety | Shashank Shriram et.al. | 2504.13399 | link |
2025-04-17 | VLLFL: A Vision-Language Model Based Lightweight Federated Learning Framework for Smart Agriculture | Long Li et.al. | 2504.13365 | null |
2025-04-17 | SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling | Yasin Almalioglu et.al. | 2504.13310 | null |
2025-04-17 | Weak Cube R-CNN: Weakly Supervised 3D Detection using only 2D Bounding Boxes | Andreas Lau Hansen et.al. | 2504.13297 | null |
2025-04-17 | RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity | Ranjan Sapkota et.al. | 2504.13099 | null |
2025-04-17 | Self-Supervised Pre-training with Combined Datasets for 3D Perception in Autonomous Driving | Shumin Wang et.al. | 2504.12709 | null |
2025-04-18 | RoPETR: Improving Temporal Camera-Only 3D Detection by Integrating Enhanced Rotary Position Embedding | Hang Ji et.al. | 2504.12643 | null |
2025-04-16 | Towards a General-Purpose Zero-Shot Synthetic Low-Light Image and Video Pipeline | Joanne Lin et.al. | 2504.12169 | null |
2025-04-16 | RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning | Yuan Luo et.al. | 2504.12167 | null |
2025-04-16 | pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild | Jonas Myhre Schiøtt et.al. | 2504.12045 | null |
2025-04-16 | A Review of YOLOv12: Attention-Based Enhancements vs. Previous Versions | Rahima Khanam et.al. | 2504.11995 | null |
2025-04-16 | Multimodal Spatio-temporal Graph Learning for Alignment-free RGBT Video Object Detection | Qishun Wang et.al. | 2504.11779 | null |
2025-04-15 | Multi-level Cellular Automata for FLIM networks | Felipe Crispim Salvagnini et.al. | 2504.11406 | null |
2025-04-15 | CFIS-YOLO: A Lightweight Multi-Scale Fusion Network for Edge-Deployable Wood Defect Detection | Jincheng Kang et.al. | 2504.11305 | null |
2025-04-15 | Flyweight FLIM Networks for Salient Object Detection in Biomedical Images | Leonardo M. Joao et.al. | 2504.11112 | null |
2025-04-15 | S |
Yu Lin et.al. | 2504.11111 | null |
2025-04-15 | DRIFT open dataset: A drone-derived intelligence for traffic analysis in urban environmen | Hyejin Lee et.al. | 2504.11019 | null |
2025-04-16 | GATE3D: Generalized Attention-based Task-synergized Estimation in 3D* | Eunsoo Im et.al. | 2504.11014 | null |
2025-04-15 | CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors | Jiahuan Long et.al. | 2504.10888 | null |
2025-04-15 | Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task | Aviral Chharia et.al. | 2504.10880 | null |
2025-04-15 | Weather-Aware Object Detection Transformer for Domain Adaptation | Soheil Gharatappeh et.al. | 2504.10877 | null |
2025-04-15 | ATLASv2: LLM-Guided Adaptive Landmark Acquisition and Navigation on the Edge | Mikolaj Walczak et.al. | 2504.10784 | null |
2025-04-14 | DiffMOD: Progressive Diffusion Point Denoising for Moving Object Detection in Remote Sensing | Jinyue Zhang et.al. | 2504.10278 | null |
2025-04-14 | Balancing Stability and Plasticity in Pretrained Detector: A Dual-Path Framework for Incremental Object Detection | Songze Li et.al. | 2504.10214 | null |
2025-04-15 | WildLive: Near Real-time Visual Wildlife Tracking onboard UAVs | Nguyen Ngoc Dat et.al. | 2504.10165 | null |
2025-04-14 | COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts | Jiansheng Li et.al. | 2504.10158 | null |
2025-04-14 | Small Object Detection with YOLO: A Performance Analysis Across Model Versions and Hardware | Muhammad Fasih Tariq et.al. | 2504.09900 | null |
2025-04-14 | Density-based Object Detection in Crowded Scenes | Chenyang Zhao et.al. | 2504.09819 | null |
2025-04-13 | Uncertainty Guided Refinement for Fine-Grained Salient Object Detection | Yao Yuan et.al. | 2504.09666 | link |
2025-04-13 | Pillar-Voxel Fusion Network for 3D Object Detection in Airborne Hyperspectral Point Clouds | Yanze Jiang et.al. | 2504.09506 | null |
2025-04-13 | Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation | Yongchao Feng et.al. | 2504.09480 | null |
2025-04-13 | InfoBound: A Provable Information-Bounds Inspired Framework for Both OoD Generalization and OoD Detection | Lin Zhu et.al. | 2504.09448 | null |
2025-04-11 | TinyCenterSpeed: Efficient Center-Based Object Detection for Autonomous Racing | Neil Reichlin et.al. | 2504.08655 | null |
2025-04-11 | Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization | Jialu Li et.al. | 2504.08641 | null |
2025-04-10 | Enhanced Cooperative Perception Through Asynchronous Vehicle to Infrastructure Framework with Delay Mitigation for Connected and Automated Vehicles | Nithish Kumar Saravanan et.al. | 2504.08172 | null |
2025-04-10 | Multi-Task Learning with Multi-Annotation Triplet Loss for Improved Object Detection | Meilun Zhou et.al. | 2504.08054 | null |
2025-04-10 | Detect Anything 3D in the Wild | Hanxue Zhang et.al. | 2504.07958 | null |
2025-04-11 | Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks | Erin Carson et.al. | 2504.07835 | null |
2025-04-10 | P2Object: Single Point Supervised Object Detection and Instance Segmentation | Pengfei Chen et.al. | 2504.07813 | null |
2025-04-10 | Nonlocal Retinex-Based Variational Model and its Deep Unfolding Twin for Low-Light Image Enhancement | Daniel Torres et.al. | 2504.07810 | null |
2025-04-10 | Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network | Peng Jia et.al. | 2504.07777 | null |
2025-04-10 | VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model | Haozhan Shen et.al. | 2504.07615 | link |
2025-04-10 | RASMD: RGB And SWIR Multispectral Driving Dataset for Robust Perception in Adverse Conditions | Youngwan Jin et.al. | 2504.07603 | null |
2025-04-10 | WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer | Huilin Yin et.al. | 2504.07441 | null |
2025-04-09 | Few-Shot Adaptation of Grounding DINO for Agricultural Domain | Rajhans Singh et.al. | 2504.07252 | null |
2025-04-09 | Multi-Object Tracking for Collision Avoidance Using Multiple Cameras in Open RAN Networks | Jordi Serra et.al. | 2504.07163 | null |
2025-04-09 | Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection | Ruoyu Chen et.al. | 2504.07060 | null |
2025-04-09 | UAV Position Estimation using a LiDAR-based 3D Object Detection Method | Uthman Olawoye et.al. | 2504.07028 | null |
2025-04-09 | Towards Efficient Roadside LiDAR Deployment: A Fast Surrogate Metric Based on Entropy-Guided Visibility | Yuze Jiang et.al. | 2504.06772 | null |
2025-04-09 | Domain-Conditioned Scene Graphs for State-Grounded Task Planning | Jonas Herzog et.al. | 2504.06661 | null |
2025-04-09 | Visually Similar Pair Alignment for Robust Cross-Domain Object Detection | Onkar Krishna et.al. | 2504.06607 | null |
2025-04-08 | From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction | Vladimir Golovkin et.al. | 2504.06357 | null |
2025-04-08 | Analyzing the Impact of Low-Rank Adaptation for Cross-Domain Few-Shot Object Detection in Aerial Images | Hicham Talaoubrid et.al. | 2504.06330 | null |
2025-04-08 | Balancing long- and short-term dynamics for the modeling of saliency in videos | Theodor Wulff et.al. | 2504.05913 | null |
2025-04-08 | PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario | Sriram Mandalika et.al. | 2504.05908 | null |
2025-04-08 | Intrinsic Saliency Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation | Xiangyu Zheng et.al. | 2504.05904 | null |
2025-04-08 | KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection | Xingyuan Li et.al. | 2504.05878 | null |
2025-04-08 | DefMamba: Deformable Visual State Space Model | Leiye Liu et.al. | 2504.05794 | null |
2025-04-08 | Event-based Civil Infrastructure Visual Defect Detection: ev-CIVIL Dataset and Benchmark | Udayanga G. W. K. N. Gamage et.al. | 2504.05679 | null |
2025-04-08 | POD: Predictive Object Detection with Single-Frame FMCW LiDAR Point Cloud | Yining Shi et.al. | 2504.05649 | null |
2025-04-08 | AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes | Zhenteng Li et.al. | 2504.05601 | null |
2025-04-07 | SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection | Bonan Ding et.al. | 2504.05170 | null |
2025-04-07 | Inland Waterway Object Detection in Multi-environment: Dataset and Approach | Shanshan Wang et.al. | 2504.04835 | null |
2025-04-07 | Playing Non-Embedded Card-Based Games with Reinforcement Learning | Tianyang Wu et.al. | 2504.04783 | null |
2025-04-07 | Feedback-Enhanced Hallucination-Resistant Vision-Language Model for Real-Time Scene Understanding | Zahir Alsulaimawi et.al. | 2504.04772 | null |
2025-04-07 | Inverse++: Vision-Centric 3D Semantic Occupancy Prediction Assisted with 3D Object Detection | Zhenxing Ming et.al. | 2504.04732 | null |
2025-04-06 | Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Jiancheng Pan et.al. | 2504.04517 | link |
2025-04-06 | eKalibr-Stereo: Continuous-Time Spatiotemporal Calibration for Event-Based Stereo Visual Systems | Shuolong Chen et.al. | 2504.04451 | link |
2025-04-05 | Autoregressive High-Order Finite Difference Modulo Imaging: High-Dynamic Range for Computer Vision Applications | Brayan Monroy et.al. | 2504.04228 | null |
2025-04-05 | An Optimized Density-Based Lane Keeping System for A Cost-Efficient Autonomous Vehicle Platform: AurigaBot V1 | Farbod Younesi et.al. | 2504.04217 | null |
2025-04-05 | Learning about the Physical World through Analytic Concepts | Jianhua Sun et.al. | 2504.04170 | null |
2025-04-04 | PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector | Kaidong Li et.al. | 2504.03563 | null |
2025-04-04 | ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving | Sheng Yang et.al. | 2504.03438 | null |
2025-04-04 | Infrared bubble recognition in the Milky Way and beyond using deep learning | Shimpei Nishimoto et.al. | 2504.03367 | null |
2025-04-04 | Real-Time Roadway Obstacle Detection for Electric Scooters Using Deep Learning and Multi-Sensor Fusion | Zeyang Zheng et.al. | 2504.03171 | null |
2025-04-04 | Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning | Lucas Choi et.al. | 2504.03168 | null |
2025-04-03 | LiDAR-based Object Detection with Real-time Voice Specifications | Anurag Kulkarni et.al. | 2504.02920 | null |
2025-04-03 | BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation | Van Nguyen Nguyen et.al. | 2504.02812 | null |
2025-04-03 | Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results | Andrei Dumitriu et.al. | 2504.02558 | null |
2025-04-03 | Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision | Xiaofeng Han et.al. | 2504.02477 | null |
2025-04-03 | CornerPoint3D: Look at the Nearest Corner Instead of the Center | Ruixiao Zhang et.al. | 2504.02464 | null |
2025-04-03 | Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline | Peifu Liu et.al. | 2504.02416 | null |
2025-04-03 | SemiISP/SemiIE: Semi-Supervised Image Signal Processor and Image Enhancement Leveraging One-to-Many Mapping sRGB-to-RAW | Masakazu Yoshimura et.al. | 2504.02345 | null |
2025-04-03 | LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection | YiMing Yu et.al. | 2504.02280 | null |
2025-04-02 | Cat-Eye Inspired Active-Passive-Composite Aperture-Shared Sub-Terahertz Meta-Imager for Non-Interactive Concealed Object Detection | Mingshuang Hu et.al. | 2504.01473 | null |
2025-04-02 | CFMD: Dynamic Cross-layer Feature Fusion for Salient Object Detection | Jin Lian et.al. | 2504.01326 | null |
2025-04-01 | Enabling Efficient Processing of Spiking Neural Networks with On-Chip Learning on Commodity Neuromorphic Processors for Edge AI Systems | Rachmad Vidya Wicaksana Putra et.al. | 2504.00957 | null |
2025-04-01 | NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds | Mahan Rafidashti et.al. | 2504.00859 | null |
2025-04-01 | AttentiveGRU: Recurrent Spatio-Temporal Modeling for Advanced Radar-Based BEV Object Detection | Loveneet Saini et.al. | 2504.00559 | null |
2025-04-01 | High-Quality Pseudo-Label Generation Based on Visual Prompt Assisted Cloud Model Update | Xinrun Xu et.al. | 2504.00526 | null |
2025-04-01 | Intrinsic-feature-guided 3D Object Detection | Wanjing Zhang et.al. | 2504.00382 | null |
2025-04-01 | CamoSAM2: Motion-Appearance Induced Auto-Refining Prompts for Video Camouflaged Object Detection | Xin Zhang et.al. | 2504.00375 | null |
2025-03-31 | Towards Precise Action Spotting: Addressing Temporal Misalignment in Labels with Dynamic Label Assignment | Masato Tamura et.al. | 2504.00149 | null |
2025-03-31 | SU-YOLO: Spiking Neural Network for Efficient Underwater Object Detection | Chenyang Li et.al. | 2503.24389 | link |
2025-03-31 | MB-ORES: A Multi-Branch Object Reasoner for Visual Grounding in Remote Sensing | Karim Radouane et.al. | 2503.24219 | link |
2025-03-31 | Spectral-Adaptive Modulation Networks for Visual Perception | Guhnoo Yun et.al. | 2503.23947 | null |
2025-03-31 | Expanding-and-Shrinking Binary Neural Networks | Xulong Shi et.al. | 2503.23709 | link |
2025-03-30 | Re-Aligning Language to Visual Objects with an Agentic Workflow | Yuming Chen et.al. | 2503.23508 | null |
2025-03-30 | EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing | Hongxiang Jiang et.al. | 2503.23330 | null |
2025-03-29 | Context in object detection: a systematic literature review | Mahtab Jamali et.al. | 2503.23249 | null |
2025-03-29 | Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection | Marc-Antoine Lavoie et.al. | 2503.23220 | null |
2025-03-29 | A GAN-Enhanced Deep Learning Framework for Rooftop Detection from Historical Aerial Imagery | Pengyu Chen et.al. | 2503.23200 | null |
2025-03-29 | Intelligent Bear Prevention System Based on Computer Vision: An Approach to Reduce Human-Bear Conflicts in the Tibetan Plateau Area, China | Pengyu Chen et.al. | 2503.23178 | null |
2025-03-28 | AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization | Martin Kišš et.al. | 2503.22526 | null |
2025-03-28 | Data Quality Matters: Quantifying Image Quality Impact on Machine Learning Performance | Christian Steinhauser et.al. | 2503.22375 | null |
2025-03-28 | ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection | Nandakishor M et.al. | 2503.22363 | null |
2025-03-28 | Knowledge Rectification for Camouflaged Object Detection: Unlocking Insights from Low-Quality Data | Juwei Guan et.al. | 2503.22180 | null |
2025-03-28 | A Survey on Remote Sensing Foundation Models: From Vision to Multimodality | Ziyue Huang et.al. | 2503.22081 | null |
2025-03-27 | AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification | Earl Ranario et.al. | 2503.22019 | null |
2025-03-27 | FACETS: Efficient Once-for-all Object Detection via Constrained Iterative Search | Tony Tran et.al. | 2503.21999 | null |
2025-03-27 | Exponentially Weighted Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection Model Training in Unmanned Aerial Vehicles Surveillance Scenarios | Taufiq Ahmed et.al. | 2503.21893 | null |
2025-03-27 | Learning Class Prototypes for Unified Sparse Supervised 3D Object Detection | Yun Zhu et.al. | 2503.21099 | link |
2025-03-26 | SaViD: Spectravista Aesthetic Vision Integration for Robust and Discerning 3D Object Detection in Challenging Environments | Tanmoy Dam et.al. | 2503.20614 | link |
2025-03-26 | Small Object Detection: A Comprehensive Survey on Challenges, Techniques and Real-World Applications | Mahya Nikouei et.al. | 2503.20516 | null |
2025-03-25 | Gemini Robotics: Bringing AI into the Physical World | Gemini Robotics Team et.al. | 2503.20020 | null |
2025-03-25 | Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception | Luke Chen et.al. | 2503.20011 | null |
2025-03-25 | Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models | Ilias Stogiannidis et.al. | 2503.19707 | null |
2025-03-25 | BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction | Jan Kohút et.al. | 2503.19658 | null |
2025-03-25 | Single Shot AI-assisted quantification of KI-67 proliferation index in breast cancer | Deepti Madurai Muthu et.al. | 2503.19606 | null |
2025-03-25 | MATT-GS: Masked Attention-based 3DGS for Robot Perception and Object Detection | Jee Won Lee et.al. | 2503.19330 | null |
2025-03-25 | Multiscale Feature Importance-based Bit Allocation for End-to-End Feature Coding for Machines | Junle Liu et.al. | 2503.19278 | null |
2025-03-24 | Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery | Sara Al-Emadi et.al. | 2503.19202 | null |
2025-03-24 | Pitch Contour Exploration Across Audio Domains: A Vision-Based Transfer Learning Approach | Jakob Abeßer et.al. | 2503.19161 | null |
2025-03-24 | Cooperative Control of Multi-Quadrotors for Transporting Cable-Suspended Payloads: Obstacle-Aware Planning and Event-Based Nonlinear Model Predictive Control | Tohid Kargar Tasooji et.al. | 2503.19135 | null |
2025-03-24 | Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection | Moussa Kassem Sbeyti et.al. | 2503.18903 | null |
2025-03-24 | LGI-DETR: Local-Global Interaction for UAV Object Detection | Zifa Chen et.al. | 2503.18785 | null |
2025-03-25 | Frequency Dynamic Convolution for Dense Image Prediction | Linwei Chen et.al. | 2503.18783 | null |
2025-03-25 | CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection | Zhichao Sun et.al. | 2503.18430 | null |
2025-03-24 | Vision-Guided Loco-Manipulation with a Snake Robot | Adarsh Salagame et.al. | 2503.18308 | null |
2025-03-22 | MAMAT: 3D Mamba-Based Atmospheric Turbulence Removal and its Object Detection Capability | Paul Hill et.al. | 2503.17700 | null |
2025-03-22 | Sense4FL: Vehicular Crowdsensing Enhanced Federated Learning for Autonomous Driving | Yanan Ma et.al. | 2503.17697 | null |
2025-03-21 | Should we pre-train a decoder in contrastive learning for dense prediction tasks? | Sébastien Quetin et.al. | 2503.17526 | null |
2025-03-21 | Event-Based Crossing Dataset (EBCD) | Joey Mulé et.al. | 2503.17499 | null |
2025-03-21 | You Only Look Once at Anytime (AnytimeYOLO): Analysis and Optimization of Early-Exits for Object-Detection | Daniel Kuhse et.al. | 2503.17497 | null |
2025-03-21 | An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection | Louis Y. Kim et.al. | 2503.17285 | null |
2025-03-21 | Which2comm: An Efficient Collaborative Perception Framework for 3D Object Detection | Duanrui Yu et.al. | 2503.17175 | null |
2025-03-21 | Hi-ALPS -- An Experimental Robustness Quantification of Six LiDAR-based Object Detection Systems for Autonomous Driving | Alexandra Arzberger et.al. | 2503.17168 | null |
2025-03-21 | R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception | Jonas Mirlach et.al. | 2503.17122 | null |
2025-03-21 | Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes | Davide Antonio Mura et.al. | 2503.17107 | null |
2025-03-21 | R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model | Boyuan Zheng et.al. | 2503.17097 | null |
2025-03-21 | Superpowering Open-Vocabulary Object Detectors for X-ray Vision | Pablo Garcia-Fernandez et.al. | 2503.17071 | null |
2025-03-21 | Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos | Yuang Feng et.al. | 2503.17050 | null |
2025-03-21 | Salient Object Detection in Traffic Scene through the TSOD10K Dataset | Yu Qiu et.al. | 2503.16910 | null |
2025-03-21 | Seg2Box: 3D Object Detection by Point-Wise Semantics Supervision | Maoji Zheng et.al. | 2503.16811 | null |
2025-03-20 | RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility in Autonomous Vehicles | Dawood Wasif et.al. | 2503.16251 | null |
2025-03-20 | MapGlue: Multimodal Remote Sensing Image Matching | Peihao Wu et.al. | 2503.16185 | null |
2025-03-20 | Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection | Jiangyi Wang et.al. | 2503.16125 | null |
2025-03-20 | Semantic-Guided Global-Local Collaborative Networks for Lightweight Image Super-Resolution | Wanshu Fan et.al. | 2503.16056 | null |
2025-03-19 | DCA: Dividing and Conquering Amnesia in Incremental Object Detection | Aoting Zhang et.al. | 2503.15295 | null |
2025-03-19 | Test-Time Backdoor Detection for Object Detection Models | Hangtao Zhang et.al. | 2503.15293 | null |
2025-03-19 | GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector | Zechuan Li et.al. | 2503.15211 | null |
2025-03-19 | UltraFlwr -- An Efficient Federated Medical and Surgical Object Detection Framework | Yang Li et.al. | 2503.15161 | null |
2025-03-19 | An Investigation of Beam Density on LiDAR Object Detection Performance | Christoph Griesbacher et.al. | 2503.15087 | null |
2025-03-20 | Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark | Ying Liu et.al. | 2503.14862 | null |
2025-03-19 | State Space Model Meets Transformer: A New Paradigm for 3D Object Detection | Chuxin Wang et.al. | 2503.14493 | null |
2025-03-18 | A Revisit to the Decoder for Camouflaged Object Detection | Seung Woo Ko et.al. | 2503.14035 | null |
2025-03-18 | Shift, Scale and Rotation Invariant Multiple Object Detection using Balanced Joint Transform Correlator | Xi Shen et.al. | 2503.14034 | null |
2025-03-18 | LEGNet: Lightweight Edge-Gaussian Driven Network for Low-Quality Remote Sensing Image Object Detection | Wei Lu et.al. | 2503.14012 | link |
2025-03-18 | FrustumFusionNets: A Three-Dimensional Object Detection Network Based on Tractor Road Scene | Lili Yang et.al. | 2503.13951 | null |
2025-03-18 | Is Discretization Fusion All You Need for Collaborative Perception? | Kang Yang et.al. | 2503.13946 | null |
2025-03-18 | PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds | Barza Nisar et.al. | 2503.13914 | null |
2025-03-18 | HSOD-BIT-V2: A New Challenging Benchmarkfor Hyperspectral Salient Object Detection | Yuhao Qiu et.al. | 2503.13906 | null |
2025-03-18 | TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection | Qiang Qi et.al. | 2503.13903 | null |
2025-03-18 | YOLO-LLTS: Real-Time Low-Light Traffic Sign Detection via Prior-Guided Enhancement and Multi-Branch Feature Interaction | Ziyu Lin et.al. | 2503.13883 | null |
2025-03-17 | Beyond RGB: Adaptive Parallel Processing for RAW Object Detection | Shani Gamrian et.al. | 2503.13163 | null |
2025-03-17 | SparseAlign: A Fully Sparse Framework for Cooperative Object Detection | Yunshuang Yuan et.al. | 2503.12982 | null |
2025-03-17 | Efficient Multimodal 3D Object Detector via Instance-Level Contrastive Distillation | Zhuoqun Su et.al. | 2503.12914 | null |
2025-03-16 | Point Cloud Based Scene Segmentation: A Survey | Dan Halperin et.al. | 2503.12595 | null |
2025-03-16 | GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing | Zilun Zhang et.al. | 2503.12490 | null |
2025-03-15 | An Efficient Deep Learning-Based Approach to Automating Invoice Document Validation | Aziz Amari et.al. | 2503.12267 | null |
2025-03-15 | Minuscule Cell Detection in AS-OCT Images with Progressive Field-of-View Focusing | Boyu Chen et.al. | 2503.12249 | null |
2025-03-15 | SFMNet: Sparse Focal Modulation for 3D Object Detection | Oren Shrout et.al. | 2503.12093 | null |
2025-03-18 | UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection | Xin Jin et.al. | 2503.12009 | null |
2025-03-14 | Rethinking Multi-modal Object Detection from the Perspective of Mono-Modality Feature Learning | Tianyi Zhao et.al. | 2503.11780 | null |
2025-03-14 | FLASHμ: Fast Localizing And Sizing of Holographic Microparticles | Ayush Paliwal et.al. | 2503.11538 | null |
2025-03-14 | Falcon: A Remote Sensing Vision-Language Foundation Model | Kelu Yao et.al. | 2503.11070 | null |
2025-03-14 | FMNet: Frequency-Assisted Mamba-Like Linear Attention Network for Camouflaged Object Detection | Ming Deng et.al. | 2503.11030 | null |
2025-03-17 | Comparative Analysis of Advanced AI-based Object Detection Models for Pavement Marking Quality Assessment during Daytime | Gian Antariksa et.al. | 2503.11008 | null |
2025-03-14 | Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection | Chuhan Zhang et.al. | 2503.11005 | null |
2025-03-13 | The Power of One: A Single Example is All it Takes for Segmentation in VLMs | Mir Rayat Imtiaz Hossain et.al. | 2503.10779 | null |
2025-03-13 | HeightFormer: Learning Height Prediction in Voxel Features for Roadside Vision Centric 3D Object Detection via Transformer | Zhang Zhang et.al. | 2503.10777 | null |
2025-03-15 | Semantic-Supervised Spatial-Temporal Fusion for LiDAR-based 3D Object Detection | Chaoqun Wang et.al. | 2503.10579 | null |
2025-03-13 | RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation | Yuwen Du et.al. | 2503.10410 | link |
2025-03-13 | RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing | Fengxiang Wang et.al. | 2503.10392 | link |
2025-03-13 | Object detection characteristics in a learning factory environment using YOLOv8 | Toni Schneidereit et.al. | 2503.10356 | null |
2025-03-13 | TARS: Traffic-Aware Radar Scene Flow Estimation | Jialong Wu et.al. | 2503.10210 | null |
2025-03-13 | A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection | Shenghao Fu et.al. | 2503.10152 | link |
2025-03-13 | Deep Learning-Based Direct Leaf Area Estimation using Two RGBD Datasets for Model Development | Namal Jayasuriya et.al. | 2503.10129 | null |
2025-03-13 | Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection | Zihao Zhang et.al. | 2503.09968 | null |
2025-03-12 | CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation | Hariprasath Govindarajan et.al. | 2503.09878 | null |
2025-03-12 | How good are deep learning methods for automated road safety analysis using video data? An experimental study | Qingwu Liu et.al. | 2503.09807 | null |
2025-03-12 | Deep Learning for Climate Action: Computer Vision Analysis of Visual Narratives on X | Katharina Prasse et.al. | 2503.09361 | null |
2025-03-12 | Fully-Synthetic Training for Visual Quality Inspection in Automotive Production | Christoph Huber et.al. | 2503.09354 | null |
2025-03-12 | DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection | Chiara Cappellino et.al. | 2503.09271 | null |
2025-03-12 | Polygonizing Roof Segments from High-Resolution Aerial Images Using Yolov8-Based Edge Detection | Qipeng Mei et.al. | 2503.09187 | null |
2025-03-12 | Dual-Domain Homogeneous Fusion with Cross-Modal Mamba and Progressive Decoder for 3D Object Detection | Xuzhong Hu et.al. | 2503.08992 | null |
2025-03-11 | GBlobs: Explicit Local Structure via Gaussian Blobs for Improved Cross-Domain LiDAR-based 3D Object Detection | Dušan Malić et.al. | 2503.08639 | null |
2025-03-11 | Referring to Any Person | Qing Jiang et.al. | 2503.08507 | null |
2025-03-11 | SuperCap: Multi-resolution Superpixel-based Image Captioning | Henry Senior et.al. | 2503.08496 | null |
2025-03-13 | Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels | Qiming Xia et.al. | 2503.08421 | null |
2025-03-11 | Embodied Crowd Counting | Runling Long et.al. | 2503.08367 | null |
2025-03-11 | Physics-based AI methodology for Material Parameter Extraction from Optical Data | M. Koumans et.al. | 2503.08183 | null |
2025-03-11 | Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method | Fei Wang et.al. | 2503.08144 | null |
2025-03-12 | Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning | Lizhen Xu et.al. | 2503.08101 | link |
2025-03-11 | SparseVoxFormer: Sparse Voxel-based Transformer for Multi-modal 3D Object Detection | Hyeongseok Son et.al. | 2503.08092 | null |
2025-03-11 | Simulating Automotive Radar with Lidar and Camera Inputs | Peili Song et.al. | 2503.08068 | null |
2025-03-10 | YOLOE: Real-Time Seeing Anything | Ao Wang et.al. | 2503.07465 | link |
2025-03-10 | HGO-YOLO: Advancing Anomaly Behavior Detection with Hierarchical Features and Lightweight Optimized Detection | Qizhi Zheng et.al. | 2503.07371 | null |
2025-03-10 | Mitigating Hallucinations in YOLO-based Object Detection Models: A Revisit to Out-of-Distribution Detection | Weicheng He et.al. | 2503.07330 | null |
2025-03-10 | Semantic Communications with Computer Vision Sensing for Edge Video Transmission | Yubo Peng et.al. | 2503.07252 | null |
2025-03-10 | MIRAM: Masked Image Reconstruction Across Multiple Scales for Breast Lesion Risk Prediction | Hung Q. Vo et.al. | 2503.07157 | null |
2025-03-10 | A Light Perspective for 3D Object Detection | Marcelo Eduardo Pederiva et.al. | 2503.07133 | null |
2025-03-10 | SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements | Haiyang Xie et.al. | 2503.07101 | null |
2025-03-10 | RS2V-L: Vehicle-Mounted LiDAR Data Generation from Roadside Sensor Observations | Ruidan Xing et.al. | 2503.07085 | null |
2025-03-10 | Availability-aware Sensor Fusion via Unified Canonical Space for 4D Radar, LiDAR, and Camera | Dong-Hee Paek et.al. | 2503.07029 | null |
2025-03-10 | Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection | Wentao Wu et.al. | 2503.06948 | null |
2025-03-06 | Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach | Soumyadeep Ro et.al. | 2503.04918 | null |
2025-03-06 | Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation | David T. Hoffmann et.al. | 2503.04718 | null |
2025-03-06 | DEAL-YOLO: Drone-based Efficient Animal Localization using YOLO | Aditya Prashant Naidu et.al. | 2503.04698 | null |
2025-03-06 | Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection | Riccardo De Monte et.al. | 2503.04688 | null |
2025-03-09 | ReynoldsFlow: Exquisite Flow Estimation via Reynolds Transport Theorem | Yu-Hsi Chen et.al. | 2503.04500 | null |
2025-03-06 | A lightweight model FDM-YOLO for small target improvement based on YOLOv8 | Xuerui Zhang et.al. | 2503.04452 | null |
2025-03-06 | Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks | Lukáš Gajdošech et.al. | 2503.04308 | null |
2025-03-06 | CA-W3D: Leveraging Context-Aware Knowledge for Weakly Supervised Monocular 3D Detection | Chupeng Liu et.al. | 2503.04154 | null |
2025-03-06 | Robust Computer-Vision based Construction Site Detection for Assistive-Technology Applications | Junchi Feng et.al. | 2503.04139 | null |
2025-03-06 | Fractional Correspondence Framework in Detection Transformer | Masoumeh Zareapoor et.al. | 2503.04107 | null |
2025-03-05 | DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance | Zhao Yang et.al. | 2503.03689 | null |
2025-03-05 | 4D Radar Ground Truth Augmentation with LiDAR-to-4D Radar Data Synthesis | Woo-Jin Jung et.al. | 2503.03637 | null |
2025-03-05 | Simulation-Based Performance Evaluation of 3D Object Detection Methods with Deep Learning for a LiDAR Point Cloud Dataset in a SOTIF-related Use Case | Milin Patel et.al. | 2503.03548 | link |
2025-03-05 | AI-Driven Multi-Stage Computer Vision System for Defect Detection in Laser-Engraved Industrial Nameplates | Adhish Anitha Vilasan et.al. | 2503.03395 | null |
2025-03-05 | MIAdapt: Source-free Few-shot Domain Adaptive Object Detection for Microscopic Images | Nimra Dilawar et.al. | 2503.03370 | null |
2025-03-05 | BEVMOSNet: Multimodal Fusion for BEV Moving Object Segmentation | Hiep Truong Cong et.al. | 2503.03280 | null |
2025-03-04 | Class-Aware PillarMix: Can Mixed Sample Data Augmentation Enhance 3D Object Detection with Radar Point Clouds? | Miao Zhang et.al. | 2503.02687 | null |
2025-03-04 | Exploring Model Quantization in GenAI-based Image Inpainting and Detection of Arable Plants | Sourav Modak et.al. | 2503.02420 | null |
2025-03-04 | Robust detection of overlapping bioacoustic sound events | Louis Mahon et.al. | 2503.02389 | null |
2025-03-04 | YOLO-PRO: Enhancing Instance-Specific Object Detection with Full-Channel Global Self-Attention | Lin Huang et.al. | 2503.02348 | null |
2025-03-04 | SSNet: Saliency Prior and State Space Model-based Network for Salient Object Detection in RGB-D Images | Gargi Panda et.al. | 2503.02270 | null |
2025-03-03 | Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection | Boyong He et.al. | 2503.02101 | null |
2025-03-03 | Uncertainty Representation in a SOTIF-Related Use Case with Dempster-Shafer Theory for LiDAR Sensor-Based Object Detection | Milin Patel et.al. | 2503.02087 | link |
2025-03-03 | Visual-RFT: Visual Reinforcement Fine-Tuning | Ziyu Liu et.al. | 2503.01785 | link |
2025-03-03 | Enhancing Object Detection Accuracy in Underwater Sonar Images through Deep Learning-based Denoising | Ziyu Wang et.al. | 2503.01655 | null |
2025-03-03 | Evaluating Stenosis Detection with Grounding DINO, YOLO, and DINO-DETR | Muhammad Musab Ansari et.al. | 2503.01601 | null |
2025-02-28 | The Common Objects Underwater (COU) Dataset for Robust Underwater Object Detection | Rishi Mukherjee et.al. | 2502.20651 | null |
2025-02-28 | RTGen: Real-Time Generative Detection Transformer | Chi Ruan et.al. | 2502.20622 | null |
2025-02-28 | LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation | Zhefan Xu et.al. | 2502.20607 | null |
2025-02-27 | Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds | Mohamed Abdelsamad et.al. | 2502.20316 | null |
2025-02-27 | OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels | Meng Lou et.al. | 2502.20087 | link |
2025-02-27 | Night-Voyager: Consistent and Efficient Nocturnal Vision-Aided State Estimation in Object Maps | Tianxiao Gao et.al. | 2502.20054 | null |
2025-02-27 | Learning Mask Invariant Mutual Information for Masked Image Modeling | Tao Huang et.al. | 2502.19718 | null |
2025-02-27 | BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance | Xin Ye et.al. | 2502.19694 | null |
2025-02-26 | Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras | Hoonhee Cho et.al. | 2502.19630 | null |
2025-02-23 | Rewards-based image analysis in microscopy | Kamyar Barakati et.al. | 2502.18522 | null |
2025-02-25 | Multi-Perspective Data Augmentation for Few-shot Object Detection | Anh-Khoa Nguyen Vu et.al. | 2502.18195 | null |
2025-02-25 | Progressive Local Alignment for Medical Multimodal Pre-training | Huimin Yan et.al. | 2502.18047 | null |
2025-02-25 | Automatic Vehicle Detection using DETR: A Transformer-Based Approach for Navigating Treacherous Roads | Istiaq Ahmed Fahad et.al. | 2502.17843 | null |
2025-02-24 | Semi-Supervised Weed Detection in Vegetable Fields: In-domain and Cross-domain Experiments | Boyang Deng et.al. | 2502.17673 | null |
2025-02-24 | Experimental validation of UAV search and detection system in real wilderness environment | Stella Dumenčić et.al. | 2502.17372 | null |
2025-02-24 | LCV2I: Communication-Efficient and High-Performance Collaborative Perception Framework with Low-Resolution LiDAR | Xinxin Feng et.al. | 2502.17039 | null |
2025-02-23 | Geometry-Aware 3D Salient Object Detection Network | Chen Wang et.al. | 2502.16488 | null |
2025-02-26 | MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering | Caixiong Li et.al. | 2502.16486 | null |
2025-02-23 | Cross-domain Few-shot Object Detection with Multi-modal Textual Enrichment | Zeyu Shangguan et.al. | 2502.16469 | null |
2025-02-23 | Deep learning approaches to surgical video segmentation and object detection: A Scoping Review | Devanish N. Kamtam et.al. | 2502.16459 | null |
2025-02-22 | FeatSharp: Your Vision Model Features, Sharper | Mike Ranzinger et.al. | 2502.16025 | null |
2025-02-21 | Generative AI Framework for 3D Object Generation in Augmented Reality | Majid Behravan et.al. | 2502.15869 | null |
2025-02-21 | Depth-aware Fusion Method based on Image and 4D Radar Spectrum for 3D Object Detection | Yue Sun et.al. | 2502.15516 | null |
2025-02-21 | Q-PETR: Quant-aware Position Embedding Transformation for Multi-View 3D Object Detection | Jiangyong Yu et.al. | 2502.15488 | null |
2025-02-20 | Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios | Richard Marcus et.al. | 2502.15076 | null |
2025-02-20 | YOLOv12: A Breakdown of the Key Architectural Features | Mujadded Al Rabbani Alif et.al. | 2502.14740 | null |
2025-02-20 | LXLv2: Enhanced LiDAR Excluded Lean 3D Object Detection with Fusion of 4D Radar and Camera | Weiyi Xiong et.al. | 2502.14503 | null |
2025-02-20 | ODVerse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11 | Tianyou Jiang et.al. | 2502.14314 | null |
2025-02-19 | Image compositing is all you need for data augmentation | Ang Jia Ning Shermaine et.al. | 2502.13936 | null |
2025-02-19 | MSVCOD:A Large-Scale Multi-Scene Dataset for Video Camouflage Object Detection | Shuyong Gao et.al. | 2502.13859 | null |
2025-02-19 | An Overall Real-Time Mechanism for Classification and Quality Evaluation of Rice | Wanke Xia et.al. | 2502.13764 | null |
2025-02-18 | Multiple Distribution Shift -- Aerial (MDS-A): A Dataset for Test-Time Error Detection and Model Adaptation | Noel Ngu et.al. | 2502.13289 | null |
2025-02-18 | RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection | Jingtong Yue et.al. | 2502.13071 | null |
2025-02-18 | Task-Oriented Semantic Communication for Stereo-Vision 3D Object Detection | Zijian Cao et.al. | 2502.12735 | null |
2025-02-18 | DAMamba: Vision State Space Model with Dynamic Adaptive Scan | Tanzhe Li et.al. | 2502.12627 | null |
2025-02-18 | Gaseous Object Detection | Kailai Zhou et.al. | 2502.12415 | null |
2025-02-17 | Enhancing Transparent Object Pose Estimation: A Fusion of GDR-Net and Edge Detection | Tessa Pulli et.al. | 2502.12027 | null |
2025-02-16 | DAViMNet: SSMs-Based Domain Adaptive Object Detection | A. Enes Doruk et.al. | 2502.11178 | null |
2025-02-15 | CLoCKDistill: Consistent Location-and-Context-aware Knowledge Distillation for DETRs | Qizhen Lan et.al. | 2502.10683 | null |
2025-02-14 | Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding | Wenxuan Guo et.al. | 2502.10392 | null |
2025-02-14 | Object Detection and Tracking | Md Pranto et.al. | 2502.10310 | null |
2025-02-14 | Artificial Intelligence to Assess Dental Findings from Panoramic Radiographs -- A Multinational Study | Yin-Chih Chelsea Wang et.al. | 2502.10277 | null |
2025-02-13 | Instance Segmentation of Scene Sketches Using Natural Image Priors | Mia Tang et.al. | 2502.09608 | null |
2025-02-13 | Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection | Yi Yu et.al. | 2502.09471 | link |
2025-02-13 | Mitigating the Impact of Prominent Position Shift in Drone-based RGBT Object Detection | Yan Zhang et.al. | 2502.09311 | null |
2025-02-12 | Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection | Ziyue Yang et.al. | 2502.08373 | link |
2025-02-12 | Plantation Monitoring Using Drone Images: A Dataset and Performance Review | Yashwanth Karumanchi et.al. | 2502.08233 | null |
2025-02-12 | Take What You Need: Flexible Multi-Task Semantic Communications with Channel Adaptation | Xiang Chen et.al. | 2502.08221 | null |
2025-02-13 | SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation | Zhiming Ma et.al. | 2502.08168 | link |
2025-02-12 | Knowledge Swapping via Learning and Unlearning | Mingyu Xing et.al. | 2502.08075 | null |
2025-02-13 | Visual-based spatial audio generation system for multi-speaker environments | Xiaojing Liu et.al. | 2502.07538 | null |
2025-02-11 | Quantitative Analysis of Objects in Prisoner Artworks | Thea Christoffersen et.al. | 2502.07440 | null |
2025-02-11 | Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving | Novendra Setyawan et.al. | 2502.07417 | null |
2025-02-11 | Multi-Task-oriented Nighttime Haze Imaging Enhancer for Vision-driven Measurement Systems | Ai Chen et.al. | 2502.07351 | link |
2025-02-11 | SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer | Wenxi Li et.al. | 2502.07216 | null |
2025-02-11 | Dense Object Detection Based on De-homogenized Queries | Yueming Huang et.al. | 2502.07194 | null |
2025-02-11 | Foreign-Object Detection in High-Voltage Transmission Line Based on Improved YOLOv8m | Zhenyue Wang et.al. | 2502.07175 | null |
2025-02-11 | A Survey on Mamba Architecture for Vision Applications | Fady Ibrahim et.al. | 2502.07161 | null |
2025-02-10 | Multimodal Search on a Line | Jared Coleman et.al. | 2502.07000 | null |
2025-02-10 | AgilePilot: DRL-Based Drone Agent for Real-Time Motion Planning in Dynamic Environments by Leveraging Object Detection | Roohan Ahmed Khan et.al. | 2502.06725 | null |
2025-02-10 | EdgeMLBalancer: A Self-Adaptive Approach for Dynamic Model Switching on Resource-Constrained Edge Devices | Akhila Matathammal et.al. | 2502.06493 | null |
2025-02-10 | Enhancing Document Key Information Localization Through Data Augmentation | Yue Dai et.al. | 2502.06132 | null |
2025-02-10 | Improved YOLOv5s model for key components detection of power transmission lines | Chen Chen et.al. | 2502.06127 | null |
2025-02-10 | A Novel Multi-Teacher Knowledge Distillation for Real-Time Object Detection using 4D Radar | Seung-Hyun Song et.al. | 2502.06114 | null |
2025-02-09 | Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery | Yuhui Zeng et.al. | 2502.05843 | null |
2025-02-08 | Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector | Qirui Wu et.al. | 2502.05540 | null |
2025-02-07 | LP-DETR: Layer-wise Progressive Relations for Object Detection | Zhengjian Kang et.al. | 2502.05147 | null |
2025-02-07 | Counting Fish with Temporal Representations of Sonar Video | Kai Van Brunt et.al. | 2502.05129 | null |
2025-02-07 | DetVPCC: RoI-based Point Cloud Sequence Compression for 3D Object Detection | Mingxuan Yan et.al. | 2502.04804 | null |
2025-02-07 | MHAF-YOLO: Multi-Branch Heterogeneous Auxiliary Fusion YOLO for accurate object detection | Zhiqiang Yang et.al. | 2502.04656 | null |
2025-02-07 | AIQViT: Architecture-Informed Post-Training Quantization for Vision Transformers | Runqing Jiang et.al. | 2502.04628 | null |
2025-02-06 | An Optimized YOLOv5 Based Approach For Real-time Vehicle Detection At Road Intersections Using Fisheye Cameras | Md. Jahin Alam et.al. | 2502.04566 | null |
2025-02-06 | OneTrack-M: A multitask approach to transformer-based MOT models | Luiz C. S. de Araujo et.al. | 2502.04478 | null |
2025-02-07 | Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances | Yi Yu et.al. | 2502.04268 | null |
2025-02-06 | An object detection approach for lane change and overtake detection from motion profiles | Andrea Benericetti et.al. | 2502.04244 | null |
2025-02-06 | YOLOv4: A Breakthrough in Real-Time Object Detection | Athulya Sundaresan Geetha et.al. | 2502.04161 | null |
2025-02-06 | Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks | Yuhui Jin et.al. | 2502.03877 | null |
2025-02-06 | Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount | Yanbiao Ma et.al. | 2502.03852 | null |
2025-02-06 | Single-Domain Generalized Object Detection by Balancing Domain Diversity and Invariance | Zhenwei He et.al. | 2502.03835 | null |
2025-02-06 | UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection | Xi Song et.al. | 2502.03761 | null |
2025-02-06 | RAMOTS: A Real-Time System for Aerial Multi-Object Tracking based on Deep Learning and Big Data Technology | Nhat-Tan Do et.al. | 2502.03760 | null |
2025-02-05 | An Empirical Study of Methods for Small Object Detection from Satellite Imagery | Xiaohui Yuan et.al. | 2502.03674 | null |
2025-02-05 | Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics | Indrashis Das et.al. | 2502.03654 | null |
2025-02-05 | RoboGrasp: A Universal Grasping Policy for Robust Robotic Control | Yiqi Huang et.al. | 2502.03072 | null |
2025-02-05 | Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features | Keiichiro Yamamura et.al. | 2502.02895 | null |
2025-02-05 | RS-YOLOX: A High Precision Detector for Object Detection in Satellite Remote Sensing Images | Lei Yang et.al. | 2502.02850 | null |
2025-02-04 | Learning the RoPEs: Better 2D and 3D Position Encodings with STRING | Connor Schenck et.al. | 2502.02562 | null |
2025-02-04 | Uncertainty Quantification for Collaborative Object Detection Under Adversarial Attacks | Huiqun Huang et.al. | 2502.02537 | null |
2025-02-04 | Improving Generalization Ability for 3D Object Detection by Learning Sparsity-invariant Features | Hsin-Cheng Lu et.al. | 2502.02322 | null |
2025-02-05 | From Fog to Failure: How Dehazing Can Harm Clear Image Object Detection | Ashutosh Kumar et.al. | 2502.02027 | null |
2025-02-04 | Memory Efficient Transformer Adapter for Dense Predictions | Dong Zhang et.al. | 2502.01962 | null |
2025-02-04 | INTACT: Inducing Noise Tolerance through Adversarial Curriculum Training for LiDAR-based Safety-Critical Perception and Autonomy | Nastaran Darabi et.al. | 2502.01896 | null |
2025-02-04 | SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset | Goodarz Mehr et.al. | 2502.01894 | null |
2025-02-03 | Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection | Reza Sadeghian et.al. | 2502.01856 | null |
2025-02-03 | GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection | Jeffri Murrugarra-LLerena et.al. | 2502.01565 | null |
2025-02-03 | Human Body Restoration with One-Step Diffusion Model and A New Benchmark | Jue Gong et.al. | 2502.01411 | null |
2025-01-31 | Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches | Ying Zang et.al. | 2501.19329 | null |
2025-01-31 | GO: The Great Outdoors Multimodal Dataset | Peng Jiang et.al. | 2501.19274 | null |
2025-01-31 | Early Diagnosis and Severity Assessment of Weligama Coconut Leaf Wilt Disease and Coconut Caterpillar Infestation using Deep Learning-based Image Processing Techniques | Samitha Vidhanaarachchi et.al. | 2501.18835 | null |
2025-01-30 | Tuning Event Camera Biases Heuristic for Object Detection Applications in Staring Scenarios | David El-Chai Ben-Ezra et.al. | 2501.18788 | null |
2025-01-30 | Adaptive Object Detection for Indoor Navigation Assistance: A Performance Evaluation of Real-Time Algorithms | Abhinav Pratap et.al. | 2501.18444 | null |
2025-01-29 | Real Time Scheduling Framework for Multi Object Detection via Spiking Neural Networks | Donghwa Kang et.al. | 2501.18412 | null |
2025-01-30 | IROAM: Improving Roadside Monocular 3D Object Detection Learning from Autonomous Vehicle Data Domain | Zhe Wang et.al. | 2501.18162 | null |
2025-02-03 | Efficient Feature Fusion for UAV Object Detection | Xudong Wang et.al. | 2501.17983 | null |
2025-01-29 | TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection | Lei Cheng et.al. | 2501.17977 | link |
2025-01-28 | Object Detection with Deep Learning for Rare Event Search in the GADGET II TPC | Tyler Wheeler et.al. | 2501.17892 | null |
2025-01-29 | Detection of Oscillation-like Patterns in Eclipsing Binary Light Curves using Neural Network-based Object Detection Algorithms | Burak Ulaş et.al. | 2501.17538 | link |
2025-01-30 | Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection | Alicia Allmendinger et.al. | 2501.17387 | null |
2025-01-28 | DINOSTAR: Deep Iterative Neural Object Detector Self-Supervised Training for Roadside LiDAR Applications | Muhammad Shahbaz et.al. | 2501.17076 | null |
2025-01-28 | Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding | Akash Kumar et.al. | 2501.17053 | null |
2025-01-28 | Approach Towards Semi-Automated Certification for Low Criticality ML-Enabled Airborne Applications | Chandrasekar Sridhar et.al. | 2501.17028 | null |
2025-01-28 | Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection | Xiangyu Gao et.al. | 2501.16981 | null |
2025-01-28 | SSF-PAN: Semantic Scene Flow-Based Perception for Autonomous Navigation in Traffic Scenarios | Yinqi Chen et.al. | 2501.16754 | null |
2025-01-28 | DebugAgent: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging | Muxi Chen et.al. | 2501.16751 | null |
2025-01-27 | Efficient Object Detection of Marine Debris using Pruned YOLO Model | Abi Aryaza et.al. | 2501.16571 | null |
2025-01-27 | Object Detection for Medical Image Analysis: Insights from the RT-DETR Model | Weijie He et.al. | 2501.16469 | null |
2025-01-27 | The Linear Attention Resurrection in Vision Transformer | Chuanyang Zheng et.al. | 2501.16182 | null |
2025-01-27 | Real-Time Brain Tumor Detection in Intraoperative Ultrasound Using YOLO11: From Model Training to Deployment in the Operating Room | Santiago Cepeda et.al. | 2501.15994 | null |
2025-01-26 | Breaking the SSL-AL Barrier: A Synergistic Semi-Supervised Active Learning Framework for 3D Object Detection | Zengran Wang et.al. | 2501.15449 | null |
2025-01-26 | FAVbot: An Autonomous Target Tracking Micro-Robot with Frequency Actuation Control | Zhijian Hao et.al. | 2501.15426 | null |
2025-01-26 | Doracamom: Joint 3D Detection and Occupancy Prediction with Multi-view 4D Radars and Cameras for Omnidirectional Perception | Lianqing Zheng et.al. | 2501.15394 | null |
2025-01-26 | iFormer: Integrating ConvNet and Transformer for Mobile Application | Chuanyang Zheng et.al. | 2501.15369 | link |
2025-01-25 | Explainable YOLO-Based Dyslexia Detection in Synthetic Handwriting Data | Nora Fink et.al. | 2501.15263 | null |
2025-01-28 | SpikSSD: Better Extraction and Fusion for Object Detection with Spiking Neuron Networks | Yimeng Fan et.al. | 2501.15151 | link |
2025-01-25 | Comprehensive Evaluation of Cloaking Backdoor Attacks on Object Detector in Real-World | Hua Ma et.al. | 2501.15101 | null |
2025-01-24 | TD-RD: A Top-Down Benchmark with Real-Time Framework for Road Damage Detection | Xi Xiao et.al. | 2501.14302 | null |
2025-01-23 | Efficient Precision Control in Object Detection Models for Enhanced and Reliable Ovarian Follicle Counting | Vincent Blot et.al. | 2501.14036 | null |
2025-01-23 | Enhanced PEC-YOLO for Detecting Improper Safety Gear Wearing Among Power Line Workers | Chen Zuguo et.al. | 2501.13981 | null |
2025-01-23 | PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection | Peiyuan Zhang et.al. | 2501.13898 | link |
2025-01-23 | First Lessons Learned of an Artificial Intelligence Robotic System for Autonomous Coarse Waste Recycling Using Multispectral Imaging-Based Methods | Timo Lange et.al. | 2501.13855 | null |
2025-01-23 | Integrating Causality with Neurochaos Learning: Proposed Approach and Research Agenda | Nanjangud C. Narendra et.al. | 2501.13763 | null |
2025-01-23 | You Only Crash Once v2: Perceptually Consistent Strong Features for One-Stage Domain Adaptive Detection of Space Terrain | Timothy Chase Jr et.al. | 2501.13725 | null |
2025-01-23 | YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID | Iñaki Erregue et.al. | 2501.13710 | link |
2025-01-24 | Multi-aspect Knowledge Distillation with Large Language Model | Taegyeong Lee et.al. | 2501.13341 | null |
2025-01-22 | MONA: Moving Object Detection from Videos Shot by Dynamic Camera | Boxun Hu et.al. | 2501.13183 | null |
2025-01-21 | Large-image Object Detection for Fine-grained Recognition of Punches Patterns in Medieval Panel Painting | Josh Bruegger et.al. | 2501.12489 | link |
2025-01-21 | TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking | Adarsh Kumar Kosta et.al. | 2501.12482 | null |
2025-01-21 | Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems | Stefano Carlo Lambertenghi et.al. | 2501.12269 | null |
2025-01-21 | DLEN: Dual Branch of Transformer for Low-Light Image Enhancement in Dual Domains | Junyu Xia et.al. | 2501.12235 | null |
2025-01-21 | SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology | Dongli Wu et.al. | 2501.12169 | null |
2025-01-21 | Co-Paced Learning Strategy Based on Confidence for Flying Bird Object Detection Model Training | Zi-Wei Sun et.al. | 2501.12071 | null |
2025-01-21 | SMamba: Sparse Mamba for Event-based Object Detection | Nan Yang et.al. | 2501.11971 | null |
2025-01-20 | Enhancing SAR Object Detection with Self-Supervised Pre-training on Masked Auto-Encoders | Xinyang Pu et.al. | 2501.11249 | null |
2025-01-19 | LiFT: Lightweight, FPGA-tailored 3D object detection based on LiDAR data | Konrad Lis et.al. | 2501.11159 | link |
2025-01-19 | Advanced technology in railway track monitoring using the GPR Technique: A Review | Farhad Kooban et.al. | 2501.11132 | null |
2025-01-19 | Green Video Camouflaged Object Detection | Xinyu Wang et.al. | 2501.10914 | null |
2025-01-18 | ClusterViG: Efficient Globally Aware Vision GNNs via Image Partitioning | Dhruv Parikh et.al. | 2501.10640 | null |
2025-01-17 | MutualForce: Mutual-Aware Enhancement for 4D Radar-LiDAR 3D Object Detection | Xiangyuan Peng et.al. | 2501.10266 | null |
2025-01-17 | Leveraging Confident Image Regions for Source-Free Domain-Adaptive Object Detection | Mohamed Lamine Mekhalfi et.al. | 2501.10081 | null |
2025-01-17 | One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression | Keita Miwa et.al. | 2501.10064 | null |
2025-01-17 | LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks | Wei Lu et.al. | 2501.10040 | link |
2025-01-17 | FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis | Zhe Chen et.al. | 2501.09887 | null |
2025-01-16 | A Simple Aerial Detection Baseline of Multimodal Language Models | Qingyun Li et.al. | 2501.09720 | link |
2025-01-16 | Practical Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2501.09705 | link |
2025-01-16 | Multi-task deep-learning for sleep event detection and stage classification | Adriana Anido-Alonso et.al. | 2501.09519 | link |
2025-01-16 | The Devil is in the Details: Simple Remedies for Image-to-LiDAR Representation Learning | Wonjun Jo et.al. | 2501.09485 | null |
2025-01-16 | MonoSOWA: Scalable monocular 3D Object detector Without human Annotations | Jan Skvrna et.al. | 2501.09481 | null |
2025-01-16 | RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection | Jianrui Shi et.al. | 2501.09465 | null |
2025-01-16 | On the Relation between Optical Aperture and Automotive Object Detection | Ofer Bar-Shalom et.al. | 2501.09456 | null |
2025-01-16 | SoccerSynth-Detection: A Synthetic Dataset for Soccer Player Detection | Haobin Qin et.al. | 2501.09281 | null |
2025-01-15 | Polyp detection in colonoscopy images using YOLOv11 | Alok Ranjan Sahoo et.al. | 2501.09051 | null |
2025-01-15 | PACF: Prototype Augmented Compact Features for Improving Domain Adaptive Object Detection | Chenguang Liu et.al. | 2501.08605 | null |
2025-01-14 | Predicting Performance of Object Detection Models in Electron Microscopy Using Random Forests | Ni Li et.al. | 2501.08465 | link |
2025-01-14 | Bootstrapping Corner Cases: High-Resolution Inpainting for Safety Critical Detect and Avoid for Automated Flying | Jonathan Lyhs et.al. | 2501.08142 | null |
2025-01-14 | Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation | Yunzhi Zhuge et.al. | 2501.07806 | link |
2025-01-14 | Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding | Zhaokai Wang et.al. | 2501.07783 | link |
2025-01-13 | SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing | Varun Biyyala et.al. | 2501.07554 | link |
2025-01-13 | ML Mule: Mobile-Driven Context-Aware Collaborative Learning | Haoxiang Yu et.al. | 2501.07536 | null |
2025-01-13 | TimberVision: A Multi-Task Dataset and Framework for Log-Component Segmentation and Tracking in Autonomous Forestry Operations | Daniel Steininger et.al. | 2501.07360 | null |
2025-01-13 | Toward Realistic Camouflaged Object Detection: Benchmarks and Method | Zhimeng Xin et.al. | 2501.07297 | link |
2025-01-13 | Dual Scale-aware Adaptive Masked Knowledge Distillation for Object Detection | ZhouRui Zhang et.al. | 2501.07101 | null |
2025-01-11 | CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection | Yiheng Li et.al. | 2501.06550 | link |
2025-01-11 | CPDR: Towards Highly-Efficient Salient Object Detection via Crossed Post-decoder Refinement | Yijie Li et.al. | 2501.06441 | null |
2025-01-11 | FocusDD: Real-World Scene Infusion for Robust Dataset Distillation | Youbing Hu et.al. | 2501.06405 | null |
2025-01-10 | A Holistically Point-guided Text Framework for Weakly-Supervised Camouflaged Object Detection | Tsui Qin Mok et.al. | 2501.06038 | null |
2025-01-10 | Minimizing Occlusion Effect on Multi-View Camera Perception in BEV with Multi-Sensor Fusion | Sanjay Kumar et.al. | 2501.05997 | null |
2025-01-10 | EDNet: Edge-Optimized Small Target Detection in UAV Imagery -- Faster Context Attention, Better Feature Fusion, and Hardware Acceleration | Zhifan Song et.al. | 2501.05885 | null |
2025-01-10 | Automatic detection of single-electron regime of quantum dots and definition of virtual gates using U-Net and clustering | Yui Muto et.al. | 2501.05878 | null |
2025-01-10 | Zero-shot Shark Tracking and Biometrics from Aerial Imagery | Chinmay K Lalgudi et.al. | 2501.05717 | null |
2025-01-10 | Dark Energy Survey Year 6 Results: Synthetic-source Injection Across the Full Survey Using Balrog | D. Anbajagane et.al. | 2501.05683 | null |
2025-01-09 | Approximate Supervised Object Distance Estimation on Unmanned Surface Vehicles | Benjamin Kiefer et.al. | 2501.05567 | null |
2025-01-09 | Performance of YOLOv7 in Kitchen Safety While Handling Knife | Athulya Sundaresan Geetha et.al. | 2501.05399 | null |
2025-01-09 | A Systematic Literature Review on Deep Learning-based Depth Estimation in Computer Vision | Ali Rohan et.al. | 2501.05147 | null |
2025-01-09 | CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection | Xiang Zhang et.al. | 2501.05132 | null |
2025-01-09 | AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data | Haoran Zhu et.al. | 2501.04969 | link |
2025-01-09 | Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks | Seyed Amir Bidaki et.al. | 2501.04897 | link |
2025-01-08 | Video Summarisation with Incident and Context Information using Generative AI | Ulindu De Silva et.al. | 2501.04764 | null |
2025-01-08 | Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models | Miaoyang He et.al. | 2501.04582 | null |
2025-01-08 | RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark | Xin Zhang et.al. | 2501.04440 | link |
2025-01-08 | FGU3R: Fine-Grained Fusion via Unified 3D Representation for Multimodal 3D Object Detection | Guoxin Zhang et.al. | 2501.04373 | null |
2025-01-08 | H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving | Siran Chen et.al. | 2501.04302 | null |
2025-01-08 | UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles | Abhishek Balasubramaniam et.al. | 2501.04213 | null |
2025-01-07 | LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving | Lingdong Kong et.al. | 2501.04005 | null |
2025-01-07 | Visual question answering: from early developments to recent advances -- a survey | Ngoc Dung Huynh et.al. | 2501.03939 | null |
2025-01-07 | SCC-YOLO: An Improved Object Detector for Assisting in Brain Tumor Diagnosis | Runci Bai et.al. | 2501.03836 | null |
2025-01-08 | Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Xinbin Yuan et.al. | 2501.03775 | link |
2025-01-07 | AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features | Ruochen Zhang et.al. | 2501.03700 | null |
2025-01-07 | Anomaly Triplet-Net: Progress Recognition Model Using Deep Metric Learning Considering Occlusion for Manual Assembly Work | Takumi Kitsukawa et.al. | 2501.03533 | null |
2025-01-05 | Multispectral Pedestrian Detection with Sparsely Annotated Label | Chan Lee et.al. | 2501.02640 | null |
2025-01-05 | Generalization-Enhanced Few-Shot Object Detection in Remote Sensing | Hui Lin et.al. | 2501.02474 | link |
2025-01-04 | V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection | Sichao Wang et.al. | 2501.02363 | null |
2025-01-04 | Accurate Crop Yield Estimation of Blueberries using Deep Learning and Smart Drones | Hieu D. Nguyen et.al. | 2501.02344 | null |
2025-01-04 | RadarNeXt: Real-Time and Reliable 3D Object Detector Based On 4D mmWave Imaging Radar | Liye Jia et.al. | 2501.02314 | null |
2025-01-03 | A Separable Self-attention Inspired by the State Space Model for Computer Vision | Juntao Zhang et.al. | 2501.02040 | link |
2025-01-03 | UAV-DETR: Efficient End-to-End Object Detection for Unmanned Aerial Vehicle Imagery | Huaxiang Zhang et.al. | 2501.01855 | null |
2025-01-03 | Dual Mutual Learning Network with Global-local Awareness for RGB-D Salient Object Detection | Kang Yi et.al. | 2501.01648 | null |
2025-01-02 | A Multi-task Supervised Compression Model for Split Computing | Yoshitomo Matsubara et.al. | 2501.01420 | link |
2025-01-02 | MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception | Xiaoshuai Hao et.al. | 2501.01037 | null |
2025-01-01 | A Novel Approach using CapsNet and Deep Belief Network for Detection and Identification of Oral Leukopenia | Hirthik Mathesh GV et.al. | 2501.00876 | null |
2025-01-01 | NMM-HRI: Natural Multi-modal Human-Robot Interaction with Voice and Deictic Posture via Large Language Model | Yuzhi Lai et.al. | 2501.00785 | null |
2024-12-31 | Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting | Kyle Gao et.al. | 2501.00625 | null |
2024-12-31 | B2Net: Camouflaged Object Detection via Boundary Aware and Boundary Fusion | Junmin Cai et.al. | 2501.00426 | null |
2024-12-30 | TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation | Shaoqing Xu et.al. | 2412.20911 | link |
2024-12-30 | Humanoid Robot RHP Friends: Seamless Combination of Autonomous and Teleoperated Tasks in a Nursing Context | Mehdi Benallegue et.al. | 2412.20770 | null |
2024-12-30 | Solar Filaments Detection using Active Contours Without Edges | Sanmoy Bandyopadhyay et.al. | 2412.20749 | null |
2024-12-30 | Open-Set Object Detection By Aligning Known Class Representations | Hiran Sarkar et.al. | 2412.20701 | null |
2024-12-30 | SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection | Yuxuan Li et.al. | 2412.20665 | link |
2024-12-30 | YOLO-UniOW: Efficient Universal Open-World Object Detection | Lihao Liu et.al. | 2412.20645 | link |
2024-12-29 | A Novel FPGA-based CNN Hardware Accelerator: Optimization for Convolutional Layers using Karatsuba Ofman Multiplier | Amit Sarkar et.al. | 2412.20393 | null |
2024-12-29 | Differential Evolution Integrated Hybrid Deep Learning Model for Object Detection in Pre-made Dishes | Lujia Lv et.al. | 2412.20370 | null |
2024-12-28 | Plastic Waste Classification Using Deep Learning: Insights from the WaDaBa Dataset | Suman Kunwar et.al. | 2412.20232 | null |
2024-12-28 | SimLTD: Simple Supervised and Semi-Supervised Long-Tailed Object Detection | Phi Vu Tran et.al. | 2412.20047 | null |
2024-12-27 | Chimera: A Block-Based Neural Architecture Search Framework for Event-Based Object Detection | Diego A. Silva et.al. | 2412.19646 | null |
2024-12-27 | Optimizing Helmet Detection with Hybrid YOLO Pipelines: A Detailed Analysis | Vaikunth M et.al. | 2412.19467 | null |
2024-12-26 | Revisiting Monocular 3D Object Detection from Scene-Level Depth Retargeting to Instance-Level Spatial Refinement | Qiude Zhang et.al. | 2412.19165 | null |
2024-12-26 | From Coin to Data: The Impact of Object Detection on Digital Numismatics | Rafael Cabral et.al. | 2412.19091 | null |
2024-12-26 | Assessing Pre-trained Models for Transfer Learning through Distribution of Spectral Components | Tengxue Zhang et.al. | 2412.19085 | null |
2024-12-25 | CGCOD: Class-Guided Camouflaged Object Detection | Chenxi Zhang et.al. | 2412.18977 | null |
2024-12-25 | HV-BEV: Decoupling Horizontal and Vertical Feature Sampling for Multi-View 3D Object Detection | Di Wu et.al. | 2412.18884 | null |
2024-12-25 | TSceneJAL: Joint Active Learning of Traffic Scenes for 3D Object Detection | Chenyang Lei et.al. | 2412.18870 | null |
2024-12-25 | Distortion-Aware Adversarial Attacks on Bounding Boxes of Object Detectors | Pham Phuc et.al. | 2412.18815 | link |
2024-12-25 | Unified Local and Global Attention Interaction Modeling for Vision Transformers | Tan Nguyen et.al. | 2412.18778 | null |
2024-12-24 | Sampling Bag of Views for Open-Vocabulary Object Detection | Hojun Choi et.al. | 2412.18273 | null |
2024-12-24 | Efficient Detection Framework Adaptation for Edge Computing: A Plug-and-play Neural Network Toolbox Enabling Edge Deployment | Jiaqi Wu et.al. | 2412.18230 | null |
2024-12-24 | Spectrum-oriented Point-supervised Saliency Detector for Hyperspectral Images | Peifu Liu et.al. | 2412.18112 | link |
2024-12-24 | Multi-Point Positional Insertion Tuning for Small Object Detection | Kanoko Goto et.al. | 2412.18090 | null |
2024-12-24 | COMO: Cross-Mamba Interaction and Offset-Guided Fusion for Multimodal Object Detection | Chang Liu et.al. | 2412.18076 | null |
2024-12-23 | Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection | Yitong Chen et.al. | 2412.17800 | link |
2024-12-23 | Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions | Huaxu He et.al. | 2412.17654 | null |
2024-12-23 | Impact of Evidence Theory Uncertainty on Training Object Detection Models | M. Tahasanul Ibrahim et.al. | 2412.17405 | null |
2024-12-23 | Feature Based Methods Domain Adaptation for Object Detection: A Review Paper | Helia Mohamadi et.al. | 2412.17325 | null |
2024-12-23 | Towards Unsupervised Model Selection for Domain Adaptive Object Detection | Hengfu Yu et.al. | 2412.17284 | link |
2024-12-22 | NumbOD: A Spatial-Frequency Fusion Attack Against Object Detectors | Ziqi Zhou et.al. | 2412.16955 | link |
2024-12-22 | Separating Drone Point Clouds From Complex Backgrounds by Cluster Filter -- Technical Report for CVPR 2024 UG2 Challenge | Hanfang Liang et.al. | 2412.16947 | null |
2024-12-22 | Seamless Detection: Unifying Salient Object Detection and Camouflaged Object Detection | Yi Liu et.al. | 2412.16840 | link |
2024-12-24 | Human-Guided Image Generation for Expanding Small-Scale Training Image Datasets | Changjian Chen et.al. | 2412.16839 | null |
2024-12-21 | IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks | Yaming Zhang et.al. | 2412.16654 | link |
2024-12-20 | NeRF-To-Real Tester: Neural Radiance Fields as Test Image Generators for Vision of Autonomous Systems | Laura Weihl et.al. | 2412.16141 | null |
2024-12-20 | MR-GDINO: Efficient Open-World Continual Object Detection | Bowen Dong et.al. | 2412.15979 | link |
2024-12-20 | Mask-RadarNet: Enhancing Transformer With Spatial-Temporal Semantic Context for Radar Object Detection in Autonomous Driving | Yuzhi Wu et.al. | 2412.15595 | null |
2024-12-19 | Exploring Machine Learning Engineering for Object Detection and Tracking by Unmanned Aerial Vehicle (UAV) | Aneesha Guna et.al. | 2412.15347 | null |
2024-12-19 | Leveraging Color Channel Independence for Improved Unsupervised Object Detection | Bastian Jäckl et.al. | 2412.15150 | null |
2024-12-19 | A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space | Yonghao He et.al. | 2412.14680 | link |
2024-12-19 | Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers | Rui Ding et.al. | 2412.14633 | null |
2024-12-19 | Alignment-Free RGB-T Salient Object Detection: A Large-scale Dataset and Progressive Correlation Network | Kunpeng Wang et.al. | 2412.14576 | link |
2024-12-19 | SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection | Ruoyu Xu et.al. | 2412.14571 | null |
2024-12-18 | HA-RDet: Hybrid Anchor Rotation Detector for Oriented Object Detection | Phuc D. A. Nguyen et.al. | 2412.14379 | link |
2024-12-18 | Joint Perception and Prediction for Autonomous Driving: A Survey | Lucas Dal'Col et.al. | 2412.14088 | link |
2024-12-18 | Object Style Diffusion for Generalized Object Detection in Urban Scene | Hao Li et.al. | 2412.13815 | null |
2024-12-18 | MMO-IG: Multi-Class and Multi-Scale Object Image Generation for Remote Sensing | Chuang Yang et.al. | 2412.13684 | null |
2024-12-18 | Comparative Analysis of YOLOv9, YOLOv10 and RT-DETR for Real-Time Weed Detection | Ahmet Oğuz Saltık et.al. | 2412.13490 | null |
2024-12-17 | Continuous Patient Monitoring with AI: Real-Time Analysis of Video in Hospital Care Settings | Paolo Gabriel et.al. | 2412.13152 | null |
2024-12-17 | A New Adversarial Perspective for LiDAR-based 3D Object Detection | Shijun Zheng et.al. | 2412.13017 | null |
2024-12-17 | What is YOLOv6? A Deep Insight into the Object Detection Model | Athulya Sundaresan Geetha et.al. | 2412.13006 | null |
2024-12-17 | Differential Alignment for Domain Adaptive Object Detection | Xinyu He et.al. | 2412.12830 | null |
2024-12-17 | RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection | Yiheng Li et.al. | 2412.12799 | link |
2024-12-17 | RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion | Xiaomeng Chu et.al. | 2412.12725 | null |
2024-12-17 | Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images | Zhifei Shi et.al. | 2412.12562 | null |
2024-12-17 | CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal Dynamics | Ruixin Mao et.al. | 2412.12525 | link |
2024-12-17 | PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts | Kun Guo et.al. | 2412.12460 | link |
2024-12-16 | Domain Generalization in Autonomous Driving: Evaluating YOLOv8s, RT-DETR, and YOLO-NAS with the ROAD-Almaty Dataset | Madiyar Alimov et.al. | 2412.12349 | null |
2024-12-16 | Coconut Palm Tree Counting on Drone Images with Deep Object Detection and Synthetic Training Data | Tobias Rohe et.al. | 2412.11949 | null |
2024-12-16 | Sonar-based Deep Learning in Underwater Robotics: Overview, Robustness and Challenges | Martin Aubard et.al. | 2412.11840 | null |
2024-12-16 | CLDA-YOLO: Visual Contrastive Learning Based Domain Adaptive YOLO Detector | Tianheng Qiu et.al. | 2412.11812 | null |
2024-12-16 | PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection | Xiaoran Xu et.al. | 2412.11807 | link |
2024-12-16 | Learning UAV-based path planning for efficient localization of objects using prior knowledge | Rick van Essen et.al. | 2412.11717 | null |
2024-12-16 | Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning | Chang Xu et.al. | 2412.11582 | null |
2024-12-16 | HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection | Zijian Gu et.al. | 2412.11489 | link |
2024-12-16 | Universal Domain Adaptive Object Detection via Dual Probabilistic Alignment | Yuanfan Zheng et.al. | 2412.11443 | link |
2024-12-16 | V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations | Jin-Cheng Jhang et.al. | 2412.11412 | null |
2024-12-15 | From Simple to Professional: A Combinatorial Controllable Image Captioning Agent | Xinran Wang et.al. | 2412.11025 | link |
2024-12-13 | A dual contrastive framework | Yuan Sun et.al. | 2412.10348 | null |
2024-12-13 | MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization | Shuaiting Li et.al. | 2412.10261 | null |
2024-12-13 | Copy-Move Detection in Optical Microscopy: A Segmentation Network and A Dataset | Hao-Chiang Shao et.al. | 2412.10258 | null |
2024-12-13 | UN-DETR: Promoting Objectness Learning via Joint Supervision for Unknown Object Detection | Haomiao Liu et.al. | 2412.10176 | link |
2024-12-13 | HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection | Zican Shi et.al. | 2412.10116 | null |
2024-12-13 | RemDet: Rethinking Efficient Model Design for UAV Object Detection | Chen Li et.al. | 2412.10040 | link |
2024-12-13 | Timealign: A multi-modal object detection method for time misalignment fusing in autonomous driving | Zhihang Song et.al. | 2412.10033 | null |
2024-12-13 | Object-Focused Data Selection for Dense Prediction Tasks | Niclas Popp et.al. | 2412.10032 | null |
2024-12-13 | CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection | Qibo Chen et.al. | 2412.09799 | null |
2024-12-12 | FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection | Ke Li et.al. | 2412.09258 | null |
2024-12-12 | UADet: A Remarkably Simple Yet Effective Uncertainty-Aware Open-Set Object Detection Framework | Silin Cheng et.al. | 2412.09229 | null |
2024-12-12 | ContextHOI: Spatial Context Learning for Human-Object Interaction Detection | Mingda Jia et.al. | 2412.09050 | null |
2024-12-12 | STEAM: Squeeze and Transform Enhanced Attention Module | Rishabh Sabharwal et.al. | 2412.09023 | null |
2024-12-12 | Sensing for Space Safety and Sustainability: A Deep Learning Approach with Vision Transformers | Wenxuan Zhang et.al. | 2412.08913 | null |
2024-12-11 | DALI: Domain Adaptive LiDAR Object Detection via Distribution-level and Instance-level Pseudo Label Denoising | Xiaohu Lu et.al. | 2412.08806 | link |
2024-12-11 | Utilizing Multi-step Loss for Single Image Reflection Removal | Abdelrahman Elnenaey et.al. | 2412.08582 | link |
2024-12-11 | PointCFormer: a Relation-based Progressive Feature Extraction Network for Point Cloud Completion | Yi Zhong et.al. | 2412.08421 | null |
2024-12-13 | Physical Informed Driving World Model | Zhuoran Yang et.al. | 2412.08410 | null |
2024-12-11 | Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation | Jiaming Lv et.al. | 2412.08139 | null |
2024-12-11 | DTAA: A Detect, Track and Avoid Architecture for navigation in spaces with Multiple Velocity Objects | Samuel Nordström et.al. | 2412.08121 | null |
2024-12-11 | THUD++: Large-Scale Dynamic Indoor Scene Dataset and Benchmark for Mobile Robots | Zeshun Li et.al. | 2412.08096 | null |
2024-12-11 | MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents | Yun Xing et.al. | 2412.08014 | null |
2024-12-10 | Low-Latency Scalable Streaming for Event-Based Vision | Andrew Hamara et.al. | 2412.07889 | null |
2024-12-10 | Multimodal Contextualized Support for Enhancing Video Retrieval System | Quoc-Bao Nguyen-Le et.al. | 2412.07584 | null |
2024-12-10 | Making the Flow Glow -- Robot Perception under Severe Lighting Conditions using Normalizing Flow Gradients | Simon Kristoffersson Lind et.al. | 2412.07565 | link |
2024-12-10 | Enhancing 3D Object Detection in Autonomous Vehicles Based on Synthetic Virtual Environment Analysis | Vladislav Li et.al. | 2412.07509 | null |
2024-12-10 | DSFEC: Efficient and Deployable Deep Radar Object Detection | Gayathri Dandugula et.al. | 2412.07411 | null |
2024-12-10 | Benchmarking Vision-Based Object Tracking for USVs in Complex Maritime Environments | Muhayy Ud Din et.al. | 2412.07392 | null |
2024-12-09 | FlexEvent: Event Camera Object Detection at Arbitrary Frequencies | Dongyue Lu et.al. | 2412.06708 | null |
2024-12-09 | EMOv2: Pushing 5M Vision Model Frontier | Jiangning Zhang et.al. | 2412.06674 | link |
2024-12-09 | Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset | Xiao Wang et.al. | 2412.06647 | null |
2024-12-09 | Self-Paced Learning Strategy with Easy Sample Prior Based on Confidence for the Flying Bird Object Detection Model Training | Zi-Wei Sun et.al. | 2412.06306 | null |
2024-12-09 | No Annotations for Object Detection in Art through Stable Diffusion | Patrick Ramos et.al. | 2412.06286 | link |
2024-12-09 | DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction | Yunheng Li et.al. | 2412.06244 | null |
2024-12-09 | A Real-Time Defense Against Object Vanishing Adversarial Patch Attacks for Object Detection in Autonomous Vehicles | Jaden Mu et.al. | 2412.06215 | null |
2024-12-09 | PoLaRIS Dataset: A Maritime Object Detection and Tracking Dataset in Pohang Canal | Jiwon Choi et.al. | 2412.06192 | null |
2024-12-08 | Tiny Object Detection with Single Point Supervision | Haoran Zhu et.al. | 2412.05837 | null |
2024-12-07 | Rethinking Annotation for Object Detection: Is Annotating Small-size Instances Worth Its Cost? | Yusuke Hosoya et.al. | 2412.05611 | null |
2024-12-06 | From classical techniques to convolution-based models: A review of object detection algorithms | Fnu Neha et.al. | 2412.05252 | null |
2024-12-06 | Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection | Chaoda Zheng et.al. | 2412.05154 | link |
2024-12-06 | DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection | Yishuo Chen et.al. | 2412.04931 | link |
2024-12-06 | Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection | Khurram Azeem Hashmi et.al. | 2412.04915 | null |
2024-12-05 | Cubify Anything: Scaling Indoor 3D Object Detection | Justin Lazarow et.al. | 2412.04458 | null |
2024-12-05 | Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure | Saheli Hazra et.al. | 2412.04337 | null |
2024-12-05 | YOLO-CCA: A Context-Based Approach for Traffic Sign Detection | Linfeng Jiang et.al. | 2412.04289 | link |
2024-12-05 | DEIM: DETR with Improved Matching for Fast Convergence | Shihua Huang et.al. | 2412.04234 | link |
2024-12-05 | Frequency-Adaptive Low-Latency Object Detection Using Events and Frames | Haitian Zhang et.al. | 2412.04149 | null |
2024-12-05 | Thermal and RGB Images Work Better Together in Wind Turbine Damage Detection | Serhii Svystun et.al. | 2412.04114 | null |
2024-12-05 | SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | Seokju Yun et.al. | 2412.04077 | null |
2024-12-05 | Space to Policy: Scalable Brick Kiln Detection and Automatic Compliance Monitoring with Geospatial Data | Zeel B Patel et.al. | 2412.04065 | null |
2024-12-05 | UNCOVER: Unknown Class Object Detection for Autonomous Vehicles in Real-time | Lars Schmarje et.al. | 2412.03986 | null |
2024-12-05 | MT3DNet: Multi-Task learning Network for 3D Surgical Scene Reconstruction | Mithun Parab et.al. | 2412.03928 | null |
2024-12-04 | Perception Tokens Enhance Visual Reasoning in Multimodal Language Models | Mahtab Bigverdi et.al. | 2412.03548 | null |
2024-12-04 | Data Fusion of Semantic and Depth Information in the Context of Object Detection | Md Abu Yusuf et.al. | 2412.03490 | null |
2024-12-04 | Task-driven Image Fusion with Learnable Fusion Loss | Haowen Bai et.al. | 2412.03240 | null |
2024-12-04 | ObjectFinder: Open-Vocabulary Assistive System for Interactive Object Search by Blind People | Ruiping Liu et.al. | 2412.03118 | null |
2024-12-04 | TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception | Runjian Chen et.al. | 2412.03054 | null |
2024-12-04 | Assessing the performance of CT image denoisers using Laguerre-Gauss Channelized Hotelling Observer for lesion detection | Prabhat Kc et.al. | 2412.02920 | null |
2024-12-03 | EvRT-DETR: The Surprising Effectiveness of DETR-based Detection for Event Cameras | Dmitrii Torbunov et.al. | 2412.02890 | null |
2024-12-03 | Optimized CNNs for Rapid 3D Point Cloud Object Recognition | Tianyi Lyu et.al. | 2412.02855 | null |
2024-12-03 | Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects | Abdurrahman Zeybey et.al. | 2412.02803 | null |
2024-12-03 | SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection | Joongwon Chae et.al. | 2412.02565 | null |
2024-12-03 | Underload: Defending against Latency Attacks for Object Detectors on Edge Devices | Tianyi Wang et.al. | 2412.02171 | null |
2024-12-03 | Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable | Lizhen Xu et.al. | 2412.02054 | null |
2024-12-02 | Smart Parking with Pixel-Wise ROI Selection for Vehicle Detection Using YOLOv8, YOLOv9, YOLOv10, and YOLOv11 | Gustavo P. C. P. da Luz et.al. | 2412.01983 | null |
2024-12-02 | HPRM: High-Performance Robotic Middleware for Intelligent Autonomous Systems | Jacky Kwok et.al. | 2412.01799 | null |
2024-12-02 | Identifying Reliable Predictions in Detection Transformers | Young-Jin Park et.al. | 2412.01782 | null |
2024-12-02 | FEVER-OOD: Free Energy Vulnerability Elimination for Robust Out-of-Distribution Detection | Brian K. S. Isaac-Medina et.al. | 2412.01596 | null |
2024-12-02 | Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection | Hao Tang et.al. | 2412.01556 | null |
2024-12-03 | GFreeDet: Exploiting Gaussian Splatting and Foundation Models for Model-free Unseen Object Detection in the BOP Challenge 2024 | Xingyu Liu et.al. | 2412.01552 | null |
2024-12-02 | Improving Object Detection by Modifying Synthetic Data with Explainable AI | Nitish Mital et.al. | 2412.01477 | null |
2024-11-29 | SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection | Philipp Wolters et.al. | 2411.19860 | null |
2024-11-29 | Feedback-driven object detection and iterative model improvement | Sönke Tenckhoff et.al. | 2411.19835 | link |
2024-11-29 | Real-Time Anomaly Detection in Video Streams | Fabien Poirier et.al. | 2411.19731 | null |
2024-11-29 | LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention | Zewen Du et.al. | 2411.19585 | link |
2024-11-29 | Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding | Wenbo Zhang et.al. | 2411.19551 | null |
2024-11-28 | Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection | Tsun-Hin Cheung et.al. | 2411.19220 | null |
2024-11-28 | Co-Learning: Towards Semi-Supervised Object Detection with Road-side Cameras | Jicheng Yuan et.al. | 2411.19143 | null |
2024-11-28 | On Moving Object Segmentation from Monocular Video with Transformers | Christian Homeyer et.al. | 2411.19141 | null |
2024-11-28 | Dynamic Attention and Bi-directional Fusion for Safety Helmet Wearing Detection | Junwei Feng et.al. | 2411.19071 | null |
2024-11-28 | MVFormer: Diversifying Feature Normalization and Token Mixing for Efficient Vision Transformers | Jongseong Bae et.al. | 2411.18995 | null |
2024-11-27 | Efficient Dynamic LiDAR Odometry for Mobile Robots with Structured Point Clouds | Jonathan Lichtenfeld et.al. | 2411.18443 | link |
2024-11-27 | Deep Fourier-embedded Network for Bi-modal Salient Object Detection | Pengfei Lyu et.al. | 2411.18409 | link |
2024-11-27 | Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks | Chen Zhou et.al. | 2411.18288 | link |
2024-11-27 | From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects | Zizhao Li et.al. | 2411.18207 | link |
2024-11-27 | RPEE-HEADS: A Novel Benchmark for Pedestrian Head Detection in Crowd Videos | Mohamad Abubaker et.al. | 2411.18164 | null |
2024-11-27 | ROICtrl: Boosting Instance Control for Visual Generation | Yuchao Gu et.al. | 2411.17949 | null |
2024-11-26 | Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning | Hoàng-Ân Lê et.al. | 2411.17536 | link |
2024-11-26 | TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba | Xiaowen Ma et.al. | 2411.17473 | link |
2024-11-26 | Communication-Efficient Cooperative SLAMMOT via Determining the Number of Collaboration Vehicles | Susu Fang et.al. | 2411.17432 | null |
2024-11-26 | DGNN-YOLO: Dynamic Graph Neural Networks with YOLO11 for Small Object Detection and Tracking in Traffic Surveillance | Shahriar Soudeep et.al. | 2411.17251 | null |
2024-11-26 | Event-based Spiking Neural Networks for Object Detection: A Review of Datasets, Architectures, Learning Rules, and Implementation | Craig Iaboni et.al. | 2411.17006 | link |
2024-11-25 | Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory | Zaira Manigrasso et.al. | 2411.16934 | null |
2024-11-25 | Open Vocabulary Monocular 3D Object Detection | Jin Yao et.al. | 2411.16833 | link |
2024-11-25 | Imperceptible Adversarial Examples in the Physical World | Weilin Xu et.al. | 2411.16622 | null |
2024-11-25 | STDWeb: Simple Transient Detection pipeline for the Web | Sergey Karpov et.al. | 2411.16470 | null |
2024-11-25 | Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks | Asanobu Kitamoto et.al. | 2411.16421 | link |
2024-11-26 | CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation | Leon Sick et.al. | 2411.16319 | null |
2024-11-25 | Diagnosis of diabetic retinopathy using machine learning & deep learning technique | Eric Shah et.al. | 2411.16250 | null |
2024-11-25 | Interpreting Object-level Foundation Models via Visual Precision Search | Ruoyu Chen et.al. | 2411.16198 | null |
2024-11-25 | Learn from Foundation Model: Fruit Detection Model without Manual Annotation | Yanan Wang et.al. | 2411.16196 | null |
2024-11-25 | CIA: Controllable Image Augmentation Framework Based on Stable Diffusion | Mohamed Benkedadra et.al. | 2411.16128 | null |
2024-11-25 | You only thermoelastically deform once: Point Absorber Detection in LIGO Test Masses with YOLO | Simon R. Goode et.al. | 2411.16104 | null |
2024-11-25 | Leverage Task Context for Object Affordance Ranking | Haojie Huang et.al. | 2411.16082 | null |
2024-11-22 | A Real-Time DETR Approach to Bangladesh Road Object Detection for Autonomous Vehicles | Irfan Nafiz Shahan et.al. | 2411.15110 | null |
2024-11-22 | MSSF: A 4D Radar and Camera Fusion Framework With Multi-Stage Sampling for 3D Object Detection in Autonomous Driving | Hongsi Liu et.al. | 2411.15016 | null |
2024-11-22 | VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving | Haiming Zhang et.al. | 2411.14716 | null |
2024-11-21 | Unveiling the Hidden: A Comprehensive Evaluation of Underwater Image Enhancement and Its Impact on Object Detection | Ali Awad et.al. | 2411.14626 | null |
2024-11-21 | DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Tianhe Ren et.al. | 2411.14347 | link |
2024-11-21 | AnywhereDoor: Multi-Target Backdoor Attacks on Object Detection | Jialin Lu et.al. | 2411.14243 | null |
2024-11-21 | Transforming Static Images Using Generative Models for Video Salient Object Detection | Suhwan Cho et.al. | 2411.13975 | link |
2024-11-21 | Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation | Ming Zhao et.al. | 2411.13847 | null |
2024-11-20 | MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection | Tong Ning et.al. | 2411.13628 | null |
2024-11-20 | DIS-Mine: Instance Segmentation for Disaster-Awareness in Poor-Light Condition in Underground Mines | Mizanur Rahman Jewel et.al. | 2411.13544 | null |
2024-11-20 | A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data | Kavin Chandrasekaran et.al. | 2411.13311 | link |
2024-11-20 | VADet: Multi-frame LiDAR 3D Object Detection using Variable Aggregation | Chengjie Huang et.al. | 2411.13186 | null |
2024-11-20 | RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation | Christoph Reinders et.al. | 2411.13150 | link |
2024-11-20 | YCB-LUMA: YCB Object Dataset with Luminance Keying for Object Localization | Thomas Pöllabauer et.al. | 2411.13149 | link |
2024-11-20 | Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension | Yongdong Luo et.al. | 2411.13093 | link |
2024-11-20 | Bounding-box Watermarking: Defense against Model Extraction Attacks on Object Detectors | Satoru Koda et.al. | 2411.13047 | null |
2024-11-20 | Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection | Xinhao Zhong et.al. | 2411.13001 | null |
2024-11-19 | Maps from Motion (MfM): Generating 2D Semantic Maps from Sparse Multi-view Images | Matteo Toso et.al. | 2411.12620 | null |
2024-11-19 | GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving | Shaoqing Xu et.al. | 2411.12452 | null |
2024-11-19 | Physics-Guided Detector for SAR Airplanes | Zhongling Huang et.al. | 2411.12301 | link |
2024-11-18 | Scaling Deep Learning Research with Kubernetes on the NRP Nautilus HyperCluster | J. Alex Hurt et.al. | 2411.12038 | null |
2024-11-18 | LightFFDNets: Lightweight Convolutional Neural Networks for Rapid Facial Forgery Detection | Günel Jabbarlı et.al. | 2411.11826 | null |
2024-11-18 | WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images | Lars Nieradzik et.al. | 2411.11738 | null |
2024-11-18 | Exploring Emerging Trends and Research Opportunities in Visual Place Recognition | Antonios Gasteratos et.al. | 2411.11481 | null |
2024-11-18 | SL-YOLO: A Stronger and Lighter Drone Target Detection Model | Defan Chen et.al. | 2411.11477 | null |
2024-11-19 | EVT: Efficient View Transformation for Multi-Modal 3D Object Detection | Yongjin Lee et.al. | 2411.10715 | null |
2024-11-15 | Vision Eagle Attention: A New Lens for Advancing Image Classification | Mahmudul Hasan et.al. | 2411.10564 | link |
2024-11-15 | Interactive Image-Based Aphid Counting in Yellow Water Traps under Stirring Actions | Xumin Gao et.al. | 2411.10357 | null |
2024-11-15 | RETR: Multi-View Radar Detection Transformer for Indoor Perception | Ryoma Yataka et.al. | 2411.10293 | null |
2024-11-15 | Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning | Jingru Yang et.al. | 2411.10252 | null |
2024-11-15 | Real-Time AI-Driven People Tracking and Counting Using Overhead Cameras | Ishrath Ahamed et.al. | 2411.10072 | null |
2024-11-15 | Diachronic Document Dataset for Semantic Layout Analysis | Thibault Clérice et.al. | 2411.10068 | null |
2024-11-14 | Adversarial Attacks Using Differentiable Rendering: A Survey | Matthew Hull et.al. | 2411.09749 | null |
2024-11-14 | Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration | Yifan Shao et.al. | 2411.09604 | link |
2024-11-14 | Long-Tailed Object Detection Pre-training: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction | Chen-Long Duan et.al. | 2411.09453 | null |
2024-11-14 | Instruction-Driven Fusion of Infrared-Visible Images: Tailoring for Diverse Downstream Tasks | Zengyi Yang et.al. | 2411.09387 | null |
2024-11-14 | DT-JRD: Deep Transformer based Just Recognizable Difference Prediction Model for Video Coding for Machines | Junqi Liu et.al. | 2411.09308 | null |
2024-11-14 | Cross-Modal Consistency in Multimodal Large Language Models | Xiang Zhang et.al. | 2411.09273 | null |
2024-11-14 | LEAP:D -- A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection | Chanyeong Park et.al. | 2411.09180 | null |
2024-11-13 | Multimodal Object Detection using Depth and Image Data for Manufacturing Parts | Nazanin Mahjourian et.al. | 2411.09062 | null |
2024-11-13 | DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models | Yongdong Wang et.al. | 2411.09022 | null |
2024-11-13 | UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation | Chengyuan Zhang et.al. | 2411.08569 | null |
2024-11-13 | Methodology for a Statistical Analysis of Influencing Factors on 3D Object Detection Performance | Anton Kuznietsov et.al. | 2411.08482 | null |
2024-11-13 | V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion | Xun Huang et.al. | 2411.08402 | link |
2024-11-12 | Large-scale Remote Sensing Image Target Recognition and Automatic Annotation | Wuzheng Dong et.al. | 2411.07802 | link |
2024-11-12 | Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning | Jianhao Li et.al. | 2411.07742 | null |
2024-11-12 | Depthwise Separable Convolutions with Deep Residual Convolutions | Md Arid Hasan et.al. | 2411.07544 | null |
2024-11-11 | Transformers for Charged Particle Track Reconstruction in High Energy Physics | Samuel Van Stroud et.al. | 2411.07149 | null |
2024-11-11 | Multi-scale Frequency Enhancement Network for Blind Image Deblurring | Yawen Xiang et.al. | 2411.06893 | null |
2024-11-11 | Fast and Efficient Transformer-based Method for Bird's Eye View Instance Prediction | Miguel Antunes-García et.al. | 2411.06851 | link |
2024-11-11 | United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images | Yanguang Sun et.al. | 2411.06703 | link |
2024-11-11 | Track Any Peppers: Weakly Supervised Sweet Pepper Tracking Using VLMs | Jia Syuen Lim et.al. | 2411.06702 | null |
2024-11-11 | LFSamba: Marry SAM with Mamba for Light Field Salient Object Detection | Zhengyi Liu et.al. | 2411.06652 | null |
2024-11-09 | LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation | Weijie Ma et.al. | 2411.06173 | link |
2024-11-09 | AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems | Zhiyu Zhu et.al. | 2411.06146 | null |
2024-11-09 | Pattern Integration and Enhancement Vision Transformer for Self-Supervised Learning in Remote Sensing | Kaixuan Lu et.al. | 2411.06091 | null |
2024-11-09 | An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models | Fatemeh Shiri et.al. | 2411.06048 | link |
2024-11-08 | Open-set object detection: towards unified problem formulation and benchmarking | Hejer Ammar et.al. | 2411.05564 | null |
2024-11-08 | ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving | Tao Ma et.al. | 2411.05311 | null |
2024-11-08 | SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection | Yun Zhao et.al. | 2411.05292 | null |
2024-11-07 | On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data | Aitor Martinez-Seras et.al. | 2411.04586 | null |
2024-11-07 | l0-Regularized Sparse Coding-based Interpretable Network for Multi-Modal Image Fusion | Gargi Panda et.al. | 2411.04519 | null |
2024-11-07 | Pose2Trajectory: Using Transformers on Body Pose to Predict Tennis Player's Trajectory | Ali K. AlShami et.al. | 2411.04501 | null |
2024-11-08 | SuperQ-GRASP: Superquadrics-based Grasp Pose Estimation on Larger Objects for Mobile-Manipulation | Xun Tu et.al. | 2411.04386 | null |
2024-11-07 | UEVAVD: A Dataset for Developing UAV's Eye View Active Object Detection | Xinhua Jiang et.al. | 2411.04348 | null |
2024-11-07 | GazeGen: Gaze-Driven User Interaction for Visual Content Generation | He-Yen Hsieh et.al. | 2411.04335 | null |
2024-11-06 | Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection | Pengfei Lyu et.al. | 2411.03728 | link |
2024-11-06 | Estimation of Psychosocial Work Environment Exposures Through Video Object Detection. Proof of Concept Using CCTV Footage | Claus D. Hansen et.al. | 2411.03724 | null |
2024-11-05 | An Application-Agnostic Automatic Target Recognition System Using Vision Language Models | Anthony Palladino et.al. | 2411.03491 | null |
2024-11-05 | Self-supervised cross-modality learning for uncertainty-aware object detection and recognition in applications which lack pre-labelled training data | Irum Mehboob et.al. | 2411.03082 | null |
2024-11-05 | CRT-Fusion: Camera, Radar, Temporal Fusion Using Motion Information for 3D Object Detection | Jisong Kim et.al. | 2411.03013 | null |
2024-11-05 | Centerness-based Instance-aware Knowledge Distillation with Task-wise Mutual Lifting for Object Detection on Drone Imagery | Bowei Du et.al. | 2411.02861 | null |
2024-11-05 | Correlation of Object Detection Performance with Visual Saliency and Depth Estimation | Matthias Bartolo et.al. | 2411.02844 | link |
2024-11-05 | ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing | Yuka Ogino et.al. | 2411.02799 | null |
2024-11-05 | Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection | Yifan Wang et.al. | 2411.02747 | null |
2024-11-05 | Analysis of Multi-epoch JWST Images of |
Zijian Zhang et.al. | 2411.02729 | null |
2024-11-04 | Intelligent Video Recording Optimization using Activity Detection for Surveillance Systems | Youssef Elmir et.al. | 2411.02632 | null |
2024-11-04 | SIRA: Scalable Inter-frame Relation and Association for Radar Perception | Ryoma Yataka et.al. | 2411.02220 | null |
2024-11-04 | Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation | Yan Li et.al. | 2411.02057 | link |
2024-11-04 | V-CAS: A Realtime Vehicle Anti Collision System Using Vision Transformer on Multi-Camera Streams | Muhammad Waqas Ashraf et.al. | 2411.01963 | null |
2024-11-04 | Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models | Sharat Agarwal et.al. | 2411.01925 | null |
2024-11-04 | LiDAttack: Robust Black-box Attack on LiDAR-based Object Detection | Jinyin Chen et.al. | 2411.01889 | link |
2024-11-03 | ROAD-Waymo: Action Awareness at Scale for Autonomous Driving | Salman Khan et.al. | 2411.01683 | null |
2024-11-03 | OSAD: Open-Set Aircraft Detection in SAR Images | Xiayang Xiao et.al. | 2411.01597 | null |
2024-11-03 | One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection | Zhenyu Wang et.al. | 2411.01584 | null |
2024-11-03 | A Visual Question Answering Method for SAR Ship: Breaking the Requirement for Multimodal Dataset Construction and Model Fine-Tuning | Fei Wang et.al. | 2411.01445 | null |
2024-11-03 | Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision | Xiangzhong Luo et.al. | 2411.01431 | null |
2024-10-31 | ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images | Timing Yang et.al. | 2410.24001 | link |
2024-10-31 | Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images | Yakun Xie et.al. | 2410.23991 | null |
2024-10-31 | Uncertainty Estimation for 3D Object Detection via Evidential Learning | Nikita Durasov et.al. | 2410.23910 | null |
2024-10-31 | From Web Data to Real Fields: Low-Cost Unsupervised Domain Adaptation for Agricultural Robots | Vasileios Tzouras et.al. | 2410.23906 | null |
2024-10-31 | Open-Set 3D object detection in LiDAR data as an Out-of-Distribution problem | Louis Soum-Fontez et.al. | 2410.23767 | null |
2024-10-31 | Context-Aware Token Selection and Packing for Enhanced Vision Transformer | Tianyi Zhang et.al. | 2410.23608 | null |
2024-10-30 | EMMA: End-to-End Multimodal Model for Autonomous Driving | Jyh-Jing Hwang et.al. | 2410.23262 | null |
2024-10-30 | S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving | Maciej K. Wozniak et.al. | 2410.23085 | null |
2024-10-30 | First Place Solution to the ECCV 2024 ROAD++ Challenge @ ROAD++ Spatiotemporal Agent Detection 2024 | Tengfei Zhang et.al. | 2410.23077 | null |
2024-10-30 | AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection | Yujin Wang et.al. | 2410.22939 | null |
2024-10-29 | Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection | Gyusam Chang et.al. | 2410.22461 | null |
2024-10-29 | Lighten CARAFE: Dynamic Lightweight Upsampling with Guided Reassemble Kernels | Ruigang Fu et.al. | 2410.22139 | link |
2024-10-29 | Data Generation for Hardware-Friendly Post-Training Quantization | Lior Dikstein et.al. | 2410.22110 | null |
2024-10-29 | Cognitive Semantic Augmentation LEO Satellite Networks for Earth Observation | Hong-fu Chou et.al. | 2410.21916 | null |
2024-10-29 | PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices | Ming Kang et.al. | 2410.21822 | link |
2024-10-28 | MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps | Yating Xu et.al. | 2410.21566 | link |
2024-10-28 | TACO: Adversarial Camouflage Optimization on Trucks to Fool Object Detectors | Adonisz Dimitriu et.al. | 2410.21443 | null |
2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
2024-10-28 | IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks | Manjunath D et.al. | 2410.20953 | null |
2024-10-28 | SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity | Kunyun Wang et.al. | 2410.20790 | null |
2024-10-27 | Sebica: Lightweight Spatial and Efficient Bidirectional Channel Attention Super Resolution Network | Chongxiao Liu et.al. | 2410.20546 | link |
2024-10-27 | Guidance Disentanglement Network for Optics-Guided Thermal UAV Image Super-Resolution | Zhicheng Zhao et.al. | 2410.20466 | link |
2024-10-27 | Open-Vocabulary Object Detection via Language Hierarchy | Jiaxing Huang et.al. | 2410.20371 | null |
2024-10-27 | Historical Test-time Prompt Tuning for Vision Foundation Models | Jingyi Zhang et.al. | 2410.20346 | null |
2024-10-25 | OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery | Philipe Dias et.al. | 2410.19965 | null |
2024-10-25 | MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services | Hongjia Wu et.al. | 2410.19665 | null |
2024-10-25 | Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models | Shenghao Fu et.al. | 2410.19635 | null |
2024-10-25 | MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors | Fanqi Pu et.al. | 2410.19590 | null |
2024-10-25 | DECADE: Towards Designing Efficient-yet-Accurate Distance Estimation Modules for Collision Avoidance in Mobile Advanced Driver Assistance Systems | Muhammad Zaeem Shahzad et.al. | 2410.19336 | null |
2024-10-25 | In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators | Dmytro Humeniuk et.al. | 2410.19277 | null |
2024-10-24 | HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision | Burak Ercan et.al. | 2410.19164 | null |
2024-10-24 | Optimizing Edge Offloading Decisions for Object Detection | Jiaming Qiu et.al. | 2410.18919 | link |
2024-10-24 | You Only Look Around: Learning Illumination Invariant Feature for Low-light Object Detection | Mingbo Hong et.al. | 2410.18398 | null |
2024-10-24 | Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images | Dong-Guw Lee et.al. | 2410.18340 | link |
2024-10-23 | Automated Defect Detection and Grading of Piarom Dates Using Deep Learning | Nasrin Azimi et.al. | 2410.18208 | null |
2024-10-23 | DREB-Net: Dual-stream Restoration Embedding Blur-feature Fusion Network for High-mobility UAV Object Detection | Qingpeng Li et.al. | 2410.17822 | link |
2024-10-23 | YOLO-Vehicle-Pro: A Cloud-Edge Collaborative Framework for Object Detection in Autonomous Driving under Adverse Weather Conditions | Xiguang Li et.al. | 2410.17734 | null |
2024-10-23 | YOLOv11: An Overview of the Key Architectural Enhancements | Rahima Khanam et.al. | 2410.17725 | null |
2024-10-23 | PlantCamo: Plant Camouflage Detection | Jinyu Yang et.al. | 2410.17598 | link |
2024-10-23 | OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking | Haiji Liang et.al. | 2410.17534 | link |
2024-10-22 | EPContrast: Effective Point-level Contrastive Learning for Large-scale Point Cloud Understanding | Zhiyi Pan et.al. | 2410.17207 | null |
2024-10-22 | YOLO-TS: Real-Time Traffic Sign Detection with Enhanced Accuracy Using Optimized Receptive Fields and Anchor-Free Fusion | Junzhou Chen et.al. | 2410.17144 | null |
2024-10-22 | FlightAR: AR Flight Assistance Interface with Multiple Video Streams and Object Detection Aimed at Immersive Drone Control | Oleg Sautenkov et.al. | 2410.16943 | null |
2024-10-22 | AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Yongjian Wu et.al. | 2410.16820 | link |
2024-10-22 | DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units | Liam Boyle et.al. | 2410.16769 | null |
2024-10-22 | DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Zhixiong Nan et.al. | 2410.16707 | null |
2024-10-22 | Fire and Smoke Detection with Burning Intensity Representation | Xiaoyi Han et.al. | 2410.16642 | link |
2024-10-21 | Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models | Yufei Zhan et.al. | 2410.16163 | link |
2024-10-21 | Multi-Sensor Fusion for UAV Classification Based on Feature Maps of Image and Radar Data | Nikos Sakellariou et.al. | 2410.16089 | null |
2024-10-21 | Few-shot target-driven instance detection based on open-vocabulary object detection models | Ben Crulis et.al. | 2410.16028 | null |
2024-10-21 | How Important are Data Augmentations to Close the Domain Gap for Object Detection in Orbit? | Maximilian Ulmer et.al. | 2410.15766 | null |
2024-10-21 | P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving | Mohamed R. Elshamy et.al. | 2410.15602 | null |
2024-10-21 | Deep Learning and Machine Learning -- Object Detection and Semantic Segmentation: From Theory to Applications | Jintao Ren et.al. | 2410.15584 | null |
2024-10-21 | Online Pseudo-Label Unified Object Detection for Multiple Datasets Training | XiaoJun Tang et.al. | 2410.15569 | null |
2024-10-20 | TrackMe:A Simple and Effective Multiple Object Tracking Annotation Tool | Thinh Phan et.al. | 2410.15518 | null |
2024-10-20 | YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary | Hao-Tang Tsui et.al. | 2410.15346 | null |
2024-10-20 | Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability | Yusuke Hosoya et.al. | 2410.15315 | null |
2024-10-18 | MultiOrg: A Multi-rater Organoid-detection Dataset | Christina Bukas et.al. | 2410.14612 | null |
2024-10-18 | Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech | Shuwei He et.al. | 2410.14101 | link |
2024-10-18 | Enhancing In-vehicle Multiple Object Tracking Systems with Embeddable Ising Machines | Kosuke Tatsumura et.al. | 2410.14093 | null |
2024-10-17 | Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring | Kristina Telegraph et.al. | 2410.13616 | null |
2024-10-17 | RemoteDet-Mamba: A Hybrid Mamba-CNN Network for Multi-modal Object Detection in Remote Sensing Images | Kejun Ren et.al. | 2410.13532 | null |
2024-10-16 | Syn2Real Domain Generalization for Underwater Mine-like Object Detection Using Side-Scan Sonar | Aayush Agrawal et.al. | 2410.12953 | null |
2024-10-16 | MambaBEV: An efficient 3D detection model with Mamba2 | Zihan You et.al. | 2410.12673 | null |
2024-10-16 | Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion | Minkyoung Cho et.al. | 2410.12592 | null |
2024-10-16 | Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look | Yong Zhang et.al. | 2410.12396 | null |
2024-10-16 | Real-time Stereo-based 3D Object Detection for Streaming Perception | Changcai Li et.al. | 2410.12394 | link |
2024-10-16 | Context-Infused Visual Grounding for Art | Selina Khan et.al. | 2410.12369 | link |
2024-10-16 | Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond | Pengwei Liang et.al. | 2410.12274 | null |
2024-10-16 | Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm | Guanming Huang et.al. | 2410.12259 | null |
2024-10-17 | SAM-Guided Masked Token Prediction for 3D Scene Understanding | Zhimin Chen et.al. | 2410.12158 | null |
2024-10-16 | Unveiling the Limits of Alignment: Multi-modal Dynamic Local Fusion Network and A Benchmark for Unaligned RGBT Video Object Detection | Qishun Wang et.al. | 2410.12143 | null |
2024-10-17 | Dynamic Open-Vocabulary 3D Scene Graphs for Long-term Language-Guided Mobile Manipulation | Zhijie Yan et.al. | 2410.11989 | null |
2024-10-15 | Fractal Calibration for long-tailed object detection | Konstantinos Panagiotis Alexandridis et.al. | 2410.11774 | null |
2024-10-15 | POLO -- Point-based, multi-class animal detection | Giacomo May et.al. | 2410.11741 | null |
2024-10-15 | YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection | Olalekan Akindele et.al. | 2410.11727 | null |
2024-10-15 | SeaDATE: Remedy Dual-Attention Transformer with Semantic Alignment via Contrast Learning for Multimodal Object Detection | Shuhan Dong et.al. | 2410.11358 | null |
2024-10-15 | Open World Object Detection: A Survey | Yiming Li et.al. | 2410.11301 | null |
2024-10-15 | Representation Similarity: A Better Guidance of DNN Layer Sharing for Edge Computing without Training | Bryan Bo Cao et.al. | 2410.11233 | null |
2024-10-15 | TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement | Zhiwei Lin et.al. | 2410.11228 | null |
2024-10-16 | CVCP-Fusion: On Implicit Depth Estimation for 3D Bounding Box Prediction | Pranav Gupta et.al. | 2410.11211 | link |
2024-10-15 | Multiview Scene Graph | Juexiao Zhang et.al. | 2410.11187 | null |
2024-10-14 | UAV3D: A Large-scale 3D Perception Benchmark for Unmanned Aerial Vehicles | Hui Ye et.al. | 2410.11125 | null |
2024-10-14 | ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2410.10554 | link |
2024-10-14 | Learning to Ground VLMs without Forgetting | Aritra Bhowmik et.al. | 2410.10491 | null |
2024-10-14 | SMART-TRACK: A Novel Kalman Filter-Guided Sensor Fusion For Robust UAV Object Tracking in Dynamic Environments | Khaled Gabr et.al. | 2410.10409 | null |
2024-10-14 | V2M: Visual 2-Dimensional Mamba for Image Representation Learning | Chengkun Wang et.al. | 2410.10382 | link |
2024-10-14 | GlobalMamba: Global Image Serialization for Vision Mamba | Chengkun Wang et.al. | 2410.10316 | link |
2024-10-14 | ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object | Jiwei Chen et.al. | 2410.10298 | null |
2024-10-14 | Out-of-Bounding-Box Triggers: A Stealthy Approach to Cheat Object Detectors | Tao Lin et.al. | 2410.10091 | link |
2024-10-15 | Optimizing Waste Management with Advanced Object Detection for Garbage Classification | Everest Z. Kuang et.al. | 2410.09975 | null |
2024-10-13 | EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition | Jingyu Liu et.al. | 2410.09954 | null |
2024-10-13 | LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond | Md Tanvir Islam et.al. | 2410.09831 | link |
2024-10-11 | DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection | Haochen Li et.al. | 2410.09004 | null |
2024-10-11 | LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection | Mingjia Li et.al. | 2410.08810 | null |
2024-10-11 | Hespi: A pipeline for automatically detecting information from hebarium specimen sheets | Robert Turnbull et.al. | 2410.08740 | null |
2024-10-11 | MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation | Qihang Yang et.al. | 2410.08739 | null |
2024-10-11 | Boosting Open-Vocabulary Object Detection by Handling Background Samples | Ruizhe Zeng et.al. | 2410.08645 | null |
2024-10-11 | DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention | Nguyen Huu Bao Long et.al. | 2410.08582 | link |
2024-10-11 | VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking | Zekun Qian et.al. | 2410.08529 | null |
2024-10-10 | Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving? | Samir Abou Haidar et.al. | 2410.08365 | null |
2024-10-10 | PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection | Botao Ren et.al. | 2410.08210 | null |
2024-10-10 | Dynamic Object Catching with Quadruped Robot Front Legs | André Schakkal et.al. | 2410.08065 | null |
2024-10-10 | HeightFormer: A Semantic Alignment Monocular 3D Object Detection Method from Roadside Perspective | Pei Liu et.al. | 2410.07758 | null |
2024-10-10 | O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out | Mısra Yavuz et.al. | 2410.07514 | null |
2024-10-09 | Progressive Multi-Modal Fusion for Robust 3D Object Detection | Rohit Mohan et.al. | 2410.07475 | null |
2024-10-11 | Self-Supervised Learning for Real-World Object Detection: a Survey | Alina Ciocarlan et.al. | 2410.07442 | null |
2024-10-09 | Robust infrared small target detection using self-supervised and a contrario paradigms | Alina Ciocarlan et.al. | 2410.07437 | null |
2024-10-09 | SurANet: Surrounding-Aware Network for Concealed Object Detection via Highly-Efficient Interactive Contrastive Learning Strategy | Yuhan Kang et.al. | 2410.06842 | link |
2024-10-09 | Rethinking the Evaluation of Visible and Infrared Image Fusion | Dayan Guan et.al. | 2410.06811 | link |
2024-10-10 | QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model | Fei Xie et.al. | 2410.06806 | link |
2024-10-09 | QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation | Yuxin Li et.al. | 2410.06516 | null |
2024-10-08 | Adver-City: Open-Source Multi-Modal Dataset for Collaborative Perception Under Adverse Weather Conditions | Mateus Karvat et.al. | 2410.06380 | null |
2024-10-08 | Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach | Sha Guo et.al. | 2410.06149 | null |
2024-10-08 | Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts | Zhiwei Lin et.al. | 2410.05963 | null |
2024-10-08 | Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga | Takara Taniguchi et.al. | 2410.05935 | null |
2024-10-08 | Unobserved Object Detection using Generative Models | Subhransu S. Bhattacharjee et.al. | 2410.05869 | null |
2024-10-08 | CASA: Class-Agnostic Shared Attributes in Vision-Language Models for Efficient Incremental Object Detection | Mingyi Guo et.al. | 2410.05804 | null |
2024-10-07 | Real-Time Truly-Coupled Lidar-Inertial Motion Correction and Spatiotemporal Dynamic Object Detection | Cedric Le Gentil et.al. | 2410.05152 | null |
2024-10-07 | Human-in-the-loop Reasoning For Traffic Sign Detection: Collaborative Approach Yolo With Video-llava | Mehdi Azarafza et.al. | 2410.05096 | null |
2024-10-07 | Improving Object Detection via Local-global Contrastive Learning | Danai Triantafyllidou et.al. | 2410.05058 | null |
2024-10-07 | Improved detection of discarded fish species through BoxAL active learning | Maria Sokolova et.al. | 2410.04880 | link |
2024-10-06 | Learning De-Biased Representations for Remote-Sensing Imagery | Zichen Tian et.al. | 2410.04546 | link |
2024-10-05 | ETHcavation: A Dataset and Pipeline for Panoptic Scene Understanding and Object Tracking in Dynamic Construction Environments | Lorenzo Terenzi et.al. | 2410.04250 | null |
2024-10-05 | Fast Object Detection with a Machine Learning Edge Device | Richard C. Rodriguez et.al. | 2410.04173 | null |
2024-10-05 | Robust Task-Oriented Communication Framework for Real-Time Collaborative Vision Perception | Zhengru Fang et.al. | 2410.04168 | null |
2024-10-05 | Cross Resolution Encoding-Decoding For Detection Transformers | Ashish Kumar et.al. | 2410.04088 | link |
2024-10-05 | Mamba Capsule Routing Towards Part-Whole Relational Camouflaged Object Detection | Dingwen Zhang et.al. | 2410.03987 | null |
2024-10-04 | DRAFTS: A Deep Learning-Based Radio Fast Transient Search Pipeline | Yong-Kun Zhang et.al. | 2410.03200 | null |
2024-10-04 | Learning 3D Perception from Others' Predictions | Jinsu Yoo et.al. | 2410.02646 | null |
2024-10-02 | Enhancing Screen Time Identification in Children with a Multi-View Vision Language Model and Screen Time Tracker | Xinlong Hou et.al. | 2410.01966 | null |
2024-10-02 | 3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection | Yang Cao et.al. | 2410.01647 | link |
2024-10-02 | Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection | Hongru Yan et.al. | 2410.01404 | null |
2024-10-02 | Finetuning Pre-trained Model with Limited Data for LiDAR-based 3D Object Detection by Bridging Domain Gaps | Jiyun Jang et.al. | 2410.01319 | null |
2024-10-02 | Panopticus: Omnidirectional 3D Object Detection on Resource-constrained Edge Devices | Jeho Lee et.al. | 2410.01270 | null |
2024-10-02 | High and Low Resolution Tradeoffs in Roadside Multimodal Sensing | Shaozu Ding et.al. | 2410.01250 | null |
2024-10-07 | Perceptual Piercing: Human Visual Cue-based Object Detection in Low Visibility Conditions | Ashutosh Kumar et.al. | 2410.01225 | link |
2024-10-02 | A versatile machine learning workflow for high-throughput analysis of supported metal catalyst particles | Arda Genc et.al. | 2410.01213 | link |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-07-17 | SOD-YOLO: Enhancing YOLO-Based Detection of Small Objects in UAV Imagery | Peijun Wang et.al. | 2507.12727 | null |
2025-07-16 | InterpIoU: Rethinking Bounding Box Regression with Interpolation-Based IoU Optimization | Haoyuan Liu et.al. | 2507.12420 | null |
2025-07-08 | High-Frequency Semantics and Geometric Priors for End-to-End Detection Transformers in Challenging UAV Imagery | Hongxing Peng et.al. | 2507.00825 | null |
2025-06-30 | Event-based Tiny Object Detection: A Benchmark Dataset and Baseline | Nuo Chen et.al. | 2506.23575 | null |
2025-06-15 | MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection | Yuxiang Wang et.al. | 2506.12697 | null |
2025-05-28 | Cross-DINO: Cross the Deep MLP and Transformer for Small Object Detection | Guiping Cao et.al. | 2505.21868 | null |
2025-05-27 | Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO | Muzhi Zhu et.al. | 2505.21457 | null |
2025-05-27 | Robust Video-Based Pothole Detection and Area Estimation for Intelligent Vehicles with Depth Map and Kalman Smoothing | Dehao Wang et.al. | 2505.21049 | null |
2025-05-22 | MAFE R-CNN: Selecting More Samples to Learn Category-aware Features for Small Object Detection | Yichen Li et.al. | 2505.16442 | null |
2025-07-23 | Application of YOLOv8 in monocular downward multiple Car Target detection | Shijie Lyu et.al. | 2505.10016 | null |
2025-04-30 | Learning to Borrow Features for Improved Detection of Small Objects in Single-Shot Detectors | Richard Schmit et.al. | 2505.00044 | null |
2025-04-29 | Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection | Siwei Wang et.al. | 2504.20602 | null |
2025-04-25 | MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View | Liugang Lu et.al. | 2504.18136 | null |
2025-04-18 | HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection | YangChen Zeng et.al. | 2504.13469 | null |
2025-04-11 | SO-DETR: Leveraging Dual-Domain Features and Knowledge Distillation for Small Object Detection | Huaxiang Zhang et.al. | 2504.11470 | null |
2025-04-14 | Small Object Detection with YOLO: A Performance Analysis Across Model Versions and Hardware | Muhammad Fasih Tariq et.al. | 2504.09900 | null |
2025-03-29 | Context in object detection: a systematic literature review | Mahtab Jamali et.al. | 2503.23249 | null |
2025-03-26 | Small Object Detection: A Comprehensive Survey on Challenges, Techniques and Real-World Applications | Mahya Nikouei et.al. | 2503.20516 | null |
2025-03-24 | LGI-DETR: Local-Global Interaction for UAV Object Detection | Zifa Chen et.al. | 2503.18785 | null |
2025-03-30 | YOLO-LLTS: Real-Time Low-Light Traffic Sign Detection via Prior-Guided Enhancement and Multi-Branch Feature Interaction | Ziyu Lin et.al. | 2503.13883 | null |
2025-03-06 | DEAL-YOLO: Drone-based Efficient Animal Localization using YOLO | Aditya Prashant Naidu et.al. | 2503.04698 | null |
2025-02-05 | An Empirical Study of Methods for Small Object Detection from Satellite Imagery | Xiaohui Yuan et.al. | 2502.03674 | null |
2025-01-30 | Tuning Event Camera Biases Heuristic for Object Detection Applications in Staring Scenarios | David El-Chai Ben-Ezra et.al. | 2501.18788 | null |
2024-12-24 | Multi-Point Positional Insertion Tuning for Small Object Detection | Kanoko Goto et.al. | 2412.18090 | null |
2024-12-13 | PanSR: An Object-Centric Mask Transformer for Panoptic Segmentation | Lojze Žust et.al. | 2412.10589 | link |
2024-12-12 | Analysis of Object Detection Models for Tiny Object in Satellite Imagery: A Dataset-Centric Approach | Kailas PS et.al. | 2412.10453 | null |
2024-12-16 | RemDet: Rethinking Efficient Model Design for UAV Object Detection | Chen Li et.al. | 2412.10040 | link |
2025-01-08 | YOLOv5-Based Object Detection for Emergency Response in Aerial Imagery | Sindhu Boddu et.al. | 2412.05394 | null |
2024-11-28 | Dynamic Attention and Bi-directional Fusion for Safety Helmet Wearing Detection | Junwei Feng et.al. | 2411.19071 | null |
2024-12-27 | DGNN-YOLO: Interpretable Dynamic Graph Neural Networks with YOLO11 for Small Object Detection and Tracking in Traffic Surveillance | Shahriar Soudeep et.al. | 2411.17251 | null |
2025-01-13 | SL-YOLO: A Stronger and Lighter Drone Target Detection Model | Defan Chen et.al. | 2411.11477 | null |
2024-11-15 | Interactive Image-Based Aphid Counting in Yellow Water Traps under Stirring Actions | Xumin Gao et.al. | 2411.10357 | null |
2024-11-14 | Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration | Yifan Shao et.al. | 2411.09604 | link |
2024-11-01 | LAM-YOLO: Drones-based Small Object Detection on Lighting-Occlusion Attention Mechanism YOLO | Yuchen Zheng et.al. | 2411.00485 | null |
2024-10-29 | PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices | Ming Kang et.al. | 2410.21822 | link |
2024-10-11 | Self-Supervised Learning for Real-World Object Detection: a Survey | Alina Ciocarlan et.al. | 2410.07442 | null |
2024-10-09 | Robust infrared small target detection using self-supervised and a contrario paradigms | Alina Ciocarlan et.al. | 2410.07437 | null |
2024-08-28 | Small Object Detection for Indoor Assistance to the Blind using YOLO NAS Small and Super Gradients | Rashmi BN et.al. | 2409.07469 | null |
2024-09-07 | Unleashing the Power of Generic Segmentation Models: A Simple Baseline for Infrared Small Target Detection | Mingjin Zhang et.al. | 2409.04714 | null |
2024-09-06 | BFA-YOLO: Balanced multiscale object detection network for multi-view building facade attachments detection | Yangguang Chen et.al. | 2409.04025 | null |
2024-08-16 | Enhancing Object Detection with Hybrid dataset in Manufacturing Environments: Comparing Federated Learning to Conventional Techniques | Vinit Hegiste et.al. | 2408.08974 | null |
2024-08-14 | Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection | Zhonglin Chen et.al. | 2408.07455 | null |
2024-08-08 | SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes | Boshra Khalili et.al. | 2408.04786 | null |
2024-07-29 | Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images | Zewen Du et.al. | 2407.19696 | link |
2024-07-25 | XS-VID: An Extremely Small Video Object Detection Dataset | Jiahao Guo et.al. | 2407.18137 | null |
2024-07-23 | ESOD: Efficient Small Object Detection on High-Resolution Images | Kai Liu et.al. | 2407.16424 | null |
2024-06-20 | Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines | Xinyi Ying et.al. | 2406.14482 | link |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-07-22 | A Single-step Accurate Fingerprint Registration Method Based on Local Feature Matching | Yuwei Jia et.al. | 2507.16201 | null |
2025-07-09 | Dual-Granularity Cross-Modal Identity Association for Weakly-Supervised Text-to-Person Image Matching | Yafei Zhang et.al. | 2507.06744 | null |
2025-07-05 | From Query to Explanation: Uni-RAG for Multi-Modal Retrieval-Augmented Learning in STEM | Xinyi Wu et.al. | 2507.03868 | null |
2025-07-02 | What does really matter in image goal navigation? | Gianluca Monaci et.al. | 2507.01667 | null |
2025-06-30 | Efficient and Accurate Image Provenance Analysis: A Scalable Pipeline for Large-scale Images | Jiewei Lai et.al. | 2506.23707 | null |
2025-06-29 | Dynamic Contrastive Learning for Hierarchical Retrieval: A Case Study of Distance-Aware Cross-View Geo-Localization | Suofei Zhang et.al. | 2506.23077 | null |
2025-06-27 | MatChA: Cross-Algorithm Matching with Feature Augmentation | Paula Carbó Cubero et.al. | 2506.22336 | null |
2025-07-22 | Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs | Shaojie Zhang et.al. | 2506.22139 | null |
2025-06-27 | ZeroReg3D: A Zero-shot Registration Pipeline for 3D Consecutive Histopathology Image Reconstruction | Juming Xiong et.al. | 2506.21923 | null |
2025-06-25 | Fast entropy-regularized SDP relaxations for permutation synchronization | Michael Lindsey et.al. | 2506.20191 | null |
2025-06-18 | ReSeDis: A Dataset for Referring-based Object Search across Large-Scale Image Collections | Ziling Huang et.al. | 2506.15180 | null |
2025-06-16 | EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition | Bingxi Liu et.al. | 2506.13133 | null |
2025-06-12 | RealKeyMorph: Keypoints in Real-world Coordinates for Resolution-agnostic Image Registration | Mina C. Moghadam et.al. | 2506.10344 | null |
2025-06-11 | Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints | Xiangkai Zhang et.al. | 2506.09748 | null |
2025-06-11 | ScaleLSD: Scalable Deep Line Segment Detection Streamlined | Zeran Ke et.al. | 2506.09369 | link |
2025-05-21 | Anti-interrupted sampling repeater jamming via linear canonical Wigner distribution lightweight LFM detection | Jia-Mian Li et.al. | 2506.06302 | null |
2025-06-05 | Vanishing arcs for isolated plane curve singularities | Hanwool Bae et.al. | 2506.04917 | null |
2025-06-05 | Deep Learning Reforms Image Matching: A Survey and Outlook | Shihua Zhang et.al. | 2506.04619 | null |
2025-06-20 | SR3D: Unleashing Single-view 3D Reconstruction for Transparent and Specular Object Grasping | Mingxu Zhang et.al. | 2505.24305 | null |
2025-06-05 | Universal Domain Adaptation for Semantic Segmentation | Seun-An Choe et.al. | 2505.22458 | null |
2025-05-23 | To Glue or Not to Glue? Classical vs Learned Image Matching for Mobile Mapping Cameras to Textured Semantic 3D Building Models | Simone Gaisbauer et.al. | 2505.17973 | link |
2025-05-16 | Multi-view dense image matching with similarity learning and geometry priors | Mohamed Ali Chebbi et.al. | 2505.11264 | null |
2025-05-12 | Boosting Global-Local Feature Matching via Anomaly Synthesis for Multi-Class Point Cloud Anomaly Detection | Yuqi Cheng et.al. | 2505.07375 | link |
2025-05-04 | OBD-Finder: Explainable Coarse-to-Fine Text-Centric Oracle Bone Duplicates Discovery | Chongsheng Zhang et.al. | 2505.03836 | link |
2025-05-06 | LiftFeat: 3D Geometry-Aware Local Feature Matching | Yepeng Liu et.al. | 2505.03422 | link |
2025-05-04 | Focus What Matters: Matchability-Based Reweighting for Local Feature Matching | Dongyue Li et.al. | 2505.02161 | null |
2025-05-15 | Mitigating Modality Bias in Multi-modal Entity Alignment from a Causal Perspective | Taoyu Su et.al. | 2504.19458 | link |
2025-04-28 | Dynamic Arthroscopic Navigation System for Anterior Cruciate Ligament Reconstruction Based on Multi-level Memory Architecture | Shuo Wang et.al. | 2504.19398 | null |
2025-04-23 | Road Similarity-Based BEV-Satellite Image Matching for UGV Localization | Zhenping Sun et.al. | 2504.16346 | null |
2025-04-18 | Outlier-Robust Multi-Model Fitting on Quantum Annealers | Saurabh Pandey et.al. | 2504.13836 | null |
2025-04-11 | Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models | Josef Bengtson et.al. | 2504.08348 | null |
2025-04-10 | Image registration of 2D optical thin sections in a 3D porous medium: Application to a Berea sandstone digital rock image | Jaehong Chung et.al. | 2504.06604 | link |
2025-04-22 | To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition | Davide Sferrazza et.al. | 2504.06116 | link |
2025-04-10 | Learning Affine Correspondences by Integrating Geometric Constraints | Pengju Sun et.al. | 2504.04834 | link |
2025-04-01 | Scaling Prompt Instructed Zero Shot Composed Image Retrieval with Image-Only Data | Yiqun Duan et.al. | 2504.00812 | null |
2025-03-31 | CoMatch: Dynamic Covisibility-Aware Transformer for Bilateral Subpixel-Level Semi-Dense Image Matching | Zizhuo Li et.al. | 2503.23925 | null |
2025-03-28 | Pairwise Matching of Intermediate Representations for Fine-grained Explainability | Lauren Shrack et.al. | 2503.22881 | link |
2025-03-26 | Multimodal Image Matching based on Frequency-domain Information of Local Energy Response | Meng Yang et.al. | 2503.20827 | null |
2025-03-22 | Normalized Matching Transformer | Abtin Pourhadi et.al. | 2503.17715 | link |
2025-03-20 | Loop Closure from Two Views: Revisiting PGO for Scalable Trajectory Estimation through Monocular Priors | Tian Yi Lim et.al. | 2503.16275 | null |
2025-03-20 | MapGlue: Multimodal Remote Sensing Image Matching | Peihao Wu et.al. | 2503.16185 | link |
2025-03-19 | PAPI-Reg: Patch-to-Pixel Solution for Efficient Cross-Modal Registration between LiDAR Point Cloud and Camera Image | Yuanchao Yue et.al. | 2503.15285 | null |
2025-04-07 | Less Biased Noise Scale Estimation for Threshold-Robust RANSAC | Johan Edstedt et.al. | 2503.13433 | null |
2025-03-17 | SatDepth: A Novel Dataset for Satellite Image Matching | Rahul Deshmukh et.al. | 2503.12706 | link |
2025-03-14 | Refining Image Edge Detection via Linear Canonical Riesz Transforms | Shuhui Yang et.al. | 2503.11148 | null |
2025-03-13 | Speedy MASt3R | Jingxing Li et.al. | 2503.10017 | null |
2025-03-11 | Keypoint Detection and Description for Raw Bayer Images | Jiakai Lin et.al. | 2503.08673 | null |
2025-03-06 | Learning 3D Medical Image Models From Brain Functional Connectivity Network Supervision For Mental Disorder Diagnosis | Xingcan Hu et.al. | 2503.04205 | null |
2025-03-07 | Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration | Qianliang Wu et.al. | 2503.04127 | null |
2025-03-05 | JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba | Xiaoyong Lu et.al. | 2503.03437 | null |
2025-02-28 | CNSv2: Probabilistic Correspondence Encoded Neural Image Servo | Anzhe Chen et.al. | 2503.00132 | null |
2025-02-27 | A2-GNN: Angle-Annular GNN for Visual Descriptor-free Camera Relocalization | Yejun Zhang et.al. | 2502.20036 | link |
2025-02-27 | RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges | Thibaut Loiseau et.al. | 2502.19955 | null |
2025-02-26 | BEV-LIO(LC): BEV Image Assisted LiDAR-Inertial Odometry with Loop Closure | Haoxin Cai et.al. | 2502.19242 | link |
2025-02-25 | PromptMID: Modal Invariant Descriptors Based on Diffusion and Vision Foundation Models for Optical-SAR Image Matching | Han Nie et.al. | 2502.18104 | link |
2025-02-25 | Improving Transformer Based Line Segment Detection with Matched Predicting and Re-ranking | Xin Tong et.al. | 2502.17766 | null |
2025-03-04 | Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model | Yaxuan Huang et.al. | 2502.16779 | null |
2025-02-16 | FeaKM: Robust Collaborative Perception under Noisy Pose Conditions | Jiuwu Hao et.al. | 2502.11003 | link |
2025-02-24 | Enhancing Ground-to-Aerial Image Matching for Visual Misinformation Detection Using Semantic Segmentation | Emanuele Mule et.al. | 2502.06288 | link |
2025-02-04 | Muographic Image Upsampling with Machine Learning for Built Infrastructure Applications | William O'Donnell et.al. | 2502.02624 | null |
2025-02-01 | MambaGlue: Fast and Robust Local Feature Matching With Mamba | Kihwan Ryoo et.al. | 2502.00462 | link |
2025-01-24 | Dense-SfM: Structure from Motion with Dense Consistent Matching | JongMin Lee et.al. | 2501.14277 | null |
2025-01-20 | MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching | Yepeng Liu et.al. | 2501.11299 | null |
2025-01-13 | MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training | Xingyi He et.al. | 2501.07556 | null |
2025-01-13 | Matching Free Depth Recovery from Structured Light | Zhuohang Yu et.al. | 2501.07113 | null |
2025-01-02 | Sparis: Neural Implicit Surface Reconstruction of Indoor Scenes from Sparse Views | Yulun Wu et.al. | 2501.01196 | null |
2024-12-31 | Towards Real-Time 2D Mapping: Harnessing Drones, AI, and Computer Vision for Advanced Insights | Bharath Kumar Agnur et.al. | 2412.20210 | null |
2024-12-27 | MINIMA: Modality Invariant Image Matching | Xingyu Jiang et.al. | 2412.19412 | link |
2024-12-24 | GIMS: Image Matching System Based on Adaptive Graph Construction and Graph Neural Network | Xianfeng Song et.al. | 2412.18221 | link |
2024-12-17 | Bringing Multimodality to Amazon Visual Search System | Xinliang Zhu et.al. | 2412.13364 | null |
2024-12-04 | Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis | Siyoon Jin et.al. | 2412.03150 | null |
2024-11-20 | DT-LSD: Deformable Transformer-based Line Segment Detection | Sebastian Janampa et.al. | 2411.13005 | link |
2024-11-15 | Image Matching Filtering and Refinement by Planes and Beyond | Fabio Bellavia et.al. | 2411.09484 | link |
2024-11-11 | XPoint: A Self-Supervised Visual-State-Space based Architecture for Multispectral Image Registration | Ismail Can Yagmur et.al. | 2411.07430 | link |
2024-11-07 | The Impact of Semi-Supervised Learning on Line Segment Detection | Johanna Engman et.al. | 2411.04596 | link |
2024-11-04 | Silver medal Solution for Image Matching Challenge 2024 | Yian Wang et.al. | 2411.01851 | null |
2024-10-30 | Variable Resolution Sampling and Deep Learning Image Recovery for Accelerated Multi-Spectral MRI Near Metal Implants | Azadeh Sharafi et.al. | 2410.23329 | null |
2024-11-05 | RelationBooth: Towards Relation-Aware Customized Object Generation | Qingyu Shi et.al. | 2410.23280 | null |
2024-10-31 | ETO:Efficient Transformer-based Local Feature Matching by Organizing Multiple Homography Hypotheses | Junjie Ni et.al. | 2410.22733 | null |
2024-10-30 | LoFLAT: Local Feature Matching using Focused Linear Attention Transformer | Naijian Cao et.al. | 2410.22710 | null |
2024-10-26 | Generative Adversarial Patches for Physical Attacks on Cross-Modal Pedestrian Re-Identification | Yue Su et.al. | 2410.20097 | null |
2024-10-01 | A Robust Multisource Remote Sensing Image Matching Method Utilizing Attention and Feature Enhancement Against Noise Interference | Yuan Li et.al. | 2410.11848 | null |
2024-10-15 | LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images | Yuzhou Cheng et.al. | 2410.11505 | null |
2024-10-12 | Leveraging Semantic Cues from Foundation Vision Models for Enhanced Local Feature Correspondence | Felipe Cadar et.al. | 2410.09533 | link |
2024-09-27 | Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras | Yipeng Lu et.al. | 2409.18673 | null |
2024-09-25 | Game4Loc: A UAV Geo-Localization Benchmark from Game Data | Yuxiang Ji et.al. | 2409.16925 | link |
2024-09-24 | Automatic Registration of SHG and H&E Images with Feature-based Initial Alignment and Intensity-based Instance Optimization: Contribution to the COMULIS Challenge | Marek Wodzinski et.al. | 2409.15931 | null |
2024-09-10 | Weakly-supervised Camera Localization by Ground-to-satellite Image Registration | Yujiao Shi et.al. | 2409.06471 | link |
2024-09-05 | Enabling Practical and Privacy-Preserving Image Processing | Chao Wang et.al. | 2409.03568 | null |
2024-09-20 | A General Albedo Recovery Approach for Aerial Photogrammetric Images through Inverse Rendering | Shuang Song et.al. | 2409.03032 | link |
2024-08-29 | Super-Resolution works for coastal simulations | Zhi-Song Liu et.al. | 2408.16553 | null |
2024-09-15 | Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks | Sierra Bonilla et.al. | 2408.16445 | link |
2024-08-26 | Affine steerers for structured keypoint description | Georg Bökman et.al. | 2408.14186 | link |
2024-08-25 | TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers | Chuanrui Zhang et.al. | 2408.13770 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-10-16 | Development of Image Collection Method Using YOLO and Siamese Network | Chan Young Shin et.al. | 2410.12561 | null |
2024-10-16 | LoD-Loc: Aerial Visual Localization using LoD 3D Map with Neural Wireframe Alignment | Juelin Zhu et.al. | 2410.12269 | null |
2024-10-16 | Leveraging Spatial Attention and Edge Context for Optimized Feature Selection in Visual Localization | Nanda Febri Istighfarin et.al. | 2410.12240 | null |
2024-10-15 | LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images | Yuzhou Cheng et.al. | 2410.11505 | null |
2024-10-15 | Multiview Scene Graph | Juexiao Zhang et.al. | 2410.11187 | null |
2024-10-12 | Leveraging Semantic Cues from Foundation Vision Models for Enhanced Local Feature Correspondence | Felipe Cadar et.al. | 2410.09533 | link |
2024-10-11 | Voxel-SLAM: A Complete, Accurate, and Versatile LiDAR-Inertial SLAM System | Zheng Liu et.al. | 2410.08935 | link |
2024-10-16 | Semantic Token Reweighting for Interpretable and Controllable Text Embeddings in CLIP | Eunji Kim et.al. | 2410.08469 | null |
2024-10-11 | A Unified Deep Semantic Expansion Framework for Domain-Generalized Person Re-identification | Eugene P. W. Ang et.al. | 2410.08456 | null |
2024-10-10 | A Unified Debiasing Approach for Vision-Language Models across Modalities and Tasks | Hoin Jung et.al. | 2410.07593 | null |
2024-10-09 | Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval | Mohammad Omama et.al. | 2410.07022 | null |
2024-10-09 | Pair-VPR: Place-Aware Pre-training and Contrastive Pair Classification for Visual Place Recognition with Vision Transformers | Stephen Hausler et.al. | 2410.06614 | null |
2024-10-09 | MedImageInsight: An Open-Source Embedding Model for General Domain Medical Imaging | Noel C. F. Codella et.al. | 2410.06542 | null |
2024-10-08 | Temporal Image Caption Retrieval Competition -- Description and Results | Jakub Pokrywka et.al. | 2410.06314 | null |
2024-10-08 | Monocular Visual Place Recognition in LiDAR Maps via Cross-Modal State Space Model and Multi-View Matching | Gongxin Yao et.al. | 2410.06285 | null |
2024-10-08 | GSLoc: Visual Localization with 3D Gaussian Splatting | Kazii Botashev et.al. | 2410.06165 | null |
2024-10-08 | Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning | Ayush Singh et.al. | 2410.05928 | null |
2024-10-08 | RNR-Nav: A Real-World Visual Navigation System Using Renderable Neural Radiance Maps | Minsoo Kim et.al. | 2410.05621 | null |
2024-10-11 | LoTLIP: Improving Language-Image Pre-training for Long Text Understanding | Wei Wu et.al. | 2410.05249 | null |
2024-10-06 | LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation | Jianhao Jiao et.al. | 2410.04419 | null |
2024-10-02 | Boosting Weakly-Supervised Referring Image Segmentation via Progressive Comprehension | Zaiquan Yang et.al. | 2410.01544 | null |
2024-10-03 | EUFCC-CIR: a Composed Image Retrieval Dataset for GLAM Collections | Francesc Net et.al. | 2410.01536 | link |
2024-10-04 | CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessment | Safouane El Ghazouali et.al. | 2410.01411 | link |
2024-09-30 | Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation | Aleyna Kütük et.al. | 2410.00266 | null |
2024-09-29 | CELLmap: Enhancing LiDAR SLAM through Elastic and Lightweight Spherical Map Representation | Yifan Duan et.al. | 2409.19597 | null |
2024-09-28 | VLAD-BuFF: Burst-aware Fast Feature Aggregation for Visual Place Recognition | Ahmad Khaliq et.al. | 2409.19293 | link |
2024-09-27 | MASt3R-SfM: a Fully-Integrated Solution for Unconstrained Structure-from-Motion | Bardienus Duisterhof et.al. | 2409.19152 | null |
2024-09-26 | Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval | Mankeerat Sidhu et.al. | 2409.18733 | null |
2024-09-26 | Revisit Anything: Visual Place Recognition via Image Segment Retrieval | Kartik Garg et.al. | 2409.18049 | link |
2024-09-24 | GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization | Gennady Sidorov et.al. | 2409.16502 | link |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-10-15 | RS-MOCO: A deep learning-based topology-preserving image registration method for cardiac T1 mapping | Chiyi Huang et.al. | 2410.11651 | null |
2024-10-14 | MoonMetaSync: Lunar Image Registration Analysis | Ashutosh Kumar et.al. | 2410.11118 | link |
2024-10-14 | Stationary Velocity Fields on Matrix Groups for Deformable Image Registration | Johannes Bostelmann et.al. | 2410.10997 | null |
2024-10-14 | A Counterexample in Image Registration | Serap A. Savari et.al. | 2410.10725 | null |
2024-10-12 | FiRework: Field Refinement Framework for Efficient Enhancement of Deformable Registration | Haiqiao Wang et.al. | 2410.09595 | link |
2024-10-12 | Leveraging Semantic Cues from Foundation Vision Models for Enhanced Local Feature Correspondence | Felipe Cadar et.al. | 2410.09533 | link |
2024-10-11 | Hierarchical uncertainty estimation for learning-based registration in neuroimaging | Xiaoling Hu et.al. | 2410.09299 | link |
2024-10-07 | DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration | Yongtai Zhuo et.al. | 2410.05234 | link |
2024-10-07 | Variable Resolution Pixel Quantization for Low Power Machine Vision Application on Edge | Senorita Deb et.al. | 2410.05189 | null |
2024-10-04 | DiffKillR: Killing and Recreating Diffeomorphisms for Cell Annotation in Dense Microscopy Images | Chen Liu et.al. | 2410.03058 | link |
2024-10-03 | Deep Regression 2D-3D Ultrasound Registration for Liver Motion Correction in Focal Tumor Thermal Ablation | Shuwei Xing et.al. | 2410.02579 | link |
2024-10-07 | NestedMorph: Enhancing Deformable Medical Image Registration with Nested Attention Mechanisms | Gurucharan Marthi Krishna Kumar et.al. | 2410.02550 | null |
2024-10-03 | CTARR: A fast and robust method for identifying anatomical regions on CT images via atlas registration | Thomas Buddenkotte et.al. | 2410.02316 | link |
2024-09-30 | Shuffled Linear Regression via Spectral Matching | Hang Liu et.al. | 2410.00078 | null |
2024-09-30 | Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model | Fulong Ma et.al. | 2409.20164 | null |
2024-09-29 | Dual-Attention Frequency Fusion at Multi-Scale for Joint Segmentation and Deformable Medical Image Registration | Hongchao Zhou et.al. | 2409.19658 | null |
2024-09-28 | Trigger-Based Fragile Model Watermarking for Image Transformation Networks | Preston K. Robinette et.al. | 2409.19442 | null |
2024-09-27 | ADEPT: A Noninvasive Method for Determining Elastic Properties of Valve Tissue | Wensi Wu et.al. | 2409.19081 | null |
2024-09-26 | Ophthalmic Biomarker Detection with Parallel Prediction of Transformer and Convolutional Architecture | Md. Touhidul Islam et.al. | 2409.17788 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-07-23 | Mapping ground-based coronagraphic images to Helioprojective-Cartesian coordinate system by image registration | Feiyang Sha et.al. | 2507.17670 | null |
2025-07-22 | Combined Image Data Augmentations diminish the benefits of Adaptive Label Smoothing | Georg Siedel et.al. | 2507.16427 | null |
2025-07-21 | Compress-Align-Detect: onboard change detection from unregistered images | Gabriele Inzerillo et.al. | 2507.15578 | null |
2025-07-17 | fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting | Alicia Durrer et.al. | 2507.13146 | null |
2025-07-17 | cIDIR: Conditioned Implicit Neural Representation for Regularized Deformable Image Registration | Sidaty El Hadramy et.al. | 2507.12953 | null |
2025-07-16 | Pathology-Guided Virtual Staining Metric for Evaluation and Training | Qiankai Wang et.al. | 2507.12624 | null |
2025-07-15 | Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration? | Hanxue Gu et.al. | 2507.11569 | null |
2025-07-14 | Well-posedness of an optical flow based optimal control formulation for image registration | Johannes Haubner et.al. | 2507.10188 | null |
2025-07-09 | Segmentation Regularized Training for Multi-Domain Deep Learning Registration applied to MR-Guided Prostate Cancer Radiotherapy | Sudharsan Madhavan et.al. | 2507.06966 | null |
2025-07-08 | Exploring Partial Multi-Label Learning via Integrating Semantic Co-occurrence Knowledge | Xin Wu et.al. | 2507.05992 | null |
2025-07-08 | From Motion to Meaning: Biomechanics-Informed Neural Network for Explainable Cardiovascular Disease Identification | Comte Valentin et.al. | 2507.05783 | null |
2025-07-06 | Grid-Reg: Grid-Based SAR and Optical Image Registration Across Platforms | Xiaochen Wei et.al. | 2507.04233 | null |
2025-06-29 | Multimodal image registration for effective thermographic fever screening | C. Y. N. Dwith et.al. | 2507.02955 | null |
2025-07-09 | Modality-agnostic, patient-specific digital twins modeling temporally varying digestive motion | Jorge Tapias Gomez et.al. | 2507.01909 | null |
2025-07-01 | On the Intensity-based Inversion Method for Quantitative Quasi-Static Elastography | Ekaterina Sherina et.al. | 2507.01207 | null |
2025-07-08 | Bridging Classical and Learning-based Iterative Registration through Deep Equilibrium Models | Yi Zhang et.al. | 2507.00582 | null |
2025-06-30 | Puzzles: Unbounded Video-Depth Augmentation for Scalable End-to-End 3D Reconstruction | Jiahao Ma et.al. | 2506.23863 | null |
2025-06-27 | Cardiovascular disease classification using radiomics and geometric features from cardiac CT | Ajay Mittal et.al. | 2506.22226 | null |
2025-06-27 | Robust and Accurate Multi-view 2D/3D Image Registration with Differentiable X-ray Rendering and Dual Cross-view Constraints | Yuxin Cui et.al. | 2506.22191 | null |
2025-06-25 | Real-Time 3D Guidewire Reconstruction from Intraoperative DSA Images for Robot-Assisted Endovascular Interventions | Tianliang Yao et.al. | 2506.21631 | null |
2025-06-25 | Photon Absorption Remote Sensing (PARS): Comprehensive Absorption Imaging Enabling Label-Free Biomolecule Characterization and Mapping | Benjamin R. Ecclestone et.al. | 2506.20069 | null |
2025-06-24 | VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration | Hang Zhang et.al. | 2506.19975 | null |
2025-06-24 | Deformable Medical Image Registration with Effective Anatomical Structure Representation and Divide-and-Conquer Network | Xinke Ma et.al. | 2506.19222 | null |
2025-06-23 | A Deep Learning Based Method for Fast Registration of Cardiac Magnetic Resonance Images | Benjamin Graham et.al. | 2506.19167 | null |
2025-06-19 | Watermarking Autoregressive Image Generation | Nikola Jovanović et.al. | 2506.16349 | link |
2025-06-18 | Tree-based adaptive finite element methods for deformable image registration | Nicolás A. Barnafi et.al. | 2506.15876 | null |
2025-06-30 | Mono-Modalizing Extremely Heterogeneous Multi-Modal Medical Image Registration | Kyobin Choo et.al. | 2506.15596 | null |
2025-06-17 | A Digital Twin Framework for Adaptive Treatment Planning in Radiotherapy | Chih-Wei Chang et.al. | 2506.14701 | null |
2025-06-16 | PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images | Lingteng Qiu et.al. | 2506.13766 | null |
2025-06-12 | Unsupervised Deformable Image Registration with Structural Nonparametric Smoothing | Hang Zhang et.al. | 2506.10813 | null |
2025-06-12 | RealKeyMorph: Keypoints in Real-world Coordinates for Resolution-agnostic Image Registration | Mina C. Moghadam et.al. | 2506.10344 | null |
2025-06-11 | CINeMA: Conditional Implicit Neural Multi-Modal Atlas for a Spatio-Temporal Representation of the Perinatal Brain | Maik Dannecker et.al. | 2506.09668 | link |
2025-06-11 | Geometry Reduced Order Modeling (GROM) with application to modeling of glymphatic function | Andreas Solheim et.al. | 2506.09442 | link |
2025-06-07 | Exploring Image Transforms derived from Eye Gaze Variables for Progressive Autism Diagnosis | Abigail Copiaco et.al. | 2506.09065 | null |
2025-06-04 | Personalized MR-Informed Diffusion Models for 3D PET Image Reconstruction | George Webber et.al. | 2506.03804 | null |
2025-06-03 | FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens | Christian Schlarmann et.al. | 2506.03096 | null |
2025-06-03 | Guiding Registration with Emergent Similarity from Pre-Trained Diffusion Models | Nurislam Tursynbek et.al. | 2506.02419 | null |
2025-06-02 | Implicit Deformable Medical Image Registration with Learnable Kernels | Stefano Fogarollo et.al. | 2506.02150 | null |
2025-06-01 | Modality Translation and Registration of MR and Ultrasound Images Using Diffusion Models | Xudong Ma et.al. | 2506.01025 | null |
2025-05-30 | MRDust: Wireless Implant Data Uplink & Localization via Magnetic Resonance Image Modulation | Biqi Rebekah Zhao et.al. | 2506.00143 | null |
2025-05-30 | A Novel Coronary Artery Registration Method Based on Super-pixel Particle Swarm Optimization | Peng Qi et.al. | 2505.24351 | null |
2025-05-30 | Fourier ptychographic microscopy aided with transport of intensity equation for robust full phase spectrum reconstruction | Mikołaj Rogalski et.al. | 2505.24322 | null |
2025-05-30 | Pretraining Deformable Image Registration Networks with Random Images | Junyu Chen et.al. | 2505.24167 | link |
2025-05-30 | Beyond the LUMIR challenge: The pathway to foundational registration models | Junyu Chen et.al. | 2505.24160 | null |
2025-05-28 | Collaborative Learning for Unsupervised Multimodal Remote Sensing Image Registration: Integrating Self-Supervision and MIM-Guided Diffusion-Based Image Translation | Xiaochen Wei et.al. | 2505.22000 | null |
2025-05-27 | Moment kernels: a simple and scalable approach for equivariance to rotations and reflections in deep convolutional networks | Zachary Schlamowitz et.al. | 2505.21736 | null |
2025-05-23 | To Glue or Not to Glue? Classical vs Learned Image Matching for Mobile Mapping Cameras to Textured Semantic 3D Building Models | Simone Gaisbauer et.al. | 2505.17973 | null |
2025-06-01 | 4D-CTA Image and geometry dataset for kinematic analysis of abdominal aortic aneurysms | Mostafa Jamshidian et.al. | 2505.17647 | null |
2025-05-22 | Temporal Differential Fields for 4D Motion Modeling via Image-to-Video Synthesis | Xin You et.al. | 2505.17333 | null |
2025-05-22 | Deep mineralogical segmentation of thin section images based on QEMSCAN maps | Jean Pablo Vieira de Mello et.al. | 2505.17008 | link |
2025-05-23 | Tracking the Flight: Exploring a Computational Framework for Analyzing Escape Responses in Plains Zebra (Equus quagga) | Isla Duporge et.al. | 2505.16882 | link |
2025-05-18 | Kornia-rs: A Low-Level 3D Computer Vision Library In Rust | Edgar Riba et.al. | 2505.12425 | null |
2025-05-15 | IMITATE: Image Registration with Context for unknown time frame recovery | Ziad Kheil et.al. | 2505.10124 | link |
2025-05-15 | Non-Registration Change Detection: A Novel Change Detection Task and Benchmark Dataset | Zhe Shan et.al. | 2505.09939 | link |
2025-05-11 | AugMixCloak: A Defense against Membership Inference Attacks via Image Transformation | Heqing Ren et.al. | 2505.07149 | null |
2025-05-11 | Federated Learning with LoRA Optimized DeiT and Multiscale Patch Embedding for Secure Eye Disease Recognition | Md. Naimur Asif Borno et.al. | 2505.06982 | null |
2025-05-11 | Bi-directional Self-Registration for Misaligned Infrared-Visible Image Fusion | Timing Li et.al. | 2505.06920 | null |
2025-05-10 | Improving Generalization of Medical Image Registration Foundation Model | Jing Hu et.al. | 2505.06527 | link |
2025-05-09 | FF-PNet: A Pyramid Network Based on Feature and Field for Brain Image Registration | Ying Zhang et.al. | 2505.04938 | null |
2025-05-07 | Tetrahedron-Net for Medical Image Registration | Jinhai Xiang et.al. | 2505.04380 | null |
2025-05-05 | Unsupervised training of keypoint-agnostic descriptors for flexible retinal image registration | David Rivas-Villar et.al. | 2505.02787 | null |
2025-05-05 | Unsupervised Deep Learning-based Keypoint Localization Estimating Descriptor Matching Performance | David Rivas-Villar et.al. | 2505.02779 | null |
2025-04-30 | MagicCraft: Natural Language-Driven Generation of Dynamic and Interactive 3D Objects for Commercial Metaverse Platforms | Ryutaro Kurai et.al. | 2504.21332 | null |
2025-04-24 | Spectral Bias Correction in PINNs for Myocardial Image Registration of Pathological Data | Bastien C. Baluyot et.al. | 2504.17945 | null |
2025-04-22 | Towards prediction of morphological heart age from computed tomography angiography | Johan Öfverstedt et.al. | 2504.15783 | null |
2025-04-19 | Exploring Generalizable Pre-training for Real-world Change Detection via Geometric Estimation | Yitao Zhao et.al. | 2504.14306 | null |
2025-04-17 | SC3EF: A Joint Self-Correlation and Cross-Correspondence Estimation Framework for Visible and Thermal Image Registration | Xi Tong et.al. | 2504.12869 | null |
2025-04-17 | Computer-Aided Design of Personalized Occlusal Positioning Splints Using Multimodal 3D Data | Agnieszka Anna Tomaka et.al. | 2504.12868 | null |
2025-04-16 | Correlation Ratio for Unsupervised Learning of Multi-modal Deformable Registration | Xiaojian Chen et.al. | 2504.12265 | link |
2025-04-16 | A Category-Fragment Segmentation Framework for Pelvic Fracture Segmentation in X-ray Images | Daiqi Liu et.al. | 2504.11872 | null |
2025-04-13 | Imaging Transformer for MRI Denoising: a Scalable Model Architecture that enables SNR << 1 Imaging | Hui Xue et.al. | 2504.10534 | null |
2025-04-13 | Capturing Longitudinal Changes in Brain Morphology Using Temporally Parameterized Neural Displacement Fields | Aisha L. Shuaibu et.al. | 2504.09514 | null |
2025-04-09 | IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces | Nian Wu et.al. | 2504.07999 | link |
2025-04-10 | Geometric and Dosimetric Validation of Deformable Image Registration for Prostate MR-guided Adaptive Radiotherapy | Victor N. Malkov et.al. | 2504.07933 | null |
2025-04-09 | OmniCaptioner: One Captioner to Rule Them All | Yiting Lu et.al. | 2504.07089 | link |
2025-04-10 | nnLandmark: A Self-Configuring Method for 3D Medical Landmark Detection | Alexandra Ertl et.al. | 2504.06742 | null |
2025-04-09 | Large Scale Supervised Pretraining For Traumatic Brain Injury Segmentation | Constantin Ulrich et.al. | 2504.06741 | null |
2025-04-09 | EDIT: Enhancing Vision Transformers by Mitigating Attention Sink through an Encoder-Decoder Architecture | Wenfeng Feng et.al. | 2504.06738 | null |
2025-04-10 | Image registration of 2D optical thin sections in a 3D porous medium: Application to a Berea sandstone digital rock image | Jaehong Chung et.al. | 2504.06604 | link |
2025-04-08 | OSDM-MReg: Multimodal Image Registration based One Step Diffusion Model | Xiaochen Wei et.al. | 2504.06027 | null |
2025-04-07 | Biomechanical Constraints Assimilation in Deep-Learning Image Registration: Application to sliding and locally rigid deformations | Ziad Kheil et.al. | 2504.05444 | null |
2025-04-07 | Solving the fully nonlinear Monge-Ampère equation using the Legendre-Kolmogorov-Arnold Network method | Bingcheng Hu et.al. | 2504.05022 | null |
2025-04-03 | IMPACT: A Generic Semantic Loss for Multimodal Medical Image Registration | Valentin Boussot et.al. | 2503.24121 | link |
2025-04-01 | OncoReg: Medical Image Registration for Oncological Challenges | Wiebke Heyer et.al. | 2503.23179 | link |
2025-03-28 | Divide to Conquer: A Field Decomposition Approach for Multi-Organ Whole-Body CT Image Registration | Xuan Loc Pham et.al. | 2503.22281 | null |
2025-03-26 | UWarp: A Whole Slide Image Registration Pipeline to Characterize Scanner-Induced Local Domain Shift | Antoine Schieb et.al. | 2503.20653 | null |
2025-03-26 | Robust Flower Cluster Matching Using The Unscented Transform | Andy Chu et.al. | 2503.20631 | null |
2025-03-26 | Vision-Amplified Semantic Entropy for Hallucination Detection in Medical Visual Question Answering | Zehui Liao et.al. | 2503.20504 | null |
2025-03-25 | SACB-Net: Spatial-awareness Convolutions for Medical Image Registration | Xinxing Cheng et.al. | 2503.19592 | link |
2025-03-21 | Model reduction of convection-dominated viscous conservation laws using implicit feature tracking and landmark image registration | Victor Zucatti et.al. | 2503.17463 | null |
2025-03-21 | Halton Scheduler For Masked Generative Image Transformer | Victor Besnier et.al. | 2503.17076 | link |
2025-03-21 | Downstream Analysis of Foundational Medical Vision Models for Disease Progression | Basar Demir et.al. | 2503.16842 | null |
2025-03-18 | Weakly Supervised Spatial Implicit Neural Representation Learning for 3D MRI-Ultrasound Deformable Image Registration in HDR Prostate Brachytherapy | Jing Wang et.al. | 2503.14395 | null |
2025-03-18 | Text-Guided Image Invariant Feature Learning for Robust Image Watermarking | Muhammad Ahtesham et.al. | 2503.13805 | null |
2025-03-17 | UniReg: Foundation Model for Controllable Medical Image Registration | Zi Li et.al. | 2503.12868 | null |
2025-03-15 | Meta-operators for all-optical image processing | Linzhi Yu et.al. | 2503.12252 | null |
2025-03-14 | Multi-Stage Generative Upscaler: Reconstructing Football Broadcast Images via Diffusion Models | Luca Martini et.al. | 2503.11181 | null |
2025-03-13 | How Should We Evaluate Uncertainty in Accelerated MRI Reconstruction? | Luca Trautmann et.al. | 2503.10527 | null |
2025-03-14 | On the Limitations of Vision-Language Models in Understanding Image Transforms | Ahmad Mustafa Anis et.al. | 2503.09837 | null |
2025-03-10 | NimbleReg: A light-weight deep-learning framework for diffeomorphic image registration | Antoine Legouhy et.al. | 2503.07768 | null |
2025-03-10 | Evaluation of Alignment-Regularity Characteristics in Deformable Image Registration | Vasiliki Sideri-Lampretsa et.al. | 2503.07185 | null |
2025-03-07 | New multimodal similarity measure for image registration via modeling local functional dependence with linear combination of learned basis functions | Joel Honkamaa et.al. | 2503.05335 | link |
2025-03-07 | Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration | Qianliang Wu et.al. | 2503.04127 | null |
2025-03-02 | Cross Modality Medical Image Synthesis for Improving Liver Segmentation | Muhammad Rafiq et.al. | 2503.00945 | null |
2025-03-02 | Personalizing the meshed SPL/NAC Brain Atlas for patient-specific scientific computing using SynthMorph | Andy Huynh et.al. | 2503.00931 | null |
2025-03-02 | NCF: Neural Correspondence Field for Medical Image Registration | Lei Zhou et.al. | 2503.00760 | null |
2025-02-26 | Deep learning and classical computer vision techniques in medical image analysis: Case studies on brain MRI tissue segmentation, lung CT COPD registration, and skin lesion classification | Anyimadu Daniel Tweneboah et.al. | 2502.19258 | null |
2025-02-26 | From Traditional to Deep Learning Approaches in Whole Slide Image Registration: A Methodological Review | Behnaz Elhaminia et.al. | 2502.19123 | null |
2025-02-24 | SynthRAD2025 Grand Challenge dataset: generating synthetic CTs for radiotherapy | Adrian Thummerer et.al. | 2502.17609 | null |
2025-02-22 | Good Representation, Better Explanation: Role of Convolutional Neural Networks in Transformer-Based Remote Sensing Image Captioning | Swadhin Das et.al. | 2502.16095 | null |
2025-02-23 | Triad: Vision Foundation Model for 3D Magnetic Resonance Imaging | Shansong Wang et.al. | 2502.14064 | link |
2025-02-17 | On the Logic Elements Associated with Round-Off Errors and Gaussian Blur in Image Registration: A Simple Case of Commingling | Serap A. Savari et.al. | 2502.11992 | null |
2025-02-17 | Medical Image Registration Meets Vision Foundation Model: Prototype Learning and Contour Awareness | Hao Xu et.al. | 2502.11440 | link |
2025-02-15 | Super Resolution image reconstructs via total variation-based image deconvolution: a majorization-minimization approach | Mouhamad Chehaitly et.al. | 2502.10876 | null |
2025-02-15 | Hybrid Deepfake Image Detection: A Comprehensive Dataset-Driven Approach Integrating Convolutional and Attention Mechanisms with Frequency Domain Features | Kafi Anan et.al. | 2502.10682 | null |
2025-02-14 | PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention Control | Kunal Swami et.al. | 2502.10258 | null |
2025-02-13 | Vision-based Geo-Localization of Future Mars Rotorcraft in Challenging Illumination Conditions | Dario Pisanti et.al. | 2502.09795 | null |
2025-02-12 | MRUCT: Mixed Reality Assistance for Acupuncture Guided by Ultrasonic Computed Tomography | Yue Yang et.al. | 2502.08786 | null |
2025-02-07 | Investigating the impact of kernel harmonization and deformable registration on inspiratory and expiratory chest CT images for people with COPD | Aravind R. Krishnan et.al. | 2502.05119 | null |
2025-02-06 | Expanding Training Data for Endoscopic Phenotyping of Eosinophilic Esophagitis | Juming Xiong et.al. | 2502.04199 | null |
2025-02-05 | REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations | Peter Sushko et.al. | 2502.03629 | null |
2025-02-05 | A Unified Framework for Semi-Supervised Image Segmentation and Registration | Ruizhe Li et.al. | 2502.03229 | null |
2025-02-05 | Tell2Reg: Establishing spatial correspondence between images by the same language prompts | Wen Yan et.al. | 2502.03118 | link |
2025-02-05 | PoleStack: Robust Pole Estimation of Irregular Objects from Silhouette Stacking | Jacopo Villa et.al. | 2502.02907 | null |
2025-02-04 | Test Time Training for 4D Medical Image Interpolation | Qikang Zhang et.al. | 2502.02341 | link |
2025-02-04 | MORPH-LER: Log-Euclidean Regularization for Population-Aware Image Registration | Mokshagna Sai Teja Karanam et.al. | 2502.02029 | null |
2025-02-03 | Label Correction for Road Segmentation Using Road-side Cameras | Henrik Toikka et.al. | 2502.01281 | null |
2025-02-03 | Multi-Resolution SAR and Optical Remote Sensing Image Registration Methods: A Review, Datasets, and Future Perspectives | Wenfei Zhang et.al. | 2502.01002 | null |
2025-01-31 | Transformation trees -- documentation of multimodal image registration | Agnieszka Anna Tomaka et.al. | 2501.19140 | null |
2025-01-31 | An Adversarial Approach to Register Extreme Resolution Tissue Cleared 3D Brain Images | Abdullah Naziba et.al. | 2501.18815 | link |
2025-01-27 | Multi-Objective Deep-Learning-based Biomechanical Deformable Image Registration with MOREA | Georgios Andreadis et.al. | 2501.16525 | null |
2025-01-23 | Variational U-Net with Local Alignment for Joint Tumor Extraction and Registration (VALOR-Net) of Breast MRI Data Acquired at Two Different Field Strengths | Muhammad Shahkar Khan et.al. | 2501.13690 | null |
2025-01-22 | Learning accurate rigid registration for longitudinal brain MRI from synthetic data | Jingru Fu et.al. | 2501.13010 | null |
2025-01-22 | LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation | Jiahao Wang et.al. | 2501.12976 | null |
2025-01-21 | Regressor-Guided Image Editing Regulates Emotional Response to Reduce Online Engagement | Christoph Gebhardt et.al. | 2501.12289 | null |
2025-01-18 | Deformable Image Registration of Dark-Field Chest Radiographs for Local Lung Signal Change Assessment | Fabian Drexel et.al. | 2501.10757 | null |
2025-01-18 | Quasi-linear maps and image transformations | S. V. Butler et.al. | 2501.10635 | null |
2025-01-15 | A Vessel Bifurcation Landmark Pair Dataset for Abdominal CT Deformable Image Registration (DIR) Validation | Edward R Criscuolo et.al. | 2501.09162 | link |
2025-01-15 | TimeFlow: Longitudinal Brain Image Registration and Aging Progression Analysis | Bailiang Jian et.al. | 2501.08667 | null |
2025-01-13 | MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training | Xingyi He et.al. | 2501.07556 | null |
2025-01-13 | Implicit Neural Representations for Registration of Left Ventricle Myocardium During a Cardiac Cycle | Mathias Micheelsen Lowes et.al. | 2501.07248 | link |
2025-01-19 | Improved joint modelling of breast cancer radiomics features and hazard by image registration aided longitudinal CT data | Subrata Mukherjee et.al. | 2501.06814 | null |
2025-01-06 | COph100: A comprehensive fundus image registration dataset from infants constituting the "RIDIRP" database | Yan Hu et.al. | 2501.02800 | null |
2025-01-02 | Rephotography in the Digital Era: Mass Rephotography and re.photos, the Web Portal for Rephotography | Axel Schaffland et.al. | 2501.02017 | null |
2024-12-31 | Estimation of 3T MR images from 1.5T images regularized with Physics based Constraint | Prabhjot Kaur et.al. | 2501.01464 | null |
2024-12-29 | Motion Transfer-Driven intra-class data augmentation for Finger Vein Recognition | Xiu-Feng Huang et.al. | 2412.20327 | link |
2024-12-27 | Structural Similarity in Deep Features: Image Quality Assessment Robust to Geometrically Disparate Reference | Keke Zhang et.al. | 2412.19553 | null |
2024-12-24 | Advancing Deformable Medical Image Registration with Multi-axis Cross-covariance Attention | Mingyuan Meng et.al. | 2412.18545 | null |
2024-12-23 | Unsupervised learning of spatially varying regularization for diffeomorphic image registration | Junyu Chen et.al. | 2412.17982 | null |
2024-12-22 | Classifier-guided registration of coronary CT angiography and intravascular ultrasound | R. L. M. van Herten et.al. | 2412.17100 | null |
2024-12-20 | LEDA: Log-Euclidean Diffeomorphic Autoencoder for Efficient Statistical Analysis of Diffeomorphism | Krithika Iyer et.al. | 2412.16129 | null |
2024-12-20 | From Model Based to Learned Regularization in Medical Image Registration: A Comprehensive Review | Anna Reithmeir et.al. | 2412.15740 | null |
2024-12-19 | MUSTER: Longitudinal Deformable Registration by Composition of Consecutive Deformations | Edvard O. S. Grødem et.al. | 2412.14671 | link |
2024-12-19 | E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling | Zhihang Yuan et.al. | 2412.14170 | null |
2024-12-17 | Image registration is a geometric deep learning task | Vasiliki Sideri-Lampretsa et.al. | 2412.13294 | null |
2024-12-17 | Prompt Augmentation for Self-supervised Text-guided Image Manipulation | Rumeysa Bodur et.al. | 2412.13081 | null |
2024-12-17 | Identifying Bias in Deep Neural Networks Using Image Transforms | Sai Teja Erukude et.al. | 2412.13079 | link |
2024-12-16 | IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation | Yiren Song et.al. | 2412.11638 | null |
2024-12-13 | RAID-Database: human Responses to Affine Image Distortions | Paula Daudén-Oliver et.al. | 2412.10211 | null |
2024-12-12 | On Round-Off Errors and Gaussian Blur in Superresolution and in Image Registration | Serap A. Savari et.al. | 2412.09741 | null |
2024-12-10 | AmCLR: Unified Augmented Learning for Cross-Modal Representations | Ajay Jagannath et.al. | 2412.07979 | link |
2024-12-09 | Table2Image: Interpretable Tabular data Classification with Realistic Image Transformations | Seungeun Lee et.al. | 2412.06265 | link |
2024-12-05 | Blind Underwater Image Restoration using Co-Operational Regressor Networks | Ozer Can Devecioglu et.al. | 2412.03995 | null |
2024-12-04 | MRNet: Multifaceted Resilient Networks for Medical Image-to-Image Translation | Hyojeong Lee et.al. | 2412.03039 | null |
2024-12-02 | CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion | Kai He et.al. | 2412.01792 | null |
2024-12-03 | Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation | Bolin Lai et.al. | 2412.01027 | null |
2024-11-28 | FAN-Unet: Enhancing Unet with vision Fourier Analysis Block for Biomedical Image Segmentation | Jiashu Xu et.al. | 2411.18975 | null |
2024-11-27 | Neural Image Unfolding: Flattening Sparse Anatomical Structures using Neural Fields | Leonhard Rist et.al. | 2411.18415 | null |
2024-11-26 | CAMLD: Contrast-Agnostic Medical Landmark Detection with Consistency-Based Regularization | Soorena Salari et.al. | 2411.17845 | null |
2024-11-25 | Improving Deformable Image Registration Accuracy through a Hybrid Similarity Metric and CycleGAN Based Auto-Segmentation | Keyur D. Shah et.al. | 2411.16992 | null |
2024-11-25 | Oriented histogram-based vector field embedding for characterizing 4D CT data sets in radiotherapy | Frederic Madesta et.al. | 2411.16314 | null |
2024-11-28 | Can Encrypted Images Still Train Neural Networks? Investigating Image Information and Random Vortex Transformation | XiaoKai Cao et.al. | 2411.16207 | link |
2024-11-24 | Making Images from Images: Interleaving Denoising and Transformation | Shumeet Baluja et.al. | 2411.15925 | null |
2024-11-24 | ZeroGS: Training 3D Gaussian Splatting from Unposed Images | Yu Chen et.al. | 2411.15779 | null |
2024-11-23 | LDM-Morph: Latent diffusion model guided deformable image registration | Jiong Wu et.al. | 2411.15426 | link |
2024-11-26 | Exploiting Watermark-Based Defense Mechanisms in Text-to-Image Diffusion Models for Unauthorized Data Usage | Soumil Datta et.al. | 2411.15367 | null |
2024-11-21 | Automatic brain tumor segmentation in 2D intra-operative ultrasound images using MRI tumor annotations | Mathilde Faanes et.al. | 2411.14017 | link |
2024-11-20 | Virtual Staining of Label-Free Tissue in Imaging Mass Spectrometry | Yijie Zhang et.al. | 2411.13120 | null |
2024-11-13 | A generalized software framework for consolidation of radiotherapy planning and delivery data from diverse data sources | Yasin Abdulkadir et.al. | 2411.08876 | null |
2024-11-12 | Atmospheric turbulence restoration by diffeomorphic image registration and blind deconvolution | Jerome Gilles et.al. | 2411.07578 | null |
2024-11-12 | Uncertainty-Aware Test-Time Adaptation for Inverse Consistent Diffeomorphic Lung Image Registration | Muhammad F. A. Chaudhary et.al. | 2411.07567 | null |
2024-11-11 | XPoint: A Self-Supervised Visual-State-Space based Architecture for Multispectral Image Registration | Ismail Can Yagmur et.al. | 2411.07430 | link |
2024-11-10 | Graph Neural Networks for modelling breast biomechanical compression | Hadeel Awwad et.al. | 2411.06596 | link |
2024-11-09 | NeuReg: Domain-invariant 3D Image Registration on Human and Mouse Brains | Taha Razzaq et.al. | 2411.06315 | null |
2024-11-11 | Relationships between the degrees of freedom in the affine Gaussian derivative model for visual receptive fields and 2-D affine image transformations, with application to covariance properties of simple cells in the primary visual cortex | Tony Lindeberg et.al. | 2411.05673 | null |
2024-11-05 | A Symmetric Dynamic Learning Framework for Diffeomorphic Medical Image Registration | Jinqiu Deng et.al. | 2411.02888 | null |
2024-11-05 | Applications of Automatic Differentiation in Image Registration | Warin Watson et.al. | 2411.02806 | link |
2024-11-04 | Multi-modal deformable image registration using untrained neural networks | Quang Luong Nhat Nguyen et.al. | 2411.02672 | null |
2024-11-04 | Advanced computer vision for extracting georeferenced vehicle trajectories from drone imagery | Robert Fonod et.al. | 2411.02136 | null |
2024-11-03 | FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing | Jitesh Joshi et.al. | 2411.01542 | link |
2024-11-03 | MambaReg: Mamba-Based Disentangled Convolutional Sparse Coding for Unsupervised Deformable Multi-Modal Image Registration | Kaiang Wen et.al. | 2411.01399 | null |
2024-11-02 | RLE: A Unified Perspective of Data Augmentation for Cross-Spectral Re-identification | Lei Tan et.al. | 2411.01225 | link |
2024-10-29 | NCA-Morph: Medical Image Registration with Neural Cellular Automata | Amin Ranem et.al. | 2410.22265 | link |
2024-10-27 | Unsupervised Panoptic Interpretation of Latent Spaces in GANs Using Space-Filling Vector Quantization | Mohammad Hassan Vali et.al. | 2410.20573 | link |
2024-10-27 | UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration | Runshi Zhang et.al. | 2410.20348 | link |
2024-10-26 | Cross-Survey Image Transformation: Enhancing SDSS and DECaLS Images to Near-HSC Quality for Advanced Astronomical Analysis | Zhijian Luo et.al. | 2410.20025 | null |
2024-10-25 | Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Ilan Naiman et.al. | 2410.19538 | null |
2024-10-24 | A Counterexample in Cross-Correlation Template Matching | Serap A. Savari et.al. | 2410.19085 | null |
2024-10-24 | Python workflow for segmenting multiphase flow in porous rocks | Catherine Spurin et.al. | 2410.18937 | link |
2024-10-23 | MsMorph: An Unsupervised pyramid learning network for brain image registration | Jiaofen Nan et.al. | 2410.18228 | link |
2024-10-23 | Improving Instance Optimization in Deformable Image Registration with Gradient Projection | Yi Zhang et.al. | 2410.15767 | null |
2024-10-18 | GESH-Net: Graph-Enhanced Spherical Harmonic Convolutional Networks for Cortical Surface Registration | Ruoyu Zhang et.al. | 2410.14805 | null |
2024-10-18 | 2D-3D Deformable Image Registration of Histology Slide and Micro-CT with ML-based Initialization | Junan Chen et.al. | 2410.14343 | null |
2024-10-17 | SAMReg: SAM-enabled Image Registration with ROI-based Correspondence | Shiqi Huang et.al. | 2410.14083 | link |
2024-10-13 | S |
Yongxiang Liu et.al. | 2410.13891 | null |
2024-10-15 | RS-MOCO: A deep learning-based topology-preserving image registration method for cardiac T1 mapping | Chiyi Huang et.al. | 2410.11651 | null |
2024-10-14 | MoonMetaSync: Lunar Image Registration Analysis | Ashutosh Kumar et.al. | 2410.11118 | link |
2024-10-14 | Stationary Velocity Fields on Matrix Groups for Deformable Image Registration | Johannes Bostelmann et.al. | 2410.10997 | null |
2024-10-14 | A Counterexample in Image Registration | Serap A. Savari et.al. | 2410.10725 | null |
2024-10-12 | FiRework: Field Refinement Framework for Efficient Enhancement of Deformable Registration | Haiqiao Wang et.al. | 2410.09595 | link |
2024-10-12 | Leveraging Semantic Cues from Foundation Vision Models for Enhanced Local Feature Correspondence | Felipe Cadar et.al. | 2410.09533 | link |
2024-10-11 | Hierarchical uncertainty estimation for learning-based registration in neuroimaging | Xiaoling Hu et.al. | 2410.09299 | link |