Image Caption

Publish Date	Title	Authors	PDF	Code
2025-06-26	SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark	Alex Costanzino et.al.	2506.21549v1	null
2025-06-26	HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation	Xinzhuo Li et.al.	2506.21546v1	null
2025-06-26	DeOcc-1-to-3: 3D De-Occlusion from a Single Image via Self-Supervised Multi-View Diffusion	Yansong Qu et.al.	2506.21544v1	null
2025-06-26	WorldVLA: Towards Autoregressive Action World Model	Jun Cen et.al.	2506.21539v1	null
2025-06-26	Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval	Hani Alomari et.al.	2506.21538v1	null
2025-06-26	Exploring the Design Space of 3D MLLMs for CT Report Generation	Mohammed Baharoon et.al.	2506.21535v1	null
2025-06-26	The Kaleidoscope Survey: Strong Gravitational Lensing in Galaxy Clusters with Radial Arcs	Catherine Cerny et.al.	2506.21531v1	null
2025-06-26	Revealing electron-lattice decoupling by Peltier thermometry and nanoscale thermal imaging in graphene	Saurabh Kumar Srivastav et.al.	2506.21523v1	null
2025-06-26	Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration	Jiahe Chen et.al.	2506.21509v1	null
2025-06-26	Lightweight Physics-Informed Zero-Shot Ultrasound Plane Wave Denoising	Hojat Asgariandehkordi et.al.	2506.21499v1	null
2025-06-26	TITAN: Query-Token based Domain Adaptive Adversarial Learning	Tajamul Ashraf et.al.	2506.21484v1	null
2025-06-26	SmoothSinger: A Conditional Diffusion Model for Singing Voice Synthesis with Multi-Resolution Architecture	Kehan Sui et.al.	2506.21478v1	null
2025-06-26	Wild refitting for black box prediction	Martin J. Wainwright et.al.	2506.21460v1	null
2025-06-26	Spatial Mental Modeling from Limited Views	Baiqiao Yin et.al.	2506.21458v1	null
2025-06-26	A Comprehensive Dataset for Underground Miner Detection in Diverse Scenario	Cyrus Addy et.al.	2506.21451v1	null
2025-06-26	Controllable 3D Placement of Objects with Scene-Aware Diffusion Models	Mohamed Omran et.al.	2506.21446v1	null
2025-06-26	HyperSORT: Self-Organising Robust Training with hyper-networks	Samuel Joutard et.al.	2506.21430v1	null
2025-06-26	XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation	Bowen Chen et.al.	2506.21416v1	null
2025-06-26	Plasmonically Enhanced Flexural-Mode AlScN Nanoplate Resonator as Uncooled and Ultrafast IR Detector with High Responsivity	Aurelio Venditti et.al.	2506.21412v1	null
2025-06-26	Distributed Cross-Channel Hierarchical Aggregation for Foundation Models	Aristeidis Tsaris et.al.	2506.21411v1	null
2025-06-26	FastRef:Fast Prototype Refinement for Few-Shot Industrial Anomaly Detection	Long Tian et.al.	2506.21398v1	null
2025-06-26	High-quality metalens enables minimally invasive CFB endoscopy	Ruixiang Song et.al.	2506.21379v1	null
2025-06-26	GenFlow: Interactive Modular System for Image Generation	Duc-Hung Nguyen et.al.	2506.21369v1	null
2025-06-26	rQdia: Regularizing Q-Value Distributions With Image Augmentation	Sam Lerman et.al.	2506.21367v1	null
2025-06-26	CA-I2P: Channel-Adaptive Registration Network with Global Optimal Selection	Zhixin Cheng et.al.	2506.21364v1	null
2025-06-26	Primordial Metamaterials	A. Ware et.al.	2506.21359v1	null
2025-06-26	ToosiCubix: Monocular 3D Cuboid Labeling via Vehicle Part Annotations	Behrooz Nasihatkon et.al.	2506.21358v1	null
2025-06-26	ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models	Hongbo Liu et.al.	2506.21356v1	null
2025-06-26	SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning	Melanie Rieff et.al.	2506.21355v1	null
2025-06-26	Generalizable Neural Electromagnetic Inverse Scattering	Yizhe Cheng et.al.	2506.21349v1	null