2025-06-26 |
SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark |
Alex Costanzino et.al. |
2506.21549v1 |
null |
2025-06-26 |
SAM4D: Segment Anything in Camera and LiDAR Streams |
Jianyun Xu et.al. |
2506.21547v1 |
null |
2025-06-26 |
HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation |
Xinzhuo Li et.al. |
2506.21546v1 |
null |
2025-06-26 |
Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval |
Hani Alomari et.al. |
2506.21538v1 |
null |
2025-06-26 |
Exploring the Design Space of 3D MLLMs for CT Report Generation |
Mohammed Baharoon et.al. |
2506.21535v1 |
null |
2025-06-26 |
Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration |
Jiahe Chen et.al. |
2506.21509v1 |
null |
2025-06-26 |
Towards Reliable Detection of Empty Space: Conditional Marked Point Processes for Object Detection |
Tobias J. Riedlinger et.al. |
2506.21486v1 |
null |
2025-06-26 |
Global and Local Entailment Learning for Natural World Imagery |
Srikumar Sastry et.al. |
2506.21476v1 |
null |
2025-06-26 |
Aligning Spoken Dialogue Models from User Interactions |
Anne Wu et.al. |
2506.21463v1 |
null |
2025-06-26 |
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing |
Huadai Liu et.al. |
2506.21448v1 |
null |
2025-06-26 |
Domain Knowledge-Enhanced LLMs for Fraud and Concept Drift Detection |
Ali Şenol et.al. |
2506.21443v1 |
null |
2025-06-26 |
HyperSORT: Self-Organising Robust Training with hyper-networks |
Samuel Joutard et.al. |
2506.21430v1 |
null |
2025-06-26 |
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation |
Bowen Chen et.al. |
2506.21416v1 |
null |
2025-06-26 |
Accelerating GNN Training through Locality-aware Dropout and Merge |
Gongjian Sun et.al. |
2506.21414v1 |
null |
2025-06-26 |
TableMoE: Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table Understanding |
Junwen Zhang et.al. |
2506.21393v1 |
null |
2025-06-26 |
PanSt3R: Multi-view Consistent Panoptic Segmentation |
Lojze Zust et.al. |
2506.21348v1 |
null |
2025-06-26 |
HieraSurg: Hierarchy-Aware Diffusion Model for Surgical Video Generation |
Diego Biagini et.al. |
2506.21287v1 |
null |
2025-06-26 |
Temporal Rate Reduction Clustering for Human Motion Segmentation |
Xianghan Meng et.al. |
2506.21249v1 |
null |
2025-06-26 |
GANet-Seg: Adversarial Learning for Brain Tumor Segmentation with Hybrid Generative Models |
Qifei Cui et.al. |
2506.21245v1 |
null |
2025-06-26 |
ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation |
Xiwei Xuan et.al. |
2506.21233v1 |
null |
2025-06-26 |
Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval |
Yongchan Chun et.al. |
2506.21222v1 |
null |
2025-06-26 |
Robust and efficient pre-processing techniques for particle-based methods including dynamic boundary generation |
Niklas S. Neher et.al. |
2506.21206v1 |
null |
2025-06-26 |
MedPrompt: LLM-CNN Fusion with Weight Routing for Medical Image Segmentation and Classification |
Shadman Sobhan et.al. |
2506.21199v1 |
null |
2025-06-26 |
Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation |
Yihong Cao et.al. |
2506.21198v1 |
null |
2025-06-26 |
Out-of-Distribution Semantic Occupancy Prediction |
Yuheng Zhang et.al. |
2506.21185v1 |
null |
2025-06-26 |
Performance improvement of spatial semantic segmentation with enriched audio features and agent-based error correction for DCASE 2025 Challenge Task 4 |
Jongyeon Park et.al. |
2506.21174v1 |
null |
2025-06-26 |
Compressed and Smooth Latent Space for Text Diffusion Modeling |
Viacheslav Meshchaninov et.al. |
2506.21170v1 |
null |
2025-06-26 |
Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition |
Longkun Zou et.al. |
2506.21165v1 |
null |
2025-06-26 |
Robust Deep Learning for Myocardial Scar Segmentation in Cardiac MRI with Noisy Labels |
Aida Moafi et.al. |
2506.21151v1 |
null |
2025-06-26 |
Tree-based Semantic Losses: Application to Sparsely-supervised Large Multi-class Hyperspectral Segmentation |
Junwen Wang et.al. |
2506.21150v1 |
null |