Skip to content

Alignment

Alignment

Publish Date Title Authors PDF Code
2025-06-26 SAM4D: Segment Anything in Camera and LiDAR Streams Jianyun Xu et.al. 2506.21547v1 null
2025-06-26 Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval Hani Alomari et.al. 2506.21538v1 null
2025-06-26 G$^{2}$D: Boosting Multimodal Learning with Gradient-Guided Distillation Mohammed Rakib et.al. 2506.21514v1 null
2025-06-26 Global and Local Entailment Learning for Natural World Imagery Srikumar Sastry et.al. 2506.21476v1 null
2025-06-26 Deception Detection in Dyadic Exchanges Using Multimodal Machine Learning: A Study on a Swedish Cohort Franco Rugolon et.al. 2506.21429v1 null
2025-06-26 Distributed Cross-Channel Hierarchical Aggregation for Foundation Models Aristeidis Tsaris et.al. 2506.21411v1 null
2025-06-26 CA-I2P: Channel-Adaptive Registration Network with Global Optimal Selection Zhixin Cheng et.al. 2506.21364v1 null
2025-06-26 SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning Melanie Rieff et.al. 2506.21355v1 null
2025-06-26 HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context Qize Yang et.al. 2506.21277v1 null
2025-06-26 WordCon: Word-level Typography Control in Scene Text Rendering Wenda Shi et.al. 2506.21276v1 null
2025-06-26 Integrating Vehicle Acoustic Data for Enhanced Urban Traffic Management: A Study on Speed Classification in Suzhou Pengfei Fan et.al. 2506.21269v1 null
2025-06-26 GANet-Seg: Adversarial Learning for Brain Tumor Segmentation with Hybrid Generative Models Qifei Cui et.al. 2506.21245v1 null
2025-06-26 DiMPLe -- Disentangled Multi-Modal Prompt Learning: Enhancing Out-Of-Distribution Alignment with Invariant and Spurious Feature Separation Umaima Rahman et.al. 2506.21237v1 null
2025-06-26 MedPrompt: LLM-CNN Fusion with Weight Routing for Medical Image Segmentation and Classification Shadman Sobhan et.al. 2506.21199v1 null
2025-06-26 Personalized Federated Learning via Dual-Prompt Optimization and Cross Fusion Yuguang Zhang et.al. 2506.21144v1 null
2025-06-26 Semantic-aware Digital Twin for AI-based CSI Acquisition Jiajia Guo et.al. 2506.21126v1 null
2025-06-26 IPFormer-VideoLLM: Enhancing Multi-modal Video Understanding for Multi-shot Scenes Yujia Liang et.al. 2506.21116v1 null
2025-06-26 DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning Kang He et.al. 2506.21096v1 null
2025-06-26 EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception Sanjoy Chowdhury et.al. 2506.21080v1 null
2025-06-26 TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence Feng Jiang et.al. 2506.21028v1 null
2025-06-26 LASFNet: A Lightweight Attention-Guided Self-Modulation Feature Fusion Network for Multimodal Object Detection Lei Hao et.al. 2506.21018v1 null
2025-06-26 Multimodal Prompt Alignment for Facial Expression Recognition Fuyan Ma et.al. 2506.21017v1 null
2025-06-26 TSDASeg: A Two-Stage Model with Direct Alignment for Interactive Point Cloud Segmentation Chade Li et.al. 2506.20991v1 null
2025-06-26 EVA: Mixture-of-Experts Semantic Variant Alignment for Compositional Zero-Shot Learning Xiao Zhang et.al. 2506.20986v1 null
2025-06-26 ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation Shruti Bansal et.al. 2506.20969v1 null
2025-06-26 OmniEval: A Benchmark for Evaluating Omni-modal Models with Visual, Auditory, and Textual Inputs Yiman Zhang et.al. 2506.20960v1 null
2025-06-26 Hierarchical Sub-action Tree for Continuous Sign Language Recognition Dejie Yang et.al. 2506.20947v1 null
2025-06-26 A Multi-Stage Framework for Multimodal Controllable Speech Synthesis Rui Niu et.al. 2506.20945v1 null
2025-06-26 E-FreeM2: Efficient Training-Free Multi-Scale and Cross-Modal News Verification via MLLMs Van-Hoang Phan et.al. 2506.20944v1 null
2025-06-25 Stellar Dynamics in Open Clusters Increases the Binary Fraction and Mass Ratios: Evidence from Photometric Binaries in 35 Open Clusters Anna C. Childs et.al. 2506.20889v1 null