Skip to content

Speech Synthesis and Conversion

Speech Synthesis and Conversion

Publish Date Title Authors PDF Code
2025-06-26 DeOcc-1-to-3: 3D De-Occlusion from a Single Image via Self-Supervised Multi-View Diffusion Yansong Qu et.al. 2506.21544v1 null
2025-06-26 MADrive: Memory-Augmented Driving Scene Modeling Polina Karpikova et.al. 2506.21520v1 null
2025-06-26 Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Boyu Gou et.al. 2506.21506v1 null
2025-06-26 SmoothSinger: A Conditional Diffusion Model for Singing Voice Synthesis with Multi-Resolution Architecture Kehan Sui et.al. 2506.21478v1 null
2025-06-26 ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing Huadai Liu et.al. 2506.21448v1 null
2025-06-26 Low-metallicity massive single stars with rotation. III. Source of ionization and C-IV emission in I Zw 18 Dorottya Szécsi et.al. 2506.21442v1 null
2025-06-26 EndoFlow-SLAM: Real-Time Endoscopic SLAM with Flow-Constrained Gaussian Splatting Taoyu Wu et.al. 2506.21420v1 null
2025-06-26 XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation Bowen Chen et.al. 2506.21416v1 null
2025-06-26 DynamicBench: Evaluating Real-Time Report Generation in Large Language Models Jingyao Li et.al. 2506.21343v1 null
2025-06-26 Rubidium Abundances in Cool Giants from High-Resolution H-band Spectra: A New Diagnostic for Galactic Chemical Evolution Nils Ryde et.al. 2506.21332v1 null
2025-06-26 An H$α$ Cloud in the HI Tail: Recent Star Formation in the Outskirts of NGC 4258 Revealed by Nanshan 1-m Telescope Cheng Cheng et.al. 2506.21321v1 null
2025-06-26 HieraSurg: Hierarchy-Aware Diffusion Model for Surgical Video Generation Diego Biagini et.al. 2506.21287v1 null
2025-06-26 FairyGen: Storied Cartoon Video from a Single Child-Drawn Character Jiayi Zheng et.al. 2506.21272v1 null
2025-06-26 Wurtzite Boron Nitride as a potential defects host M. Silvetti et.al. 2506.21197v1 null
2025-06-26 Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image Pufan Li et.al. 2506.21152v1 null
2025-06-26 Learning to See in the Extremely Dark Hai Jiang et.al. 2506.21132v1 null
2025-06-26 Improving Diffusion-Based Image Editing Faithfulness via Guidance and Scheduling Hansam Cho et.al. 2506.21045v1 null
2025-06-26 User-in-the-Loop View Sampling with Error Peaking Visualization Ayaka Yasunaga et.al. 2506.21009v1 null
2025-06-26 Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology Qiuyi Qi et.al. 2506.21001v1 null
2025-06-26 DBMovi-GS: Dynamic View Synthesis from Blurry Monocular Video via Sparse-Controlled Gaussian Splatting Yeon-Ji Song et.al. 2506.20998v1 null
2025-06-26 Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance Akio Hayakawa et.al. 2506.20995v1 null
2025-06-26 Consistent Zero-shot 3D Texture Synthesis Using Geometry-aware Diffusion and Temporal Video Models Donggoo Kang et.al. 2506.20946v1 null
2025-06-26 A Multi-Stage Framework for Multimodal Controllable Speech Synthesis Rui Niu et.al. 2506.20945v1 null
2025-06-25 3DGH: 3D Head Generation with Composable Hair and Face Chengan He et.al. 2506.20875v1 null
2025-06-25 Nowhere left to hide: revealing realistic gravitational-wave populations in high dimensions and high resolution with PixelPop Sofia Alvarez-Lopez et.al. 2506.20731v1 null
2025-06-25 Architectural mechanisms of a universal fault-tolerant quantum computer Dolev Bluvstein et.al. 2506.20661v1 null
2025-06-25 PhasePoly: An Optimization Framework forPhase Polynomials in Quantum Circuits Zihan Chen et.al. 2506.20624v1 null
2025-06-25 Video Perception Models for 3D Scene Synthesis Rui Huang et.al. 2506.20601v1 null
2025-06-25 Interplay of magnetic ordering and charge transport in a distorted ScAl$_3$C$_3$-type GdZn$_3$As$_3$ Zhiyu Zhou et.al. 2506.20454v1 null
2025-06-25 HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling Tobias Vontobel et.al. 2506.20452v1 null