Skip to content

Speech Synthesis and Conversion

Speech Synthesis and Conversion

Publish Date Title Authors PDF Code
2024-09-16 Real-Time Whole-Body Control of Legged Robots with Model-Predictive Path Integral Control Juan Alvarez-Padilla et.al. 2409.10469v1 null
2024-09-16 Magnetic metamaterials by ion-implantation Christina Vantaraki et.al. 2409.10433v1 null
2024-09-16 HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models Vineet Bhat et.al. 2409.10419v1 null
2024-09-16 Robust image representations with counterfactual contrastive learning Mélanie Roschewitz et.al. 2409.10365v1 link
2024-09-16 On Synthetic Texture Datasets: Challenges, Creation, and Curation Blaine Hoak et.al. 2409.10297v1 null
2024-09-16 DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis Fa-Ting Hong et.al. 2409.10281v1 null
2024-09-16 Orienting gaze toward a visual target: Neurophysiological synthesis with epistemological considerations Laurent Goffart et.al. 2409.10189v1 null
2024-09-16 Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization Xiaoxue Gao et.al. 2409.10157v1 null
2024-09-16 MotionCom: Automatic and Motion-Aware Image Composition with LLM and Video Diffusion Prior Weijing Tao et.al. 2409.10090v1 null
2024-09-16 Cross-modality image synthesis from TOF-MRA to CTA using diffusion-based models Alexander Koch et.al. 2409.10089v1 null
2024-09-16 StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion Yinghao Aaron Li et.al. 2409.10058v1 null
2024-09-16 From a Single Trajectory to Safety Controller Synthesis of Discrete-Time Nonlinear Polynomial Systems Behrad Samari et.al. 2409.10026v1 null
2024-09-16 Compositional Design of Safety Controllers for Large-scale Stochastic Hybrid Systems Mahdieh Zaker et.al. 2409.10018v1 null
2024-09-16 DNN-based ensemble singing voice synthesis with interactions between singers Hiroaki Hyodo et.al. 2409.09988v1 null
2024-09-16 2S-ODIS: Two-Stage Omni-Directional Image Synthesis by Geometric Distortion Correction Atsuya Nakata et.al. 2409.09969v1 link
2024-09-15 Safe Control of Quadruped in Varying Dynamics via Safety Index Adaptation Kai S. Yun et.al. 2409.09882v1 null
2024-09-15 Constructing a Singing Style Caption Dataset Hyunjong Ok et.al. 2409.09866v1 link
2024-09-15 Latent Diffusion Models for Controllable RNA Sequence Generation Kaixuan Huang et.al. 2409.09828v1 null
2024-09-15 Room-temperature valley-selective emission enabled by planar chiral quasi-bound states in the continuum Feng Pan et.al. 2409.09806v1 null
2024-09-15 Universal Topology Refinement for Medical Image Segmentation with Polynomial Feature Synthesis Liu Li et.al. 2409.09796v1 null
2024-09-15 Risk-Aware Autonomous Driving for Linear Temporal Logic Specifications Shuhao Qi et.al. 2409.09769v1 null
2024-09-15 MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation Shuzhao Xie et.al. 2409.09756v1 null
2024-09-15 On the Proofs of the Predictive Synthesis Formula Riku Masuda et.al. 2409.09660v1 null
2024-09-15 One-Shot Learning for Pose-Guided Person Image Synthesis in the Wild Dongqi Fan et.al. 2409.09593v1 null
2024-09-14 Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder Seunghwan Kim et.al. 2409.09361v1 link
2024-09-14 MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion Sho Inoue et.al. 2409.09352v1 null
2024-09-14 LawDNet: Enhanced Audio-Driven Lip Synthesis via Local Affine Warping Deformation Deng Junli et.al. 2409.09326v1 null
2024-09-14 The Future of Decoding Non-Standard Nucleotides: Leveraging Nanopore Sequencing for Expanded Genetic Codes Hyunjin Shim et.al. 2409.09314v1 null
2024-09-14 Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation Changjin Han et.al. 2409.09311v1 null
2024-09-14 ManiDext: Hand-Object Manipulation Synthesis via Continuous Correspondence Embeddings and Residual-Guided Diffusion Jiajun Zhang et.al. 2409.09300v1 null