Skip to main content

Research Scientist

Velvet
San Francisco, CA
Full Time
Compensation
$250,000–$300,000/year

Job Description

About Us

Velvet is a data research company building the datasets that power the next generation of multimodal AI. Founded by Lucas Mantovani (ex Meta FAIR) and Lucas Tucker (ex Adobe Infra), our mission is to make AI more human by producing high-quality audiovisual training data for frontier labs.

We're hiring a Research Scientist to develop and fine-tune models for video and audio data processing and enhancement, as well as to conduct data-oriented research that pushes the boundaries of multimodal quality.

What You'll Do

  • Research, develop, and fine-tune models for audio and video enhancement — including denoising, super-resolution, speech restoration, and perceptual quality improvement — ensuring outputs meet the standards required for frontier model training.
  • Experiment with novel architectures, training objectives, and data augmentation strategies to improve model performance across diverse and noisy real-world audiovisual data.
  • Build evaluation frameworks and benchmarks to rigorously measure enhancement quality, guiding iterative model improvement.
  • Collaborate with infrastructure and data pipeline engineers to integrate trained models into large-scale processing workflows that handle wide variation in speech, visual quality, and format.

What We're Looking For

  • Strong research background in deep learning, with hands-on experience training and fine-tuning models for audio processing, video processing, or related domains.
  • Proficiency in PyTorch. Experience designing and running experiments at scale.
  • Solid understanding of signal processing fundamentals and how they inform model design for enhancement tasks.
  • A publication track record or demonstrated research output in relevant areas (audio/speech enhancement, video restoration, generative models, multimodal learning).
  • Ability to work effectively in an early-stage environment where scope is broad and priorities shift fast.

Even Better

  • Prior work at a frontier AI lab or data company focused on multimodal data.
  • Experience fine-tuning large pretrained models (diffusion models, autoencoders, or transformer-based architectures) for perceptual quality tasks.
  • Familiarity with perceptual quality metrics and human evaluation methodologies for audio and video.
  • Track record working with datasets spanning tens of thousands of hours of audio or video.

You'll Thrive Here If

  • You're excited by applied research with immediate, visible impact on data quality and downstream model performance.
  • You move fluidly between reading papers, writing training loops, and analyzing failure cases.
  • You hold yourself to a high bar for rigor — because you understand that model quality directly determines the value of the data we produce.

Interview Process

  • First round of interviews (remote)
  • Second round of interviews (remote)
  • Work trial (on-site)
  • Offer

Optimize Your Resume for This Job

Get a match score and see exactly which keywords you're missing

Optimize Resume

Job Details

Category
Research
Employment Type
Full Time
Location
San Francisco, CA
Posted
Compensation
$250,000 - $300,000 per year

About Velvet

Velvet is known for its modern, sophisticated contemporary apparel brand with laid-back California attitude, for women and men. The brand has attracted trendsetters since its inception.

Found this role interesting?

Research Scientist
Velvet
Apply