Vincent Liu

vincent [dot] liu15 [at] gmail [dot] com

prof_pic.jpg
 

I’m building a robotics company to scale robot learning from humans in the real world. In the process of proofing this, I recently first-authored foundational research on zero-shot human-to-robot learning. I post about AI, robotics, and deep tech startups on X.

I received my BA in Mathematics and MS in Computer Science from Stanford University. My past work in AI has spanned theoretical and applied topics in 3D world models, self-driving, speech generation, multi-task learning, self-supervised learning, computer vision, reinforcement learning, and natural language processing.

news

Jun 03, 2025 We released our research on tactile robot learning from humans Feel the Force: Contact-Driven Learning from Humans 🤲
May 27, 2025 We released our research on zero-shot human-to-robot learning EgoZero: Robot Learning from Smart Glasses 📝

research

CSM Implicit Shape Foundation Model
October 2024

I helped train and scale CSM’s Implicit Shape Foundation Model, a 3D-native latent diffusion-Transformer model that produces 3D occupancy fields from single image in <1 minute, and its pre- and post-training data infrastructure. My work helped CSM land #1 on the 3D Arena Leaderboard and close initial enterprise contracts.

Implicit Shape Foundation Model
CSM Neural Radiance Field Foundation Model
March 2024

I helped train and scale CSM’s Neural Radiance Field (NeRF) Foundation Model, a 2D-native NeRF model that reconstructs 3D occupancy fields from single image in <1 minute. This model plugs into a slow refinement process based on Poole et al., 2022 and grew CSM to over 300k+ users. My work was presented at NVIDIA GTC 2024.

End-to-end Speech Synthesis with Generative Adversarial Networks
June 2022

I developed adversarial methods for end-to-end text-to-speech models as part of my Master’s thesis. I also lectured on these concepts for Stanford’s CS 236G: Generative Adversarial Networks Winter 2021 offering. Here are some samples from my implementation, which is an adaptation of Donahue et al., 2021; Kim et al., 2020.

Peter Piper picked a peck of pickled peppers
The quick brown fox jumps over the lazy dog
Tesla Full Self-Driving
September 2021

I trained and shipped models to the v9.2, v10.1, v10.2, and v10.3 releases of Tesla’s Full Self-Driving (FSD). My work was also featured in Tesla’s 2021 AI Day event.

Tesla FSD vision at AI Day 2021
NVIDIA Neural Modules
September 2020

I trained and shipped lightweight speech vocoders to the v1.0.0 release of NVIDIA’s Neural Modules (NeMo), an open-source speech AI toolkit for developers that has now expanded more broadly to conversational AI and LLMs.


publications

  1. EgoZero: Robot Learning from Smart Glasses
    Vincent LiuAdemi Adeniji, Haotian Zhan, Raunaq Bhirangi, and 2 more authors
    2025
  2. Feel the Force: Contact-Driven Learning from Humans
    Ademi Adeniji, Zhuoran Chen, Vincent Liu, Venkatesh Pattabiraman, and 4 more authors
    2025
  3. DABS 2.0: Improved Datasets and Algorithms for Universal Self-Supervision
    Alex Tamkin, Gaurab Banerjee, Mohamed Owda, Vincent Liu, and 2 more authors
    In Advances in Neural Information Processing Systems, 2022
  4. Multivariate Group Robustness
    Jupinder ParmarVincent Liu, and Tatsu Hashimoto
    2022
  5. End-to-end Speech Synthesis with Generative Adversarial Networks
    Vincent Liu
    Stanford University, 2022
  6. DABS: a Domain-Agnostic Benchmark for Self-Supervised Learning
    Alex TamkinVincent Liu, Rongfei Lu, Daniel Fein, and 2 more authors
    In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2021
  7. Recurrent Control Nets for Deep Reinforcement Learning
    Vincent LiuAdemi Adeniji, Nate Lee, Jason Zhao, and 1 more author
    Stanford Undergraduate Research Journal, 2019
  8. Volumetric Semantic Segmentation of Glioblastoma Tumors from MRI Studies
    Ademi Adeniji, and Vincent Liu
    2019
  9. Sequence-to-Sequence Generative Argumentative Dialogue Systems with Self-Attention
    Ademi Adeniji, Nate Lee, and Vincent Liu
    2019

teaching

CS 236G: Generative Adversarial Networks
March 2021

I helped create and teach CS 236G: Generative Adversarial Networks, which also became a Coursera course. I designed advanced material based on papers such as Pix2PixHD, GauGAN, SRGAN, BigGAN, and MUNIT. I created workshops (1, 2) that cover topics such as mode collapse, spectral normalization, orthogonal initialization, mixed precision, distributed training, Pytorch best practices, and also gave a guest lecture on adversarial methods in speech synthesis.