PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

How to train your own Large Multimodal Model — with Hugo Laurençon & Leo Tronchon of HuggingFace M4

Jan 19, 2024 · 1h 11m
AI Summary
  • Hugo Laurençon and Leo Tronchon (HuggingFace M4) explain building open source multimodal models by combining existing LLMs and vision encoders with adapter layers
  • Discusses how DeepMind's Flamingo inspired cheaper alternatives like LLaVA, BakLLaVA, and FireLLaVA, and why Flamingo wasn't open sourced
  • Covers LAION's contributions to open source multimodal research and democratizing access to vision-language model training techniques

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →