PodcastIntel
Sign in Get Started Free
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast

AI Fundamentals: Datasets 101

Jul 17, 2023 · 1h 0m
AI Summary
  • GPT-3 trained on ~600GB of data (Wikipedia, Books, WebText, CommonCrawl), not the entire internet as commonly claimed
  • Dataset quality is critical to model performance regardless of algorithm sophistication; garbage in, garbage out principle
  • Part of AI Fundamentals series explaining why datasets matter for AI development and training

More from Latent Space: The AI Engineer Podcast

View all episodes →

Get AI Summaries for Every New Episode

Subscribe to Latent Space: The AI Engineer Podcast and get AI summaries, guest tracking, and email digests delivered automatically.

Sign Up Free →