Definition
A foundation model is a large-scale AI model trained on broad, diverse datasets using self-supervised learning techniques. The term was coined by Stanford researchers to describe models that serve as a common foundation for many different applications. Examples include GPT-4, Claude, Gemini, and Llama for language, and DALL-E, Stable Diffusion, and Midjourney for images.
What makes foundation models distinctive is their generality. A single model can be adapted through fine-tuning, prompt engineering, or retrieval-augmented generation to perform tasks it was never explicitly trained for, from writing marketing copy to analyzing legal contracts to generating code. This adaptability has made foundation models the dominant building block for modern AI applications.
Why It Matters for Product Managers
Foundation models have fundamentally changed the build-versus-buy calculus for AI features. Product managers no longer need to commission custom machine learning models for each capability. Instead, they can evaluate which foundation models best fit their needs and invest engineering effort in integration, prompt design, and fine-tuning rather than model architecture and training from scratch.
However, this shift introduces new strategic decisions. PMs must evaluate model providers, negotiate pricing based on token usage, manage vendor lock-in risks, plan for model deprecation and version changes, and make architectural decisions about whether to use hosted APIs, run open-weight models, or pursue distillation for cost optimization. Understanding the foundation model landscape is now a core product management skill for AI-powered products.
How It Works in Practice
Common Pitfalls
Related Concepts
Foundation models encompass Large Language Models focused on text, and increasingly Multimodal AI systems handling multiple data types. They can be customized through Fine-Tuning and compressed via Model Distillation. Their internal representations power Embeddings used in search and recommendation systems.