#7 — Working with models
November 29, 2023•2 min read

STRUCTURED OUTPUT: YOUR INTEGRATION ADVANTAGE
The stakes: Your downstream systems need clean, consistent data from LLMs, not freeform text that breaks your app.
Real founders are solving this now:
- Rechat's real-estate CRM demands structured responses for frontend widget rendering
- Boba's product strategy tool requires specific fields (title, summary, plausibility score)
- LinkedIn uses LLMs to generate YAML that powers skill selection and parameter setting
Founder move: Follow Postel's Law - accept natural language input but output only typed, machine-readable objects. Use Instructor for commercial APIs (OpenAI, Anthropic) or Outlines for self-hosted models.
MODEL MIGRATION: HARDER THAN IT LOOKS
The reality check: Swapping models isn't plug-and-play. Your carefully crafted prompts will likely break.
The numbers tell the story:
- Voiceflow saw a 10% performance drop when moving from gpt-3.5-turbo-0301 to the 1106 version
- GoDaddy's experience was better but still required significant adjustment
Founder move: Build automated evaluation pipelines to measure performance across models before committing. Budget more migration time than you think you need.
VERSION PINNING: YOUR STABILITY INSURANCE
The warning: In machine learning, changing anything changes everything - especially with models you don't control.
Smart founders are:
- Pinning specific model versions (like gpt-4-turbo-1106) in production
- Preventing unexpected behavior changes that trigger customer complaints
- Running shadow pipelines with newer models to safely test before deployment
Founder move: Lock your production models while maintaining a parallel testing environment for new versions.
MODEL RIGHTSIZING: THE EFFICIENCY EDGE
The opportunity: Smaller models = lower costs, faster responses, and competitive advantage.
The surprising truth:
- Haiku with 10-shot prompts can outperform zero-shot Opus and GPT-4
- DistilBERT (67M parameters) and DistilBART (400M parameters) deliver strong baseline performance
- A fine-tuned DistilBART identified hallucinations with 0.84 ROC-AUC, beating larger models at 5% of the cost and latency
Founder move: Start with the biggest model to prove feasibility, then aggressively experiment with smaller alternatives using techniques like few-shot prompting and fine-tuning.
THE BOTTOM LINE
The winners in the AI startup race won't be those with access to the biggest models, but those who engineer the smartest integration, migration, versioning, and sizing strategies. Your ability to manage these dependencies could be your most valuable technical moat.
Keep reading

#8 — A Survey of Techniques for Maximizing LLM Performance
A practical guide to optimizing LLM performance.

#9 — Copywriting formulas
11 proven copywriting formulas that help founders craft compelling messages, drive engagement, and convert followers.

#10 — The three-hour Brand Sprint
Google Ventures' Brand Sprint condenses essential branding exercises into a three-hour workshop.