#7 — Working with models | Field Notes

Founders are navigating a new AI landscape where LLM dependencies can make or break your startup. Here's what you need to know.

STRUCTURED OUTPUT: YOUR INTEGRATION ADVANTAGE

The stakes: Your downstream systems need clean, consistent data from LLMs, not freeform text that breaks your app.

Real founders are solving this now:

Rechat's real-estate CRM demands structured responses for frontend widget rendering
Boba's product strategy tool requires specific fields (title, summary, plausibility score)
LinkedIn uses LLMs to generate YAML that powers skill selection and parameter setting

Founder move: Follow Postel's Law - accept natural language input but output only typed, machine-readable objects. Use Instructor for commercial APIs (OpenAI, Anthropic) or Outlines for self-hosted models.

MODEL MIGRATION: HARDER THAN IT LOOKS

The reality check: Swapping models isn't plug-and-play. Your carefully crafted prompts will likely break.

The numbers tell the story:

Voiceflow saw a 10% performance drop when moving from gpt-3.5-turbo-0301 to the 1106 version
GoDaddy's experience was better but still required significant adjustment

Founder move: Build automated evaluation pipelines to measure performance across models before committing. Budget more migration time than you think you need.

VERSION PINNING: YOUR STABILITY INSURANCE

The warning: In machine learning, changing anything changes everything - especially with models you don't control.

Smart founders are:

Pinning specific model versions (like gpt-4-turbo-1106) in production
Preventing unexpected behavior changes that trigger customer complaints
Running shadow pipelines with newer models to safely test before deployment

Founder move: Lock your production models while maintaining a parallel testing environment for new versions.

MODEL RIGHTSIZING: THE EFFICIENCY EDGE

The opportunity: Smaller models = lower costs, faster responses, and competitive advantage.

The surprising truth:

Haiku with 10-shot prompts can outperform zero-shot Opus and GPT-4
DistilBERT (67M parameters) and DistilBART (400M parameters) deliver strong baseline performance
A fine-tuned DistilBART identified hallucinations with 0.84 ROC-AUC, beating larger models at 5% of the cost and latency

Founder move: Start with the biggest model to prove feasibility, then aggressively experiment with smaller alternatives using techniques like few-shot prompting and fine-tuning.

THE BOTTOM LINE

The winners in the AI startup race won't be those with access to the biggest models, but those who engineer the smartest integration, migration, versioning, and sizing strategies. Your ability to manage these dependencies could be your most valuable technical moat.

Answer from Perplexity: pplx.ai/share

#7 — Working with models

STRUCTURED OUTPUT: YOUR INTEGRATION ADVANTAGE

MODEL MIGRATION: HARDER THAN IT LOOKS

VERSION PINNING: YOUR STABILITY INSURANCE

MODEL RIGHTSIZING: THE EFFICIENCY EDGE

THE BOTTOM LINE

Keep reading

#8 — A Survey of Techniques for Maximizing LLM Performance

#9 — Copywriting formulas

#10 — The three-hour Brand Sprint

We’re actually here to help. Your ICPs have the words. We find them.

STRUCTURED OUTPUT: YOUR INTEGRATION ADVANTAGE

MODEL MIGRATION: HARDER THAN IT LOOKS

VERSION PINNING: YOUR STABILITY INSURANCE

MODEL RIGHTSIZING: THE EFFICIENCY EDGE

THE BOTTOM LINE

Keep reading

#8 — A Survey of Techniques for Maximizing LLM Performance

#9 — Copywriting formulas

#10 — The three-hour Brand Sprint

We’re actuallyactually here to help. Your ICPs have the words. We find them.

We’re actually here to help. Your ICPs have the words. We find them.