servicesaboutinsights
Back to all notes

#7 — Working with models

November 29, 20232 min read

#7 — Working with models

Founders are navigating a new AI landscape where LLM dependencies can make or break your startup. Here's what you need to know.

STRUCTURED OUTPUT: YOUR INTEGRATION ADVANTAGE

The stakes: Your downstream systems need clean, consistent data from LLMs, not freeform text that breaks your app.

Real founders are solving this now:

  • Rechat's real-estate CRM demands structured responses for frontend widget rendering
  • Boba's product strategy tool requires specific fields (title, summary, plausibility score)
  • LinkedIn uses LLMs to generate YAML that powers skill selection and parameter setting

Founder move: Follow Postel's Law - accept natural language input but output only typed, machine-readable objects. Use Instructor for commercial APIs (OpenAI, Anthropic) or Outlines for self-hosted models.

MODEL MIGRATION: HARDER THAN IT LOOKS

The reality check: Swapping models isn't plug-and-play. Your carefully crafted prompts will likely break.

The numbers tell the story:

  • Voiceflow saw a 10% performance drop when moving from gpt-3.5-turbo-0301 to the 1106 version
  • GoDaddy's experience was better but still required significant adjustment

Founder move: Build automated evaluation pipelines to measure performance across models before committing. Budget more migration time than you think you need.

VERSION PINNING: YOUR STABILITY INSURANCE

The warning: In machine learning, changing anything changes everything - especially with models you don't control.

Smart founders are:

  • Pinning specific model versions (like gpt-4-turbo-1106) in production
  • Preventing unexpected behavior changes that trigger customer complaints
  • Running shadow pipelines with newer models to safely test before deployment

Founder move: Lock your production models while maintaining a parallel testing environment for new versions.

MODEL RIGHTSIZING: THE EFFICIENCY EDGE

The opportunity: Smaller models = lower costs, faster responses, and competitive advantage.

The surprising truth:

  • Haiku with 10-shot prompts can outperform zero-shot Opus and GPT-4
  • DistilBERT (67M parameters) and DistilBART (400M parameters) deliver strong baseline performance
  • A fine-tuned DistilBART identified hallucinations with 0.84 ROC-AUC, beating larger models at 5% of the cost and latency

Founder move: Start with the biggest model to prove feasibility, then aggressively experiment with smaller alternatives using techniques like few-shot prompting and fine-tuning.

THE BOTTOM LINE

The winners in the AI startup race won't be those with access to the biggest models, but those who engineer the smartest integration, migration, versioning, and sizing strategies. Your ability to manage these dependencies could be your most valuable technical moat.


Answer from Perplexity: pplx.ai/share

More than just words

Don't fumble in the dark. Your ICPs have the words. We find them.

Strategic messaging isn't marketing fluff—it's the difference between burning cash on ads that don't convert and building a growth engine that scales.