firstdemand.io
A 4-stage AI pipeline that turns a landing page URL into a demand strategy — diagnosis, channel scorecard, 14-day playbook, and copy-ready assets — in under two minutes.
What it does.
firstdemand.io solves a specific problem: technical founders ship a landing page and then stall because they don't know which early-demand channels fit their specific product, ICP, and comfort level. They don't need another SEO tool or generic marketing copilot — they need an opinionated senior strategist who reads their actual product page and hands them a plan they can execute this week.
The product is not a social scheduler, SEO tool, or generic AI marketing assistant. One session. Under two minutes. Four deliverables.
What goes in.
- Landing page URL — scraped live, including JS-rendered SPAs and Cloudflare-protected pages
- Structured intake form (pre-filled by AI from the scrape): product description, ICP, launch stage, goals, constraints, channel preferences
- Optional: plain-language correction to refine the AI's interpretation
Four artifacts, one session.
Demand Readiness Diagnosis
Readiness score 0–100, four signal ratings (positioning / ICP / CTA / proof), main bottlenecks, and a ready/not-ready verdict.
Channel Scorecard
Top 3 channels ranked by fit — with effort, time-to-signal, tradeoffs, and an opinionated "why not the others" for rank 1.
14-Day Playbook
Day-grouped action plan with specific tasks, estimated time per task, and copy anchored to the founder's actual product language.
Asset Pack
Copy-paste ready: directory listings, community posts, outreach messages, CTA variants, founder bio, product one-liners — in any of six languages.
The pipeline.
Five stages, three models, two providers. Each stage streams results to the client via Server-Sent Events as it completes.
Stage 0 — URL Scrape + Form Prefill
GPT-5.4 with webSearchPreview browses the live URL — handles JS-rendered SPAs, Cloudflare protection, React apps. Returns a single unified schema: page signals AND form prefill in one call. This combined approach cut scrape latency from ~40s (two sequential calls) to ~16-18s.
Stage 1 — Demand Readiness Diagnosis
GPT-5.4 analyses positioning quality, ICP clarity, CTA strength, and social proof. Outputs a structured diagnosis with a 0–100 readiness score. Free users receive this stage plus channel theme previews; the pipeline stops at the paywall.
Stage 2 — Channel Scoring
GPT-5.4 matches ICP behaviour to channel fit across six channel families (directories, communities, outreach, partnerships, monitoring, public posting). The channel family is a Zod enum — the model cannot hallucinate outside it.
Stage 3 — 14-Day Playbook
Claude Opus 4-6 via OpenRouter. Chosen for instruction-following on complex day-grouped schemas and noticeably stronger copy quality for sequential action plans. Playbook actions reference specific product language from the scrape, not generic AI copy.
Stage 4 — Asset Pack
Claude Opus 4-6 via OpenRouter. Copywriting quality is the priority here. All assets are generated directly in the requested output language — no translation layer. Six languages supported; on-demand re-generation for additional languages without re-running the full pipeline.
OpenAI handles analytical/reasoning stages. Anthropic handles creative/copywriting stages. Model assignments live in one 14-line providers.ts file — swapping a model is one config change.
What made it hard.
The scrape latency problem
The original scraper used two sequential LLM calls: first GPT-5.4 with webSearchPreview to fetch and read the page (~25-30s), then GPT-5-mini to synthesise the content into form prefill values (~10-15s). Total: ~40s before the user saw anything.
The fix: combine both into a single generateText call with a rich output schema (CombinedScrapeSchema) covering both page signals and form prefill in one pass. Result: ~16-18s. The key insight was designing the schema to serve two masters at once rather than making two focused calls.
Checkpoint resumability vs. idempotency
The pipeline has two overlapping guarantees: idempotency (if you re-hit the generate endpoint for a complete project, it replays cached results as SSE without burning tokens) and checkpoint resumability (if the pipeline dies mid-run, the next retry picks up from the last completed stage, not from scratch).
These two guarantees interact in non-obvious ways — particularly around the justPurchased flag that bypasses idempotency when the user just completed checkout. Both patterns are necessary for a paid product with expensive LLM calls and unreliable serverless infrastructure.
Prompt injection in user corrections
Users can submit plain-language corrections to refine the AI's interpretation of their product. Those corrections are injected into every subsequent prompt. A founder can accidentally (or intentionally) write instructions that override model behaviour.
The sanitiser strips structural injection vectors (role-override phrases, section delimiters, XML tags). The injected block is wrapped in a founder_note XML tag with explicit instruction to treat it as data, not instructions. This is a deliberate trade-off: too much sanitisation strips legitimate context; too little opens the model to manipulation.
URL-level shared cache
The diagnosis cache is keyed by (normalizedUrl, stage) with a 14-day TTL. Two different users analysing the same landing page share the same cached diagnosis — zero extra LLM cost. Only the full scorecard, playbook, and assets are personalised (and therefore not cached across users).
When a user submits a correction, the cache is bypassed entirely for that project. This means a paying user who has corrected once always runs fresh LLM calls, even on re-runs with no further changes. Documented deliberate decision.
By the numbers.
6
LLM call sites in the pipeline
3
distinct models orchestrated
2
AI providers (OpenAI + Anthropic)
6
distinct Zod output schemas
~16s
scrape latency (AI path, current)
~40s
scrape latency (original, 2 calls)
60–90s
full pipeline latency (fresh, pro)
<1s
response time (cache hit)
526
lines in the generation route
9
DB schema migrations
What transfers to client work.
The engineering patterns built for firstdemand.io are directly reusable. These aren't theoretical — they're battle-tested in a live product.
Output.object + Zod schemas
Every AI call uses Vercel AI SDK's Output.object with a ZodSchema pattern. This forces schema design before prompt writing, which consistently produces better-structured output. The SDK handles retries on parse failure automatically.
Context builder pattern
Prompt assembly is separated into pure functions (buildFounderContext, buildPageContext, buildCopyNotes, buildUserContextNote) that take typed objects and return formatted string blocks. Each prompt composes these blocks rather than inline-constructing the full string. Prompt changes stay localised.
Checkpoint + idempotency in multi-stage pipelines
Write a partial result row after each stage. On retry, detect the partial row and resume from the last completed stage. Separate idempotency guard: if a complete result exists, replay as SSE without re-running LLM calls. Applicable to any multi-step AI pipeline where steps are expensive and failures are expected.
Single-call URL scrape + synthesis
Attach a live-browsing tool (webSearchPreview), define a rich output schema covering both extraction and synthesis, let the model fill both in one turn. Cuts latency by ~60% compared to sequential calls. Portable to any product that reads a URL and extracts structured data.