Synthetic audience testing · built on peer-reviewed methodology

Find out if your page converts — after before you launch it.

Paste your landing page or App Store listing. We sit 150 strangers from your selected peer lane in front of it, capture their unfiltered reactions, and convert them into the same purchase-intent metrics Fortune-500 consumer brands use — plus every objection, in their own words.

Report in ~5 minutes · From $29 · No signup until you confirm the lane

Synthetic panel · live0 / 150 responded

“How likely are you to subscribe to this?”

1
def. not

3
unsure

5
def. yes

“The free tier sounds generous, but I can't tell what the paid plan actually adds. I'd try it and probably never upgrade.”

Persona 042 · budget-conscious freelancer, 31 · rated 3

mean PI 3.62 · σ 1.0468th percentile vs. consumer-app benchmark

90%

of human panel test–retest reliability, achieved by the underlying method¹

9,300

real survey respondents the methodology was validated against¹

~5 min

from URL to full report — vs. 2–4 weeks for a traditional panel

$29

vs. $15,000+ for a 150-person human concept test

How it works

A consumer research panel, rebuilt in software.

The same four stages a research agency runs over a month — compressed into minutes, at a price an indie budget survives.

Step 1 / Read

We read your page like a customer does

Screenshot + copy extraction of your landing page or store listing. The panel reacts to exactly what visitors see — headline, screenshots, pricing, all of it.

Step 2 / Confirm

You approve the lane

We suggest the peer lane your page should be benchmarked against. Nothing runs until you confirm the fixed panel and baseline.

Step 3 / React

150 strangers respond in their own words

Each persona reacts in free text — no forced ratings. Research shows direct numeric ratings from AI are unrealistic; natural reactions are where the signal lives.¹

Step 4 / Measure

Reactions become research-grade metrics

Semantic Similarity Rating maps every reaction onto a purchase-intent distribution, benchmarked against hundreds of scored pages — plus the top objections, quoted.

The science

Not vibes. A published method, validated against 9,300 real consumers.

150 Strangers implements Semantic Similarity Rating (SSR) — a technique developed by researchers at PyMC Labs and Colgate-Palmolive and tested against 57 real consumer surveys.

Primary reference

“LLMs Reproduce Human Purchase Intent via Semantic Similarity Elicitation of Likert Ratings”

Maier, Aslak, Fiaschi, Rismal, Fletcher, Luhmann, Dow, Pappas & Wiecki (2025). arXiv:2510.08338 · open-source reference implementation on GitHub

Asking AI for a 1–5 rating directly fails. Models cluster on “safe” middle answers and produce distributions nothing like real consumers. The study measured it: only 0.26 distributional similarity.
Free-text reactions, mapped by meaning, work. SSR embeds each reaction and measures its semantic distance to calibrated anchor statements — recovering realistic distributions (similarity > 0.85) and product rankings at ~90% of what a repeated human panel achieves.
Detailed personas are non-negotiable. With rich conditioning the method reached ~90% reliability; without it, signal collapsed to ~50%. That's why every lane uses a fixed, detailed panel before we run anything.
Comparison is where it shines. The method's strength is ranking — which variant, which competitor, which message wins. We engineered the whole product around that, instead of pretending one number predicts your conversion rate.

ρ = 90%

correlation attainment vs. the theoretical maximum set by human test–retest reliability, across 57 product surveys¹

KS > 0.85

distributional similarity between synthetic and real consumer response distributions¹

0 training

no fine-tuning on your data required — the method works zero-shot, which is what makes a $29 report possible¹

+ words

unlike a human Likert score, every synthetic rating comes with a written rationale — the study found these richer than typical human survey feedback¹

Straight answers

What this can and can't tell you.

A research tool you can't trust is worthless. So here is exactly where the method is strong — and where we'll refuse to oversell it.

Reliable for

Ranking variants: which headline, pricing frame, or screenshot set your audience prefers
Competitive position: how your page lands next to up to 3 competitors, same panel, same question
Objection mining: the recurring reasons skeptics say no — quoted, clustered, segmented
Message clarity: whether your value proposition is even understood at a glance
Segment fit: which of your audience segments responds — and whether it's the one you expected

Not built for

Predicting your conversion rate: synthetic intent is directional, not a revenue forecast — anyone claiming otherwise is selling you something
Replacing real usage data: retention, churn, and pricing elasticity need actual customers
Truly novel domains: if your market has no footprint of real customer conversation online, we flag the report as low-confidence — visibly
Fine-grained demographic claims: the research found subgroup fidelity uneven; we report segments by behavior, not by census box

Pricing

Cheaper than one hour of a researcher's time.

One-time payments. No subscription. If the scrape fails or the report can't be produced, it's auto-refunded.

Single page

Diagnostic

$29 / report

150-stranger panel on one URL
Purchase-intent distribution + benchmark percentile
Top 5 objections, quoted & segmented
Copy fixes suggested per objection

Test one page

Most decisive

Variant duel

$79 / comparison

2–4 versions of your page or copy
Same panel reacts to every variant
Ranked results — the method's strongest mode
Per-segment winner breakdown

Compare variants

Know your battlefield

Competitor scan

$99 / scan

Your page vs. up to 3 competitors
Where you win, where you bleed
Objections unique to your page
Positioning gaps in their copy you can claim

Scan competitors

Questions

Asked by people who should be skeptical.

Aren't AI survey respondents just made up?

Naively, yes — ask a model for a 1–5 rating and you get useless, middle-clustered answers. That's exactly what the underlying research demonstrated, and why we don't do it. SSR elicits natural-language reactions and maps them to ratings by semantic meaning. Validated against 9,300 real respondents across 57 surveys, it recovered ~90% of the reliability a repeated human panel achieves. Synthetic panels aren't a replacement for talking to customers — they're a way to arrive at those conversations with a sharper page.

How do you know which audience to use?

We read your page and suggest a positioning lane, then stop and show it to you. You can switch lanes before a single persona is generated. The scored panel is fixed for that lane, so your percentile compares against peer pages judged by the same buyer types.

Why won't you give me a predicted conversion rate?

Because the method can't honestly deliver one, and we'd rather be trusted than impressive. The validated strength of SSR is relative measurement — rankings, percentiles against a benchmark corpus, and the qualitative why. Synthetic intent distributions are systematically wider and slightly more critical than human ones, which makes them great discriminators and bad absolute forecasters.

What if my product is in a really niche space?

The method works because models have absorbed enormous amounts of real customer conversation about most consumer domains. If your domain has thin coverage — deep tech, novel B2B categories — synthetic reactions get less trustworthy. We detect this and put a low-confidence banner on the report rather than hiding it. If a report is flagged and you don't find it useful, ask for a refund.

What do you do with my page and data?

We screenshot and extract copy from the URL you give us, run the analysis, and store your report so you can revisit it. We don't train models on your pages, don't resell your data, and only ever test publicly accessible URLs you submit.

Has this been tested on itself?

Yes. This page exists because variant B beat variant A in our own panel, and we then verified the prediction with a live A/B test on real traffic. We publish those self-tests as we run them — a method that can't survive its own scrutiny doesn't deserve yours.

Five minutes from now

You could know what 150 strangers think.

Or you could keep guessing, launch, and find out from a flat conversion chart three weeks from now.

$29 · auto-refund if we can't produce your report