Synthetic personas: the role of real insights in the age of AI

Synthetic AI personas are fast and cheap, but they won't surprise you with a heartfelt anecdote or an uncomfortable truth. Here's where synthetic works, where it doesn't, and how to combine it with real human research.

Young Lady with Perfect Face and Skin Posing to Camera with Artificial Robot.

Key takeaways

  • Synthetic personas are computer-generated profiles built from aggregated data patterns. They can answer quick questions and validate early ideas, but they won't surface the unexpected moments that real human conversations reveal.
  • Real human research captures emotion, hesitation, and context that synthetic personas can't replicate, because those reactions happen in the moment, not in a dataset.
  • Synthetic works well for rapid idea validation and creative message checks. For detailed UX testing, emotionally sensitive topics, and strategic decisions, real participants are essential.
  • The biggest risks of leaning too heavily on synthetic personas are transparency (stakeholders deserve to know where insights come from) and bias (LLMs can amplify existing societal biases).
  • The smartest approach combines both. Use synthetic for speed and breadth. Use real customer conversations for depth, nuance, and emotional truth.

There's plenty of chat about synthetic AI personas right now. Fast, cheap, always on. Sounds great, right? But can they really replace talking to real people?

If you work in marketing, product, UX, innovation, or research, the idea of instant answers without recruitment costs is tempting. Before you commit to fully synthetic research though, it's worth thinking carefully about what you might lose along the way, and where the real value of human conversations actually sits.

What are synthetic personas?

Synthetic personas are computer-generated profiles that simulate customer segments by drawing patterns from existing data, survey responses, purchase histories, and large language models. They let teams "talk to" a representative customer instantly, without recruiting a real one.

The data behind them can come from anywhere. First-party CRM data, third-party panel data, social listening feeds, or public datasets used to train general-purpose LLMs. Some synthetic personas are narrow and grounded in a company's own customer data. Others are broader, built on top of general-purpose models trained on whatever happens to be on the internet.

That distinction matters. A synthetic persona trained on a representative sample of your actual customers is very different from one generated by asking a general-purpose LLM to "pretend to be a 34-year-old shopper from Manchester who buys your brand." Both get called "synthetic personas." Only one is grounded in anything real.

Synthetic personas in a sentence:

Computer-generated profiles that simulate customer segments by drawing patterns from existing data, designed to give teams instant answers without recruiting real participants.

How do synthetic personas compare to real human research?

Synthetic personas win on speed, cost, and scale. Real human research wins on depth, emotional nuance, and the surprises that make innovation possible. The two approaches sit at different points on the trade-off, and neither fully replaces the other.

Synthetic personasReal human research
Speed to first answerMinutesDays to weeks
Cost per "conversation"LowHigher (recruitment, incentives, fieldwork)
ScaleEffectively unlimitedBounded by recruitment and budget
Emotional depthLimitedRich. Tone, hesitation, body language.
Unexpected insightRare. Outputs mirror training data.Common. People surprise you.
Post-rationalisationBuilt in. The model only has the tidy version.Avoidable with in-the-moment capture.
Bias riskInherits bias from source data and LLMsDepends on recruitment and sample
TransparencyDepends on methodology and disclosureClear. Participants are real and attributable.
Best forEarly hypothesis checks, scaled validationDepth, nuance, strategic decisions

Why do real human responses matter?

Because people don't always know what they feel, and even when they do, they rarely rationalise it accurately after the fact. Synthetic personas work with polished data. Real customers give you the messy middle. The hesitation. The workaround. The quiet contradiction between what someone says and what they do.

Say you're launching a new marketing campaign. Synthetic personas might reassure you that your messaging ticks the boxes. They won't show you the moment a real customer pauses the ad, goes quiet, then says "yeah, I don't know, it feels a bit like it's talking down to me." Those authentic reactions, sometimes uncomfortable, always revealing, are what spark meaningful change.

It's also about memory. People are remarkably good at constructing logical narratives after the fact. A customer interviewed a week after a purchase will give you a clean, tidy story. The same customer recording a video the moment they open the packaging will give you something very different. That's the gap diary studies and mobile ethnography are designed to close, and it's the gap synthetic personas can never see across.

What are synthetic personas being used for today?

Most teams are using synthetic personas at the start of a project, for rapid idea validation, early messaging checks, and stress-testing concepts before committing real budget.

Common applications include:

  • Generating quick "what would this segment think?" reads during brainstorms
  • Running dozens of creative variations past a simulated audience before narrowing to real testing
  • Synthesising existing qualitative data into a representative persona for internal reference
  • Stress-testing copy, positioning, or pricing before investing in fieldwork

In most of these cases, synthetic works best when it's clearly framed as a first pass, not a final answer.

When are synthetic personas enough, and when do you need real humans?

Synthetic works when the decision is reversible, the stakes are low, and you only need a directional read. Real humans are essential when the decision is high-stakes, emotionally loaded, or tied to a brand or product experience you can't easily change later.

Here's a practical way to think about it:

  • Rapid idea validation. Synthetic can work well, assuming you have a representative data set and aren't relying on general-purpose LLMs.
  • Detailed UX testing. Real customers bring essential depth. Someone navigating your checkout at their kitchen table behaves differently than any simulation will predict.
  • Creative message checks. Start with synthetic, confirm with real customers. Use synthetic to narrow twenty options to three, then let real participants react to those three.
  • Emotionally sensitive or complex topics. Real conversations are essential. Healthcare experiences, financial stress, grief, identity. These are places synthetic personas consistently fall short.
  • Strategic decisions affecting brand or product direction. Real input. Decisions that shape brand perception, customer loyalty, or product success need the kind of nuanced understanding only real people can give you, which is a tall order for any simulation.

Synthetic personas can reduce costs upfront. But the bigger calls, the ones that shape your brand, your product, your customer relationships, usually need richer, real-world understanding than a simulation can offer.

What are the risks of relying on synthetic personas?

Two main risks. First, transparency. If stakeholders think they're hearing from real customers when they're not, trust erodes fast. Second, bias. Synthetic personas inherit whatever bias is in the underlying data, and LLMs tend to amplify those biases rather than correct for them.

Transparency and trust

Stakeholders deserve to know exactly where insights come from. A slide deck that says "customers told us…" carries very different weight from "our synthetic persona suggested…" Misrepresenting synthetic feedback as real, even unintentionally, is a fast way to burn credibility when the decisions it informs don't work out.

Bias in AI-generated personas

Synthetic insights mirror the biases in their underlying data. If the data is skewed toward a particular demographic, purchase behaviour, or worldview, the synthetic persona will be too. Research has shown that large language models can produce outputs that reflect and even amplify existing societal biases, leading to skewed or stereotypical representations (Marketing Week).

The bias problem, in one line:

A synthetic persona is only as representative as the data it's built on, and LLMs tend to amplify whatever bias is in that data rather than correct for it.

Cross-checking synthetic outputs against real customer input keeps you grounded. The synthetic layer generates hypotheses. The real layer tests them.

How should you combine synthetic and real research in practice?

Use synthetic for breadth and speed. Use real human research for depth and truth. Let the two layers check each other.

Most teams benefit from a simple workflow: synthetic at the edges, real in the middle.

  1. Synthetic for hypothesis generation. Use synthetic personas to explore a problem space quickly, generate hypotheses, and narrow your creative options.
  2. Real human research for depth. Run a diary study, video interviews, or mobile ethnography with a small number of real participants. Watch how they actually behave. Listen to what they say in their own kitchens and cars. Capture the moments synthetic can't predict.
  3. Synthetic for scaled validation. Use what you've learned from real customers to refine your synthetic inputs. Then use the improved synthetic layer to pressure-test at scale.

The order matters. Starting with synthetic and never validating against real humans leaves you building on a picture that might already be wrong. Starting with real research and layering synthetic on top gives you a grounded foundation that synthetic can extend, rather than replace.

How does Indeemo help you stay connected to real customers?

Indeemo is an end-to-end mobile video research platform. You can recruit real participants from a global panel of 3 million+, run studies in 30+ languages, and use AI to transcribe, translate, and analyse the videos, photos, screen recordings, and texts they submit, turning weeks of fieldwork into days.

The platform is designed to feel familiar. Participants download an app that works like the social media they already use. They capture videos, photos, screen recordings, and texts from their real lives. In their kitchens, at the supermarket, during their commute. Researchers watch submissions come in from a dashboard and can ask follow-up questions in the moment.

On the back end, generative AI handles transcription, translation, thematic analysis, and sentiment detection. Instead of waiting two weeks for a transcription vendor, you can start reviewing insights the same day. Subtitled highlight reels let you share a real participant's face and voice with stakeholders, which lands differently than a bullet-pointed summary or a synthetic quote.

Real customers. Real reactions. Faster than you'd expect.

Do you need to be a research expert?

No. Whether you're an experienced researcher or a brand team exploring mobile video research for the first time, Indeemo can support you.

Use the platform independently if you have the expertise in-house. Or partner with our Catalyst team for study design, recruitment, moderation, analysis, or the full project. If you have research ambitions but not the capacity or expertise to run the project yourself, we can lend a helping hand as and when you need it.

Indeemo can be more than a platform. It can be a partnership.

The smartest approach is still deeply human

Synthetic personas offer something real: speed, scale, and a useful way to think about customer segments early in a project. What they can't offer is the emotional depth, unexpected nuance, and messy authenticity that come from watching a real customer unpack your product in their actual kitchen.

In a world racing towards automation, the smartest approach might still be deeply human. Use AI to guide you. Trust real customer stories to inspire your innovation.

Frequently asked questions

What's the difference between a synthetic persona and a traditional persona?

A traditional persona is usually built by a researcher who synthesises qualitative interviews into a composite representative customer. A synthetic persona is generated by an algorithm drawing patterns from a dataset, often with no direct human interview involved. Both aim to represent a customer segment, but one is a human interpretation of real data and the other is an automated approximation.

Can synthetic personas replace real customer research?

In most cases, no. Synthetic personas are useful for early-stage exploration and scaled validation, but they can't capture the emotional nuance, unexpected insight, or in-the-moment behaviour that real customers reveal. For strategic decisions and emotionally sensitive topics, real human research is still essential.

How accurate are synthetic personas?

Accuracy depends entirely on the underlying data. A synthetic persona built from a representative, first-party customer dataset will be more useful than one generated from a general-purpose LLM. Even in the best case, synthetic outputs should be cross-checked against real customer input before informing major decisions.

What kinds of bias can synthetic personas introduce?

Synthetic personas inherit the biases in their source data, and LLMs can amplify those biases further. Common issues include demographic skew, stereotyping, and over-representation of loud or mainstream voices at the expense of quieter, edge-case perspectives. This is one reason real human research remains important for validation.

How does Indeemo fit into a workflow that uses synthetic personas?

Indeemo gives you the real human layer. Use synthetic personas for early hypothesis generation, then run a diary study, video interviews, or mobile ethnography with real participants to validate and deepen what the synthetic layer suggested.