I Asked 100 AI Agents to Judge an Advertisement
Here's what they said (and how you can ask them, too)
January 13, 2025
In prompt engineer Michael Taylor’s latest piece in his series Also True for Humans, about managing AIs like you'd manage people, he simulates multiple AI agents to get more diverse and useful results from LLMs. Michael applies the wisdom of the crowds concept to working with AI and forecasts how economy-simulating AI agents will help us make better decisions in the future. Plus: If you have a question you’d like to ask an AI focus group, scroll to the end to submit it.—Kate Lee
Was this newsletter forwarded to you? Sign up to get it in your inbox.
In one of my first jobs in marketing, the CEO would abruptly change his mind based on the most recent person he talked to. We jokingly called it “management by elevator” because company strategy seemed to pivot based on whomever rode the elevator with him that morning.
As an entrepreneur, I sometimes catch myself doing the same thing, giving myself whiplash by abruptly changing my product roadmap from a single positive customer call. Perhaps I should talk to a wider pool of people, so that any one person doesn’t unduly influence me. After all, I shouldn’t act the first time I see an opportunity; instead, I should wait to see if I see patterns forming across my interactions.
That’s easier said than done, though, because I have to actively go out and find more people to talk to about my ideas, which might mean asking for referrals, contacting people I don’t know, or even spending money on advertising or consulting. In other words, I need to step outside of the elevator. And there’s no guarantee that the people that are willing to spend time talking to me don’t have an agenda or bias that makes their feedback less valuable.
These days I work from home so there is no elevator, but I do have access to Claude 3.5 Sonnet, my preferred AI model. Large language models have been trained on large swaths of the internet, so they can identify gaps in my knowledge. I use Claude as my brainstorming assistant, running most of my decisions past it for a second opinion.
Still, it’s not a perfect solution. The problem with using AI for market research is that the feedback it gives can be too generic and broad. LLMs are trained with reinforcement learning so they won’t offend anyone, but that also limits their ability to weigh in as a creative partner. For example, if you ask an LLM to rate a controversial new ad from Jaguar, it gives a bland and unhelpful response, not really considering why the new ad might potentially work for younger, less traditional potential car buyers.
Source: Screenshots courtesy of the author, using Anthropic’s Claude 3.5 Sonnet.In order to escape overly generic responses is to roleplay prompting, asking the AI to adopt a persona when it responds—but here’s the twist. Instead of asking AI to adopt a single persona, you can ask it to simulate many personas, eliciting individual responses from each before combining their thoughts into a single paragraph response. I call this personas of thought, a riff on chain of thought, a technique for getting an AI to employ step-by-step reasoning. The difference is that this version reasons by simulating what various people would think rather than thinking as an individual step-by-step.
Sponsored by: Every
Tools for a new generation of builders
When you write a lot about AI like we do, it’s hard not to see opportunities. We build tools for our team to become faster and better. When they work well, we bring them to our readers, too. We have a hunch: If you like reading Every, you’ll like what we’ve made.
A template for personas of thought
I frequently use this template to crowdsource opinions from multiple AI personas, and it reliably gives me more insightful and varied responses versus naively just asking ChatGPT or Claude to answer directly:
We can apply this approach to our Jaguar ad testing to get a far more nuanced discussion, more closely matching the real-world answers you would get if you ran a focus group full of humans. Here’s the prompt for this context:Then, after generating the relevant personas and their simulated responses, it combines them into a final answer that is deeper and more insightful than the average LLM response:The “Copy Nothing” ad doesn’t speak to me personally, but, then again, I’m not in the target market for buying a new Jaguar. Whether you personally like the ad or not isn’t the point. By simulating reactions from a relevant audience—in this case, new potential buyers of Jaguar cars—you get a wider distribution of opinions and deeper insights into the potential effectiveness of ideas you’re testing. You can use AI to explore more creative pathways, without converging on dull and uninteresting responses.What’s more, this technique scales: If you need more accurate responses, you can add more detailed information for the personas to take into account, or you could even supply your own personas. Most businesses have done at least some market research, which you can repurpose to create in-depth customer profiles to supplement this prompt and make the results more accurate. Now you have a digital twin of your best customers that you can bounce ideas off of without bothering real people or incurring the cost of running focus groups.Let’s walk through a real-world application to getting customer feedback on product ideas with personas of thought using customer interview transcripts and Google NotebookLM. We’ll explore the science behind how AI agents predict human behavior, finishing with exploring what happens when we can simulate scenarios with thousands of AI agents in a virtual economy, so we can see how humans might behave in the same conditions...
Become a paid subscriber to Every to unlock this piece and learn about:
- The surprising accuracy of LLM-based focus groups versus traditional market research
- Why scaling AI agents from hundreds to millions could revolutionize strategic planning
- When execution becomes trivial, choosing what to do becomes the new bottleneck
Find Out What
Comes Next in Tech.
Start your free trial.
New ideas to help you build the future—in your inbox, every day. Trusted by over 75,000 readers.
SubscribeAlready have an account? Sign in
What's included?
- Unlimited access to our daily essays by Dan Shipper, Evan Armstrong, and a roster of the best tech writers on the internet
- Full access to an archive of hundreds of in-depth articles
- Priority access and subscriber-only discounts to courses, events, and more
- Ad-free experience
- Access to our Discord community