As much as product teams (and project managers) want to create and follow detailed plans, even the best ones are just dressed-up guesses. Instead, great products are built on experimentation–trying small changes at scale, occasionally failing, and using the results to learn, grow, and optimize.
But creating a culture of experimentation–one where your team is excited to look for out-of-the-box solutions and try structured experiments–isn’t always easy.
That’s because experimentation takes work–both in time and resources and in pushing back against your assumptions and intuition (which can cause an existential crisis for even the most confident product manager).
Creating a culture of strategic and deep experimentation builds companies that are curious, resilient, and constantly iterating and adapting.
88% of retailers believe #experimentation will play an important role in keeping businesses competitive in the coming years.
So what makes a good product experiment? How should you run them to get the best results and cleanest data? And most importantly, how do you foster a culture of relentless experimentation on your team and in your company?
What makes a good product experiment?
In the early days, creative ideas, wild swings, and pure willpower are often enough to get you traction. But as your company grows, this type of experimentation needs to also ‘grow up’.
The problem is that small experiments are boring. Small changes beget small results. Why not spend time building out that new feature or marketing campaign that will really change your business?
Two words: Human nature.
We’re terrible at knowing what other people want. Staking your entire company on a big bet because you ‘have a feeling’ is a horrible idea. Instead, a structured approach to small, incremental experimentation helps your team:
- Learn: You’ll quickly understand the real impact of your changes and trust the data.
- Improve: The results of your experiments allow you to optimize design, interactions, pricing, and features continuously.
- Save money & resources: By running many small experiments all the time, you’ll get a quick signal with minimal investment and disruption.
- Mitigate risks: Experimentation reduces the risks of complicated releases. It’s better to get bad news early on and head back to the drawing board rather than see months or years of work flop when you go to market.
That’s why you need to be running experiments. But what makes a good experiment in the first place?
The good (and bad) news is that you can experiment with pretty much anything. All you need is a hunch and a way to test it. However, if you want your tests to lead to actionable insights, they need to include a few specific components:
- Problem: A good experiment solves a real user problem. These problems are discovered through data analysis, customer insights, market research, and experience. Even better is if these problems connect to a central product theme to allow for compounding returns.
- Users: Good experiments have a clear ‘audience size’–the fraction of people who will see it. You need to know who will be impacted by this change and their typical behaviors so you can look out for other knock-on changes.
- Benefit: Good experiments know the benefit they’re giving to users and consider trade-offs. If you increase pricing, will you decrease user retention?
- Feature: Good experiments provide one or more solutions that you believe will create value for users. These are ‘‘informed bets’–you know there’s a chance they’ll flop. But you’ll still learn even if they do.
- Data: Good experiments are built around a ‘North Star metric’–some ultimate goal you want to change. Most experiments will collect hundreds of data points and can change user behaviors in all sorts of ways. You need to go into it knowing the metric you’re trying to change and by how much.
Above all else, good experimentation connects to your mission, values, and goals. As Fareed Mosavat, former Director of Product at Slack, writes:
“Good experiments advance product strategy. Bad experiments only advance metrics.”
Experimentation is an essential part of being an agile team. You need to make bets, build software, track results and real-world feedback, and then implement them into your next sprint.
How to run powerful product experiments in 4 steps
There are tons of strategies and methods you can use to start implementing product experimentation today. However, the core elements of running a proper product experiment stay the same.
Step 1. Ask the right questions
Experimentation starts with finding opportunities. To find opportunities, you need to ask the right questions.
Questions uncover where you might change things. For example:
- What would happen if we increased our pricing model?
- What if we got rid of the free tier?
- Why do users keep leaving after a month?
Just sitting here, you can probably come up with a giant list of questions you have about your product, ideal user, and market. And that’s OK! One of the best things you can do is to create a question backlog for future experiments.
However, as we wrote above, good experiments aren’t one-off tests. They connect to a larger theme. To ensure you’re asking the right questions for your needs right now, follow these steps.
1. Define the area of experimentation and list your assumptions
A good product experiment has a purpose. You go into it not just with a single question but a broader business goal you want to achieve. Typically, the purpose of an experiment falls into one of three categories:
- You need to make a decision and need data to help inform it. For example, should we work on project X or Y?
- You want to reduce uncertainty related to an assumption. For example, releasing feature X will increase retention.
- You want to understand and improve performance and impact. For example, are we on track to hit our 2-year growth target?
Let’s say you’re worried about long-term user retention.
That’s the theme of your experimentation. You’ll most likely have a ton of questions about what impacts user retention, what features or changes can improve it, and so on.
You’ll also most likely already have an extensive list of assumptions you make about why users stick around as well as current things you’re doing that you’d like to validate. Gather all these together–decisions, assumptions, and performance validations and put them in a shared list.
Remember, you’re not even thinking about experiments or solutions at this point. Just listing out as many questions as possible. This process can and should take some time. As John Cutler, Head of Product Education at Amplitude, writes:
“You need to make it safe to ask ‘dumb’ and less fully-formed questions. And you can’t rush it. When the people brainstorming are worried about looking silly, they’ll shut down. If they feel rushed, they’ll stick with surface-level questions. Great questions spring from less-great questions which spring from ‘bad’ questions.”
2. Use sub-questions to get closer to testable questions
Next, you want to break down those decisions, assumptions, and validation questions even further. Pick one from each column and brainstorm sub-questions. Where do you need to validate assumptions or get rid of uncertainty?
To kickstart the process, put up a list of options: Why, who, what, where, when, which, how many, how, how long, do, are, will, have, should, and is?
Here are some examples:
- How many users churned out in the last 30 days?
- What features are customers who stick around using that those who churn aren’t?
- Does changing pricing impact user retention for users in Europe?
- How well do startup customers retain compared to Enterprise ones?
- Is there any low-hanging fruit that will quickly increase user retention?
If you’re struggling with coming up with more sub-questions, try going up or down a level of resolution. Open-ended questions are great at inspiring more specific questions. While a specific question can inspire more open-ended ones (for example, why is this important?)
3. Dot vote to decide what to test as a team
The two-step process of brainstorming high-level ideas and then sub-questions will give you a giant list of themed options. However, unless you’re a massive company with internal tools and automated testing processes, you’ll still need to dial in on the most important ones.
Here’s how you can use dot voting to help decide what questions your team feels are most important:
- Place your list of questions and assumptions up on a board
- Give each team member 3–5 ‘dots’ to vote with (you can even just use sticky notes)
- Allow multiple votes on the same idea to show conviction
- Tally up the questions with the most dots
Democratizing the question-brainstorming process like this builds trust in the results. The more your team is involved from the early stages, the more invested they’ll be in the experiment.
4. Turn your problems into problem statements using the HMW system
Finally, take your condensed list of questions and assumptions and translate as many of them as possible into problem statements.
The easiest way to do this is by using How might we (HMW) statements.
- How might we increase user retention?
- How might we persuade users who are about to churn to keep using our product?_
- How might we get unengaged users to try out more features?_
Gut check: Could what you’re testing be unethical?
Experiments are great for helping you build a better product. But they can sometimes cross the line into being unethical or manipulative. A good test learns about your users and helps them. A bad one tricks them into taking action.
Before you move on, do a quick gut check to see if any of the questions you’ve come up with are bordering on manipulation.
Step 2. Create a testable hypothesis
Next, you want to transform your questions into a testable hypothesis. If it’s been a few years since your last high-school science class, a hypothesis is: “a proposed explanation made on the basis of limited evidence as a starting point for further investigation.”
You take what you know, bet what will happen if you change it, and use that info to keep experimenting.
Experimentation is at the heart of the world’s most successful companies.
Look at your list of top questions and assumptions. What is the outcome you’d like to see when you test it? What do you know right now that will help inform that test?
A well-structured hypothesis gives you a roadmap of what you’re testing, the desired outcome, metrics that matter, and other essential details.
Let’s break down the essential elements of a testable hypothesis.
Try out for yourself: We have put together this Planio experiment issue so you can see how an experiment might look in Planio.
1. Business outcome/North Star metric
What is the outcome you’d like to see? What business goal are you trying to change? Experiments need a definition of success upfront.
Again, there are probably numerous business goals you’re chasing at any moment. However, for the sake of experimentation, you can choose just one. This is called your North Star metric. It’s the guiding factor in determining whether your experiments are working or not.
At Airbnb, they have tons of different metrics they could look at when experimenting–average purchase, repeat users, time to book, etc. However, their North Star metric is: Nights booked.
This metric gives them a central focus for all experimentation. If an experiment increases nights booked, it was a success. And as a result, their business is growing.
A North Star metric also helps you separate the signal from the noise. Each experiment will produce more data than you could go through. If you don’t know where to look, you won’t be able to learn from it.
2. User groups or personas
Who are you going to be testing? Is there a specific user persona you’re going to show your test? Or will this experiment go out to a random group of your entire user base? What’s your potential ‘audience size’ (i.e., the number of users who will see this test).
A user group or persona doesn’t just have to come from demographics. Think about the user behaviors you’re testing. Do people who do X also do Y? Why not?
3. User benefit/behavior
What is the user behavior that will drive the business outcome? In other words, what do you want users to do that will increase your North Star metric?
User behaviors don’t happen in a vacuum, however. There are secondary and complementary behaviors that you’ll want to be aware of to see how they impact each other.
For example, at Airbnb, does an increase in Nights Booked push people to choose cheaper properties? Which in turn means they’re less happy? Almost every behavior has a tradeoff you need to be aware of.
What are you doing to influence that user behavior to reach your business goals?
Think through what tools you have available to you to influence user behaviors: copy, design, UX, or features. Each solution is a test worth running.
Bringing it all together: A product experiment hypothesis example
So what does a hypothesis look like in practice? To see how all this information could look in a real issue, take a look at one we have prepared for you.
Step 3. Conduct your experiment and monitor data
Your hypothesis will tell you everything you need to know about running the experiment: What you’re doing, who you’re targeting, the length of the experiment, what metric you’re tracking, why you think it matters.
Now, all that’s left is to define the final parameters, go live, and see what happens. How you run the experiment will depend on what you’re testing. We won’t go into each of these, but the most common types of product experiments are:
- A/B tests: Randomly assign users to one of two product experiences: the ‘experiment’ or the ‘control’.
- Multivariate tests: Same as an A/B test but with multiple variables changing at the same time.
- Funnel experiments: Similar to A/B testing but with changes across multiple pages that a customer will go through (like onboarding or a sales funnel).
- Time-lapse: Comparing benchmarks of KPIs before and after a change. (This usually isn’t recommended but can be helpful if you’re unable to run an A/B test for some reason.)
No matter what type of experiment you decide to run, a few key factors will determine its success:
- Tight exposure groups: Everyone in the test groups needs to experience the change.
- Systems for tracking set up before the test starts: You know the results you’re after and aren’t ‘cherry picking’ data to confirm your beliefs after the fact.
- Reliable data: You trust that the data you’re collecting is accurate.
Finally, you need to make sure you’re testing your experiment for long enough to get solid results and counter any drastic short-term signals. Users have a ‘burn-in’ period where they’ll be interested and more engaged with changes or new features.
How to know if an experiment’s data is invalid
Experimentation depends on trusting your data. However, there are a few common situations where you should throw out or at least question the validity of your results:
- Contaminated data from running multiple tests at once: If you can’t be sure which test impacted your target metric, they’re both invalid.
- Small audience size: Outlier events can have an outsized impact on your results when the audience size is too small.
- Stopping the test too early: An early spike or drop can be exciting, but the actual test results take time to see. If you stop too early, you might be looking at data that doesn’t reflect user behaviors.
- External events: Did your competitor launch a new product or promo at the same time? Or what about a massive global crisis (like the pandemic?) These can color the results of your experiments.
Step 4. Communicate results and act on them!
A good experimentation process is like a never-ending cycle. Every result–both positive and negative–gives you a data point to create a new hypothesis. Once an experiment is complete, the results need to be understood, communicated, and implemented.
1. Create an action plan for what you’ll do with your results
Before you start, know what happens next depending on the results:
- Do you need to dig in and get more info (i.e., find out why it worked/didn’t)?
- Can you use the results to inform an even bigger experiment?
- If it failed, do you know why? Or do you need to run a validation experiment?
2. Communicate the results with the rest of your team
If you’re using our example experiment issue in Planio as a guide, you can see we have included results in a linked issue and some team insights. This way, you have a clear record of all experiments and ideas as reference. Alternatively or additinally, you can document lessons learned from experiments in your team Wiki.
3. Implement the results
Whether this means creating a new hypothesis or launching a feature, make sure you follow through on your action plan.
However, it won’t always be easy to trust the results of an experiment, especially if they go against what you thought would happen. Yet, data always trumps opinion. The only way to build a successful culture of experimentation is to implement the results no matter what.
4. Celebrate the small wins
A culture of constant experimentation can leave little time to celebrate the wins. But it’s essential to show progress to keep everyone excited and motivated! Using a tool like Planio Wiki gives you a central place to store all your experiments and evangelize the process.
How to create a culture of experimentation
If it sounds like running experiments is a lot of work, that’s because it is. However, the results are well worth the effort.
To get your team onboard with running experiments, focus less on the tools you’re using and more on why they should care about them. As Stefan Thomke writes in the Harvard Business Review:
“As companies try to scale up their online experimentation capacity, they often find that the obstacles are not tools and technology but shared behaviors, beliefs, and values. For every experiment that succeeds, nearly 10 don’t—and in the eyes of many organizations that emphasize efficiency, predictability, and ‘winning,’ those failures are wasteful.”
Here are a few of the steps necessary to build a resilient culture of experimentation:
1. Cultivate curiosity across the company
Everyone from leadership down to individual developers should value surprises. This helps shift the judgment of experiment ‘failures’ from a waste of time to an opportunity to learn and grow.
Curiosity needs to be valued across the company. Managers are often afraid of the risks involved with experimentation. So they overemphasize the need for ‘successful’ experiments (which is impossible to predict) or default to minor optimizations with no chance of becoming runaway successes.
But early failures, which are bound to happen, are the easiest way to help your development team learn, eliminate indecision, and refocus on the most promising product ideas.
2. Align everyone around a measurable North Star metric (not just the product team)
Everyone across your company should be able to explain your North Star metric and why it matters. A North Star Metric helps you refine what is important to your team and find questions, assumptions, and experiments that will inform it.
3. Hire (or add) data-minded people to your team early and often
Reliable data is a critical part of getting buy-in on experimentation. As you grow, hire more data-minded people and embed them in your teams. Not only will this make it easier to run experiments, but introducing more experimentation verbiage to your day-to-day will help normalize the process.
4. Humanize your data (to make it accessible to all)
Data can be confusing or even dehumanizing if not presented correctly (which will make your team less interested in experimenting or implementing the results).
Recognize that data doesn’t always have the answers and can sometimes be downright unhelpful. Instead, try to explain what the data means. Frame it as the ‘voice of your users’ and show how you either will or won’t use the data to inform decisions.
5. Build tools that make it easier to run experiments you trust
There are plenty of third-party tools you can use to run experiments if you can’t build your own. What’s most important is that you trust the results and that it’s easy for anyone to start one.
Democratize how you start an experiment as well as the results of it. By training everyone on how experiments work and sharing the same tools, you build trust in the process and the results.
6. Be a role model
You can’t expect your team to build a culture of experimentation if you don’t also test your own suggestions. Ego is one of the quickest ways to kill experimentation.
As Francis Bacon (the 16th-century scientist, not the painter) wrote:
“If a man will begin with certainties, he shall end in doubts; but if he will be content to begin with doubts, he shall end in certainties.”
7. Know when you can run experiments and when you can’t
Experimentation is at the heart of the world’s most successful companies. However, not everything can be tested rigorously before making a decision.
Strategy, experience, and leadership are qualities that drive a company forward through the moments where experimentation can’t be used. Knowing when to use each tool and when to trust your gut over the available data is something that comes with time, not tools.
Even the longest races are won one step at a time
Experimentation is a way to be action-oriented that planning, endless meetings, and indecision can never match. And it’s been that way for decades. As Claude Hopkins, one of the pioneers of advertising, wrote in 1923:
“Almost any question can be answered cheaply, quickly, and finally, by a test campaign. And that’s the way to answer them–not by arguments around a table.”
Over time, experimentation will lead to thousands of small and not-so-small changes that will completely transform your company. Rather than only betting on the big swings, you’ll be slowly progressing every single day.