The Right It

Page 14

by Alberto Savoia

If you are working in physics or chemistry, and your instruments are precise enough, a measured difference between, say, 62.7% and 63.3% may be both accurate and relevant. But here we are dealing with the behaviors of markets and people, which are much harder to quantify. However, if you properly apply the tools and techniques I’ve been sharing with you, you will significantly increase your chances of success—80ish percent guaranteed!

Now about the arrows in the diagram. That big, ominous black arrow at the bottom labeled “The Law of Market Failure” points to Very Unlikely. That arrow is there to remind us of the hard fact that most new ideas will fail in the market. It’s there to help us make sure that we run enough experiments and collect enough YODA to counteract the dismal initial odds for success predicted by the Law of Market Failure. Let me use an analogy to illustrate this important point.

In US criminal courts, a defendant is considered innocent until proven guilty beyond a reasonable doubt. Not only that, but the burden of proof falls on the prosecution. In a criminal trial, a defendant does not have to prove his or her innocence; it’s the prosecution that has to provide enough convincing evidence to prove guilt. But when we move from criminal law to the law of the market to put an idea on trial, we begin with a presumption that our idea is “guilty” of being The Wrong It—that it will fail in the market. It is our job to provide enough hard evidence to swing the jury in favor of the idea.

In a similar vein, astronomer Carl Sagan popularized the phrase “Extraordinary claims require extraordinary evidence.” Products that are The Right It are not exactly extraordinary (there are too many of them), but they are the exception rather than the rule. So perhaps our motto should be: “Exceptional claims require sufficient evidence to support the exception.” The only evidence that our court will allow is YODA with skin in the game. And the only way we can get that YODA is by running pretotyping experiments. Which brings me to the other arrows.

Each of the smaller, light-colored arrows to the right of the scale represents a separate pretotyping experiment and how that experiment maps onto a specific likelihood of success. To perform that mapping, you have to determine if and how well the data you collected from an experiment corroborates (i.e., supports) your hypotheses. One way you can do this is by asking the following question after you’ve run an experiment and collected data from it:

If this idea is destined to succeed in the market, assuming that it is competently executed, how likely is it that this particular pretotyping experiment would have produced this result?

Remember that a pretotyping experiment is designed to test a specific xyz hypothesis, which is derived from a Market Engagement Hypothesis. So what you are really asking is this:

If our MEH is true, how likely is it that this pretotyping experiment would generate this data?

Here’s a guideline to help you map your answer on the TRI Meter:

If the data significantly exceeds what the hypothesis predicts, you point the arrow to Very Likely.

If the data meets or slightly exceeds what the hypothesis predicts, you point the arrow to Likely.

If the data falls a bit short of what the hypothesis predicts, you point the arrow to Unlikely.

If the data falls really short of what the hypothesis predicts, you point the arrow to Very Unlikely.

Finally, if the data is for some reason ambiguous, potentially corrupted, or hard to interpret, you point the arrow to 50/50 or optionally discard it. After all, even in science not all experiments produce clean and reliable data.

Example: Second-Day Sushi

All of this sounds more complicated than it actually is, so let me show you how to use the TRI Meter on our Second-Day Sushi example, which, unlike the fish itself, should still be fresh in your mind. First, we have to make sure that we have an XYZ Hypothesis that we can hypozoom into a set of xyz hypotheses.

If you recall, we had the following XYZ Hypothesis for Second-Day Sushi:

At least 20% of packaged-sushi eaters will try Second-Day Sushi if it’s half the price of regular packaged sushi.

And from that, we hypozoomed to our first xyz hypothesis:

At least 20% of students buying packaged sushi at Coupa Café today at lunch will choose Second-Day Sushi if it’s half the price of regular packaged sushi.

To test the first xyz hypothesis, we came up with a Relabel pretotype: stick a label that says “Second-Day Sushi: 1/2 Off” on half of the boxes on display and count how many people choose to buy them. Let’s assume that 100 boxes of sushi are on display and that we label half of them (50 boxes) as Second-Day Sushi. The key piece of data we need to collect is the percentage of Second-Day Sushi boxes sold compared to the total number of sushi boxes sold. In other words, how many people who wanted sushi for lunch chose to buy Second-Day Sushi?

Let’s assume that during lunch, students bought a total of 40 boxes of prepackaged sushi. How many of those boxes were the relabeled Second-Day Sushi? Here are a few possible scenarios:

Outcome Number of Boxes Sold out of 40 % of Total Boxes

A 0 0

B 2 5

C 6 15

D 8 20

E 16 40

F* 2 5

G** 30 75

*On the day of the experiment, there was an article in The Stanford Daily about the risks of eating raw fish.

**The lunch crowd included a group of 130 Japanese students visiting the campus.

Here’s how these results map on the TRI Meter:

Outcome A (0% of all boxes sold): Ask yourself, “If Second-Day Sushi is The Right It, how likely is it that we would sell 0 boxes out of forty?” Given that your first xyz hypothesis predicts 8 boxes and you sold 0, this is an easy call. The arrow from that result should point to Very Unlikely.

Outcome B (5% of all boxes sold): This result is not as discouraging as the previous one; after all, two people bought into the idea of slightly stale sushi, but it’s well below the predictions of 20% from our hypothesis. Unless we are ready to dramatically revise our business model and expectations (e.g., target a few daring souls who are really short on cash and hungry for sushi), outcome B should also point to Very Unlikely.

Outcome C (15% of all boxes sold): The data from this experiment provides evidence that some sort of viable market might exist, but that that market is not as big as we need it to be to make the business successful based on our existing hypothesis. For now, and unless we decide to adjust our business model and expectations accordingly, these data point to Unlikely.

Outcome D (20% of all boxes sold): This is at the bottom range of our hypothesized market, but it fully meets the minimum requirements for confirming the hypothesis. Yay! It deserves a Likely rating.

Outcome E (40% of all boxes sold): Whoa! This result blows our prediction out of the water. If we ask, “If Second-Day Sushi is The Right It, how likely is it that we would sell 16 boxes out of forty?” we can confidently answer Very Likely.

Outcome F (5% of all boxes sold): This is a discouraging result, but the data is questionable because, by an unlucky coincidence, the school’s daily paper had a front-page article on the risks of eating raw fish. Because of that, we either point the arrow to 50/50 (inconclusive) or discard it.

Outcome G (75% of all boxes sold): This is a fantastic result, but we have to be objective, so we cannot ignore the fact that on the day of the experiment an unusually high number of high-school students on a college tour from Japan visited the café. Perhaps these young students did not fully understand the implication of the Second-Day Sushi name, or perhaps they did not have a lot of money for lunch. Either way, since this was not a normal situation, we should probably dismiss this particular result. As much as we’d like to believe that our idea is the greatest ever, we have to be careful not to fool ourselves.

How Much Data Do You Need?

After you understand the TRI Meter scale and know how to map data we collect into the likelihood of success, you need to answer an important question: How much data
is enough data? First of all, let me be clear and tell you that a single experiment will not do, no matter how decisive or conclusive you think the results from it are.

Think about it this way. If, using the above example, our first experiment results in outcome A (0%), do we then give up on the idea? Or if it results in outcome E (40%)—twice as good as we expected—do we drop everything and dedicate ourselves full-time to Second-Day Sushi? Before you answer, consider a couple of examples of other types of important decisions:

Would you propose marriage or accept a marriage proposal after a single date? Hopefully not, even if that one date was the perfect date. Those first few hours together might be a promising indication that you may have found your Mr. Right or Ms. Right. But, just like The Right It, the Right Him or the Right Her are the exception and not the rule, so the smart thing is to confirm that initial result with more dates.

If you were interviewing a potential candidate for a job in your organization, would you ask just one question and make a final hiring decision based on that one answer?

“How many Ping-Pong balls can you fit in a school bus?”

“Ahem . . . I am not sure . . . 100,000?”

“Wrong! Not even close. You are clearly not Pong Industries, Inc., material. Thank you for coming and good luck with your job search. The door is that way.”

A single pretotyping experiment is not enough to reliably determine whether or not our idea is likely to succeed—even if the results from that one experiment are clear and compelling. Why? Because many possible factors can skew an experiment. We can discount or discard some results if we are aware of factors that may have tainted them (e.g., the scary news article about sushi safety or the spike in Japanese visitors in our Second-Day Sushi example), but we can’t possibly know or account for all possible ways that our data may be skewed or corrupted.

A total novice at archery may hit the bull’s-eye on the first-ever shot, while an experienced archer may miss badly once in a while. That’s why you need several arrows on the sample TRI Meter.

In order to have confidence in your results, you need to run multiple pretotyping experiments and validate multiple xyz hypotheses. How many experiments do you need to run? That’s like asking how many dates you should go on with someone before you propose or agree to marry that person, or how many interview questions you should ask a candidate for an important job before extending an offer. The answer to such questions depends on a number of factors (how well those dates went, how critical the position you are trying to fill is, etc.), but the number of dates (or questions) should be more than one or two—wouldn’t you agree?

Similarly, when it comes to pretotyping, the answer depends on a number of factors, such as:

How much are you planning to invest in the idea?

How much time or money can you afford to lose if it doesn’t work out?

How much certainty do you need before making a decision?

Are the results from the experiments you’ve run so far conclusive or inconclusive?

As a rule of thumb, I’d say you need to design and run a bare minimum of three to five experiments—and several more if executing the idea involves a significant risk (e.g., quitting your job or “betting the company”) or a major investment. The number of experiments should be commensurate with the investment and the consequences of failure—the amount of your skin in the game.

Interpreting the TRI Meter

Now that you know how to map YODA from individual experiments on the TRI Meter, I will show you how to interpret the overall results and decide on the next steps. To help me do that, I will use a sample scenario that follows a typical journey from an initial version of an idea to The Right It. Put on your boxing gloves and mouth guard, because we are about to enter the ring and go for a few rounds with the Beast of Failure.

Round 1: Punched in the Face

Let’s begin with the most common scenario. Unless you get lucky, after a couple of experiments the TRI Meter for the first iteration of your idea (Idea 1) will look something like this:

If you are new at this, those first punches are going to surprise, hurt, and disorient you. But don’t let this kind of result demoralize or discourage you.

First of all, welcome to the club! What club? The very crowded club of people who thought that their idea was for sure—no doubt about it—The Right It, only to have their hopes and expectations mercilessly dashed by the Beast of Failure.

Second, think how much worse off you’d be if you had gone ahead with that idea without testing it. After investing months of work and lots of money to develop and market your product, you find out that your idea was The Wrong It all along—a knockout punch that sends you to the hospital. Fortunately, our thinking, pretotyping, and analysis tools can help you avoid that. A little pain now can save you tons of pain later. By learning quickly and cheaply that a particular idea is not likely to be successful, you will have plenty of time and resources left to modify your original idea or explore a new set of ideas—to go a few more rounds.

Based on this TRI Meter, we should concede that our beloved new product idea is most likely headed for failure in the market. Round 1 goes to the Beast of Failure. If you are really passionate about your new product, you may decide to get back into the ring and run a few more experiments with the same exact idea—just to be sure. But a more logical and less painful course of action would be to go back to the drawing board (or back to your corner, if you like the boxing metaphor) and use what you’ve learned from your experiment to tweak your idea.

Rounds 2–4: We Take Some, We Give Some

We make some tweaks to our original idea (Idea 1) and run some tests with each of the variations (Ideas 2, 3, and 4). When we map the results on the TRI Meter we get the “Likelihood of Success” shown below.

We still get punched quite a bit—especially with Idea 2—but not as hard as before. Our tweaked versions of the idea manage to stay out of the Very Unlikely zone, and we even manage to land a punch with the fourth version of our idea (Idea 4). That’s a very good sign—we are learning more about the market, tweaking accordingly, and moving closer to The Right It territory.

Round 5: We Land a Few Good Punches

Using Idea 4 (the one that scored a Likely) as the starting point, we make a couple of additional tweaks and go back into the ring with Idea 5.

The arrows from our three experiments with the fifth version of our idea all point to Likely or Very Likely. This is great! Assuming that the experiments that produced those results were properly designed and run and that the data from each was interpreted fairly and objectively, there’s strong evidence that this idea might be The Right It. But that ominous black arrow at the bottom does its job and reminds us how rare it is for a new idea to succeed in the market. Are those three positive results enough to balance and counteract the Law of Market Failure?

Developing this particular idea will require a major investment and commitment, and we want a higher degree of confidence before going ahead. So we decide to run three more experiments using Idea 5.

We map the new results on the TRI Meter alongside the first set of results (second set of results shown in bold), and we get the following:

All right! The new set of experiments on Idea 5 confirms the first set of results. This is great. We can’t completely ignore the big black arrow—the market may still surprise us with an unexpected punch—but there’s a good chance that the fifth iteration of our idea is The Right It.

To help you visualize the process, here’s what our sequence of tweaks and experiments looks like if we chart them all (from Idea 1 to Idea 5) on a single TRI Meter:

We ran a total of twelve pretotyping experiments on five different ideas (or versions of a similar idea). That may sound like a lot of tweaking and experimenting, but with pretotyping this would not have taken more than a couple of weeks—less time than what most teams would spend to write an OPD-based business plan.

As we wrap up our discussion of t
he TRI Meter, let me repeat that only arrows that represent actual data from carefully designed and personally conducted experiments are allowed. No opinions and no OPD (market research done by other people, with other methods, at other times—you know the drill). Your arrows must consist of only YODA with skin in the game.

Part III

Plastic Tactics

7

Tactics Toolkit

In Part II, I introduced you to a set of tools to add clarity to the way you think about your idea, accelerate the way you collect data to validate your idea, and add structure and objectivity to the way you analyze and interpret the data you collect. It’s a powerful toolkit, with many tools to choose from, different ways to use them, and countless ways to combine them. But how do you decide which tools to use, how to use them, and when to use them? That’s what we will cover in Part III: Plastic Tactics.

In case you are wondering, the word plastic in the title for this section is not a reference to Tupperware, but to plasticity—the ability to change and adapt one’s plans and actions in response to new and unexpected situations. Having this ability is critical, because when it comes to bringing new ideas into contact with their intended market, plans rarely go smoothly—regardless of how diligent and careful we are in making those plans.

Sticking with our boxing analogy from Part II, the best quote I’ve heard on the subject of planning comes from an unlikely source, boxer Mike Tyson. Asked to comment about one of his opponent’s plans for a fight, Tyson replied, “Everybody has a plan until I punch them in the mouth.” Expect the market to punch you in the mouth a few times and be prepared to change your plans and tactics accordingly.

‹ Prev Next ›