This is a summary of Jason Cohen‘s Business of Software 2012 presentation.
The way you’re approaching metrics in your business is wrong. How you’re using data is wrong, because the tools you’re using are wrong.
When asked to guess how many jelly beans are in a jar, individual answers are off by 67%. However, if you average all the answers together, the average is only off by 3%. People suck at guessing, but averaging guesses across people is a nearly perfect predictor of truth…sometimes.
The wisdom of the crowd only works for certain things. When asked to vote for the funniest joke, the wisdom of the crowd didn’t produce it. Crowds are wise when there’s a correct answer. Crowds are useful in objectivity, and destructive in creativity. Yes, crowds are not just neutral in creativity, in fact they are actually worse (i.e. destructive).
Take A/B tests as an example. A/B tests are usually not done right. Picking B over A because it is beating A by a little bit can be very destructive. Sparefoot, an Austin startup, ran an A/A test using Google Website Optimizer, and found that Google Website Optimizer had one A beating itself!
When what you are testing for is rare, the results are overwhelmingly wrong.
Jason has a great (and super cute!) article on his blog on easy significance testing (and hamsters).
The way you determine whether an A/B test shows a statistically significant difference is:
- Define N as “the number of trials.” N = A + B = total number of conversions
- Define D as “half the difference between the ‘winner’ and the ‘loser’.” D = (A – B) / 2
- The test result is statistically significant if D^2 is bigger than N.
Seek large outcomes from more traffic (as in 100,000′s of data points, not 10,000′s), especially if you are a small company. Go for the big stuff and shy away from the small stuff. A great example of this is Google’s famous 41 shades of blue test. Google ran this experiment to determine which shade of blue received the most clicks. Running this test with 2 shades of blue, and picking one winner with a 95% confidence level, leads to a 5% chance of a false-positive. Running this test with 41 shades of blue, leads to an 88% chance of a false-positive.
Test theories, not headlines. Don’t spitball headlines. First form a theory about why a change would be better, then test it. If the theory turns out to be invalid, think about what other assumptions could be wrong. Invalidating a theory gives you an opportunity to think deeper, and come up with another theory. Examples of theories:
Which metrics actually matter?
Which variables should I care about? The ones that have the biggest impact on growth, revenue and cash.
Let’s take a hypothetical affiliate program for a SaaS product as an example, and figure out what’s important. Affiliate program parameters:
Using a simple model in a spreadsheet, it looks like we will break even in about four to five months. Now let’s add a 15%/month growth. That 15% growth causes 50% more costs. But that doesn’t count cancellation rates, or affiliate customers being lower quality (i.e. cancel more). The end result: your dead (with a 10%/month cancellation rate). If we then increase the price by $10/month (50% more MRR), we’re back to breaking even at 4 – 5 months.
Affiliate program optimization priorities:
BUSINESS OF SOFTWARE – FOR PEOPLE BUILDING GREAT SOFTWARE BUSINESSES.
This year will be the 7th Business of Software, a three day conference for founders who want to build sustainable, profitable software businesses. BoS has always been a special conference for our delegates and we want to keep it special.
Attendance is restricted to just 400 attendees in 2013 and we have 200 places taken and the next 100 tickets (as of April 20th) will be sold at the second Early Bird Rate.
If you want to see all of the action from Business of Software 2012, the videos of the talks are available in one place now: