• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Verifying Probabilities

copterchris

New Blood
Joined
Sep 6, 2004
Messages
6
Hi,

I wonder if any mathematics whizzes can help me with this one?

I've always wondered how you would go about giving confidence measures to probabilities? Perhaps I am using incorrect terminology here so let me illustrate with an example.

If someone said to me that there is a 1 in 100 chance of a meteorite destroying the Earth on any given day, how many days would have to pass without a meteorite destroying the Earth before I could say with (say) 95% confidence that the 1 in 100 estimate was incorrect?

It sounds like there should be an "easy" calculation to give the answer but I'm stumped as to what it would be!

Any help/thoughts appreciated.

Chris
 
I think the answer you are looking for is a formula known as a binomial distibution. Not sure, and I'm leaving now so I can't check. Googling for the above might help.
 
My understanding is that binomial distributions are to do with the probability of 'n' successes in 'N' trials. I can see that they may play some part in the answer to my question but I'm not seeing that it's the whole solution.

Perhaps I'm missing something?

Chris
 
As I understand it (it's been a few years now), confidence is the probability that you have observed an effect not due to random chance.

Let's say you suspect a coin has both heads instead of a head and a tail, but the only test you can do is to have it flipped and get the result.

If it's flipped once, and comes up heads, your confidence is 50%. If it's flipped twice and comes up heads both times, your confidence increases to 75%: the chance of heads on both of two flips is 25%. Three times, and your confidence goes up to 87.5%.

Now lets say somebody tells you the coin is weighted so tails occur only once out of 100 flips (average). This is a similar problem with a different probability. You still think the coin has one side (never comes up tails). But your confidence that this second guy is wrong is only 1% after a single flip. After two flips its slightly less than 2% (1-(.99)(.99)).

How many flips have to come up heads before you decide the guy is wrong with a confidence level of 95%? That means there's a 5% chance the sequence of heads is just bad luck (produced by chance). You want to find n where .99<sup>n</sup>=.05, or n log .99 = log .05, or n = log .05 / log .99.

This works out to 299 flips, rounding up.
 
Brushing up on my maths...

Say you have an event with a given probability, which you do not know.

After repeated tests, you can count how many times it happened (not a good plan if you're destroying the planet, but bear with me).

This gives you an average probability P (hopefully 0 if you are tossing meteorites around), which is a first estimate. Of course, chance has a part, and the average probability may not be the exact value (roll one die (6 sided): the average should be 3.5, but I'll be surprised if you get it. Roll 2: the average is 7, but the odds are 5/6 that you don't get it either...).

So you evaluate the variance S (for sigma) (a measure of how spread out the outcome is. Flipping a (fair) coin to get 0 or 1 will give you low variance, rolling one die with 1000 sides will get more). Over many iterations (typically 30+ is enough, but this can vary), the odds of an event being within +/- S of P (from P-S to P+S) are 68%, 95% within +/- 2S.

That's the "experimental" way to determine a probability, with a confidence interval. Note that the range of the interval depends on the number of iterations (the relative variance for one coin flip is much more than for 100: you have a 95% chance of getting between 40 and 60 heads with 100 flips, but only a 50% chance of getting 1 head with 2 flips).

Now someone has come up with a theory that says that the probability should have a given value x (1/100, say). How does one test this?

One way is the experimental method: if P and x are close (P-x is small compared to S), the odds are good that things will work. If not, you have a problem.

Of course, the above is very rough, and can be done a lot better.

That's where you start using null hypotheses and other fun stuff.

Zombified used the null hypothesis method above, which I'll restate as follows:

Hypothesis: there is a 95% chance or more that the probability is 1/100
Null: the chance is less than 95%

And you test it as described.

What are the differences?

The "experimental" method works well with events that happen reasonably often (if nothing happens, your guess is 0 with complete uncertainty as to details). It gives you a fairly strong estimate of what the probabilities could be.

The "null hypothesis" method tells you, with a predetermined chance, whether a specific probability is a good estimate or not. It bogs down if you have many probabilities you want to test, and gives you different information about the problem.

Experimentally, the chance of earth being destroyed by a meteorite seems to be 0 with a variance of 0 (all the tests indicate earth is still here), so it's not too promising for this question. By the null hypothesis, knowing the number of tests made, you could get an upper bound on the risks, with a prespecified confidence interval. ("I'm 95% sure that there is a one in a billion chance that earth will end today" - but it could also be a one in 2 billion, or one in 957 million, or whatever).
 
copterchris said:

I've always wondered how you would go about giving confidence measures to probabilities?


Here are my attempts:

We think of some probability distribution f that describes the random variable. This probability distribution can be verified by theory or from past experiments, for example. We know that proportions are approximately bell-shaped, because they are the averages of 1's and 0's, and we know averages to be approximately bell-shaped.

From the distribution we calculate an expected value and a standard error. The area under the probability distribution (under any) is 100%. If we want to be certain our confidence interval procedure will include the true proportion p 95% of the time, we observe data (the p_hat) and construct the interval:

p_hat +- z*SE(p_hat) or [p_hat - z*SE(p_hat) , p_hat + z*SE(p_hat)]

,where z is chosen from f based on what confidence level we choose. For example, for 95% confidence it is chosen such that the sum of the area to the right of z under f is equal to (1-.95)/2 = .025

Typically a normal approximation to the binomial is sensible (especially for large sample sizes). Doing this, for a 95% confidence interval, we get:

p_hat +- 1.96*sqrt[(p_hat(1-p_hat))/n]

As a toy example, pretend we observe 5 serious meteorites out of 1000 days. Somehow we know the proportion of serious meteories to follow a normal distribution. Our 95% confidence interval for the true proportion of serious meteorites is:

[.0006, .0094] or [.06, .94]%.

It, of course, is a little harder because the distribution of serious meteorite occurances is probably not normal.


If someone said to me that there is a 1 in 100 chance of a meteorite destroying the Earth on any given day, how many days would have to pass without a meteorite destroying the Earth before I could say with (say) 95% confidence that the 1 in 100 estimate was incorrect?


Our hypotheses are:

Ho: p = 1/100
H1: p not equal to 1/100

Our test statistic is:

z = (p_hat - .01)/sqrt[(.01*.99)/n]

You'd reject Ho when z is calculated to be larger than 1.96 or smaller than -1.96. The n that allows you to reject Ho is the number of days (since we assume 1 meteorite per each day).

Trying out some X (observed serious meteorites) and solving for the n such that we reject Ho, we get:

X:greater than 5
n:53

X:greater than 10
n:178

X:greater than 20
n:552

X:less than 9 or greater than 100
n:5448
 

Back
Top Bottom