Saturday, December 3, 2016

What about Beta in Hypothesis testing?

What about Beta in Hypothesis testing?

The main purpose of this blog is to clarify Beta Risk concept and how it works and interpret its value. This post is intended for Black Belts so as to have clear and complete understanding of how to deal with hypothesis testing results in different situations.

Alpha Risk and its relationship with p-Value is better understood than Beta. Nevertheless, let’s see a refresher as Beta and Alpha are both working when using hypothesis testing methods.
Let’s put an example. In this first example we are going to reject the Null Hypotesis because p-value is less than Alpha risk.

Example:

We are considering changing suppliers for a part that we currently purchase from a supplier that charges us a premium for the hardening process. 
The proposed new supplier has provided us with a sample of their product.  They have stated that they can maintain a given characteristic of 5 on their product.
We want to test the samples and determine if their claim is accurate.
Statistical Problem:
H0: μN.S. = 5
Ha: μN.S. 5
Set Risk Levels and choose test:
  1-Sample t Test (population Standard Deviation unknown, comparing a sample mean to a target).
  α = 0.05       β = 0.10

We have a set of samples as shown below...
We are going to use 1 sample t-test to test our Null hypothesis: mean = 5 because we can use normal distribution and we do not know the historical standard deviation from this new supplier.

One-Sample T: Values
Test of μ = 5 vs ≠ 5
Variable  N    Mean   StDev  SE Mean       95% CI                    T    P-Value
Values    9  4,7889  0,2472   0,0824  (4,5989; 4,9789)  -2,56  0,034

Conclusion:
As the P-Value is less than our criteria Alpha, which was 0,05 (5%), we reject the Null. This means that by rejecting that the supplier has a mean of 5, we are accepting that the mean could be 5 with a probability of 0,034. This probability is considered low, as our criteria of what is low and high was 5% (Alpha). So we reject the Null hypothesis and say that there is enough statistical evidence to say that with a 95% of Confidence Level the new supplier is not performing with an average of 5.

As we reject the Null we do not care about Beta, which is the risk of being wrong when we fail to reject the Null Hypotesis.

The correct statement is: “we do not have enough statistical evidence to reject the Null hypothesis which was mean = 4,9, so have to accept it”.
In this case, there are two posibilities for what is truly happening. One is that they are different and due to the sample size we are not able to detect some amount of difference. We are saying that we simply have too much risk (Beta risk) for a certain difference.
The other possibility is that they are not different.
Let’s see how Beta works so to have a more clear view about what could be happening. For this pupose we have to use Beta as we are failing to reject. One could think that Beta works in a similar way as Alpha does. Then we could work out using Minitab Power and Beta, which is 1-power using the data we have. So the result using power and sample size would be: (see next result).

Power and Sample Size
1-Sample t Test
Testing mean = null (versus ≠ null)
Calculating power for mean = null + difference
α = 0,05  Assumed standard deviation = 0,2472
            Sample
Difference    Size     Power
    0,1111      24  0,559395
Beta = 1-0,559 = 0,441
By saying that Beta is 0,441, we are saying that if the populatin mean was the one from the sample which has a difference from the target of 0,1111 we would fail to reject that difference in the 44,1% of the cases.
This has nothing to do with the real problem as we simply do not know which is the population mean, we cannot say that the population mean is the one from the sample.

The way Beta has to be used in this case is as we have been provided with a certain sample we just decide which is our risk, the risk we assume to be reasonable, let’s put the tipical one, which is Beta=10%. In this case, using Minitab:

Power and Sample Size
1-Sample t Test
Testing mean = null (versus ≠ null)
Calculating power for mean = null + difference
α = 0,05  Assumed standard deviation = 0,2472
Sample
  Size  Power  Difference
    24    0,9    0,170852

Interpretation
The statement would be that with a sample sample of 24 and a standard deviation of 0,2472 the difference we can detect with a 10% of risk is 0,171.

Conclusion
If a difference of 0,171 and/or a risk of 10% is not enough for us, we had to put the new one in Minitab and take enough samples.

Imagine using the same example, that a difference of 0,1 is not allowed, so we want to ensure at least that difference with a 10% of risk (Beta risk). Therefore:

Power and Sample Size
1-Sample t Test
Testing mean = null (versus ≠ null)
Calculating power for mean = null + difference
α = 0,05  Assumed standard deviation = 0,2472
            Sample  Target
Difference    Size   Power  Actual Power
       0,1          67     0,9      0,903663
We had to have taken at least 67 samples to detect a difference of 0,1 with a 10% of risk.
Imagine that even a 10% of risk is too high for us due to the cost it implies, so only a 1% of risk is allowed.

With Beta=0,01 , Power=0,99
Power and Sample Size
1-Sample t Test
Testing mean = null (versus ≠ null)
Calculating power for mean = null + difference
α = 0,05  Assumed standard deviation = 0,2472
            Sample  Target
Difference    Size   Power  Actual Power
       0,1        115    0,99      0,990392
We would need a sample size of 115 to ensure a difference of 0,1 with a risk of 1%.

SUMMARY:

§From the sample we know the estimation for standard deviation and mean, so we can know the propability of having at least a certain value for the population parameter, for example the mean (this probalitility is p-value).
§Using the statement from above, the only thing we know about the population (thanks to a sample) is the probability distribution of the sample.
§Beta is the risk of not detecting a certain difference when the population changes. As we do not know what is the value of the population parameter, we cannot use the difference between the sample value and the target as certain to work out a “real” Beta. The sample value can only be used to work out a distribution which means we have infinite values of differences and infinite probabilities of that values, so infinite values for Beta as well.
§The way to use Beta is to decide which is the risk we could assume of not detecting a certain difference if there was indeed that difference. Then it gives us a sample size. Or the other way around, when we are provided with a certain sample, which is the difference and the risk we are assuming.
§Beta cannot be used to accept or reject the Null Hypothesis. It has to be used to decide what sample size to take or if we have enough sample size.

1 comment:

  1. Alpha, Beta and their relationship when performing hypothesis testing are normally pretty well understood by Six Sigma practitioners. Not so for Beta and sample size. That's why this Blog is so important. It is intended for those who feel they need to go deeply into hypothesis testing to better understood how it works and what is in it for them.

    ReplyDelete