What about Beta in Hypothesis testing?
The
main purpose
of this blog is to clarify
Beta Risk
concept and how it
works and interpret
its value.
This post is intended
for
Black Belts so
as to have clear
and complete understanding
of how to deal
with hypothesis
testing results
in different
situations.
Alpha
Risk and its
relationship
with p-Value
is better
understood than
Beta. Nevertheless,
let’s see
a refresher as
Beta and Alpha are both
working when
using hypothesis
testing methods.
Let’s
put an
example. In this
first example
we are going
to reject the
Null Hypotesis
because p-value
is less
than Alpha
risk.
Example:
We
are considering changing suppliers for a part that we currently purchase from a
supplier that charges us a premium for the hardening process.
The
proposed new supplier has provided us with a sample of their product. They have stated that they can maintain a
given characteristic of 5 on their product.
We
want to test the samples and determine if their claim is accurate.
Statistical
Problem:
H0: μN.S. = 5
Ha: μN.S. ≠ 5
Set
Risk Levels and choose test:
1-Sample t Test (population Standard Deviation unknown,
comparing a sample mean to a target).
α =
0.05 β =
0.10
We
have a
set of samples as shown below...
We
are going to
use 1 sample
t-test to test our Null
hypothesis:
mean = 5 because we
can use normal distribution
and we do not
know the
historical
standard deviation from
this new supplier.
One-Sample
T: Values
Test
of μ = 5 vs ≠ 5
Variable N
Mean StDev SE Mean
95% CI T P-Value
Values 9
4,7889 0,2472 0,0824
(4,5989; 4,9789) -2,56 0,034
Conclusion:
As the
P-Value is
less than
our criteria
Alpha, which
was 0,05
(5%), we reject
the Null.
This means
that by
rejecting that
the supplier
has a mean of 5, we are accepting
that the
mean could be 5
with a probability
of 0,034. This probability
is considered
low, as our
criteria of what
is low
and high was
5% (Alpha). So
we reject
the Null
hypothesis and say
that there
is enough
statistical evidence
to say that
with a
95% of Confidence Level
the new supplier
is not
performing with
an average
of 5.
As we
reject the
Null we
do not care
about
Beta, which is
the risk
of being wrong
when we
fail to reject
the Null
Hypotesis.
The
correct statement
is: “we
do not have
enough statistical
evidence to reject
the Null
hypothesis which
was mean
= 4,9, so have to accept
it”.
In this
case, there are two
posibilities
for what
is truly
happening. One is
that they
are different and due
to the sample
size we
are not able
to detect some
amount of difference.
We are saying
that we
simply have
too much
risk
(Beta risk) for
a certain difference.
The
other possibility
is that
they are not
different.
Let’s
see how
Beta works so
to have a
more clear view
about what
could be
happening. For this
pupose we
have to
use Beta as we are failing
to reject. One
could think
that Beta
works in a
similar way as Alpha
does. Then
we could
work out
using Minitab
Power and
Beta, which is
1-power using the
data we have.
So the result
using power
and sample size
would be:
(see next result).
Power
and Sample Size
1-Sample
t Test
Testing
mean = null (versus ≠ null)
Calculating
power for mean = null + difference
α = 0,05 Assumed
standard deviation = 0,2472
Sample
Difference Size
Power
0,1111
24 0,559395
Beta
= 1-0,559 = 0,441
By
saying that
Beta is
0,441, we are saying
that if
the populatin
mean was the
one from
the sample
which has
a difference from
the
target of 0,1111 we would
fail to reject
that difference
in the
44,1% of the
cases.
This
has nothing to
do with the
real problem as we
simply do not
know which
is the
population
mean, we cannot
say that
the population
mean is the
one from
the sample.
The
way Beta
has to be used in this
case is as
we have
been provided
with a certain
sample we
just
decide which is
our risk,
the risk
we assume
to be reasonable, let’s
put the
tipical one,
which is
Beta=10%. In this
case, using Minitab:
Power
and Sample Size
1-Sample
t Test
Testing
mean = null (versus ≠ null)
Calculating
power for mean = null + difference
α = 0,05 Assumed
standard deviation = 0,2472
Sample
Size
Power Difference
24
0,9 0,170852
Interpretation
The
statement would
be that with
a sample sample
of 24 and a standard deviation of
0,2472 the difference
we can detect
with a
10% of risk is
0,171.
Conclusion
If
a difference of
0,171 and/or a risk
of 10% is not
enough for
us, we
had to put
the new one
in Minitab and take
enough samples.
Imagine
using the
same example,
that a difference
of 0,1 is not
allowed, so we
want to ensure
at least that
difference with
a 10% of risk
(Beta risk). Therefore:
Power
and Sample Size
1-Sample
t Test
Testing
mean = null (versus ≠ null)
Calculating
power for mean = null + difference
α = 0,05 Assumed
standard deviation = 0,2472
Sample Target
Difference Size
Power Actual Power
0,1 67 0,9 0,903663
We
had to have
taken at least
67 samples to detect
a difference of
0,1 with a
10% of risk.
Imagine
that even
a 10% of risk is
too high
for us
due to the
cost it
implies, so only
a 1% of risk is
allowed.
With
Beta=0,01 , Power=0,99
Power
and Sample Size
1-Sample
t Test
Testing
mean = null (versus ≠ null)
Calculating
power for mean = null + difference
α = 0,05 Assumed
standard deviation = 0,2472
Sample Target
Difference Size
Power Actual Power
0,1 115 0,99 0,990392
We
would need
a sample size
of 115 to ensure a difference
of 0,1 with a risk
of 1%.
SUMMARY:
§From
the sample
we know
the estimation
for
standard deviation and
mean, so we can know
the propability
of having at least
a certain value
for the
population parameter,
for example
the mean
(this probalitility
is p-value).
§Using
the statement
from above,
the only
thing we
know about
the population
(thanks to a
sample) is
the probability
distribution
of the
sample.
§Beta
is the
risk of not
detecting a certain
difference when
the population
changes. As we
do not know
what is
the value
of the population
parameter, we
cannot use the
difference between
the sample
value and the
target as certain to work
out a
“real” Beta. The sample
value can only
be used to work
out a distribution
which means
we have
infinite values
of differences and infinite
probabilities
of that
values, so infinite
values for
Beta as well.
§The
way to
use Beta is to
decide which is
the risk
we could
assume of not
detecting a certain
difference if
there was
indeed that
difference. Then
it gives
us a sample
size. Or
the other
way around,
when we
are provided with
a certain sample,
which is
the difference
and the risk
we are assuming.
§Beta cannot be used to accept or reject the Null Hypothesis. It has to be used to decide what sample size to take or if we have enough sample size.
Alpha, Beta and their relationship when performing hypothesis testing are normally pretty well understood by Six Sigma practitioners. Not so for Beta and sample size. That's why this Blog is so important. It is intended for those who feel they need to go deeply into hypothesis testing to better understood how it works and what is in it for them.
ReplyDelete