**What is Statistical Significance?**

Statistical
Significance is important terminology in statistical inference namely **Hypothesis
testing**. Statistical Significance indicates that obtained outcome is
occurred **solely due to some cause** and has not occurred because of any
randomness or concurrence. In hypothesis testing using available sample data,
stated hypothesis is tested for Statistical Significance. If the result is
statistically significance, then researcher can make generalized statement
about general population based on this inference. Statistical Significance does
not talk about the importance of result but it speaks about **truthfulness**
of the result. Statistically significant result can be highly significant or
weakly significant, but once it is significant, it can be concluded that the result
is expectedly.

Statistical
Significance tests are parametric as well as non-parametric in nature.
Parametric tests rely on certain assumptions and test statistic follows certain
statistical distribution for assessing Statistical Significance whereas non parametric
tests are distribution free tests.

**History of Statistical Significance**

**
**

**Sir
R. A. Fisher** in year 1925, in his paper
‘Statistical Methods for Research Workers’, worked on hypothesis tests and
suggested that it is appropriate to reject the null hypothesis once in twenty
(5% of the times) but he did not give any name at that time. Later in 1933, **Egon Pearson **and **Jerzy
Neyman** termed the cut-off suggested by Fisher as **Significance level**
and denote it by α. They also propose to predefine the alpha level before
starting the research. R. A. Fisher later suggested that these alpha levels can
be set by researcher based on given situation and need not be fixed at 5%.

**How to determine Statistical Significance**

Statistical
Significance is determined by two ways either using **classical method**
i.e. by using critical value or by **p-value** **method**. Critical value
method is table based method whereas p value is calculation based method. Table
here refers to statistical table based for underlying distribution namely t distribution,
normal distribution or F distribution or Chi-square distribution.

**
**

**General Procedure for determining Statistical Significance of
hypothesis test**

- State the null and alternative hypothesis.
- Determine the level of significance (α).
- Obtain appropriate test statistic.
- Make a decision.
- Compare obtained test statistic with
**critical value**using rejection rule. - Obtain
**p value**and compare it with level of significance. - Write conclusion based on the decision.

**Making decision using Critical value Approach**

To determine Statistical
Significance using critical value approach, first we need to determine critical
value for given hypothesis test. General rule for rejection of null hypothesis
is as follows.

**Decision Rule: Reject the null
hypothesis if test statistic falls in Rejection Region**.

**depends on**whether test is

**one tailed or two tailed**test. For one tailed test depending on the direction of research hypothesis, rejection region can be either on right side or

left
side. If research hypothesis or alternative hypothesis is right tailed, then
rejection region lies to the right of the curve. If research hypothesis or
alternative hypothesis is left tailed, then rejection region lies to the left
of the curve. For two tailed test, the rejection region lies on both right and
left tail of the test and it is equally divided in both tails.

__For right tailed test,
standard rule for rejection is:__

Reject the null hypothesis if test
statistic is greater than critical value.

Rejection regions for right tailed
tests for various distributions are listed below.

For z tests: Reject H_{0} if
test statistic (Z)> Critical value (Z_{α})

For t tests: Reject H_{0} if
test statistic (T) > Critical value (t_{α, df})

For χ^{2} tests: Reject H_{0}
if test statistic (χ^{2}) > Critical value (χ^{2}α,df)

For F tests: Reject H_{0} if
test statistic (F) > Critical value (F_{(a,n1-1,n2-1)})

For **left** tailed test, standard rule for rejection is:

Reject the null hypothesis if test
statistic is less than critical value.

Rejection regions for right tailed
tests for various distributions are listed below.

For z tests: Reject H_{0} if
test statistic (Z) < Critical value (-Z_{α})

For t tests: Reject H_{0} if
test statistic (T) < Critical value (- t_{α, df})

For X^{2} tests: Reject H_{0}
if test statistic (X^{2}) < Critical value (X^{2}_{1}-α,df)

For F tests: Reject H_{0} if
test statistic (F) < Critical value (F_{(1-a,n1-1,n2-1)})

For **two** tailed test, standard rule for rejection is:

Reject the null hypothesis if test
statistic is greater than right tailed critical value or less than left tailed
critical value.

Rejection regions for two tailed
tests for various distributions are listed below.

For Z tests: Reject H_{0} if
test statistic (Z) > Critical value (Z_{α/2})

OR
if test statistic (Z) < Critical value (-Z_{α/2})

For t tests: Reject H_{0} if
test statistic (T) > Critical value (t_{α/2, df})

OR
if test statistic (T) < Critical value (-t_{α/2, df})

For X^{2} tests: Reject H_{0}
if test statistic (X^{2}) > Critical value (X^{2}_{a/2,df})

OR
if test statistic (X^{2})
< Critical value (X^{2}_{1-a/2,df})

For F tests: Reject H_{0} if
test statistic (F) > Critical value (F_{(α/2,n1-1, n2-1)})

OR if test statistic (F) <
Critical value (F_{(1-α/2,n1-1, n2-1)})

Making decision using P value approach

To
determine Statistical Significance using P value approach, we need to determine
P value for given test statistic. This **p value** is nothing but a
probability having range between 0 and 1. It tells us how strong evidence our
sample data provides against the null hypothesis when null hypothesis is
actually true. For example, in case of single population mean, p value gives us
probability that given sample would lead to larger difference between sample
and population means when in reality population mean is equal to given value.

General rule for rejection of null
hypothesis is as follows.

**Decision Rule: Reject the null
hypothesis if P value is less than level of significance (α)**.

This rejection rule is **independent**
of whether test is one tailed or two tailed test. However, calculation of P
value differs for two tailed, right tailed and left tailed tests.

For example, in case of z tests,

If alternative hypothesis is right
tailed p value is calculated as:

P value = P (Z > Z observed)

If alternative hypothesis is left
tailed p value is calculated as:

P value = P (Z < Z observed)

If alternative hypothesis is two
tailed p value is calculated as:

P value = 2 * P (Z > Z observed)

Pictorial representation of P value for all these z tests is shown below:

**What if null hypothesis is rejected?**

When
using critical value approach or using P value approach, the null hypothesis is
rejected, then we always have strong evidence against the null hypothesis. When
we have such strong evidence then only we can conclude that obtained result is **statistically
significant**. When we fail to reject the null hypothesis, then obtained
result is **statistically not significant**.

If
we found that obtained result is statistically significant, then we can
conclude that there is an **effect** and if we found that obtained result is
statistically insignificant result then we can conclude that there is **no
effect**.

**Limitations of Statistical Significance**

There
are certain limitations for Statistical Significance. Statistically significant
result may not be practically or clinically significant. Thus, experimental
significance may divide in two parts, that is, Statistical Significance and
practical significance.

It
should be noted that result which is both statistically and practically
significant is only **important** and reliable.

Larger
sample size can be one of the cause for finding statistically significant
result. The reason behind this is that with very large n, even minor difference
gets spotted and as a result hypothesis test reveals significant result. Hence
every statistically significant result must be published along with effect
size. **Effect size** is the measure of strength of the relationship or
difference between two means and it measures practical significance.

Statistically
significant result sometimes might not be able to reproduce on other
populations. Such results are referred as false positives.

**Statistical Significance Calculator**

There are various online calculators
available which gives exact results as available form statistical software’s.
These calculators require user to input sample information such as sample mean
or means, sample standard deviation or sample proportion, sample size etc. Then
for given level of significance we can compare the obtained significance values
and make the conclusion accordingly. One of the Statistical Significance
calculators is given **here**.

Statistical
Significance is widely used in Psychology, Social science, clinical trials,
biology and many other fields. In **Psychology** also, Statistical
Significance is explained in the same way. Using significance tests, we can
determine and compare the truth about various psychological treatments. The
relationship between two or more factors can be studied using Statistical Significance
tests. In Psychology, mostly 5% level
of significance is used. If the p value is larger than 0.05, the result is
statistically not significant and if p value is less than 0.05, the result is
statistically significant.

Thus,
we can see that Statistical Significance testing is most important technique in
Statistical data analysis.

**Example of Statistical Significance**** **

Let
us suppose that we want to see whether certain weight loss method is effective
in reducing 15kg weight on an average under given health conditions during 3
months’ period. To determine effectiveness of this weight loss treatments,
Statistical Significance test can be used. To do so,
we can plan and conduct an experiment, collect sample data. Using Statistical
Significance testing procedure, we can conclude about whether treatment given
by that method really reduces weight or not. Thus, we can test the hypothesis
and see if treatment is significantly producing stated results or not.

**More
Readings**