In academic research, statistical tests are widely used to make generalizations about the population from a chosen sample. Typically, the tests will depend on the probability distribution for determining the conclusion concerning the study hypothesis. The statistical tests are classified into parametric and non-parametric tests. The parametric and non-parametric tests are further segregated into the T-test, ANOVA test, Friedman, Kruskal-Wallis, Mann-Whitney test, etc. tests.
The parametric tests include information about the population or make assumptions, whereas non-parametric tests don’t make any assumptions. However, choosing a suitable test from the pool of statistical approaches is quite tricky and demands a thorough knowledge pertaining to each test.
So, when should you choose parametric and non-parametric tests?
1. A parametric test should be selected if the data are sampled from a population that follows Gaussian distribution. To determine if the sample has come from the Gaussian distribution, consider the following factors.
â—‹ Consider the scatter source. If the scatter comes the sum of numerous sources, then the Gaussian distribution can be found.
â—‹ Look at the distribution of the data in the collected data points. If the distribution is bell-shaped, then Gaussian distribution can be expected. However, if there are only a few points and Gaussian inspection is not possible, then a statistical test known as the Kolmogorov-Smirnoff test can be utilized to test if the distribution differs from the Gaussian distribution.
2. A non-parametric test should be chosen under the following circumstances.
â—‹ The outcome is a score, rank, and more importantly, the population is not Gaussian. For example, (a) the visual analog scale for pain (measured on a scale of 0-10. 0 implies no pain and 10 implies unbearable pain). (b) class ranking, (c) critics scale, etc.
â—‹ Some values are off the scale or are too low or high. Here, even if the population is Gaussian, analyzing data that are too high or low using a parametric test is not possible. In such a situation, it is wise to choose a non-parametric test. This is done by assigning low values to measure arbitrary with low values and high values to measure high arbitrary values. The non-parametric test considered only relative values, it does not matter if all the values are not known.
â—‹ If the data measured are not sampled from Gaussian distribution, transform the values to turn the distribution into Gaussian. For instance, if reciprocal or logarithm of the values is taken, then there is a statistical reason behind the transformation.
How does a choice between parametric and non-parametric tests make a difference?
The impact of choice of parametric and non-parametric test largely depends on the sample size.
1. When the parametric test with data from a non-Gaussian population is chosen for large samples, then the central limit theorem makes sure that the test functions aptly for large samples irrespective of the population being the non-Gaussian. That is, as long as the samples are large, the parametric test can handle deviations from Gaussian distribution. However, the drawback is that since the size of the sample depends on the nature of specific non-Gaussian distribution, it cannot be determined as to how large should the sample be.
2. If a non-parametric test with data points from the Gaussian population is chosen for large samples, then the non-parametric test will function normally. The P-value will be large, but the discrepancy will be small. Put simply, the parametric test will be stronger when compared to the non-parametric test for large samples.
3. If a parametric test with data from non-Gaussian is applied for small samples, then the central limit theorem cannot be relied on, and the P-Value can be inaccurate.
4. If the non-parametric test with data from the Gaussian population is applied for small samples, then the P values will be higher. That is, non-parametric tests are statistically weak for small samples.