An Update on Data Distribution and Techniques of Data Transformation

Authors

  • Ahmad Najmi Assistant Professor, Dept. of Pharmacology, AIIMS Bhopal Avik Ray Author
  • Avik Ray Masters student, Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, USA. Author

DOI:

https://doi.org/10.21276/zqxbnt82

Keywords:

data distribution, data transformation

Abstract

The distribution in biostatistics can be defined as distribution of frequencies of values of a given
variable in a sample. Distribution can be broadly classified into normal and skewed distribution.
Normal distribution is a symmetrical bell shaped curve. ±1 standard deviation covers 65% of values
around median value and ±2 S.D. covers 95% of values around median value. Mean, median & mode
are equal for normal distribution curve. Parametric test like t test and ANOVA are based on the
assumption that the data follows normal distribution. In skewed or asymmetrical distribution, there is
clustering of cases in either right side or left side of the curve. In right sided skewness, the tail of curve
is on the right side. In left skewed distribution, the tail is on the left side. Non-parametric test can be
used in case of skewed data. Parametric test are more robust as compare to non-parametric test.
The alternative is to transform the numerical variable into another scale where the values do satisfy
the assumptions needed for the desired parametric or “normal” statistical methods. These

Downloads

Download data is not yet available.

References

Krithikadatta J. Normal distribution. J Conserv Dent. 2014 Jan;17(1):96-7.

Limpert E, Stahel WA. Problems with using the normal distribution--and ways to improve quality and efficiency of data analysis. PLoS One. 2011;6(7):e21403

Peters, W.S. (1987). Normal Distribution. In: Counting for Something. Springer Texts in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-4638-1_8

DEMING, W. De Moivre's “Miscellanea Analytica”, and the Origin of the Normal Curve. Nature 132, 713 (1933). https://doi.org/10.1038/132713a0

Bennett MR. The origin of Gaussian distributions of synaptic potentials. Prog Neurobiol. 1995 Jul;46(4):331-50.

Sartori, R. The Bell Curve in Psychological Research and Practice: Myth or Reality?. Qual Quant 40, 407–418 (2006)

Bennett MR. The origin of Gaussian distributions of synaptic potentials. Prog Neurobiol. 1995 Jul;46(4):331-50.

Delucchi KL, Bostrom A. Methods for analysis of skewed data distributions in psychiatric clinical studies: working with many zero values. Am J Psychiatry. 2004 Jul;161(7):1159-68.

Higgins JP, White IR, Anzures-Cabrera J. Meta-analysis of skewed data: combining results reported on log-transformed or raw scales. Stat Med. 2008 Dec 20;27(29):6072-92

Manikandan S. Data transformation. J Pharmacol Pharmacother. 2010 Jul;1(2):126-7

Feng C, Wang H, Lu N, Chen T, He H, Lu Y, Tu XM. Log-transformation and its implications for data analysis. Shanghai Arch Psychiatry. 2014 Apr;26(2):105-9

Henderson AR. The bootstrap: a technique for data-driven statistics. Using computer-intensive analyses to explore experimental data. Clin Chim Acta. 2005 Sep;359(1-2):1-26

Downloads

Published

30.01.2023

Issue

Section

REVIEW ARTICLES ~ Pharmacology

Similar Articles

1-10 of 151

You may also start an advanced similarity search for this article.