An Update on Data Distribution and Techniques of Data Transformation
DOI:
https://doi.org/10.21276/zqxbnt82Keywords:
data distribution, data transformationAbstract
The distribution in biostatistics can be defined as distribution of frequencies of values of a given
variable in a sample. Distribution can be broadly classified into normal and skewed distribution.
Normal distribution is a symmetrical bell shaped curve. ±1 standard deviation covers 65% of values
around median value and ±2 S.D. covers 95% of values around median value. Mean, median & mode
are equal for normal distribution curve. Parametric test like t test and ANOVA are based on the
assumption that the data follows normal distribution. In skewed or asymmetrical distribution, there is
clustering of cases in either right side or left side of the curve. In right sided skewness, the tail of curve
is on the right side. In left skewed distribution, the tail is on the left side. Non-parametric test can be
used in case of skewed data. Parametric test are more robust as compare to non-parametric test.
The alternative is to transform the numerical variable into another scale where the values do satisfy
the assumptions needed for the desired parametric or “normal” statistical methods. These
Downloads
References
Krithikadatta J. Normal distribution. J Conserv Dent. 2014 Jan;17(1):96-7.
Limpert E, Stahel WA. Problems with using the normal distribution--and ways to improve quality and efficiency of data analysis. PLoS One. 2011;6(7):e21403
Peters, W.S. (1987). Normal Distribution. In: Counting for Something. Springer Texts in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-4638-1_8
DEMING, W. De Moivre's “Miscellanea Analytica”, and the Origin of the Normal Curve. Nature 132, 713 (1933). https://doi.org/10.1038/132713a0
Bennett MR. The origin of Gaussian distributions of synaptic potentials. Prog Neurobiol. 1995 Jul;46(4):331-50.
Sartori, R. The Bell Curve in Psychological Research and Practice: Myth or Reality?. Qual Quant 40, 407–418 (2006)
Bennett MR. The origin of Gaussian distributions of synaptic potentials. Prog Neurobiol. 1995 Jul;46(4):331-50.
Delucchi KL, Bostrom A. Methods for analysis of skewed data distributions in psychiatric clinical studies: working with many zero values. Am J Psychiatry. 2004 Jul;161(7):1159-68.
Higgins JP, White IR, Anzures-Cabrera J. Meta-analysis of skewed data: combining results reported on log-transformed or raw scales. Stat Med. 2008 Dec 20;27(29):6072-92
Manikandan S. Data transformation. J Pharmacol Pharmacother. 2010 Jul;1(2):126-7
Feng C, Wang H, Lu N, Chen T, He H, Lu Y, Tu XM. Log-transformation and its implications for data analysis. Shanghai Arch Psychiatry. 2014 Apr;26(2):105-9
Henderson AR. The bootstrap: a technique for data-driven statistics. Using computer-intensive analyses to explore experimental data. Clin Chim Acta. 2005 Sep;359(1-2):1-26
Downloads
Published
Issue
Section
License
Copyright (c) 2024 International Archives of BioMedical and Clinical Research (IABCR)
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors are required to sign and submit the completed “Copyright transfer Form” upon acceptance of publication of the paper. This is determined by a publishing agreement between the author and International Archives of Biomedical and Clinical Research. These rights might include the right to publish, communicate and distribute online. Author(s) retain the copyright of their work. International Archives of Biomedical and Clinical Research supports the need for authors to share, disseminate and maximize the impact of their research.