R: A Powerful Statistical Tool for Researchers

Dr Shruti Traymbak

Associate Professor

R and Python have become one of the important programming languages among researchers that help them to do data analysis with the help of command and functions and various libraries and packages. With the help of these languages, statistical analysis and interpretation   has become easier and more benevolent for researchers. R helps researchers to do statistical analysis which is used in statistical analysis like descriptive and inferential analysis. Predicting models with the help of linear regression, accuracy of models, factors analysis and reliability test are also possible with R.

In descriptive analysis, researchers considered four types of measurement like measure of central tendency, dispersion, position and measure of distribution. Measure of central tendency refers to measurement of mean, media and mode. In R mean and median functions are used to calculate mean and median function but in case of mode, “factor” function in R to convert numerical into categorical variable. In R to calculate modal class readymade function is “table”.

In case of measure of dispersion to measure variance and standard deviation, use var and sd function, in case of measure of position quartile, percentile and decile are to be considered. In statistics there are three types of quartile researchers use “quantile” function to measure three quartiles in R like – quartile1-0.25, quartile2=0.50 and quartile3= 0.75. In statistics, quartile2 is also called median. To calculate decile and percentile quantile function is used in R.

In case of measure of distribution, researchers use skewness and kurtosis to check normal distribution of data sets. To measure skewness and kurtosis in R, moments package and library need to install to calculate skewness and kurtosis. If value of skewness is zero, then symmetrical skewness, if skewness value is greater than zero then positively skewed and if skewness value is less than zero, negatively skewed. In case of kurtosis, if kurtosis value is 3 then it will be normal curve or mesokurtic, if kurtosis value is more than 3 then leptokurtic and if kurtosis value is less than 3 then, platykurtic.  CRAN websites which lists all the packages in R, about 20,000 packages are available.

In statistics, normality of data can be measured with help of Shapiro Wilks Test and function will be Shapiro.test, if p value is need to be compared with significance level. if calculated p value is less than significance level, then variable is not normally distributed. There are other functions to check normality of variable like qqnorm (quartile quartile norm) and qqline (quartile quartile line). Thus, concept of normality is the backbone of statistics.

In inferential statistics, hypotheses created and tested on the basis of various tests in R. In R, independent sample T-test parametric and non-parametric can be done. When variables show normality in nature, then researchers can use independent sample T-test parametric and in contrast when variables don’t show normality in nature then we can use independent sample T-test non-parametric. Researchers may use “t.test” function in R to do T-test parametric. In this context, if p value is greater than significance level or 0.05, then null hypothesis is accepted and alternative hypothesis will be rejected. In non-parametric test, researchers may use Mann Whitney U test (non-parametric test) which is very popular non-parametric test. In R “wilcox.test” function can be used for non-parametric test. Similarly, we need to look p value, if p value is greater than 0.05 or significance level then null hypothesis is accepted. Researchers may use “t.test” to measure paired sample T-test (Parametric Test), function will remain the same but way of writing code in R will be different. In this test also, if p value is more than 0.05 then null hypothesis will be accepted. Paired sample T-test is to be considered one of the powerful tests in statistics.

Thus, R can help researchers to do descriptive statistics and inferential statistics with simple functions and library. Researchers should understand which code is more appropriate to analyse available data. Apart from these it is very important for researchers to understand type of data.

It is important to note that data cannot be stored in R, you can save objects in workspace in R and is known as image. In the next day, researchers need to do only reload workspace.

Premium institutes like JIMS Kalakaji provides value-added courses in R to enhance placement scope for the students. JIMS Kalkaji also organizes workshop for PGDM students on R and python time to time to improve technical competencies among students.

#jims #jimsdelhi #managementcollegeindelhi #pgdmcollegesindelhi #mbacollegesindelhi #toppgdmCollegesindelhi #topbschoolsindelhi
For more information visit : https://www.jagannath.org/

JIMS KALKAJI

Dr Shruti Traymbak

Associate Professor

Written by

Leave a Reply

Your email address will not be published. Required fields are marked *