April 2016, Volume 66, Issue 4

Short Reports

Use of statistical tests and statistical software choice in 2014: tale from three Medline indexed Pakistani journals

Masood Ali Shaikh  ( Independent Consultant, Gulshan-e-Iqbal, Karachi, Pakistan. )

Abstract

Statistical tests help infer meaningful conclusions from studies conducted and data collected. This descriptive study analyzed the type of statistical tests used and the statistical software utilized for analysis reported in the original articles published in 2014 by the three Medline-indexed journals of Pakistan. Cumulatively, 466 original articles were published in 2014. The most frequently reported statistical tests for original articles by all three journals were bivariate parametric and non-parametric tests i.e. involving comparisons between two groups e.g. Chi-square test, t-test, and various types of correlations. Cumulatively, 201 (43.1%) articles used these tests. SPSS was the primary choice for statistical analysis, as it was exclusively used in 374 (80.3%) original articles. There has been a substantial increase in the number of articles published, and in the sophistication of statistical tests used in the articles published in the Pakistani Medline indexed journals in 2014, compared to 2007.
Keywords: Statistics, Software Program, Pakistan, Publication Research Journal.


Introduction

Descriptive and inferential statistical methods help make sense of data and drawing of meaningful conclusions. Reviews of statistical tests used in medical journals from Pakistan and other countries report that over time there is an increase in the use as well as sophistication of statistical tests employed for the data analysis of published studies.1-4
There is only one review of statistical methods used in articles published in Pakistani medical journals.1 This review of six medical journals of Pakistan including three Medline Indexed journals, namely Journal of Pakistan Medical Association (JPMA), Journal of Ayub Medical College (JAMC), and Journal of College of Physicians and Surgeons of Pakistan (JCPSP), for the years 1998 and 2007; reported that out of 299 \\\'original articles\\\' and \\\'short communication\\\' published in 2007, there were only 7 (2.3%) articles in the Medline indexed journals that used multiple linear regression model, while only 1 (0.6%) out of 177 articles published in the non-indexed journals used this model. This was in sharp contrast to only 1 (0.6%) article out of 172 in the Medline indexed journals in the year 1998, and none out of 101 articles in non-Medline indexed journals that used multiple regression model.1 However, this review did not study type of statistical software reported in the articles.
Pakistan still has the same three Medline indexed journals i.e. JPMA, JAMC, and JCPSP. All three are general medical journals and publish papers on pertaining to all medical disciplines, including public health. JAMC is a quarterly medical journal, while the other two are published on monthly basis. All these three journals maintain a regularly updated website - based on author\\\'s experience - where full text of all published papers is freely available for download. This descriptive study was undertaken to analyze the type of statistical tests used, and the statistical software utilized for analysis, in all the original articles published in the year 2014 by the three Medline indexed journals of Pakistan.


Methods and Results

In February 2015, websites of JPMA, JAMC, and JCPSP were accessed online, and all \\\'original articles\\\' published in the year 2014 were downloaded and reviewed for the type of statistical tests applied, and the statistical software used for analysis of data. The full text of all the articles is freely available on the websites of these three journals, including for the year 2014. In all instances use of statistical tests, as reported in the published paper were taken at face value; in very few instances where such information was not reported, this information was inferred based on the description provided in the \\\'methods\\\' section. Data were analyzed in terms of frequencies and percentages using the freely available statistical analysis program R version 3.1.2.
Cumulatively, 466 original articles were published by the three journals in the year 2014. JPMA published 189 (40.6%), while JAMC and JCPSP published 130 (27.9%) and 147 (31.5%) respectively. For the purpose of this analysis, statistical tests were combined in groups as most of the articles reported use of more than one test.

Table-1 lists all the groups along with various statistical tests forming these groups. Univariate analysis like frequencies, percentages, mean, median etc. were grouped together. Statistical tests entailing comparison between two groups e.g. Chi-Square, t-tests, and their non-parametric counterparts, in addition to correlations were all combined in one group. Statistical tests involving comparison between more than two groups e.g. Analysis of Variance (ANOVA) and Kruskal-Wallis test etc. were combined in a separate group. While regression analyses e.g. linear and logistic etc. were combined in yet another group. A hierarchical approach to grouping of statistical tests was adopted, where for example articles using t-tests in combination with regressions were counted as belonging to regression group. However, few articles that used either only qualitative analysis techniques or used statistical tests that were not part of other groups e.g. Kappa statistic, or those articles that used statistical tests belonging to more than one group e.g. Area under the Curve (AUC) in combination with Kolmogorov-Smirnov test were grouped together. Four articles used mixed methods study design i.e. both quantitative as well as qualitative analysis techniques were used; these were grouped together with the \\\'other\\\' group since in these articles qualitative analysis techniques were used in combination with quantitative analysis that was limited to only frequencies and counts.
Table-1 provides information on the number and type of statistical tests performed and reported in articles, disaggregated by three Pakistani Medline indexed journals. The most frequently reported statistical tests for original articles by all three journals were bivariate parametric and non-parametric tests i.e. involving comparisons between two groups e.g. Chi-square test, t-test, and various types of correlations; cumulatively, 201 (43.1%) such tests were reported. While cumulatively, only 39 (8.4%) original articles used multivariate models e.g. linear and logistic regressions.


Table-2 provides information on the type of statistical analysis software programmes reported in the original articles published in year 2014, disaggregated by three Medline indexed journals. SPSS was the primary choice for statistical analysis, as it was used in 374 (80.3%) of original articles; the SPSS versions used ranged from version 10 to 22. SPSS was also used in combination with other software programmes in 7 (87.5%) out of 8 instances where more than one analysis software programme was used. The use of freely available software programme Epi Info was reported in 3 (0.6%) of original articles. The other freely available and very powerful analysis software programme R was only reported in one original article, where it was used in combination with STATA.


Discussion

Results from this study clearly indicate that there has been a substantial increase in the number of articles published, as well as an increase in the number and sophistication of statistical tests used in the articles published in the Pakistani Medline indexed journals in 2014, compared to 2007.1 Some other article types that were not included in this study e.g. \\\'short communication\\\' and articles published under the \\\'Student\\\'s Corner\\\' by the JPMA also report use of regression models as well as statistical tests for comparing two groups.5,6 Their inclusion in the study would have further reinforced this conclusion. Bivariate parametric and non-parametric tests i.e. involving comparisons between two groups e.g. Chi-square test, t-test, and various types of correlations were reported by 201 (43.1%) original articles, and were the most common statistical tests used. These statistical tests also have been reported to be the most commonly used in other studies.1-3,7
The SPSS (SPSS Inc., Chicago, IL, USA) statistical software programme was exclusively used in 374 (80.3%) original articles published in 2014 by the three Pakistani Medline indexed journals; SPSS was also the most commonly used statistical analysis software reported by the Journal of Periodontal and Implant Science, based on review from 2010 to 2014.2 However STATA (StataCorp LP., College Station, TX, USA) and SAS (SAS Institute Inc., Cary, NC, USA) were reported to be the most commonly used statistical analysis programmes in the health services research studies published in the United States from 2007 to 2009.8
Findings from this descriptive study need to be interpreted with several caveats; firstly analysis was restricted to original articles only and for one year i.e. 2014. Secondly analysis was restricted to only Pakistani Medline indexed journals. The \\\'PakMediNet\\\', which is the leading online database of Pakistani medical journals, lists 77 medical and pharmaceutical journals that are published in Pakistan.9 Finally, many Pakistani health and medical researchers publish in foreign journals. Hence results are in no way representative of statistical tests used for analysis or the choice of statistical software used in Pakistan.
Future reviews of statistical tests used in articles published in Pakistani medical journals need to study their appropriateness in the context of study design and data collected, as well as quality of inferences drawn based on the results of statistical tests applied.


References

1. Rao MH, Khan N. Comparison of statistical methods, type of articles and study design used in selected Pakistani medical journals in 1998 and 2007. J Pak Med Assoc 2010; 60:745-50.
2. Choi E, Lyu J, Park J, Kim HY. Statistical methods used in articles published by the Journal of Periodontal and Implant Science. J Periodontal Implant Sci 2014; 44: 288-92.
3. Emerson JD,Colditz GA. Use of statistical analysis in the New England Journal of Medicine. N Engl J Med 1983; 309: 709-13.
4. Wang Y, Yu JM, Geng JP, Zheng JW. A review of the statistical methods used in the journal of prosthetic dentistry. Shanghai Kou Qiang Yi Xue 2004; 13: 201-2.
5. Shaikh MA. Prevalence, correlates, and changes in tobacco use between 2006 and 2010 among 13-15 year Moroccan school attending adolescents. J Pak Med Assoc 2014; 64: 1306-9.
6. Rais R, Saeed M, Haider R, Jassani Z, Riaz A, Perveen T. Rheumatoid arthritis clinical features and management strategies at an urban tertiary facility in Pakistan. J Pak Med Assoc 2014; 64: 1435-7.
7. Rigby AS, Armstrong GK, Campbell MJ, Summerton N. A survey of statistics in three UK general practice journal. BMC Med Res Methodol 2004; 4: 28.
8. Dembe AE, Partridge JS, Geist LC. Statistical software applications used in health services research: analysis of published studies in the U.S. BMC Health Serv Res 2011; 11:252.
9. Medical journals of Pakistan. [Online] 2015 [Cited 2015 Feb 11]. Available from URL: http://www.pakmedinet.com/journal.php.

Journal of the Pakistan Medical Association has agreed to receive and publish manuscripts in accordance with the principles of the following committees: