Mehwish Hussain ( Department of Biostatistics, Dow University of Health Sciences, Karachi, Pakistan. )

#### July 2012, Volume 62, Issue 7

### Learning Research

Researchers analyse and present datasets after the collection phase of studies. This step is most crucial as it leads to the discussion on the entire report, depending on the objective of the study. Nevertheless, researchers abide great difficulties due to unawareness of statistical tools to be used as per objective.^{1} Different studies report that many medical professionals are keen to learn statistical tools. However, they start repulsing from them when they encounter difficulties in analysis.^{1,2} This phobia may be due to mathematical tabulation and interpretation of the findings. Nevertheless, if researchers can learn the core concept of statistical analysis it would help them tremendously. All correct analyses are based on the objective of the study. Guidelines of International Committee of Medical Journals Editors (ICMJE), explains the importance of acceptable reporting of statistics: "Describe statistical methods with enough detail to enable a knowledgeable reader with access to the original data to verify the reported results."^{3}

**Types of Statistical Analyses:**

Mainly, there are two types of analyses involved in statistical findings. One is descriptive, another is inferential. In descriptive statistics, researcher only describes the findings of the collected data. Inferential statistics includes methods to generalize data findings to the related populations with certain level of confidence and assurance of significance of results. It is necessary to consider some sets of principles while conducting such statistical computations. Researchers should also be cautious while presenting such statistical analyses. In this article, the considerations of descriptive statistics findings will be under the pen. Inferential statistics will be mentioned in subsequent articles of the \'Learning Research\' series.

**Descriptive Statistical Analysis:**

Descriptive analyses are performed when the study objective is involved in enhancing the reader\'s knowledge, comprehension and application related to the research. The phrasing of the study objective is pivotal. Objectives\' exact words are related to the study design and statistics. It contains some of the related clues as defined in Bloom\'s taxonomy.^{4} Such as describe, identify, evaluate, examine, design, review, show, measure etc. Strictly note that the verbs associate, relate, compare, predict, estimate etc. come under the area of inferential statistical analysis. Statistical technical terms should not be used in a nontechnical manner such as "random", "normal," "significant," "correlations," and "sample."^{3} Nevertheless, if used, then these terms should be elaborated in such a way that reader can understand the scenario comfortably.

Descriptive analysis involves description of data in terms of frequencies, proportions, mean, median, quartiles, standard deviation, inter-quartiles range etc. Measurement of these statistics depends on type of variables either to be qualitative or quantitative. Qualitative variables are categorical, characterized and attributable e.g. gender, socio-economic status, pain level, treatment groups etc. On the other hand quantitative variables are measurable, continuous and numerical e.g. age, height, weight, pain score etc.

**Qualitative Variable Analysis:**

When the data is qualitative in nature then description of data should be done in terms of frequencies and percentages. Usually, editor asks authors to share both frequencies and percentages while writing the results. Commonly used format of the same is writing the frequency with percentages in parentheses or vice versa e.g.

"Prevalence of obesity was 1450 (30.1%) {men: 1030 (35.13%), women: 420 (22.28%)}."^{5}

It should also be noted that researcher must not start the sentence integers. If one has to start a sentence with a figure, words should be used. Furthermore, if author is transforming the description of proportions in fractional forms then corresponding frequency should be given in parentheses. An example of the same is:

"More than 80% respondents had known or heard about Tetanus (n= 973). Regarding predisposing factors of Tetanus infection, … Majority of respondents had known or heard about Rabies (n= 973; 81%)."^{6}

The author must read the statistical analysis and data description portion mentioned in "Instruction to Authors" of the specific journal. In this section, editor provides details that how many decimal places should be reported with fractional integers.

**Quantitative Variable Analysis:**

Basic descriptions of quantitative variables are presented in terms of mean ± standard deviation (SD). Note that it must not be standard error (SE) of mean. Though, when data is skewed then median with inter-quartile range (IQR) is to be reported. It must be clear to the authors the IQR is different from range. Range is a difference between largest and smallest values in a set of values; differing IQR is calculated based on quartiles of a given dataset.^{7} There are several methods to check skewness of the data. Nonetheless, histogram and box plot are used for initial stage detection of the same. If the histogram showed a "bell shaped" curve or middle line in the box of box plot is at center of the box, then variable is non-skewed (symmetric). If such mirror-image pattern is not observed then variable is to be considered as skewed (asymmetrical). Another arithmetical evidence for detecting skewness is standard deviation. If it is at-most three times of its mean, the variable is symmetric. The variable would be considered as skewed if the standard deviation is more than three times of its mean.

**Graphical Representation of Data:**

Tabular or graphical presentation of results is a great asset that an author can use to present complicated and large volumes of findings. The pie chart and bar chart are used to present proportions and frequencies obtained for qualitative data. Though, pie chart is not much preferable as it represents one variable only. Also, with large number of categories, the pie chart representation becomes quite vague. The colours of the slice/bar of the chart must be light. Since, most of the journals published graphs in black and white colours. Thus, to distinguish diverse categories of the variables, different pattern can be induced while drawing them (Figure).^{8}

Data labels must be given in terms of either frequencies or percentages. Both X (horizontal) and Y (vertical) axes must be labeled. The title of the figure should be given beneath the graph. It should be detailed but concise.

**Tabular Presentation of Data:**

There are many formats for the tabular presentation. Tables containing description of less than three variables are usually not considered good for data presentation. Also, tables having many cells with zero frequencies or having a lot of categories with small number of counts should not be displayed in articles. Too small and too large tables should be avoided. The display of variable\'s names and categories can be presented in two ways. Examples of presenting such datasets are shown in Tables 1 and 2.

Variable names should be given in the first column. In second column, variable\'s categories should be mentioned (Table-1).^{9} Number of rows of the first and second columns must be merged within corresponding columns and should be equal to the number of variable\'s categories. Another way of presentation is that variable’s name and categories are given together in the first column. It should be done in a manner that in one row the name of the variable is given in bold format. Successive rows below mention the categories of that variable in normal text format (Table-2).^{10} In later columns statistics for each category should be displayed in both types of presentations. Each cell should contain both frequencies with percentages in parentheses when describing categorical variables. The mean ± SD or median (IQR) should be presented corresponding to quantitative variable. Long phrases and abbreviations should be referred in footnote of the table with the marking of symbols such as *, †, ‡ etc.10 The title should be given above the table.

Few words of caution must be shared with the authors here. First, the authors must never repeat the study findings in table or figures if already adequately mentioned in the text of the article. A good researcher first reviews the dataset and checks what can be best presented in figures or tables. Later, the text is written accordingly. Authors need to mention the tables and graphs in the main text of the article.

### References

1. Altman DG. Statistics in medical journals: some recent trends. JPMA 2000; 19: 3275-89.

2. Curran-Everett D, Benos DJ. Guidelines for reporting statistics in journals published by the American Physiological Society. Am J Physiology-Heart and Circulatory Physiology 2004; 287: H447-9.

3. International Committee of Medical Journal Editors. Uniform Requirements for Manuscripts Submitted to Biomedical Journals: Manuscript Preparation and Submission: Preparing a Manuscript for Submission to a Biomedical Journal. (Online) 2009 (Cited 2012 May 6). Available from URL: http://www.icmje.org/manuscript_1prepare.html.

4. Bloom BS. Bloom\\\'s Taxonomy. In: Victoria Uo, ed volume 2012. Boston: Allyn and Bacon, 2004.

5. Ustu Y, Ugurlu M, Aslan O, Aksoy YM, Kasim I, Egici MT, Sanisoglu SY. High prevalence of obesity in Tokat, a northern province of Turkey. JPMA 2012; 62: 435-40.

6. Wasay M, Malik A, Fahim A, Yousuf A, Chawla R, Daniel H, Rafay M, Azam I, Razzak J. Knowledge and attitudes about Tetanus and Rabies: A population-based survey from Karachi, Pakistan. JPMA 2012; 62: 378-82.

7. Stat Trek. How to Measure Variability in Statistics. (Online) 2012 (Cited 2012 May 6). Available from URL: http://stattrek.com/descriptive-statistics/variability.aspx.

8. Thaver AM, Kamal A. Impact of information sources on the knowledge of adolescents about hepatitis B. JPMA 2010; 60: 1072-5.

9. Ali S, Ali SF, Imam AM, Ayub S, Billoo AG. Perception and practices of breastfeeding of infants 0-6 months in an urban and a semi-urban community in Pakistan: a cross-sectional study. JPMA 2011; 61: 99-104.

10. Jalilolghadr S, Afaghi A, O\\\'Connor H, Chow CM. Effect of low and high glycaemic index drink on sleep pattern in children. JPMA 2011; 61: 533-6.

**Journal of the Pakistan Medical Association has agreed to receive and publish manuscripts in accordance with the principles of the following committees:**