November 2021, Volume 71, Issue 11

Research Article

Urdu translation and validation of clinically useful depression outcome scale

Yasmeen Wajid Mauna Gauhar  ( National Institute of Psychology, Center of Excellence, Quaid-i-Azam University, Islamabad, Pakistan. )
Humaira Jami  ( National Institute of Psychology, Center of Excellence, Quaid-i-Azam University, Islamabad, Pakistan. )


Objective:  To translate and validate the Clinically Useful Depression Outcome Scale for Urdu-speaking population.

Method: The cross-sectional study was conducted in Rawalpindi and Islamabad from January 2018 to November 2019. The process of translation and validation was conducted in two phases. In the first phase, the scale was forward and backward translated. In the second phase two validation studies were conducted; one for computing Cronbach’s alpha, test-retest reliability, and item-total correlation, and exploring convergent and discriminant validity; and the other for exploring linguistic equivalence between the original and the translated scale. Data was analysed using SPSS 22.

Result: The first validation study had 170 subjects; 85(50%) in clinical and 85(50%) in non-clinical settings. The translated scale was found to be internally consistent, and convergent and discriminant validity coefficients were significant (p<0.05). Mean difference between clinical and non-clinical groups was also significant (p<0.05), indicating the diagnostic capability of the translated scale. The second validation study, conducted on a separate sample of 82 bilingual participants, showed that the mean difference between the original and the translated version was non-significant (p>0.05), indicating that the Urdu version can be considered an equivalent to the original scale.

Conclusion: The translated version of the Clinically Useful Depression Outcome Scale (CUDOS-Urdu) was found to be a reliable and valid instrument for measuring depressive symptoms in Urdu-speaking individuals.

Keywords: Depression, Urdu translation, Psychometrics, Reliability, Validity. (JPMA 71: 2524; 2021)





Depression is a common mental disorder that can afflict persons of all age groups throughout the globe.1 The World Health Organisation (WHO)2 reported that the number of people living with depression in 2015 was 322 million, with a major chunk of them living in south-east Asia. The occurrence of depression differs by age.2 Major Depressive Disorder (MDD) add to the burden assigned to suicide, increases the risk of death and other health outcomes1 including ischaemic heart disease (IHD).3 Depression is estimated to reach the second spot by 2020 as per ranking of the Disability Adjusted Life Years (DALYs) computed for all ages.4

Pervasiveness of depressive disorders seems to be predominantly high in Pakistan.5 In a systematic review, the mean prevalence for anxiety and depression in an indigenous random community sample was 33.62%.  Out of these, 29-66% women and 10-33% men experienced depressive symptoms.6 In another study, the prevalence of depression in Lahore was high (53.4%) compared to Quetta (43.9%) and Karachi (35.7%).7 Depression among the elderly aged >65 years was found to be 22.9%.8 Pakistani women experienced more anti-natal depression (48.4%) compared to Aboriginal (31.2%) and Caucasian (8.6%) women.9 In the tribal areas of Khyber Puktunkhwa (KP), 60% women and 45% men suffered from depression.5 Suicide rate linked with depression was found to be high in Pakistan.10

There is a dearth of scales in Urdu to measure depression. Of the few available, the translated ones have received criticism on the grounds of insufficient evaluation.11 The Siddiqui-Shah Depression Scale (SSDS) is the only indigenously developed scale. However, it has 72 items11 which makes it daunting for depressed individuals to score that many items.

Urdu is spoken and understood not only indigenously but also by millions across the world.12 For the detection of depression in Urdu-speaking population, a need exists for a brief, user-friendly, easy-to-administer and score tool. Developing a new measure from scratch is much more challenging and time-consuming. However, translating a scale into Urdu is a rather viable option.

The Clinically Useful Depression Outcome Scale (CUDOS)13 is a reliable, valid, precise and user-friendly self-reporting instrument, which is brief, acceptable to patients, and covers all Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV) diagnostic criteria for MDD.13 It is also reliable with internal consistency and test-retest reliability, has convergent and discriminant validity, indicates symptom severity and remission status, has case-finding capability, assesses psychosocial function, quality of life (QOL) and suicidal thoughts, is sensitive to change, easy to score, is not expensive and can be completed in <3 minutes and scored in <15 seconds.13

CUDOS is internally consistent (α=0.90); its test-retest reliability at baseline was 0.92 and at follow-up, 0.95. It had high convergent validity with other depression scales, like Beck Depression Inventory (BDI) (r=0.81), Extracted Hamilton Rating Scale for Depression (r= 0.69) and with the Clinical Global Index of Depression Severity (r=0.71). The discriminant validity was good with Michigan Alcohol Screening Test (r=0.08), the Drug Abuse Screening Test (r=0.12), and with Whitely Index (r=0.27).13

There are 18 items on the scale, out of which 16 cover the symptoms of depression. The total score on 16 items ranges from 0 to 64. A score of 0-10 represents non-depressed range, 11-20 shows minimal depressive symptoms, 21-30 is mild depression, 31-45 indicates moderate and 46 and above is severe depression. Item 17 and 18 give scores on global perception of psychosocial impairment due to depression and QOL, respectively.

The respondent is instructed to rate the symptom items on a 5-point Likert scale indicating “how well the item describes them during the past week, including today” (0 = not at all true/0 days; 1 = rarely true/1-2 days; 2 = sometimes true/3-4 days; 3 = usually true/5-6 days; 4 = almost always true/every day).

The scale can be utilised repeatedly to monitor client / patient progress and treatment efficacy without causing any burden to the test-taker and administrator.13 In a study, participants found CUDOS easy to understand, less cumbersome and less time-taking to score compared to the BDI. There was less information to read. Majority of the patients showed preference for CUDOS over BDI at every follow-up.14


Subjects and Methods


The cross-sectional study was conducted from January 2018 to November 2019 in Rawalpindi and Islamabad in two phases after approval from the National Institute of Psychology, Centre of Excellence, Quaid-i-Azam University, Islamabad, Pakistan. The scale was translated using Brislin15 criteria of backward translation in phase one. The psychometric properties of the translated scale were tested through two validation studies in phase two.

In Phase 1, the first step was to ascertain the relevance of CUDOS in the local setting. The original CUDOS was administered to 4 males and 4 females aged 26-39 years and with post-graduate education having depressive symptoms. The participants found the items easy to rate and relatable to their symptoms. Next, the scale was translated into Urdu by three independent bilingual individuals; a physician and two Ph.D. scholars of Psychology.

In step 2, the translated versions were then presented to a committee of two expert Ph.D psychologists having previous experience of evaluating translations. Minor changes were made to items 1, 2, 3, 9, 10, 13, 15, 16, and 20 by the committee.

In step 3, the scale was sent for back-translation to three different independent bilinguals having MS, M.Phil, and MBBS qualification and having no knowledge of the original scale.

In the final step, the earlier constituted committee analysed the back-translations and modified items 5, 9, 10 and 16. The scale was then sent to the principal author for comments. After the approval, the instrument was ready for validation studies that were carried out in the second phase of the study.

The first validation study was conducted to find out the reliability and validity of the translated CUDOS. Males and females with or without depressive symptoms were invited to participate. The participants were divided into clinical and nonclinical groups.  The venue for the clinical setting was the Psychiatric Department of the Capital Development Authority (CDA) Hospital, Islamabad. Potential participants were interviewed by the doctor at the psychiatric facility. The community sample was collected by contacting participants from all walks of life. Those who met the inclusion criteria and gave informed consent were enrolled.

The clinical group comprised individuals presenting with mental health issues.  They were at least 18 years of age, literate in Urdu, could self-report and had a current diagnosis of DSM-based MDD or dysthymic disorder. Individuals with bipolar disorders, psychotic disorders, schizophrenia spectrum disorders, substance use/drug addiction (tobacco smoking was an exception) and those with any neurological condition, intellectual disability, current pregnancy, and self-harm history were excluded.

The nonclinical group comprised functional individuals from the community who could self-report in Urdu, did not have clinically significant symptoms of depression, who were not on any psychiatric medication or seeing a mental health professional, and did not have difficulty in social or occupational functioning. Those aged <18 years, those who could not read or write Urdu or give self-report, those who had neurological condition/s, intellectual disability, clinically significant symptoms of depression or other psychiatric disorders and those on psychiatric treatment were excluded.

The sample size was calculated using the item response theory16 which suggested item-to-respondent ratio of 1:5  to 1:10. Since CUDOS consisted of 18 items, the required sample size was 90 or 180, as per item-to-respondent ratios of 1:5 and 1:10. The item-response ratio in the current study was 1:9.44.

The participants were briefed about the purpose of the study and assured of confidentiality and anonymity. A booklet containing demographic sheet, translated CUDOS, translated17 Depression Anxiety Stress Scale (DASS)18 and the translated Satisfaction with Life Scale (SLWS)19 was handed over to them for scoring. DASS and SLWS were used to determine CUDOS’s convergent and discriminant validity. The subjects in the clinical group were monitored for the completion time of the Urdu version, so that it could be compared with time taken to complete which was <3 minutes for the original CUDOS.13

The test- retest reliability of the translated scale was measured on a sample of post-graduate students from the National Institute of Psychology, Centre of Excellence, Quaid-e-Azam University, Islamabad, who gave informed consent. The inclusion and exclusion criteria for this group were the same as were for the participants of the non-clinical group. Booklets containing the translated scales were administered on two occasions with a gap of two weeks. Information about confidentiality, and scoring the booklets were given before the participants provided the data.

The second validation study was conducted to examine language equivalency between the translated and the original version of CUDOS. It was hypothesised that if no significant mean difference was observed in the scores obtained from the participants on two points in time, then a language equivalency between the original and translated version would be assumed.

The study was conducted on a separate convenient sample of bilingual students from two public-sector universities in Islamabad.

The participants were approached through faculty members of the universities after seeking permission from the relevant authorities, and those who gave written informed consent were enrolled.

The sample size calculation for each item was based on 1:10 ratio according to the item response theory.16 However, the final analysis had an item response ratio of 1:44.

The scales were administered on two points in time. Each participant had to score both versions of the scale so that their scores on original and translated CUDOS could be tallied and mean difference could be analysed.  On the first administration, half the participants scored original English version and the other half scored the translated CUDOS. On the second administration after a week, participants who scored the English version earlier, scored the Urdu version and those who scored the Urdu version earlier scored the English version of CUDOS. Data related to both the studies was analysed using SPSS 22.

Reliability and validity analyses in the first validation study were run for the total data as well as for the clinical and nonclinical groups. Mean, standard deviation, alpha coefficients, and test-retest correlations were computed. Convergent and discriminant validity of the translated CUDOS with DASS total scale (and subscales) and with SWLS were analysed, respectively. To find the indication of the diagnostic assessment level of the translated scale, sub-groups’ mean comparison was carried out through Independent sample t-test.

In the cross-language validation study, mean, standard deviations and alpha reliabilities were computed for both versions of CUDOS. Paired t-test and correlation between Urdu and English versions were computed to study scales’ correlation and mean difference. P<0.05 was considered statistically significant.




Of the 250 subjects approached for the first validation study, 170(68%) participated; 85(50%) in clinical and 85(50%) in non-clinical group. Demographic and clinical data of each subject was noted (Table 1).

Alpha reliabilities for all the 18 items were calculated. Items 1-16 measured depression only, while items 17-18 measured psychosocial impairment and QOL. Cronbach alpha values for the translated CUDOS on combined sample and its clinical and nonclinical groups showed good reliability coefficients (Table 2).

The item-total correlations were significant for the entire sample, as well as for the two groups, with the value of item 4 being low throughout the sample (Table 3).

The test- retest reliability of the translated scales showed significant correlation (r=0.62, p<0.01) between the two administrations. This was done on a sample of 37 subjects; 2(5.4%) males; 35(94.6%) females; overall mean age: 21.31±2.03 years.

The translated CUDOS showed significant convergent validity with total DASS and its subscales in total sample as well as in clinical and nonclinical groups, and the discriminant validity with SWLS in the entire sample as well as in contrast groups was significant (Table 4).

Contrast group mean values for both the groups were computed with the assumption that clinical scores on translated CUDOS would be higher than the nonclinical scores. The assumption was supported by statistics (t=5.07; df =83, p<0.001). Significant mean difference was observed in the scores of clinical and nonclinical groups for the total scale. Mean differences of these subgroups were also calculated separately for 16 items measuring depressive symptoms and for items 17 and 18 measuring global perception of functioning and QOL. All mean differences were significant (p<0.001) (Table 5).

Of the 196 individuals approached for the second validation study, 82(42%) participated (Table 6).

The alpha reliabilities were found to be good (Table 7).

There was a significant positive relationship (r=0.65, p<0.01) between the original and the translated versions of CUDOS. There was no significant mean difference in participants’ scores on Urdu and English versions (Table 8).




The current study translated and validated CUDOS on indigenous population through validation studies. The Urdu version of CUDOS in study I exhibited satisfactory internal consistency for the total sample as well as for the subgroups. Alpha coefficient for the total sample was 0.93 which was the same as reported earlier.20 Cronbach’s α=0.91 in the clinical sample (18 items) was equal to Cronbach’s alpha of the Korean version (α=0.91).21 Cronbach’s α 0.89 for the clinical group (16 items) in the current study was close to the original English version (α=0.90) 13 and to the Spanish adaptation (α=0.88) of CUDOS.22 Item-to-total correlations indicated that each translated item contributed significantly to the dimension measured, hence leading to the retention of all items. Item-total correlation of Item 4, ‘My appetite was much greater than usual’, was significant but low in the total sample as well as in the subgroups. It seemed as if this item did not apply to most subjects who provided data. According to a study,23 half of the patients with MDD experience have decreased appetite due to depressive states and one-third experience increased appetite. This may have been the case with the participants of the current study.

The significant correlations on test-retest administration established its temporal reliability. The scale correlated significantly positively with DASS and its subscales, and significantly negatively with SWLS, providing support for its convergent and discriminant validity. The significant mean difference between clinical and nonclinical groups provided support for the diagnostic capability of the translated scale. Original CUDOS reported <3 minutes completion time.13 Our clinical sample took four minutes on average to complete the translated version which was closer to the completion time reported by the Spanish version of CUDOS.22

Results of cross-language validation study showed strong alpha coefficients for Urdu and English versions, again showing that the scale was internally consistent. Significant correlation and non-significant mean difference between original and translated versions of CUDOS provided evidence for the language equivalence and suggested that the scale can be used interchangeably.

Since the translated CUDOS is time-efficient and cost-effective, and relevant to the subjects’ experiences, it can be used on every visit to a healthcare setting for early detection, treatment, referrals, and for maintaining records. Clinicians who do not use rating scales24 can easily use it, as it does not require any specific training. An assessment of improvement without a standardised measure may increase the chance of relapse if residual symptoms escaped the clinician’ attention.24

The current study has its limitations in terms of lack of diversity, unequal gender participation, high dropout rate and use of convenience sampling. It was conducted in the region of Rawalpindi and Islamabad where ethnic representation was less from other areas of Pakistan. Unequal male and female representation may have affected the link of gender differences with the study variable.




The translated measure, ‘CUDOS- Urdu’ was found to be reliable and valid. The scale can be an aid at initial screening and may help keep track of the clients’ levels of depression over the course of treatment. Easy access and usage of the instrument by the clinicians and researchers makes it a viable option for clinical use and for conducting epidemiological / other or depression-related research.


Disclaimer: The text is based on a doctoral thesis.

Conflict of Interest: None.

Source of Funding: None.




1.      Ferrari AJ, Charlson FJ, Norman RE, Patten SB, Freedman G, Murray CJ, et al. Burden of depressive disorders by country, sex, age, and year: findings from the global burden of disease study 2010. PLoS Med. 2013; 10:e1001547.

2.      World Health Organization. Depression and other common mental disorders: global health estimates. World Health Organization; [Online] 2017 [Cited 2021 May 30]. Available from: URL:

3.      Charlson FJ, Moran AE, Freedman G, Norman RE, Stapelberg NJ, Baxter AJ, et al. The contribution of major depression to the global burden of ischemic heart disease: a comparative risk assessment. BMC Med. 2013; 11:250.

4.      Reddy MS. Depression: the disorder and the burden. Indian J Psychol Med. 2010; 32:1-2.

5.      Husain N, Chaudhry IB, Afridi MA, Tomenson B, Creed F. Life stress and depression in a tribal area of Pakistan. Br J Psychiatry. 2007; 190:36-41.

6.      Mirza I, Jenkins R. Risk factors, prevalence, and treatment of anxiety and depressive disorders in Pakistan: systematic review. BMJ. 2004; 328:794.

7.      Gadit AA, Mugford G. Prevalence of depression among households in three capital cities of Pakistan: need to revise the mental health policy. PloS One. 2007; 2:e209.

8.      Ganatra HA, Zafar SN, Qidwai W, Rozi S. Prevalence and predictors of depression among an elderly population of Pakistan. Aging Ment Health. 2008; 12:349-56.

9.      Shah SM, Bowen A, Afridi I, Nowshad G, Muhajarine N. Prevalence of antenatal depression: comparison between Pakistani and Canadian women. J Pak Med Assoc. 2011; 61:242.

10.    Khan MM, Mahmud S, Karim MS, Zaman M, Prince M. Case-control study of suicide in Karachi, Pakistan. Br J Psychiatry. 2008; 193:402-5.

11.    Ahmer S, Faruqui RA, Aijaz A. Psychiatric rating scales in Urdu: a systematic review. BMC Psychiatry. 2007; 7:59.

12.    Ahmad S, Hussain S, Akhtar F, Shah FS. Urdu translation and validation of PHQ-9, a reliable identification, severity and treatment outcome tool for depression. J Pak Med Assoc. 2018; 68:1166-70.

13.    Zimmerman M, Chelminski I, McGlinchey JB, Posternak MA. A clinically useful depression outcome scale. Compr Psychiatry. 2008; 49:131-40.

14.    Zimmerman M, Chelminski I, Young D, Dalrymple K. Using outcome measures to promote better outcomes. Clin Neuropsychiatry. 2011; 8:28-36.

15.    Brislin RW. Translation research and its applications: an introduction. In Translations: Applications and research. New York: Wiley & Halsted, 1976.

16.    Osborne JW, Costello AB. Sample size and subject to item ratio in Principle component analysis. Pract Assess Res Evaluat. 2004; 9: 1-9

17.    Aslam N, Tariq N. Psychological disorders and resilience among earthquake affected individuals. (Unpublished M. Phil Dissertation).  Islamabad, Pakistan: National Institute of Psychology, Quaid-i-Azam University, 2007.

18.    Lovibond PF, Lovibond SH. The structure of negative emotional states: Comparison of the Depression Anxiety Stress Scales (DASS) with the Beck Depression and Anxiety Inventories. Behav Res Ther. 1995; 33:335-43.

19.    Diener ED, Emmons RA, Larsen RJ, Griffin S. The satisfaction with life scale. Butt MM, Ghani A, Khan S, translators. Department of Psychology, GC University, Lahore, Pakistan. [Online] [Cited 2021 May 30]. Available from: URL:

20.    Hsu LF, Kao CC, Wang MY, Chang CJ, Tsai PS. Psychometric testing of a Mandarin Chinese Version of the Clinically Useful Depression Outcome Scale for patients diagnosed with type 2 diabetes mellitus. Int J Nurs Stud. 2014; 51:1595-604.

21.    Jeon SW, Han C, Ko YH, Yoon SY, Pae CU, Choi J, et al. Measurementbased treatment of residual symptoms using clinically useful depression outcome scale: Korean validation study. Clin Psychopharmacol Neurosci. 2017; 15:28-34.

22.    Maurino J. Adaptation into Spanish of the clinically useful depression outcome scale (CUDOS) for assessing major depressive disorder from the patient’s perspective. Actas Esp Psiquiatr. 2013; 41:287-300.

23.    Maxwell MA, Cole DA. Weight change and appetite disturbance as symptoms of adolescent depression: Toward an integrative biopsychosocial model. Clin Psychol Rev. 2009; 29:260-73.

24.    Zimmerman M, McGlinchey JB. Why don't psychiatrists use scales to measure outcome when treating depressed patients?. J Clin Psychiatry. 2008; 69:1916-9.


Journal of the Pakistan Medical Association has agreed to receive and publish manuscripts in accordance with the principles of the following committees: