Testing the PROMIS® Depression measures for monitoring depression in a clinical sample outside the US
Introduction
Certain areas of medicine have a sustained interest in the development of Patient Reported Outcome (PRO) instruments (Black and Jenkinson, 2009). This interest has been accompanied by a proliferation of condition-specific instruments, causing a fragmentation of measures that hampers comparability across studies, settings, or pathologies. As a response, the Patient-Reported Outcomes Measurement Information System (PROMIS®) (Cella et al., 2007) was devised in the US as a publicly available measurement system of self-reported health based on a domain-specific approach without attributions to specific conditions or treatments (Cella et al., 2010). PROMIS focuses on comparability between health states and populations through the application of item response theory (IRT), a psychometric method for item-calibration allowing a common metric for different populations, broader range of scores, and greater precision in individual measures compared to classical test theory methods. IRT properties yield the possibility of alternative administration forms: full item banks, static short forms or dynamic computer adaptive testing (CAT) that selects items in real time targeted to the examinee's specific level of ability or impairment, reducing the number of questions needed and respondent burden without a substantial loss of precision (Hambleton et al., 1991, Van der Linden and Glas, 2000). However, administration burden is increased as CAT requires computerized support in applications.
The international extension of PROMIS is currently underway (Alonso et al., 2013) with PROMIS domains being culturally adapted into several languages (Patient-reported outcomes measurement information system, 2015a). To support the usefulness of PROMIS® for cross-national comparisons, it is important to demonstrate that PROMIS measures are valid, reliable and responsive to change when used outside the US. The assessment of cross-cultural differential functioning at the item (DIF) and test (DTF) level is also crucial to ensure that items are similarly understood and the measures are unbiased across different subpopulations, most importantly, countries, cultures and conditions.
A case of particular importance is emotional disturbance and depression, constructs negatively influencing the course of health (Anderson et al., 2001, Scott et al., 2009) that have been recommended as main outcomes to assess the impact of treatments for various specific conditions (Turk et al., 2003). Efforts have been made to develop item banks for CAT depression instruments (Fliege et al., 2005, Forkmann et al., 2013, Gardner et al., 2004, Gibbons et al., 2008, Gibbons et al., 2012). Among them, the PROMIS system includes a depression domain as part of the overall health profile; it is also the only IRT-based depression measure available in Spanish. An interesting feature of PROMIS Depression is that it does not include items regarding somatic symptoms (e.g. sleep problems, appetite disturbances), unlike other commonly used depression measures (Beck et al., 1996, Spitzer et al., 1999). Thus PROMIS avoids potential confounding effects when assessing patients with comorbid physical conditions. Another advantage of PROMIS measures is that they are designed to be population-independent and sensitive to prevalence but also to a wide range of severity levels. The dimensional approach also allows averting difficulties related to changes in the consensus criteria of categorical nosologies, a problem which is known to have a great impact in clinical patient status when it comes to modification of disorder compulsory criteria (Pereda and Forero, 2012). Additionally, it can provide valuable information on real or biased cross-national differences in the epidemiology of depression (Forero et al., 2014b) (Weissman et al., 1996).
In order to gain evidence about their usefulness, PROMIS Depression attributes should be tested in clinical environments in different languages. Of greatest concern is the evaluation of construct validity and responsiveness in patient samples relevant to the construct of interest. PROMIS Depression has shown good results in patients with major depression (Pilkonis et al., 2014) and other conditions (Amtmann et al., 2014). However, the psychometric properties of the PROMIS Depression measures in Spanish or other language versions have not been evaluated so far.
This study aimed at testing the measurement properties of the Spanish version of PROMIS® Depression in patients seeking mental health care at different care levels in Spain. Specifically, our objectives were to: a) confirm the measurement model and unidimensionality of the PROMIS Depression item bank; b) assess reliability, construct-related validity and responsiveness to change of the item bank and the 8-item static short form.
Section snippets
Selection of the sample
This study was conducted as part of the Inventory of Depression and Anxiety Symptoms (INSAyD) project (Olariu et al., 2014), a prospective study designed to provide brief and easy-to-use tools for diagnosing and assessing severity of mood and anxiety disorders, based on DSM-IV-TR symptom criteria, in a sample of primary care and specialized mental health patients seeking help for active symptoms of mood or anxiety. Patients were invited to participate from October 2011 to February 2013. Three
Results
Out of 244 patients invited, 96.7% were interviewed (8 did not meet inclusion criteria and 3 refused to participate). Of them, 15 did not provide information on self-reported scales including PROMIS. Among the 218 participants who completed baseline self-reported measures, 47 (19.8%) were lost to follow up after 3 months and one was excluded. Additionally, 20 (8.3%) did not respond to the PROMIS depression item bank at follow up. The baseline analysis was carried out with these 218 individuals.
Discussion
This study assesses the psychometric properties of the Spanish version of the PROMIS Depression measures in a sample of individuals with common mental disorders. In this first study evaluating the performance of the Spanish PROMIS Depression in a clinical sample it was shown to be reliable, valid and responsive. Both the item bank and the short form were able to discriminate between MDE and frequently comorbid disorders while capturing aggravation due to comorbidity. Our results are comparable
Conclusions
Results indicate good reliability; construct validity and responsiveness of the Spanish PROMIS Depression item bank and the 8-item static short form, thus supporting PROMIS as a good measure of depression state levels. The fact that these results are found in a clinical sample demonstrates its ability for monitoring depression in clinical settings in spite of not having been designed as a clinical diagnostic instrument. Given that it is part of a broader assessment of different health outcomes
Financial disclosure and acknowledgments
We would like to thank the participating patients and health care centers who made this project possible. This study was supported by grant from Instituto de Salud Carlos III FEDER (grant references: FEDER PI10/00530; FEDER PI13/00506). Gemma Vilagut was supported by Fondo De Investigación Sanitaria. ISCIII (ECA07/059).
References (57)
- et al.
The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005-2008
J. Clin. Epidemiol.
(2010) - et al.
Confirmatory factor analysis of the beck anxiety and depression inventories in patients with major depression
J. Affect Disord.
(1998) - et al.
Differential item and test functioning methodology indicated that item response bias was not a substantial cause of country differences in mental well-being
J. Clin. Epidemiol.
(2014) - et al.
Adaptive screening for depression–recalibration of an item bank for the assessment of depression in persons with mental and somatic diseases and evaluation in a simulated computer-adaptive test environment
J. Psychosom. Res.
(2013) - et al.
Estimation of a single effect size: parametric and non-parametric methods
- et al.
Validación de las versiones en español de la Montgomery-Asberg depression y la Hamilton anxiety rating scale para la evaluación de la depresión y de la ansiedad
Med. Clin. Barc.
(2002) - et al.
Psychometric properties of the twelve item world health organization disability assessment schedule II (WHO-DAS II) in Spanish primary care patients with a first major depressive episode 14112
J. Affect Disord.
(2010) - et al.
Validation of the depression item bank from the patient-reported outcomes measurement information system (PROMIS) in a three-month observational study
J. Psychiatr. Res.
(2014) - et al.
Core outcome domains for chronic pain clinical trials: IMMPACT recommendations
Pain
(2003) - et al.
The case for an international patient-reported outcomes measurement information system (PROMIS(R)) initiative
Health Qual. Life Outcomes
(2013)
Comparing CESD-10, PHQ-9, and PROMIS depression instruments in individuals with multiple sclerosis
Rehabil. Psychol.
The prevalence of comorbid depression in adults with diabetes: a meta-analysis
Diabetes Care
PROMIS computerised adaptive tests are dynamic instruments to measure health-related quality of life in patients with cirrhosis
Aliment. Pharmacol. Ther.
An inventory for measuring clinical anxiety: psychometric properties
J. Consult Clin. Psychol.
Comparison of beck depression inventories -IA and -II in psychiatric outpatients
J. Pers. Assess.
Alpha, dimension-free, and model-based internal consistency reliability
Psychometrika
Measuring patients' experiences and outcomes
BMJ
Alternative ways of assessing model fit
Psychometric comparison of PHQ-9 and HADS for measuring depression severity in primary care
Br. J. Gen. Pract.
The patient-reported outcomes measurement information system (PROMIS): progress of an NIH roadmap cooperative group during its first two years
Med. Care
Statistical Power Analysis for the Behavioral Sciences
Cuestionarios, inventarios y escalas. Ansiedad, depresión y habilidades sociales
PROMIS® Instrument Development and Validation Scientific Standards Version 2.0. Appendix 14
Validation and utility of the patient health questionnaire in diagnosing mental disorders in 1003 general hospital Spanish inpatients
Psychosom. Med.
Screening for mental disorders in heart failure patients using computer-adaptive tests
Qual. Life Res.
Development of a computer-adaptive test for depression (D-CAT)
Qual. Life Res.
Towards a biopsychosocial nosology of mental illness: challenges and opportunities for psychiatric epidemiology
J. Epidemiol. Community Health
Computerized adaptive measurement of depression: a simulation study
BMC Psychiatry
Cited by (36)
Diagnostic operating characteristics of PROMIS scales in screening for depression
2021, Journal of Psychosomatic ResearchCitation Excerpt :Since both are widely-accessible public domain measures, this supports both as viable screening options. Only four previous studies [8–11] have examined the operating characteristics of the PROMIS depression scales using a criterion standard psychiatric interview; their findings are compared to our study in Table 4. Only two studies prior to ours reported operating characteristics for major depressive disorder (the other two examined any depressive disorder), and three of the previous studies had a relatively small number of patients with major depression (18 to 32 cases).
Individual differences, personality, social, family and work variables on mental health during COVID-19 outbreak in Spain
2021, Personality and Individual DifferencesCitation Excerpt :Items were answered according to a 5-point Likert scale containing a range of replies from 0 (never) to 5 (always). Previous studies have found adequate psychometric properties (Vilagut et al., 2015). High score on both scales indicates higher anxiety and depression.
Minimally important differences and severity thresholds are estimated for the PROMIS depression scales from three randomized clinical trials
2020, Journal of Affective DisordersCitation Excerpt :The only previous study to suggest a possible MID focused on 194 patients undergoing treatment for depression over 12 weeks and used PROMIS CAT administration and a retrospective global rating of change anchor to provide an MID estimate of 2.5 to 5 points (Pilkonis et al., 2014). The strong correlations (mean = 0.72) between PROMIS scales and the PHQ-9 were similar to correlations previously reported that ranged from 0.63 to 0.84 (Amtmann et al., 2014; Choi et al., 2014; Pilkonis et al., 2014; Tang et al., 2019; Vilagut et al., 2015). Second, the correspondence between PROMIS and PHQ-9 scores (1.25 point T-score change for each 1 point change in the PHQ-9) may be useful in interpreting studies that use only one of these measures.
Spanish adaptation of the Gender-Related Variables for Health Research (GVHR): Factorial Structure and Relationship with Health Variables
2023, Spanish Journal of PsychologyValidation of the computerized adaptive test for mental health in primary care
2019, Annals of Family MedicineCitation Excerpt :By design, CATs minimize measurement uncertainty and have greater precision than traditional self-report assessments. Several CATs for depression and anxiety have been developed,20–37 including the Computerized Adaptive Test for Mental Health (CAT-MH). The CAT-MH comprises a suite of assessments, including ones for MDD screening,38 MDD severity,39,40 and anxiety severity.41
- 1
INSAyD Investigators: Jordi Alonso, Carlos García Forero, Gemma Vilagut, Pilar Álvarez, José-Ignacio Castro-Rodriguez, Luis Miguel Martín-López, Maite Campillo, Lina Abellanas, Carrie Garnier, Maria Rosa Más, Marta Reinoso, Gabriela Barbaglia, Miquel A. Fullana, Alberto Maydeu, Anna Brown.