Development of the First Health-Related Quality of Life Questionnaires in Arabic for Women With Polycystic Ovary Syndrome (Part II): Dual-Center Validation of PCOSQoL-47 and PCOSQoL-42 Questionnaires

Introduction Validation assesses the acceptability, responsiveness, interpretability, and quality of any questionnaire in any specific population. This is done by correlation matrix evaluation of the proposed test tool with a previously well-validated assessment tool. The study objective is the dual-center assessment of the construct validity of the first health-related quality of life questionnaires for married and unmarried women with polycystic ovary syndrome (PCOS), i.e., PCOSQoL-47 and PCOSQoL-42, respectively. Materials and methods At two centers in Iraq, we enrolled 406 married women and 362 unmarried women with PCOS to test for the construct validity of PCOSQoL-47 and PCOSQoL-42, respectively, from August 2019-August 2020 (after obtaining full results of reliability testing in our previous work). We used the comparable domains from the multiculturally validated questionnaire (World Health Organization Quality of Life [WHOQOL-BREF]) as a comparator to assess the construct validity of the domains of the final highly reliable questionnaire drafts of PCOSQoL-47 and PCOSQoL-42 which were obtained from our previous work. The enrolled women will respond to WHOQOL-BREF and either PCOSQoL-47 or PCOSQoL-42, according to their marital status. Pearson's parametric correlational coefficient compared the total scores of the matched domains in one of our questionnaires and WHOQOL-BREF at p≤0.05. Values more than 0.3 denoted an important correlation between our test questionnaires and the well-validated WHOQOL-BREF questionnaire. The inter-rater reliability between our questionnaires and the comparator was calculated by Cronbach's alpha level, inter-item, and intra-class correlations coefficients matrix. Results We obtained a good respondent-to-item ratio of approximately 9:1 for both questionnaires. We had a good response for the domains of our questionnaires and WHOQOL-BREF. The coping domain at PCOSQoL-42 showed the highest Pearson's coefficient value of (0.708), which indicates a strong and significant correlation between the two constructs at (p<0.001). Other domains of the PCOSQoL-42 showed moderate significant correlation coefficient values. The psychological and emotional status domain of PCOSQoL-47 showed a weak yet significant correlation with its corresponding domain of the WHOHRQOL-BREF. The other domains of the PCOSQoL-47 showed moderate significant correlation coefficient values >0.5. The PCOSQoL-42 and PCOSQoL-47 showed high inter-rater reliability levels in measuring the requested construct or concept when we used Cronbach's alpha and inter-item correlation matrix assessment. Conclusion The individualized PCOSQoL-47 and PCOSQoL-42 for married and unmarried women with PCOS, respectively, represent the first reliable and valid HRQoL assessment tools for assessing the health-related quality of life (HRQoL) in those women with PCOS who use Arabic as a first or native language and address the sexual function as a separate domain.


Introduction
To measure whether the inferences and conclusions of any questionnaire or test score are suitable for what they were designed to measure, we used the term "Validity" [1]. Validation assesses the acceptability, responsiveness, interpretability, and quality of any questionnaire in any specific population [2,3]. There is a crossover between reliability and validity testing during the psychometric assessment of a questionnaire [1].
The construct validation is essential in testing the quality of questionnaires that deal with outcomes that are not directly observable, like health-related quality of life (HRQoL). Lack of construct validity between different measures of the questionnaire renders the questionnaire's results difficult to interpret, with the inability to draw inferences or associations from its responses [4]. To test construct validity, the questionnaire of interest and a preexisting well-validated test score or questionnaire are administered to the same group of individuals. Correlation matrix analysis is used to assess the different forms of association between different measures which have the same construct (concept) in a convergent or a divergent manner [5].
In our previous work [6], we discussed the psychometric analysis of the two newly developed HRQoL questionnaires for married and unmarried women with polycystic ovary syndrome (PCOS), i.e., PCOSQoL-47 and PCOSQoL-42, respectively. We described the item pool formation, content and face validity, applicability in a pilot study, and then test-retest reliability evaluation along with internal consistency. In this article, we will discuss the validation analysis through the construct validity testing of the two new questionnaires through a real-life study in premenopausal women with PCOS in two endocrinology centers in Iraq.

Materials And Methods
This study had passed three phases to reach the final drafts of the questionnaire. The previous article [6] described the first two phases, in which all the psychometrics evaluation of the PCOSQoL-47 and PCOSQoL-42 questionnaires for married and unmarried women with PCOS, respectively.

Phase 3: The second recruitment (August 2019-August 2020)
After the conclusion of the test-retest reliability analysis and internal consistency, we used the third draft to measure the HRQoL in the women with PCOS from the two groups (Tables 1, 2   The same previous enrollment criteria were applied during this recruitment phase, with the ethical consent forms signed [6]. Each woman received two questionnaires forums simultaneously, i.e., (PCOSQoL-47 or PCOSQoL-42, with the WHOQOL-BREF). All women received a full description of the questionnaire items by their interviewing endocrinologist.

WHOQOL-BREF
The 26-item WHOQOL-BREF is a cross-culturally validated, applicable generic questionnaire for assessing general healthy well-being [7]. It is not specific for assessing HRQoL in PCOS because of the heterogenic nature of the syndrome. The WHOQOL-BREF was used as a baseline or a comparator instrument to validate the questionnaires in different communities [8][9][10][11].
The WHOQOL-BREF has four domains of quality of life (QoL) -physical health, psychological, social relationships, and environmental -with the first two questions on overall QoL and general health. The questions reflected the respondent's feelings in the last two weeks. Each question is scored on a five-point Likert scale, 1 indicates maximum HRQoL impairment, and 5 indicates the least impairment [9,12].

Construct validity
We used the construct validity testing to evaluate the ability of the present questionnaires to measure a construct (concepts) [13], by comparing the equivalent domains in the two questionnaires, i.e., PCOSQoL-47 or PCOSQoL-42, with the WHOQOL-BREF. The analysis was done using Pearson's correlational analysis. A Likert scale measured the present questionnaire and the WHOQOL-BREF from 1 to 5. So, there was no need to create a specific syntax for the scores.
The first step in construct validity was to determine the corresponding similar domains that measure similar constructs and presumed statistically coherent outcomes from the two questionnaires, i.e., PCOSQoL-47 or PCOSQoL-42, and the WHOQOL-BREF (

Management of Data
To achieve this goal, we enrolled all the women with PCOS from both groups who attended these two endocrine centers with a diagnosis of PCOS from August 2019 to August 2020, who fulfilled the enrollment criteria of the study, and consented for recruitment in the study [6].
There were 362 unmarried women with PCOS who responded to the PCOSQoL-42 and WHOQOL-BREF. There were 406 married women who responded to the PCOSQoL-47 and WHOQOL-BREF. All the responses were checked, registered, and calculated initially by two research members independently. The questionnaire paper and electronic forms for each woman from either group were numbered by the first author, where all other authors were blinded to, to achieve the responses' maximal anonymity.
The timeline, which was proposed initially by the authors to be a year period for construct validation analysis, determined the final sample size, the total number of 768 out of 913 women with PCOS (84.1%) who attended the two centers for management at a year period from August 2019 to August 2020. We excluded 145 women with PCOS from the study because they did not fulfill the enrollment criteria.
All the paper forums were sorted and stored according to registration numbers set by the first author, where they were ready to be retrieved on request from any respondent. All enrolled women were provided with a copy of their responses for their own to ensure transparency in dealing with their data. All women were told they would be free to withhold their consent at any time during the study till the time of final publication, and this would not affect by any way the level of medical care provided for them. No woman at all withheld her consent. There was no monetary incentive for the participants.

Ethical approval
All the study phases were in accordance with FDEMC ethical committee standards, from whom ethical approval was obtained for the study. The approval number was E43/3/2018. All enrolled women signed informed consent in Arabic before participating in the study.

The construct validation of the third draft
To start validation analysis, we combined our already measured PCOSQoL questionnaires with the corresponding similar domains in the Arabic version of the WHOQOL-BREF as a comparator, which was shown in Tables 1, 2, 3. We implemented a second recruitment study to include 362 unmarried and 406 married women with PCOS to respond to PCOSQoL-42 and PCOSQoL-47, respectively, along with the WHOQOL-BREF at the same time. The sample size had a respondent-to-item ratio of approximately 9:1.
The general characteristics of both cohorts and their initial responses are described in Tables 4, 5.    To perform the construct validation, we used the parametric analysis to estimate the Pearson's correlational coefficients at a two-tailed significance level of ≤0.05. Pearson's coefficient was used to compare the continuous variables like the total domain mean score. Values more than 0.3 denoted an important correlation between our test questionnaires and the well-validated WHOQOL-BREF questionnaire.
The coping domain showed the highest Pearson's coefficient value of (0.708), which indicates a strong correlation between the two constructs at the level of a two-tailed significance of <0.001. Other domains of the PCOSQoL-42 showed moderate significant correlation coefficient values >0.5. The psychological and emotional status domain of PCOSQoL-47 showed a weak yet significant correlation with its corresponding domain of the WHOHRQOL-BREF. The other domains of the PCOSQoL-47 showed moderate significant correlation coefficient values >0.5 ( Table 6).

The inter-rater reliability analysis of the third draft
We used type A two-way mixed effect model inter-rater reliability and absolute agreement definition between our questionnaires and the comparator, through the use of Cronbach's alpha level and inter-item correlation (ICC) matrix with the same maneuver described in the test-retest reliability analysis in our previous paper [6]. The PCOSQoL-42 and PCOSQoL-47 showed high inter-rater reliability levels in measuring the requested construct or concept, as shown in Table 7.  a The two-tailed significance level was less than 0.001.

WHOQOL-BREF, World Health Organization Quality of Life instrument
The inter-item correlations and ICC for the items each domain for all items in the domains in both questionnaires were >0.3, indicating good internal reliability of the dimensions, and implicated a highly significant relationship between the questionnaires' domains.
The final versions of PCOSQoL-42 and PCOSQoL-47 in Arabic are present as an appendix in our previous work [6].

Discussion
Both five-domains PCOSQoL-42 and PCOSQoL-47 showed similar construct validity to that of the sixdomain PCOSQ-50 when measuring the ICC values [14]. However, we used the WHOQOL-BREF as a comparator [7], while they used Short-Form 36 (SF-36) [13]. Both WHOQOL-BREF and SF-36 are reliable and internationally validated to measure the QoL in general [7,15]. This might change the corresponding domains and items between our scales with WHOQOL-BREF and that of PCOSQ-50 with the SF-36.
All domains in our scales showed significant correlations with the corresponding domains of WHOQOL-BREF. Four domains from each scale showed a moderately significant correlation; the coping domain of PCOSQoL-42 showed a strong or high correlation, and the psychological and emotional status domain of PCOSQoL-47 showed a weak correlation, yet highly significant at a two-tailed significance level <0.001 ( Table 6).
The internal consistency of both questionnaires measured by the Cronbach's alpha in (Table 7) showed that only the coping domain of PCOSQoL-42 had exceeded the 0.7. Other domains had an alpha value ranging from (0.423 -0.697), which could be considered a minor limitation. Yet, other parameters like the inter-item correlation mean and ICC were acceptable and highly significant. The WHOQOL-BREF had insufficient sensitivity in measuring the impact of PCOS symptoms because of the heterogeneity of the PCOS symptomatology [12]. Again this benefited our study that these domains showed a high validity for those domains that did not achieve the requested alpha value of 0.7. This would necessitate the comparison with other more sensitive scales in the subsequent multicenter nationwide studies to have better validity.
The sample size for the construct validity evaluation could be a strength and limitation at the same time.
Although there were no absolute rules for the sample size needed to validate a questionnaire [16], we aimed for a sample size of 500 respondents for each questionnaire to meet the (very good sample size) proposed by Comrey et al. [17] to achieve an item to a respondent ratio more than 10:1. But we achieved only a ratio of 8.6:1, which was acceptable [16]. This was caused by the 35% drop-down of the number of attendees to the endocrine centers due to the SARS-CoV-2 pandemic, lockdown, and social distancing. The referral bias might be a limitation because the venue of the study represents two tertiary endocrine centers. We used a simple Arabic language with minimal use of ambiguous and medical terms to make the items more acceptable for women from both groups. We also avoided the words which might have different meanings and were confined to the original one-meaning words.

Conclusions
The individualized PCOSQoL-47 and PCOSQoL-42 for married and married and unmarried women with PCOS represent the first reliable and valid HRQoL assessment tools for assessing the HRQoL in those women with PCOS who use Arabic as a first or native language, Validation of PCOSQoL-47 and PCOSQoL-42 in other local languages like Kurdish will be undertaken following this study. Validation in local languages should be undertaken to cover a more diverse population.
The questionnaire's longitudinal and discriminant validity also needs to be tested during treatment trials to test its sensitivity to the effect of interventions and change in our local population. This will enable us to use the questionnaire to measure the treatment effect on HRQoL and to compare the effects on HRQoL from different interventions.

Additional Information Disclosures
Human subjects: Consent was obtained or waived by all participants in this study. Faiha Specialized Diabetes Endocrine and Metabolism issued approval 23H/E/FDEMC. All the study phases of this study entitled (Development of the First Health-Related Quality of Life Questionnaires in Arabic for Women with Polycystic Ovary Syndrome (Part II): Dual-Center Validation of PCOSQoL-47 and PCOSQoL-42 Questionnaires) were in accordance with FDEMC ethical committee standards. The approval followed the 1964 Declaration of Helsinki and its later amendments. the approval number of the study is (23H/E/FDEMC). Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue.

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.