Journal Retraction Rates and Citation Metrics: An Ouroboric Association?

Introduction Retraction of published papers has a far-reaching impact on the scientific world, especially if the retracted papers were published in high-impact journals. Although it has been noted that the retraction rates of journals correlated with their citation metrics, no conclusive data were available for most clinical specialties. In this study, we determined the retraction rate for anesthesia and two comparison groups (neurosurgery and high impact clinical journals). We then studied the correlation of the retraction rate with citation metrics. Methods We generated a list of all anesthesia journals that were indexed in the National Library of Medicine database. We obtained the number of papers published in each journal as well as the number of papers retracted from each. We also collated the Impact Factor® and H-index of each journal. The same methodology was followed for neurosurgery and high impact clinical journals. We then studied the correlations between the retraction rate and citation metrics of each journal. Results The retraction index was 2.59 for anesthesiology, 0.66 for neurosurgery and 0.75 for the high-impact clinical journals group. The retraction rate did not correlate with the citation metrics. However, the number of papers published in each journal and the absolute number of retractions showed a positive correlation with the citation metrics. The H-index showed stronger correlations with these parameters than the Impact factor. Conclusions The number of retractions increased in proportion to both the number of papers published in a journal and the citation metrics of that journal.


Introduction
Retraction of published papers is an inevitable part of the scientific process and reflects the self-correcting nature of science. However, as the volume of scientific literature increases exponentially, the number of retractions has also been increasing [1]. The increasing number of retractions could reflect better engagement of the scientific community with the process of post-publication review. However, it could also be due to an increased rate of malpractice, owing to the "publish or perish" pressure that is now ubiquitous in science. Startling examples of malpractice that led to immediate real-world harm were seen during the ongoing COVID pandemic. A particularly egregious instance is that of two papers by the same lead authors, that were published in the Lancet and the New England Journal of Medicine (NEJM), respectively [2,3]. Both of these papers appear to have been based on completely fraudulent data [4,5]. The effect of these papers was to temporarily halt the hydroxychloroquine arm of the multi-national SOLIDARITY trial being run by the World Health Organization, which was then restarted once the papers were retracted. The COVID pandemic has led to a rush to publish any science related to the novel coronavirus. This has often led to compromised and inadequate pre-publication peer review [6]. The number of COVID-related papers available on PubMed Central (as of November 6, 2020) was 74993; 38 of these papers have been retracted currently, yielding a retraction rate of 0.05% [7]. This is much higher than the rate of retractions for the life sciences as a whole, which was 0.01% [8]. The consequences of the dissemination of false information via compromised papers are all too well known. The effects of the (now discredited) paper published by Wakefield et al are still felt the world over, in the form of a burgeoning anti-vaccine movement [9,10].
The exact impact of scientific papers is difficult to quantify. Citation metrics are imperfect markers to assess the impact of journals or published papers on science and the community. However, it cannot be gainsaid that the papers published in highly cited journals are more widely disseminated than those published in lower cited ("low impact") journals. Several studies have found that the retraction rates for journals correlate with the Impact Factor (R) (ImpFac), and thus, rates of retraction are higher for higher cited journals [11,12]. This would imply that higher cited journals publish more compromised papers that eventually get retracted, but are also more widely disseminated prior to retraction. However, this association has not been reported consistently and some studies have found either no association or a negative association between the retraction rates in a scientific discipline and the ImpFac of journals in that discipline [13,14].
In this study, we examined if the number and rates of retraction (represented by the "retraction index") correlated with the citation metrics (ImpFac and H-index) of journals in the field of anesthesiology. In order to validate the results, we also generated two comparison groups and ran the same analyses across the comparison groups.

Anesthesia
We first identified all PubMed indexed anesthesia journals from the National Library of Medicine database [15]. The database was searched using various search strings (anesthesia, anesthesiology, etc.). The results from the searches using these strings were collated and duplicates eliminated, to generate a list of all indexed journals for anesthesia as a broad specialty.
We then searched the Retraction Watch database using the name of each journal from the preceding list, to identify the number of retractions from each journal (R) over the past 10 years (2010-2020, both inclusive) [16]. The R number for each journal was also cross-checked by searching PubMed using the name of the respective journal as a search string and with the retraction filters applied [17]. For instance, the search strategy on PubMed to identify R for the Journal of Anesthesia would be - The retraction index for a journal was computed using the following formula [12]: Thus, the RI basically denotes the number of papers retracted for every 1000 papers published by the journal over the defined time period. Thus, the RI was computed for each anesthesia journal. The median RI for anesthesia as a specialty (and anesthesia journals as a group) was then computed.
It should be noted that the preceding methodology studies retractions of anesthesia papers published in "pure" anesthesia journals. Anesthesia papers published in other specialty journals or in general medical journals (such as the NEJM) would not be included in this dataset, since the retraction rates for those journals would be driven by other specialties.
The ImpFac of each journal was obtained from the Journal Citation Reports (JCR) 2020 list (Clarivariate Analytics). For those journals that were not included in the JCR list, the citations-per-document (over a twoyear period) statistic was obtained from the Scimago site [18]. This statistic is computed using a methodology that is nearly congruent to the ImpFac calculation process. The H-index for each journal was obtained from the Scimago website.
Spearman's rank order correlation was used to evaluate for any association between the RI and ImpFac or Hindex. Similar correlation analyses were also carried out between the raw number of retracted papers (R) and the ImpFac and H-index. All analyses were carried out on Stata (v14, Stata Corp, College Station, TX) and Microsoft Excel (v16. 16

Comparison groups
In order to validate the methodology described above, we generated two comparison groups. For the first comparator group, we elected to study retractions across neurosurgery journals. The same methodology as employed for anesthesia was used to generate the list of neurosurgery journals and to determine R, N, and RI values for each neurosurgery journal.
The second comparator group was generated from the JCR 2020 list. We selected all journals pertaining to clinical disciplines from among the top 200 journals in the JCR. Journals that were common between the anesthesia and neurosurgery groups and this group were preferentially included in the respective specialty group. Journals that exclusively published reviews were not included. Once the list of the highest impact clinical journals was parsed from the top 200 journals on the JCR, the subsequent methodology to identify the R, N and RI was the same as previously described.

Results
Thirty-two anesthesia journals, 23 neurosurgery journals, and 41 high ImpFac clinical journals (HICJs) from the JCR 2020 were finally included in the analysis.

Anesthesia journals
The cumulative number of retractions over the past 10 years, across the 32 anesthesia journals included in the analysis (R) was 334 ( Table 1). The number of indexed papers published across these journals over the same period (N) was 70286. The overall retraction index for anesthesia journals for the period 2010-2020 was 2.59 (range: 0-17.69). The overall RI for anesthesia was 4.75.

TABLE 1: Anesthesia journals included in the analysis and their attributes
The journals are arranged in descending order of their retraction indices (RI). R = number of retracted papers; N = the total number of indexed papers published in these journals over the past 10 years; RI = retraction index.
The number of papers retracted from anesthesia journals over the period 2010-2020 (R) correlated positively with the number of papers published during the same period (N) ( Table 2). The RI did not correlate with the ImpFac; however, both R and N correlated positively with the ImpFac (Figure 1). Similar results were obtained when the H-index was used as the citation metric. The RI did not correlate with the H-index but both R and N correlated positively with the H-index (

Comparator group: neurosurgery
The R for the period 2010-2020 across the 23 neurosurgery journals was 57 and N was 71939. (Table 3) The median RI for neurosurgery journals was 0.66 (range 0.2-4.3) and the overall RI was 0.79. Neither the ImpFac nor the H-index showed a significant correlation with the RI. A positive correlation was noted between R and the ImpFac, but not between ImpFac and N. The H-index showed a positive correlation with both R and N (

TABLE 3: Neurosurgery journals included in the analysis and their attributes
The journals are arranged in descending order of their retraction indices (RI). R = number of retracted papers; N = the total number of indexed papers published in these journals over the past 10 years; RI = retraction index.
The H-index and ImpFac correlated equally strongly with R (z=-0.018, p=0.493). However, the correlation between the H-index and the number of papers published over the 10-year period studied (N) was much stronger (z=-3.345, p<0.0001).

Comparator group: high-impact factor clinical journals
The R for the HICJ group was 192 papers and N was 233632 ( Table 4). The median RI for the HICJ group was 0.75 (range: 0-3.05) and the overall RI was 0.82. R correlated strongly with the total number of indexed papers published in the HICJs (N) ( Table 2). There was no correlation between the RI and ImpFac but both R and N correlated positively with ImpFac. The H-index correlated strongly with the RI, R and N (  The journals are arranged in descending order of their retraction indices (RI). R = number of retracted papers; N = the total number of indexed papers published in these journals over the past 10 years; RI = retraction index.
The H-index correlated more strongly with R than the ImpFac (z=-3.095, p=0.001). The H-index also correlated more strongly with N than the ImpFac did (z=-3.583, p<0.0001).

Multivariate analysis: number of retractions and retraction index
The median RI for anesthesia (2.59

Retractions in anesthesia
The rate of retractions (RI) varies by scientific discipline. Based on the available literature, the RI is 1.5 for genetics, 0.29 for nursing and 0.1 for the life sciences as a whole [8,14,19]. In the present study, the RI for the period 2010-2020 (inclusive) was 2.59 for anesthesiology, 0.66 for neurosurgery and 0.75 for the HICJ group.
Thus, anesthesia appears to have a higher retraction rate than neurosurgery and the HICJ groups; the RI for anesthesia is also higher than that for the life sciences as a whole. Most retractions in anesthesia appear to be due to misconduct by authors, especially data fabrication and plagiarism [20,21]. These reasons for retraction are nearly congruent to those in other specialties [22][23][24].
However, there is one phenomenon that largely determines the retraction milieu for anesthesia journals. Retractions in anesthesia have been largely driven by a few researchers and groups, unlike in other disciplines. An analysis in 2018 found that 313 papers that were eligible for retraction (278 already retracted and 35 not retracted yet) could be attributed to just three anesthesia researchers/ practitioners [25]. Furthermore, when retractions in the corpus of scientific literature as a whole were considered, the top two authors with the highest number of individual retractions, were both from the field of anesthesiology [26]. This phenomenon of individuals driving the bulk of retractions in anesthesia skews the retraction index for the specialty.
The journal with the highest rate of retractions in anesthesia was the European Journal of Anesthesiology (EJA), with an RI of 17.69 (Table 1). Of the 32 retractions from the EJA, eight could be linked to Boldt and 12 to Fujii. If the retractions attributable to these two authors were to be eliminated from the calculation, the RI for EJA would be 6.7 per 1000 papers. Another striking example of a few authors driving the retraction rates of anesthesia journals is that of the Canadian Journal of Anesthesia (CJA). Thirty-four of 39 papers retracted from the CJA were authored by Fujii and the remaining five by Boldt. The current RI for the CJA is 17.38 per 1000 papers published. Had the systematic fraud perpetrated by Fujii, Boldt and Scott Reuben not occurred, the RI for the CJA would be 0.
Thus, when the high rate of retractions in anesthesia is discussed, it is necessary to always present this caveat, so as to present the true impact of systematic fraud on medical science. Sadly, the kind of systematic fraud perpetrated in anesthesia now appears to have been detected in the field of obstetrics and gynecology as well and an investigation is currently ongoing [27]. This large set of fraudulent papers would inevitably alter the retraction milieu for obstetrics and gynecology. While it cannot be denied that post-publication scrutiny was responsible for detecting these instances of fraud, these examples also underscore the importance of rigorous pre-publication peer review so as to prevent the dissemination of fraudulent science and erosion of the public faith in the scientific process. Increasing the number of peer reviewers and assistant editors so as to reduce the workload on each, could be one measure towards this end.

Retractions in highly-vs lower-cited journals
The issue of retractions from highly cited journals is an important one. It has been widely noticed that the highly cited journals retract more papers than lower cited ones. The reasons for this could be several. First, the papers published in high impact journals are read more widely and are likely scrutinized more extensively. Second, the data published in high impact journals is more likely to prompt a spate of replication studies; thus, results that have poor replicability or are fraudulent are more consistently detected. The third reason would indicate that at least part of the blame must be ascribed to the journals themselves. The tendency of the high impact journals to almost exclusively publish strongly positive studies could act as an inducement for data fabrication. Publication in such high impact journals is likely to lead to significant advancement of a researcher's career; this could be another possible reason to publish compromised data in the high impact journals [12]. The positive association between the rate of retractions and the citation metrics is disturbing, since fraudulent or compromised data published in highly cited journals are widely disseminated and continue to be cited even after retraction [28].
In this study, R (but not RI) showed a positive correlation with the citation metrics, implying that the higher impact journals retracted more papers than the lower impact ones. However, there is another important nuance to be considered. Since higher visibility and scrutiny of the articles published in high impact journals leads to more retractions (vide supra), the gap between the number of retracted papers and the number of retractable papers is very small in high impact journals. This gap (retractable articles -retracted articles) is much higher in lower impact journals [11]. Moreover, visibility of the article also correlated with greater scrutiny and earlier retraction [11]. Thus, a higher rate of retractions in high-impact journals is in all likelihood, the result of a better post-publication review. The returns of publishing in highly-cited journals could act as an inducement to publish papers with strongly positive results, often based on compromised data. However, papers published in the high-impact journals are highly cited, and ipso facto, are also scrutinized more than their counterparts in lower-cited journals. This increased scrutiny possibly results in a higher number of retractions from the higher-cited journals, thus leading to an ouroboric relationship between the citation metrics and the number of retractions from journals.
All journals should therefore consider implementing measures to improve the visibility of the published papers so as to improve the process of post-publication review, such as automatic open access to papers after a specified lock-in period. Shifting the emphasis of journals from only publishing studies with positive results (and vanishingly small p values), towards a focus on the strength of the study protocols, rates of protocol adherence and rates of follow-up (for clinical studies), could, in itself, act as an incentive to publish honest data. Extensive plagiarism and reference checks, and mandatory data deposition are steps that could aid in the detection of fraud prior to publication rather than in a post hoc fashion [29].

Rate and number of retractions vs citation metrics
In this study, we found that the rate of retraction (as described by the RI) did not correlate with either the ImpFac or H-index but the absolute number of retractions (R) and the number of indexed papers published (N) over the past 10 years correlated positively with both H-index and ImpFac (Table 2). Thus, unsurprisingly, as a journal (and specialty) published more papers, more retractions were to be expected therein. Although it is widely held that the ImpFac is a better indicator of the impact of a journal, whereas the H-index is more useful to study the productivity and impact of an individual researcher, we found stronger correlations between the H-index and R and N in this study [30]. Publishing a higher number of papers while maintaining a high citation rate would lead to a high H-index for a journal, and thus, the Hindex is probably a better measure of the impact of a journal than the ImpFac.
The factors that predicted the rate of retraction were different across the different groups in this study. For instance, in anesthesia, none of the variables evaluated (N, ImpFac, H-index) were able to predict the RI. In neurosurgery, the ImpFac predicted the RI whereas for the HICJs, the H-index predicted the RI. Similar discrepancies were noted across groups when multivariable regression was performed to study the possible predictors of the absolute number of retractions (R). In anesthesia, none of the studied variables predicted R. In neurosurgery, ImpFac predicted R whereas for the HICJs, H-index predicted R. These discrepancies could reflect differing publication dynamics across specialties. They could also result from the differing types of papers published across these three groups and the proportions thereof.

Conclusions
The number of papers published in a journal over a defined period of time increased in proportion to the citation metrics of the journal. As the number of papers published in a journal increased, the number of retractions therefrom also increased. The number of retractions correlated strongly with the citation metrics; more papers were retracted from highly cited journals. The H-index correlated more strongly with the number of retractions than the Impact Factor. There are likely other predictors of the ideal baseline number of retractions as a process-indicator of the pre-publication review process, besides citation metrics and the number of papers published.

Additional Information Disclosures
Human subjects: All authors have confirmed that this study did not involve human participants or tissue.
Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue.

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.