Analysis of Practices to Promote Reproducibility and Transparency in Anaesthesiology Research: Are Important Aspects “Hidden Behind the Drapes?”

Introduction Reliable, high-quality research is essential to the field of anaesthesiology. Reproducibility and transparency has been investigated in the biomedical domain and in the social sciences, with both lacking to provide necessary information to reproduce the study findings. In this study, we investigated 14 indicators of reproducibility in anaesthesiology research. Methods We used the National Library of Medicine (NLM) catalogue to search for all anaesthesiology journals that are MEDLINE indexed and provided English texts. PubMed was searched with the list of journals to identify all publications from January 1, 2014 to December 31, 2018. We randomly sampled 300 publications that fit the inclusion criteria for our analysis. Data extraction was then conducted in a blinded, duplicate fashion using a pilot-tested Google form. Results The PubMed search of these journals identified 171,441 publications, with 28,310 being within the time frame. From the 300 publications sampled, 296 full-text publications were accessible. Most of the studies did not include materials or protocol availability statements. The majority of publications did not provide a data analysis script statement (121/122, 99% [98% to 100%]) or a preregistration statement (94/122, 77% [72% to 81%]). Conclusion Anaesthesiology research needs to drastically improve indicators of reproducibility and transparency. By making research publically available and improving accessibility to detailed study components, primary research can be reproduced in subsequent studies and help contribute to the development of new practice guidelines.


Introduction:
Reliable, high quality research is essential to the field of anaesthesiology. New research establishes the evidence base for clinical practice guidelines, modifies established protocols, updates the standard of care, affects reimbursement (with possibly significant financial implications), and informs clinical practice for anaesthesiologists. Consider the use of epidural glucocorticoid injections for spinal stenosis as an example. Now controversial, previous guidelines based upon uncontrolled trials recommended epidural steroid injections to treat spinal stenosis pain. [1][2][3][4][5][6][7][8][9][10][11] These studies led to a 271% increase from 1991 to 2011 in physician use of epidural injections to treat spinal stenosis pain and a considerable cost increase from $24 million to $174 million 12 . Friedly and colleagues 12 , re-examined these recommendations and conducted a large, randomized, controlled trial that found no statistically significant difference at six weeks for epidural glucocorticoid plus local anaesthetic injection versus epidural local anaesthetic injection only. Because of the implications of research on patient care and healthcare costs, credible science should catalyse change and must be supported by reliable evidence.
The process of peer reviewing, analysing, critiquing, and, eventually, reproducing trials is the cornerstone for creating high quality, reliable, transparent, reproducible, evidence-based publications. 13 In fact, reproducibility and transparency are core scientific principles. However, published reports may provide only limited summaries of a research study, and these published reports often fail to include key study components -raw data, detailed protocols, materials, and analysis scripts -that provide more comprehensive study details. Access to this additional information enables further analysis and verification of the conclusions from the original research. 14 When researchers strive for transparency and allow primary research to be reproducible, we will see improved efficiency 15 , self-correction 16 , and credible published literature. 17 Because of the vital importance of accurate research and its direct influence on patient care, publishers of high-impact journals have initiated guidelines to help improve the reproducibility and transparency of research. For example, the British Journal of Anaesthesia and Anesthesia & Analgesia provide statements in their authorship guidelines encouraging raw data to be available to readers; however, raw data is not required to be submitted for public viewing. 18,19 Access to raw data is encouraged by these journals for statistical reproducibility 20 , additional analysis 21 , participantlevel meta-analyses 22 , and the merging of future or existing datasets. 23 Reproducibility and transparency have been assessed in biomedical and social sciences; however, practices that promote reproducibility and transparency have never been evaluated in anaesthesiology research. In this study, we queried indicators of reproducibility to assess the current climate of anaesthesiology research. Results from this investigation may be used to establish a baseline for comparison in future studies.

Methods:
This is an observational, cross-sectional study design. We used the methodology by Hardwicke and colleagues 24 with modifications. This study did not involve human participants and was not subject to oversight by an institutional review board per the United States Code of Federal Regulations. 25 We report our study in accordance with guidelines for meta-epidemiological methodology research. 26 We uploaded our protocol, data extraction form, and other necessary materials for public viewing on the Open Science Framework (https://osf.io/n4yh5/).

Journal Selection
We used the National Library of Medicine (NLM) catalogue to search for all relevant journals using the subject terms tag Anesthesiology [ST]. This search was performed on May 29, 2019. The inclusion criteria required that journals provided full-text publications in English and were MEDLINE indexed. The list of journals in the NLM catalogue fitting the inclusion criteria were then extracted using the electronic International Standard Serial Number (ISSN) or the linking ISSN when the electronic ISSN was unavailable. This series of ISSNs were then used in a PubMed search to identify all publications within these journals. We limited the sample to publications from January 1, 2014 to December 31, 2018 then randomly sampled 300 publications that fit the inclusion criteria for our analysis (https://osf.io/7sk9m/).

Data Extraction Training
The two investigators responsible for data extraction (OO and DR) underwent a full day of training to ensure inter-rater reliability. The training included an in-person session that reviewed the project study design, protocol, Google extraction form, and examples of where information may be contained using two sample publications. The investigators were then given three example publications from which to extract data in a blinded fashion. Following data extraction, the pair reconciled differences between them by discussion. This training session was recorded and listed online for reference (https://osf.io/tf7nw/). As a final training exercise, investigators extracted data from the first 10 publications of their sample. The investigators held a meeting to reconcile any differences in the data before extracting data from the remaining 290 publications.

Data Extraction
Data extraction on the remaining 290 publications was then conducted in a duplicate, blinded fashion. A final consensus meeting was held with both investigators to resolve disagreements. A third investigator (DT) was available for adjudication but was not needed. We extracted data using a pilottested Google form based on the one provided by Hardwicke and colleagues with additions. 24 This form queries information necessary to be reproducible, such as the availability of materials, data, protocols, or analysis scripts (https://osf.io/3nfa5/). The data extracted varied based on the study design with studies having no empirical data being excluded (e.g., editorials, commentaries [without reanalysis], simulations, news, reviews, and poems). In our Google form, we included the five-year and most-recent-year impact factor, when available. We also expanded the study design options to include: cohort studies, case series, secondary analyses, chart reviews, and cross-sectional studies. Finally, we expanded the funding options from public, private, or mixed into the more specific categories of university, hospital, public, private/industry, non-profit, or mixed.

Evaluation of Open Access Status
We evaluated all 300 publications to see if they were freely available online through open access.
We searched the Open Access Button (openaccessbutton.org) with publication titles and DOI numbers.
This database actively searches for the full-text online. If the Open Access Button was unable to find the publication, authors (OO and DR) then searched Google and PubMed to determine if the full-text was available on the journal website.

Evaluation of Replication and Whether Publications Were Included in Research Synthesis
For empirical studies, excluding meta-analysis and commentary with analysis, we searched the Web of Science to determine if the publication was cited in a replication study, meta-analysis, or systematic review. The Web of Science additionally lists information important for our study, such as the country of journal publication, five-year impact factor (when available), and most recent impact factor with the year it represents.

Statistical Analysis
We report descriptive statistics for each of our findings with 95% confidence intervals (95% CI) using analysis functions within Microsoft Excel.

Publication Characteristics and Availability
Our search of the NLM catalogue identified 86 anaesthesiology journals, but only 36 fit the inclusion criteria. The PubMed search of these journals identified 171,441 publications, with 28,310 being within the time-frame. From the 300 publications sampled, 296 full-text publications were obtained (296/300, 98% [95% CI: 97% to 99%]), while four only provided the abstract or could not be accessed (4/300, 1% [95% CI: 0% to 3%]). Of the 296 publications, 53% (160/300) were publicly accessible. The remaining 47% (140/300) were blocked by a paywall, which could only be accessed through academic library subscriptions (Table 1). We analyzed several anaesthesiology journals of varying five-year impact factors with a median of 2.902 [interquartile range: 1.8-4.0]. For 32 publications, the journal impact factor was unavailable. Other sample characteristics are shown in Table 1 and Supplemental Table 1.

Replication Criteria
The presence of several reproducibility criteria were analyzed, including: publication availability, conflict of interest statement, funding statement, protocol availability, raw data availability, materials availability statement, preregistration statement, and analysis script availability ( Table 2). Of the 154 publications containing empirical data, 107 were assessed for a materials availability statement. Metaanalysis, case studies, case series and commentary with analysis studies were excluded from this evaluation ( Figure 1). Most of the studies containing data did not include materials availability statements or protocol availability statements (104/107, 97% [95% CI: 95% to 99%]). The availability of raw data, analysis scripts, study protocol, and preregistration was accessed in 122 studies. Case studies and case series were excluded from this evaluation ( Figure 1). Most of these studies did not provide a data availability statements (105/122, 86% [95% CI: 82% to 90%]). In the studies that had accessible data, only 8% included all the raw data to reproduce the study findings (1/13 [95% CI: 5% to 11%]). The majority of publications did not provide a data analysis script statement (121/122, 99% [95% CI: 98% to 100%]). Similar to analysis scripts, a majority of the publications did not contain a preregistration statement (94/122, 77% [95% CI: 72% to 81%]). Additional information is available in Table 3 and   Supplemental Table 1.

Replication and Evidence Synthesis
The publications were analyzed for their number of citations in replication studies or systematic reviews/meta-analyses. Of the 154 publications containing empirical data, 139 studies were included in this analysis; meta-analyses, systematic reviews, and commentaries with analysis were excluded ( Table   2). None of the 139 studies were cited by a replication study. Similarly, most of the publications were not cited by a systematic review or meta-analysis (122/139, 88% [95% CI: 84% to 91%]; Supplemental Table 1).

Conflict of Interest and Funding Statements
The presence of these statements was analyzed for the accessible 296 publications.  Table 3.

Discussion:
To our knowledge, this is the first study attempting to objectively quantify specific indicators of reproducibility and transparency in the field of anaesthesiology. Our results show disregard for reproducibility and transparency in currently published anaesthesiology research. The majority of the publications in our sample failed to make key study components available. Materials and protocols were not routinely accessible, many authors did not provide raw data, only one publication provided an analysis script, and the majority were not preregistered. There were no published replication attempts in our sample of publications. Of the indicators we investigated, conflict of interest disclosures and funding statements, were the only indicators included by a majority of the researchers; however, there is still room for improvement. Publications failing to provide key study components can have unintended consequences when others attempt to replicate the research or when it is included in a meta-analysis or systematic review. Seitz et al. conducted a systematic review on exposure to general anaesthesia and risk of developing Alzheimer's disease. When pooling the primary studies, only a single study specified the time duration between exposure to general anaesthesia and assessment of dementia. This lack of reporting prohibited the authors from estimating a pooled effect estimate for this important outcome. 27 Had better reporting been performed by the primary study authors, this analysis would have been possible.
The lack of publicly available protocols, materials, and data in anaesthesiology literature is concerning. These research methodologies allow for independent verification of results and for ensuring that researchers actually did what they planned to do. For example, comparing study protocols or preregistrations with published reports allows for the evaluation of outcome switching or selective reporting bias. This bias occurs when study authors add, remove, promote, or demote outcomes in a study based on whether these outcomes were statistically significant. This form of bias can be problematic, leading to misinterpretations of clinical trials. Several studies have examined the pervasiveness of this problem in the medical literature. 28,29 P-hacking, the practice of running multiple statistical analyses until statistical significance is achieved, is another significant problem that can be mitigated if statistical analysis plans are transparent and available for evaluation. HARKing (Hypothesizing After Results are Known) occurs when post-hoc findings are insinuated to be a priori hypothesis-driven analyses in published study reports. Thus, all three forms of research malpractice may be inspected if study authors provide all available materials and preregister their studies.

Future for Reproducibility and Transparency
Improving the reproducibility crisis in science requires an actionable response by multiple stakeholder groups. Below, we outline recommendations being adopted inside and outside of medicine that may be useful to the field of anaesthesiology. Here, we focus on the role of academic journals and funders, although, certainly, the researchers themselves, peer reviewers, institutional review boards, and others play a role in this improvement.
In a recent article, Adams describes efforts by the British Journal of Anaesthesia to improve study reproducibility by considering reproducibility beyond the methods and results. 30 The journal's editor-in-chief has created a novel approach for arriving at more accurate conclusions by involving independent reviewers to write discussions and conclusions during the peer review process when provided with the submitter's raw data. The idea would attempt to eliminate the original authors' conflicts of interests and allegiance biases. These biases can alter the interpretation of their results. Although we did not inquire about reproducibility with regards to drawing conclusions from submitted data and methodology, this seems to be an additional measure journals could consider taking in order to ensure published material is not misconstrued. The journal Anesthesiology uses custom software designed to evaluate a study's adherence to reporting guidelines like the Consolidated Standards for Reporting Trials Funders play an important role, too. The NIH and NSF have both developed processes to improve the reproducibility of studies funded by federal tax dollars. 31,32 The Wellcome Trust is a "politically and financially independent" group that funds thousands of researchers internationally. The Trust influences researchers and policy makers to improve the methodological quality of publications. In order to receive funding from this group, authors are expected to include accurate records of the methods, procedures, and approvals so that the findings can be replicated. 33 The Bill and Melinda Gates Foundation is also a significant funding source for researchers. Although they do not include as rigorous of guidelines on manuscript submission, they do require that all publications funded by them be immediately open access.
The push for open access aids in the dissemination of new research and findings across the world and can actively change the direction of subsequent research designs. 34 Both of these foundations emphasise important aspects of transparent and reproducible research.
Our study has both strengths and limitations. Concerning its strengths, our study examined a wide range of anaesthesiology publications published across several journals. The random sample of these publications used in this study should improve the generalizability of our findings. We used double data extraction throughout the data collection process. This form of data extraction, which incorporates two coders who are blinded to the decision making of the other, is considered the gold standard by the systematic review community and is advocated by the Cochrane Collaboration. 35 Additionally, we have provided our study protocol, data, and other pertinent materials to improve the reproducibility and transparency of this study. Regarding its limitations, our data collection was sampled from publications dated from 2014 to 2018 and is meant to be a general overview rather than a complete analysis of anaesthesiology publications. Our data collection is also limited to publications in the field of anaesthesiology. We recommend investigating reproducibility and transparency in other fields of medicine as there is often overlap which can contribute to the development of clinical guidelines and protocols. For example, the recent Enhanced Recovery After Surgery (ERAS) protocol developed for Cardiac Surgery published in JAMA Surgery included several randomized control trials and meta-analyses that would not necessarily have been found in specific anaesthesiology journals. 36 In conclusion, anaesthesiology research needs to drastically improve with regards to reproducibility and transparency. This analysis is consistent with previous studies in the biomedical and social science research. We speculate our findings are also consistent in other fields of medicine; however, we recommend further analysis in order to catalyse change in those fields. Our goal of this study is to offer a foundation for publishers to consider when evaluating the validity of a study and for authors and researchers to consider when developing their primary research projects. By including these indicators in primary research, anaesthesiology publications can become more valid, transparent, and reproducible. By making research easily accessible online and by improving accessibility to the detailed study components (raw data, materials, protocols, and analysis scripts) primary research can be reproduced in subsequent studies and help contribute to the development of new practice guidelines, helping change patient care through evidence-based conclusions.

Conflict of Interest
Does the article disclose any conflict of interest or state that none exist?
Includes all studies (n=296) Transparency can be demonstrated by including a statement of potential conflicts of interest. This provides an opportunity for any potential bias to be disclosed.

Evidence synthesis
Are there any reported citations of the study being by a meta-analysis or systematic review? Empirical studies ¶ (n=154) Inclusion of articles in a meta-analysis and systematic review facilitates the production of new studies.

Protocols
Is a protocol availability statement included in the article? Empirical studies † (n=122) In order to reproduce a study, a complete and through protocol is necessary. Which elements of the protocol was included?

Materials
Is a material availability statement included in the article?
Empirical studies ‡ (n=107) Restrictions to the accessibility of materials used in a previous study can negatively impact the validity of a subsequent replication study.
How does the article state the materials are available?
Are the materials accessible from the statement provided?

Raw data
Is a data availability statement included in the article?
Empirical studies † (n=122) Top ranking journals (Nature, The Lancet, Annals of Internal Medicine) are more frequently requiring studies to have data availability.
How does the article state the data are available?
Are the data accessible from the statement provided?
Does the article provide all the raw data that would be required for replication?
Are the names of the data files easily identifiable?

Analysis scripts
Does the article provide an analysis script availability statement?
Empirical studies † (n=122) Analysis scripts are unique sets of instructions that can be used in a replication study to mirror previous data analysis.
How does the article state the analysis scripts are available?
Are the data accessible from the statement provided?

Pre-registration
Is a pre-registration statement included in the article? Empirical studies † (n=122) Reporting bias like Phacking and outcome switching can be reduced by pre-Which organization was the article registered with?
Was the pre-registration accessible? registration.
Which elements of the pre-registration were available? ¶ 'Empirical studies' have empirical data and include the following studies: chart review, secondary analysis, case series, clinical trial, cohort, case-control, meta-analysis, systematic review, commentaries [with data analysis], laboratory, case reports and cross-sectional study designs. Meta-analysis and systematic review were excluded due to this category not being applicable. †Case series and case reports were excluded because they lack reproducibility criteria, which was performed by Wallach et al. 37 ‡Case series, case reports, meta-analysis, systematic review, or commentaries with analysis were excluded due to this category not being applicable