Objective: Computational reproducibility is vital to research quality but rarely assessed systematically. This study aims to evaluate whether Computational Reproducibility Review (CRR) during the publication process improves reproducibility compared to standard peer review. We hypothesize that this intervention will improve reproducibility, encourage code sharing, and reduce errors.
Method: This randomized controlled trial enrolls manuscripts submitted to partnering journals that meet inclusion criteria, including open data availability and inferential analysis. Manuscripts are randomized (1:1) to either peer review with CRR (intervention group) or standard peer review without intervention (control group). The CRR involves reproducing the essential statistical results using shared data (with or without the analysis code). Feedback is provided to authors in the intervention group during the peer review. Computational reproducibility of the control group will be assessed only after publication. The primary outcome is the difference between CRR versus control in the proportion of manuscripts of which we successfully reproduced the essential statistical results after publication. Secondary outcomes include rates of code sharing, overt errors identified, time required for the review, publication timelines, and categorized reproducibility issues.
Results: The study protocol was preregistered on the Open Science Framework1. Enrollment of manuscripts commenced in January 2025 and is ongoing. As of September 2025, 73 manuscripts submitted to GigaSicence, BDJ or RIO journal have been screened. Of these, 35 were eligible and 33 have been randomized. Sixteen manuscripts were randomized to the intervention group, of which 3 were rejected after the first round of review. Two from the intervention group and none of the 18 manuscripts randomized from the control group have been published. Early findings show that CRRs often require direct communication with authors to clarify or complete missing information or materials. The median time spent on the first round of CRR was 8.75 hours (IQR: 5.79–13.73) per manuscript. The project is expected to run until March 2026.
Conclusion: This study fills an important gap in assessing research quality by testing a practical way to improve computational reproducibility. If successful, the results could support wider use of CRRs in academic journals, leading to more reliable and trustworthy science.