A Bioinformatician, Computer Scientist, and Geneticist lead bioinformatic tool development - which one is better?

The development of accurate bioinformatic software tools is crucial for the effective analysis of complex biological data. This study examines the relationship between the academic department affiliations of authors and the accuracy of the bioinformatic tools they develop. By analyzing a corpus of previously benchmarked bioinformatic software tools, we mapped bioinformatic tools to the academic fields of the corresponding authors and evaluated tool accuracy by field. Our results suggest that “Medical Informatics” outperforms all other fields in bioinformatic software accuracy, with a mean proportion of wins in accuracy rankings exceeding the null expectation. In contrast, tools developed by authors affiliated with “Bioinformatics” and “Engineering” fields tend to be less accurate. However, after correcting for multiple testing, no result is statistically significant (p > 0.05). Our findings reveal no strong association between academic field and bioinformatic software accuracy. These findings suggest that the development of interdisciplinary software applications can be effectively undertaken by any department with sufficient resources and training.

The development of accurate and scalable bioinformatic tools 12 is critical for interpreting such large datasets.Which requires 13 both biological insight and advanced computational skills to 14 build algorithms.The advent of high-throughput technolo-15 gies has driven the growth of bioinformatics, leading to the 16 establishment of specialized groups within biology, computer 17 science, and engineering faculties, each contributing to the 18 field's expansion [5,7].19 The contributions of "domain experts" to bioinformatics 20 from the biological and health sciences, such as genetics and 21 molecular biology are essential as they ensure software tools 22 are relevant and accurate.However, domain experts may lack 23 the advanced computational expertise needed to develop so-24 phisticated software.In contrast, fields like mathematics, en-25 gineering, and computational sciences -referred to here as "development experts" -offer expertise in algorithm development, mathematical modeling, statistics, and software engineering, essential for creating efficient and scalable bioinformatic tools.
It is possible that departmental differences may influence bioinformatic tool development by reflecting the distinct expertise, resources, and perspectives offered by different academic fields.Development experts excel in computational efficiency, while domain experts provide essential biological insights.Therefore, the success of a tool may depend on the integration of diverse skills rather than on the specific departmental affiliation of its developers.
The primary objective of this study is to examine whether the academic department affiliation of a corresponding author has a discernible outcome on the accuracy (i.e.correctness of predictions) of the bioinformatic tools they develop.Specifically, we aim to determine whether tools created by authors from domain-expert, development-expert of interdisciplinary fields differ in accuracy.To address this, we analyzed a benchmarked corpus of bioinformatic software tools and evaluated their accuracy based on the developers' academic affiliations.

Results
We explored the relationship between the accuracy of bioinformatic software tools and the academic fields of their developers.Using a previously published corpus of benchmarked accuracy rankings [8], we mapped corresponding authors' addresses to standardized "fields of study" [9] and grouped them into broader categories.
Figure 1 shows the number of tools for the general and specific fields (N ≥ 10).Most bioinformatic tools were developed by authors affiliated with Genetics, Bioinformatics, Computer Science, or similar departments.Among the general fields, Biological Sciences produced the most software tools, followed by Computer Sciences.
We ranked fields based on the mean proportion of "wins" (i.e., for field '1' when its tool 'A' outperforms tool 'B' in benchmark 'X') and calculated Z-scores to compare them against random expectations (i.e., wins = 0.5).A higher pro- affiliation [11,12,13,14].This and some other redundan-cies leaves just eight departments representing the medical 26 informatics field here.

32
Other fields showed confidence intervals that included the 33 null value of 0.5, with modest Z-scores ranging from -0.49 to 34 0.96, and P-values greater than 0.05.

41
We tested the assumption that academic department specializa-42 tion reflects the quality of research software.After correcting 43 for multiple testing, we found no significant association be-44 tween academic expertise and the accuracy of bioinformatic 45 tools.This suggests that department affiliation does not cor-46 relate with software quality, and neither general nor specific 47 research fields showed any significant links to tool accuracy.the key factor in producing accurate tools, while citation metrics, tool age, speed are not associated with software accuracy [15].Our findings complement this by showing that academic field is also not associated with software accuracy.
While other aspects of bioinformatic tools, such as speed and usability, are important, we emphasize that accuracy should remain the top priority, as poor predictions can have long-term impacts on research [16].
Medical Informatics was the top-performing field in developing accurate tools, these include methods for structural variation detection, single-cell profiling, long-read assembly, multiple sequence alignment and are derived from a limited number of research teams.However, tools from Bioinformatics and Engineering ranked lower, though these differences were not statistically significant.
Therefore, an individual's department is not a reliable indicator of the quality of the software they produce.Academic affiliation should not be used as a proxy for assessing the potential success of software development projects.
Limitations: Some benchmarks include multiple tool options, potentially introducing non-independent effects.The accuracy metrics are diverse, with some limitations (e.g., issues with "accuracy" in class-imbalanced datasets [17] and criticisms of the N50 metric for sequence assembly [18]).Additionally, smaller benchmarks and smaller fields may exag-gerate rank shifts.We mitigate this in part by only considering 26 departments with 10 or more corresponding tools.

27
There may be little connection between a researcher's train-28 ing and their listed department, as illustrated by this author's 29 background which began in mathematics, before taking posi-30 tions in departments of computer science, molecular biology 31 and bioinformatics, and now is affiliated with a Biochemistry 32 Department.

33
The corresponding (last) author, is typically the principal 34 investigator and may not be the primary tool developer, but 35 rather oversees the project.There is likely to be a strong 36 overlap between the departments of the first and last authors, 37 but this was not explored in the current study.

38
Final words: This study does not find strong evidence 39 linking academic department affiliation with bioinformatic 40 software accuracy (p > 0.05 in all instances).Future research 41 should investigate other factors, such as interdisciplinary col-42 laborations and developer training, to understand what drives 43 high-quality tool development.Addressing potential biases 44 against interdisciplinary work [19] and ensuring long-term sup-45 port for essential software infrastructure will also be critical 46 for advancing the field [20].

Methods
The data, scripts, figures and manuscript draft files are availble at the GitHub repository: https://github.com/ppgardne/departments-software-accuracyPre-registration: The methods for this study followed the pre-registered proposal outlined prior to any unpublished data collection [8].
Benchmarking data: software ranks from previously gathered benchmarks are publically available [15], these include data from 68 publications that rank the accuracy of different sets of 498 distinct software tools.
Mapping tools to academic field: For each software tool, the corresponding publication(s) were identified, and the addresses of the primary corresponding author were manually extracted when available.If an author listed multiple addresses, only the first two were used.In cases with multiple corresponding authors, the last corresponding author was chosen.
The department names of the authors were mapped to the closest associated "fields of study" as defined by the National Science Foundation [9].We analysed these fields at three hierarchical levels: first, specific fields (e.g."genetics", "computer science", "bioinformatics" etc), which were then mapped to broader general fields (e.g."biological sciences", "computer sciences" etc).Thirdly, we categorized them into three types of expertise: development experts, domain experts and interdisciplinary experts.Development experts, from fields such as computer science, mathematics, and engineering, are expected to bring relevant expertise in software engineering and the mathematical modeling of biological problems.Domain experts, from the biological and health sciences, are anticipated to possess detailed knowledge of their subject area and to be invested in producing high-performing software for their research needs.Interdisciplinary experts come from fields such as bioinformatics, biostatistics, and biomathematics, and also include researchers who list both development and domain expertise (e.g."Computer Science" and "Genetics").We have treated some fields as synonymous; for example, "Computational Biology" was mapped to "Bioinformatics", and "Genomics" is mapped to "Genetics".
We restricted all subsequent analyses to fields that contain at least 10 software tools in our benchmark corpus.This mitigated against potential issues due to small sample sizes.

Statistical analysis:
The accuracy data is derived from benchmarks using a diverse number of metrics that include sensitivity, specificity, PPV, FDR, error rates, AUROC, MCC and others [16].The number of tools ranked in any benchmark ranged from 3 to 50.In order to obtain a representative measure of accuracy for a field that accounts for the diversity in accuracy measures and number of ranked tools, we employed a rank-based and bootstrapping strategy.We randomly sampled, with replacement, sets of 200 tools from the total of 498 tools.For each tool, a corresponding benchmark was selected at random, and the number of times the tool "won" against another tool was recorded, along with the total number of pairwise comparisons made.These counts of wins and total comparisons were then assigned to the corresponding specific 56 and general departments, and expertise areas.In other words, a 57 tool ranked second in a benchmark of 11 tools will contribute 58 9 wins and 10 comparisons to the totals for its corresponding 59 fields.

60
This process was repeated 1,000 times to estimate the 61 mean proportions of wins for each field, along with a 95% 62 confidence interval for these values (Figure 2A).Additionally, 63 we calculated a Z-score for each field to determine the number 64 of standard deviations the mean number of wins deviates from 65 the expected null value of 0.5 for randomly grouped tools 66 (Figure 2B).Where µ is the mean, σ is the standard deviation, x is 69 the raw value.In this case we set x = 0.5 as this is the null 70 expectation for the proportion of wins for randomly grouped 71 sets of tools.For the purposes of illustration we plot (−1) * z 72 so that the direction is the same as for the "proportion of wins" 73 forest plot (Figure 1).

74
P-values are computed from the absolute value of the Z-75 scores to evaluate if any field is significantly distinguished 76 from the null i.e.P[X > x].The P-values are corrected for 77 multiple testing by controlling the false discovery rate method 78 [21].
portion of wins and lower (−1) * Z − score indicate better overall accuracy."Medical Informatics", a branch of "Technologies", outperformed other fields, with a mean win proportion of 0.70 (95% CI: 0.53 − 0.85) and a Z-score of −1.88.However, P = 0.29 after multiple testing correction).Notably, this category includes five different parameter options for the MAFFT sequence alignment tool [10], and a further four separate corresponding authors that list either the Department or Center of Biomedical Informatics, Harvard Medical School as their

Figure 2 .
Figure 2. (A)A forest plot, illustrating the mean and 95% confidence intervals of the proportion of times software tools published by a given field "win" in pairwise comparisons.Confidence intervals and the mean was determined using a bootstrapping procedure.Within each field the entries have been sorted by the mean number of wins.The sample size for each field is indicated by the column of numbers on the right of the figure.(B) A Z-score was computed for each distribution of bootstrap samples for each field.The expected proportion of wins for randomly selected groups of tools was used as "x" (i.e.null=0.5).