Abstract
Reproducible, understandable models that can be reused and combined to true multi-scale systems are required to solve the present and future challenges of systems biology. However, many mathematical models are still built for a single purpose and reusing them in a different context can be challenging due to an inflexible monolithic structure, confusing code, missing documentation or other issues. These challenges are very similar to those faced in the engineering of large software systems. It is therefore likely that addressing model design at the software engineering level will also be beneficial in systems biology. To do this, researchers cannot just rely on using an accepted standard language. They need to be aware of the characteristics that make this language desirable and they need guidelines on how to utilize them to make their models more reproducible, understandable, reusable, and extensible. Drawing upon our experience with translating and extending a model of the human baroreflex, we therefore propose a list of desirable language characteristics and provide guidelines and examples for incorporating them in a model: In our opinion, a mathematical modeling language used in systems biology should be modular, human-readable, hybrid (i.e., support multiple formalisms), open, declarative, and support the graphical representation of models. We compare existing modeling languages with respect to these characteristics and show that there is no single best language but that trade-offs always have to be considered. We also illustrate the benefits of the individual language characteristics by translating a monolithic model of the human cardiac conduction system to a modular version using the modeling language Modelica as an example. Our experiment can be seen as emblematic for model reuse in a multi-scale setting. It illustrates how each characteristic, when applied consistently, can facilitate the reuse of the resulting model. We therefore recommend that modelers consider these criteria when choosing a programming language for any biological modeling task and hope that our work sparks a discussion about the importance of software engineering aspects in mathematical modeling languages.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
This revision is based on the much valued feedback of our reviewers from our submission in npj Systems Biology and Applications. It contains several small to medium-size changes, but the most prominent are the following: 1. We clarify that by "support for discrete variables" we only mean the support for explicitly declaring discrete variables. The old wording made it seem that some languages do not support discrete model parts at all. 2. We remove the example comparing two concrete commits in different projects as it is quite hard to make such a comparison fair when the commits achieve different goals in different projects.