kinfitr: Reproducible PET Pharmacokinetic Modelling in R

Quantification of Positron Emission Tomography (PET) data is performed using pharmacokinetic models. There exist many models for describing this data, each of which may describe the data better or worse depending on the specific application, and there are both theoretical, practical and empirical reasons to select any one model over another. As such, effective PET modelling requires a high degree of flexibility, while effective communication of all steps taken through scientific publications is not always feasible. Reproducible research practices address these concerns, in that researchers share analysis code, and data if possible, such that all steps are recorded, allowing an independent researcher to reproduce the results and assess their veracity. In this article, I present kinfitr: a software package for performing kinetic modelling using the open-source R language, in a reproducible manner. The R community has a strong culture of reproducible research, and the language consists of numerous tools which allow both effective and easy sharing and communication of analysis code. The package is written in such a way as to allow the analyst the freedom to use and rapidly exchange between approaches, and to assess goodness of fit, with 14 different kinetic models currently implemented using a consistent syntax, as well as tools for working with the data. By providing open-source tools for kinetic modelling, including documentation and examples, it is hoped that this will extend access to methodology for research groups lacking software engineering expertise, as well as simplify and thereby encourage transparent and reproducible reporting.


INTRODUCTION
Positron emission tomography (PET) is an in vivo neuroimaging method with high biochemical sensitivity 28 and specificity: it is an essential tool for the study of the neurochemical pathophysiology of mental 29 and neurological disease as well as for evaluating pharmacological treatments. This method allows for 30 accurate quantification of picomolar concentrations, thereby allowing insights which are not possible 31 using any other in vivo imaging modality. PET is, however, prohibitively expensive, often costing in 32 excess of USD 10 000 per measurement, and additionally involves exposure of participants to harmful 33 radioactivity. For this reason, accurate quantification is imperative in order to maximise the scientific 34 value of each measurement, as well as to minimise the number of participants who must be exposed to 35 radiation in order to answer the scientific question at hand.

Figure 1.
Compartmental models are the basis of PET kinetic modelling. For both panels, C represents the radioactivity concentrations within each compartment. The red cylinder on the left of each panel represents the arterial blood, containing plasma (P). Within the plasma, the radiotracer is either free (FP), or bound to plasma proteins (PP). The black boxes represent the compartments. TCM refers to Tissue Compartment Model. A. The three tissue compartment model is the basis for the two-and one-tissue compartment models: transfer between certain compartments are assumed to be sufficiently rapid that they can be considered as single compartments for the two-and one-tissue compartment models (coloured boxes). The compartments include FT free tracer, NS non-specifically bound, S specifically bound, T total, and ND non-displaceable. B. Reference region models consider the total concentration of radiotracer in the target T and in the reference region R, and assume that the non-displaceable concentration is equal in both regions, and that the specific binding in the reference region is equal to 0. estimated quantity of interest.

47
There exist numerous different kinetic models for performing this quantification, which differ in a 48 variety of important ways. Firstly, they differ in their specificity for the target binding, e.g. quantifying 49 only the specific binding itself, as compared to quantifying the total binding, including non-specific only the estimate of the binding, compared to estimating all of the rate constants underlying that estimate.

52
Thirdly, they differ in their relative degree of bias and variance, and hence their sensitivity to noise, i.e. how 53 much they over-or under-fit the data). Fourthly, they differ in their assumptions about the behaviour of 54 the radiotracer in the tissue, e.g. irreversible vs reversible binding, or the compartmental structure of the 55 binding ( Figure 1). These assumptions are usually only partially met in any given application, and care 56 must be taken to ensure that the degree to which assumptions are not met does not bias the estimates 57 in important ways (Salinas, Searle, and Gunn 2014). These differences are further complicated by the 58 fact that the performance of different models may vary based on the properties of each specific tracer: a 59 certain model may be more effective for certain cases than others, all else being equal.

60
For the modeller, there is no silver bullet. Rather, the model used to estimate the quantity of interest 61 should be selected based on the radiotracer, as well as the research question and properties of the data 62 set itself. This is further complicated by the myriad other analytical decisions which must be made prior 63 to modelling, such as statistical weighting schemes, the application of partial volume effect correction, 64 or the use of numerous ways that the blood data can be modelled too (the blood, blood-to-plasma ratio 65 and parent fraction curves can all be modelled to derive improved estimates of the arterial input function 66 curve, which can itself also be modelled). As such, effective PET modelling requires a high degree of  This general issue has led to calls among the broader scientific community for computational re-74 producibility, or more broadly reproducible research (RR), as a minimum standard for assessment of 75 2/10 scientific claims, i.e. that researchers share analysis code and, if possible, data. This ensures not only that 76 all steps are recorded, but this also allows an independent researcher to reproduce the results and assess 77 their veracity, as well as their sensitivity to various decisions taken during analysis. RR practices further 78 accelerate scientific progress, as novel methods can be readily validated, applied and extended by other 79 researchers using the shared code.

80
In this paper, I present kinfitr: a software package for performing PET kinetic modelling of TACs 81 using the R language. This tool both provides flexibility for effective modelling, while at the same time 82 being written in such a way as to promote transparency of this process. Further, by using the R language, 83 all code is open-source, and reproducible reporting is made easy by the extensive ecosystem of tools for 84 this purpose for the R language.

86
The kinfitr package contains a host of tools for processing and modelling of PET TAC data, i.e. after the 87 raw image data has been transformed into vectors of radioactivity concentrations. The code is available at 88 https://github.com/mathesong/kinfitr.  Table 1. All models have associated plotting routines, which include representation of 97 weights, and the corresponding reference curve. Most models also produce standard errors of estimates, 98 estimated using the delta method when the parameters are not directly fitted. The kinfitr package contains a set of tools for preprocessing of blood data. Blood data can be read into 114 kinfitr directly and automatically from PET BIDS JSON files as blooddata objects. This contains the 115 raw data for the arterial blood, arterial plasma, blood-to-plasma ratio, parent fraction, and arterial input 116 function, as well as the models used to interpolate this data. These objects also contain the model which 117 will be used to interpolate this data. By default, the interpolation method is defined as piecewise linear 118 interpolation, but all curves can also be interpolated using nonlinear models. All of the data points,

130
Once the user is satisfied with the fits to the blood data, an input object can be created: this consists 131 of an interpolation of the curves into a common time series which can be used for kinetic modelling.

132
Additionally, if the user only has access to blood data which is already preprocessed, they can also bypass 133 the blooddata object, and directly interpolate the data into an input object.

Choosing a Suitable t* in Linear Models
The package also contains a number of helper functions which can be used for the kinds of calculations 152 and processesing steps which must often be performed to accompany TAC modelling. The package 153 includes a unit conversion function, for translating between any standard units of radioactivity to any 154 other, as well as for applying, and reversing, decay correction. For blood data collected using automated 155 blood sampling systems, the package offers dispersion correction. The package additionally methods for 156 estimating weights of TACs.

157
For models involving arterial input, the TAC data and arterial input data must be matched in time. All 158 of the "tissue compartment models" allow for additional fitting of the time delay between these curves, as 159 well as the blood volume fraction as additional parameters.

222
The great expense and technical difficulty of PET, especially when blood data is also collected, as well 223 as the fact that participants are injected with harmful radioactivity, makes it imperative that the resulting 224 data is used in an optimal fashion. The kinfitr package makes it possible to make better use of PET data, 225 by providing researchers with access to a wide variety of kinetic models, and allows the results of this 226 modelling to be effectively, and transparently communicated in reproducible reports.

227
This package additionally makes it easier for multi-centre collaborative projects to harmonise their 228 data modelling procedures, as all analysis procedures and instructions are contained within the code 229 which can be shared between centres. By its use of BIDS PET structure for blood data, this means that 230 this complicated data originating from numerous different sources can be quickly and uniformly read and 231 analysed.

232
In summary, it is hoped that this package will help a researchers to perform PET modelling in a more 233 reproducible fashion, and to prioritise accuracy and transparency to a greater extent in their research.

234
Furthermore, by this project being open-source and hosted on GitHub, other users will also be able to add 235 additional tools and models to the software through pull requests, which can be merged to improve the 236 software package for everyone using it.