A Workflow for High Through-Put, High Precision Livestock Diagnostic Screening of Locomotor Kinematics

Locomotor kinematics have been challenging inputs for automated diagnostic screening of livestock. Locomotion is a highly variable behavior, and influenced by subject characteristics (e.g. body mass, size, age, disease). We assemble a set of methods from different scientific disciplines, composing an automatic, high through-put workflow which can disentangle behavioral complexity and generate precise individual indicators of non-normal behavior for application in diagnostics and research. For this study, piglets (Sus domesticus) were filmed from lateral perspective during their first ten hours of life, an age at which maturation is quick and body mass and size have major consequences for survival. We then apply deep learning methods for point digitization, calculate joint angle profiles, and apply information-preserving transformations to retrieve a multivariate kinematic data set. We train probabilistic models to infer subject characteristics from kinematics. Model accuracy is validated for strides from piglets of normal birth weight (i.e. the category it was trained on), but the models infer the body mass and size of low birth weight piglets (which were left out of training, out-of-sample inference) to be “normal”. The age of some (but not all) low birth weight individuals is underestimated, indicating developmental delay. Such individuals could be identified automatically, inspected, and treated accordingly. This workflow has potential for automatic, precise screening in livestock management.


Introduction
Veterinary diagnostics have struggled with a methodological trade-off between high precision and high through-put. In the era of genomics, proteomics, and the like, the strive for accurate diagnostics of livestock diseases has directed considerable attention to the development of modern laboratory tests (Howson et al., 2017;Lamy and Mau, 2012). Conventional imaging techniques also play a role, but usually require special equipment and measurement techniques (e.g. radiography, microscopy, ultrasound, cf. Yitbarek and Dagnaw, 2022). These methods are high precision tools, but low through-put or expensive, some potentially invasive, and therefore not generally suitable for broad monitoring of farm animals. On the other hand, computational techniques are increasingly available to mine extensive data sets collected with sensors or cameras for diagnostically relevant signals (Gómez et al., 2021;Neethirajan, 2020;Netukova et al., 2021;Piñeiro et al., 2019;Wurtz et al., 2019). models are capable of handling complex situations, given sufficient data. Multivariate probabilistic models (see below) are suited to also capture intra-individual variability and yield effect likelihoods. However, the high dimensionality of kinematic data sets, the multi-parameter, multi-level (hierarchical) covariate situations, and the high digitization workload have often been a limiting factor for the generation of quantitative models of vertebrate locomotion (Jackson et al., 2016;Michelini et al., 2020;Seethapathi et al., 2019).
Several recent technological advances have enabled researchers to tackle scientific questions on locomotion in a more efficient way. Firstly, the past few years have brought huge leaps in terms of computer vision, deep learning, and thereby semi-automatic video digitization methods (Corcoran et al., 2021;Jackson et al., 2016;Karashchuk et al., 2021;Mathis et al., 2020;Mielke et al., 2020). These tools typically require a manually digitized subset of the data as the "training set" for a neural network, which is then able to digitize further videos in high through-put, hopefully with reasonable accuracy. A second field of technological advance are the aforementioned probabilistic models, which build on an elegant computational implementation of Bayesian theory (Markov Chain Monte Carlo / MCMC sampling, cf. Gelman et al., 2020;McElreath, 2018;van de Schoot et al., 2021). Such models can naturally incorporate hierarchical parameter interrelations and intrinsic variability. The main reason for this is that probabilistic models work on data distributions, and their outcome are distributions and "effect likelihoods", rather than point estimates. This can be informative on an intrinsically varying process such as locomotion (Mielke et al., 2018). Machine Learning methods for video digitization are validly advancing to be the standard in kinematic analysis, whereas probabilistic models still lack recognition in the field, despite their potential. To summarize, the mentioned advances in computer vision and statistical modeling enable us to (1) acquire a lot of quantitative data with minimal to no workload, and (2) model them in a suitable way. It would be desirable to adapt those technological advances for veterinary use, generating a classifier which could identify systematic alterations in the locomotion of domestic animals, and thereby enabling the computer-supported diagnostic screening for deficiencies, pathological states, and diseases.
Domestic pigs are a well-studied model system in which scientific interest joins the economic interest of commercial breeding. These animals have been subject to a variety of locomotor studies, including paradigms to test the effects of breed (Mirkiani et al., 2021), birth weight (Vanden Hole et al., 2018a, surface friction (von Wachenfelt et al., 2008), welfare (Guesgen and Bench, 2017), various Using the semi-automatic, machine-learning digitization techniques mentioned above, one can extend the analysis of gait variables to quantities of intra-limb coordination with manageable workload. However, using the whole set of raw point coordinates of joint points of interest raises the issue of dimensionality (two to three coordinates per reference point, simply too many data variables). Statistical modeling requires a minimum number of observations for being able to infer effects of the different variables (Austin and Steyerberg, 2015;Frick, 1996;Maxwell et al., 2017;Riley et al., 2020). The common solution is to reduce the dimensionality with an appropriate transformation. To choose a transformation, it can be exploited that common analysis procedures in locomotor biomechanics require steady state locomotion.
"Steady state" implies that the behavior consists of repetitive blocks of kinematics, i.e. stride cycles. And one of the most common sets of techniques in physics and engineering to handle cyclic data is Fourier Analysis, or more specifically Fourier Series Decomposition (FSD; Bracewell, 2000;Fourier, 1822;Gray and Goodman, 1995;Mielke et al., 2019;Pike and Alexander, 2002;Webb and Sparrow, 2007). With FSD, joint angle profiles are transformend into their representation in the frequency domain, i.e. an array of harmonics. Some of the characteristics of the profiles (namely mean angle, amplitude, and phase) are more readily captured by those harmonics and can optionally be removed. This is most intuitive in the case of phase: removing phase differences enables a mathematically optimal temporal alignment of the profiles. By isolating the other characteristics, mean and amplitude, the joint angle profiles can be transformed to meaningful quantities such as dynamic posture (mean joint angle and effective range of motion), and coordination sensu stricto (relative phase/joint timing and residual kinematics, cf. Mielke et al., 2019). Harmonics are independent of temporal sampling and duration: the coefficient array is of fixed size, which is useful for subsequent multivariate analysis methods, such as Principal Component Analysis (PCA). Another advantage of this transformation procedure is that it is reversible because all mathematical information is retained in the process (which is not the case when using collective variables alone). This means that joint angle profiles can be reconstructed for any observed or hypothetical point in parameter space, which enables in-sample and out-of-sample predictive sampling.
To summarize, the Fourier Series decomposition provides a mathematically convenient and biomechanically meaningful representation of the kinematic data, which opens up new options for data analysis and modeling.
In this study, we establish a workflow which can be automated and used to identify individual animals locomoting differently from the "normal" reference, based on video recordings, deep learning digitization, mathematical transformations, and probabilistic modeling. A conventional, 2D kinematics data set is extracted with the aid of deep learning tools from lateral videos of walking piglets. By applying multivariate analysis and FSD, we separate spatiotemporal gait variables, dynamic posture, and coordination, and model their relation to subject characteristics (mass, size, age, and birth weight category). Crucially, this constitutes the complete information captured by locomotor kinematics, and all parameters are submitted to an inclusive, probabilistic model. As a test case, we tackle the question of whether low birth weight in domestic piglets is an indication of delayed development, and attempt to quantify the delay with an inverse modeling strategy as follows. Intuitively, and conventionally, joint kinematics are considered the output of the locomotor system. Therefore, conventional statistical models might consider them on the "outcome" side; on the "input" side, the effects of birth weight, age, speed, or other parameters are quantified. Herein, we use a different approach, and invert the model.
We construct a probabilistic computer model which describes "age" and other subject characteristics as a function of all available kinematic parameters. The rationale is similar to that in subject recognition tasks: given a certain kinematic profile, can we infer (characteristics of) the subject? We split our data set into birth weight classes (LBW, NBW), and train the model on only the strides from NBW observations. This NBW model is our "kinematic reference" model, quantitatively capturing the expectation of what would be "normal" by inferring the plausible age range for a given kinematic observation. We then use that trained model to compute out-of-sample inference of individual LBW observations.
Our hypothesis is that, if LBW were at the same stage of postnatal locomotor development as their NBW siblings, then the model should accurately infer the age of the LBW animals. Conversely, if the LBW piglets are delayed in development, the model would underestimate their age. Thus, by applying this inverse modeling strategy and comparing the computer-inferred age to the actual age of the LBW piglets, we can quantify and potentially falsify a hypothesized delay in locomotor development.
The components of this classification workflow are not novel, and commonly used in physics and engineering. We use available machine learning tools to digitize videos, apply a series of well-known transformations, and train a probabilistic model classifier. We demonstrate that a set of individual locomotor events can be used to distinguish individuals which develop slower than expected, in a temporal accuracy of four to eight hours (which is a considerable timespan for neonate animals). These are precise diagnostic measurements, generated at high through-put, with the overall aim of improving animal welfare, all of which is in line with the prototypical ideal of Precision Livestock Farming.
grunting vocalization of the researcher were other successful strategies to induce targeted locomotion in the direction perpendicular to the camera axis. After recording sessions the piglets were returned to their litter and remained with the sow. The workflow herein involved handling of the animals as a consequence of the research setting. However, note that the procedure could easily be automated for continuous data collection by a suitable pen arrangement (Meijer et al., 2014;Netukova et al., 2021;Stavrakakis et al., 2014).

Digitization
We used the software DeepLabCut (DLC, Mathis et al., 2018) for digitization of all video material. In addition, a custom made point tracking software (Mielke et al., 2020) was used to generate a training set.
In total, our dataset contained 180 videos (more than 11 hours, 169 animals) of video. Our goal was to prepare a general DLC network which is capable of automatically tracking piglets at multiple ages, and which can be shared and re-used for subsequent research questions. This is why the full data set was used for digitization and for the calculation of some derived measures (size PCA). However, the analysis focus of this study (see below) was only a subset of the data (i.e. the 58 animals of the youngest age class). The video processing workflow, applied to the full data set, was as follows. To get a balanced training set, one stride of each of the animals was selected, and the video was cut, cropped to runway height, and optionally mirrored horizontally so that movement would always be rightwards. All videos were concatenated and submitted to the DLC training set generation. DLC was set to select 2552 frames from these videos, which were tracked in an external software and re-imported for training (80% training fraction). Seventeen landmarks (i.e. points of interest or "key-points"; usually joint centers, fig. 1) were digitized, representing all body parts visible on the lateral perspective (head: snout, eye, ear; back line: withers, croup, tail base; forelimb: scapula, shoulder, elbow, wrist, metacarpal, forehoof; hindlimb: hip, knee, ankle, metatarsal, hindhoof). We selected a "resnet 152" network architecture and trained for 540, 672 iterations (16 days of computer workload). The network was then applied to digitize the continuous, full video recordings twice: once in default direction and once horizontally mirrored, because training set was always rightward movement.
The next step is to find the relevant temporal sequences of walking in the continuous videos. Naturally, the trained network would only extract potentially useful landmark traces for episodes which resembled the training set, i.e. in episodes with a piglet moving perpendicular to the image axis, in lateral aspect and rightward direction. We automatically extracted 2597 of such sequences by filtering for high digitization "likelihood" provided by DLC, low noise (i.e. steady landmark movement) and consistent, plausible landmark distances. We further applied an automatic algorithm to find footfalls and label stride cycles in the candidate episodes (4730 cycles). This procedure involved a start-end-matching optimization (using Procrustes superimposition) to ensure that strides were indeed cyclical. To further assess digitization quality, gait variables were automatically extracted. Definition of these variables was chosen to simplify the automatic procedure, as follows. Stride distance, frequency, and speed are trivial measures of the animal movement. Duty factor is available for fore-and hindlimb, and measures the fraction of stride time in which the respective hoof is in ground contact. Clearance is approximated by quantifying the ratio of flexion of each limb (one minus the quotient of minimum and maximum absolute hip-toe-distance during the stride). Head and torso angle are the stride-average angles of the snout-ear or withers-croup lines with respect to the coordinate system. Hindlimb phase measures the time between hind-and forehoof touchdown, divided by the stride cycle duration. Where applicable, gait variables were prepared for analysis (see below) by converting them to dimensionless values (Alexander and Jayes, 1983;Hof, 1996) using the cumulated distance of landmarks along the snout-to-tailbase line of the animal as reference, extracted as stride average from the digitized landmarks. Only strides with plausible values (i.e. those which lie within the theoretical distribution of each parameter; 1862 cycles) where processed. Manual inspection further boiled down the data set to 897 stride cycles (the others excluded for digitization errors, multi-animal confusion, non-walking gait, intermittent or sidewards locomotion, or incompleteness).
Finally, 368 of the remaining strides from 58 animals were in the youngest age category (< 10 h) and thus selected for the present analysis, the data table is available online (see below).

Data Processing
The landmark data provided by DLC was further processed for analysis. Python code for the whole procedure is available ( The forelimb angle served as reference to temporally align all cycles in the data set (removal of phase differences between different cycles; forelimb angle was not used further). Then, mean and amplitude of the joint oscillations were isolated for all joint angles and are categorized as "dynamic posture" parameters.
Mean joint angle is the temporal average, whereas amplitude is related to effective range of motion (eROM). The residual, i.e. differences captured by non-affine Fourier coefficients, can be categorized as "coordination" sensu stricto (it measures the precise temporal succession of joint configurations). In our case, there were 96 variables of coordination (6 angles, 8 harmonics, real and imaginary) which were submitted to a PCA. Only the first 12 coordination components (CC) were used for statistical analysis, capturing 80.2% of the variability in coordination.
To summarize, FSD and FCAS served three purposes: (i) temporal alignment of the cyclic traces, (ii) separation of meaningful parameter categories (dynamic posture and coordination), and (iii) preparation for multivariate analysis via PCA. Basic script code (Python, Matlab and R) to perform FCAS can be found on a dedicated git repository (https://git.sr.ht/~falk/fcas_code).
Information retention is generally a strength of this method. FCAS and PCA are mathematical transformations, which means that the information content after transformation is theoretically identical to that prior to transformation (theoretically, because only a finite number of harmonics can be used, yet this is of little concern for continuous, smooth joint angle profiles). The neglected PCs and the residual not captured by 8 harmonics were the only information from kinematics of the given joints to be lost in this procedure, and by definition these contain the least information. Apart from that, all information present in the raw joint angle profiles enters the analysis. Though we used a 2D dataset herein, the procedure could be applied equally well to angles measured from 3D coordinate data (Scott et al., 2022).
Furthermore, all transformations are reversible, hence any analysis outcome can be translated back to kinematics with high accuracy. Reversibility bares a lot of herein unused potential, for example for interpolating unobserved subject states or for inferring kinematics of fossile species by phylogenetic and morphometric bracketing. Reversibility can also be of use when presenting raw joint angle profiles and their averages, as follows. One crucial aspect of the FCAS procedure is temporal alignment of the joint angle profiles in the frequency domain. In conventional temporal alignment, a single characteristic point in the stride cycle is chosen as a reference, wherein this is only "characteristic" for a certain part of one limb (e.g. left hindlimb hoof touchdown). Temporal alignment to the hindhoof touchdown might cause distinct peaks in the forelimb angle joint profiles to occur at different relative points in the stride cycle (e.g. ankle joint profiles in Fig. 3 below, lower half, green traces). If profiles show such variable peak positions, then their average will have a wider, less pronounced (i.e. lower amplitude), and potentially unnatural peak. For illustration, this is analogous to averaging two sine-waves of identical amplitude, but phase shifted: in the worst case, they cancel each other out (as in "destructive interference"). The problem is not restricted to pronounced peaks, but generally occurs if the temporal intra-limb coordination varies within a data set. Using FCAS, it is possible to get a more representative average of the raw traces which has its amplitude conserved, but phase and mean angle averaged. This is enabled by transformation to the frequency domain, separation of affine components, removal of phase differences by shifting to average phase, profile averaging, followed by inverse transformation back to the time domain. Because a set of profiles and phases may be calculated for each angle individually, and because phase relations can differ between joints, there are the options to align based on one reference angle (e.g. the whole forelimb, as done herein) or minimize all phase differences across all joints. Chosing the first option herein has implications: when plotting hindlimb joints aligned by a forelimb reference (as in Fig. 3, lower half), phases still differ, and the "destructive interference" problem might hamper averaging. In such cases it is possible to apply an extra, joint-wise FCAS alignment for the sole purpose of generating meaningful averages.

Statistical Modeling
To summarize, four categories of variables were used for analysis: • subject characteristics: age, sex, mass, birth weight category, size • spatiotemporal gait variables: distance, frequency, speed, clearance (fore-/hindlimb), duty factor (fore-/hindlimb), head angle, hindlimb phase • dynamic posture: mean joint angles and eROM for six joints • coordination: the residual after extraction of dynamic posture (see above) Our guiding question for model design is whether a probabilistic, linear model is able to infer subject characteristics (specifically: age, mass, and size) from raw kinematics (expressed as dynamic posture and coordination) and gait variables (collective variables). Given the common conception that kinematics are a complex output of an individual motor system, this might be considered an "inverse" modeling approach.
The present analysis focused on three outcome variables (Fig. 2): mass (kg), size (arb. units, from a PCA of marker distances), and age (h). Though these outcome variables were specific per individual and recording session, we analyzed them "per stride" (i.e. there were multiple strides with identical subject measures on the outcome side).

15
The model formula is:   arrays to the actual observed values for one given stride (using 'pymc.set_data'). The number of predictions usually matches the number of training samples, which means that all posterior information is used to construct the prediction distributions. We would thus retrieve mass, size, and age predictions (i.e. probabilistic inference) for each stride in the data set, which were then compared to the known, actual mass, size, and age.

Results
The present analysis is centered around a linear model which is designed to infer mass, size, and age (subject characteristics) from an extensive set of kinematic parameters from 2D videos. The numbers provided by the model sampling are equally extensive, and will only be reported in brief. The key purpose of the model is posterior predictive sampling of the LBW strides which were left out of the model, and which are analyzed in detail below.
To assess whether there are qualitative differences between the birth weight categories, one can compare the joint angle profiles (i.e. raw, angular kinematics) on which the present analysis was performed ( 3). The intra-group variablility clearly exceeds the differences between groups, although it must be emphasized that groups are inhomogeneous (with regard to age, speed, etc.), which might lead to a bias if composition of LBW and NBW data differs. LBW walk with a more flexed hindlimb posture, as indicated by the parallelly offset average hip, knee, and ankle profiles. Additionally, NBW individuals on average seem to set the shoulder at a more extended angle. No differences in coordination are apparent (which would manifest in altered temporal structure of the profiles). These findings indicate that LBW kinematics are hardly distinguishable from NBW kinematics by qualitative, visual assessment, which is at least in part be due to high variability.
A quantitative comparison of variable kinematic measurements can be achieved with probabilistic linear models. For the purpose of predictive sampling (see below), we train models to describe the interrelations of kinematic parameters and subject characteristics in NBW piglets. The outcome of MCMC sampling of a linear model are value distributions for slopes, which in our case indicated how certain kinematic parameters are associated with a change in mass, size, and age (supplementary material 7.1). Of the gait-or coordination parameters, only hindlimb clearance was correlated with differences in animal mass.
Mass was also associated with changes in the dynamic posture of the hip and ankle. For size, the model inferred associations with head angle, hindlimb duty factor and clearance, and one coordination component (CC3), as well as changes in the fore-and hindlimb posture and an effect of sex. Finally, age was associated with an increase in forelimb clearance, potential changes at the hip and wrist, and several coordination components (CC9, CC11). Some eROM slope distributions for age were high in average magnitude, but variable (the "credible interval" contained zero). These model results provide detailed insight into parameter interrelations in the present data set and indicate which of the parameters are the relevent ones to infer a given subject attribute in predictive sampling.
Performing in-sample and out-of-sample predictive inference with the models trained on NBW strides elucidated if and how left-out strides differed from NBW model expectation (Fig. 4). Note that, to capture variance (i.e. uncertainty in the prediction), each stride was sampled repeatedly.  heavier than actual, and their size was overestimated (+ 1.71 units). Both faults matched the actual differences in magnitude (cf. methods, Fig. 2). In contrast, the age inference for the low birth weight subjects were not normally distributed: most ages were correctly inferred from stride-wise kinematics, but ages for some strides were underestimated. The underestimation of those strides quantified to just below five hours.
In summary, the NBW-trained model "guesses" the size and mass of the animals producing LBW strides to be "normal" (although they are not), which indicates that these defining features of LBW do not reflect in altered kinematics. However, age inference is non-normal, i.e. some strides are classified as typical for animals of younger than actual age.
To find out whether the offset age inference was related to certain individuals, or certain strides from different individuals, we grouped the inferences per stride or subject and calculated the chance of overor underestimating age. Of the 8 low birth weight subjects who contributed 39 strides, 4 individuals were consistently underestimated (Tab. 1). Consistently means that more than 75% of all predictive samples were below actual age, and that the ages for a majority of strides were on average underestimated. The magnitude of underestimation was between two and five hours. Curiously, those were the individuals recorded at a slightly higher age (> 5 hours). Overestimation in the other four LBW individuals was also consistent, but less so (less extreme underestimation rate, mean ∆ < 2 h). Standard deviation of the estimates did not vary across individuals or birth weight categories.
We conclude that underestimation of age is consistent over multiple strides of the same individual, and thus individual-specific. Quadruped terrestrial locomotion is the collective output of an ensemble of organismal subsystems, which is both reason and challenge for its usefulness in veterinary diagnostics. On one side, the kinematics can be quantified in multidimensional data sets, capturing the many degrees of freedom of the limb joints. On the other side, kinematic quantities are context dependent and affected by numerous subject characteristics (age, weight, pathologies, . . . ) which also cross-influence each other.
The challenge emerges to find the right trace of a given (or unknown) condition in the multidimensional observation on the background of kinematic variability. Deep Learning methods for video digitization have become available, and probabilistic computational models offer a flexible framework to mirror complex parameter relations. Once trained to a given question, these computer tools can achieve comparative diagnostic classification with minimal human interaction, e.g. for continuous screening in a farm setting. Multivariate systems have been a challenge to integrated management and precision farming, and the presented locomotor analysis workflow highlights a possible way to succeed in that challenge.
In this study, we have demonstrated a test case for generating a probabilistic model of piglet locomotion which incorporates all kinematic information.
Our example model was trained on a high number of observations which are considered "normal", and applied to classify untrained observations in terms of deviation from normal behavior. The data stems from laterally filmed videos of normal (NBW) and low birth weight (LBW) piglet locomotor behavior from unrestricted walking gait (an inexpensive, high-throughput arrangement, and a common behavior).
Low birth weight is often associated with low vitality (Baxter et al., 2008;Hales et al., 2013;Muns et al., 2013), and this supposedly correlates with deficient locomotion. Hence, the obvious first research question is whether birth weight has an influence on the locomotor behavior. Top-down, direct, visual assessment could justify the hypothesis that LBW walking kinematics are somehow different from "normal" (D'Eath, 2012). Yet that is (i) hard to assess due to high behavioral variability and (ii) trivially expected given the adaptation to different physical properties of their body: gravitational force is a predominant constraint of locomotion, and it simply scales with animal weight. Our results showed that the eight LBW individuals we submitted to the weight-kinematics model were all over-estimated in terms of their weight, by the amount that matched LBW-NBW weight difference (Fig. 4). The same is true for the size model. This indicates that LBW, at least all those in our data set, are capable of walking as if they were of normal birth weight and size. This is the first example of a diagnostic model application: the model confirms quantitatively normal locomotor behavior despite occurrence of a given non-normal co-variate (weight).
A second diagnostic application is the identification of individuals (or even strides) which systematically deviate from an expectation or norm. Probabilistic models do not only classify "normal" or "not": they yield a distribution of plausible values, and thereby a likelihood that a given observation is indicative of a problem. The same model architecture as above, but configured to infer age from a kinematic measurement, estimated some (but not all) individuals to be of lower than actual age (Tab. 1). Those were specifically the older of the LBW individuals, whereas the youngest ones (< 4h) walked as expected for neonates. Though we cannot fully rule out chance with our limited sample size, this provides evidence that the quick postnatal development was halted in those individuals. Our interpretation is that, at birth, LBW individuals putatively had the same capabilities as their NBW siblings, yet at least some "fell behind" regular development in the first hours. We can think of two possible reasons for this: (1) the birth process as a trauma might mask the actual capabilities of all neonates alike, concealing actual, pre-existing differences (Litten et al., 2003); (2) development is impeded by depleted energy reserves and a failure in (kin) competition and the perinatal struggle for teats and warmth (Le Dividich et al., 2017). We found little support for the first possible reason: top-down locomotor development is quick for both groups (Vanden Hole et al., 2018a, and muscular architecture shows no differences (Vanden Hole et al., 2018b). On the other hand, there is evidence for quick depletion of energy levels in the low birth weight individuals, which rectifies within a period of ten hours (Vanden Hole et al., 2019). This finding is consistent with the present study and supports the perinatal struggle hypothesis.
Delayed development does not necessarily corroborate the hypothesis of locomotor deficiency in LBW.
We would expect truly deficient strides to be substantially different from the data trained to the model, thus be either excluded or misclassified. Exclusion means that the used Deep Learning implementation could not capture deficient strides, or only in a way which led to exclusion in subsequent (automatic) quality checks (see below). We acknowledge that there currently still is room for refinement in the Deep Learning digitization procedure. Yet in the likely case that some deficient strides passed quality checks and were subjected to the model, we would expect them to be more "unpredictable" (i.e. higher variance of posterior samples). Instead, in our data set, inferences were consistent for repeated measures of an individual, without notable increase in variance across inferences per stride. For the affected subjects, we can even quantify a plausible delay of less than five hours, which could nevertheless be critical given the rapid maturation of locomotor behavior in this species (Vanden Hole et al., 2017) and the importance of postnatal competition. Such detailed information is valuable when evaluating the success of different mitigation strategies (e.g. supplementing energy to piglets, Schmitt et al., 2019). It must be emphasized that, just like other computational diagnostic tools, the method outlined herein is not intended for standalone use. Instead, it is complementary to or can facilitate the in-depth inspection. Nevertheless, the specificity of the presented gait analysis supersedes mere activity analysis: to our knowledge, being able to automatically retrieve an individual, probabilistic measure for developmental delay in swine has not been achieved before. Information retention is a feature of the presented workflow which we think can enable researchers and veterinaries to differentiate a multitude of potential influences on locomotor behavior, given sufficient reference data and an appropriate model design.
Our data set is limited and potentially biased in terms of LBW observations. There are much fewer valid LBW strides in our data set, in absolute numbers: only 39 of 368 observations are LBW. This could be interpreted as evidence for a lower capacity (despite equal potential) of LBW to produce normal locomotion. Yet there are proximal, trivial explanations: for this study, the 10% lower quantile of birth weights in a litter is considered LBW, and there is a hard cap of 800g. The resulting share is equal in our training set for video digitization, and in the final data set, because of pseudo-random, opportunistic sampling on-site (i.e. recording work was permanent, yet determined by farrowing and feeding of the subjects). The minority of LBW training videos might lead to an under-learning of those animals in the digitization network, which could lead to reduced digitization quality and therefore an exclusion bias for "non-normal" individuals. Though it seems unlikely, we cannot rule out reduced locomotor capacity in LBWs: the present data set is unsuited to count the occurrence of locomotor behavior due to its automatic generation. On the other hand, the strict stride filtering criteria for "good" kinematics may have involuntarily filtered out deficient individuals. Our conclusion that low birth weight individuals are non-deficient is strictly tied to the definition of the low birth weight category, which is herein based on weight criteria and did not regard phenotypical indicators of intra-uterine growth restriction (which we did not record, cf. Amdi et al., 2013).
A corollary question is which patterns in the kinematic variables cause the different age inferences. We report high magnitude (but also highly variable, i.e. "non-significant") slopes inferred from the age model (supplementary material 7.1). Note that these slopes solely reflect effects within the NBW data subset.
We also observed slight differences in the average hindlimb dynamic posture (Fig. 3). In fact, a more flexed hindlimb is typical for the youngest animals of both birth weight categories. We emphasized potential differences in group composition to explain that (e.g. sex effect in the "size" model), and different age per group might be a proximal explanation for the non-normal age inference in LBW.
However, the average age of LBW animals ( 5.3 h) in our data set is nominally above that of NBW ( 3.8 h), which is a discrepancy with the age underestimation. Yet if we assume that the hypothesis of delayed locomotor development is correct, the nominal age would be misleading, and LBW effectively behave similar to younger animals. This can explain the apparent discrepancy in age group composition and age inferences from kinematics. It also suggests that dynamic posture might be the major proxy for perinatal maturation, though many other parameters also entered the probabilistic model and influenced the model outcome.
To summarize, we herein assembled state-of-the-art computer techniques for the purpose of individual diagnostics in quadruped locomotion, which we think constitute a valuable workflow for livestock screening and management. All components require some manual and computational efforts for initialization (network training, model regression). However, once that is done, the workflow is as follows: • generate more video recordings (e.g. in an instrumented runway) • apply the trained Deep Learning network for automated digitization • identify stride cycles (automatic with framewise Procrustes comparison) • stride cycle quality filtering by automatic criteria (end-start difference, constant speed, . . . ) • Fourier Series Decomposition, temporal alignment, and parameter transfromation (PCA) • probabilistic classification (i.e. posterior predictive sampling) with an inverted model structure • validation of above-threshold classifications Except for the last (crucial) step, all of this can be fully automated, and the whole workflow is readily available for precision livestock farming. Monitoring can happen automatically (as in Litten et al., 2003;Netukova et al., 2021), which reduces delay in identifying individuals in need of intervention. Multiple models can be tested in parallel: in the present test case, the "weight" and "size" models found LBW locomotion indistinguishable from the "normal" reference group, whereas the "age" model specifically identified those animals which likely experience a delay in locomotor development. Likewise, tests for specific diseases could be set up. A more extensive (longitudinal) data set and more specific models are required to bring this tool into "clinical" or economical/commercial use, and one purpose of the present study was also to give sufficient explanations and references for readers unfamiliar with the mentioned methods. Nevertheless, we demonstrated that the modeling workflow is able to provide a high precision, high throughput method for domestic pig locomotor diagnostics.