## Abstract

The inverse of the genomic relationship matrix (**G**^{-1}) is used in the single-step genomic BLUP, which incorporates genomic, pedigree, and phenotype information for simultaneous genetic evaluation of genotyped and non-genotyped individuals. The rapidly growing number of genotypes is a constraint for inverting a huge **G**. The APY algorithm is an efficient method of solving this issue. Matrix **G** has a limited dimensionality. Dividing individuals into core and non-core, **G**^{-1} is approximated via the inverse partition of **G** for core individuals. The quality of the approximation depends on the core size and composition. The APY algorithm conditions genomic breeding values of the non-core individuals to those of the core individuals, leading to a diagonal block of **G**^{-1} for non-core individuals . Dividing observations into two groups (*e.g*., core and non-core, or genotyped and non-genotyped), any symmetric matrix can be expressed in APY and APY inverse expressions, equal to the matrix itself and its inverse, respectively. The change of **G**^{nn} to makes APY an approximate. The application of APY is extendable to the inversion of any large symmetric matrix with a limited dimensionality at a lower computational cost. Possible applications are: computing the pedigree relationship matrix (**A**) from the APY inverse of **A**^{-1}, a diagonal block of **A** (same as the previous one, but avoiding unnecessary calculations), and the block of the block-diagonal preconditioner matrix corresponding to marker effects for iterative solving of marker effect model equations. Furthermore, APY may improve the matrix’s numerical condition.

## 1 Introduction

Genomic evaluations are mainly performed using the genomic relationship matrix **G** in the so-called method genomic BLUP (GBLUP, VanRaden, 2008) or random regression SNP marker models called SNP-BLUP (Koivula et al., 2012). The first predicts genomic breeding values of genotyped individuals, and the latter predicts marker effects (i.e., allele substitution effects). Simultaneous genetic evaluation of genotyped and non-genotyped individuals for obtaining optimal and unbiased evaluations not limited to genotyped individuals, both methods were elevated to single-step GBLUP (ssGBLUP, Aguilar et al., 2010; Christensen and Lund, 2010), and single-step SNP-BLUP (ss-SNP-BLUP, Fernando et al., 2014), also called the single-step marker effect model.

The number of genotyped individuals is rapidly growing, and the most expensive operation in GBLUP and ssGBLUP is inverting matrix **G**. As the number of genotyped individuals reaches the number of markers, the numerical condition of **G** deteriorates. By the number of genotypes exceeding the number of markers, **G** becomes singular and non-invertible. Furthermore, the cost of inverting **G** and **A**_{22} (the block of **A** corresponding to genotyped individuals, where **A** is the pedigree-based additive genetic relationship matrix) required for ssGBLUP is cubic, and there is a bottleneck of direct inversion of a matrix of size about 150,000 (Fragomeni et al., 2015). Three solutions were proposed for this problem (Misztal et al., 2014; Fernando et al., 2016; Mäntysaari et al., 2017), one being the algorithm for proven and young (APY, Misztal et al., 2014). This algorithm belongs to a group of methods called approximate kernel methods or Gaussian process approximations (Snelson and Ghahramani, 2007). APY forms a sparse representation of , dividing genotyped individuals to core (*c*) and non-core (*n*) subsets. Direct inversion is only required for the block of **G** corresponding to core individuals (**G**_{cc}). Consequently, the *O*((*c* + *n*)^{3}) computational cost is reduced to *O*(*c*^{3}) + *O*(*n*). In the APY algorithm, genomic breeding values of non-core individuals are conditioned on the genomic breeding values of core individuals. This algorithm is based on the assumption that the dimensionality of **G** is limited and that independent chromosome segments explain the rank of **G** (Misztal, 2016). As long as the number of core individuals is greater than the number of independent chromosome segments (Misztal et al., 2014), and the core subset covers the **G** spectrum (Bermann et al., 2022) it may not take all the genotyped individuals to explain the variation in **G**. Therefore, the variation in **G** can be explained by the core subset, and genomic breeding values of the non-core individuals are expressed as a linear function of those from the core individuals (Bermann et al., 2022). As such, the accuracy of the APY algorithm depends on the core size and composition.

The matrix is calculated as (Bermann et al., 2022):
where, **M**_{nn} = **G**_{nn} – **P**_{nc}**G**_{cn}, and . In practice, diag(**M**_{nn}) is used instead of **M**_{nn}. Strandén et al. (2017) and Bermann et al. (2022) showed that:

The aim of this study is to provide new insights and possible applications for the APY algorithm.

## 2 Theory and discussion

### 2.1 The APY and APY inverse expressions

In this subsection, it is shown that any covariance or inverse covariance (generally any symmetric) matrix has expressions, here called APY and APY inverse expressions. A new way of understanding the properties of the APY inverse expression of G (i.e., ) is through understanding the hybrid pedigree-genomic relationship matrix (**H**) used in ssGBLUP. Legarra et al. (2009) derived various forms of the same relationship matrix, including full pedigree and genomic information. Denoting genotyped and non-genotyped individuals as 2 and 1: **H** =

It worth mentioning that replacing **G** with **A**_{22} in any of these equations turns **H** to **A**. Similarly, replacing **G** with **G**_{nn} and **A** with **G** turns **H** to **G**. The above equations can be simplified to:
where, the projection matrix . A nice property of **H** is that its inverse can be derived directly with no need to form and invert **H** (Aguilar et al., 2010; Christensen and Lund, 2010):

Matrix **H**^{-1} replaces **A**^{-1} in BLUP for ssGBLUP. Replacing **G** with , **A**^{-1} with **G**, and notations 1 and 2 with *c* and *n*, respectively, turns Eq. 6 to Eq. 1. This shows that Eq. 6 is the APY inverse expression of **H**^{-1}. Following Eq. 2, the APY expression of **H**^{-1} is:
where **M**^{22} = **H**^{22} – **P**^{21} **A**^{12}, **P**^{12} = (**A**^{11})^{-1}**A**^{12}, and . Similarly, there are APY and APY inverse expressions for **H**.

### 2.2 Understanding the differences between G^{-1} and

Considering Eq. 1 and 2, as long as no change is made to **M**_{nn}, the APY and the APY inverse expressions of **G** are equal to **G** and **G**^{-1}, respectively. Matrix becomes an approximate **G**^{-1} when **M**_{nn} is changed to a diagonal matrix with diagonal elements:
representing genomic Mendelian sampling (Misztal et al., 2014). Using Eq. 10, calculations can be paralleled across all genotyped individuals. Compared with **G**^{nn}:

The change of **M**_{nn} to diag(**M**_{nn}) is propagated to the other blocks of via the projection matrix **P**_{cn} (Eq. 1). No change is made to **G**_{APY} other than to the off-diagonal elements of **G**_{nn} (Strandén et al., 2017). Following Eq. 2, **M**_{nn} + **P**_{nc}**G**_{cn} – **G**_{nn} = **0**. Thus, replacing **M**_{nn} with diag(**M**_{nn}) replaces offdiag(**G**_{nn}) with offdiag(**P**_{nc}**G**_{cn}). Therefore, it can be articulated that genomic relationships among non-core individuals become a function of **G**_{cc} and **G**_{cn}. The efficiency of the APY algorithm depends on how well offdiag(**P**_{nc}**G**_{cn}) replaces offdiag(**G**_{nn}).

### 2.3 Other applications

The application of the APY algorithm is not limited to **G**^{-1}, nor to ssGBLUP and GBLUP. This algorithm can be applied to approximate the inverse of any large symmetric matrix, where the rank of the matrix is smaller than its dimension. Representing any such matrix with **G**, only **G**_{cc} needs to be inverted. Besides reduced matrix inversion cost, there are sparsity-related reduced computational costs.

The first and the only time the APY algorithm was suggested for inverting a matrix other than **G** was by Misztal et al. (2014). They suggested the APY algorithm for the **A**_{22} inversion, which is required in ssGBLUP (Eq. 8). They derived an equivalent formula for the APY approximation of :

Here, the diagonal elements of **M**_{nn} equal , where *i* is a non-core genotyped individual. The **a**_{ci} vectors (rows of **A**_{cn}) can be efficiently computed using the Colleau algorithm (Colleau, 2002), which can be done in parallel for many vectors at a time. The *a _{ii}* elements (diag(

**A**

_{nn})) are easy to compute applying the fast and efficient algorithms available for computing inbreeding coefficients (Tier, 1990; Meuwissen and Luo, 1992; Sargolzaei and Iwaisaki, 2005; Sargolzaei et al., 2005). However, computing via the APY algorithm is a problem in a loop, which means to obtain the inverse of a block of

**A**(i.e., ), the inverse of its sub-block is required. There are two other well established methods for the calculation of (Colleau, 2002; Faux and Gengler, 2013).

Contrarily, one may apply the APY algorithm for inverting **A**^{-1} to **A**. Though calculating **A** is computationally expensive, calculation of **A**^{-1} is computationally fast and efficient (Henderson, 1975), even for large populations. The computational cost of inverting **A**^{-1} to **A** can be reduced by obtaining an APY inversion of **A**^{-1}:
where **M**^{nn} is a diagonal matrix with diagonal elements *m ^{ii}* =

*a*–

^{ii}**a**

^{ic}(

**A**

^{cc})

^{-1}

**a**

^{ic}. Matrix

**A**

^{nn}is sparse. Thus, compared to

**G**

^{nn}, there are considerably fewer non-zero off-diagonal elements set to 0. On the other hand, the choices of core size and core composition are likely to be more important. In the APY algorithm, relationships among non-core individuals are conditioned on the information from core individuals. In

**A**, the number of relatives that can explain the relationships between a non-core individual with other non-core individuals is limited. Thus, the choice of core individuals becomes more difficult. Contrarily, in

**G**all individuals share information via many markers, regardless of whether they are relatives.

If rather than **A**, a diagonal block of it (**A**_{cc}) is needed, some of the calculations in Eq. 13 become redundant, and **A**_{cc} can be calculated as:

Calculating (**A**_{cc})_{APY}, there is no choice of the core size and composition, as the choice of individuals for which the relationship coefficients to be approximated is already made. The APY approximation of **A**_{cc} might be influenced by **A**^{nn} changed to the diagonal (**M**^{nn})^{-1}. Should APY approximations need improvement, the researcher might consider adding a chosen group of non-core individuals to the core subset. An application for **A**_{cc} is to calculate **A**_{22} for blending with **G** to improve the numerical condition of **G**, and to introduce residual polygenic variance not captured by the markers.

The APY algorithm helped overcome the limitations of inverting **G**. On the contrary, this constraint does not exist for marker effect models (*i.e*., SNP-BLUP and ss-SNP-BLUP) because a marker × marker matrix is used instead of **G**^{-1}, which does not need to be inverted. This advantage comes at the price of dense matrix multiplications, and convergence complexities (Vandenplas et al., 2018; Bermann et al.,2022). Unlike **G**, the size of that matrix remains constant over time unless the genotyping platform changes, and the old genotypes are imputed to a genotyping platform with a higher marker density. In fact, GBLUP and SNP-BLUP are equivalent models (Bermann et al., 2022). Conversion formulas between these two models are presented in the Appendix.

The mixed model equations (MME) of the marker effect models do not require direct matrix inversion (Fernando et al., 2014). Indirect inversion of **A** is needed, which is easy to obtain. However, due to convergence difficulties, a specialised preconditioned conjugate gradient (PCG) solver with a block-diagonal Jacobi preconditioner matrix is applied, which is extended from single-trait to multi-trait analyses (Harris et al., 2022). As such, a marker × marker diagonal block of the MME (here called **Q**) is inverted, which is expanded by the number of traits in the model. The APY algorithm is a good candidate for this scenario, where the markers are divided into core and non-core. Only the block corresponding to core markers (**Q**_{cc}) is inverted. Similar rules applied to are applied to this scenario, with the difference that the role of markers and genotyped individuals are switched. Due to collinearity in the marker × individual genotype matrix, this matrix is not of full rank. The main source of collinearity is the markers with low minor allele frequency. Also, it would probably not take all the genotyped individuals to explain marker effects. Therefore, **Q** has a limited dimensionality, and the off-diagonal elements of **Q**_{nn} (in the preconditioner matrix, not in the MME) are conditioned on **Q**_{cc} and **Q**_{cn}. The is a preconditioner matrix with the preconditioning properties similar to those of **Q**^{-1}. Though the number of PCG iterations might differ, the cost of storing in the memory is cheaper, and each PCG iteration is expected to be faster.

The core size and composition define the APY accuracy. Core size, which its optimum is a function of the effective population size (Pocrnic et al., 2016), is the most important. As long as there is room to increase the core size to span over 98% of the eigenvalue spectra of **G**, a random set of core individuals is shown to perform well because it gives good coverage over generations and breeds in the population (Nilforooshan and Lee, 2019). The problem of nonidentical results for random cores and the same data can be addressed by saving the identification of the core individuals. There is ongoing research on finding the optimal core subset, and it is an important topic for admix populations and when the core size is constrained. When the core size is limited, an optimum core composition can harvest a larger variation of **G**. Though with a sufficiently large core size, the gain from an optimal core subset would be marginal (Nilforooshan and Lee, 2019), if screening for the optimal core subset is computationally affordable, it would be proffered over a random core subset.

The APY accuracy is usually measured by the correlation between genomic breeding values obtained via **G**^{-1} and . However, it might be okay to have a correlation coefficient slightly less than 1. A small variation of **G** might be due to collinearity and noise-related and good to get discharged. The APY algorithm may help reduce the collinearity and noise in **G**. Nilforooshan and Lee (2019) showed that APY reduced the very large max(diag(**G**^{-1})), which is a sign of reduced collinearity and improved condition of **G**. Validation of genomic breeding values is a good complementary.

It is unknown what proportion of random markers would cover over 98% of the eigenvalue spectra of **Q**. Similar to the concept of effective population size defining the optimum number of core individuals for might be the concept of effective marker size defining the optimum number of core markers for . Such markers are likely segregating in the coding regions, with effects as independent and orthogonal as possible to other markers; a concept similar to independent chromosome segments equal to *2N _{e}L/log(4N_{e}L)* (Goddard, 2009), where

*N*is the effective population size, and

_{e}*L*is the length of chromosome in Morgans. Therefore,

**G**and

**Q**might have similar dimensionality, and the required core size might be the same for both. Possibly, choosing markers corresponding to the highest diagonal elements of

**Q**is better than a random set of core markers. This is because those markers cover a larger variation in

**Q**(i.e.,

*trace*(

**Q**) = ∑

*eigenvalue*(

**Q**)). This would favour choosing markers with lower minor allele frequency. An optimised core subset may reduce the need for a larger core size (i.e., the same variation in

**Q**captured by a smaller set of markers). Future research is needed on this topic.

## Conclusions

This study aimed to open new insights and understanding about the APY algorithm and to introduce new possible applications to this algorithm. Starting from the **H** matrix formula, it was shown that every covariance or inverse covariance matrix could be shown as a combination of its two diagonal blocks (diagonal blocks for genotyped and non-genotyped individuals in **H**). The projection matrix makes the combination (information flow) between the two diagonal blocks. Furthermore, it was shown that any covariance or inverse covariance matrix has APY and APY inverse expressions equal to the matrix itself and its inverse, respectively. The difference arises when a diagonal block of the APY inverse (corresponding to non-core individuals) changes to a specific diagonal matrix. That change is projected to the rest of the inverse matrix via the projection matrix. That diagonal matrix sets non-core individuals independent from each other conditional to the coefficients provided by the core individuals. The APY algorithm can also be understood as an (approximate) absorption of the off-diagonal elements of a diagonal block into the rest of the matrix.

The APY algorithm is based on the concept of the limited dimensionality. A genomic relationship matrix has a limited dimensionality equivalent to the number of independent chromosome segments, which allows a reduction in the dimensionality of **G**. Therefore, it would take the inverse of a diagonal block of **G** to invert **G**. An APY inverse of **G** with a sufficient core size and proper core composition produces genomic breeding values analogous to those using the exact **G**^{-1}. Possible new applications for APY are: computing **A**, a diagonal block of **A**, and the block of the block-diagonal preconditioner matrix corresponding to marker effects for iterative solving of marker effect model equations. The application of APY is not limited to obtaining the best sparse approximates of **G**^{-1}, and new applications may emerge in the future.

## Appendix

Considering the MME for GBLUP:
and **G** = **WW**′, conversion of GBLUP to SNP-BLUP MME follows:

On the other hand, the conversion of SNP-BLUP to GBLUP is as follows:
where , **û** and **â** are the vectors of solutions for fixed effects, individuals’ additive genetic merit and marker effects, is the residual variance, and , is the additive genetic variance captured by markers.