# GCA: An R package for genetic connectedness analysis using pedigree and genomic data

## Introduction

Genetic connectedness quantities the extent of linkage between individuals across management units. The concept of genetic connectedness can be extended to measure a connectedness level between training and testing sets in the whole-genome prediction area. However, there is no user-friendly R package that offers a comprehensive list of connectedness metrics. In this project, we aim to develop an R package, which utilizes either pedigree or genomic data to measure the connectedness between individuals across units.

This package implements three connectedness metrics which are functions of prediction error variance (PEV) matrix, including prediction error variance of difference (PEVD), coefficient of determination (CD) and prediction error correlation (r), coupled with three summary methods for each statistic. For example, the PEVD across units is summarized as 1) average PEV of all pairwise differences between individuals across units; 2) average PEV within and across units; 3) use a contrast vector. Analogous summary methods will be also applied to CD and r statistics. Three additional metrics approximating connectedness using variance of estimates of unit effects (VE) are included, such as variance of estimates of units effects difference (VED), coefficient of determination of VED (CDVED), and connectedness rating (CR). Within each metric, three different methods are names according to the number of corrected factors (e.g., 0, 1 and 2) for fixed effects, such as VED0, VED1 and VED2. Similar corrected function is also applied for metrics of CDVED and CR.

This R package is hosting on a GitHub page accompanied with a detailed vignette document. The core functions of the R code are rewritten with C++ to improve computational speed by taking the advantage of the Rcpp package. We expect this R package will provide a comprehensive tool for genetic connectedness analysis and whole genome prediction.

## Core functions of connectedness metrics

### Prediction Error Variance

A PEV matrix typically serves as the kernel of connectedness measures, which can be obtained from Henderson’s mixed model equations (MME) (Henderson 1984). Assume a standard linear mixed model $$\mathbf{y} = \mathbf{Xb} + \mathbf{Zu} + \boldsymbol{\epsilon}$$, where $$\mathbf{y}$$, $$\mathbf{b}$$, $$\mathbf{u}$$ and $$\boldsymbol{\epsilon}$$ refer to a vector of phenotypes, effects of management units, random additive genetic effects and residuals, respectively. The $$\mathbf{X}$$ and $$\mathbf{Z}$$ are incidence matrices which associate management units effects and individuals to phenotypes, respectively. The variance-covariance structure of this mixed model follows \begin{align*} \begin{pmatrix} \mathbf{y} \\ \mathbf{u} \\ \mathbf{\boldsymbol{\epsilon}} \end{pmatrix} &\sim N \left [ \begin{pmatrix} \mathbf{Xb}\\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix} \mathbf{ZK}\sigma^2_u \mathbf{Z}' + \mathbf{I}\sigma^2_{\epsilon} & \mathbf{ZK}\sigma^2_u & \mathbf{I}\sigma^2_{\epsilon} \\ \mathbf{KZ'}\sigma^2_u & \mathbf{K}\sigma^2_u & 0 \\ \mathbf{I}\sigma^2_{\epsilon} & 0 & \mathbf{I}\sigma^2_{\epsilon} \end{pmatrix} \right ] . \end{align*} where $$\sigma^2_u$$ and $$\sigma^2_e$$ refer to additive genetic variance and residual variance, respectively. The $$\mathbf{K}$$ is a positive (semi)definite relationship matrix. Following this, the mixed model equation (Henderson 1984) can be written as \begin{align} \begin{bmatrix} \mathbf{X'X} & \mathbf{X'Z} \\ \mathbf{Z'X} & \mathbf{Z'Z + K^{-1}\lambda} \end{bmatrix} \begin{bmatrix} \mathbf{\widehat{b}}\\ \mathbf{\widehat{u}} \end{bmatrix} \mathbf{=} \begin{bmatrix} \mathbf{X'y} \\ \mathbf{Z'y} \end{bmatrix} \label{MMEeq} \end{align} where $$\lambda$$ is a ratio of variance components which equals to $$\frac{\sigma^2_{\epsilon}}{\sigma^2_u}$$. The inverse of the MME coefficient matrix derived from this model is \begin{align*} \mathbf{C}^{-1} &= \begin{bmatrix} \mathbf{X'X} & \mathbf{X'Z} \\ \mathbf{Z'X} & \mathbf{Z'Z} + \mathbf{K}^{-1} \lambda \end{bmatrix}^{-1} \\ &= \begin{bmatrix} \mathbf{C}^{11} & \mathbf{C}^{12} \\ \mathbf{C}^{21} & \mathbf{C}^{22} \end{bmatrix}. \end{align*} With this, the PEV of $$u$$ is defined as (Henderson 1984) \begin{align*} \text{PEV}(u) &= \text{Var}(\widehat{u} - u) \\ &= \text{Var}(u | \widehat{u}) \\ &= (\mathbf{Z'MZ} + \mathbf{K}^{-1} \lambda)^{-1} \sigma^2_{\epsilon} \\ &= \mathbf{C}^{22} \sigma^2_{\epsilon}, \end{align*} where $$\mathbf{M} = \mathbf{I} - \mathbf{X}(\mathbf{X}' \mathbf{X})^{-}\mathbf{X}'$$ is the absorption (projection) matrix for fixed effects.

### Variance of estimates of management units effects

The derivation of PEV oftentimes results in a heavy computational demanding for the high-dimensional dataset. Kennedy and Trus (1993) proposed a direct approximation of mean PEV over unit (PEV_Mean) based on VE, that is

\begin{align} \text{Var}(\widehat{b}) = [\mathbf{X}' \mathbf{X} - \mathbf{X}' \mathbf{Z}(\mathbf{Z'Z} + \mathbf{K}^{-1} \lambda)^{-1}\mathbf{Z}' \mathbf{X}]^{-1}\sigma^2_{\epsilon}. \end{align}

However, Holmes, Dodds, and Lee (2017) pointed out this equation is not an exact approximation of PEV_Mean, and an according correction is suggested to achieve the equality. When unit effect is the only fixed effect including in the model, an approximate mean of PEV over unit is $\begin{equation*} \text{PEV_Mean} = \text{Var}(\widehat{b}) - \sigma^2_{\epsilon} (\mathbf{X}' \mathbf{X})^{-1}, \end{equation*}$ where $$\sigma^2_{\epsilon} (\mathbf{X}' \mathbf{X})^{-1}$$ corrects the number of records within management units. An additional correction was added to derive the mean of PEV over unit which accounts for the additional fixed effects excepts unit effect in the model as (Holmes, Dodds, and Lee 2017) \begin{align*} \text{PEV_Mean} &= \text{Var}(\widehat{b_1}) - \sigma^2_{\epsilon} (\mathbf{X_1}' \mathbf{X_1})^{-1} \\ & + (\mathbf{X_1}'\mathbf{X_1})^{-1}\mathbf{X_1}'\mathbf{X_2}\text{Var}(\widehat{b_2})\mathbf{X_2}'\mathbf{X_1}(\mathbf{X_1}'\mathbf{X_1})^{-1} \\ & + (\mathbf{X_1}'\mathbf{X_1})^{-1}\mathbf{X_1}'\mathbf{X_2}\text{Cov}(\widehat{b_2}, \widehat{b_1}) \\ & + \text{Cov}(\widehat{b_1}, \widehat{b_2})\mathbf{X_2}'\mathbf{X_1}(\mathbf{X_1}'\mathbf{X_1})^{-1} \end{align*} where the $$\mathbf{X_1}$$ and $$\mathbf{X_2}$$ represent incidence matrix for management units and other fixed effects, respectively. The $$\widehat{b_1}$$ refers to the estimates of management units effects, and $$\widehat{b_2}$$ indicates the estimates of other fixed effects.

## Connectedness Metrics

In general, connectedness metrics can be grouped into two sets, which are the function of PEV and VE, respectively.

### Function of PEV

#### Prediction Error Variance of Difference

The PEVD (Kennedy and Trus 1993) measures the prediction error variance difference of breeding values between individuals from different management units. The PEVD between two individuals of $$i$$ and $$j$$ can be expressed as \begin{align*} \text{PEVD}(\widehat{u_i} - \widehat{u_j}) &= [ \text{PEV}(\widehat{u_i}) + \text{PEV}(\widehat{u_j}) - 2 \text{PEC}(\widehat{u_i}, \widehat{u_j})] \\ &= (\mathbf{C}^{22}_{ii} - \mathbf{C}^{22}_{ij} - \mathbf{C}^{22}_{ji} + \mathbf{C}^{22}_{jj}) \sigma^2_{\epsilon} \\ &= (\mathbf{C}^{22}_{ii} + \mathbf{C}^{22}_{jj} - 2\mathbf{C}^{22}_{ij}) \sigma^2_{\epsilon} , \end{align*} where PEC$$_{ij}$$ is the off-diagonal element of the PEV matrix which indicates the prediction error covariance or covariance between errors of genetic values.

• Individual Average PEVD A calculation of summary PEVD can be traced back to Kennedy and Trus (1993) as average PEV of all pairwise differences between individuals between two management units \begin{align*} \text{PEVD}_{i'j'} &= \frac{1}{n_{i'} \cdot n_{j'}} \sum \text{PEVD}_{i'j'} , \end{align*} $$n_{i'}$$ and $$n_{j'}$$ are the total number of records in units $$i'$$ and $$j'$$, accordingly. The $$\sum \text{PEVD}_{i'j'}$$ was the sums of all pairwise differences between two units.

• Group Average PEVD We can also summarize PEVD using average PEV within and across management units \begin{align*} \text{PEVD}_{i'j'} &= \overline{\text{PEV}}_{i'i'} + \overline{\text{PEV}}_{j'j'} -2\overline{\text{PEC}}_{i'j'}, \end{align*} where the $$\overline{\text{PEV}}_{i'i'}$$, $$\overline{\text{PEV}}_{i'i'}$$ and $$\overline{\text{PEC}}_{i'j'}$$ indicate the mean PEV in $$i'$$ management unit, $$j'$$ mamagement unit, and mean prediction error covariance between $$i'$$ and $$j'$$ units.

• Contrast PEVD Alternatively, a summary of PEVD across any pair of management units under a contrast form is (Laloë 1993; Yu et al. 2017)
\begin{align*} \text{PEVD}(\mathbf{x}) &= \mathbf{x}' \mathbf{C}^{22} \mathbf{x} \sigma^2_{\epsilon}, \end{align*} where the sum of elements in $$\mathbf{x}$$ equals to zero.

#### Coefficient of Determination

The coefficient of determination (CD) measures the precision of the estimates breeding values. The pairwise CD between individual $$i$$ and $$j$$ is defined as (Laloë, Phocas, and Menissier 1996)
\begin{align*} \text{CD}_{ij} &= \frac{\text{Var}(\mathbf{u}) - \text{Var}(\mathbf{u}|\mathbf{\widehat{u}})}{\text{Var}(\mathbf{u})} \\ &= 1 - \frac{\text{Var}(\mathbf{u}|\mathbf{\widehat{u}})}{\text{Var}(\mathbf{u})} \\ &= 1 - \lambda \frac{\mathbf{C}^{22}_{ii} + \mathbf{C}^{22}_{jj} - 2\mathbf{C}^{22}_{ij} }{\mathbf{K}_{ii} + \mathbf{K}_{jj} - 2\mathbf{K}_{ij}}, \end{align*}

The $$ii$$ and $$jj$$ indicate the diagonal elements of the $$\mathbf{K}$$ matrix for $$ith$$ and $$jth$$ individuals, accordingly. The $$ij$$ refers to the off-diagonal elements of the K matrix, and $$\lambda$$ is variance ration of $$\frac{\sigma^2_e}{\sigma^2_a}$$.

• Individual Average CD The CD between individuals across two management units can be also derived with average PEV of all pairwise differences \begin{align*} \text{CD}_{i'j'} &= 1 - \frac{\frac{1}{n_{i'} \cdot n_{j'}} \cdot \sigma^2_e \cdot \sum \mathbf{(C^{22}}_{i'i'} + \mathbf{C^{22}}_{j'j'} - 2\mathbf{C^{22}}_{i'j'})}{\frac{1}{n_{i'} \cdot n_{j'}} \cdot \sigma^2_u \cdot \sum (\mathbf{K}_{i'i'} + \mathbf{K}_{j'j'} - 2\mathbf{K}_{i'j'})}\\ &= 1 - \frac{\frac{1}{n_{i'} \cdot n_{j'}}\sum \text{PEVD}_{i' j'}}{\frac{1}{n_{i'} \cdot n_{j'}} \cdot \sigma^2_u \cdot \sum (\mathbf{K}_{i'i'} + \mathbf{K}_{j'j'} - 2\mathbf{K}_{i'j'})}\\ &= 1 - \frac{\sum \text{PEVD}_{i'j'}}{\sigma^2_u \cdot \sum (\mathbf{K}_{i'i'} + \mathbf{K}_{j'j'} - 2\mathbf{K}_{i'j'})}. \end{align*}

• Group Average CD Similar, average PEV within and across management units can be used to calculate the CD summary statistic \begin{align*} \text{CD}_{i'j'} &= 1 - \frac{\sigma^2_e \cdot \overline{\mathbf{C^{22}}}_{i'i'} + \overline{\mathbf{C^{22}}}_{j'j'} -2\overline{\mathbf{C^{22}}}_{i'j'}}{\sigma^2_u \cdot (\overline{\mathbf{K}}_{i'i'} + \overline{\mathbf{K}}_{j'j'} - 2\overline{\mathbf{K}}_{i'j'})}\\ &= 1 - \frac{\overline{\text{PEV}}_{i'i'} + \overline{\text{PEV}}_{j'j'} -2\overline{\text{PEC}}_{i'j'}}{\sigma^2_u \cdot (\overline{\mathbf{K}}_{i'i'} + \overline{\mathbf{K}}_{j'j'} - 2\overline{\mathbf{K}}_{i'j'})}\\ &= 1 - \frac{\text{PEVD}_{i'j'}}{\sigma^2_u \cdot (\overline{\mathbf{K}}_{i'i'} + \overline{\mathbf{K}}_{j'j'} - 2\overline{\mathbf{K}}_{i'j'})}. \end{align*} The $$\overline{\mathbf{K}}_{i'i'}$$, $$\overline{\mathbf{K}}_{j'j'}$$ and $$\overline{\mathbf{K}}_{i'j'}$$ refer to the mean of relationship coefficients in the management unit $$i'$$, $$j'$$, and mean relationship coefficients between two units.

• Contrast CD A contrast notation of summary CD between any pair of management units is (Laloë, Phocas, and Menissier 1996) \begin{align*} \text{CD}(\mathbf{x}) &= 1 - \frac{\text{Var}(\mathbf{x}'\mathbf{u}|\mathbf{\widehat{u}})}{\text{Var}(\mathbf{x}'\mathbf{u})} \\ &= 1- \lambda \frac{\mathbf{x}' \mathbf{C}^{22} \mathbf{x} }{ \mathbf{x}' \mathbf{K} \mathbf{x}}\\ &= 1- \frac{\mathbf{x}' \mathbf{C}^{22} \mathbf{x} \cdot \sigma^2_e }{ \mathbf{x}' \mathbf{K} \mathbf{x} \cdot \sigma^2_u}\\ &= 1- \frac{\text{PEVD(x)}}{ \mathbf{x}' \mathbf{K} \mathbf{x} \cdot \sigma^2_u}, \end{align*} where $$\mathbf{x}$$ and $$\mathbf{K}$$ is a contrast vector and a relationship matrix defined above.

#### Prediction Error Correlation

The pairwise r statistic (Lewis et al. 1999) between individual $$i$$ and $$j$$ can be derived from PEV matrix to a predictive error correlation matrix as
\begin{align*} \text{r}_{ij} = \frac{\text{PEC}(\widehat{u_i}, \widehat{u_j})}{\sqrt{\text{PEV}(\widehat{u_i}) \cdot \text{PEV}(\widehat{u_j})}}. \end{align*}
• Individual connectedness The summary of individual connectedness calculates pairwise r for all pair of individuals followed by averaging all r across management units: \begin{align*} \text{r}_{i'j'} = \frac{1}{n_{i'} \cdot n_{j'}} \cdot \sum{\frac{\text{PEC}(\widehat{u_{i'}}, \widehat{u_{j'}})}{\sqrt{\text{PEV}(\widehat{u_{i'}}) \cdot \text{PEV}(\widehat{u_{j'}})}}}. \end{align*}

• Group connectedness One of the summary statistics of r is flock connectedness, which takes the average of PEV matrix by management units followed by a ratio to r. The Flock r between two management units $$i'$$ and $$j'$$ is given by (Kuehn et al. 2008)
\begin{align*} \text{r}_{i'j'} &= \frac{\overline{\text{PEC}}_{i'j'}}{\sqrt{\overline{\text{PEV}}_{i'i'} \cdot \overline{\text{PEV}}_{j'j'}}}\\ &=\frac{ 1/n_{i'} \sum \text{PEC}_{i' j'} 1/n_{j'}}{ \sqrt{ (1/n_{i'})^2 \sum \text{PEV}_{i' i'} \cdot (1/n_{j'})^2 \sum \text{PEV}_{j' j'} } } \\ &= \frac{ \sum \text{PEC}_{i' j'} }{ \sqrt{ \sum \text{PEV}_{i' i'} \cdot \sum \text{PEV}_{j' j'} } }, \end{align*} where $$n_{i'}$$ and $$n_{j'}$$ indcate the number of individuals in units $$i'$$ and $$j'$$, respectively.

• Contrast r We can also summarize r under a contrast format \begin{align*} \text{r}(\mathbf{x}) &= \mathbf{x}' \mathbf{rx}. \end{align*} The $$\mathbf{r}$$ is a correlation matrix with the element of $$r_{ij} = \frac{\text{PEC}(\widehat{u_i}, \widehat{u_j})}{\sqrt{\text{PEV}(\widehat{u_i}) \cdot \text{PEV}(\widehat{u_j})}}$$, which indicates the paiwise prdiction error correlation between individual $$i$$ and $$j$$.

### Function of VE

#### Variance of estimates of unit effects differences

• VED0 The VED0(Kennedy and Trus 1993) statistic estimates connectedness with the variance of differences of management units effects. The VED between two units $$i'$$ and $$j'$$ is defined as \begin{align*} \text{VED}_{i'j'} &= \text{Var}(\widehat{b})_{i'i'} + \text{Var}(\widehat{b})_{j'j'} - 2\text{Cov}(\widehat{b})_{i'j'}, \end{align*} where Var$$(\widehat{b})_{i'i'}$$, Var$$(\widehat{b})_{j'j'}$$ and Cov$$(\widehat{b})_{i'j'}$$ indicate variance of management unit $$i'$$, $$j'$$ and covariance between two units, respectively.

• VED1 & VED2 Holmes, Dodds, and Lee (2017) suggested quantifying group average PEVD using PEV_Mean \begin{align*} \text{PEVD}_{i'j'} &= \text{PEV_Mean}_{i'i'} + \text{PEV_Mean}_{j'j'} -2\text{PEV_Mean}_{i'j'}, \end{align*} , where PEV_Mean could be directly derived from $$\text{Var}(\widehat{b})$$ by adding correction factors.

#### Coefficient of Determination of VED

• CDVED0 We can also use the variance of estimates management units effects to approximate CD as \begin{align*} \text{CD}_{i'j'} &= 1 - \frac{\text{Var}(\widehat{b})_{i'i'} + \text{Var}(\widehat{b})_{j'j'} - 2\text{Cov}(\widehat{b})_{i'j'}}{\sigma^2_u \cdot (\overline{\mathbf{K}}_{i'i'} + \overline{\mathbf{K}}_{j'j'} - 2\overline{\mathbf{K}}_{i'j'})}. \end{align*}

• CDVED1 & CDVED2 Analogously, CD could be calculated with a correction of $$\text{Var}(\widehat{b})$$ (Holmes, Dodds, and Lee (2017)) as \begin{align*} \text{CD}_{i'j'} &= 1 - \frac{\sigma^2_e \cdot \overline{\mathbf{C^{22}}}_{i'i'} + \overline{\mathbf{C^{22}}}_{j'j'} -2\overline{\mathbf{C^{22}}}_{i'j'}}{\sigma^2_u \cdot (\overline{\mathbf{K}}_{i'i'} + \overline{\mathbf{K}}_{j'j'} - 2\overline{\mathbf{K}}_{i'j'})}\\ &= 1 - \frac{\text{PEV_Mean}_{i'i'} + \text{PEV_Mean}_{j'j'} -2\text{PEV_Mean}_{i'j'}}{\sigma^2_u \cdot (\overline{\mathbf{K}}_{i'i'} + \overline{\mathbf{K}}_{j'j'} - 2\overline{\mathbf{K}}_{i'j'})}. \end{align*}

The $$\text{PEV_Mean}_{i'i'}$$, $$\text{PEV_Mean}_{j'j'}$$ and $$\text{PEV_Mean}_{i'j'}$$ are average PEV over management unit $$i'$$, $$j'$$ and average of prediction error covariance between management units $$i'$$ and $$j'$$, respectively.

#### Connectedness Rating (CR)

• CRO Mathur, Sullivan, and Chesnais (2002) proposed using CR approximates flock connectedness by replacing PEV matrix with the variance of differences of management units estimates. The CR of two management units $$i'$$ and $$j'$$ is given by \begin{align*} \text{CR}_{i'j'} &= \frac{\text{Cov}(\widehat{b})_{i'j'} }{ \sqrt{ \text{Var}(\widehat{b})_{i'i'} \cdot \text{Var}(\widehat{b})_{j'j'}}}. \end{align*}

• CR1 & CR2 The prediction error correlation can be derived with aforementioned correction of variance of estimates of management units effects (Holmes, Dodds, and Lee 2017) \begin{align*} \text{r}_{i'j'} &= \frac{\text{PEV_Mean}_{i'j'}}{\sqrt{\text{PEV_Mean}_{i' i'} \cdot \text{PEV_Mean}_{j' j'}}}. \end{align*}

## Application of GCA package

### Install GCA from Github

install.packages("devtools")
library(devtools)
install_github('HaipengU/GCA')
library(GCA)

### Load example cattle dataset in GCA package.

The cattle dataset of GCcattle contains two files of cattle.pheno and cattle.W, which include phenotype and marker information, respectively. The details can be checked with ?GCcattle.

library(GCA)
data(GCcattle)
dim(cattle.pheno) # phenotype
## [1] 530   6
dim(cattle.W) # marker matrix
## [1]   530 10000

### Example of connectedness across units.

#### Individual average CD

X <- model.matrix(~ -1 + factor(cattle.pheno$Unit)) # design matrix of unit effect with intercept excluded G <- computeG(cattle.W, maf = 0.05, type = 'G2') # genomic relationship matrix (VanRaden) EVD <- eigen(G) # eigendecomposition var <- varcomp(y = cattle.pheno$Phenotype, Evector = EVD$vectors, Evalue = EVD$values) # variance component
var$Vu # variance of additive genetic effects ## [1] 0.6496319 var$Ve # variance of residual
## [1] 0.4445644
CD_IdAve <- gcm(Kmatrix = G, Xmatrix = X, sigma2a = var$Vu, sigma2e = var$Ve,
MUScenario = as.factor(cattle.pheno$Unit), statistic = 'CD_IdAve', NumofMU = 'Pairwise') CD_IdAve ## MU1 MU2 MU3 MU4 MU5 ## MU1 NA 0.6696807 0.6080204 0.6601902 0.6511560 ## MU2 0.6696807 NA 0.6411488 0.6906823 0.6851704 ## MU3 0.6080204 0.6411488 NA 0.6312249 0.6261347 ## MU4 0.6601902 0.6906823 0.6312249 NA 0.6763378 ## MU5 0.6511560 0.6851704 0.6261347 0.6763378 NA The above example showed the individual average CD, which is a function of PEV and returned a pairwise connectedness between all units. Alternative, it can return an overall connectedness which averages all pairwise CD by setting ‘NumofMU’ to ‘Overall’ as CD_IdAve <- gcm(Kmatrix = G, Xmatrix = X, sigma2a = var$Vu, sigma2e = var$Ve, MUScenario = as.factor(cattle.pheno$Unit), statistic = 'CD_IdAve', NumofMU = 'Overall')
CD_IdAve
## [1] 0.6539746

#### Coefficient of determination of VED

We showed the CDVED1 in the following example, which is a function of VE.

CDVED1 <- gcm(Kmatrix = G, Xmatrix = X, sigma2a = var$Vu, sigma2e = var$Ve,
MUScenario = as.factor(cattle.pheno$Unit), statistic = 'CDVED1', NumofMU = 'Pairwise') CDVED1 ## MU1 MU2 MU3 MU4 MU5 ## MU1 NA 0.5821483 0.4925309 0.6316821 0.5744163 ## MU2 0.5821483 NA 0.4636864 0.7036332 0.6301990 ## MU3 0.4925309 0.4636864 NA 0.5363781 0.5087033 ## MU4 0.6316821 0.7036332 0.5363781 NA 0.6791546 ## MU5 0.5744163 0.6301990 0.5087033 0.6791546 NA The above example only considered one fixed effect of unit in the model. When more than one fixed effect is included, we can account for these additional fixed effects using statistic ‘CDVED2’ X2 <- model.matrix(~ -1 + factor(cattle.pheno$Unit)
+ factor(cattle.pheno$Sex)) CDVED2 <- gcm(Kmatrix = G, Uidx = 5, Xmatrix = X2, sigma2a = var$Vu, sigma2e = var$Ve, MUScenario = as.factor(cattle.pheno$Unit), statistic = 'CDVED2', NumofMU = 'Pairwise')
CDVED2
##           MU1       MU2       MU3       MU4       MU5
## MU1        NA 0.5816912 0.4907677 0.6315747 0.5743693
## MU2 0.5816912        NA 0.4624705 0.7035294 0.6288023
## MU3 0.4907677 0.4624705        NA 0.5351207 0.5060286
## MU4 0.6315747 0.7035294 0.5351207        NA 0.6787688
## MU5 0.5743693 0.6288023 0.5060286 0.6787688        NA

Notice that, we added additional argument of ‘Uidx’ in the above example, which is a interger to indicate the last column of unit effects in X matrix. More details can be checked with ?gcm

### Available connectedness metrics in GCA package.

The following section lists all available metrics in GCA package, which can be called by setting the argument of ‘statistic’.

• PEVD_IdAve : Individual average PEVD, the optional argument of ‘scale’ is available.
• PEVD_GrpAve : Groupd average PEVD, the optional arguments of ‘scale’ and ‘diag’ are available.
• PEVD_contrast: Contrast PEVD, the optional argument of ‘scale’ is available.
• CD_IdAve : Individual average CD.
• CD_GrpAve : Group average CD, the optional argument of ‘diag’ is available.
• CD_contrast : Contrast CD.
• r_IdAve : Individual average r.
• r_GrpAve : Group average r, the optional argument of ‘diag’ is available.
• r_contrast : Contrast r.
• VED0 : Variance of estimate of unit effects differences. The optional argument of ‘scale’ is available.
• VED1 : Variance of estimate of unit effects differences with the correction of unit effect. The optional argument of ‘scale’ is available.
• VED2 : Variance of estimate of unit effects differences with the correction of unit effect and additional fixed effects. The additional argument of ‘Uidx’ is required and the optional argument of ‘scale’ is available.
• CDVED0 : Coefficient of determination of VED, the optional argument of ‘diag’ is available.
• CDVED1 : Coefficient of determination of VED with the correction of unit effect. The optional argument of ‘diag’ is available.
• CDVED2 : Coefficient of determination of VED with the correction of unit effect and additional fixed effects. The additional argument of ‘Uidx’ is required and the optional argument of ‘diag’ is available.
• CR0 : Connectedness rating.
• CR1 : Connectedness rating with the correction of unit effect.
• CR2 : Connectedness rating with the correction of unit effect and additional fixed effects. The additional argument of ‘Uidx’ is required.

## References

Henderson, C R. 1984. Applications of Linear Models in Animal Breeding. Third edition, Edited by Schaeffer LR. Guelph: University of Guelph.

Holmes, John B, Ken G Dodds, and Michael A Lee. 2017. “Estimation of Genetic Connectedness Diagnostics Based on Prediction Errors Without the Prediction Error Variance–Covariance Matrix.” Genetics Selection Evolution 49 (1). BioMed Central:29.

Kennedy, BW, and D Trus. 1993. “Considerations on Genetic Connectedness Between Management Units Under an Animal Model.” Journal of Animal Science 71 (9). Oxford University Press:2341–52.

Kuehn, LA, DR Notter, GJ Nieuwhof, and RM Lewis. 2008. “Changes in Connectedness over Time in Alternative Sheep Sire Referencing Schemes.” Journal of Animal Science 86 (3). Oxford University Press:536–44.

Laloë, Denis. 1993. “Precision and Information in Linear Models of Genetic Evaluation.” Genetics Selection Evolution 25 (6). BioMed Central:557.

Laloë, Denis, Florence Phocas, and Francois Menissier. 1996. “Considerations on Measures of Precision and Connectedness in Mixed Linear Models of Genetic Evaluation.” Genetics Selection Evolution 28 (4). BioMed Central:359.

Lewis, RM, RE Crump, G Simm, and R Thompson. 1999. “Assessing Connectedness in Across-Flock Genetic Evaluations.” Proc. Brit. Soc. Anim. Sci 121.

Mathur, PK, BP Sullivan, and JP Chesnais. 2002. “Measuring Connectedness: Concept and Application to a Large Industry Breeding Program.” In Proc. 7th World Congr. Genet. Appl. To Livest. Prod, 19:23.

Yu, Haipeng, Matthew L Spangler, Ronald M Lewis, and Gota Morota. 2017. “Genomic Relatedness Strengthens Genetic Connectedness Across Management Units.” G3: Genes, Genomes, Genetics 7 (10). G3: Genes, Genomes, Genetics:3543–56.