This summarizes information about the components of a gt_dapc from the
tidypopgen package. The parameter matrix determines which element is
returned.
Usage
# S3 method for class 'gt_dapc'
tidy(x, matrix = "eigenvalues", ...)Arguments
- x
A
gt_dapcobject (as returned bygt_dapc()).- matrix
Character specifying which component of the DAPC should be tidied.
"samples","scores", or"x": returns information about the map from the original space into the least discriminant axes."v","rotation","loadings"or"variables": returns information about the map from discriminant axes space back into the original space (i.e. the genotype frequencies). Note that this are different from the loadings linking to the PCA scores (which are available in the element $loadings of the dapc object)."d","eigenvalues"or"lds": returns information about the eigenvalues.
- ...
Not used. Needed to match generic signature only.
Value
A tibble::tibble with columns depending on the component of DAPC being tidied.
If "scores" each row in the tidied output corresponds to the original
data in PCA space. The columns are:
rowID of the original observation (i.e. rowname from original data).
LDInteger indicating a principal component.
valueThe score of the observation for that particular principal component. That is, the location of the observation in PCA space.
If matrix is "loadings", each row in the tidied output corresponds to
information about the principle components in the original space. The
columns are:
rowThe variable labels (colnames) of the data set on which PCA was performed.
LDAn integer vector indicating the principal component.
valueThe value of the eigenvector (axis score) on the indicated principal component.
If "eigenvalues", the columns are:
LDAn integer vector indicating the discriminant axis.
std.devStandard deviation (i.e. sqrt(eig/(n-1))) explained by this DA (for compatibility with
prcomp.cumulativeCumulative variation explained by principal components up to this component (note that this is NOT phrased as a percentage of total variance, since many methods only estimate a truncated SVD.
Examples
#' # Create a gen_tibble of lobster genotypes
bed_file <-
system.file("extdata", "lobster", "lobster.bed", package = "tidypopgen")
lobsters <- gen_tibble(bed_file,
backingfile = tempfile("lobsters"),
quiet = TRUE
)
# Remove monomorphic loci and impute
lobsters <- lobsters %>% select_loci_if(loci_maf(genotypes) > 0)
lobsters <- gt_impute_simple(lobsters, method = "mode")
# Create PCA and run DAPC
pca <- gt_pca_partialSVD(lobsters)
populations <- as.factor(lobsters$population)
dapc_res <- gt_dapc(pca, n_pca = 6, n_da = 2, pop = populations)
# Tidy scores
tidy(dapc_res, matrix = "scores")
#> # A tibble: 352 × 3
#> row LD value
#> <chr> <dbl> <dbl>
#> 1 Ale04 1 3.87
#> 2 Ale04 2 0.132
#> 3 Ale05 1 3.96
#> 4 Ale05 2 -0.402
#> 5 Ale06 1 3.25
#> 6 Ale06 2 -0.801
#> 7 Ale08 1 3.06
#> 8 Ale08 2 0.398
#> 9 Ale13 1 1.60
#> 10 Ale13 2 1.05
#> # ℹ 342 more rows
# Tidy eigenvalues
tidy(dapc_res, matrix = "eigenvalues")
#> # A tibble: 4 × 3
#> LD eigenvalue cumulative
#> <int> <dbl> <dbl>
#> 1 1 225. 225.
#> 2 2 33.4 259.
#> 3 3 2.29 261.
#> 4 4 0.283 261.
# Tidy loadings
tidy(dapc_res, matrix = "loadings")
#> # A tibble: 158 × 3
#> column LD value
#> <chr> <chr> <dbl>
#> 1 rs3441 LD1 -0.00389
#> 2 rs3441 LD2 -0.00831
#> 3 rs4173 LD1 -0.0157
#> 4 rs4173 LD2 0.0121
#> 5 rs6157 LD1 0.0122
#> 6 rs6157 LD2 -0.162
#> 7 rs7502 LD1 0.163
#> 8 rs7502 LD2 0.0172
#> 9 rs7892 LD1 0.0880
#> 10 rs7892 LD2 0.0206
#> # ℹ 148 more rows
