gt_cluster_pca.Rd
This function implements the clustering procedure used in Discriminant
Analysis of Principal Components (DAPC, Jombart et al. 2010).
This procedure consists in running successive K-means with an
increasing number of clusters (k), after transforming data using
a principal component analysis (PCA). For each model,
several statistical measures of goodness of fit
are computed, which allows to choose the optimal k using the function
gt_cluster_pca_best_k()
.
See details for a description of how to select the optimal k
and vignette("adegenet-dapc") for a tutorial.
a gt_pca
object returned by one of the gt_pca_*
functions.
number of principal components to be fed to the LDA.
number of clusters to explore, either a single value, or a vector of length 2 giving the minimum and maximum (e.g. 1:5). If left NULL, it will use 1 to the number of pca components divided by 10 (a reasonable guess).
either 'kmeans' or 'ward'
number of iterations for kmeans (only used if method="kmeans"
)
number of starting points for kmeans (only used if method="kmeans"
)
boolean on whether to silence outputting information to the screen (defaults to FALSE)
a gt_cluster_pca
object, which is a subclass of gt_pca
with
an additional element 'cluster', a list with elements:
'method' the clustering method (either kmeans or ward)
'n_pca' number of principal components used for clustering
'k' the k values explored by the function
'WSS' within sum of squares for each k
'AIC' the AIC for each k
'BIC' the BIC for each k
'groups' a list, with each element giving the group assignments for a given k