Skip to contents

Run SNMF from R in tidypopgen

Usage

gt_snmf(
  x,
  k,
  project = "continue",
  n_runs = 1,
  alpha,
  tolerance = 1e-05,
  entropy = FALSE,
  percentage = 0.05,
  I,
  iterations = 200,
  ploidy = 2,
  seed = -1
)

Arguments

x

a gen_tibble or a character giving the path to the input geno file

k

an integer giving the number of clusters

project

one of "continue", "new", and "force": "continue" stores files in the current project, "new" creates a new project, and "force" stores results in the current project even if the .geno input file has been altered,

n_runs

the number of runs for each k value (defaults to 1)

alpha

numeric snmf regularization parameter. See LEA::snmf for details

tolerance

numeric value of tolerance (default 0.00001)

entropy

boolean indicating whether to estimate cross-entropy

percentage

numeric value indicating percentage of masked genotypes, ranging between 0 and 1, to be used when entropy = TRUE

I

number of SNPs for initialising the snmf algorithm

iterations

numeric integer for maximum iterations (default 200)

ploidy

the ploidy of the input data (defaults to 2)

seed

the seed for the random number generator

Value

an object of class gt_admix consisting of a list with the following elements:

  • k the number of clusters

  • Q a matrix with the admixture proportions

  • P a matrix with the allele frequencies

  • log a log of the output generated by ADMIXTURE (usually printed on the screen when running from the command line)

  • cv the masked cross-entropy (if entropy is TRUE)

  • loglik the log likelihood of the model

  • id the id column of the input gen_tibble (if applicable)

  • group the group column of the input gen_tibble (if applicable)

Details

This is a wrapper for the function snmf from R package LEA.