PCA for gen_tibble
objects by randomized partial SVD
gt_pca_randomSVD.Rd
This function performs Principal Component Analysis on a gen_tibble
,
by randomised partial SVD based on the
algorithm in RSpectra (by Yixuan Qiu and Jiali Mei).
This algorithm is linear in time in all dimensions and is very
memory-efficient. Thus, it can be used on very large big.matrices.
This function is a wrapper
for bigstatsr::big_randomSVD()
Usage
gt_pca_randomSVD(
x,
k = 10,
fun_scaling = bigsnpr::snp_scaleBinom(),
tol = 1e-04,
verbose = FALSE,
n_cores = 1,
fun_prod = bigstatsr::big_prodVec,
fun_cprod = bigstatsr::big_cprodVec
)
Arguments
- x
a
gen_tbl
object- k
Number of singular vectors/values to compute. Default is
10
. This algorithm should be used to compute a few singular vectors/values.- fun_scaling
Usually this can be left unset, as it defaults to
bigsnpr::snp_scaleBinom()
, which is the appropriate function for biallelic SNPs. Alternatively it is possible to use custom function (seebigsnpr::snp_autoSVD()
for details.- tol
Precision parameter of svds. Default is
1e-4
.- verbose
Should some progress be printed? Default is
FALSE
.- n_cores
Number of cores used.
- fun_prod
Function that takes 6 arguments (in this order):
a matrix-like object
X
,a vector
x
,a vector of row indices
ind.row
ofX
,a vector of column indices
ind.col
ofX
,a vector of column centers (corresponding to
ind.col
),a vector of column scales (corresponding to
ind.col
), and compute the product ofX
(subsetted and scaled) withx
.
- fun_cprod
Same as
fun.prod
, but for the transpose ofX
.
Value
a gt_pca
object, which is a subclass of bigSVD
; this is
an S3 list with elements:
A named list (an S3 class "big_SVD") of
d
, the eigenvalues (singular values, i.e. as variances),u
, the scores for each sample on each component (the left singular vectors)v
, the loadings (the right singular vectors)center
, the centering vector,scale
, the scaling vector,method
, a string defining the method (in this case 'randomSVD'),call
, the call that generated the object.
Note: rather than accessing these elements directly, it is better to use
tidy
and augment
. See gt_pca_tidiers
.