Skip to contents

This function constructs the input data for fit_marginal.

Usage

construct_data(
  sce,
  assay_use = "counts",
  celltype,
  pseudotime,
  spatial,
  other_covariates,
  ncell = dim(sce)[2],
  corr_by,
  parallelization = "mcmapply",
  BPPARAM = NULL
)

Arguments

sce

A SingleCellExperiment object.

assay_use

A string which indicates the assay you will use in the sce. Default is 'counts'.

celltype

A string of the name of cell type variable in the colData of the sce. Default is 'cell_type'.

pseudotime

A string or a string vector of the name of pseudotime and (if exist) multiple lineages. Default is NULL.

spatial

A length two string vector of the names of spatial coordinates. Default is NULL.

other_covariates

A string or a string vector of the other covariates you want to include in the data.

ncell

The number of cell you want to simulate. Default is dim(sce)[2] (the same number as the input data). If an arbitrary number is provided, the function will use Vine Copula to simulate a new covariate matrix.

corr_by

A string or a string vector which indicates the groups for correlation structure. If '1', all cells have one estimated corr. If 'ind', no corr (features are independent). If others, this variable decides the corr structures.

parallelization

A string indicating the specific parallelization function to use. Must be one of 'mcmapply', 'bpmapply', or 'pbmcmapply', which corresponds to the parallelization function in the package parallel,BiocParallel, and pbmcapply respectively. The default value is 'mcmapply'.

BPPARAM

A MulticoreParam object or NULL. When the parameter parallelization = 'mcmapply' or 'pbmcmapply', this parameter must be NULL. When the parameter parallelization = 'bpmapply', this parameter must be one of the MulticoreParam object offered by the package 'BiocParallel. The default value is NULL.

Value

A list with the components:

count_mat

The expression matrix

dat

The original covariate matrix

newCovariate

The simulated new covariate matrix, is NULL if the parameter ncell is default

filtered_gene

The genes that are excluded in the marginal and copula fitting steps because these genes only express in less than two cells.

Details

This function takes a SingleCellExperiment object as the input. Based on users' choice, it constructs the matrix of covariates (explanatory variables) and the expression matrix (e.g., count matrix for scRNA-seq).

Examples

  data(example_sce)
  my_data <- construct_data(
  sce = example_sce,
  assay_use = "counts",
  celltype = "cell_type",
  pseudotime = "pseudotime",
  spatial = NULL,
  other_covariates = NULL,
  corr_by = "1"
  )