Function to prepare cluster list for `cluster_communication()`
preparing_expr_list.Rd
`preparing_expr_list()` produce the cluster list needed for CCC calculation
Arguments
- mtx
the gene-by-cell matrix used in single cell data analysis
- clusters
a vector with cluster membership of all the cells in `mtx`
- mean_t
threshold on the expression used to binarize the expression matrix. This value should be decided considering the distribution of the mean expression of all the genes (without considering the zeros)
- cell_t
minimum number of cells in which a gene should be expressed over `mean_t` threshold in a cluster to be considered in subsequent analysis
- universe
names of the genes of interest. Used to filter the cluster gene lists
Value
The function returns a gene list composed by a vector for each cluster provided. Each vector is composed by the frequency of over-threshold expression of the genes in the cluster, named by the respective genes
Details
The function is used to prepare the cluster list needed as an input in `cluster_communication()`. In detail, the function needs the `gene_by_cell` matrix used in single cell analysis. We suggest to use the normalized data to avoid accounting for differences in gene counts. The matrix is binarized by assigning a `1` to all the genes that have a normalized expression value equal or higher than `mean_t`, `0` otherwise. Then, for each gene in cluster, the function discards the gene expressed in a number of cell lower then `cell_t` and calculates the mean frequency of over-threshold expression between all cells of the cluster. The function returns a vector for each cluster, each one composed by the frequencies named after the corresponding genes. The vectors are filtered to maintain only the genes in `universe`, that are the gene of interest