The SpaceX (spatially dependent gene co-expression network) is a Bayesian methodology to identify both shared and cluster-specific co-expression network across genes. These clusters can be cell type specific or based on spatial regions. SpaceX uses an over-dispersed spatial Poisson model coupled with a high-dimensional factor model which is based on a dimension reduction technique for computational efficiency.
The Figure above shows the overall conceptual flow of our pipeline. Panel A is an image of a tissue section from the region of interest. Panel B shows spatial gene expression and biomarkers which are recorded from that tissue section with the help of sequencing techniques. Panel C is the resulting data matrix of gene expression along with spatial locations and cluster annotations on the tissue. All these serve as input for the SpaceX model to obtain the shared (Panel D) and cluster-specific co-expression networks (Panel E). Finally, we use these networks for downstream analysis to detect gene modules and hub genes across spatial regions (Panel F & Panel G respectively) for biological interpretation.
This package requires a Fortran compiler in order to work. Here are the instructions:
-
Windows: install the Rtools package that is appropriate for your version of R
-
Mac: Go to this website and follow the instructions: (https://mac.R-project.org/tools/)
-
Linux: From a terminal, do the following:
sudo apt install gcc
. That will bring in multiple compilers.
The package requires a dependency that is not available on CRAN. Install it with:
remotes::install_github("rdevito/MSFA")
You can install the released version of SpaceX from (https://github.com/SatwikAch/SpaceX) with:
devtools::install_github("SatwikAch/SpaceX")
library(SpaceX)
#> Loading required package: PQLseq
The first input is Gene_expression_mat which is
The second input is Spatial_locations is a dataframe which contains spatial coordinates.
The third input is Cluster_annotations.
The fourth input is sPMM. If TRUE, the code will return the estimates of sigma1_sq and sigma2_sq from the spatial Poisson mixed model.
The fifth input is Post_process. If FALSE, the code will return the
posterior samples of
The final input is numCore. The number of requested cores for parallel computing and default is set to be 1.
You will obtain a list of objects as output.
Posterior_samples contains all the posterior samples.
Shared_network provides the shared co-expression matrix
(transformed correlation matrix of
Cluster_network provides the cluster specific co-expression
matrices (transformed correlation matrices of
An example code with the breast cancer data to demonstrate how to run the SpaceX function and obtain shared and cluster specific networks.
## Reading the Breast cancer data
## Spatial locations
head(BC_loc)
## Gene expression for data
head(BC_count)
## Data processing
G <-dim(BC_count)[2] ## number of genes
N <-dim(BC_count)[1] ## number of locations
## Application to SpaceX algorithm (Please make sure to request for large enough memory to work with the posterior samples)
BC_fit <- SpaceX(BC_count,BC_loc[,1:2],BC_loc[,3],sPMM=FALSE,Post_process = TRUE,numCore = 2)
## Shared_network :: Shared co-expression matrix
## Cluster_network :: Cluster specific co-expression matrices
The tutorial website can be found here.
Satwik Acharyya, Xiang Zhou and Veerabhadran Baladandayuthapani (2022). SpaceX: Gene Co-expression Network Estimation for Spatial Transcriptomics. Bioinformatics, 38(22): 5033–5041.
-
Please run the SpaceX package in R 4.1.2.
-
Please email at [email protected] for any issues.