Tutorial 11: Joint analysis of mouse brain anterior-posterior sections

In this tutorial, we demonstrate how to integrate multiple samples of mouse brain data horizontally using STGMVA.

You can download the two sections of mouse brain data at https://zenodo.org/record/8141084.

This data is stored in “11. Mouse Brain Anterior Posterior Section 2_multi-sample integation” folder.

STGMVA could dsicern the same spatial domains in two different anterior-posterior sections of mouse brain.

Loading package

[10]:

import os
import torch
import pandas as pd
import scanpy as sc
from sklearn import metrics
import multiprocessing as mp
import squidpy as sq
import numpy as np
import matplotlib.pyplot as plt
import anndata as ad
from sklearn.decomposition import PCA

import warnings
warnings.filterwarnings("ignore")

[2]:

from STGMVA.STGMVA import STGMMVE
from STGMVA import mk_dir

2023-07-15 17:33:09.719922: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Reading ST data

[3]:

adata = sc.read_h5ad("/home/tengliu/Paper6-NC/STGMVA_Tutorials/ST data/filtered_feature_bc_matrix.h5ad")
adata.var_names_make_unique()

[4]:

adata

[4]:

AnnData object with n_obs × n_vars = 6114 × 32285
    obs: 'in_tissue', 'array_row', 'array_col', 'data'
    obsm: 'spatial'

[5]:

sc.pl.embedding(adata, basis='spatial',size=35,
                show=False)

[5]:

<Axes: xlabel='spatial1', ylabel='spatial2'>

_images/Tutorial_8_mouse_brain_anterior_posterior_8_1.svg

Create directory for pretrained model

[6]:

save_path='./Samp_results/'
section_id = "mouse_brain_section2_integration"
mk_dir(save_path,section_id)

Training the model

STGMVA aims to learn the representations by two-step process. First, pretrained the GMM clustering model. Second, discerned the spatial domains for spatial transcriptomics data.

After model training, the learned representations will be saved in adata.obsm[‘embedding’], and can be used for spatial clustering.

[8]:

model = STGMMVE(adata, datatype="mouse_brain", nCluster=26,save_path=save_path, section_id=section_id)

# model.pretrain() # Train your own pretranined model or use the pretrained model we provided.

adata_res = model.train_cluster()

Graph constructed!

R[write to console]:                    __           __
   ____ ___  _____/ /_  _______/ /_
  / __ `__ \/ ___/ / / / / ___/ __/
 / / / / / / /__/ / /_/ (__  ) /_
/_/ /_/ /_/\___/_/\__,_/____/\__/   version 5.4.10
Type 'citation("mclust")' for citing this R package in publications.

fitting ...
  |======================================================================| 100%

  0%|          | 0/50 [00:15<?, ?it/s]

Epoch: 0
NMI=0.798208, ARI=0.693354
Loss=26.6892,  ELBO Loss=89.8866

 10%|█         | 5/50 [02:11<19:26, 25.92s/it]

Epoch: 5
NMI=0.755517, ARI=0.574254
Loss=27.8406,  ELBO Loss=62.3367

 20%|██        | 10/50 [03:59<15:45, 23.64s/it]

Epoch: 10
NMI=0.760985, ARI=0.576200
Loss=27.8027,  ELBO Loss=51.1024

 30%|███       | 15/50 [05:46<12:57, 22.20s/it]

Epoch: 15
NMI=0.768384, ARI=0.607746
Loss=27.7586,  ELBO Loss=46.5559

 40%|████      | 20/50 [07:26<09:43, 19.45s/it]

Epoch: 20
NMI=0.763626, ARI=0.589126
Loss=27.4439,  ELBO Loss=44.2383

 50%|█████     | 25/50 [08:43<06:59, 16.79s/it]

Epoch: 25
NMI=0.778288, ARI=0.611737
Loss=27.3582,  ELBO Loss=43.2493

 60%|██████    | 30/50 [10:15<05:31, 16.59s/it]

Epoch: 30
NMI=0.767700, ARI=0.588367
Loss=27.2577,  ELBO Loss=42.6749

 70%|███████   | 35/50 [11:47<04:41, 18.77s/it]

Epoch: 35
NMI=0.768342, ARI=0.602319
Loss=27.2522,  ELBO Loss=42.2562

 80%|████████  | 40/50 [13:22<03:01, 18.14s/it]

Epoch: 40
NMI=0.751411, ARI=0.563726
Loss=27.1699,  ELBO Loss=41.8975

 90%|█████████ | 45/50 [14:48<01:33, 18.79s/it]

Epoch: 45
NMI=0.792349, ARI=0.657427
Loss=27.1688,  ELBO Loss=41.7008

100%|██████████| 50/50 [15:57<00:00, 19.15s/it]

[9]:

adata_res

[9]:

AnnData object with n_obs × n_vars = 6114 × 32285
    obs: 'in_tissue', 'array_row', 'array_col', 'data', 'pre_label'
    var: 'highly_variable', 'highly_variable_rank', 'means', 'variances', 'variances_norm'
    uns: 'hvg', 'log1p', 'mclust_labels', 'ari_list', 'loss'
    obsm: 'spatial', 'distance_matrix', 'graph_neigh', 'adj', 'feat_mat', 'embedding'

For the downstream clustering method, we use the mclust clustering method.

Visualization

figure1

[13]:

sq.pl.spatial_scatter(adata_res, color="pre_label", size=15, figsize=(8,6),shape=None)

WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`
WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`
WARNING: Please specify a valid `library_id` or set it permanently in `adata.uns['spatial']`

_images/Tutorial_8_mouse_brain_anterior_posterior_22_1.svg

figure2

[14]:

sq.pl.spatial_scatter(adata_res, color="pca_refine_no", size=15, figsize=(8,6),shape=None)

_images/Tutorial_8_mouse_brain_anterior_posterior_24_0.svg

figure3

[15]:

sq.pl.spatial_scatter(adata_res, color="refine_no", size=15, figsize=(8,6),shape=None)

_images/Tutorial_8_mouse_brain_anterior_posterior_26_0.svg

Save the results for more figures

[16]:

adata_res.filename = './'+"mouse_brain2_integration.h5ad"