analysis/00_preprocess.md

Preprocessing

This document details the pre-processing steps run before the rest of the analysis scripts, including pointing to which scripts were run to perform each step.

Sample table

See code in .qmd document for LaTeX table.

Alignment

For both Slide-seq and Visium data, we built custom bowtie2 indices for the pooled transcriptome of the 129 and CAST mice.

Alignment was conducted with e.g. the command:

bowtie2 -x bowtie2_index_129xCAST \
        -k 100 \
        -p 32 \
        --very-sensitive \
        -U ./tagged2.fastq |
        samtools view -bS - > ./tagged_bwt2_129_CAST.bam

BAM files were processed with a custom Python script to get uniquely mapped reads based on number of mismatches, as well as the number of reads uniquely mapped but unable to be assigned to one allele (used as input for cell type assignment).

RCTD

RCTD was run on each sample individually using the scripts:

Note that I use > to show what the output of each script was, but it is not necessary to run the script with an output pipe.

See run_rctd.sbatch for SLURM job submission resources.

Overall ASE: spASE

  1. Overall maternal/paternal bias - we assume $\text{logit}(p_{j}) = \beta_{0,j}$, i.e. the mean maternal probability does not change based on cell type or spatial location.

  2. run_spase_hippo1_overall_bias.R > results_overall_bias_hippo_1.rds

  3. run_spase_hippo2_overall_bias.R > results_overall_bias_hippo_2.rds
  4. run_spase_hippo3_overall_bias.R > results_overall_bias_hippo_3.rds
  5. run_spase_cere3_overall_bias.R > results_overall_bias_cere_3.rds
  6. run_spase_cere4_visium_overall_bias.R > > results_overall_bias_cere_4_visium.rds

See run_spase_overall_bias.sbatch for SLURM job submission resources.

  1. Overall spatial pattern (no cell type effect) - we assume $$\text{logit}(p_{i,j}) = \beta_{0,j} + \sum_{\ell=1}^L x_{i,\ell}\beta_{\ell,j},$$ where $x_{i,\ell}$ are degrees of freedom $L$ thin plate spline basis functions evaluated at spots $i$.

  2. run_spase_hippo1_overall_spatial.R > results_overall_spatial_hippo_1.rds

  3. run_spase_hippo2_overall_spatial.R > results_overall_spatial_hippo_2.rds
  4. run_spase_hippo3_overall_spatial.R > results_overall_spatial_hippo_3.rds
  5. run_spase_cere3_overall_spatial.R > results_overall_spatial_cere_3.rds
  6. run_spase_cere4_visium_overall_spatial.R > results_overall_spatial_cere_4.rds

See run_spase_overall_spatial.sbatch for SLURM job submission resources.

Cell type-specific analyses: C-SIDE and spASE

  1. Within cell type maternal/paternal bias - we assume $$p_{i,j} = \sum_{k=1}^K \alpha_{i,j,k} \text{expit}(\beta_{0,k,j}).$$
  2. Within cell type spatial pattern - we assume $$p_{i,j} = \sum_{k=1}^K \alpha_{i,j,k} \, \text{expit} \left(\beta_{0,k,j} + \sum_{\ell=1}^L x_{i,\ell}\beta_{\ell,k,j}\right),$$ where $x_{i,\ell}$ are degrees of freedom $L$ thin plate spline basis functions evaluated at spots $i$.

C-SIDE and spASE were run on each sample individually using the scripts:

See run_spase_celltype.sbatch for example SLURM job submission resources.



lulizou/spASE documentation built on May 22, 2024, 5:24 a.m.