• HIRISA
  • Analysis Methods

     

    Data Processing 

    scRNA-seq FASTQs were processed into counts matrices using CellRanger (version 7.0).

    10x Flex Well-level quality control metrics (total reads, valid barcodes, median genes per cell, and fraction of reads in cells) were used to assess dataset quality and filtering thresholds. Data from individual samples assembled across wells were given a first pass of cell type labels using Seurat (v5) (Hao, et al., 2023) label transfer with reference to the PBMC dataset provided by the Satija Lab (Hao and Hao, et al., 2021). 

    scRNA count matrices were normalized and scaled using Seurat (v5). Cell populations were manually labelled via clustering using a combination of selected reference markers for each cell type. Cells mapping to dendritic cell populations were removed. Doublets were identified and removed using scrublet (v0.2) (Wolock, et al., 2019). 

    Differential Expression N-of-1 Analysis

    To identfy significantly responsive genes to each stimulation, N-of-1 analysis was performed by analyzing each donor separately, and significant genes shared across all five donors were retained in final gene sets. Within each combination of donor, stimulation, and cell type, we sampled 1,000 cells each from the stimulation condition and non-stimulated control samples and tested differential expression using MAST (Finak, et al., 2015). We performed 20 iterations of sampling and testing, and genes with an adjusted p value < 0.01 and abs(log₂FC) > 0.2 across 10 or more iterations were retained as donor-level significantly differentially expressed genes. Final gene sets were selected by retaining genes that were significant differential in all five donors. Log₂FC values reported in final gene sets are the median of log₂FC values across all five donors.

    Pathway Analysis

    Gene set enrichment analysis was conducted using fgsea (v4.5) (Korotkevich, et al., 2021) and the 2024 GO Biological Processes dataset. Redundant pathway terms were collapsed into larger term groups using rrvgo (v1) (Sayols, 2023). Enrichment was considered significant at a cutoff of adjusted p value < 0.05.

    PCA Analysis

    Principal components analysis (PCA) of cell type responses was performed using the prcomp function in R. For each stimulated condition, the PCA input matrix consised of the median log2FC values for significant ISGs across cell types relative to the unstimulated condition. 

    IFN Response Scoring Method

    Cell type IFN-α and IFN-γ response scores were calculated by non-negative least squares using nnls in R, given the following:

    Minimize:
    ‖A·x − b‖₂

    Subject to:
    x ≥ 0

    Here, A is an n(ISG) × 2 matrix, where n(ISG) is the number of interferon-stimulated genes in the union of the significant IFN-α and IFN-γ gene sets for that cell type. Column 1 of A contains log₂ fold-changes of these genes in response to IFN-α. Column 2 contains log₂ fold-changes in response to IFN-γ. The vector b contains log₂ fold-changes of the same genes in the sample of interest relative to its matched control. All columns of A and b were scaled. The estimated coefficients given in x, restricted to be non-negative, were taken as the IFN-α and IFN-γ response scores.
     

    References

    Finak G, McDavid A, Yajima M, Deng J, Gersuk V, Shalek AK, et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015;16: 278.
    doi:10.1186/s13059-015-0844-5

    Hao Y, Hao S, Andersen-Nissen E, Mauck WM 3rd, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184: 3573-3587.e29.
    doi:10.1016/j.cell.2021.04.048

    Hao Y, Stuart T, Kowalski MH, Choudhary S, Hoffman P, Hartman A, et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol. 2023.
    doi:10.1038/s41587-023-01767-y

    Korotkevich G, Sukhov V, Budin N, Shpak B, Artyomov MN, Sergushichev A. Fast gene set enrichment analysis. bioRxiv. 2021. p. 060012.
    doi:10.1101/060012

    Sayols S. rrvgo: a Bioconductor package for interpreting lists of Gene Ontology terms. MicroPubl Biol. 2023;2023.
    doi:10.17912/micropub.biology.000811

    Wolock SL, Lopez R, Klein AM. Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic Data. Cell Syst. 2019;8: 281-291.e9.
    doi:10.1016/j.cels.2018.11.005