Immunobiology of Aging scRNA-seq

Overview

scRNA-seq data from the Immunobiology of Aging cohort was generated on the 10x Genomics Flex Gene Expression platform.

Below, we provide labeled and annotated PBMC scRNA-seq data from this healthy adult cohorts. We provide the full dataset, as well as data for each major class of cell types.

All .h5ad files for this project contain sample and subject metadata, in addition to cell type labels and QC metrics. Click the header below for descriptions of these metadata:

Each file contains sample-level metadata, as well as cell-level cell type labels and QC metrics. The following values are stored in the .obs section of these .h5ad files as descriptions of observations:

Sample Identifiers
cohort.cohortGuid: A Globally Unique Identifier (GUID) of the Cohort the subject enrolled in for our study subject.subjectGuid: A GUID for the Subject
sample.sampleKitGuid: A GUID for the Sample Kit, representing all material collected at a visit
specimen.specimenGuid: A GUID for the specific aliquot used for the experiment

Subject Metadata
subject.biologicalSex: The biological sex of the Subject
subject.ageAtFirstDraw: The Age of the Subject at their first on-study sample collection
subject.race: The self-reported Race of the Subject
subject.ethnicity: The self-reported Ethnicity of the subject
subject.cmv: The CMV Status of the subject, as determined by an HCMV assay

Sample Metadata
sample.visitName: The name of the study visit (i.e. time point)
sample.subjectAgeAtDraw: The age of the Subject in years at the time of sample collection

Process Identifiers
batch_id: A GUID for the batch of samples processed together (e.g. B039)
pool_id: A GUID for the pool of samples combined for Cell Hashing (e.g. B039-P1)
chip_id: A GUID for the 10x Genomics chip the cells were loaded into (e.g. B039-P1C2)
well_id: A GUID for the 10x Genomics well the cells were loaded into within the chip (e.g. B039-P1C2W4)
*barcodes: A GUID for the individual cell
original_barcodes: The original, sequence-based barcode generated by 10x Genomics Cell Ranger software
cell_name: A quasi-unique, memorable cell identifier generated using an adjective-adjective-animal structure

*used as the primary cell index in our .h5ad files

Cell QC Metrics
n_reads: Number of reads assigned to the cell barcode
n_umis: Number of Unique Molecular Identifiers (unique molecules) detected
n_genes: Number of genes with at least 1 UMI detected
total_counts_mito: Total number of reads that were assigned to mitochondrial genes
pct_counts_mito: Percent of reads that were assigned to mitochondrial genes
doublet_score: Doublet score assigned by Scrublet for doublet detection

Cell Labeling Results
AIFI_L1: Final broad class cell type label (9 types)
predicted_AIFI_L1: Predicted AIFI_L1 type assigned by CellTypist
AIFI_L2: Final mid resolution cell type label (29 types)
predicted_AIFI_L2: Predicted AIFI_L2 type assigned by CellTypist
AIFI_L3: Final high resolution cell type label (71 types)
predicted_AIFI_L3: Predicted AIFI_L3 type assigned by CellTypist

Cell population .h5ad files

Here, we group cells by major population category. These files contain cells from all samples.

We are providing our scRNA-seq data in AnnData (.h5ad) format. For more details about AnnData, see the AnnData Documentation Page.

These files contain both normalized high-variance genes and raw count data. Normalized data is the active layer by default. In Python, the raw counts can be accessed using:

adata = adata.raw.to_adata()

Each file provided below contains the full set of ~3.8 million cells, or a subset for a major cell population category. Each file contains cells from the full set of 234 samples. Cell counts, and approximate file sizes are below:

File NameN CellsFile Size
imm-of-aging_all_cells.h5ad3,758,51450 GB
imm-of-aging_b-plasma_cells.h5ad455,89314 GB
imm-of-aging_cd4-memory-treg_cells.h5ad854,75330 GB
imm-of-aging_cd4-naive_cells.h5ad770,80925 GB
imm-of-aging_cd8-gdt-mait-dnt_cells.h5ad717,55925 GB
imm-of-aging_dc-monocyte_cells.h5ad389,18716 GB
imm-of-aging_nk-ilc_cells.h5ad560,95919 GB
imm-of-aging_other_cells.h5ad9,3540.25 GB