Last modified 2025-12-11

Support

Use Correct Pipeline Syntax (Tutorial)

Abbreviations Key
AIFIAllen Institute for ImmunologyHISEHuman Immune System Explorer
ATAC-seqassays for transposase-accessible chromatin sequencingTEA-seqtranscripts, epitopes, and chromatin accessibility sequencing
csvcomma-separated valuesscRNA-seqsmall cytoplasmic RNA sequencing
FastQFastQualityQCquality control
FCSflow cytometry standard
fRNAfixed RNA

At a Glance

In this document, we discuss AIFI syntax and formatting requirements for input files in bioinformatics pipelines. We focus on data submission for single-cell and multiomics analyses. Topics include formatting of submission sheet fields and FastQ filename conventions for each assay type, including scRNA-seq, scATAC-seq, fixed RNA, Olink proteomics, supervised gating, TEA-seq, V(D)J, and Vizgen. For instructions on creating a tar file, see Submit and Monitor Pipeline Batches (Tutorial). If you have questions, contact Support.

Formatting Submission Sheet Fields

The submissionBC format requires a hyphen after the first two letters (for example, AA-0000). This standardized naming convention ensures file traceability, automates data validation, and maintains consistency across large datasets. "BC" stands for "barcode," a unique code or identifier assigned to a sample, batch, or submission in bioinformatics and sequencing workflows. The barcode ensures that each item can be traced, matched, and managed accurately throughout the pipeline. 


Instructions

 Sign In

1. Navigate to HISE, and use your organizational email address to sign in.

 Follow the Examples for Your Submission Type

1. Locate your submission type in the following tabbed table. Then use the example file names and spreadsheets to format and submit your files.

scRNA-seq

In this section, we describe the required naming pattern for scRNA-seq submission sheets, tar archives, and FastQ files. Follow these conventions so that the pipeline can automatically recognize your batch, link it to the correct submission, and distinguish read files using the “R before P” convention.

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet example

Format the header tab of your spreadsheet as shown in the accompanying example.

ABC
1SubmissionBCPoolType
2FC-0413P1RNA
3FC-0413P2RNA

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

SubmissionBC_AAA_2022-12-26.tar.gz

Example file name

FC-0000_AAA_2022-12-26.tar.gz

FastQ files (R before P)

Naming format

BatchID-RP1C1W1_S1_L001_R1_001.fastq.gz

Example file name

B100-RP1C1W1_S1_L001_R1_001.fastq.gz

scATAC-seq

In this section, we outline how to name scATAC-seq submission sheets, tar archives, and FastQ files. The patterns mirror scRNA-seq where possible, with an “A before P” convention in FastQ filenames so that the pipeline can reliably route ATAC reads.

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet example

Format the header tab of your spreadsheet as shown in the accompanying example.

ABC
1SubmissionBCPoolType
2FC-0279P0ATAC

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

SubmissionBC_AAA_2022-12-26.tar.gz

Example file name

FC-0000_AAA_2022-12-26.tar.gz

FastQ files (A before P)

Naming format

BatchID-AP1C1W1_S1_L001_R1_001.fastq.gz

Example file name

B100-AP1C1W1_S1_L001_R1_001.fastq.gz

Fixed RNA

In this section, we explain how to format fixed RNA submission sheets and file names, including how to represent singleplex versus multiplex pools. We also show how to handle cases in which multiple pools share one or more flow cells. Finally, we describe how to use the fR– prefix to help the pipeline identify fixed RNA FastQ files.

Submission sheets

Naming format

BatchID_SEQ_FRNA_Submission_20230106.xlsx

Example file name

B100_SEQ_FRNA_Submission_20230106.xlsx

Submission sheet header

If multiple pools are associated with one flow cell, put the pools into a comma-separated list in the header sheet, as in the accompanying example.

ABCD
1SubmissionBCPoolTypeMultiplexFRNA
2FC00394P1,P2,P3fRNYes

Submission sheet examples

Format the header tab on each spreadsheet as shown in the accompanying examples.

Singleplex

ABCD
1SubmissionBCPoolTypeMultiplexFRNA
2SP-AAAABP1fRNANo

Multiplex

ABCD
1SubmissionBCPoolTypeMultiplexFRNA
2SM-MXOTYPA,PBfRNAYes

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

Single flowcell

Use this format for tar files that have one flow cell (currently all fixed RNA tar files are like this).

SubmissionBC_AAA_2022-12-26.tar.gz

Multiple pools

Use this format for tar files with multiple pools shared by two flow cells.

BatchID_SubmissionBC_AAA_2022-12-22.tar.gz

FastQ files (fR- before P)

Naming format

BatchID-fR-P1C1W1_S1_L001_I1_001.fastq

Example file name

B100-fR-P1C1W1_S1_L001_I1_001.fastq

Supervised Gating

In this section, we describe how to prepare supervised gating submission sheets and tar archives that bundle FCS files and QC outputs. The naming conventions for tar files and their contents allow the pipeline to find the right FCS files, link them to QC reports, and associate them with the appropriate batch.

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet example

Format your spreadsheet as shown in the accompanying example.

ABC
1SubmissionBCPoolType
2FC-0204P1RNA

Tar files

Naming format

BatchID_PS1_fcs.tar.gz

Example file name

B064_PS1_fcs.tar.gz

Example files

B064_PS1_SampleID_QC_Report.txt

B064_PS1_SampleID_QC.fcs

B064_PS1_SampleID_QC_Passed_TTTT.png

TEA-seq

Here we cover TEA-seq submission sheet structure, including how to represent hashed samples and merged flow cells. We also document the tar file naming pattern and the “E before P” FastQ convention the pipeline uses to distinguish TEA-seq reads from other assays.​

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet examples

Individual samples

Use the TEA‑seq submission sheet to list all samples (including hashed samples). When you combine multiple flow cells into one pool, reference the merged flow cell ID. To facilitate hashed ATAC processing, the mock version of samples should be set up in the sample sheet, as in the accompanying example. 

ABCDE
1FSQGAZ0BL61-02EXP-00387HT1NAP1
2EXP-00387-MP1C1W1EXP-00387NAEXP-00387-AP1C1W1P1

Pooled samples

Use the submission barcode sheet to define one row per pooled flow cell, linking SubmissionBC (for example, FC-0371) to the corresponding pool (P1) and assay type (TEA). This sheet is required whether you have a single flow cell or multiple flow cells merged into a pool. Format your spreadsheet as shown in the accompanying example.

ABC
1SubmissionBCPoolType
2FC-0371P1TEA

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

SubmissionBC_AAA_2022-12-26.tar.gz

Example file name

FC-0000_AAA_2022-12-26.tar.gz

FastQ files (E before P)

Naming format

BatchID-EP1C1W1_S1_L001_I1_001.fastq

Example file name

B100-EP1C1W1_S1_L001_I1_001.fastq


Upload your submission sheet

To upload your spreadsheet to a watchfolder, follow the instructions in Steps 2 and 3 of Submit and Monitor Pipeline Batches (Tutorial).


Related Resources

Configure a Pipeline (Tutorial)

Understand Automated Pipelines

Use the Sample Status Dashboard (Tutorial)