Last modified 2025-12-11 |
Use Correct Pipeline Syntax (Tutorial)
| Abbreviations Key | ||||
| AIFI | Allen Institute for Immunology | HISE | Human Immune System Explorer | |
| ATAC-seq | assays for transposase-accessible chromatin sequencing | TEA-seq | transcripts, epitopes, and chromatin accessibility sequencing | |
| csv | comma-separated values | scRNA-seq | small cytoplasmic RNA sequencing | |
| FastQ | FastQuality | QC | quality control | |
| FCS | flow cytometry standard | |||
| fRNA | fixed RNA |
At a Glance
In this document, we discuss AIFI syntax and formatting requirements for input files in bioinformatics pipelines. We focus on data submission for single-cell and multiomics analyses. Topics include formatting of submission sheet fields and FastQ filename conventions for each assay type, including scRNA-seq, scATAC-seq, fixed RNA, Olink proteomics, supervised gating, TEA-seq, V(D)J, and Vizgen. For instructions on creating a tar file, see Submit and Monitor Pipeline Batches (Tutorial). If you have questions, contact Support.
Formatting Submission Sheet Fields
The submissionBC format requires a hyphen after the first two letters (for example, AA-0000). This standardized naming convention ensures file traceability, automates data validation, and maintains consistency across large datasets. "BC" stands for "barcode," a unique code or identifier assigned to a sample, batch, or submission in bioinformatics and sequencing workflows. The barcode ensures that each item can be traced, matched, and managed accurately throughout the pipeline.
Instructions
Sign In
1. Navigate to HISE, and use your organizational email address to sign in.
Follow the Examples for Your Submission Type
1. Locate your submission type in the following tabbed table. Then use the example file names and spreadsheets to format and submit your files.
scRNA-seq
In this section, we describe the required naming pattern for scRNA-seq submission sheets, tar archives, and FastQ files. Follow these conventions so that the pipeline can automatically recognize your batch, link it to the correct submission, and distinguish read files using the “R before P” convention.
Submission sheets
Naming format
BatchID_SEQ_Submission_20230106.xlsx
Example file name
B100_SEQ_Submission_20230106.xlsx
Submission sheet example
Format the header tab of your spreadsheet as shown in the accompanying example.
| A | B | C | |
| 1 | SubmissionBC | Pool | Type |
| 2 | FC-0413 | P1 | RNA |
| 3 | FC-0413 | P2 | RNA |
Tar files
Naming format
The AAA portion of the filename can be any assortment of letters.
SubmissionBC_AAA_2022-12-26.tar.gz
Example file name
FC-0000_AAA_2022-12-26.tar.gz
FastQ files (R before P)
Naming format
BatchID-RP1C1W1_S1_L001_R1_001.fastq.gz
Example file name
B100-RP1C1W1_S1_L001_R1_001.fastq.gz
scATAC-seq
In this section, we outline how to name scATAC-seq submission sheets, tar archives, and FastQ files. The patterns mirror scRNA-seq where possible, with an “A before P” convention in FastQ filenames so that the pipeline can reliably route ATAC reads.
Submission sheets
Naming format
BatchID_SEQ_Submission_20230106.xlsx
Example file name
B100_SEQ_Submission_20230106.xlsx
Submission sheet example
Format the header tab of your spreadsheet as shown in the accompanying example.
| A | B | C | |
| 1 | SubmissionBC | Pool | Type |
| 2 | FC-0279 | P0 | ATAC |
Tar files
Naming format
The AAA portion of the filename can be any assortment of letters.
SubmissionBC_AAA_2022-12-26.tar.gz
Example file name
FC-0000_AAA_2022-12-26.tar.gz
FastQ files (A before P)
Naming format
BatchID-AP1C1W1_S1_L001_R1_001.fastq.gz
Example file name
B100-AP1C1W1_S1_L001_R1_001.fastq.gz
Fixed RNA
In this section, we explain how to format fixed RNA submission sheets and file names, including how to represent singleplex versus multiplex pools. We also show how to handle cases in which multiple pools share one or more flow cells. Finally, we describe how to use the fR– prefix to help the pipeline identify fixed RNA FastQ files.
Submission sheets
Naming format
BatchID_SEQ_FRNA_Submission_20230106.xlsx
Example file name
B100_SEQ_FRNA_Submission_20230106.xlsx
Submission sheet header
If multiple pools are associated with one flow cell, put the pools into a comma-separated list in the header sheet, as in the accompanying example.
| A | B | C | D | |
| 1 | SubmissionBC | Pool | Type | MultiplexFRNA |
| 2 | FC00394 | P1,P2,P3 | fRN | Yes |
Submission sheet examples
Format the header tab on each spreadsheet as shown in the accompanying examples.
Singleplex
| A | B | C | D | |
| 1 | SubmissionBC | Pool | Type | MultiplexFRNA |
| 2 | SP-AAAAB | P1 | fRNA | No |
Multiplex
| A | B | C | D | |
| 1 | SubmissionBC | Pool | Type | MultiplexFRNA |
| 2 | SM-MXOTY | PA,PB | fRNA | Yes |
Tar files
Naming format
The AAA portion of the filename can be any assortment of letters.
Single flowcell
Use this format for tar files that have one flow cell (currently all fixed RNA tar files are like this).
SubmissionBC_AAA_2022-12-26.tar.gz
Multiple pools
Use this format for tar files with multiple pools shared by two flow cells.
BatchID_SubmissionBC_AAA_2022-12-22.tar.gz
FastQ files (fR- before P)
Naming format
BatchID-fR-P1C1W1_S1_L001_I1_001.fastq
Example file name
B100-fR-P1C1W1_S1_L001_I1_001.fastq
Supervised Gating
In this section, we describe how to prepare supervised gating submission sheets and tar archives that bundle FCS files and QC outputs. The naming conventions for tar files and their contents allow the pipeline to find the right FCS files, link them to QC reports, and associate them with the appropriate batch.
Submission sheets
Naming format
BatchID_SEQ_Submission_20230106.xlsx
Example file name
B100_SEQ_Submission_20230106.xlsx
Submission sheet example
Format your spreadsheet as shown in the accompanying example.
| A | B | C | |
| 1 | SubmissionBC | Pool | Type |
| 2 | FC-0204 | P1 | RNA |
Tar files
Naming format
BatchID_PS1_fcs.tar.gz
Example file name
B064_PS1_fcs.tar.gz
Example files
B064_PS1_SampleID_QC_Report.txt
B064_PS1_SampleID_QC.fcs
B064_PS1_SampleID_QC_Passed_TTTT.png
TEA-seq
Here we cover TEA-seq submission sheet structure, including how to represent hashed samples and merged flow cells. We also document the tar file naming pattern and the “E before P” FastQ convention the pipeline uses to distinguish TEA-seq reads from other assays.
Submission sheets
Naming format
BatchID_SEQ_Submission_20230106.xlsx
Example file name
B100_SEQ_Submission_20230106.xlsx
Submission sheet examples
Individual samples
Use the TEA‑seq submission sheet to list all samples (including hashed samples). When you combine multiple flow cells into one pool, reference the merged flow cell ID. To facilitate hashed ATAC processing, the mock version of samples should be set up in the sample sheet, as in the accompanying example.
| A | B | C | D | E | |
| 1 | FSQGAZ0BL61-02 | EXP-00387 | HT1 | NA | P1 |
| 2 | EXP-00387-MP1C1W1 | EXP-00387 | NA | EXP-00387-AP1C1W1 | P1 |
Pooled samples
Use the submission barcode sheet to define one row per pooled flow cell, linking SubmissionBC (for example, FC-0371) to the corresponding pool (P1) and assay type (TEA). This sheet is required whether you have a single flow cell or multiple flow cells merged into a pool. Format your spreadsheet as shown in the accompanying example.
| A | B | C | |
| 1 | SubmissionBC | Pool | Type |
| 2 | FC-0371 | P1 | TEA |
Tar files
Naming format
The AAA portion of the filename can be any assortment of letters.
SubmissionBC_AAA_2022-12-26.tar.gz
Example file name
FC-0000_AAA_2022-12-26.tar.gz
FastQ files (E before P)
Naming format
BatchID-EP1C1W1_S1_L001_I1_001.fastq
Example file name
B100-EP1C1W1_S1_L001_I1_001.fastq
Upload your submission sheet
To upload your spreadsheet to a watchfolder, follow the instructions in Steps 2 and 3 of Submit and Monitor Pipeline Batches (Tutorial).
Related Resources
Configure a Pipeline (Tutorial)