Use Correct Pipeline File Syntax (Tutorial)

 

Last modified 2026-03-24

 

Support

Use Correct Pipeline Syntax (Tutorial)

Abbreviations Key

 

 

 

AIFI

Allen Institute for Immunology

 

HISE

Human Immune System Explorer

ATAC-seq

assays for transposase-accessible chromatin sequencing

 

TEA-seq

transcripts, epitopes, and chromatin accessibility sequencing

csv

comma-separated values

 

scRNA-seq

small cytoplasmic RNA sequencing

FastQ

FastQuality

 

QC

quality control

FCS

flow cytometry standard

 

 

 

fRNA

fixed RNA

 

 

 

At a Glance

In this document, we discuss AIFI syntax and formatting requirements for input files in bioinformatics pipelines. We focus on data submission for single-cell and multiomics analyses. Topics include formatting of submission sheet fields and FastQ filename conventions for each assay type, including scRNA-seq, scATAC-seq, fixed RNA, Olink proteomics, supervised gating, TEA-seq, V(D)J, and Vizgen. For instructions on creating a tar file, see Submit and Monitor Pipeline Batches (Tutorial). If you have questions, contact Support.

When to Use This Feature

Use this feature when your data files and submission sheets must be recognized and processed automatically by HISE pipelines, such as in the following circumstances:

– You're preparing submission sheets, tar archives, and FastQ files for any automated pipeline (for example, scRNA‑seq, scATAC‑seq, fixed RNA, TEA‑seq).

– You want your batches to link correctly to configured pipelines without manual intervention.

– You'd like to avoid ingest failures, misrouted data, or delays caused by incorrect filenames or spreadsheet formats.

This page is intended to complement other pipeline setup and monitoring tutorials (see Related Resources). Use it in addition to tutorials such as Configure a Pipeline, Submit and Monitor Pipeline Batches, and other assay‑specific documentation to make sure both your pipeline configuration and your file syntax are correct before you upload data.

NOTE: Formatting Submission Sheet Fields

The submissionBC format requires a hyphen after the first two letters (for example, AA-0000). This standardized naming convention ensures file traceability, automates data validation, and maintains consistency across large datasets. "BC" stands for "barcode," a unique code or identifier assigned to a sample, batch, or submission in bioinformatics and sequencing workflows. The barcode ensures that each item can be traced, matched, and managed accurately throughout the pipeline. 


Instructions

 Sign In

1. Navigate to HISE, and use your organizational email address to sign in.

 Follow the Examples for Your Submission Type

1. Locate your submission type in the following tabbed table. Then use the example file names and spreadsheets to format and submit your files.

scRNA-seq

In this section, we describe the required naming pattern for scRNA-seq submission sheets, tar archives, and FastQ files. Follow these conventions so that the pipeline can automatically recognize your batch, link it to the correct submission, and distinguish read files using the “R before P” convention.

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet example

Format the Header, Sample, and Library tabs of your spreadsheet as shown in the accompanying examples.

NOTE: The submission sheet has two new columns. On the Header tab, the new Pipeline column requires the value "v1." On the Library tab, the new Pool column uses values in the format “P1,” “P2,” and so on.

Header tab

A

B

C

D

1

SubmissionBC

Pool

Type

Pipeline

2

FC-00610

P1

RNA

v1

3

FC-00611

P2

RNA

Sample tab

A

B

C

D

E

F

1

SampleID

BatchID

HashTag

Barcode

WellID

PoolID

2

PB05905-001

B258

HT1

NA

NA

P1

3

PB05882-002

B258

HT2

NA

NA

P1

Library tab

A

B

C

D

1

LibraryID

IndexName

SubmissionBC

Pool

2

B258-RP1C1W1

SI-GA-A1

FC-00610

P1

3

B258-RP1C1W1

SI-GA-B1

FC-00610

P1

FastQ files (R before P)

Naming format

BatchID-RP1C1W1_S1_L001_R1_001.fastq.gz

Example file name

B100-RP1C1W1_S1_L001_R1_001.fastq.gz

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

SubmissionBC_AAA_2022-12-26.tar.gz

Example file name

FC-0000_AAA_2022-12-26.tar.gz

scATAC-seq

In this section, we outline how to name scATAC-seq submission sheets, tar archives, and FastQ files. The patterns mirror scRNA-seq where possible, with an “A before P” convention in FastQ filenames so that the pipeline can reliably route ATAC reads.

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet example

Format the header tab of your spreadsheet as shown in the accompanying example.

 

A

B

C

1

SubmissionBC

Pool

Type

2

FC-0279

P0

ATAC

FastQ files (A before P)

Naming format

BatchID-AP1C1W1_S1_L001_R1_001.fastq.gz

Example file name

B100-AP1C1W1_S1_L001_R1_001.fastq.gz

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

SubmissionBC_AAA_2022-12-26.tar.gz

Example file name

FC-0000_AAA_2022-12-26.tar.gz

Fixed RNA

In this section, we explain how to format fixed RNA submission sheets and file names, including how to represent singleplex versus multiplex pools. We also show how to handle cases in which multiple pools share one or more flow cells. Finally, we describe how to use the fR– prefix to help the pipeline identify fixed RNA FastQ files.

Submission sheets

Naming format

BatchID_SEQ_FRNA_Submission_20230106.xlsx

Example file name

B100_SEQ_FRNA_Submission_20230106.xlsx

Submission sheet header

If multiple pools are associated with one flow cell, put the pools into a comma-separated list in the header sheet, as in the accompanying example.

 

A

B

C

D

1

SubmissionBC

Pool

Type

MultiplexFRNA

2

FC00394

P1,P2,P3

fRNA

Yes

Submission sheet examples

Format the header tab on each spreadsheet as shown in the accompanying examples.

Singleplex

 

A

B

C

D

1

SubmissionBC

Pool

Type

MultiplexFRNA

2

SP-AAAAB

P1

fRNA

No

Multiplex

 

A

B

C

D

1

SubmissionBC

Pool

Type

MultiplexFRNA

2

SM-MXOTY

PA,PB

fRNA

Yes

FastQ files (fR- before P)

Naming format

BatchID-fR-P1C1W1_S1_L001_I1_001.fastq

Example file name

B100-fR-P1C1W1_S1_L001_I1_001.fastq

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

Single flowcell

Use this format for tar files that have one flow cell (currently all fixed RNA tar files are like this).

SubmissionBC_AAA_2022-12-26.tar.gz

Multiple pools

Use this format for tar files with multiple pools shared by two flow cells.

BatchID_SubmissionBC_AAA_2022-12-22.tar.gz

Supervised Gating

In this section, we describe how to prepare supervised gating submission sheets and tar archives that bundle FCS files and QC outputs. The naming conventions for tar files and their contents allow the pipeline to find the right FCS files, link them to QC reports, and associate them with the appropriate batch.

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet example

Format your spreadsheet as shown in the accompanying example.

 ABC
1SubmissionBCPoolType
2FC-0204P1RNA

Tar files

Naming format

BatchID_PS1_fcs.tar.gz

Example file name

B064_PS1_fcs.tar.gz

Example files

B064_PS1_SampleID_QC_Report.txt

B064_PS1_SampleID_QC.fcs

B064_PS1_SampleID_QC_Passed_TTTT.png

TEA-seq

Here we cover TEA-seq submission sheet structure, including how to represent hashed samples and merged flow cells. We also document the tar file naming pattern and the “E before P” FastQ convention the pipeline uses to distinguish TEA-seq reads from other assays.​

Submission sheets

Naming format

BatchID_SEQ_Submission_20230106.xlsx

Example file name

B100_SEQ_Submission_20230106.xlsx

Submission sheet examples

Individual samples

Use the TEA‑seq submission sheet to list all samples (including hashed samples). When you combine multiple flow cells into one pool, reference the merged flow cell ID. To facilitate hashed ATAC processing, the mock version of samples should be set up in the sample sheet, as in the accompanying example. 

 

A

B

C

D

E

1

FSQGAZ0BL61-02

EXP-00387

HT1

NA

P1

2

EXP-00387-MP1C1W1

EXP-00387

NA

EXP-00387-AP1C1W1

P1

Pooled samples

Use the submission barcode sheet to define one row per pooled flow cell, linking SubmissionBC (for example, FC-0371) to the corresponding pool (P1) and assay type (TEA). This sheet is required whether you have a single flow cell or multiple flow cells merged into a pool. Format your spreadsheet as shown in the accompanying example.

 

A

B

C

1

SubmissionBC

Pool

Type

2

FC-0371

P1

TEA

FastQ files (E before P)

Naming format

BatchID-EP1C1W1_S1_L001_I1_001.fastq

Example file name

B100-EP1C1W1_S1_L001_I1_001.fastq

Tar files

Naming format

The AAA portion of the filename can be any assortment of letters.

SubmissionBC_AAA_2022-12-26.tar.gz

Example file name

FC-0000_AAA_2022-12-26.tar.gz

Upload your submission sheet

To upload your spreadsheet to a watchfolder, follow the instructions in Steps 2 and 3 of Submit and Monitor Pipeline Batches (Tutorial).


Related Resources

 Configure a Pipeline (Tutorial)

Understand Automated Pipelines

Use the Sample Status Dashboard (Tutorial)