Use Metadata to Make Files Searchable (Tutorial)
At a Glance
![]() | Abbreviations Key |
CSV | comma-separated values |
cvf | create verbose filename |
ELN | electronic lab notebook |
GUID | globally unique identifier |
gz | gzip |
HISE | Human Immune System Explorer |
IDE | integrated development environment |
NA | not applicable |
SLIMS | simplified laboratory information management system |
tar | tape archive |
UI | user interface |
Automated pipelines enable HISE to generate result files with attached subject and sample metadata. The pipelines standardize the data processing for each assay. Certain types of files, however, such as those containing experimental or pilot data, are unsuitable for pipeline ingestion and must be uploaded directly to a Project Store and then downloaded to an IDE. By default, metadata is not attached to these files. Instead, you can either associate metadata with these files at ingest or attach it later, after some experimentation. We highly recommend the former.
To make your work reproducible and easily findable in an advanced search, attach metadata to your files at ingest. |
Attach Metadata at Ingest
Step 1: Create a manifest
To attach metadata to files going to a Project Store at ingest, create a manifest file that lists all the files to be ingested, along with their file types and sample references. To be ingested into HISE, this file must be named manifest.csv
.
Below is a sample manifest.csv
file:
- Lines 1 and 2. The first two rows show the account and project to which the file data belongs. This information is not required but helps ensure that the metadata is associated with the correct account and project (for example, if the file is dropped into the wrong watchfolder).
- Line 3. The third line is a static header showing three columns:
file, samples, and fileType
. If a file is not associated with a sample, the special keywordreference
can be used instead ofsamples
. - Subsequent lines. The remaining lines contain the actual metadata: each file name, the specific sample(s) with which it's associated, and the file type. Multiple samples are separated with the delimiter
;
accountGuid | 10f58583-1cdf-4f18-8de4-dc1ca94783e2 | |
projectGuid | e206cf7a-5b13-478f-b842-a305fe4954d8 | |
file | samples | fileType |
population-stats.csv | KT00970;KT01245;KT01244;KT00971 | FlowCytometry |
Instructions
1. Navigate to SLIMS, and use your organizational email address to sign in.
2. From the top navigation menu, click the Content tab.
3. Use the search, filter features, and checkboxes to find and select the project for which you want to create a manifest file.
3. Remain on the Content tab, and click the ELN tab.
4. In the ELN section:
A. Select the content you want to work with.
B. An ELN notebook opens.
C. Select the block with your content.
D. Click the compass icon.
6. In the Generate HISE Manifest box, choose the type of manifest best suited to your project (To create a manifest.csv in SLIMS, you must be using a Simplified ELN experiment.)
7. In the lower-left corner, click Finish.
If the file type you want to declare doesn't exist in the project, or if the manifest is not available in LIMS, use the Support button at the top of this page to file an issue. |
Create a Tar File and Ingest It into a Watchfolder
After you create a manifest file, tar up the manifest with the files themselves. For instructions, see Ingest Data into the Project Store (Tutorial). When the tar file is dropped in the watchfolder, the HISE decorator service untars the files and uploads the data to the Project Store linked to the watchfolder. It also adds the samples and/or file ypes declared in the manifest.
Use the Project Store UI to Attach Metadata
Before you can add file metadata, you first need to select the projects you are working on. Navigate to your Personal Space and select Projects and make sure your desired projects are selected.
[select-projects-image]
Now navigate to your Personal Space again, but this time select Project Stores. Then select your project.
[select-project-store]
Now select all the files and click “Add file metadata” in the top right corner of the screen. This will pop-up a new prompt where you can add metadata like file type, sample kit GUID, or batch ID.
[add-file-metadata]
Lastly, click “Submit” and you should now see your selected files with values for the fields that were filled in.
[click-submit]
cohort | cohortDescription | subjectGuid | sampleKitGuid | birthYear | daysSinceFirstVisit | ethnicity | race | sex | specimenGuid | totalCellCount | visitDetails | visitName |
subjectA | sampleBB | 100 | NA | LastVisit | ||||||||
subjectA | sampleBC | 50 | NA | moreVisit | ||||||||
sampleBB | specimenB | 10000 | ||||||||||
sampleBC | specimenC | 8888 |
The demographics scheme mapping must be set up by admin prior to ingesting a demographics file. Each Project has its own demographics scheme. Subject demographics for associated samples are provided as part of the manifest delivered to the HISE wet lab. This data is automatically transferred to HISE. However, it is possible to submit some demographics data through a watchfolder instead (for example, as part of a set of survey data).
The following table shows a sample demographics scheme:
Variable in CSV | Variable in HISE |
birthYear | subject.birthYear |
cohort | cohort.cohortGuid |
cohortDescription | cohort.description |
daysSinceFirstVisit | sample.daySinceFirstVisit |
draw date | sample.drawDate |
ethnicity | subject.ethnicity |
race | subject.race |
sampleKitGuid | sample.sampleKitGuid |
sex | subject.biologicalSex |
specimenGuid | specimen.specimenGuid |
subjectGuid | subject.subjectGuid |
totalCellCount | specimen.totalViableCellCount |
visitDetails | sample.visitDetails |
visitName | sample.visitName |