Migrate to the HISE NextGen IDEs (Q&A)
Last modified 2024-11-22
Abbreviations Key | |
AIFI | Allen Institute for Immunology |
HISE | Human Immune System Explorer |
IDE | integrated development environment |
UI | user interface |
VM | virtual machine |
At a Glance
AIFI now offers IDEs designed specifically for the needs of our scientists and data analysts. These new IDEs are faster and easier to spin up and navigate. They support a workflow that promotes reproducibility, collaboration, and transparency. This document answers frequently asked questions about migrating from the current IDEs to the NextGen IDEs. For specific issues, contact immunology-support@alleninstitute.org.
Q: Why are the current-gen HISE IDEs being replaced?
A: The current-gen IDEs are based on the deprecated Google Vertex AI platform. Google is ending support for Vertex AI managed notebooks. Existing instances will continue to function, but patches and updates will not be available. To avoid the inevitable bugs and poor performance associated with using an unsupported product, AIFI will remove all Vertex-based IDE instances that remain in HISE after January 8. For details, see The Great NextGen IDE Migration (Tutorial).
Q: What advantages do the NextGen IDEs offer?
A: NextGen IDEs offer a significantly better user experience compared with the current-gen IDEs:
- Faster startup
- GitHub integration
- Improved Jupyter notebook UI
- Packages composed by key scientists
Future plans include greater disk bandwidth, improved management of idle instances, and billing support.
Q: How do I find the path to my private folder (gs://<path_to_private_folder_google_bucket>
)?
A: The path to your private folder is in the AIFI_IDE_Inventory
spreadsheet. Neelima Inala sent this file each person who owns an IDE that must be migrated from current gen to NextGen before the deadline. If you can't find the path in the spreadsheet, or you don't have access to the spreadsheet, contact immunology-support@alleninstitute.org, and we'll look up the path for you.
Q: Who has to migrate to the NextGen IDEs?
A: Every HISE user who works with IDEs must make the transition. For details, see The Great NextGen IDE Migration (Tutorial).
Q: How do I create a NextGen IDE?
A: From the top navigation menu in HISE, click RESEARCH, and then choose IDE NextGen. For detailed instructions, see Create Your First NextGen IDE (Tutorial).
Q: What should I do with my existing (current-gen) IDEs?
A: To begin the migration, delete everything in /home/jupyter/cache.
Then tar up your entire directory and upload it to your /private
folder. For details, see The Great NextGen IDE Migration (Tutorial).
Q: Which folders persist from session to session?
A: See the summary table in Explore NextGen IDE Folders.
Q: Where is the root disk?
A: The root disk is part of the VM instance that's built when you create a NextGen IDE. Because the root disk contains the operating system files, it's sometimes called the "boot disk." When you stop and restart your instance, the root disk persists.
The root disk has a hierarchical folder structure that includes the read-only root directory (/
). It also contains writeable paths like /var
and /home
. The root disk is different from the data disk, /home/jupyter
, where you save your files and data.
By default, the size of the root disk is 10 GB. When the disk is full, an HTTP
524 error is returned. For details, see Explore NextGen IDE Folders.
Q: Where is the temporary/scratch disk?
A: The scratch space is your /temp
folder. It's used to cache data in active use, process temporary files, and store intermediate computational results. For example, the folder might be used to store data after a system check or repair operation. Files in the /temp
folder are persistent. For details, see Explore NextGen IDE Folders.
Q: What is a Conda environment?
A: Conda is a package manager used to bundle source code, tools, dependencies, and instructions for deploying these resources efficiently. When an IDE is created or restarted, Conda quickly spins up an environment based on the selected data modality. For details, see Manage Packages in HISE NextGen IDEs. Conda packages are maintained in a GitHub repository to ensure reproducibility of the steps in the analysis as well as the identical development environment used to obtain a given set of results.
Q: Why is Conda-based installation recommended?
A: Conda creates a snapshot of the tools and resources necessary to re-create scientific or data analytic results. It captures not only the results themselves, but also the environment in which the steps were taken or computations made. The Conda environment includes the code needed to reproduce the exact version of every package and dependency in use when the researchers reached their conclusions. The original researchers or unaffiliated investigators can then attempt to reproduce those results simply by launching the same Conda environment. Conda packages are maintained in a GitHub repository to ensure reproducibility of the steps in the analysis as well as the identical development environment used to obtain a given set of results.
Q: After November 28, can I resize an instance?
A: Yes.
Q: If a file goes missing, is it gone forever?
A: No. It's recoverable for about a week. Contact immunology-support@alleninstitute.org for assistance.
Q: When I do a readFiles()
or cacheFiles()
SDK call, the downloaded files are saved to what location?
A: The files go to the /input
folder (/home/workspace/input
) in a nested format, as in the following example. For details, see Explore NextGen IDE Folders.
Q: How can I execute a shell script within a /private
folder?
A: As a workaround, you can copy the shell script to a location within /home/workspace
, and then execute the script.
Q: Are R and Python are available within a single environment?
A: Yes. To choose R and Python within a single environment, choose "Minimal modality" when you create your IDE.
Related Resources
Best Practices for NextGen IDE Users
Manage Packages in HISE NextGen IDEs
Manage NextGen IDE Instances (Tutorial)