BYU

Office of Research Computing

Python

Python
Python is a dynamic language, filling a similar role to Perl or other scripting languages.

Python is well suited to do preparation and post-processing for jobs, but is not as suited for high-performance computing as a compiled language like C or Fortran. We do not recommend that users use it for a major component of their HPC jobs. We do recommend using it for preparation, post-processing, and proof-of-concept work for defining algorithms.

Anaconda / Miniconda3 / conda / mamba

The best way to get a particular Python version and associated libraries is to use the conda/mamba package manager from Anaconda (https://docs.conda.io/en/latest/miniconda.html). conda/mamba is already installed on the supercomputer. The best way to use conda/mamba is to create or modify a file called .bashrc with all of the environment changes conda/mamba need to run. To do so, navigate to your home directory. Then, create or modify the .bashrc file, but first check whether conda initialization is already present. You can do this by searching for lines containing conda or miniconda in the .bashrc file.

If conda initialization isn't already there, you can add it by copying and pasting the following into the bottom of the file:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/apps/miniconda3/latest/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/apps/miniconda3/latest/etc/profile.d/conda.sh" ]; then
        . "/apps/miniconda3/latest/etc/profile.d/conda.sh"
    else
        export PATH="/apps/miniconda3/latest/bin:$PATH"
    fi
fi
unset __conda_setup

if [ -f "/apps/miniconda3/latest/etc/profile.d/mamba.sh" ]; then
    . "/apps/miniconda3/latest/etc/profile.d/mamba.sh"
fi
# <<< conda initialize <<<

After you edit your .bashrc, it will automatically perform any setup necessary for conda/mamba whenever you log in to the supercomputer. You may log out and log back in, or run the following in order to perform the setup immediately:

[[ -f ~/.bashrc ]] && source ~/.bashrc

(We highly recommend using mamba instead of conda. mamba is a drop-in replacement for conda and does almost everything conda does, but faster. By also running mamba init, you will be able to use mamba.)

To finish, log out and log back in. Your command prompt should now include {base}. This signifies that conda/mamba is available and that you have the base environment activated. In the future, you will not need to load the miniconda3/module.

The base environment is for packages that conda and mamba need and should not usually be used for most computing tasks. However, you can use conda/mamba to create a conda environment and install the packages you need there.

To create a new environment, run

mamba create --name your_new_environment_name

Activate the environment with

mamba activate your_new_environment_name

Your prompt will change from {base} to {your_new_environment_name} to show that you have changed environments.

Now, install Python:

mamba install python

or

mamba install python=X.Y.Z

This will download python (or python version X.Y.Z) and install it in the current environment. You can install other packages in the same way.

For more details, refer to the miniconda documentation: https://docs.conda.io/en/latest/miniconda.html. You may, for example, want to use different repositories, called channels, to download different or newer software.

Slurm Batch Scripts and Conda

To use a conda environment inside of a Slurm batch script, modify the first line of the script so that it looks like this:

#!/bin/bash --login

After this change, you will be able to activate a conda environment inside of the script.

You will have to make this change because conda has a peculiar dependency on ~/.bashrc. Typically, a Slurm script runs in non-interactive mode, which means that bash will not run ~/.bash_profile. By including the flag --login, you instruct bash to run ~/.bash_profile anyway which, if you have set up your ~/.bash_profile and ~/.bashrc as described above, will properly set up access to conda.

Bioconda

Bioconda (https://bioconda.github.io/) is an anaconda repository of "software packages related to biomedical research". Essentially, Bioconda provides an anaconda channel, or repository, of biomedical software packages. If you are using Bioconda-provided software, you will need to make a few changes to your ~/.condarc configuration file (the file that contains settings for conda) to gain access to their channels.

Bioconda's website (https://bioconda.github.io/) specifies that you will need to run these commands in this order:

conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda config --set channel_priority strict
conda config --set auto_activate_base false

Alternatively, you can edit your ~/.condarc file so that it resembles the following:

channels:
  - bioconda
  - conda-forge
  - defaults
channel_priority: strict
auto_activate_base: false

Your ~/.condarc may also include other settings and channels. If you have other channels, make sure that the relative ordering of the above three channels stays the same.

Once your ~/.condarc contains this information, you will be able to create an environment and install Bioconda packages and their recommended dependencies.

Available Versions

Using modules we have several full-featured versions available. A current list of Python versions (and other software) can be found by running this command:

module avail python

To load Python using modules:

# load Python 3.9.x
module load python/3.9

# switch to Python 3.11.x
module swap python/3.11

Using "pip"

Before installing any Python libraries, always create a dedicated conda/mamba environment. This ensures a clean, isolated workspace and prevents conflicts between packages.

Avoid using pip unless absolutely necessary. Most libraries can and should be installed with mamba for better dependency management and stability. Only resort to pip if a required package isn't available via mamba, and do this after installing all other core dependencies.

This approach ensures a more reliable and reproducible Python environment, avoiding version conflicts and dependency issues.

JupyterLab

Jupyter

Please Use JupyterLabs with OnDemand