Internet Access from Jobs
Overview
Compute nodes do not have direct Internet access. A beta HTTP/HTTPS proxy service is available for compute nodes that need to reach external Internet endpoints from the cluster. This service is slower than direct internet access and may occasionally fail. Workflows should automatically retry failed requests, use backoff delays between retries, and set larger timeout values when downloading large files. Some applications and containers require additional configuration to use the proxy and trust the proxys certificate authority (CA).
On this cluster:
- GPU and high memory nodes should not be used for build processes unless those resources are required for the build process. These resources are in high demand.
- Compute nodes do not have direct Internet access. They must use the proxy service to reach external HTTP and HTTPS endpoints.
- Login nodes do have Internet access, but users may also use the proxy service on login nodes for consistency with compute-node workflows.
The proxy endpoint is:
http://porthole.rc.byu.edu:3128
The proxy supports both HTTP and HTTPS. The proxy also scans files for malware/viruses and filters some web categories. Some sites may be blocked. If you need access to a site that is blocked, please contact rcsupport@byu.edu. Please include a brief description of why you need access to the site.
Quick Start
For most applications, loading the proxy module is sufficient:
module load http_proxy
This module sets environment variables for using the proxy and trusting the certificate authority (CA) the proxy uses.
Some applications do not use the system trust store automatically. These applications may throw self-signed or certificate verification errors. Please configure the application to use the system CA bundle at:
/etc/pki/tls/certs/ca-bundle.crt
The proxy CA certificate file itself is present on all nodes at:
/etc/pki/ca-trust/source/anchors/rc_root_ca.pem
For containers, installing the proxy Certificate Authority (CA) can be the most reliable approach. Bind mounting the host CA bundle may work for some containers, but it is less portable and reliable.
Containers
To use Apptainer with the proxy, run:
module load apptainer_http_proxy
This module will load several environment variables that are passed into the container and instruct applications to use the proxy. However, this alone is not enough for applications to trust the proxy's CA. To correctly add the proxy's CA certificate to the container's trust store, the Linux distribution used by the container must be determined.
Determining the Linux Distribution of a Container
Before installing the proxy CA into a container, first determine what Linux distribution the container is based on. The certificate location and trust update command depend on the distribution family.
A good first check is:
apptainer exec image.sif cat /etc/*release
or for writable sandboxes:
apptainer exec /tmp/mycontainer cat /etc/*release
If the release file is not found, you can also inspect which package manager exists:
apptainer exec image.sif which apk apt dnf yum
This often gives a quick clue:
-
apkusually means Alpine -
aptusually means Debian or Ubuntu -
dnforyumusually means Fedora, Rocky, Red Hat, or AlmaLinux
If you are still unsure, please contact rcsupport@byu.edu and include:
- the container source, such as
docker://<image> - the output of
cat /etc/*release - the output of
which apk apt dnf yum
Bind Mounting a Certificate Authority (CA) into a Container
Bind mounting is the process of mapping a directory or file from the host system to a specific location in the container. This process is generally quicker than adding the CA certificate into the container, however, it may not be as reliable for remedying certificate verification errors. If this process does not work, try adding the CA certificate to the container.
Alpine, Debian, or Ubuntu Linux
module load apptainer apptainer_http_proxy
module load http_proxy #Only needed if the container image isn't already downloaded
export APPTAINERENV_SSL_CERT_BUNDLE=/etc/ssl/certs/ca-certificates.crt
export APPTAINERENV_REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
export APPTAINERENV_CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
apptainer run --cleanenv --bind /etc/pki/tls/certs/ca-bundle.crt:/etc/ssl/certs/ca-certificates.crt docker://alpine
## The container should be able to access websites now.
Fedora, Rocky, Red Hat, or AlmaLinux
module load apptainer apptainer_http_proxy
module load http_proxy #Only needed if the container image isn't already downloaded
export APPTAINERENV_SSL_CERT_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt
export APPTAINERENV_REQUESTS_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt
export APPTAINERENV_CURL_CA_BUNDLE=/etc/pki/tls/certs/ca-bundle.crt
apptainer run --cleanenv --bind /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt docker://fedora
### The container should be able to access websites now.
Adding a Trusted Certificate Authority (CA) to a Container
Alpine Linux
module load apptainer
module load http_proxy apptainer_http_proxy # This is optional on login nodes
apptainer build --sandbox /tmp/alpine docker://alpine
mkdir -p /tmp/alpine/usr/local/share/ca-certificates
cp /etc/pki/ca-trust/source/anchors/rc_root_ca.pem /tmp/alpine/usr/local/share/ca-certificates/rc_root_ca.crt # This only works from RC provided systems
apptainer shell --writable --cleanenv /tmp/alpine
cat /usr/local/share/ca-certificates/rc_root_ca.crt >> /etc/ssl/certs/ca-certificates.crt
apk add ca-certificates
update-ca-certificates
echo -e "export SSL_CERT_FILE=\"/etc/ssl/certs/ca-certificates.crt\"\nexport REQUESTS_CA_BUNDLE=\"/etc/ssl/certs/ca-certificates.crt\"\nexport CURL_CA_BUNDLE=\"/etc/ssl/certs/ca-certificates.crt\"" >> $APPTAINER_ENVIRONMENT
exit
apptainer build alpine-rc-certs.sif /tmp/alpine
Debian or Ubuntu Linux
module load apptainer
module load http_proxy apptainer_http_proxy # This is optional on login nodes
apptainer build --sandbox /tmp/debian docker://debian
mkdir -p /tmp/debian/usr/local/share/ca-certificates
cp /etc/pki/ca-trust/source/anchors/rc_root_ca.pem /tmp/debian/usr/local/share/ca-certificates/rc_root_ca.crt # This only works from RC provided systems
apptainer shell --writable --fakeroot --cleanenv /tmp/debian
apt update && apt -y install ca-certificates
update-ca-certificates
echo -e "export SSL_CERT_FILE=\"/etc/ssl/certs/ca-certificates.crt\"\nexport REQUESTS_CA_BUNDLE=\"/etc/ssl/certs/ca-certificates.crt\"\nexport CURL_CA_BUNDLE=\"/etc/ssl/certs/ca-certificates.crt\"" >> $APPTAINER_ENVIRONMENT
exit
apptainer build debian-rc-certs.sif /tmp/debian
Fedora, Rocky, Red Hat, or AlmaLinux
module load apptainer
module load http_proxy # This is optional on login nodes
apptainer build --sandbox --fix-perms /tmp/fedora docker://fedora
cp /etc/pki/ca-trust/source/anchors/rc_root_ca.pem /tmp/fedora/etc/pki/ca-trust/source/anchors/rc_root_ca.crt # This only works from RC provided systems
apptainer shell --writable --cleanenv --no-privs /tmp/fedora
update-ca-trust extract
echo -e "export SSL_CERT_FILE=\"/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem\"\nexport REQUESTS_CA_BUNDLE=\"/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem\"\nexport CURL_CA_BUNDLE=\"/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem\"" >> $APPTAINER_ENVIRONMENT
exit
apptainer build --fakeroot fedora-rc-certs.sif /tmp/fedora
Important Guidance for Reliability
This service is currently in beta.
Because the proxy performs malware/virus scanning and filtering, requests may occasionally be slow or fail. In some cases it may take a few minutes to detect issues with a proxy server.
Applications and scripts should be written to:
- Retry failed requests automatically.
- Use a backoff delay between retries.
- Use larger timeout values and handle timeouts gracefully, especially when downloading large files because these take longer to scan.
Using Curl with the Proxy
Curl usually honors the environment variables set by the module:
module load http_proxy
curl https://example.com
For retry behavior, use curl's retry support:
module load http_proxy
curl --retry 10 \
--retry-delay 30 \
--max-time 300 \
--retry-max-time 600 \
-o myfile.zip \
https://example.com/myfile.zip
-
retryTotal number of times to retry -
retry-delaySeconds to wait before retrying -
max-timeSeconds for the entire request. This may need to be longer for large files, especially compressed files, as they take longer to scan -
retry-max-timeTotal seconds to retry. -
-oOutputs to file
Using Python Requests with the Proxy
Please note that the Requests library in Python uses the REQUESTS_CA_BUNDLE environment variable for certificate verification. This is set automatically in the http_proxy module, however if Requests is run in a container, this variable must be set in the container.
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
# Sleeps between retries 0, 5, 10, 20, 40, 80 seconds.
retry_strategy = Retry(
total=5,
backoff_factor=5,
status_forcelist=[429, 500, 502, 503, 504]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session = requests.Session()
session.mount('https://', adapter)
try:
# This will fail with a 500 status code, and the retry strategy will retry the request up to 5 times with exponential backoff.
response = session.get('https://httpbin.org/status/500', timeout=300)
except requests.exceptions.RetryError:
print("Request failed after 5 retries.")
Environment Variables
The following is a list of environment variables that we may set to help the proxy work correctly. Any defaults listed are from the http_proxy module.
-
http_proxy: Instructs the application which proxy to use for HTTP requests. Usuallyhttp://porthole.rc.byu.edu:3128 -
HTTP_PROXY: See http_proxy above. Some applications use all capitals for this variable. -
https_proxy: Instructs the application which proxy to use for HTTPS requests. Usuallyhttp://porthole.rc.byu.edu:3128 -
HTTPS_PROXY: See https_proxy above. Some applications use all capitals for this variable. -
no_proxy: Instructs the application to not use the proxy when connecting to hosts in this list. The list can contain IP addresses and DNS names. Some applications will accept CIDR ranges in this list too. -
NO_PROXY: See no_proxy above. Some applications use all capitals for this variable. -
SSL_CERT_FILE: Specifies where the Certificate Authority (CA) bundle is located on the system. Many applications will recognize this variable. Defaults to/etc/pki/tls/certs/ca-bundle.crt -
REQUESTS_CA_BUNDLE: Used by Python Requests and several other tools to specify a CA bundle instead of relying on internal certificate-verification methods. Defaults to/etc/pki/tls/certs/ca-bundle.crt -
CURL_CA_BUNDLE: Instructs Curl where to find a CA bundle that it should use for certificate verification. Defaults to/etc/pki/tls/certs/ca-bundle.crt -
MAMBA_SSL_VERIFY: Tells Mamba where to find a CA bundle that it should use for certificate verification. Defaults to/etc/pki/tls/certs/ca-bundle.crt -
PIP_TIMEOUT: Tells the Python tool, pip, how many seconds to wait before retrying a package download. Defaults to300seconds. This needs to be longer for larger package downloads. -
HF_HUB_DOWNLOAD_TIMEOUT: Integer value to define the number of seconds to wait for server response when downloading a file. If the request times out, a TimeoutError is raised. Defaults to300seconds. This needs to be longer to allow scanning of larger models. -
HF_HUB_ETAG_TIMEOUT: Integer value to define the number of seconds to wait for server response when fetching the latest metadata from a repo before downloading a file. If the request times out, huggingface_hub will default to the locally cached files. Defaults to60seconds. -
UV_HTTP_TIMEOUT: Used by theuvPython package installer to configure a time limit in seconds for HTTP requests. Defaults to300seconds. -
UV_CONCURRENT_DOWNLOADS: Used by theuvPython package installer to configure the maximum number of simultaneous downloads. Defaults to6. Please do not unset or set this very high. Many simultaneous downloads can slow the proxy down significantly for others using it. -
NXF_OPTS: Used by Nextflow to configure options. Defaults to-Djavax.net.ssl.trustStore=/etc/pki/ca-trust/extracted/java/cacerts -Djavax.net.ssl.trustStorePassword=changeitwhich configures a custom CA bundle to be used within Nextflow -
JAVA_OPTS: Usually used by Java applications. Defaults to-Djavax.net.ssl.trustStore=/etc/pki/ca-trust/extracted/java/cacerts -Djavax.net.ssl.trustStorePassword=changeitwhich configures a custom CA bundle to be used within Java applications. Not always recognized by Java applications.
Troubleshooting
Certificate verify failed
This usually means the application does not trust the proxy CA certificate.
Check the following:
1. Load the proxy module:
module load http_proxy
2. Confirm the certificate-related environment is set:
echo "$SSL_CERT_FILE"
echo "$REQUESTS_CA_BUNDLE"
echo "$CURL_CA_BUNDLE"
For host-based, i.e, not in Apptainer, these should usually point to:
/etc/pki/tls/certs/ca-bundle.crt
3. For applications inside of a container, make sure the proxy CA was installed into the container trust store and the container trust database was updated. See this for more information.
Ensure the SSL_CERT_FILE environment variable points to the container's CA bundle and not the host's bundle. To check this from inside the container run:
[ -f "$SSL_CERT_FILE" ] && echo "true"
It should echo "true".
4. For applications inside of a container, try using the --cleanenv flag when using apptainer run or apptainer exec. Without this flag, environment variables from the host can be passed into the container. If the container was built from a different Linux distribution, this could incorrectly point the application at the wrong CA bundle.
5. Search for application-specific environment variables for the CA bundle. Unfortunately SSL_CERT_FILE is not a widely followed standard for environment variables. Some applications use other environment variables to point the application to the correct CA bundle.
6. Reach out to rcsupport@byu.edu. Please include how to recreate the problem you are facing and what steps you have taken to try to troubleshoot the issue.
Other Compute Nodes Can No Longer Be Reached
After running module load http_proxy you can no longer reach other compute nodes that may be running HTTP(S) endpoints needed for your application. This happens because the current node is attempting to reach them through the proxy which does not allow HTTP(S) requests to compute nodes. We recommend:
- For Wget/Curl, use the
--no-proxyflag.
module load http_proxy
# The following request will not use the proxy even though the module has been loaded.
curl --no-proxy '*' http://m8-1-1.rc.byu.edu:12345
- Change the
no_proxyandNO_PROXYenvironment variables. Some applications will read this in and not use the configured proxy for these endpoints. Note this is a comma-separated value.
For an application running on the host:
# The below lines do not use the proxy for localhost, m8-1-1, and m8-1-2. (Including its short hostname, fully-qualified domain name, and IP address)
export no_proxy=127.0.0.1,::1,localhost,192.168.220.1,m8-1-1,m8-1-1.rc.byu.edu,192.168.220.2,m8-1-2,m8-1-2.rc.byu.edu
export NO_PROXY=$no_proxy
python my_script.py
or for applications within Apptainer:
# The below lines do not use the proxy for localhost, m8-1-1, and m8-1-2. (Including its short hostname, fully-qualified domain name, and IP address)
export APPTAINERENV_no_proxy=127.0.0.1,::1,localhost,192.168.220.1,m8-1-1,m8-1-1.rc.byu.edu,192.168.220.2,m8-1-2,m8-1-2.rc.byu.edu
export APPTAINERENV_NO_PROXY=$APPTAINERENV_no_proxy
apptainer run ./myimage.sif
- Unload the http_proxy module with
module unload http_proxy. If the application is within Apptainer, runmodule unload apptainer_http_proxy
A Site is Blocked
If you need access to a site that is currently blocked, please email rcsupport@byu.edu and include:
- the URL
- the tool or application you were using
- a brief note about why access is needed
Still Having Trouble
If you are still having issues, please contact rcsupport@byu.edu with:
- the command you ran
- the full error output
- whether you were on a login node or compute node
- whether you were using the host OS or Apptainer
- the output of
env | grep -i proxy
echo "$SSL_CERT_FILE"
Acceptable Use and Logging Policies
Permitted Uses
The HTTP proxy may be used for activities that support approved institutional, departmental, research and instructional needs including:
- Accessing software updates, package repositories, and container registries
- Retrieving documentation, reference materials, and research data
- Connecting to vendor-supported web services required for installed software
- Accessing other external HTTP or HTTPS resources approved for legitimate work purposes
Prohibited Uses
The HTTP proxy may not be used for activities that violate institutional policy, applicable laws, contracts, or security requirements. Prohibited uses include, but are not limited to:
- Circumventing network, firewall, access control, or security policies
- Accessing malicious, illegal, or unauthorized content
- Downloading or distributing copyrighted material without authorization
- Conducting vulnerability scanning, penetration testing, scraping, crawling, or automated probing of external systems
- Hosting, relaying, tunneling, or anonymizing unauthorized traffic
- Exfiltrating institutional, research, regulated, confidential, or proprietary data
- Accessing personal entertainment, streaming, gaming, social media, or other non-work-related services
- Using the proxy to connect to command-and-control infrastructure, malware repositories, cryptocurrency mining services, or other high-risk destinations
- Attempting to hide user identity, evade logging, or modify proxy configuration without authorization
Logging and Monitoring
Use of the HTTP proxy may be logged, monitored, and recorded for security, operational, compliance, and troubleshooting purposes.
Last changed on Fri May 15 14:29:13 2026