Levi's RUN_PATH article
Dynamically linking software on a GNU/Linux system in 2017
At work I help manage a multi-version software repository for our users. Our users inevitably need different versions of core libraries and applications. We manage this by installing software into different paths and then using environment modules to let our users choose the versions of things they need. For instance we may install the GNU Compiler Collection version 6.4.0 to /apps/gcc/6.4.0
and version 7.2.0 to /apps/gcc/7.2.0
. Users who want GCC 6.4 will run module load gcc/6.4
and users who want GCC 7.2 will run module load gcc/7.2
. What does this have to do with linking software?
Let's say I've compiled Python 3.6 with GCC 6.4. The python3
binary will likely depend on libraries provided by GCC such as libgcc_s.so
. How does the linker know to use the versions provided by GCC 6.4 instead of the ones provided by the operating system? On GNU/Linux systems the slightly simplified process looks like this:
- Check the
DT_RPATH
section of the binary if and only if theDT_RUNPATH
attribute does not exist. I will refer toDT_RPATH
this asRPATH
. - If the excutable is not a
setuid
/setgid
binary then look inLD_LIBRARY_PATH
. - Check the
DT_RUNPATH
section of the binary. I will refer to this asRUNPATH
. - Look up the library in the
ldconfig
cache file (commonly/etc/ld.so.cache
). - Look in the system paths
/lib
,/usr/lib
and/usr/local/lib
(orlib64
variants)
If you are not familiar with the idea of a search path: put each directory that you want to be searched into a colon separated list. As an example LD_LIBRARY_PATH
might look like /tmp/library_a/lib:/tmp/library_b/lib
. If we are at step 2 in the above process then the dynamic linker will first check /tmp/library_a/lib
and if it does not find it then it will check /tmp/library_b/lib
.
How do we know which technique to use? My goal for this article is to explain when each option might be used.
Note that the concepts may carry to non-GNU/Linux systems but the details will be different. This article focuses on common GNU/Linux systems.
When to use RPATH
Use RPATH
for when the paths to search must not be overriden for any reason both now and in the future. Let's say you are installing software for multiple users and do not want them to be able to goof up running this particular piece of software by setting LD_LIBRARY_PATH
. You might consider RPATH
in this case. I say might because once this is set there is no overriding it without altering the binary.
I generally recommend against using RPATH
because it is very hard to tell the future. How certain can you be when you are building the software that will neither you nor your users will never have a valid reason to override this path?
It is worth noting that RPATH
is deprecated. However, I suspect it will never permanently go away for two reasons:
- Linux systems really, really try not to break userspace software.
- It actually serves a need that no other technique can fill. If rpath went away the first thing that would be checked is
LD_LIBRARY_PATH
. We cannot (and should not even if we could) prevent users from alteringLD_LIBRARY_PATH
. However, if the user has a library inLD_LIBRARY_PATH
that conflicts with the one administrator-provided binary or library needs then it may not run correctly. The rpath provides a mechanism for administrators to prevent this from happening.
When to use LD_LIBRARY_PATH
Use of LD_LIBRARY_PATH
is highly discouraged. Here is a small sampling of articles that recommend against its use:
-
LD_LIBRARY_PATH
considered harmful by Georg Sauthoff -
Why
LD_LIBRARY_PATH
is bad by David Barr -
When should I set
LD_LIBRARY_PATH
?. According to them the short answer is never.
I encourage reading more on this subject but if you want the very quick, probably too-simple summary it's that 1) conflicts are inevitable and 2) it is a potential security issue.
If you do use LD_LIBRARY_PATH
it should be for temporary use such as running a test suite with a different version of a dependency to ensure it works correctly. You should be very wary setting an LD_LIBRAY_PATH
in a modulefile or some other file that might get sourced by other users; doing so is a sign that what you are doing is permanent and some other method should be used.
A cautionary tale
At work we used the LD_LIBRARY_PATH
path method for hundreds of software packages simply because we did not know any better at the time. Years later when we realized our mistake we were well past the point of recompiling them properly to use different, more appropriate methods. This meant that for new software we were basically forced to use RPATH
which in turn prevents users from using LD_LIBRARY_PATH
as it was intended.
Fortunately for us we are preparing a new operating system image. This is the impetus for writing this article: I need a single article that explains the process in enough depth to understand the issues and can confidently recommend to my coworkers. Hopefully it will be useful to others as well.
When to use RUNPATH
Most user-space software should prefer using RUNPATH
. This is the intended replacement for RPATH
and in 2017 most Linux operating systems provide a linker that will understand it. In particular if you ship pre-built software to be installed on systems you do not own you should probably use this method with a RUNPATH
that mentions $ORIGIN
. The use of $ORIGIN
allows you to use relative search paths. Let's say your software is extracted into $DIR
; your directory structure might be something like:
$DIR/
bin/
lib/
Your executables should be in $DIR/bin
and should have RUNPATH
set to include $ORIGIN/../lib
. You put all of the libraries in $DIR/lib
and if they have dependencies on other libraries you are also shipping they should have a RUNPATH
of $ORIGIN
.
When to use the ldconfig cache
User-space software will rarely use this method directly. Software installed by administrators that is intended for all users will typically go into the system paths but sometimes they may need it to be located elsewhere. In this case they will add a relevant file to /etc/ld.so.conf.d
and regenerate the ld.so.cache
.
When to use system paths
If all users of the system are intended to use this version of the library nearly all of the time then it should probably belong here. This is the simplest, easiest method because it requires no extra configuration at build-time nor at run-time. Most libraries installed by package managers end up here.
Summary
For most software that is intended to be shared by all users I recommend using the system paths. However, software authors should not assume their software will be installed here and should provide a mechanism to install it to other places.
If the system paths are not suitable because of version conflicts or you do not want all users to automatically use them then I recommend RUNPATH
.
I recommend against using LD_LIBRARY_PATH
. If you do use LD_LIBRARY_PATH
it should be temporary such as loading a newer version of a dependency in order to run test suites against that version.
I recommend against using RPATH
. Use RPATH
only if you are very certain you do not want users to purposefully or accidentally load different versions of libraries through LD_LIBRARY_PATH
.
Last changed on Wed Sep 18 16:51:30 2019