In an effort to not re-invent the wheel, libraries were developed to allow for code-reuse. The first libraries were static libraries, code that is compiled to one or more .o (object) files, each .o containing a set of library functions and/or global variables, which are then collected together into a .a archive file, which can be manipulated with the ar command (see: man ar).
.o
.a
ar
man ar
ar t /usr/lib64/libz.a
The C compiler then searches through the .a files collection of object files and statically links them to your program at the compile time linking stage in the same manner your own .o object files are linked together.
The downside to static libraries are that they:
Static libraries still exist and may be available to be linked to your programs if they are compiled with the -static option. Static programs have the upside that they will likely work so long as a specific minimum kernel version is met, meaning that a distributed static binary will much more likely work across all distributions of Linux.
-static
To address the two downsides of static libraries, shared libraries were developed. The main problem with a shared library is knowing where in memory it and its functions/data reside so that a caller can call the functions or access the data. In a program or library, the names of functions and data are called symbols, each of which has an address and a type and a record of them is kept in the program itself after compilation. The symbols of a program or library may be inspected using the following commands (assuming the symbols have not been stripped via the strip command):
strip
readelf -s program|library
readelf -s
|
nm program
nm
Initially in Linux, prior to kernel version 1.2 when the a.out program executable format was used, a library was compiled such that it would load at a specific address, allowing the caller to know where in memory a function or data item will be found, thus allowing the compiler to hard-code the addresses for the particular functions (or the address of a jump-table.) The downside of this scheme is that the limited 32 bit address space had to have reservations for each library, limiting the number and locations of libraries in the system. Libraries needed to be registered to prevent conflicts in memory with other libraries.
To make a shared library you make a C program (the main function is optional) and compile it with the options -shared and -fPIC for Position Independent Code.
-shared
-fPIC
#include <stdio.h> int foo(int n) { return printf("Foo %d\n", n); } int main(void) { return foo(10); }
gcc -shared -fPIC -o libfoo.so foo.c
When Linux switched from a.out (Assembler OUTput) executable format to ELF (Executable and Linking Format, also used by
The program /lib64/ld-linux-x86_64.so.2 is the runtime dynamic linker (also known as ld.so.) When a program is compiled against a shared library, the dynamic linker is specified as the interpreter for the program. When such a program is execve()'ed, the kernel will launch the dynamic linker to load and run the program:
/lib64/ld-linux-x86_64.so.2
execve()
We can play the part of the dynamic linker in our own programs using the C library functions dlopen() to open a dynamic shared object (.so file) and then get the address of symbols within using dlsym().
dlopen()
.so
dlsym()
#include <dlfcn.h>
void *dlopen(const char *filename, int flags);
int dlclose(void *handle);
dlopen() returns a (void *) non-NULL pointer to a resource handle if it successfully opens the file. flags is one of RTLD_LAZY (resolve symbols only when needed (functions only)) or RTLD_NOW (resolve all symbols immediately.) binary OR'ed with other flags such as RTLD_GLOBAL or RTLD_LOCAL (decides at what level symbols are available to subsequently loaded libraries.)
flags
RTLD_LAZY
RTLD_NOW
When opening a library you should be specific about it's location unless you wish to use an established library installed in the normal /lib* directories. If you wish to use a common library you can include <gnu/lib-names.h> and use defines such as LIBM_SO (the math library.)
/lib*
<gnu/lib-names.h>
LIBM_SO
void *dlsym(void *handle, const char *symbol);
Resolves the symbol named by symbol given the resource handle. The return value is the address of the symbol or NULL on failure.
symbol
handle
#include <stdio.h> #include <dlfcn.h> int main(void) { void *handle = dlopen("./libfoo.so", RTLD_LAZY); if (handle == NULL) { perror("dlopen"); return 1; } int (*foo)(int) = dlsym(handle, "foo"); foo(5); int (*foo_main)(void) = dlsym(handle, "main"); foo_main(); dlclose(handle); return 0; }