Friend: I tried looking at static linking in Mac OS X and it seems nearly impossible. Take a look at this http://stackoverflow.com/a/3801032
Me: I have no idea what that
-static
flag does, but I'm pretty sure that's not how you link to a library. Let me RTFM a bit.
Minutes later...
Me: I'm gonna have to write this stuff down.
From Wikipedia:
A static library or statically-linked library is a set of routines, external functions and variables which are resolved in a caller at compile-time and copied into a target application by a compiler, linker, or binder, producing an object file and a stand-alone executable.
A dynamically linked library is a library intended for dynamic linking. Only a minimum amount of work is done by the linker when the executable file is created; it only records what library routines the program needs and the index names or numbers of the routines in the library. The majority of the work of linking is done at the time the application is loaded (load time) or during execution (run time). Usually, the necessary linking program, called a "dynamic linker" or "linking loader", is actually part of the underlying operating system.
First things first, gcc
isn't the default compiler in Mac OS X anymore. Since Xcode 5, the Apple developer toolchain uses clang
, and gcc
only aliases to clang
.
NOTE: All shell outputs in this document were produced with the default
bash
on Mac OS X El Capitan 10.11.3 with Xcode 7.2 installed.
$ gcc -v
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.3.0
Thread model: posix
So let's look into clang
.
$ man clang
clang is a C, C++, and Objective-C compiler which encompasses preprocessing,
parsing, optimization, code generation, assembly, and linking. Depending on
which high-level mode setting is passed, Clang will stop before doing a full
link. While Clang is highly integrated, it is important to understand the stages
of compilation, to understand how to invoke it.
So static linking is an option of the linker stage of compilation. So let's try ld
, the Mac OS X linker's manual.
$ man ld
OPTIONS
Options that control the kind of output
-execute The default. Produce a mach-o main executable that has file
type MH_EXECUTE.
-dylib Produce a mach-o shared library that has file type MH_DYLIB.
-bundle Produce a mach-o bundle that has file type MH_BUNDLE.
-dynamic The default. Implied by -dylib, -bundle, or -execute
-static Produces a mach-o file that does not use the dyld. Only used
building the kernel.
Now we have it. -static
does not control how the output links to libraries; it controls the type of output produced by the linker. In this case, -static
is used to indicate that no dynamic linking should occur with this binary. Ever. The only file to ever need this option is the kernel.
So how do we tell the linker which library to link against?
$ man ld
Options that control libraries
-lx This option tells the linker to search for libx.dylib or libx.a in
the library search path. If string x is of the form y.o, then that
file is searched for in the same places, but without prepending
`lib' or appending `.a' or `.dylib' to the filename.
-Ldir Add dir to the list of directories in which to search for
libraries. Directories specified with -L are searched in the order
they appear on the command line and before the default search
path. In Xcode4 and later, there can be a space between the -L and
directory.
These are the only two options you need to know to link to most (static or dynamic) libraries. More can be said about lazy linking, frameworks, and many other aspects of linking and libraries but these topics are outside of the scope of this post.
-l
is used to tell the linker the name of the libraries to look into for symbols. If you decide to compile a program that uses functions from libpng
, you would add -lpng
to the arguments passed to clang
.
-L
is used to tell the linker where to look for the library files. ld
maintains a list of directories to search for a library to use. The default library search path is /usr/lib
then /usr/local/lib
. The -L option will add a new library search path.
At this point is it important to point out that, if the linker finds both a .a
(static) and .dylib
(dynamic) file in the search paths, it will always choose the dynamic library.
UNIX man pages are infamous for being dry, to the point and absolutely impossible to read for first-timers. Let's try an example to see how all this fits together.
We'll be using a very simple C program that uses libcurl
to retrieve the content at http://google.com
.
#include <curl/curl.h>
#include <stdio.h>
int main(int argc, char** argv) {
CURL *curl;
CURLcode res;
curl = curl_easy_init();
if (curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://google.com");
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
res = curl_easy_perform(curl);
if(res != CURLE_OK)
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));
curl_easy_cleanup(curl);
}
return 0;
}
libcurl
is installed by default with Mac OS X, so we don't need to provide headers or libraries to compile this program.
$ ls -1 /usr/include/curl
curl.h
curlbuild.h
curlrules.h
curlver.h
easy.h
mprintf.h
multi.h
stdcheaders.h
typecheck-gcc.h
$ ls -1 /usr/lib/libcurl*
/usr/lib/libcurl.3.dylib
/usr/lib/libcurl.4.dylib
/usr/lib/libcurl.dylib
So let's try to compile our program!
$ clang -o curl_example-system curl_example.c
OH NO! We get a bunch of "Undefined symbols" errors. Why?
We simply forgot to specify that we wanted to link against libcurl
at the linker stage. Let's add the suitable -l
option.
$ clang -o curl_example-system -lcurl curl_example.c
$ ./curl_example-system
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="fr"
><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop=
"image"><title>Google</title>...
Great! We have a working program that links against libcurl
.
So did we link dynamically or statically? When we inspected the contents of /usr/lib
, we only found .dylib
files, so we can expect that this executable links against dynamically. We can confirm this by using otool
.
otool
is a command-line tool to display different parts of object files. In our case, we want to see which libraries the executable links against, so we use the -L
option.
$ otool -L curl_example-system
curl_example-system:
/usr/lib/libcurl.4.dylib (compatibility version 7.0.0, current version 8.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
We can see our executable links against libcurl
and libSystem
.
That went pretty smoothly, so let's see what happens when we try to link against a custom build of libcurl
First of all, let's build our own version of curl
.
$ git clone https://github.com/curl/curl.git
$ cd curl
$ ./buildconf
$ ./configure
$ make
Supposing everything went smoothly, you now have your own freshly brewed curl
and libcurl
.
$ cd ..
$ ls -1 curl/lib/.libs
libcurl.4.dylib
libcurl.a
libcurl.dylib
...
So inside curl/lib/.libs
we have our libraries, both static and dynamic. Let's try compiling our program linking against our brand new libcurl
. We'll have to give the linker the path to our new libraries using the -L
option.
$ clang -o curl_example-dynamic -Lcurl/lib/.libs/ -lcurl curl_example.c
SUCCESS! We have a new executable. Let's see which libraries this one links against.
$ otool -L curl_example-dynamic
curl_example-dynamic:
/usr/local/lib/libcurl.4.dylib (compatibility version 9.0.0, current version 9.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
We can definitely see that our new executable does not link against the system libcurl
but where does that /usr/local/lib
path come from?
To answer this question, we need to inspect our brand new libcurl
.
$ otool -L curl/lib/.libs/libcurl.dylib
curl/lib/.libs/libcurl.dylib:
/usr/local/lib/libcurl.4.dylib (compatibility version 9.0.0, current version 9.0.0)
/usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/System/Library/Frameworks/LDAP.framework/Versions/A/LDAP (compatibility version 1.0.0, current version 2.4.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
That 1st line looks familiar... It's exactly the same as our executable! Actually, on Mac OS X, a dynamic library knows where it's "expected" to be installed and uses that path as an identifier. When the linker created our executable, it looked into our libcurl.dylib
and used the "install name". So let's run our new executable!
$ ./curl_example-dynamic
dyld: Library not loaded: /usr/local/lib/libcurl.4.dylib
Referenced from: /Users/chrales/Desktop/./curl_example-dynamic
Reason: Incompatible library version: curl_example-dynamic requires version 9.0.0 or later, but libcurl.4.dylib provides version 7.0.0
Trace/BPT trap: 5
What? Why?? NO! What happened here?
These types of errors are actually a consequence of "dynamic linking". Remember that dynamic linking requires a program to link symbols at run-time. In Mac OS X, the dynamic linker is a part of the OS. We can still get some insight from the manual pages.
man dyld
reveals a number of environment variables that can be used to affect the way symbols are dynamically linked. This one is of particular interest for us.
DYLD_PRINT_LIBRARIES
When this is set, the dynamic linker writes to file descriptor 2
(normally standard error) the filenames of the libraries the program is
using. This is useful to make sure that the use of DYLD_LIBRARY_PATH is
getting what you want.
Seems like a great way to track what's going on with our dynamic linking, and diagnose our previous error.
$ DYLD_PRINT_LIBRARIES=1 ./curl_example-dynamic
dyld: loaded: /Users/chrales/Desktop/curl_example/./curl_example-dynamic
dyld: loaded: /usr/lib/libcurl.4.dylib
dyld: unloaded: /Users/chrales/Desktop/curl_example/./curl_example-dynamic
dyld: Library not loaded: /usr/local/lib/libcurl.4.dylib
Referenced from: /Users/chrales/Desktop/curl_example/./curl_example-dynamic
Reason: Incompatible library version: curl_example-dynamic requires version 9.0.0 or later, but libcurl.4.dylib provides version 7.0.0
Trace/BPT trap: 5
That 2nd line tells it all. Since the dynamic linker did not find libcurl
at the expected install path, it fell back to the system library, which is a different (and incompatible) version than the one used when building our executable.
How do we tell the dynamic linker to use our custom library? This is also solved by using an environment variable.
DYLD_LIBRARY_PATH
This is a colon separated list of directories that contain libraries. The
dynamic linker searches these directories before it searches the default
locations for libraries. It allows you to test new versions of existing
libraries.
Seems perfect for the job... and it is!
$ DYLD_LIBRARY_PATH=curl/lib/.libs ./curl_example-dynamic
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="fr"
><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop=
"image"><title>Google</title>...
In the future, should we decide to keep our own libcurl
and use it, we could either install it to the expected path, or change the install name to reflect where our library will be located. This is a mildly annoying, but very powerful feature of Mac OS's dynamic linker that we can leverage to distribute custom versions of libraries with our software, even when those libraries are already installed on the system.
Dynamic linking and static linking both have distinct advantages and disadvantages. In our case, statically linking against a particular build of libcurl
could be useful if we want to be absolutely sure no other version of libcurl
– especially the system version – to be used when running our program, or to distribute our program without having to distribute our libcurl along with it.
We know that the linker will choose the dynamic library rather than the static library whenever it finds it on the search paths. For this example, we'll have to remove the dylib files from the build path.
rm curl/lib/.libs/*.dylib
Let's try compiling.
$ clang -o curl_example-static -Lcurl/lib/.libs/ -lcurl curl_example.c
Undefined symbols for architecture x86_64:
"_ASN1_INTEGER_get", referenced from:
_ossl_connect_common in libcurl.a(libcurl_la-openssl.o)
"_ASN1_STRING_data", referenced from:
_ossl_connect_common in libcurl.a(libcurl_la-openssl.o)
"_ASN1_STRING_length", referenced from:
_ossl_connect_common in libcurl.a(libcurl_la-openssl.o)
...
That's quite a large amount of undefined symbols! Why did this happen?
Remember when we used otool
to display all the libraries our libcurl.dylib
linked against? Remember libssl
, libcrypto
, libz
and the LDAP framework were all mentioned? That's where the missing symbols are. By linking staticallly against libcurl
, we copy all the symbols from libcurl
into our executable file, but we still need to link (statically or dynamically) against the symbols from these other libraries. However, since we didn't have any problems running our dynamically linked version of curl_example
, it means that the dynamic libraries can be found somewhere on the system. We can just tell the linker to link dynamically to the system libraries to find the missing symbols.
Remember that our
libcurl
links againstlibssl
andlibcrypto
from/usr/local/opt/openssl/lib
. There is another, incompatible version in/usr/lib
; we want to avoid linking against that one, so we'll have to add the path to the one our customlibcurl
uses with a-L
linker option.
$ clang -o curl_example-static -Lcurl/lib/.libs/ -L/usr/local/opt/openssl/lib/ -lcurl -lssl -lcrypto -lz -framework LDAP curl_example.c
$ ./curl_example-static
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="fr"
><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta
content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop=
"image"><title>Google</title>...
Voilà!
Let's inspect our new executable using otool
.
$ otool -L curl_example-static
curl_example-static:
/usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.5)
/System/Library/Frameworks/LDAP.framework/Versions/A/LDAP (compatibility version 1.0.0, current version 2.4.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)
No mention of libcurl
, we did it! Notice that we still need to link dynamically (at run-time) against the other libraries. But libcurl is not required at run-time; we can say that libcurl
is statically linked to our executable.
So what's actually in our executables? We'll use a tool called nm
. nm
lists and displays information about the symbols in our executable. Let's try running it on our curl_example-system
executable.
$ nm -m curl_example-system
(undefined) external ___stderrp (from libSystem)
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
(undefined) external _curl_easy_cleanup (from libcurl)
(undefined) external _curl_easy_init (from libcurl)
(undefined) external _curl_easy_perform (from libcurl)
(undefined) external _curl_easy_setopt (from libcurl)
(undefined) external _curl_easy_strerror (from libcurl)
(undefined) external _fprintf (from libSystem)
0000000100000e40 (__TEXT,__text) external _main
(undefined) external dyld_stub_binder (from libSystem)
Remember from using otool
that curl_example-system
links dynamically to libcurl
and libSystem
. We can see that every symbol linked from one of those libraries is listed as (undefined)
, meaning they aren't defined in our executable, but rather in another object file, linked at run-time. The only two symbols that are defined in our executable (and located in the __TEXT
segment of the __text
section) are __mh_execute_header
and _main
. __mh_execute_header
is the "mach header" of a Mach-O executable file, used by the OS to identify an executable file, and _main
is the main
function of our program. All other symbols are either symbols from the standard C library or from the cURL library.
To learn more about what a "segment" and a "section" are and more info about the internals of an executable file, read Anatomy Of A Program In Memory. (Linux-specific, but a lot of generic information inside).
Let's try curl_example-dynamic
.
$ nm -m curl_example-dynamic
(undefined) external ___stderrp (from libSystem)
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
(undefined) external _curl_easy_cleanup (from libcurl)
(undefined) external _curl_easy_init (from libcurl)
(undefined) external _curl_easy_perform (from libcurl)
(undefined) external _curl_easy_setopt (from libcurl)
(undefined) external _curl_easy_strerror (from libcurl)
(undefined) external _fprintf (from libSystem)
0000000100000e40 (__TEXT,__text) external _main
(undefined) external dyld_stub_binder (from libSystem)
(Un)surprisingly, we get exactly the same output, since both executables link dynamically to the same libraries. We know this from using otool
. We also know from otool
that a version and path are stored for each dynamic library we link to, making sure that -system
uses the default libcurl
and that -dynamic
uses our custom build.
Finally, let's look at curl_example-static
.
$ nm -m curl_example-static
(undefined) external _ASN1_INTEGER_get (from libcrypto.1)
(undefined) external _ASN1_STRING_data (from libcrypto.1)
(undefined) external _ASN1_STRING_length (from libcrypto.1)
...
(undefined) external _SSL_CIPHER_get_name (from libssl.1)
(undefined) external _SSL_CTX_add_client_CA (from libssl.1)
(undefined) external _SSL_CTX_check_private_key (from libssl.1)
...
(undefined) external ___bzero (from libSystem)
(undefined) external ___error (from libSystem)
(undefined) external ___maskrune (from libSystem)
...
0000000100021cf7 (__TEXT,__text) external _curl_easy_cleanup
0000000100021e66 (__TEXT,__text) external _curl_easy_duphandle
000000010001b2d9 (__TEXT,__text) external _curl_easy_escape
0000000100021d9e (__TEXT,__text) external _curl_easy_getinfo
0000000100021a1d (__TEXT,__text) external _curl_easy_init
0000000100022098 (__TEXT,__text) external _curl_easy_pause
0000000100021b0c (__TEXT,__text) external _curl_easy_perform
000000010002213c (__TEXT,__text) external _curl_easy_recv
0000000100022017 (__TEXT,__text) external _curl_easy_reset
00000001000221f3 (__TEXT,__text) external _curl_easy_send
0000000100021a60 (__TEXT,__text) external _curl_easy_setopt
000000010002a776 (__TEXT,__text) external _curl_easy_strerror
000000010001b554 (__TEXT,__text) external _curl_easy_unescape
...
(undefined) external _inflate (from libz)
(undefined) external _inflateEnd (from libz)
(undefined) external _inflateInit2_ (from libz)
(undefined) external _inflateInit_ (from libz)
...
(undefined) external _ldap_err2string (from LDAP)
(undefined) external _ldap_first_attribute (from LDAP)
(undefined) external _ldap_first_entry (from LDAP)
...
Woah! That's quite a load of symbols. Where do all these symbols come from?
First, let's notice that all symbols from libcurl
are now visible in the executable, and that they aren't (undefined)
anymore, but defined in (__TEXT,__text)
. This confirms that by linking statically to libcurl
, we actually wrote the code from the library into our executable, dispensing with the need to find those symbols elsewhere at run-time.
Now let's look at all the other new (undefined)
symbols. nm
is nice enough to mention where the symbols are located. Remember from building our -static
executable and from using otool
that we had to specify all the libraries libcurl
required on the command line for the linker. We can see now that we have a bunch of (undefined)
symbols for all those libraries. Just like the symbols marked from libcurl
in our previous builds, we need to have libcrypto
, libssl
(both part of OpenSSL), libz
and the LDAP framework installed on the system for our program to run.
While this article does not go in depth about static vs. dynamic linking, it does shed a lot of light on many lesser known developer tools in Mac OS X. To learn more about how Apple's developer tools compile code into an executable, you can look into these links, which were used extensively while writing this article.
- clang User's Manual
man ld
man otool
man dyld
man nm
- Anatomy of a Program in Memory - Gustavo Duarte
- Mach-O executable - objc.io
HI, @loderunner, Charles! Very well-written article! Great job. Thanks for your effort in getting this out-there.
I came across this article while trying to figure out how-to unpack and find the address of a section named
"__cstring"
that can be produced usingllvm-readelf -x -p __cstring
command-line tool.I am trying to programatically find out some of this info from within a running program. In Linux I can do:
I am searching for equivalent call in Mac/OSX land as
dladdr()
did not quite give me what I wanted.I looked elsewhere and came upon interfaces like (ignore the commented out code, those were my experiments):
But nothing seems to work.
Your write-up seems to hold some potential to help me figure out how to navigate Mach-O's linker / loader info.
If this comment tickles your brain and you may have some ideas to help me out, would appreciate a small ping.
Btw, I also found this article that you linked in the reference to be very useful!
Even with this WWW it is SO very difficult to find info for esoteric topics like these. So, thanks again, for putting that reference out there.