Example of compiling C libraries

Previously on DevMonologue.com

In my last article we looked at how you can compile C libraries written by other people and integrate them into your Xcode project as static libraries. It covered some basics regarding the build process of many standard open source libraries. However, often times things don’t exactly go according to plan and an inexperienced developer can be left with cryptic compiler and linker errors.

And now… an example of compiling C libraries

So today we are going to look at an example of compiling C libraries. Seeing it in practice will be much more helpful and easier to understand. I’ll try to be as explicit as I can and make sure to explain everything I do. The whole process might prove a little scary to some of you, but you’ll be thanking me one day when you finally rub noses with such a problem.

So lets get on with it!

The project

All of the articles in this blog are about things that I either encountered in my programming career or I found interesting and wanted to learn more about. This post is no different, so the example of compiling C libraries we’re going to be working on, is something I build recently. To be honest, there are libraries that would be more useful to the majority of you, but nevertheless I believe the task I had shows many aspects of working with open source libraries that are a perfect project for this article

Today’s example of compiling C libraries will be something called JusText. The JusText algorithm is designed to take a whole HTML webpage as input, and output the useful, human readable “payload”. It quite an interesting project, which is luckily ported to C++ by István Endrédy (thank you sir!). So for starters, we need to compile that on the Mac for the x86, x86_64, armv7, armv7s and arm64 architectures.

Note: What are all these architectures and why do I need them? Well, these are the currently “mainstream” architectures for iOS projects. Here’s a quick summary:
* x86 – this is the standard desktop architecture from Intel that your Mac uses. For iOS apps it allows you to run your project on the simulator
* x86_64 – The 64-bit version of x86
* armv7 – The architecture that the iPhone (4S, I believe) uses
* armv7s – The architecture that slightly newer iPhones use (iPhone 5)
* arm64 – The 64-bit version of the ARM architecture (iPhone 5S and iPhone 6)

However, there’s a caveat to building JusText. It has dependencies:
* htmlcxx
* pcrecpp

And before you think I set you up, know that this is very common. Most libraries out there use other libraries. You cannot realistically expect them to write everything from scratch. So, before we are able to successfully build JusText, we will have to compile htmlcxx and pcrecpp. But what are these projects?

htmlcxx is used to parse HTML. Note that you cannot just use an XML parser since HTML is often not valid XML and parsing will just fail.

pcrecpp is a C++ library that provides regular expression support.

To summarize, our task consists of the following:
* Compile pcrecpp into a static library
* Compile htmlcxx into a static library
* Compile JusText using the libraries above
* Import JusText into your iOS project and start using it!

The plan

Looking at the source code of the libraries you wish to compile, there are two way to go. The first way is the one I showed you in my previous article – using it’s Make script. The other one is to compile the sources yourself using Xcode. There’s nothing stopping you from creating a new project, adding all files and compiling them as if you would with your own source code. This approach works well for small libraries that don’t need any special build configuration. However, for larger projects, it can quickly become too complicated to setup the Xcode project and figure out all dependencies.

For the purpose of this tutorial, we are going to be using both methods. In fact, we cannot stick to only one of them. Here’s why:

Our final product – the JusText library, doesn’t have a Makefile or a build script. It only contains a Visual Studio project file. So we will have to create a custom Xcode project for it. But don’t worry, the sources themselves are not very complicated so it should be fairly straight-forward.

Now, to satisfy JusText’s dependencies, lets look at pcrecpp and htmlcxx.

With htmlcxx, the decision is trivial. JusText’s author has already opted to include it’s sources as part of his project. That means that instead of creating a static library for htmlcxx, we will be importing the sources and compiling them as part of the JusText project.

On the other hand, pcrecpp is not included in the project. We will have to satisfy that dependency ourselves before compiling JusText. Now, looking at the source code for pcrecpp, we can see that it DOES contain a Makefile. Additionally, it has multiple products (the pcre library, the pcrecpp library along with other things) so we better let the Makefile deal with creating them rather than try to replicate that with Xcode.

So, upon further investigation, our tasks get a little updated:
* Compile pcrecpp into a static library
* Setup Xcode project for JusText
* Add pcrecpp as a static library
* Compile JusText as a static library
* Import JusText into your iOS project and start using it!

PCRE

Compile pcrecpp into a static library

Those of you who have already read my article about compiling C libraries, should already know how to do that. If not, you can should go ahead and read it before continuing.

But there is a problem. Following the steps in my previous article will build pcre for x86_64 (x86 if your Mac is not 64-bit). In order to build for the ARM architectures, we will need to modify our call to the configure script. More specifically, we need to instruct it to NOT use the default gcc and g++ compilers, but rather the ones that come with the iOS SDK. That’s because the gcc and g++ compilers that come with the SDK are specifically designed to build for iOS, meaning the ARM architecture in this case.

So, previously, we called the configure script without parameters:

$ ./configure

But now… wait for it…

$ ./configure --disable-shared --enable-utf8 --host=arm-apple-darwin CFLAGS="-arch armv7 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS8.1.sdk" CXXFLAGS="-arch armv7 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS8.1.sdk" LDFLAGS="-L." CC="/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc" CXX="/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++"

Yeah… So that’s a little different. Configure script parameters are way beyond the scope of a single article and I’m certainly not the man to explain them, but generally there are several things to note here.

Firstly, note that we explicitly instructed the configure script to build for the armv7 architecture using -arch armv7. Note that we can change that to armv7s, x86, x86_64 and arm_64. In fact, we will do just that later.

The other parameters set the path to the C and C++ compilers that we want to be used during the build process. Additionally, we specify the path to the iOS SDK that we want to build with.

Note: The file system paths that I’m using might not work for you if you’ve installed Xcode elsewhere or are building with different SDK. If you encounter any errors in the configure script, make sure that these paths are indeed correct.

If all goes well with the configure script, you can go ahead and run make to generate the library file.

At this point, you can go to the .lib folder and find all kinds of goodies:

libpcre.a libpcre.lai libpcrecpp.la libpcreposix.a libpcreposix.lai pcre_stringpiece_unittest pcregrep
libpcre.la libpcrecpp.a libpcrecpp.lai libpcreposix.la pcre_scanner_unittest pcrecpp_unittest pcretest

Note: .lib is a hidden folder so you might not see it in Finder.

Out of these, we will need:

libpcre.a
libpcreposix.a
libpcrecpp.a

Making a universal static library

You might notice that we only built the library for armv7. And I promised that it was going to be working with all the popular architectures we use for iOS development. The way we are going to achieve that might not the easiest or most elegant, but here it is. We are going to generate a static library file for every architecture we need by modifying our call to the configure script. It should be almost the same as the one shown above – just replace all references to armv7 with the new architecture you want to build for – arm64 for example. After each build, go ahead and copy the library files in a separate directory. Once you have one for every architecture you want, type the following in the Terminal:

$ lipo -create /path/to/library1 /path/to/library2 /path/to/library3 -output /path/to/universal_library.a

So what just happened? We took several library files and created what’s called a fat library. Thats a library that contains binaries for several architectures.

Preparing PCRE for use

Now that we have a pre-built version of the pcre and pcrecpp libraries, we are almost ready to check that off our list and continue with the next task at hand. So what’s still missing?

If you’ve read my previous article you might recall that in order to use a library, you will need two things – the library itself and its public headers. The headers, of course, are needed because the compiler doesn’t know about our library even after importing it into the project. Using the headers, we “promise” that we will provide implementation for the library method call during the linking process.

Normally, by invoking the make install script, the public header files are automatically copied into one of the standard header files folder – like /usr/local/include/. However, since we were being “clever” during the last step, we didn’t install our library and we will have to copy the headers ourselves.

One way to do that would be to open the Makefile and figure out which files it copies. However, since the PCRE library is relatively simple, you can kind of “guess” which files you’ll need to get from the source files. So we can get away with just copying all .h files from PCRE over to our future project file’s directory.

JusText

Good news! We’ve just completed the hard part of this tutorial. Now all we have to do is create a new Xcode project and add our PCRE universal library to it.

As discussed above, we are going to download the source code for JusText from github, add it to out Xcode project, satisfy its dependencies and hopefully… build it.

Setup a new Xcode project

First things first, lets download JusText’s source code.

Next, in order to create an Xcode project for a static library, go to File -> New -> Project -> Framework & Library -> Cocoa Touch Static Library. Once you have done that, go ahead and add all sources for JusText (excluding the Visual Studio files and the test directory).

Project tree screenshot

As you can see, it already contains the sources for the htmlcxx library so we don’t have to worry about that. What we have to worry about is pcre. Lets add it to the project, alongside it’s header files.

PCRE headers
Now we are kind of ready to try and build JusText. But (Spoiler alert) we are going to run into a couple of problems. Go ahead and hit that Build button, I dare you!

Fixing standard library errors

Standard library error

ci_string.h:8:10: 'bits/char_traits.h' file not found

Yeah… Now what? This error might seem a bit obscure… and it is. But a little googling around and you’ll find that Xcode doesn’t seem to be working that well with the C standard libraries sometimes. But here’s the clever bit – we are going change the C standard library the project is built with.

Changing the standard library

Whaaaaat? I bet you thought you’d never have to change these project settings. But those kind of problems are to be expected whenever you are trying to work with C++ libraries. They tend to use stuff from the standard libraries and even though they are supported by Xcode, it’s not always the default setting.

Alright! Let try building again. Surely everything will be fine this time.

ParserDom error

justext.h:7:10: 'html/ParserDom.h' file not found

Setting header search paths

Oh man, not again! Well, get used to it. You are bound to run into these sort of problems. So what happened here? The compiler cannot find ParserDom.h. But why? It’s right there in the project tree!

The problem is with the import statement.

#include <html/ParserDom.h>

If we change that to

#include "ParserDom.h"

the problem will be fixed. You would recall that the difference between these statements is that the latter checks the project directories for the specified file, whereas the former searches all the standard places your environment puts its headers.

That being said, changing the include statement is not best thing you can do in this situation. It’s not going to win you any respect with the developer society. That’s because both JusText and htmlcxx are third-party projects that you should avoid modifying unless you have something meaningful to add. It’s just cleaner that way.

We need another plan to fix this error. If we cannot change the include statement, we will have to change the places Xcode searches for headers. That’s done in the “Header search paths” field in the project settings. We’ll add JusText/htmlcxx-0.84 in there and specify that Xcode should search in there recursively since the actual headers are one level deeper in the html directory.

htmlcxx search paths

That should be enough to make this error go away. But the truth is that we are still not done… If you try to build once again, you’ll see that we get the same error for PCRE’s headers. At this point you should know how to handle that. Just for the sake of clarity, you need to add JusText/libraries/pcre/headers/ to the search path. This time you don’t need to make that path recursive because that’s the exact path to the header files.

Before you start hating me for making you work with header search paths, please consider that this, along with library search paths, is a big part of working with third-party sources and workspaces. So you better get used to it.

NOTE: Library search paths is the same as header search paths, but instead of telling the compiler where to look for headers, it tells the linker where to find libraries. If your library file is in a non-standard path, you may need to add its location in the library search paths. We didn’t do that for the JusText project because we added the PCRE static library in our project tree, thus including it in the build phases “Link binary using libraries” list

OK, so now what happens when we hit that Build button. Will there be another error? No! At this point the project should compile and link fine! Well done, we finally have a result! If you look under “Products” in your project tree, you should even see your built static library. Isn’t it satisfying…?

Valid architectures

After all that trouble we went through to build PCRE for many architectures, it makes sense to make sure our newly built JusText library does so as well. Unlike what we had to do previously, setting up Xcode to generate a binary valid for several architectures is quite simple.

First of all, in the “valid architectures” field of our project file, we will have to enumerate all architectures we need. In our case, that might be “armv7”, armv7s and “arm64”. In addition to that, we need to tell Xcode not only that all these options are acceptable build configurations for the project, but we also want it to include all these architectures in a single binary every time we build. That’s done by specifying “NO” in the “Build Active Architecture Only” field. However, it makes sense to leave that to “YES” for Debug builds in order to shorten build times.

Valid architectures

Finishing off

Wow… that was intense! Using third party libraries is hard work… Well, not as hard as writing that extra functionality yourself :)

Sample projects

After all that talking, it is now finally time for me to show you some code. A completed Xcode project, containing everything we discussed here about the example of compiling C libraries is available on my github page.

Also, I created a simple console utility that utilizes our newly created static library. It is also available on github.

One thing you might notice in JusTextUtil is that it also contains all static libraries related to PCRE. This is no coincidence. In fact, it is the client of a static library’s responsibility to satisfy all dependencies. What’s more, you can safely remove “libpcre.a” from the JusText project and it will build fine. It is the JusTextUtl project where the the build process needs to resolve all method calls in order to create the binary.

I hope you enjoyed this example of compiling C libraries, found it useful and you don’t get terrified next time you need to work with libraries. If there is anything unclear about the project we just completed, be sure to leave a comment so we can work together on solving each other’s development problems. For more tutorials about using libraries consider subscribing to the DevMonologue blog via email or RSS. Thanks for reading!

Compiling libraries

This is an article that will teach you how to download, compile and use a C (or C++) library for use in your own projects. It will give you some basic information about static libraries and dependencies that will be the foundation for a series of posts regarding working with third-party libraries.

Introduction

Many developers nowadays know too little about how to work with compiling libraries. And that’s a real shame since they are an essential tool in today’s programming landscape. Virtually all programs you can write nowadays use dozens of them. Even “Hello World”.

Maybe the reason people feel intimidated by C libraries is that they involve some general UNIX knowledge. Most of the things I personally know about them is from playing around with Linux. By doing so, I learned a couple of things about dependencies and how libraries work together by separating responsibility for different aspects of functionality.

Another difficulty comes from the fact that there are different types of libraries – static libraries, shared libraries, frameworks… While they aren’t that different in the way they’re used by developers, it’s yet another thing we need to keep in order. I’m not going to get into much detail about that, so if you’re interested, you way refer to this article explaining the difference between library types.

A quick word on dependencies

You might not be familiar with the concept of dependencies. But I’m pretty sure most, if not all, of your project have dependencies. For example, your iOS applications depend on the Foundation framework and UIKit (and possibly many others). Foundation and UIKit, in turn, depend on tons of other frameworks and libraries. In order for your app to properly run, all these dependencies have to be satisfied. This means that if Xcode doesn’t have any of these libraries, your project wont run.

The same way you can use a library to use functionality you don’t want to write yourself, libraries can depend on other libraries. That’s how you get dependencies in “real life”.

Let make an experiment and see what dependencies some programs have. In OSX, you can easily do this with the following command:

    $ otool -L <file>

It will give you the names and versions of all libraries used by the executable. For instance, here’s what the ls command (that lists folder contents and file information) uses:

    $ otool -L /bin/ls 
        /bin/ls:
        /usr/lib/libutil.dylib (compatibility version 1.0.0, current version 1.0.0)
        /usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version     
        5.4.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version
        1213.0.0)

Really neat. But how about something from Cocoa. Here’s the output from a sample console utility I wrote recently:

$ otool -L justextutil
justextutil:
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0)
    /usr/local/lib/libpcreposix.0.dylib (compatibility version 1.0.0, current version
    1.3.0)
    /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation
    (compatibility version 300.0.0, current version 1151.16.0)
    /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version
    1213.0.0)
    /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
    (compatibility version 150.0.0, current version 1151.16.0)

Hey, we can see some familiar faces here like CoreFoundation! Oh man, otool is the best!

But seriously… Dependencies will make you cry sometimes. They will cause “undefined symbol” linker errors that will make you question your intelligence as well as sanity. Consider yourself warned. So get to know (and love) them!

Anatomy of a library

So what is a library exactly? How do you recognize it when you see it?

A library is a collection of object files, combined in an archive that is not compressed.

Huh?! Well, you might think of it as a bundle of functions and classes all combined in a single file. Possibly having the “.a”, “.so” or “dylib” extension. It gets linked alongside your program and whenever your code calls a method from the library, the linker looks up the position of that code within the library file and assembles the resulting binary with it. It is important to note that functionality inside the library file are already compiled. That’s why companies love them so much. They can provide you with some cool functionality, but at the same time, not give you the source code for it (so you can come back for more :)).

So code from the library is automatically assembled into your binary by the linker. But what about compiling? How do you compile function calls that you don’t have the source code for. If you thought of that while reading the previous paragraph, treat yourself to a pat on the back.

In reality, the library file is not the only thing you need. You also need the library’s header files. Or at least the public ones. You add them to your project so that the compiler knows that this functionality you are using, really exists (actually it is functionality you promise that exists). To summarize, in order for you to successfully use third-party libraries in your project, you need two things – the library itself (most often a “.a” file) and the library’s public header files 1.

Hm, well that’s inconvenient. Why should there be two sets of files? Wouldn’t it be great if there was a single item containing them all?

Yes! Yes, there is? It’s called a framework. A framework is a library, bundled together with it’s headers and resources (among other things). You can even see this in Xcode. In the navigator (the file tree on the left), find a framework, click on the chevron on it’s left to expand it. It shows a “Headers” directory. Open that and you’ll see all header files available for this framework.

Framework contents

Lets start compiling libraries!

Lets start with something I had to deal with recently. We are going to be compiling PCRE for our OSX machines. PCRE, or Perl Compliant Regular Expressions is a C++ library that adds regex support. It is used by some major software projects like Apache and Apple’s Safari and if you someday need to use another C/C++ library there’s a chance that it depends on PCRE (exactly what happened to me last week).

Right, so since we’re going to be building stuff, first of all, we are going to need some source code. Conveniently enough, the source code for PCRE is readily available online. Head over there and download a copy of it. Be sure to get PCRE and not PCRE2 since we’re old school and don’t want any goodies from these newer versions (though I doubt PCRE2 will have different build method). So, you should be looking at an archive, containing the sources. Go ahead and unarchive it either with your GUI tool, or if you’re really hardcore, from the Terminal using:

    tar xzvf filename.tar.gz    # for tar.gz
    tar xjvf filename.tar.bz    # for tar.bz

It’s always good to be able to do such basic stuff like that right from the Terminal so that you can impress the cool kids, whenever they are around.

Anyway, you should end up with a directory full of wonderful stuff. Alongside the actual source code, we are going to be focusing on several files:
* autogen.sh
* configure
* MakeFile

Autogen

If you check now, you’ll see that the configure file is not there. Autogen is another script that will generate it for you. Note that many libraries don’t have it. In this case, just ignore this step and go straight to configure.

Configure

Configure is a shell script that contains a lot of cryptic code I cannot even begin to understand. But it will prepare your source code for build. Go ahead and run it without any parameters:

    ./configure

You can add additional parameters in order to build the library in a different way. For instance, if you want to build it for another architecture, like arm or arm64. Quite useful for iOS developers. However, this is outside the scope of this article and we are going to tackle it separately. So stay tuned to this blog to find out. You can always subscribe in order to get notified about new posts.

make

Once configure finishes successfully, we can go ahead and run make. But before that, make sure configure didn’t fail! This happens sometimes. Read the output carefully and make sure there are no errors. If everything is fine, run make. It will build the library. If it completes successfully, there should already be a built version of the library somewhere inside the directory, most likely in the “.lib” directory. If that’s ok with you, you can stop here, otherwise continue with the next step.

make install

Running make install will install your library in the appropriate places, like /usr/local/lib for instance. You might have to run in as a super user with sudo make install.

So, to summarize, the following sequence of commands should be enough to build the PCRE library:

./configure
make
make install

Now that you have the compiled library file, you are ready to use it in your projects. You will also need the public header files, but you can get those from the source code directory. Additionally, make install should copy them in the directories Xcode searches for library headers (e.g. /usr/local/include/).

Next, we may need to setup our Xcode project so that it knows where to look for our library and it’s headers. But that’s a problem for another post. I don’t want to fill your brains with too much information at once.

So that’s it for today. Hope I didn’t scare you away from compiling libraries. And don’t worry – it might take a certain amount of fiddling, but it works out fine eventually.

In the next articles, we are going to be looking at:
* Setting library and header search paths
* Compiling libraries for different architectures

You may want to subscribe to the blog to get notified for these new posts.

[1] What makes a header public is purely the developers choice. A library author may not want to expose all library functionality to it’s users. He/She might choose to leave some headers hidden. Thus, the rest of the headers are the public ones.