DK Ancora Imparo

A cursory look at LLVM and Clang

Today, I spent my day looking at LLVM and Clang. It’s more like a cursory look rather than an in-depth investigation. The first thing that you want to do is to install LLVM tools. You could checkout the subversion repositories and build them from source. Be prepared to wait for an hour or so and have about 6 gigabytes of free space. Or, you could just invoke:

    $ brew llvm

It took only a few minutes or so because it’s a bottled package (binary). After that, we could write our first program:

    $ cat test.m
    #import <Foundation/Foundation.h>

    int main(int argc, char *argv[]) {
        NSLog(@"Hello World");
        return 0;
    }

    $ clang -framework Foundation test.m -o test

    $ ./test
    2013-08-12 20:12:30.409 test[6490:707] Hello World

Actually, we don’t need to install LLVM to perform the steps above because Xcode comes with clang. But, we do need some LLVM tools if we want to play around with the LLVM intermediate representation (“LLVM IR”).

Compiling a program is actually a multi step process that involves preprocessing, parsing, optimization, code generation, assembly, and linking. In addition to that, a compiler is normally implemented in a modular way; such that the part that handles processing the programming language code (front end) is independent from the part that handles generating the machine code (back end). This allows the compiler to handle multiple languages for multiple target machines. Some pictures that show a compiler architecture. If you have some existential questions about compilers, the following article describes bootstrapping a compiler so that it can self-compile itself.

In LLVM, the middle layers, where the front end and the back end meet, are represented by the LLVM IR, a well specified code representation. The LLVM IR is defined in three isomorphic forms: a human readable textual format, an in-memory data structure, and an efficient on-disk binary format (bitcode). Using the above source file, we can see the intermediate representation:

    $ clang test.m -S -emit-llvm -o test.ll
    $ cat test.ll
    ; ModuleID = 'test.m'
    target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
    target triple = "x86_64-apple-macosx10.8.0"

    %0 = type opaque
    %struct.NSConstantString = type { i32*, i32, i8*, i64 }

    @__CFConstantStringClassReference = external global [0 x i32]
    @.str = linker_private unnamed_addr constant [12 x i8] c"Hello World\00", align 1
    @_unnamed_cfstring_ = private constant %struct.NSConstantString { i32* getelementptr inbounds ([0 x i32]* @__CFConstantStringClassReference, i32 0, i32 0), i32 1992, i8* getelementptr inbounds ([12 x i8]* @.str, i32 0, i32 0), i64 11 }, section "__DATA,__cfstring"

    define i32 @main(i32 %argc, i8** %argv) uwtable ssp {
      %1 = alloca i32, align 4
      %2 = alloca i32, align 4
      %3 = alloca i8**, align 8
      store i32 0, i32* %1
      store i32 %argc, i32* %2, align 4
      store i8** %argv, i8*** %3, align 8
      call void (%0*, ...)* @NSLog(%0* bitcast (%struct.NSConstantString* @_unnamed_cfstring_ to %0*))
      ret i32 0
    }

    declare void @NSLog(%0*, ...)

    !llvm.module.flags = !{!0, !1, !2, !3}

    !0 = metadata !{i32 1, metadata !"Objective-C Version", i32 2}
    !1 = metadata !{i32 1, metadata !"Objective-C Image Info Version", i32 0}
    !2 = metadata !{i32 1, metadata !"Objective-C Image Info Section", metadata !"__DATA, __objc_imageinfo, regular, no_dead_strip"}
    !3 = metadata !{i32 4, metadata !"Objective-C Garbage Collection", i32 0}

The file test.ll contains the human readable LLVM IR which can be converted into the bitcode format:

    $ llvm-as test.ll
    $ file test.bc
    test.bc: LLVM bit-code object x86_64
    $ od -x -N 4 test.bc
    0000000      c0de    0b17
    0000004

Interesting trivia: the magic number for bitcode files is 0x0b17c0de. The tool llvm-dis disassembles the bitcode into the human readable IR. And the tool lli directly executes programs from LLVM bitcode.

    $ rm test.ll
    $ llvm-dis test.bc # You'll get the test.ll file back
    $ lli -load=/System/Library/Frameworks/Foundation.framework/Versions/Current/Foundation test.bc
    2013-08-12 22:23:50.181 lli[9516:707] Hello World

Well, that’s it. A cursory look at LLVM and Clang. More to come. Oh by the way, a highly recommended article on LLVM by Chris Lattner himself can be found in The Architecture of Open Source Applications book.

Miscellaneous thoughts on iOS Development

Modules and Dependencies

It’s a good idea to group common and reusable code into a library. It minimises code duplication, saves time, helps in debugging, and simplifies code maintenance. The following tutorial shows how to create a library and structure your Xcode project into subprojects. In addition, we also need a way to manage the dependencies. We could use Git submodules as described in the aforementioned tutorial or we could use CocoaPods. Both approaches are fine as long as you set the version numbers explicitly for the dependencies. For example, do not set your dependency to the master branch of a project. Instead, set it to a well known version, e.g. 2.5.2. You do not want your app to break because of a late night code change by the project maintainer.

Code Style and Formatting

It’s important to have a consistent code style guide. Preferably, it’s company wide in scope. It allows new people to move into existing projects easily. It allows people to switch between projects easily and it simplifies collaboration of two or more people in a big project. However, this is a subjective topic and people tend to have strong opinions on this. There are so many Objective-C style guides, for example, there are guides from Google, Apple, Adium, Zarra Studios, and so forth. Related to the code style is code formatting. Again, people have strong opinions on code formatting, for example, tabs or spaces, 4 or 8 spaces, bracket on the same line or next line, and so forth. Some people use code formatting tool like Uncrustify for Xcode projects, for example, Uncrustify Automator Services. Some people use Git pre commit hooks to format the code before it’s being committed to a repository. To iterate, it’s important to have a consistent code style guide. Having said that, if it becomes a blocker in the development process, it’s also useful to see this in a larger context. After all, apps on the App Store are not judged by whether or not they use tabs/spaces.

Prototyping

For a quick prototyping, just use Storyboard and Interface Builder. The tools are useful and they get the job done in a short amount of time. They allow you to quickly prototype the UI, get something functional running, and test the app with users. In addition, Storyboard allows you to design your table views and cells statically (no code is required). Unfortunately, it’s not available in Interface Builder. Do not write code for the UI for an initial prototype. The code might not survive the first user acceptance testing. In addition, we could also use one of the many iOS prototyping apps: Briefs, POP, or Prototypr. Or one of the web apps: Fluid or Flinto.

Testing

With iOS 7 and apps that can auto update, application testing is becoming more and more important. New projects created with the new Xcode 5 come with unit tests by default. In addition, Apple has provided new tools and frameworks that put further emphasis in testing and continuous integration: the new XCTest framework, Xcode 5, Xcode bots, and Xcode Server in Mac OS X. Thus, new apps need to be developed with testing in mind.

Version Control

There is only one answer here: use a version control, period. Feel free to use Git or Subversion (the only supported version control systems by Xcode), but most of the time, Git is the right choice. It allows you to collaborate with other people, track changes to your code, and review the changes. Xcode provides you with the code comparison editor and history view. Also, please write meaningful and helpful commit messages as explained by this blog post.

Documentation

Transferring knowledge by talking and hands-on sessions does not scale and it does not work in the long term. It’s really important to document the code for the sake of your future self and the next project maintainer. If the project is big and the design needs to be explained, write a separate text document explaining the rationales and the design decisions. Write comments to explain an intricate code segment and when you change the code, edit the comments accordingly. Write code comments judiciously. Remember that comments can become stale and lines of comments in the code are lines that need to be maintained.

Alloc

I don’t enjoy writing as much as I enjoy programming. But I think writing is a step that I must take to get better in communicating ideas. So yeah, in the fourth month of the year, I’m reviving my blogs. Hopefully, this post is not the only post for the year. In the great tradition of Objective-C: alloc (or new if you use ARC).