The LLVM team yesterday released LLVM 2.8, the low-level virtual machine infrastructure that includes a next-generation C/C++ compiler, optimiser, and run-time.
LLVM is actually a collection of projects which together form a means to do C, Objective-C and C++ compilation. Compilation itself is done with Clang, which has been able to compile both C and Objective-C for some time. Although C++ support was added in LLVM 2.7, the 2.8 release completes the C++ specification, which combined with libc++ provides a standard library for C++ applications. There's also a new LLDB debugger, which is a replacement for the erstwhile gdb
debugger but uses the same parser and source code tools as the compiler does. Although it has been available for a while, LLDB sees its first release in LLVM 2.8.
Unlike GCC, which is a monolithic compiler released under the GPL, the LLVM family of tools are more modular – and thanks to a more permissive BSD license, can be embedded in commercial tools. As a result, applications like Apple's Xcode include Clang support which performs faster than the external gcc
application – and not only that, but since the tool's AST can be introspected by the containing tool, gives the IDE much more power in understanding how the source code relates to the structure and compiled code.
In addition, the modular architecture allows for the Clang static analyzer to run over source code and identify potential bugs, as well as Klee, a symbolic virtual machine which can identify what sequence of events can occur within a program. One feature of Klee is that in the event of finding a bug, it can programmatically generate a test case to be able to exercise that condition to demonstrate that the fix has subsequently been made.
Not all about C
Finally, the LLVM project isn't only about C or C based languages. Underlying the front-end parsers is a symbolic instruction set – a kind of portable assembly code – which can be translated to any of the supported machine architectures. This makes it possible to build other parsers and translators to generate the same assembly code and take the advantage of being able to be used on any platform that is supported by the LLVM family.
Not only that, but the optimisations work at the assembly code level, not at the source level, so any languages that can be translated into the LLVM IR automatically take advantage of the runtime optimisations that are possible. There is also a runtime for interpreting the IR directly, so interpreted languages can use this to get a quick start, followed by subsequent calls to the JIT to optimise certain parts of the application.
This has already been used in the form of VMKit, which provides a common runtime for JVM and CLR – and is also used in the runtime of other languages. The Mono runtime, which also released version 2.8 today, includes support for LLVM as a JIT (with mon-llvm
) to aid in runtime optimisation. Other runtimes include Ruby on LLVM, MacRuby and Unladen Swallow. It's even used inside Clam AV to perform efficient virus scanning.
You can try out web-based demo to see how code is compiled into LLVM IR, or read the LLVM blog or see the documentation for more information.