Sunday, April 29, 2012

Back to Tech : Building for ARM "on the metal"

It's been a while since I posted anything technical, what with the winter season and moaning about the launch of the Pi, price gouging on ebay, playing hockey, teaching my 12 year old to program, and a whole  load of other stuff.  Fear not, though; LambdaPi is still moving onwards, I have a branch on my machine that separates everything that is Lisp from everything that is kernel, and is moving towards a more solidly real-time base.

Anyway.  I had a request from Angus Hammond asking about build scripts, as follows:

Would you be able to do a blog post at some point explaining how the build script you use in lambdapi works? The linking needed for an OS is obviously different to that needed normally if only because of the interrupts table, and that script is completely beyond my understanding. It would be greatly appreciated if you could explain any of the working behind it.
The answer to this is, of course, "of course".  So let's have a look at what's required.

Firstly, we need to understand what we want.  The way the Pi boots (and the way I'm using qemu) is to (eventually) load a binary image into memory at location 0x00000000.  This binary image is required to start with a standard ARM vector table, which we have already seen.  Execution actually starts by transferring control to the ARM with the PC set to 0x00000000, the standard "reset" vector.

So, what's going on in the build process?

Well, firstly, we compile and assemble everything we need into object files.  So far, so standard.  The tricky bits come with the following Makefile rules :

bin/kernel.img: bin/kernel.elf
  ${OBJCOPY} -O binary $< $@

bin/kernel.elf: lambdapi.ld $(OBJ) $(SYSLIBS)
  ${LD} ${LDFLAGS} -T lambdapi.ld $(OBJ) $(SYSLIBS) -o $@

The first of these to be executed is the one that builds bin/kernel.elf - this is dependent on the link script lambdapi.ld, all the object files $(OBJ), and all the assorted libraries we're using $(SYSLIBS).  We call the linker ${LD} with a set of flags ${LDFLAGS}, using a linker script lambdapi.ld, passing in all the object files and system libraries.  LDFLAGS are set to 

-nostdlib -static --error-unresolved-symbols

Which means "don't try and link any standard libraries, link everything statically, and throw an error if there's anything you can't find".  

So, at the end of that step, we should have a properly formatted elf-format file.  Unfortunately, that's not what we want, we need to lose all the elf stuff and end up with a raw image to be loaded at a particular location, hence the objcopy stage, which takes the binary bit of the elf image and spits it out, standalone.

Now, all we need to know is how to make the linker file.  A standard ld script takes a bunch of object files and munges them all together, keeping the .text (code) sections together, then appending a load of other stuff at the end, notably the bss and data sections.  Unfortunately, it doesn't guarantee any particular order of entry, and we *need* our reset table to be the very first entry.  So, we need a custom link script.  Let's look at it.

. = 0x0;
.text : {

__exidx_start = .;
.ARM.exidx   : { *(.ARM.exidx* .gnu.linkonce.armexidx.*) }
__exidx_end = .;

.data : { *(.data) }
__bss_start__ = .;
.bss : { *(.bss) }
__bss_end__ = .;

The first thing we do is tell the linker where the entry point is : symbol __reset.  In reality, we only need this if we'll ever be using the elf file directly, which we won't, but it doesn't hurt.

Now for the meat of the script.  We start by setting the link location to be 0x0, and then we create a text (code) section.  The first part of this will be the vector table, which we have handily tagged in vector_table.s with a .section directive as follows:

.section .reset, "ax"

This tells the assembler that the code assembled should be placed in a section of code with a name of  .reset (which I made up, I could equally well have used .simon, .foo, or anything else, as long as I was consistent between the ld script and the section name in the code itself).

In fact, we're linking all code labeled with a .reset section name first, but as there's only one of these, we can guarantee it will be first.

After that, and still within the .text section, comes anything linked as .text.  Basically, that's everything "code" spat out of the C compiler, along with everything else we've labelled as code in the assembler.

Then we have a .ARM.exidx section, that's required for C++ exception handling if we ever need it.  We don't, at the moment, but might do later.

Then we have the .data section - that's for any pre-defined, read-only values.

And finally, we have the .bss section - space allocated for unassigned data we might write to.

You'll also note that we set a couple of variables, __bss_start__ and __bss_end__ to the current link location "." - these are referred to in the code (so that we know where bss starts and ends, and thus where free memory starts) and will be replaced by the linker with the correct values.

And that's pretty much it.  It looks scary, but it's not.

If you want chapter and verse on this, check out this article which is much more explicit, and covers using system libraries and so on.  Also, it's worth looking at balau's blog entry on newlib (and all of balau's other bare metal articles, which are highly instructive).  And, of course, the gnu binutils documentation.