Tuesday, January 31, 2012

CAR, CDR, CONS - an aside

Commenter pcpete has been following the ARM OS stuff, but was having some trouble with my CAR/CDR/etc macros.  I figured I'd do a little writeup for those who aren't totally au fait with what that's all about.

CAR, CDR and CONS are terminology which has stuck around since the '50s.  It's not going away, so if you want to Lisp, you need to get used to it.

The original Lisp was implemented on the IBM 704 computer.  This was a machine which had a 36 bit word (peculiar, but useful) and a 15 bit address bus. 36 bit words could be split into 4 parts (and hardware support was provided for doing so) known as the address (a 15 bit value), the decrement (again, 15 bits), the tag (3 bits) and the flags (3 bits).  The very earliest work on Lisp provided operators to extract these 4 parts (Contents of Address part of Register or CAR,  and likewise for Decrement, Tag and Flags, providing CDR, CTR and CFR).  Later work (before the first implementation of Lisp) dropped direct manipulation of flags and tag, leaving CAR and CDR, and a "Construct" operator, CONS, which took values for the address and decrement parts respectively and stuffed them into a machine word.

The fact that a machine word was bigger than the size of the address of a machine word meant that it was possible to implement pairs of values and singly linked list cells within one machine word.  A pair of values (for example, 1 & 2) could be created as follows:

CONS(1, 2)

and singly linked lists by treating the 'address' part (paradoxically enough) as a value, and the 'decrement' part as the address of the next cell in the list.  By setting the last pointer to some easily-recognised value (known as 'nil'), it is possible to find the end of a list.  Thus, creating a list containing the numbers 1, 2 and 3, looks like this:

CONS(1, CONS(2, CONS(3, nil)))

Binary trees are also easy to construct using pairs - for example:

CONS(CONS(1, 2), CONS(3, 4))

I'm generally the last one to point people at Wikipedia, but there's a reasonable and rather more graphical explanation of what have become known as 'cons cells' over there

It's worth noting that Lisp also allows you to use the notation 'first' and 'rest' instead of 'car' and 'cdr', although this only really makes sense when in list context, and doesn't allow for composition (caar, cadr, cdar, etc).  Pretty much nobody uses first and rest.

So, Lisp uses what is, in effect, a pair (or list) oriented memory model as opposed to the more usual (these days) approach of addressing single memory cells individually.  And if one is implementing a Lisp (as I am), it makes a certain amount of sense for the low level memory allocation and manipulation functions to operate in this way.  Unfortunately, "modern" machines in general don't have words that are larger than their address bus size, and ARM is no exception to this.  ARM is a 32 bit machine - we could restrict addressing within our lisp to 16 bits (maximum 64Kwords or 256KB) and all values to 16 bits, or, more sensibly, use "virtual" words of 2 machine words (or maybe more) in size.  Usefully, ARM has LDRD (Load Register Dual) and STRD (Store Register Dual) operands that make this a snip to do.

So, we can define cons cells as 8-byte aligned pairs of 32-bit values, manipulate them as pairs using LDRD/STRD, and we're a long way towards implementing a Lisp.  

But what of the Flags and Tag stuff that was originally in Lisp?  Although, by the time Lisp came out, you had no way of directly manipulating them, they were still used to differentiate between pointers, direct values, and constants, amongst other things.  And this is something we need to be able to do as well, otherwise we can't tell if (for example) the value 0x00000000 in the CDR of a cell is a pointer to a cell at the address 0x00000000, the integer value zero, or some special value indicating nil/end of list.

As it happens, all is well. LDRD/STRD require an 8-byte aligned address, which means the low 3 bits of a pointer will *always* be 0.  We can, therefore, use those 3 bits as flags to indicate various other types, and, if we're careful, we can encode the majority of types in such a way as to keep their values "boxed" (i.e. containable in a 32 bit value).  Values requiring more than 32 bits to encode (strings, large numbers, rational numbers, etc) can either be encoded as lists of boxed values, pairs, or using any other encoding the programmer deems useful as long as they are easily recognisable.

So, back to the code I posted, which uses CAR, CDR, SET_CAR and SET_CDR macros.

/* linked list structure */
typedef struct task_list {
  task_t * _car;
  struct task_list * _cdr;
} task_list_t;

/* some LISPy primitives */
#define CAR(x) (x)->_car
#define CDR(x) (x)->_cdr

#define SET_CAR(x,y) CAR(x)=(y)
#define SET_CDR(x,y) CDR(x)=(y)

As we can see, task_list_t is an analogue for the cons cell, being simply 2 address-bus sized values held together.  Indeed, my code elsewhere looks like this:

typedef cons_t task_list_t;

but the original formulation is perhaps easier to grok at first.

So, given a pointer x to a cons cell c, we can see that CAR(x) expands to(x)->_car, in other words returning the _car element of the cell c pointed to by x.  Likewise SET_CAR(x, y) becomes (x)->_car = (y), assigning the value y to the _car element of the cell c pointed to by x. Doing this as macros is probably premature optimisation, but it's "the done thing".

Our task scheduler uses a list (actually, a plurality of lists) of tasks to execute, and considers them for execution in a "round robin" style.  So, assuming we have 4 tasks, a, b, c and d, we might have a list that looks, in lisp syntax, like this:

(a b c d)

or, in diagram form, where the top row is the CDR of the cell, and the bottom the CAR, and * indicates a pointer to the next cell

* - * - * - nil
|   |   |   |
a   b   c   d

We need to rotate the list as we consider each task, so that the next time through we consider the next task. So after considering task a for execution, we want the list to look like this:

(b c d a)

or, in diagram form

* - * - * - nil
|   |   |   |
b   c   d   a

In true lisp style, we'll use a singly linked list for this, with the CAR of each cell being a pointer to the task, and the CDR being a pointer to the next cell, or nil for end of list.  

So, we come in, and we find the CAR and CDR of the (original) list.  CAR is, obviously enough, task 'a', the task we want to consider for execution.  CDR, however, is not task b, but the rest of the list (hence the alternative 'rest' in Lisp) - i.e.

(b c d)

or, in diagram form

* - * - nil
|   |   |
b   c   d

So our next step is to put 'a' onto the end of the list.  This is not as simple as it seems - CONS can only be used to push items onto the front of a list - CONS(list, a) would result in this:

((bcd) . a)

or, in diagram form

* - * - nil
|   |   |
b   c   d

which is not a true list in Lisp terms (it's a "dotted pair" with a list as its first element). We need some sort of APPEND operator that adds an element to the end of a list, a sort of 'reverse cons'.  Lisp has one of those, it's (oddly enough) called APPEND, it's usually defined recursively, and it always creates new cons cells all over the place.  We really don't want to go allocating new memory in a function that's going to be called thousands of times per second simply to move a few values about - we want to keep the same cons cells and simply swap their CDR values around.  The algorithm thus becomes:

first := list
if CDR(first) != NIL
  list, end := CDR(first)
  end := CDR(end) while CDR(end) != NIL
  SET_CDR(first, NIL)
  SET_CDR(end, first)

With that in mind, have another read through http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part_14.html


Monday, January 16, 2012

Developing a multitasking OS for ARM part 3 3/4

So, we looked at how to do the difficult part of task swapping last time, now let's look at how we go about handling interrupts and the syscall layer.

Firstly, interrupts.  We'll not bother with FIQs for the moment, we'll stick with IRQs.

Despite having an ARM1176 core, the Raspberry Pi (or, rather, its Broadcom SoC) doesn't have a vectored interrupt controller - instead it has 3 "standard" ARM interrupt controllers.  That makes things a bit more complex for interrupt handling, but it's not too arduous.  As a quick reminder, here's the IRQ code we had before, with a little bit of meat in the middle:

.global _irq_handler

sub lr, lr, #4 /* Save adjusted LR_IRQ */
srsdb sp!, #SYS_MODE /* save LR_irq and SPSR_irq to system mode stack */

cpsid i,#SYS_MODE /* Go to system mode */

push {r0-r12} /* Save registers */

and r0, sp, #4 /* align the stack and save adjustment with LR_user */
sub sp, sp, r0
push {r0, lr}

/* Identify and clear interrupt source */
/* Should return handler address in r0 */
bl identify_and_clear_irq

blxne r0 /* go handle our interrupt if we have a handler */
/* An interruptible handler should disable / enable irqs */

/* Exit is via context switcher */
b switch_context_do

and the context switcher looks like this:

.global switch_context_do
/* Do we need to switch context? */
mov r3, #0x0c /* offset to fourth word of task block */
ldr r0, =__current_task
ldr r1, [r0]
ldr r0, =__next_task
ldr r2, [r0]
cmp r2, #0 /* If there's no next task, we can't switch */
beq .Lswitch_context_exit
cmp r1, #0 /* In the normal case, we will have a __current_task */
bne .Lnormal_case

/* When we get here, we're either idling in system mode at startup, or we've */
/* just voluntarily terminated a task.  In either case, we need to remove the */
/* return information we just pushed onto the stack, as we're never, ever going */
/* back. */
pop {r0, r1} /* remove any potential stack alignment */
add sp, sp, r0
add sp, sp, #0x3c /* and the other registers that should be there */
/* r0-r12, interrupted pc & spsr */
/* Now we can do our first actual task swap */
ldr r0,  =__next_task /* swap out the task */
ldr r2,  [r0]
ldr r0,  =__current_task
str r2,  [r0]
ldr sp,  [r2, r3] /* and restore stack pointer */
b .Lswitch_context_exit /* bail */
cmp r1, r2 /* otherwise, compare current task to next */
beq .Lswitch_context_exit

clrex /* Clear all mutexes */

/* At this point we have everything we need on the sysmode (user) stack */
/* {stack adjust, lr}_user, {r0-r12}_user, {SPSR, LR}_irq */
/* Save our stack pointer, and swap in the new one before returning */

ldr r0, =__current_task /* save current stack pointer */
ldr r0, [r0]
str sp, [r0, r3] /* stack pointer is second word of task object */
ldr r0,  =__next_task /* swap out the task */
ldr r2,  [r0]
ldr r0,  =__current_task
str r2,  [r0]
ldr sp,  [r2, r3] /* and restore stack pointer */
pop {r0, lr} /* restore LR_user and readjust stack */
add sp, sp, r0
pop {r0-r12} /* and other registers */
rfeia sp! /* before returning */

The context switcher looks complex, but it isn't really.  There's 4 cases to cater for, viz:

  • No 'next' task, don't switch
  • No 'current' task, switch to 'next' task, cleanup stack and switch to 'next' task
  • 'next' task is the same as 'current' task, don't switch
  • 'next' task is different from 'current' task, switch
Obviously, the meat of the interrupt handler is held in 'identify_and_clear_interrupt', which does pretty much what it says on the tin.  In this article, I'll show the handler for the qemu platform, which is significantly simpler than that for the Pi, but the Pi handler looks largely the same modulo having to deal with 3 controllers.
.global identify_and_clear_irq

FUNC identify_and_clear_irq

ldr r4, =.Lirq_base

ldr r4, [r4]
/* read the vector address to indicate we're handling the interrupt */
ldr r0, [r4, #IRQ_HANDLER]
/* which IRQs are asserted? */
ldr r0, [r4, #IRQ_STATUS]
ldr r5, =__irq_handlers

clz r6, r0 /* which IRQ was asserted? */
mov r1, #1 /* make a mask */
bic r0, r0, r1, lsl r6 /* clear flag */
str r0, [r4, #IRQ_ACK] /* Now acknowledge the interrupt */
str r0, [r4, #IRQ_SOFTCLEAR] /* and make sure we clear software irqs too */
ldr r0, [r5, r6, lsl #2] /* load handler address */
.Lret: bx lr /* exit */
.word IRQ_BASE

.global __irq_handlers
__irq_handlers: .skip 32 * 4

and patching in a hander is as simple as setting the address for the interrupt handler into __irq_handlers at the appropriate place.  Simples, as they say in internet-land.

Syscalls are very similar.  Here's the syscall handler:

.global _svc_handler
srsdb sp!, #SYS_MODE /* save LR_svc and SPSR_svc to sys mode stack */
cpsid i,#SYS_MODE
push {r0-r12} /* Save registers */

and r0, sp, #4 /* align the stack and save adjustment with LR_user */
sub sp, sp, r0
push {r0, lr}
ldr r0,[lr,#-4] /* Calculate address of SVC instruction */
/* and load it into R0. */
and r0,r0,#0x000000ff /* Mask off top 24 bits of instruction */
/* to give SVC number. */
ldr r2, =__syscall_table /* get the syscall */
ldr r3, [r2, r0, lsl#2]
cmp r3, #0
beq _syscall_exit
tst r3, #0x01 /* what linkage are we using */
bxeq r3 /* ASM, just run away */
bic r3, r3, #0x01
blx r3 /* C, must come back here */

.global _syscall_exit
pop {r0, lr} /* restore LR_user and readjust stack */
add sp, sp, r0

pop {r0-r12} /* and other registers */
rfeia sp! /* before returning */

.section .bss
.global __syscall_table
/* leave space for 256 syscall addresses */
.skip 2048

The fun bit is how we go about getting the syscall number into the handler.  I've taken the "canonical" approach of using the svc operand, hence the bit where we get the instruction and extract the number.  Other ways include using a register, a global variable, pushing onto the stack, or some combination of these.

The other twist here is that I allow for both C and assembler syscall functions by setting (or not) bit 0 of the function address in the syscall table.  Assembler syscall handlers must, of course, exit via _syscall_exit or equivalent, but that's down to the programmer to get it right.

Saturday, January 7, 2012

Developing a multitasking OS for ARM part 3 1/2

Okay, so last time we looked at what we need to schedule and store top level information about tasks in our fledgling OS.  And it was pretty easy to do, because we could do all of it in C.  Unfortunately, things are about to get complicated again, because we're about to dive down into assembler again.

Whoopee!  I can almost hear the sound of "back" buttons being clicked as I type this.

So.  Multitasking.  How's that gonna work, then?  Largely speaking, there's 2 types of multitasking - preemptive multitasking, where tasks run for a specified amount of time then get forcibly swapped out, and cooperative multitasking, where tasks run until they decide to give some time to someone else.  We're going to implement (because there's precious little overhead in doing so) a hybrid where tasks can be "nice" and give time to others, but where the absolute maximum time they get is capped by a preemptive scheduler.

So.  Let's look at the sequence of events for a user-triggered task swap.  We want the user to use a line of code that looks like this (remember, I'm developing something that runs scheme...)


However, the user's code is running in user mode (remember the ARM processor states), and it can't directly access the task scheduler.  This is where software interrupts / SVC calls come in - the user's code *can* use the svc instruction.  This, in fact, is the beginning of what is known as a syscall interface, the way that unprivileged user code calls (or causes to happen) kernel functions.

Executing the svc instruction causes the following to happen:

  • CPSR_usr is transferred to SPSR_svc
  • PC is stored in LR_svc
  • Processor switches to SVC mode (SP_usr and LR_usr are now hidden by SP_svc and LR_svc, CPSR is identical except for processor state change)
  • PC is loaded with the address of the SVC exception handler

At this point, what we need to do is store R0-R12, LR_usr and PC (before entry to svc handler) into the user stack, in a known order, then save the user stack pointer to the task's "housekeeping" information, load the equivalents back in for the next task, and jump out of the handler back to user code.  We'll get onto that in a minute.

Preemptive task swapping will be done by using a timer interrupt.  Thus, for a preemptive task swap, the situation is quasi-identical, with _svc above replaced by _irq.  The only difference is that the PC stored to LR_svc is actually 4 bytes on from where we want to restart, so we need to remember to take that into account.

So.  How do we go about saving our information to the user stack?  This is made quite easy by the fact the user mode register are identical to the system mode registers.  So we need to save the svc/irq mode stuff (LR, SPSR) into the system mode stack, then swap into system mode and save the rest.  The first bit is covered by one instruction, which might have been made for the task:

srs (Store Return State) - store LR and SPSR of the current mode onto the stack of a specified mode

and, of course, its "twin"

rfe (Return From Exception) - load PC and CPSR from address

So, the preamble for the svc handler is as follows:

FUNC _svc_handler
srsdb sp!, #SYS_MODE /* save LR_svc and SPSR_svc to svc mode stack */
cpsid i,#SYS_MODE     /* go sys mode, interrupts disabled */

push {r0-r12} /* Save registers */

and r0, sp, #4 /* align the stack and save adjustment with LR_user */
sub sp, sp, r0
push {r0, lr}

and for an irq:

FUNC _irq_handler
sub lr, lr, #4 /* Save adjusted LR_IRQ */
srsdb sp!, #SYS_MODE /* save LR_irq and SPSR_irq to system mode stack */
cpsid i,#SYS_MODE /* Go to system mode */
push {r0-r12} /* Save registers */
and r0, sp, #4 /* align the stack and save adjustment with LR_user */
sub sp, sp, r0
push {r0, lr}

Note that the only difference is that the irq handler adjusts the return address (as we want to go back to the interrupted instruction, not the next one).

Given that we are now *always* in system mode, exiting from either handler is identical:

pop {r0, lr} /* restore LR_user and readjust stack */
add sp, sp, r0
pop {r0-r12} /* and other registers */
rfeia sp! /* before returning */

I'll move onto how to implement the guts of the two handlers in the next post.  But before we go, here's how we set up a task in "C" land.

Remember, when we switch into a task, we will be pulling a stored process state from the task's stack into the registers, and then restoring the PC and CPSR, also from the task's stack.  Setting up a task, then, involves "faking" the preamble of the exception handlers above.  Note the use of exit_fn as the value of LR_usr, this is where we go when a task dies, and is used to clean up after task exit, and the use of 'entry' (the address of the task's entry function) as the value of LR_svc/irq, which will be used as the "return" address from the exception handler.

The use of exit_fn means we can schedule "one-shot" tasks (which not all realtime OSs can do).  Hooray for us.

  // Allocate stack space and the actual object
  task_t * task = malloc( sizeof(task_t) );
  void * stack = malloc( stack_size * 4 );
  unsigned int * sp = stack + stack_size;
  task->stack_top = stack;
  task->priority = priority;
  task->state = TASK_SLEEP;
  task->id = (unsigned int)task & 0x3fffffff;
  *(--sp) = 0x00000010;             // CPSR (user mode with interrupts enabled)
  *(--sp) = (unsigned int)entry;  // 'return' address (i.e. where we come in)
  *(--sp) = 0x0c0c0c0c;             // r12
  *(--sp) = 0x0b0b0b0b;             // r11
  *(--sp) = 0x0a0a0a0a;             // r10
  *(--sp) = 0x09090909;             // r9
  *(--sp) = 0x08080808;             // r8
  *(--sp) = 0x07070707;             // r7
  *(--sp) = 0x06060606;             // r6
  *(--sp) = 0x05050505;             // r5
  *(--sp) = 0x04040404;             // r4
  *(--sp) = 0x03030303;             // r3
  *(--sp) = 0x02020202;             // r2
  *(--sp) = 0x01010101;             // r1
  *(--sp) = (unsigned int) env;     // r0, i.e. arg to entry function

  if ((unsigned int)sp & 0x07) {
    *(--sp) = 0xdeadc0de;           // Stack filler
    *(--sp) = (unsigned int)exit_fn;              // lr, where we go on exit
    *(--sp) = 0x00000004;           // Stack Adjust
  } else {
    *(--sp) = (unsigned int)exit_fn;              // lr, where we go on exit
    *(--sp) = 0x00000000;           // Stack Adjust
  task->stack_pointer = sp;