<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-9184145065528610199</id><updated>2012-02-01T10:10:57.322+01:00</updated><category term='arm'/><category term='lisp os newton stuff'/><category term='lisp'/><category term='assembler'/><category term='wits android a81e review'/><category term='android newton'/><category term='raspberry pi'/><category term='os'/><title type='text'>Mr Foo</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>17</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-5660486269164190537</id><published>2012-01-31T12:53:00.001+01:00</published><updated>2012-02-01T10:10:57.500+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='arm'/><category scheme='http://www.blogger.com/atom/ns#' term='os'/><title type='text'>CAR, CDR, CONS - an aside</title><content type='html'>Commenter pcpete has been following the ARM OS stuff, but was having some trouble with my &lt;a href="http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part_14.html"&gt;CAR/CDR/etc macros&lt;/a&gt;. &amp;nbsp;I figured I'd do a little writeup for those who aren't totally au fait with what that's all about.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;CAR, CDR and CONS are terminology which has stuck around since the '50s. &amp;nbsp;It's not going away, so if you want to Lisp, you need to get used to it.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The original Lisp was implemented on the IBM 704 computer. &amp;nbsp;This was a machine which had a 36 bit word (peculiar, but useful) and a 15 bit address bus. 36 bit words could be split into 4 parts (and hardware support was provided for doing so) known as the &lt;b&gt;address&lt;/b&gt; (a 15 bit value), the &lt;b&gt;decrement&lt;/b&gt;&amp;nbsp;(again, 15 bits), the &lt;b&gt;tag&lt;/b&gt; (3 bits) and the &lt;b&gt;flags&lt;/b&gt; (3 bits). &amp;nbsp;The very earliest work on Lisp provided operators to extract these 4 parts (Contents of Address part of Register or CAR, &amp;nbsp;and likewise for Decrement, Tag and Flags, providing CDR, CTR and CFR). &amp;nbsp;Later work (before the first implementation of Lisp) dropped direct manipulation of flags and tag, leaving CAR and CDR, and a "Construct" operator, CONS, which took values for the address and decrement parts respectively and stuffed them into a machine word.&lt;br /&gt;&lt;br /&gt;The fact that a machine word was bigger than the size of the address of a machine word meant that it was possible to implement &lt;i&gt;pairs of values&lt;/i&gt; and &lt;i&gt;singly linked list cells &lt;/i&gt;within one machine word. &amp;nbsp;A pair of values (for example, 1 &amp;amp; 2) could be created as follows:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;CONS(1, 2)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;and singly linked lists by treating the 'address' part (paradoxically enough) as a value, and the 'decrement' part as the address of the next cell in the list. &amp;nbsp;By setting the last pointer to some easily-recognised value (known as 'nil'), it is possible to find the end of a list. &amp;nbsp;Thus, creating a list containing the numbers 1, 2 and 3, looks like this:&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;CONS(1, CONS(2, CONS(3, nil)))&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Binary trees are also easy to construct using pairs - for example:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;CONS(CONS(1, 2), CONS(3, 4))&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;I'm generally the last one to point people at Wikipedia, but there's a reasonable and rather more graphical explanation of what have become known as&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Cons"&gt;'cons cells' over there&lt;/a&gt;.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;It's worth noting that Lisp also allows you to use the notation '&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;first&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;' and '&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;rest&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;' instead of '&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;car&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;' and '&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;cdr&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;', although this only really makes sense when in list context, and doesn't allow for composition (&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;caar&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;cadr&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;cdar&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, etc). &amp;nbsp;Pretty much nobody uses &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;first&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; and &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;rest&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;So, Lisp uses what is, in effect, a pair (or list) oriented memory model as opposed to the more usual (these days) approach of addressing single memory cells individually. &amp;nbsp;And if one is implementing a Lisp (as I am), it makes a certain amount of sense for the low level memory allocation and manipulation functions to operate in this way. &amp;nbsp;Unfortunately, "modern" machines in general don't have words that are larger than their address bus size, and ARM is no exception to this. &amp;nbsp;ARM is a 32 bit machine - we could restrict addressing within our lisp to 16 bits (maximum 64Kwords or 256KB) and all values to 16 bits, or, more sensibly, use "virtual" words of 2 machine words (or maybe more) in size. &amp;nbsp;Usefully, ARM has &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;LDRD&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; (Load Register Dual) and &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;STRD&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; (Store Register Dual) operands that make this a snip to do.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;So, we can define cons cells as 8-byte aligned pairs of 32-bit values, manipulate them as pairs using &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;LDRD&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;/&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;STRD&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, and we're a long way towards implementing a Lisp. &amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;But what of the Flags and Tag stuff that was originally in Lisp? &amp;nbsp;Although, by the time Lisp came out, you had no way of directly manipulating them, they were still used to differentiate between pointers, direct values, and constants, amongst other things. &amp;nbsp;And this is something we need to be able to do as well, otherwise we can't tell if (for example) the value 0x00000000 in the CDR of a cell is a pointer to a cell at the address 0x00000000, the integer value zero, or some special value indicating nil/end of list.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;As it happens, all is well. &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;LDRD/STRD&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; require an 8-byte aligned address, which means the low 3 bits of a pointer will *always* be 0. &amp;nbsp;We can, therefore, use those 3 bits as flags to indicate various other types, and, if we're careful, we can encode the majority of types in such a way as to keep their values "boxed" (i.e. containable in a 32 bit value). &amp;nbsp;Values requiring more than 32 bits to encode (strings, large numbers, rational numbers, etc) can either be encoded as lists of boxed values, pairs, or using any other encoding the programmer deems useful as long as they are easily recognisable.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;So, back to the code I posted, which uses CAR, CDR, SET_CAR and SET_CDR macros.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* linked list structure */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;typedef struct task_list {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task_t * _car;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; struct task_list * _cdr;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;} task_list_t;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* some LISPy primitives */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define CAR(x) (x)-&amp;gt;_car&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define CDR(x) (x)-&amp;gt;_cdr&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define SET_CAR(x,y) CAR(x)=(y)&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define SET_CDR(x,y) CDR(x)=(y)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;As we can see, task_list_t is an analogue for the cons cell, being simply 2 address-bus sized values held together. &amp;nbsp;Indeed, my code elsewhere looks like this:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;typedef cons_t task_list_t;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;but the original formulation is perhaps easier to grok at first.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;So, given a pointer &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;x&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; to a cons cell &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;c&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, we can see that&amp;nbsp;&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;CAR(x)&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; expands to&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;(x)-&amp;gt;_car&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, in other words returning the &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_car&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; element of the cell &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;c&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; pointed to by &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;x&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;. &amp;nbsp;Likewise &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;SET_CAR(x, y)&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; becomes &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;(x)-&amp;gt;_car = (y)&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, assigning the value &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;y&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; to the _car element of the cell &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;c&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; pointed to by &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;x&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;. Doing this as macros is probably premature optimisation, but it's "the done thing".&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Our task scheduler uses a list (actually, a plurality of lists) of tasks to execute, and considers them for execution in a "round robin" style. &amp;nbsp;So, assuming we have 4 tasks, a, b, c and d, we might have a list that looks, in lisp syntax, like this:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;(a b c d)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;or, in diagram form, where the top row is the CDR of the cell, and the bottom the CAR, and * indicates a pointer to the next cell&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;* - * - * - nil&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;| &amp;nbsp; | &amp;nbsp; | &amp;nbsp; |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;a &amp;nbsp; b &amp;nbsp; c &amp;nbsp; d&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;We need to rotate the list as we consider each task, so that the next time through we consider the next task. So after considering task a for execution, we want the list to look like this:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;(b c d a)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;or, in diagram form&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;* - * - * - nil&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;| &amp;nbsp; | &amp;nbsp; | &amp;nbsp; |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;b &amp;nbsp; c &amp;nbsp; d &amp;nbsp; a&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;In true lisp style, we'll use a singly linked list for this, with the CAR of each cell being a pointer to the task, and the CDR being a pointer to the next cell, or nil for end of list. &amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;So, we come in, and we find the CAR and CDR of the (original) list. &amp;nbsp;CAR is, obviously enough, task 'a', the task we want to consider for execution. &amp;nbsp;CDR, however, is not task b, but the &lt;i&gt;rest of the list&lt;/i&gt;&amp;nbsp;(hence the alternative 'rest' in Lisp) - i.e.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;(b c d)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;or, in diagram form&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;* - * - nil&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;| &amp;nbsp; | &amp;nbsp; |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;b &amp;nbsp; c &amp;nbsp; d&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;So our next step is to put 'a' onto the end of the list. &amp;nbsp;This is not as simple as it seems - CONS can only be used to push items onto the front of a list - CONS(list, a) would result in this:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;((bcd) . a)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;or, in diagram form&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;a&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;|&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;* - * - nil&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;| &amp;nbsp; | &amp;nbsp; |&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;b &amp;nbsp; c &amp;nbsp; d&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;which is not a true list in Lisp terms (it's a "dotted pair" with a list as its first element). W&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;e need some sort of APPEND operator that adds an element to the end of a list, a sort of 'reverse cons'. &amp;nbsp;Lisp has one of those, it's (oddly enough) called APPEND, it's usually defined recursively, and it always creates new cons cells all over the place.&amp;nbsp;&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&amp;nbsp;We &lt;i&gt;really&lt;/i&gt; don't want to go allocating new memory in a function that's going to be called thousands of times per second simply to move a few values about - we want to keep the same cons cells and simply swap their CDR values around. &amp;nbsp;The algorithm thus becomes:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;first := list&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;if CDR(first) != NIL&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; list, end := CDR(first)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; end := CDR(end) while CDR(end) != NIL&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; SET_CDR(first, NIL)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; SET_CDR(end, first)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;With that in mind, have another read through&amp;nbsp;http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part_14.html&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Simon&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-5660486269164190537?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/5660486269164190537/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2012/01/car-cdr-cons-aside.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/5660486269164190537'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/5660486269164190537'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2012/01/car-cdr-cons-aside.html' title='CAR, CDR, CONS - an aside'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-260706063652785333</id><published>2012-01-16T09:32:00.002+01:00</published><updated>2012-01-31T12:53:32.248+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='assembler'/><category scheme='http://www.blogger.com/atom/ns#' term='arm'/><category scheme='http://www.blogger.com/atom/ns#' term='os'/><category scheme='http://www.blogger.com/atom/ns#' term='raspberry pi'/><title type='text'>Developing a multitasking OS for ARM part 3 3/4</title><content type='html'>So, we looked at how to do the difficult part of task swapping last time, now let's look at how we go about handling interrupts and the syscall layer.&lt;br /&gt;&lt;br /&gt;Firstly, interrupts. &amp;nbsp;We'll not bother with FIQs for the moment, we'll stick with IRQs.&lt;br /&gt;&lt;br /&gt;Despite having an ARM1176 core, the Raspberry Pi (or, rather, its Broadcom SoC) doesn't have a vectored interrupt controller - instead it has 3 "standard" ARM interrupt controllers. &amp;nbsp;That makes things a bit more complex for interrupt handling, but it's not too arduous. &amp;nbsp;As a quick reminder, here's the IRQ code we had before, with a little bit of meat in the middle:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _irq_handler&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_irq_handler:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;lr, lr, #4&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Save adjusted LR_IRQ */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;srsdb&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp!, #SYS_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* save LR_irq and SPSR_irq to system mode stack */&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cpsid&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;i,#SYS_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Go to system mode */&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0-r12}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Save registers */&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;and&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, sp, #4&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* align the stack and save adjustment with LR_user */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, lr}&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Identify and clear interrupt source */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Should return handler address in r0 */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bl&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;identify_and_clear_irq&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;blxne&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* go handle our interrupt if we have a handler */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* An interruptible handler should disable / enable irqs */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Exit is via context switcher */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;b&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;switch_context_do&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;and the context switcher looks like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global switch_context_do&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;switch_context_do:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Do we need to switch context? */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, #0x0c&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* offset to fourth word of task block */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, =__current_task&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, [r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, =__next_task&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, [r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cmp&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, #0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* If there's no next task, we can't switch */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;beq&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.Lswitch_context_exit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cmp&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, #0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* In the normal case, we will have a __current_task */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bne&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.Lnormal_case&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* When we get here, we're either idling in system mode at startup, or we've &lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* just voluntarily terminated a task. &amp;nbsp;In either case, we need to remove the&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* return information we just pushed onto the stack, as we're never, ever going */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* back.&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;         &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, r1}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* remove any potential stack alignment */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;add&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;add&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, #0x3c&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* and the other registers that should be there */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;     &lt;/span&gt;/* r0-r12, interrupted pc &amp;amp; spsr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Now we can do our first actual task swap */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, &amp;nbsp;=__next_task&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* swap out the task */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, &amp;nbsp;[r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, &amp;nbsp;=__current_task&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, &amp;nbsp;[r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, &amp;nbsp;[r2, r3]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* and restore stack pointer */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;b&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.Lswitch_context_exit&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* bail */&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.Lnormal_case:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cmp&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, r2&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* otherwise, compare current task to next */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;beq&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.Lswitch_context_exit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;clrex&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* Clear all mutexes */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* At this point we have everything we need on the sysmode (user) stack&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* {stack adjust, lr}_user, {r0-r12}_user, {SPSR, LR}_irq &lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Save our stack pointer, and swap in the new one before returning&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, =__current_task&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* save current stack pointer */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, [r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, [r0, r3]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* stack pointer is second word of task object */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, &amp;nbsp;=__next_task&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* swap out the task */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, &amp;nbsp;[r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, &amp;nbsp;=__current_task&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, &amp;nbsp;[r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, &amp;nbsp;[r2, r3]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* and restore stack pointer */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.Lswitch_context_exit:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, lr}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* restore LR_user and readjust stack */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;add&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0-r12}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* and other registers */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;rfeia&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp!&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* before returning */&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The context switcher looks complex, but it isn't really. &amp;nbsp;There's 4 cases to cater for, viz:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;No 'next' task, don't switch&lt;/li&gt;&lt;li&gt;No 'current' task, switch to 'next' task, cleanup stack and switch to 'next' task&lt;/li&gt;&lt;li&gt;'next' task is the same as 'current' task, don't switch&lt;/li&gt;&lt;li&gt;'next' task is different from 'current' task, switch&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;Obviously, the meat of the interrupt handler is held in 'identify_and_clear_interrupt', which does pretty much what it says on the tin. &amp;nbsp;In this article, I'll show the handler for the qemu platform, which is significantly simpler than that for the Pi, but the Pi handler looks largely the same modulo having to deal with 3 controllers.&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;identify_and_clear_irq&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;identify_and_clear_irq:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;FUNC&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;identify_and_clear_irq&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r4, =.Lirq_base&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r4, [r4]&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* read the vector address to indicate we're handling the interrupt */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, [r4, #IRQ_HANDLER]&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* which IRQs are asserted? */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, [r4, #IRQ_STATUS]&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r5, =__irq_handlers&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;clz&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r6, r0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* which IRQ was asserted? */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, #1&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* make a mask */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bic&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, r0, r1, lsl r6&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* clear flag */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, [r4, #IRQ_ACK]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Now acknowledge the interrupt */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, [r4, #IRQ_SOFTCLEAR]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* and make sure we clear software irqs too */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, [r5, r6, lsl #2]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* load handler address */&lt;/div&gt;&lt;div&gt;.Lret:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bx&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;lr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* exit */&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;.Lirq_base:&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;IRQ_BASE&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;.bss&lt;/div&gt;&lt;div&gt;.global __irq_handlers&lt;/div&gt;&lt;div&gt;__irq_handlers:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.skip&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;32 * 4&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;and patching in a hander is as simple as setting the address for the interrupt handler into __irq_handlers at the appropriate place. &amp;nbsp;Simples, as they say in internet-land.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Syscalls are very similar. &amp;nbsp;Here's the syscall handler:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;_svc_handler&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_svc_handler:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;srsdb&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp!, #SYS_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* save LR_svc and SPSR_svc to sys mode stack */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cpsid&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;i,#SYS_MODE&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0-r12}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Save registers */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;and&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, sp, #4&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* align the stack and save adjustment with LR_user */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, lr}&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;  &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0,[lr,#-4]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Calculate address of SVC instruction */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;     &lt;/span&gt;/* and load it into R0. */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;and&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0,r0,#0x000000ff&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Mask off top 24 bits of instruction */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;     &lt;/span&gt;/* to give SVC number. */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;     &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, =__syscall_table&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* get the syscall */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, [r2, r0, lsl#2]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cmp&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, #0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;beq&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;_syscall_exit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;tst&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, #0x01&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* what linkage are we using */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bxeq&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* ASM, just run away */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bic&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, r3, #0x01&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;blx&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* C, must come back here */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _syscall_exit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_syscall_exit:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, lr}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* restore LR_user and readjust stack */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;add&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0-r12}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* and other registers */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;rfeia&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp!&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* before returning */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.section .bss&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global __syscall_table&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__syscall_table:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* leave space for 256 syscall addresses */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.skip&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;2048&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The fun bit is how we go about getting the syscall number into the handler. &amp;nbsp;I've taken the "canonical" approach of using the svc operand, hence the bit where we get the instruction and extract the number. &amp;nbsp;Other ways include using a register, a global variable, pushing onto the stack, or some combination of these.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The other twist here is that I allow for both C and assembler syscall functions by setting (or not) bit 0 of the function address in the syscall table. &amp;nbsp;Assembler syscall handlers must, of course, exit via _syscall_exit or equivalent, but that's down to the programmer to get it right.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-260706063652785333?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/260706063652785333/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2012/01/developing-multitasking-os-for-arm-part_16.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/260706063652785333'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/260706063652785333'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2012/01/developing-multitasking-os-for-arm-part_16.html' title='Developing a multitasking OS for ARM part 3 3/4'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-1903850706476505846</id><published>2012-01-07T10:16:00.002+01:00</published><updated>2012-01-31T12:54:03.319+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='assembler'/><category scheme='http://www.blogger.com/atom/ns#' term='arm'/><category scheme='http://www.blogger.com/atom/ns#' term='os'/><category scheme='http://www.blogger.com/atom/ns#' term='raspberry pi'/><title type='text'>Developing a multitasking OS for ARM part 3 1/2</title><content type='html'>Okay, so last time we looked at what we need to schedule and store top level information about tasks in our fledgling OS. &amp;nbsp;And it was pretty easy to do, because we could do all of it in C. &amp;nbsp;Unfortunately, things are about to get complicated again, because we're about to dive down into assembler again.&lt;br /&gt;&lt;br /&gt;Whoopee! &amp;nbsp;I can almost hear the sound of "back" buttons being clicked as I type this.&lt;br /&gt;&lt;br /&gt;So. &amp;nbsp;Multitasking. &amp;nbsp;How's that gonna work, then? &amp;nbsp;Largely speaking, there's 2 types of multitasking - preemptive multitasking, where tasks run for a specified amount of time then get forcibly swapped out, and cooperative multitasking, where tasks run until they decide to give some time to someone else. &amp;nbsp;We're going to implement (because there's precious little overhead in doing so) a hybrid where tasks can be "nice" and give time to others, but where the absolute maximum time they get is capped by a preemptive scheduler.&lt;br /&gt;&lt;br /&gt;So. &amp;nbsp;Let's look at the sequence of events for a user-triggered task swap. &amp;nbsp;We want the user to use a line of code that looks like this (remember, I'm developing something that runs scheme...)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;(task-swap-out)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;However, the user's code is running in user mode (remember the ARM processor states), and it can't directly access the task scheduler. &amp;nbsp;This is where software interrupts / SVC calls come in - the user's code *can* use the &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;svc&lt;/span&gt; instruction. &amp;nbsp;This, in fact, is the beginning of what is known as a syscall interface, the way that unprivileged user code calls (or causes to happen) kernel functions. &lt;br /&gt;&lt;br /&gt;Executing the svc instruction causes the following to happen:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;CPSR_usr is transferred to SPSR_svc&lt;/li&gt;&lt;li&gt;PC is stored in LR_svc&lt;/li&gt;&lt;li&gt;Processor switches to SVC mode (SP_usr and LR_usr are now hidden by SP_svc and LR_svc, CPSR is identical except for processor state change)&lt;/li&gt;&lt;li&gt;PC is loaded with the address of the SVC exception handler&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;At this point, what we need to do is store R0-R12, LR_usr and PC (before entry to svc handler) into the user stack, in a known order, then save the user stack pointer to the task's "housekeeping" information, load the equivalents back in for the next task, and jump out of the handler back to user code. &amp;nbsp;We'll get onto that in a minute.&lt;br /&gt;&lt;br /&gt;Preemptive task swapping will be done by using a timer interrupt. &amp;nbsp;Thus, for a preemptive task swap, the situation is quasi-identical, with _svc above replaced by _irq. &amp;nbsp;The only difference is that the PC stored to LR_svc is actually 4 bytes on from where we want to restart, so we need to remember to take that into account.&lt;br /&gt;&lt;br /&gt;So. &amp;nbsp;How do we go about saving our information to the user stack? &amp;nbsp;This is made quite easy by the fact the user mode register are identical to the system mode registers. &amp;nbsp;So we need to save the svc/irq mode stuff (LR, SPSR) into the system mode stack, then swap into system mode and save the rest. &amp;nbsp;The first bit is covered by one instruction, which might have been made for the task:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;srs (Store Return State) - store LR and SPSR of the current mode onto the stack of a specified mode&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;and, of course, its "twin"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;rfe (Return From Exception) - load PC and CPSR from address&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;So, the preamble for the svc handler is as follows:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;FUNC&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;_svc_handler&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;srsdb&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp!, #SYS_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* save LR_svc and SPSR_svc to svc mode stack */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cpsid&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;i,#SYS_MODE &amp;nbsp; &amp;nbsp; /* go sys mode, interrupts disabled */&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0-r12}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Save registers */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;and&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, sp, #4&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* align the stack and save adjustment with LR_user */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, lr}&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;and for an irq:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;FUNC _irq_handler&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;lr, lr, #4&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Save adjusted LR_IRQ */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;srsdb&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp!, #SYS_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* save LR_irq and SPSR_irq to system mode stack */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cpsid&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;i,#SYS_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Go to system mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0-r12}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Save registers */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;and&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, sp, #4&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* align the stack and save adjustment with LR_user */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;push&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, lr}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Note that the only difference is that the irq handler adjusts the return address (as we want to go back to the interrupted instruction, not the next one).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Given that we are now *always* in system mode, exiting from either handler is identical:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0, lr}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* restore LR_user and readjust stack */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;add&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, sp, r0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;{r0-r12}&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* and other registers */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;rfeia&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp!&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* before returning */&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'll move onto how to implement the guts of the two handlers in the next post. &amp;nbsp;But before we go, here's how we set up a task in "C" land. &lt;br /&gt;&lt;br /&gt;Remember, when we switch into a task, we will be pulling a stored process state from the task's stack into the registers, and then restoring the PC and CPSR, also from the task's stack. &amp;nbsp;Setting up a task, then, involves&amp;nbsp;"faking" the preamble of the exception handlers above. &amp;nbsp;Note the use of exit_fn as the value of LR_usr, this is where we go when a task dies, and is used to clean up after task exit, and the use of 'entry' (the address of the task's entry function) as the value of LR_svc/irq, which will be used as the "return" address from the exception handler. &lt;br /&gt;&lt;br /&gt;The use of exit_fn means we can schedule "one-shot" tasks (which not all realtime OSs can do). &amp;nbsp;Hooray for us.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; // Allocate stack space and the actual object&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task_t * task = malloc( sizeof(task_t) );&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; void * stack = malloc( stack_size * 4 );&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; unsigned int * sp = stack + stack_size;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task-&amp;gt;stack_top = stack;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task-&amp;gt;priority = priority;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task-&amp;gt;state = TASK_SLEEP;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task-&amp;gt;id = (unsigned int)task &amp;amp; 0x3fffffff;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x00000010; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // CPSR (user mode with interrupts enabled)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = (unsigned int)entry; &amp;nbsp;// 'return' address (i.e. where we come in)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x0c0c0c0c; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r12&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x0b0b0b0b; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r11&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x0a0a0a0a; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r10&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x09090909; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r9&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x08080808; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r8&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x07070707; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r7&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x06060606; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r6&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x05050505; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r5&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x04040404; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r4&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x03030303; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r3&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x02020202; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = 0x01010101; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // r1&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; *(--sp) = (unsigned int) env; &amp;nbsp; &amp;nbsp; // r0, i.e. arg to entry function&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; if ((unsigned int)sp &amp;amp; 0x07) {&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; *(--sp) = 0xdeadc0de; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // Stack filler&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; *(--sp) = (unsigned int)exit_fn; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;// lr, where we go on exit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; *(--sp) = 0x00000004; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // Stack Adjust&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; } else {&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; *(--sp) = (unsigned int)exit_fn; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;// lr, where we go on exit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; *(--sp) = 0x00000000; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; // Stack Adjust&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; }&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task-&amp;gt;stack_pointer = sp;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-1903850706476505846?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/1903850706476505846/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2012/01/developing-multitasking-os-for-arm-part.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/1903850706476505846'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/1903850706476505846'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2012/01/developing-multitasking-os-for-arm-part.html' title='Developing a multitasking OS for ARM part 3 1/2'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-124372955269158907</id><published>2011-12-14T10:49:00.002+01:00</published><updated>2012-01-31T12:52:04.288+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='assembler'/><category scheme='http://www.blogger.com/atom/ns#' term='arm'/><category scheme='http://www.blogger.com/atom/ns#' term='os'/><category scheme='http://www.blogger.com/atom/ns#' term='raspberry pi'/><title type='text'>Developing a multitasking OS for ARM part 3</title><content type='html'>Okay. &amp;nbsp;We're almost done with the big bits of scary assembler. &amp;nbsp;Indeed, this post is almost totally assembler free, and will deal with some C functions and definitions we need for later.&lt;br /&gt;&lt;br /&gt;We want to do something useful with what we have at the moment. &amp;nbsp;We could simply implement "something useful" as a C language routine called &lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;c_entry()&lt;/span&gt;, which would run in SVC mode with interrupts off.&amp;nbsp;In some cases, that would be sufficient. &amp;nbsp;But it would hardly count as an OS, let alone a multitasking one.&lt;br /&gt;&lt;br /&gt;So, let's look at what we want to do, make some definitions. &amp;nbsp;We want tasks that run in an unprivileged mode (i.e user mode) and are either preemptively swapped out by the OS in order to run another task, or which periodically yield control of the processor to another task. &amp;nbsp;They must be able to terminate. &amp;nbsp;For the moment, we won't worry too much about protected memory spaces, IPC, or any of that jazz (which complicate matters, but which will come later).&lt;br /&gt;&lt;br /&gt;A task, must have its own, inviolate, set of registers, and its own stack. &amp;nbsp;It must also have some other information - entry point, state, and potentially priority. &amp;nbsp;My implementation is based on scheme, so a task must also have an environment, but that's not absolutely necessary.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* function pointer type returning void and taking a pointer to an environment */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;typedef void(*task_entry_point_t)(void * environment);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* potential task states */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;typedef enum {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; TASK_RUNNABLE,&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; TASK_SLEEPING,&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;} task_state_t;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* And the task itself */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;typedef struct {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; void * stack_top; &amp;nbsp; &amp;nbsp; /* limit of the task stack */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; void * stack_pointer; /* current stack pointer &amp;nbsp; */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; uint32_t priority:5; &amp;nbsp;/* priority, 0-31 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;*/&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; uint32_t state:1; &amp;nbsp; &amp;nbsp; /* state, task_state_t &amp;nbsp; &amp;nbsp; */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; uint32_t id:26; &amp;nbsp; &amp;nbsp; &amp;nbsp; /* task id &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;} task_t;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Obviously, we need to know what the current task is, and have a list of other tasks that might want to run. &amp;nbsp;This is not identical to my code, as tasks are actually scheme objects, but you get the idea.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* linked list structure */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;typedef struct task_list {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; task_t * _car;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; struct task_list * _cdr;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;} task_list_t;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* some LISPy primitives */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define CAR(x) (x)-&amp;gt;_car&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define CDR(x) (x)-&amp;gt;_cdr&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define SET_CAR(x,y) CAR(x)=(y)&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define SET_CDR(x,y) CDR(x)=(y)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* We'll need these later */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define nil &amp;nbsp;(task_list_t*)0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;#define skip (task_t*)0&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* And the bits we need for the actual lists */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;task_t * __current_task;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;task_list_t * __priority_lists[31];&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Now, the approach we'll be taking to multitasking is this:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Each task is created with a prority, and positioned as such in one of the priority lists. &amp;nbsp;Every time we need to find a task, we go through the priority lists, starting at zero, and ending at 31. &amp;nbsp;We look at each element in turn of the list by removing it from the head of the list and then grafting it onto the end of the list. &amp;nbsp;This way we round-robin schedule within each priority. &amp;nbsp;Only "runnable" tasks get scheduled, obviously. &amp;nbsp;If the task is actually the placeholder "skip", we will skip onto the next lowest priority. Th&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;is way, all tasks eventually get a bite of CPU, with high priority tasks getting vastly more than the low priority ones.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Obviously, the initial setup of the lists is critical, and should be done in &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;c_entry&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;task_t * __sleep_task;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;for (int i = 0; i &amp;lt; 31; i++) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; __priority_lists[i] = (task_list_t *)malloc(sizeof(task_list_t));&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; SET_CAR(__priority_lists[i], skip);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; SET_CDR(__priority_lists[i], nil);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__priority_lists[31] =&amp;nbsp;&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;(task_list_t *)&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;malloc(sizeof(task_list_t));&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;SET_CAR(__priority_lists[31], __sleep_task);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;SET_CDR(__priority_lists[31], nil);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;The astute amongst you will notice the use of the C library function &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;malloc()&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; in there, despite not having a c library. &amp;nbsp;Don't worry about it. &amp;nbsp;It'll come later. Do worry about me not checking for errors :)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Note that priority 31 has *no* 'skip' entry, and points to a real task. &amp;nbsp;This task should not, under any circumstances, be missed out.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Now that's all done, we can find the next runnable task.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* lispish function */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;task_list_t * nconc(task_list_t * car, task_list_t * cdr) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; if (car == nil) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; return cdr;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; } else {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; task_list_t * x = car;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; while (CAR(x) != nil) x = CAR(x);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; SET_CDR(x, cdr);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; return car;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;task_t * next_runnable_task() {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; for (int i = 0; i &amp;lt; 32; i++) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; while(CAR(__priority_lists[i] != skip)) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; /* rotate the list */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; task_list_t * car = CAR(__priority_lists[i]);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; task_list_t * cdr = CDR(__priority_lists[i]);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; SET_CDR(car, nil);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; __priority_lists[i] = nconc(cdr, car);&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; /* check runnability */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; if (car-&amp;gt;_car-&amp;gt;state == TASK_RUNNABLE)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; return car-&amp;gt;_car;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; /* we should never get here, but just in case, eh? */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; return __sleep_task;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Now, that's all fine and well, but what about setting up tasks and actually making them swap? &amp;nbsp;Ah. &amp;nbsp;That's a bit more complex, and we're gonna have to delve down into assembler again. &amp;nbsp;I'll get into that next time round.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Until then, though, here's a couple of little functions we needed before.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;void * malloc(size_t size) {&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; extern char * __heap_top;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; extern char * __memtop;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; char * prev_heap_top = __heap_top;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; if (__heap_top + size &amp;gt; __memtop) {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; return (void *)0;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; __heap_top += size;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; return (void *) prev_heap_top;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;}&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;void sys_sleep() {&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; for(;;){&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; // ARMv6 Wait For Interrupt (WFI)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; uint32_t * reg = 0;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; __asm__ __volatile__ ("MCR p15,0,%[t],c7,c0,4" :: [t] "r" (reg));&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; }&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;}&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-124372955269158907?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/124372955269158907/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part_14.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/124372955269158907'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/124372955269158907'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part_14.html' title='Developing a multitasking OS for ARM part 3'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-6391315865862740870</id><published>2011-12-11T18:08:00.001+01:00</published><updated>2011-12-14T10:50:43.150+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><category scheme='http://www.blogger.com/atom/ns#' term='assembler'/><category scheme='http://www.blogger.com/atom/ns#' term='arm'/><category scheme='http://www.blogger.com/atom/ns#' term='os'/><category scheme='http://www.blogger.com/atom/ns#' term='raspberry pi'/><title type='text'>Developing a multitasking OS for ARM part 2</title><content type='html'>Okay, kids, gather around and we'll carry on where we left off.&lt;br /&gt;&lt;br /&gt;Now, we have the ARM booting, jumping to a reset handler, and dropping into an endless loop. &amp;nbsp;That's a pretty good start. &amp;nbsp;But really, we'd like to do something more - well - how to put this - "more".&lt;br /&gt;&lt;br /&gt;In order to do this, we need to have a bit more understanding about how the ARM itself works.&lt;br /&gt;&lt;br /&gt;If we go to the ARMv7AR Architecture Reference Manual (which can be had by registering at arm.com, or by downloading a hooky copy off the internets, either approach is feasible, and one at least of which is recommended), we see, in section B1 (the System Level Programmer's Model) a certain amount of interesting reading. &amp;nbsp;Forget the "privilege" aspect for the moment, and let's skip ahead to section B1.3.&lt;br /&gt;&lt;br /&gt;We find that the processor has 8 separate operating modes. &amp;nbsp;These are:&lt;br /&gt;&lt;br /&gt;User Mode,&amp;nbsp;System Mode,&amp;nbsp;Supervisor Mode,&amp;nbsp;Monitor Mode,&amp;nbsp;Abort Mode,&amp;nbsp;Undefined Mode,&amp;nbsp;IRQ Mode, and&amp;nbsp;FIQ Mode&lt;br /&gt;&lt;br /&gt;If we look back at the set of vectors we set up earlier, a lot of these "cross over". &amp;nbsp;So when we drop into the IRQ vector, we will be in IRQ mode. &amp;nbsp;FIQ, FIQ mode. &amp;nbsp;Either of the aborts, Abort mode. &amp;nbsp;Undefined instruction, undefined mode, and so on. &amp;nbsp;What's interesting is how the machine registers are shared between modes, and particularly the fact that all but system/user modes have their own stack pointers.&lt;br /&gt;&lt;br /&gt;Now, when the ARM starts up, it is in SVC mode. &amp;nbsp;That's the way it is, and you can't change that. &amp;nbsp;And when it starts up, &lt;i&gt;no stack has been defined&lt;/i&gt;. &amp;nbsp;So you need to be really damned careful in the first bits of the reset code.&lt;br /&gt;&lt;br /&gt;Stacks on the ARM grow downwards, so the best thing to do generally is to put them at the top of memory. &amp;nbsp;As such, a typical reset routine will start by finding out how much memory is available, then setting up stack pointers for each of the operating modes. &amp;nbsp;We're nothing if not typical, so let's look at how we do that.&lt;br /&gt;&lt;br /&gt;First thing - sizing memory. &amp;nbsp;On the versatile baseboard as emulated by qemu, this is easy. &amp;nbsp;We try writing to a bit of memory, then read back - if the value is set, there's memory there, if there's not, then we are above the top of memory. &amp;nbsp;It's not quite so simple on the Pi, as trying to write outside of physical RAM will cause an exception. &amp;nbsp;However, we're going to be a bit clever, and try to kill 2 birds with one stone.&lt;br /&gt;&lt;br /&gt;Firstly, we need to set up some big fat global variables.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__memtop&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__memtop:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;0x00400000&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Start checking memory from 4MB */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__system_ram&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__system_ram:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;0x00000000&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* System memory in MB */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__heap_start&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__heap_start:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__bss_end__&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Start of the dynamic heap */&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global __heap_top&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__heap_top:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__bss_end__&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Current end of dynamic heap */&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;__bss_end__ is set up by the linker, and it would be much better of me to use that for the initial value of __memtop (rounded up to the nearest megabyte) as well. &amp;nbsp;But hey, I'm lazy. &amp;nbsp;It'll come back to bite me later, I'm sure.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, as the Pi causes an exception on writes outside memory, we need to patch in a handler, temporarily. &amp;nbsp;Here's the handler:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* temporary data abort handler that sets r4 to zero */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* this will force the "normal" check to work in the */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* case (as, I believe, on RasPi) where access 'out &amp;nbsp;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* of bounds' causes a page fault &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;temp_abort_handler:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r4, #0x00000000&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;lr, lr, #0x08&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;movs&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;pc, lr&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Note how the comment indicates I'm not absolutely sure this will work. &amp;nbsp;This is, frankly, because &lt;i&gt;I'm not sure if this will work on a real Pi&lt;/i&gt;, and nobody wants to let me get my hands on one. &amp;nbsp;Still, let's pretend, eh?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* This tries to work out how much memory we have available&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* Should work on both Pi and qemu targets&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;FUNC&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;_size_memory&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* patch in temporary fault handler */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r5, =.Ldaha&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r5, [r5]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r6, [r5]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r7, =temp_abort_handler&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r7, [r5]&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;DMB&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r12&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Try and work out how much memory we have */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, .Lmemtop&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, .Lmem_page_size&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, [r1]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, .Lsystem_ram&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, [r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.Lmem_check:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;add&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, r3, #0x04&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, [r3]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Try and store a value above current __memtop */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;DMB&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r12&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Data memory barrier, in case */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r4, [r3]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Test if it stored */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cmp&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, r4&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Did it work? */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bne&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.Lmem_done&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, [r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;add&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, r3, r1&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Add block size onto __memtop and try again */&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, [r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;b&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;.Lmem_check&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.Lmem_done:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, [r0]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* get final memory size */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;lsr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, #0x14&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Get number of megabytes */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r3, [r2]&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* And store it */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* unpatch handlers */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r6, [r5]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;DMB&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r12&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bx&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;lr&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.Lmemtop:&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;div&gt;.extern __memtop&lt;/div&gt;&lt;div&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__memtop&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;.Lmem_page_size:&lt;/div&gt;&lt;div&gt;.extern __mem_page_size&lt;/div&gt;&lt;div&gt;.word __mem_page_size&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;.Lsystem_ram:&lt;/div&gt;&lt;div&gt;.extern __system_ram&lt;/div&gt;&lt;div&gt;.word __system_ram&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="font-family: Times;"&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.Ldaha:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.extern&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;data_abort_handler_address&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;data_abort_handler_address&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;We see a few things here. &amp;nbsp;Firstly, how to patch in and out the handler. &amp;nbsp;Also, that I've got fed up with doing the whole .code 32; .global foo; foo: rigmarole and defined a macro called FUNC. &amp;nbsp;We also see a macro called DMB, which implements the ARMv6 Data Memory Barrier (ARMv7 has a 'dmb' instruction, to do that, we don't). &amp;nbsp;For what it's worth, these are the macros:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.macro FUNC name&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.text&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.code 32&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global \name&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;\name:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.endm&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* Data memory barrier */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;/* pass in a spare register */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.macro DMB reg&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;\reg, #0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mcr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;p15,0,\reg,c7,c10,5&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Data memory barrier on ARMv6 */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.endm&lt;/span&gt;&lt;/div&gt;&lt;div style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: Times, 'Times New Roman', serif;"&gt;So, we can hopefully now find out how much memory we have, with __memtop containing the actual top of memory and __system_ram containing the number of megabytes in case it's useful to know.&lt;/div&gt;&lt;/div&gt;&lt;div style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: Times, 'Times New Roman', serif;"&gt;So let's look at the start of _reset...&lt;/div&gt;&lt;div style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ MODE_BITS, &amp;nbsp; 0x1F&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* Bit mask for mode bits in CPSR */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ USR_MODE, &amp;nbsp; &amp;nbsp;0x10&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* User mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ FIQ_MODE, &amp;nbsp; &amp;nbsp;0x11&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* Fast Interrupt Request mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ IRQ_MODE, &amp;nbsp; &amp;nbsp;0x12&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* Interrupt Request mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ SVC_MODE, &amp;nbsp; &amp;nbsp;0x13&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* Supervisor mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ ABT_MODE, &amp;nbsp; &amp;nbsp;0x17&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* Abort mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ UND_MODE, &amp;nbsp; &amp;nbsp;0x1B&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* Undefined Instruction mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.equ SYS_MODE, &amp;nbsp; &amp;nbsp;0x1F&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt; /* System mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;FUNC&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;_reset&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Do any hardware intialisation that absolutely must be done first */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* No stack set up at this point - be careful */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, =.Lsize_memory&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, [r0]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cmp&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, #0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;blxne&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Assume that at this point, __memtop and __system_ram are populated&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Let's get on with initialising our stacks */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mrs&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, cpsr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Original PSR value */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, __memtop&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Top of memory */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bic&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, r0, #MODE_BITS&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Clear the mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;orr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r0, r0, #IRQ_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Set IRQ mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;msr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cpsr_c, r0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Change the mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sp, r1&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* End of IRQ_STACK */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Subtract IRQ stack size */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, __irq_stack_size&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sbc&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, r1, r2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bic &amp;nbsp; &amp;nbsp;r0, r0, #MODE_BITS&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Clear the mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;orr &amp;nbsp; &amp;nbsp;r0, r0, #SYS_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Set SYS mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;msr &amp;nbsp; &amp;nbsp;cpsr_c, r0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Change the mode &amp;nbsp; */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov &amp;nbsp; &amp;nbsp;sp, r1&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* End of SYS_STACK &amp;nbsp;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Subtract SYS stack size */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, __sys_stack_size&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sbc&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, r1, r2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bic &amp;nbsp; &amp;nbsp;r0, r0, #MODE_BITS&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Clear the mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;orr &amp;nbsp; &amp;nbsp;r0, r0, #FIQ_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Set FIQ mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;msr &amp;nbsp; &amp;nbsp;cpsr_c, r0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Change the mode &amp;nbsp; */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov &amp;nbsp; &amp;nbsp;sp, r1&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* End of FIQ_STACK &amp;nbsp;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Subtract FIQ stack size */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, __fiq_stack_size&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sbc&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, r1, r2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bic &amp;nbsp; &amp;nbsp;r0, r0, #MODE_BITS&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Clear the mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;orr &amp;nbsp; &amp;nbsp;r0, r0, #SVC_MODE&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Set Supervisor mode bits */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;msr &amp;nbsp; &amp;nbsp;cpsr_c, r0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Change the mode */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov &amp;nbsp; &amp;nbsp;sp, r1&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;    &lt;/span&gt;/* End of stack */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* And finally subtract Kernel stack size to get final __memtop */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r2, __svc_stack_size&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sbc&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, r1, r2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;str&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;r1, __memtop&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/*-- Leave core in SVC mode ! */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Zero the memory in the .bss section. &amp;nbsp;*/&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov &lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;a2, #0&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Second arg: fill value */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;mov&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;fp, a2&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* Null frame pointer */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;a1, .Lbss_start&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* First arg: start of memory block */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;a3, .Lbss_end&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sub&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;a3, a3, a1&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Third arg: length of block */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;bl&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;memset&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr r2, .Lc_entry&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;/* Let C coder have at initialisation */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mov &amp;nbsp; &amp;nbsp; lr, pc&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; bx &amp;nbsp; &amp;nbsp; &amp;nbsp;r2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cpsie&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;i&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* enable irq */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;cpsie&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;f&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;   &lt;/span&gt;/* and fiq */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;/* Initialisation done, sleep */&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;ldr r2, .Lsleep&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; mov &amp;nbsp; &amp;nbsp; lr, pc&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; bx &amp;nbsp; &amp;nbsp; &amp;nbsp;r2&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;div&gt;&lt;div&gt;.Lbss_start:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__bss_start__&lt;/div&gt;&lt;div&gt;.Lbss_end:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;__bss_end__&lt;/div&gt;.Lc_entry:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;c_entry&lt;/div&gt;&lt;div&gt;.Lsleep:&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;  &lt;/span&gt;.word&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;sys_sleep&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Note the use of &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;msr cpsr_c, rx&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; - this is how we change mode. &amp;nbsp;We can change mode this way from any mode &lt;i&gt;except&lt;/i&gt; user mode. &amp;nbsp;Luckily, the user mode stack pointer is shared with system mode, so we don't need to drop into user mode at all. &amp;nbsp;So we go off, find how much memory we have, then for certain of the operating modes, we set up a stack pointer. &amp;nbsp;We then use a pre-written implementation of &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;memset()&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt; to zero out the bss section, let the 'c' code have a go at initialising its stuff via &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;c_entry()&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;, turn on interrupts, and go to sleep via &lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;sys_sleep()&lt;/span&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Next up, how we go about doing useful work...&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-6391315865862740870?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/6391315865862740870/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part_11.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/6391315865862740870'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/6391315865862740870'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part_11.html' title='Developing a multitasking OS for ARM part 2'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-5204243276321826045</id><published>2011-12-07T10:11:00.001+01:00</published><updated>2011-12-07T10:54:03.390+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='assembler'/><category scheme='http://www.blogger.com/atom/ns#' term='arm'/><category scheme='http://www.blogger.com/atom/ns#' term='os'/><category scheme='http://www.blogger.com/atom/ns#' term='raspberry pi'/><title type='text'>Developing a multitasking OS for ARM part 1</title><content type='html'>My Scheme OS for the &lt;a href="http://www.raspberrypi.org/"&gt;Raspberry Pi SBC&lt;/a&gt; is coming along nicely (code at https://gitorious.org/lambdapi), and a couple of comments on the Raspberry Pi forum kinda kicked me into actually documenting some of the process of what I've been doing. &amp;nbsp;So, here goes.&lt;br /&gt;&lt;br /&gt;Firstly, the toolset.&lt;br /&gt;&lt;br /&gt;The first thing we'll be needing is development tools. &amp;nbsp;Yeah, there's "off the peg" toolsets available, but I wanted to be up at the bleeding edge. &amp;nbsp;So, off to GNU's site, and let's get cracking.&lt;br /&gt;&lt;br /&gt;I built and installed the latest versions of libtools (which includes the assembler and linker), gcc, g++, newlib and gdb, all for target arm-none-eabi. &amp;nbsp;If you want to know how to do this, googling "arm bare metal" should elucidate. &amp;nbsp;Otherwise, there's always codesourcery.&lt;br /&gt;&lt;br /&gt;Now, booting. &amp;nbsp;Obviously, the first thing we need to do is boot the board. &amp;nbsp;In my case, it's very uncomplicated. &amp;nbsp;No first-stage booters, no relocating stuff from flash, just a bunch of RAM &amp;nbsp;that your binary gets loaded into, starting at address 0x00000000. &amp;nbsp;Easy peasy.&lt;br /&gt;&lt;br /&gt;So. &amp;nbsp;How does ARM (specifically, the ARM1176jzf-s processor on the Raspberry Pi) boot? &amp;nbsp;Well, there's chapter and verse on the &lt;a href="http://infocenter.arm.com/"&gt;ARM site&lt;/a&gt;, but here's the TL;DR version.&lt;br /&gt;&lt;br /&gt;&lt;i&gt;When the ARM powers on, it executes ARM (32 bit) instructions starting from address 0x00000000. &amp;nbsp;&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Simples, right? &amp;nbsp;Well, not quite. &amp;nbsp;Address 0x00000000 is the start of what's known as the exception vector table, which contains 8 bytes for each of 8 potential exceptions. &amp;nbsp;8 bytes (or 2 words) is enough to store an absolute jump instruction, or an instruction to move an address from memory into the program counter. &amp;nbsp;So the simplest vector table would look like this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.section .reset&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.code 32&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global __reset&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__reset:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b _reset &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; @ Power on reset&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b _undef &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; @ Undefined instruction&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b _swi &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; @ Software interrupt&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b _prefetch_abort &amp;nbsp;@ Prefetch Abort&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b _data_abort &amp;nbsp; &amp;nbsp; &amp;nbsp;@ Data abort&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b . &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;@ Unused&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b _irq &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; @ IRQ&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b _fiq &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; @ "fast interrupt"&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;And that would be fine. &amp;nbsp;However, that's not how it's normally done, mainly because it's impossible, with this setup, to change the vectors on the fly. &amp;nbsp;So what we do is this:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.section .reset, "ax"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.code 32&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global __reset&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;__reset:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; ldr &amp;nbsp;pc, _reset_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; ldr &amp;nbsp;pc, _undef_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; ldr &amp;nbsp;pc, _swi_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; ldr &amp;nbsp;pc,&amp;nbsp;_prefetch_abort_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; ldr &amp;nbsp;pc,&amp;nbsp;_data_abort_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b &amp;nbsp; &amp;nbsp;.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; ldr &amp;nbsp;pc,&amp;nbsp;_irq_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _fiq&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_fiq:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; @ Fast interrupt handler starts here&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b &amp;nbsp; &amp;nbsp;_no_handler&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _no_handler&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_no_handler:&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b &amp;nbsp; &amp;nbsp;_no_handler&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_reset_address: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.word _reset&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_undef_address: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.word _undef&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_swi_address: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.word _swi&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_prefetch_abort_address: .word _prefetch_abort&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_data_abort_address: &amp;nbsp; &amp;nbsp; .word _data_abort&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_irq_address: &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;.word _irq&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _reset_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&amp;nbsp;_undef_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&amp;nbsp;_swi_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&amp;nbsp;_prefetch_abort_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&amp;nbsp;_data_abort_address&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global&amp;nbsp;_irq_address&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.weak _undef&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.set _undef, _no_handler&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.weak _prefetch_abort&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.set _prefetch_abort, _no_handler&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.weak _data_abort&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.set _data_abort, _no_handler&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _reset&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_reset:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b &amp;nbsp; &amp;nbsp; _no_handler&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _swi&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_swi:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b &amp;nbsp; &amp;nbsp; _no_handler&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;.global _irq&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;_irq:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp; b &amp;nbsp; &amp;nbsp; _no_handler&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;That's loads bigger, but what does it change, exactly?&lt;br /&gt;&lt;br /&gt;The "fast interrupt" code gets to miss an indirection, so it's faster. &amp;nbsp;We simply start the interrupt handler directly at the end of the vector table. &amp;nbsp;I'm not actually doing this at the moment, but it's possible.&lt;br /&gt;&lt;br /&gt;The other exceptions load their address from an indirection table, so we can repatch them on the fly.&lt;br /&gt;&lt;br /&gt;We have a "generic" handler for unhandled exceptions. &amp;nbsp;The way that gets patched in is to do with the linker. &amp;nbsp;A .weak directive for a symbol will allow us to simply not define a symbol in our code, and the linker will replace it with zero instead of barfing. &amp;nbsp;The .set directive enables us to use a different default to zero. &amp;nbsp;Thus, any of the _undef, _prefetch_abort or _data_abort entry points (in the code above) will redirect to _no_handler unless we define those entry points elsewhere. &amp;nbsp;This is a trick we'll use again later. &amp;nbsp;Note _reset, _swi and _irq have no defaults, and thus must be defined elsewhere (I've defined them to simply jump to _no_handler for the moment.&lt;br /&gt;&lt;br /&gt;All we need to do is assemble that and link it to load at 0x00000000, and we have a booter. &amp;nbsp;It will do bugger all, but it will work.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-5204243276321826045?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/5204243276321826045/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/5204243276321826045'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/5204243276321826045'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/12/developing-multitasking-os-for-arm-part.html' title='Developing a multitasking OS for ARM part 1'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-8011543994957468212</id><published>2011-10-19T21:29:00.000+02:00</published><updated>2011-10-20T10:42:43.618+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp'/><title type='text'>Garbage collection for an SD/MMC-based persistent store</title><content type='html'>&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;My Lisp/Scheme based OS is moving on apace.  I've actually got a basic OS running on STM32, but it's feeling a bit cramped, and there's a new game in town - &lt;a href="http://www.raspberrypi.org/"&gt;Raspberry Pi&lt;/a&gt;.  Raspberry Pi (or, to shorten it a bit, RasPi) is a credit-card sized ARM11 based machine intended for educational use.  And it's cheap.  Really, really, cheap.  there's a few oddities to it; it's actually based on a Broadcom SoC that's, in reality, a top whack GPU with a sorta old-hat ARM core bolted on as a bit of an afterthought.  As such, it boots "oddly", GPU first and then ARM, but hey.  The board itself has no onboard flash; persistent storage is SD card only (or, I guess, USB mass storage).&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;As could probably be guessed, it's coming with Linux on board, but hey, who cares about Linux?&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Never one to make things easy on myself, I've expanded the scope of my OS.  The Lisp/Scheme core is still there, the concept of syscalls as closures, all that good stuff.  But I keep coming back to the Newton.  Ah, my beloved NewtonOS.  A totally persistent OS, one that takes power failures and years sitting in a drawer in its stride, a system where you whack new batteries in and you're back &lt;span style="font-style: italic;"&gt;exactly&lt;/span&gt; where you were almost the moment you've let go of the power toggle switch.  This is exactly what &lt;a href="http://www.loper-os.org/"&gt;Stanislav&lt;/a&gt; is talking about in his &lt;a href="http://www.loper-os.org/?p=231"&gt;third law of sane computing&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;So.  Persistence, then.  How's that gonna work?  &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;i&gt;Before we start, I should note that the following technical stuff is nothing to do with wear levelling (which task is carried out quite adequately by the onboard controller of the SD card).&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;We have a quantity of RAM (256MB in the case of the Pi) which we want to use as a read / write cache for a larger quantity of persistent storage. &amp;nbsp;In this case, Flash memory accessed through an SD/MMC card interface.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;We could take a naïve approach, and simply page stuff in and out, but there is a significant performance overhead, particularly with flash memory. &amp;nbsp;Not only are writes to flash slow, but you can only write to a block once, and then it needs to be erased before you write again. &amp;nbsp;And erasing is &lt;i&gt;really&lt;/i&gt; slow.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Any given SD card is split into a number of sectors (typically between 512 bytes and 2K long). &amp;nbsp;A sector is the smallest unit we can read. &amp;nbsp;Those sectors are grouped into erase groups (typically in the 128KB - 2MB range). &amp;nbsp;An erase group is the smallest unit we can erase. &amp;nbsp;We can write to a sector, but if we try to write to it twice, the entire erase group has to be erased and the old contents (plus the new contents of our sector) written back. &amp;nbsp;This is a positively &lt;i&gt;glacial&lt;/i&gt; operation.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;If we take a sane approach, we block everything at the erase group size, and control when (and how) we write back to the card. &amp;nbsp;If we take the approach of writing data back to a different erase group, we can put off the pain of erasing the old one, and do it at our (or the card's) leisure.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Now, as we don't know exactly where an erase group will be paged into memory (and we can't restrict it to a specific location, as the card storage is potentially bigger than real memory), we have to use an indirect addressing method, using a "logical erase group number" (which can't be the actual erase group number, as that will change underneath us) and an index or address into the block of memory. &amp;nbsp;This adds some overhead to pointer dereferencing, of course, but it's mainly done using shifts and masks; dropping to assembler to do this should make it at least tolerable.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;The bigger pain is garbage collection. &amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Lisp and Scheme are garbage collected languages, and we need a garbage collector that doesn't hurt us. &amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;At this point, I should probably set out some requirements for garbage collection:&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;It should reclaim as much memory as possible (duh)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;It should not require paging in gigabytes of memory (that costs potentially enormous amounts of time).&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;It should never (if possible) force data to be written back to storage (which would be worse than paging in).&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;It should (if possible) be concurrent with application usage.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;It should have a bounded worst case time per cycle (Java, I'm looking at you)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;A traditional garbage collector is gonna hurt us plenty. &amp;nbsp;Why is that?&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;The way most garbage collectors work is by tracing from a group of accessible objects to find all objects in the heap that are addressable. &amp;nbsp;These addressable objects are then copied somewhere "safe", and the rest thrown away. &amp;nbsp;That's pretty much the gist of it (there are various complications, but the reader is referred to &lt;a href="http://www.amazon.fr/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484"&gt;the relevant literature&lt;/a&gt; for further detail).&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Now, that whole "copying" thing means that, for every garbage collection cycle, the erase group(s) containing the "live" objects will have to be written back to the SD card, whether the objects themselves have changed or not. &amp;nbsp;Ouch.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;But surely, you say, there exists a way of collecting garbage &lt;i&gt;without&lt;/i&gt; moving objects around? As it happens, dear reader, there does. &amp;nbsp;&lt;a href="http://www.pipeline.com/~hbaker1/NoMotionGC.html"&gt;H.G. Baker's "Treadmill" non-copying garbage collector&lt;/a&gt; is a particularly good (and easy to code) example. &amp;nbsp;It comes with a certain amount of hurt itself, but it gets rid of the copying requirement, and that's a good start.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;The main hurt of Baker's treadmill is the requirement to hold every object in a doubly-linked list. &amp;nbsp;For a lisp implementation where objects generally look like this:&lt;/span&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;struct cell {&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;uint32_t car;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;uint32_t cdr;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;};&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;adding in the doubly-linked list produces this, which has twice the storage requirement:&lt;/span&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;struct cell {&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;struct cell * next;&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;struct cell * prev;&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;uint32_t car;&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;uint32_t cdr;&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;};&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Ouch, again. &amp;nbsp;But hey, memory's cheap, and we're doing this in order to use RAM as a window on the bigger, persistent store - not really an issue. &amp;nbsp;It should be noted that the structure above is only indicative; the links don't need to be persisted and can be kept separate of the actual cell contents. &amp;nbsp;The overhead, of course, remains the same.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;The other hurt is not to do with Baker's algorithm, but is more general - GC tends to want to access &lt;i&gt;all&lt;/i&gt; your heap. &amp;nbsp;And when your heap is feasibly orders of magnitude larger than real memory, that's gonna cause a load of thrashing.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;So I started with the idea of not GCing non-swapped-in erase groups. &amp;nbsp;Do your garbage collection on what you have in memory at the time, don't try and pull in any non-swapped-in erase blocks, and leave it at that - the other erase groups will get GCed when they eventually get swapped back in. &amp;nbsp;Great, said I, coded about half of it up, and went to bed. &amp;nbsp;And then woke up in the middle of the night.&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;i&gt;"What about objects in a swapped in erase group which are only pointed to by objects in erase groups that are not swapped in? &amp;nbsp;How the hell do I distinguish them from free objects if I don't know where they come from, or even that they are being pointed to?"&lt;/i&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;Ah, &lt;b&gt;tits!&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Yep, if we're doing things in a naive way, but not swapping in every erase group as we traverse, we're gonna lose objects, and corrupt structures that are partially swapped in.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;We need to get clever.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Firstly, let's stick with the idea of erase groups, and not garbage collecting the "world". &amp;nbsp;Indeed, let's not even try collecting our entire heap, we can restrict ourselves to a single erase group, or even a single sector if we like. &amp;nbsp;All we have to do is decide not to follow links outside the (relatively) little block of memory we're looking at. &amp;nbsp;We take our set of system roots, for every one that's in our block, we trace its links until we get to leaves or a link outside the block. &amp;nbsp;The rest is totally normal as per Baker's algorithm. &lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;By restricting ourselves to a single block of memory, we can obviate the need to have a full doubly-linked list for our entire heap; all we need is a doubly linked list for the block we're currently collecting, and that can be generated trivially at the beginning of the GC. &amp;nbsp;This buggers the simplicity of allocating under Baker (which is a dereference and pointer increment), but rather than holding a full list all we need to hold is a bitmap.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Still, that hasn't solved the issue of links into our collected block.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;As it turns out, the solution to that is incredibly simple, and falls back to one of the oldest forms of garbage collection - reference counting. &amp;nbsp;Along with our cell data, we carry a reference count variable, which counts the number of times a cell is referenced by cells outside the block. &amp;nbsp;This looks a bit like this:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;code&gt;struct cell {&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp;uint32_t car;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;uint32_t cdr;&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;nbsp;uint16_t refcount;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;};&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;If we create a cross-block link, we increment the reference count, if we destroy one (either by explicitly removing the link in code, or by freeing the referencing object during a garbage collection), we decrement it. &amp;nbsp;As part of building our list of root objects, we scan the block for those objects with a non-zero refcount, and push those direct into the list of roots.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Now, maintaining refcounts is a pain in the ass, but with the granularity of allocation blocks we're talking about here (between 128KB and 2MB, remember), the number of cross-block references are liable to be very small. &amp;nbsp;We can help this by preferring allocation of new objects in the block their "owner" inhabits (which will also help reduce object scattering, and thus paging issues).&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;So. &amp;nbsp;what does the solution look like?&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Firstly, we have some basic types:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;typedef struct cell {&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; uint32_t car;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; uint32_t cdr;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;} cell_t;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;typedef struct reference_count {&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; uint16_t colour:1; // indicator for gc, see Baker&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; uint16_t count:15;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;} reference_count_t;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Some calculations:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;// Every erase group must store :&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;// n cells (8 bytes each),&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp;n refcounts (2 bytes each)&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;// and n / 32 free cell bitmaps (4 bytes each)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;#define CELLS_PER_BLOCK (erase_group_size * 8) / (64 + 16 + 1)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Therefore cell storage within an erase group looks like this (it's not actually defined like this, as CELLS_PER_BLOCK is dynamically calculated based on actual card characteristics):&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;cell_t &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;cells[CELLS_PER_BLOCK];&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;reference_count_t refcounts[CELLS_PER_BLOCK];&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;uint8_t &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; bitmaps[CELLS_PER_BLOCK];&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;The garbage collection algorithm is thus (using a single, transient, doubly linked list for CELLS_PER_BLOCK elements) :&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;- For each cell in the erase group&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; - if cell is in set of root objects, or has a refcount != 0, add to "grey" objects&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; - else add to "ecru" objects&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;- Do Baker's treadmill algorithm, restricting ourselves to intra-block links&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;- For each free object&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; - set "free" indicator in bitmaps&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; - if object has extra-group link(s)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; &amp;nbsp; - decrement refcount(s) on externally referenced groups (may require paging)&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; &amp;nbsp; - reinitialize to NIL . NIL or similar&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&amp;nbsp; &amp;nbsp; - flag group as "dirty"&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;And that's about it. &amp;nbsp;Allocating involves finding a free cell in the relevant erase group (made relatively easy using ARM's CLZ operator), assignment has to check for cross-erase-group links.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;Note that the only time garbage collection will require the erase group to be flushed back to storage is in the (rare) case where we free a cell that has extra-group links. &amp;nbsp;The "free" bitmap may be out of date on loading a group from flash, but it will never indicate that a non-free cell is free (allocation of a cell will update the bitmap to indicate the non-free status of that cell, and require a flush). &amp;nbsp;The bitmap will be updated on the first gc cycle for the group, at which point it should be up-to-date.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;As an added bonus, as long as everything is persisted, suspending a process is simply a case of incrementing the reference counts of all its root objects and stopping the process. &amp;nbsp;Starting it again is decrementing those counts, and you're back to &lt;i&gt;exactly where you were.&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/i&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;How d'ya like them apples?&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-8011543994957468212?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/8011543994957468212/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/10/garbage-collection-for-sdmmc-based.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/8011543994957468212'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/8011543994957468212'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2011/10/garbage-collection-for-sdmmc-based.html' title='Garbage collection for an SD/MMC-based persistent store'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-7662688938341363545</id><published>2010-12-17T18:01:00.000+01:00</published><updated>2011-12-07T20:02:52.783+01:00</updated><title type='text'>TiVo^H^H^H^HAndroidization</title><content type='html'>A few years back there was a big hoo-hah in the open source world, and it was at least partially responsible for the creation of the GPLv3.  That hoo-hah was the &lt;i&gt;TiVoization&lt;/i&gt; of the Linux kernel.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In short, what happened was this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;TiVo, inc. created their famous PVR device, which took the world (especially the geek world) by storm.  At the heart of the device was a bunch of GPL software, including the Linux kernel.  This later fact was, IMO, a large part of &lt;i&gt;why&lt;/i&gt; TiVo took the geek world by storm, but that's another argument.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;TiVo did what any good company using GPLed software should do - they delivered the source of their modifications to the GPLed software, and kept their non-GPLed software scrupulously away from the GPLed stuff.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So far, so good, right?  Wrong.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Whilst it was possible for anyone to recompile their own version of the TiVo firmware, it was &lt;i&gt;not&lt;/i&gt; possible to flash that recompiled firmware onto a TiVo branded device.  The firmware images used to flash the TiVo were cryptographically signed, and the means to do that was not public. There were, of course, good reasons for this, not least of which was that the media companies would have, legally speaking, shat upon TiVo, inc. from a great height if it were not so.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There was much oohing and aahing from the GPL advocates over this, and it was generally decided that this was somehow wrong, and it was going to kill the GPL, and other such crap.  And thus was born the GPLv3, which "protected" not only the software, but also the hardware it was designed to run on.  Linus and the other kernel developers, who saw little wrong with what TiVo had done, decided to stick with the old GPLv2 anyway, and to hell with V3. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Fast forward a few years.  The Linux kernel has not died.  The GPL has not died.  But now there &lt;i&gt;is&lt;/i&gt; a threat to the GPL, and it is attacking both v2 and v3.  It's a real and present danger, and it's being ignored (at best) or cheered on (at worst).  That threat has a name.  Its name is &lt;b&gt;&lt;i&gt;Android&lt;/i&gt;&lt;/b&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's not Android as such that is the threat, but the market in which it is being used - mobile devices.  An open system is anæthema to the hermetically closed world of mobile phones and telecoms carriers. &amp;nbsp;Android is being claimed as the saviour of the open source world from the big bad ogres at Apple and MS. &amp;nbsp;I believe this claim is wrong.&lt;br /&gt;&lt;br /&gt;Even Google aren't being the open source heros they are claimed. &amp;nbsp;The (non-GPL parts of the) source for Android 4.x (Ice Cream Sandwich) was released months after the first devices hit the street, and I'm not sure if the 3.x series has ever been made public.&lt;br /&gt;&lt;br /&gt;Telephony providers are absolutely against people reflashing their devices (or even rooting them), and many mobiles are every bit as TiVoised as the original TiVo. &amp;nbsp;And that inability to reflash is being used as a marketing device by the providers / manufacturers. &amp;nbsp;"Benefit from Android x.x", they trumpet, "get yourself a new contract with handset / tablet X.1", quietly ignoring the fact that handset / tablet X.0 could quite happily run Android x.x should they bother to take a little time to provide a firmware upgrade. &amp;nbsp;And yet Apple are "evil", and "push you to consume", although they have a policy of supporting hardware for a decent amount of time. &amp;nbsp;No, Apple are evil because they are "closed source", and to avoid that we'll happily get fucked up the arse by someone waving a GPL banner.&lt;br /&gt;&lt;br /&gt;But the worst, the absolute worst, of the lot, are the Chinese manufacturers of cheap Android enabled devices, motherboards, and chipsets. &amp;nbsp;It is absolutely impossible to get them to release the slightest piece of source code, despite the fact they are obliged to do so.&lt;br /&gt;&lt;br /&gt;Android is nothing more than a system for pushing ads to your mobile device. &amp;nbsp;It's nothing to do with freedom, unless you're talking about Google's freedom to rape your private data. &amp;nbsp;The telephony providers are using it because it costs them jack shit. &amp;nbsp;Nothing to do with consumer benefit, nothing to do with freedom, simple bottom line accounting. &amp;nbsp;The Chinese manufacturers are using it for the same reason, and because MS have got harder on hooky copies of WinCE.&lt;br /&gt;&lt;br /&gt;None of them give a flying fuck about "freedom", but between them, they may bring the GPL down.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-7662688938341363545?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/7662688938341363545/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/12/tivohhhhandroidization.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/7662688938341363545'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/7662688938341363545'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/12/tivohhhhandroidization.html' title='TiVo^H^H^H^HAndroidization'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-4057874477659723177</id><published>2010-12-17T14:41:00.000+01:00</published><updated>2010-12-17T15:10:15.512+01:00</updated><title type='text'>How good is your hashing?</title><content type='html'>&lt;div&gt;As promised in my last post, I'm working on an embedded lisp-based OS.  Yeah, &lt;i&gt;another&lt;/i&gt; one.  Because what the world really needs is yet another geek's idea of what an OS should look like.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Well, fuck the world, I'm scratching an itch.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, if I were being sensible, I'd target something usable, like my Wits A81 tablet device.  A little lispy tablet could be really neat (and, indeed, it may well end up being so).  However, I decided to start a bit smaller.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A lot smaller, actually.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yep, a lisp-based OS on a microcontroller.  Not, I might add, the st8ms this blog originally started with (and is titled as), but it's bigger brother, the stm32.  It's an ARM Cortex-M3 processor, and the one I have in hand has 128K of flash and 8K of RAM, which is quite nifty.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, anything that's gonna fit into that little space is gonna have to be tight.  Sure, I can push as much code as possible into flash and execute from there, but that 8K is a really tight limit for something like a Lisp, even a cut down one.  And so, I started thinking about what takes up lots of space in lisp.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Firstly, there's objects.  A naive implementation of boxed objects in Lisp means that even the humblest character takes up a significant amount of space, with 24 bytes not being unheard-of.  Well, bollocks to that.  I'm being as tight as I can.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once you've squeezed stuff like characters and numbers and so on down, though, you start looking at what you can get shot of.  Symbols, for example.  They are used as unique identifiers, and ar generally stored as a hash value and a literal string.  Now, the hash value is what's used most of the time, and the string is only really used when "exploding" the symbol out into an array of characters.  So if I'm willing not to do that, I can get away with *just* holding the hash.  That's a significant saving.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So.  What hash to use?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I need something that can be pushed into a single word of space (4 bytes), and which will allow me to detect collisions.  By which I mean that if "a" and "b" resolve to the same hash value, I need to be able to tell.  The first bit is easy enough, but the second is hard as nails&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A google for hashing algorithms will generally end up pointing you to Dan Bernstein's venerable djb2.  Unfortunately, it's not really very good, and it certainly doesn't handle the second case.  For that, what I need is an algorithm that pushes out *two* hash values, a primary and a secondary.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Enter &lt;a href="http://burtleburtle.net/bob/c/lookup3.c"&gt;Bob Jenkins' 'lookup3.c', particularly 'hashlittle2()'&lt;/a&gt;.  This is a rather nice hashing algorithm that throws out a pair of hash values, has very good characteristics, and runs fast.&lt;/div&gt;&lt;div&gt;But, oh noes!  Compiling it for the STM32 results in over 1K of code.  Surely we can do better than that?  Yes, we can.&lt;/div&gt;&lt;br /&gt;&lt;code&gt;.global hash&lt;br /&gt;.type   hash, %function&lt;br /&gt;.align 2&lt;br /&gt;@ Hash Function&lt;br /&gt;@ Thumb-2 implementation of 'hashlittle2' by Bob Jenkins.&lt;br /&gt;@ In :  R0 -&gt; pointer to string&lt;br /&gt;@       R1 -&gt; length of string&lt;br /&gt;@       R2 -&gt; Initial value for 'hash c'&lt;br /&gt;@       R3 -&gt; Initial value for 'hash b'&lt;br /&gt;@ Out : R0 -&gt; 'hash c', the main hash value&lt;br /&gt;@       R1 -&gt; 'hash b', the secondary hash value&lt;br /&gt;@ Clobbers R2, R3, flags &lt;br /&gt;hash:   push    {r4-r7,lr}&lt;br /&gt;        @ Within the function, we use registers as follows:&lt;br /&gt;        @ r1    : length&lt;br /&gt;        @ r2-r4 : temporary for character loading&lt;br /&gt;        @ r5-r7 : a, b, c&lt;br /&gt;        &lt;br /&gt;        @ initial setup&lt;br /&gt;        adr     r7, hash_constant @ a = b = c = 0xdeadbeef + length + hash c&lt;br /&gt;        ldr     r7, [r7]&lt;br /&gt;        adds    r7, r1&lt;br /&gt;        adds    r7, r2&lt;br /&gt;        mov     r5, r7&lt;br /&gt;        mov     r6, r7&lt;br /&gt;        adds    r7, r3          @ c += hash b&lt;br /&gt;        &lt;br /&gt;hash_loop:&lt;br /&gt;        cmp     r1, #0x0c       @ Is r4 &lt;= 12?&lt;br /&gt;        it      le&lt;br /&gt;        ble     hash_tail       @ If so, go do the tail part&lt;br /&gt;        &lt;br /&gt;        ldmia   r0!, {r2, r3, r4}       @ Load values&lt;br /&gt;        adds    r5, r2&lt;br /&gt;        adds    r6, r3&lt;br /&gt;        adds    r7, r4&lt;br /&gt;        subs    r1, #0x0c       @ Subtract 12 from length&lt;br /&gt;        &lt;br /&gt;        @ Mix&lt;br /&gt;        subs    r5, r7                  @ a -= c&lt;br /&gt;        eor.W   r5, r5, r7, ror #28     @ a ^= rot(c,4)&lt;br /&gt;        adds    r7, r6                  @ c += b&lt;br /&gt;        &lt;br /&gt;        subs    r6, r5                  @ b -= a&lt;br /&gt;        eor.W   r6, r6, r5, ror #26             @ b ^= rot(a,6)&lt;br /&gt;        adds    r5, r7                  @ a += c;&lt;br /&gt;        &lt;br /&gt;        subs    r7, r6                  @ c -= b;  &lt;br /&gt;        eor.W   r7, r7, r6, ror #24     @ c ^= rot(b, 8);  &lt;br /&gt;        adds    r6, r5                  @ b += a;&lt;br /&gt;        &lt;br /&gt;        subs    r5, r7                  @ a -= c&lt;br /&gt;        eor.W   r5, r5, r7, ror #16     @ a ^= rot(c,16)&lt;br /&gt;        adds    r7, r6                  @ c += b&lt;br /&gt;        &lt;br /&gt;        subs    r6, r5                  @ b -= a&lt;br /&gt;        eor.W   r6, r6, r5, ror #13             @ b ^= rot(a,19)&lt;br /&gt;        adds    r5, r7                  @ a += c;&lt;br /&gt;        &lt;br /&gt;        subs    r7, r6                  @ c -= b;  &lt;br /&gt;        eor.W   r7, r7, r6, ror #28     @ c ^= rot(b, 4);  &lt;br /&gt;        adds    r6, r5                  @ b += a;       &lt;br /&gt;        &lt;br /&gt;        b       hash_loop&lt;br /&gt;hash_tail:&lt;br /&gt;        cbz     r1, hash_done           @ length 0 requires no extra work&lt;br /&gt;        &lt;br /&gt;        ldmia   r0!, {r2, r3, r4}       @ Load values&lt;br /&gt;        adr     r0, do_hash_mask&lt;br /&gt;        add     r0, r0, r1, lsl #2&lt;br /&gt;        &lt;br /&gt;do_hash_mask:&lt;br /&gt;        mov.W   pc, r0                  @ doubles for count 0 entry in masking table, *must* be 4 bytes hence .W&lt;br /&gt;        @ Here we mask off the bits we don't want according to &lt;br /&gt;        @ what data we have&lt;br /&gt;        bic     r2, #0x0000ff00         @ count is 1, mask off all but least significant byte&lt;br /&gt;        bic     r2, #0x00ff0000         @ etc etc&lt;br /&gt;        bic     r2, #0xff000000 &lt;br /&gt;        bic     r3, #0x000000ff&lt;br /&gt;        bic     r3, #0x0000ff00&lt;br /&gt;        bic     r3, #0x00ff0000&lt;br /&gt;        bic     r3, #0xff000000&lt;br /&gt;        bic     r4, #0x000000ff&lt;br /&gt;        bic     r4, #0x0000ff00&lt;br /&gt;        bic     r4, #0x00ff0000&lt;br /&gt;        bic     r4, #0xff000000&lt;br /&gt;        bic     r4, #0x00000000&lt;br /&gt;&lt;br /&gt;        adds    r5, r2                  @ Add masked vales before final mix&lt;br /&gt;        adds    r6, r3&lt;br /&gt;        adds    r7, r4&lt;br /&gt;        &lt;br /&gt;        @ Final mix&lt;br /&gt;        eors    r7, r6                  @ c ^= b&lt;br /&gt;        sub     r7, r7, r6, ror #18     @ c -= rot(b,14)&lt;br /&gt;        eors    r5, r7                  @ a ^= c&lt;br /&gt;        sub     r5, r5, r7, ror #21     @ c -= rot(c,11)&lt;br /&gt;        eors    r6, r5                  @ b ^= a&lt;br /&gt;        sub     r6, r6, r5, ror #7      @ b -= rot(a,25)&lt;br /&gt;        eors    r7, r6                  @ c ^= b&lt;br /&gt;        sub     r7, r7, r6, ror #16     @ c -= rot(b,16)&lt;br /&gt;        eors    r5, r7                  @ a ^= c&lt;br /&gt;        sub     r5, r5, r7, ror #28     @ c -= rot(c,4)&lt;br /&gt;        eors    r6, r5                  @ b ^= a&lt;br /&gt;        sub     r6, r6, r5, ror #18     @ b -= rot(a,14)&lt;br /&gt;        eors    r7, r6                  @ c ^= b&lt;br /&gt;        sub     r7, r7, r6, ror #8      @ c -= rot(b,24)&lt;br /&gt;        &lt;br /&gt;hash_done:&lt;br /&gt;        mov     r0, r7&lt;br /&gt;        mov     r1, r6&lt;br /&gt;        pop     {r4-r7, pc}&lt;br /&gt;&lt;br /&gt;hash_constant:&lt;br /&gt;        .word   0xdeadbeef&lt;br /&gt;&lt;br /&gt;.ltorg&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;div&gt;There you go.  206 bytes of hash function.  There's probably a few bytes to shave off here and there, but it works pretty well.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As it happens, I'm only using 30 bits of the main hash, with the lowest 2 bits being used to indicate type, in order that symbols fit completely into a single machine word.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-4057874477659723177?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/4057874477659723177/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/12/how-good-is-your-hashing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/4057874477659723177'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/4057874477659723177'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/12/how-good-is-your-hashing.html' title='How good is your hashing?'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-7841171115493190054</id><published>2010-10-29T12:45:00.001+02:00</published><updated>2010-10-29T16:52:50.024+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='lisp os newton stuff'/><title type='text'>All modern operating systems are shite.</title><content type='html'>&lt;div&gt;"Modern" operating systems are, to my eyes, fundamentally broken. Indeed, there's very little that's "modern" about them, when it comes down to it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;They are seemingly all mired in the concept of being nothing more than a layer over hardware, something that allows single-purpose "applications" to access that hardware in a consistent manner.  The focus is wholeheartedly on the applications, and not on the data those applications use.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Admittedly, some of those systems are better than others.  Apple's OSX is nice enough to use, mainly because it's decent enough to get out of the way when you need it to.  But even so, it's conceptually no different to Microsoft's Win7, or even Vista, XP, Win2K (go back down the MS lineage as far as you want here) or any other "desktop" operating system - the only real distinguishing factor apart from what applications are compatible is how you get on with the UI.  Personally, I find OSX tolerable, and Windows execrable, but that's personal preference, and not really much different to preferring a Ford over a Fiat.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Even the handheld market, one which was all-but invented by Apples ground-breaking Newton, has taken masive steps backwards from the beauty and simplicity of Newton, or even the sparse functionalism of Palm; "handheld devices", say Jobs, Brin and Ballmer, "are for consumption of pre-rendered media".  Forget using them as extensions of your computing environment, forget using them for anything useful (except, perhaps, as a poor replacement for a spirit level), they're there as status symbol, toy, and above all, conduit for advertising.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Yeah, I keep coming back to the Newton.  Funny, that.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, anyway, whilst fiddling with Android, and trying to make mobile devices actually useful again, and generally buggering about with my Wits A81, I've been doing a fair amount of thinking about this.  And, indeed, I've been thinking about going further than that, and actually doing something about it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I started by thinking that something could be done by using the linux kernel as a starting point.  After all, it's already developed, and it runs everywhere - why bother reinventing the wheel?  Just make a performant "shim" OS over the top of it, and a file system that does what you want, and you're laughing.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But that's suboptimal.  Linux (the kernel) is tied to the Unix concept that "everything is a file".  Which is a nice enough abstraction, but it doesn't work for me.  The reason it doesn't work for me is that "a file" is far from being a rich enough concept.  Part of this is to do with metadata (or, more particularly, the lack of it) - it's very difficult to implement interesting inter-application behaviour without metadata and (even more importantly) transparent, dynamic, access methods for that data and metadata.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;As an example, let's consider Apple's OSX, and the interaction between AddressBook.app, Mail.app and iCal.app. This is, pretty much, state of the (current) art - you send me an email suggesting a lunch date tomorrow at midday, I double click on the "midday" text and get a "lunch" appointment added to my calendar with you as an "attendee".  Pretty damn slick.  But all that interaction knows about is those 3 apps - it's hard-coded to only work between those three.  This is because the processing is contained in the aplications, and not "owned" by the data.  I can't make MyFunkyApp.app do this without linking to a bunch of Apple-provided frameworks (and even then, it's hard, as a lot of the behaviour is not externally exposed), and even when I do, Mail.app doesn't suddenly inherit the ability to do what MyFunkyApp.app does - the interaction only goes one way.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We shouldn't be living in a world where metadata and behaviour is application specific.  Developers should be developing behaviour, interfaces and transforms for specific data types, not reinventing the "application" over and over.  Users should be able to edit an image directly in an email, and send it back to the sender, without having to save it, open it in anoother application, edit it there, save it again, then send it back.  Data should be automatically versioned.  It should follow you around.  Your mobile device should be an extension of your desktop, able to take important data with you and merge changes back when you return, able to contact your machine over the 'net and fetch data you "forgot".  And so on.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Desktop computing hasn't advanced that much since 1984.  There have been a few attempts to make things better, but they have, by and large, failed, at least if we measure success in a commercial sense.  Newton was one of these.  Computing has failed on several of its promises.  We are slaves to the machine, not the other way around.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It is clear to me, at least, that we need a change.  And the idea has been buzzing around my head for some time.  I had considered that much of this could be accomplished by layering something over the Linux kernel, with a metadata-storing file system backing everything up.  The problem with this approach is that the two worlds can never be allowed to collide - Linux provides a hardware abstraction layer and nothing else.  Everything else the kernel does, including scheduling, would be surplus to requirements.  So why not, I thought to myself, develop an OS directly "on the metal" in a dynamic language, do it all from the ground up?  After all, it's been done before.  The &lt;a href="http://en.wikipedia.org/wiki/Jupiter_Ace"&gt;Jupiter Ace&lt;/a&gt; was a "home micro" which was coded purely in FORTH.  The &lt;a href="http://www.sts.tu-harburg.de/~r.f.moeller/symbolics-info/"&gt;Symbolics Lisp Machines&lt;/a&gt; were coded, from the ground up, in Lisp.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And that hooked me.  I've always loved Lisp.  It has a beauty which is transcendental, a purity of purpose which is unrivalled in any (non-Lisp) language that's come since.  A lot of people get scared by the syntax, but then people get scared by the fact Objective-C uses square brackets and colons, so fuck 'em.  Result - I started looking at Lisp-based OSes, to see if it had all been done before.  And I found this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://bywicket.com/users/mikel/weblog/fbc2a/Closos.html"&gt;Closos&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, this started to tickle my interest.  Not only was I not insane (or, at least, not insane and alone), but Mikel is one of the original guys on the Newton (&lt;a href="http://lispm.dyndns.org/news?ID=NEWS-2004-08-14-1"&gt;and pre-Newton&lt;/a&gt;) project at Apple.  One of the guys who worked on &lt;a href="http://en.wikipedia.org/wiki/SK8"&gt;SK8&lt;/a&gt; at Apple.  An insanely talented guy, who seemingly thinks much the same way as I do.  Sure, it's not much further advanced than I am, but it at least shows I'm (possibly) not utterly wrong.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, I'm not mad.  And I *am* gonna do this.  Stop thinking, start acting.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-7841171115493190054?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/7841171115493190054/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/10/all-modern-operating-systems-are-shite.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/7841171115493190054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/7841171115493190054'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/10/all-modern-operating-systems-are-shite.html' title='All modern operating systems are shite.'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-7193653211926572135</id><published>2010-07-31T09:56:00.001+02:00</published><updated>2010-07-31T10:07:22.983+02:00</updated><title type='text'>OpenEmbedded under OSX.</title><content type='html'>So, as a result of getting the little android tablet I mentioned before, I've been playing with OpenEmbedded.  Unfortunately (for me), it's not supported (or at least not fully) under OSX - even getting the native tools up and running is painful.  There's a *load* of gnu-isms in the source, which break strict POSIX compliance, and frankly make it a massive pain to get stuff working.&lt;br /&gt;&lt;br /&gt;Much of this can be got around by using precompiled stuff (either pulled from one of the OSX package repositories, fink etc, or, in my case, compiled manually), and then explicitly removed from the OE build system using ASSUME_PROVIDED.&lt;br /&gt;&lt;br /&gt;This doesn't fix a few overall issues, though, much of which comes from OE assuming it's built on a system that uses ELF as its binary format.&lt;br /&gt;&lt;br /&gt;A large part of this can be fixed with one patch, which I've attached here.  $OEBASE/$CHECKOUT/classes/relocatable.bbclass tries to fix up the &lt;span style="font-family:courier new;"&gt;rpath&lt;/span&gt; in any binaries having it, but simply assumes all binaries are ELF format.  Obviously, this crashes and burns horribly under OSX, where the runtime format for native binaries is Mach-O.  Patch below, apply from $OEBASE/$CHECKOUT with patch -p1&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;diff --git a/classes/relocatable.bbclass b/classes/relocatable.bbclass&lt;br /&gt;index 2af3a7a..3a4c119 100644&lt;br /&gt;--- a/classes/relocatable.bbclass&lt;br /&gt;+++ b/classes/relocatable.bbclass&lt;br /&gt;@@ -3,6 +3,19 @@ SYSROOT_PREPROCESS_FUNCS += "relocatable_binaries_preprocess"&lt;br /&gt;CHRPATH_BIN ?= "chrpath"&lt;br /&gt;PREPROCESS_RELOCATE_DIRS ?= ""&lt;br /&gt;&lt;br /&gt;+def is_elf_file (fullpath):&lt;br /&gt;+    import subprocess as sub&lt;br /&gt;+   &lt;br /&gt;+    p = sub.Popen(['file', '-b', fullpath],stdout=sub.PIPE,stderr=sub.PIPE)&lt;br /&gt;+    err, out = p.communicate()&lt;br /&gt;+    if p.returncode != 0:&lt;br /&gt;+        return 0&lt;br /&gt;+   &lt;br /&gt;+    if out.startswith('ELF'):&lt;br /&gt;+        return 1&lt;br /&gt;+    else:&lt;br /&gt;+        return 0&lt;br /&gt;+       &lt;br /&gt;def process_dir (directory, d):&lt;br /&gt;    import subprocess as sub&lt;br /&gt;    import stat&lt;br /&gt;@@ -24,7 +37,8 @@ def process_dir (directory, d):&lt;br /&gt;&lt;br /&gt;        if os.path.isdir(fpath):&lt;br /&gt;            process_dir(fpath, d)&lt;br /&gt;-        else:&lt;br /&gt;+        if is_elf_file(fpath) == 1:&lt;br /&gt;+            # only try to relocate ELF files&lt;br /&gt;            #bb.note("Testing %s for relocatability" % fpath)&lt;br /&gt;&lt;br /&gt;            # We need read and write permissions for chrpath, if we don't have&lt;br /&gt;@@ -85,7 +99,7 @@ def rpath_replace (path, d):&lt;br /&gt;    bindirs = bb.data.expand("${bindir} ${sbindir} ${base_sbindir} ${base_bindir} ${libdir} ${base_libdir} ${PREPROCESS_RELOCATE_DIRS}", d).split()&lt;br /&gt;&lt;br /&gt;    for bindir in bindirs:&lt;br /&gt;-        #bb.note ("Processing directory " + bindir)&lt;br /&gt;+        bb.note ("Processing directory " + bindir)&lt;br /&gt;        directory = path + "/" + bindir&lt;br /&gt;        process_dir (directory, d)&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I'll post more in a little while, including full instructions on getting oe up and running under OSX, but I'm off on holiday for a week.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-7193653211926572135?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/7193653211926572135/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/openembedded-under-osx.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/7193653211926572135'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/7193653211926572135'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/openembedded-under-osx.html' title='OpenEmbedded under OSX.'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-3307888844602045341</id><published>2010-07-11T10:03:00.000+02:00</published><updated>2010-07-11T13:52:09.080+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='android newton'/><title type='text'>Android on tablets, rights and wrongs</title><content type='html'>This is likely to be a relatively contentious post.  I've spent a bit of time trying to get to like Android, and, in short, I can't do it.  I don't like Android.  I'm sure it's OK as a &lt;span style="font-weight:bold;"&gt;smartphone&lt;/span&gt; OS, but on a tablet, it simply doesn't work for me.&lt;br /&gt;&lt;br /&gt;Now, I'm aware that I'm probably not the average user, but I'm (humbly enough) pretty well attuned to what people need to make computers work.  That comes from 20+ years in "the biz", using pretty much every operating system there's ever been, including some real wierdos that most people haven't even heard of.&lt;br /&gt;&lt;br /&gt;Now, to start hammering on what Android's got wrong, we really need to look at a comparable system that's got it *right*.  And it needs to be something I have near me, so out comes the trusty Newton.  Yep, that's right.  The Newton.  A 13 year old, underpowered "pocketable" that was ridiculed in its earlier incarnations for its pisspoor handwriting recognition.  Android's gotta be better than that, right?  Read on for a load of words on why I think the way I do, but the executive summary is "No, wrong".&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Philosophy&lt;/h2&gt;&lt;br /&gt;Before we get to the "nitty gritty" of little things, we need to look at the design rationales of the systems, or, for want of a better phrase, their "philosophy".&lt;br /&gt;&lt;br /&gt;Newton came from a real intention to make a computer that was different, that was better.  A radical shift from the 100lb hulking monoliths that lived on or under our desks at the time.  A powerful computer that could be dropped into your pocket.  A computer that anyone could use.  A computer that didn't need a keyboard.  It may have been a commercial failure (it cost Apple a lot of money at a time it couldn't afford it), but opened a whole range of new business markets and the whole concept of a pocketable "real computer" made a lot of money for other people.&lt;br /&gt;&lt;br /&gt;Android comes from Google.  In short, it can be summed up as a device for getting more eyes to Google's primary business, i.e. advertising.  Google want to hit a new market (originally smartphones), so they leverage a bunch of open source software and try to make something that looks like what the current market leader (iPhone, in this case) looks like, and dump out a beta.  Being "open" means that the hardware manufacturers can make it work on their chips, being free means that the carriers don't have to pay license fees, and being free means the consumer gets a cheaper device.  Everyone wins, right?  Well, in reality, not quite.  Hardware manufacturers have mainly made *one* version work for *one* generation of chips, and have often not lived up to the GPL requirements so that their initial work could be carried on by the public.  Carriers couldn't care less about updates, because they'd rather sell you a new phone with a new contract, and realistically, it's hardware costs that drive the pricepoint anyway.  Net result is a messy market with a ~33/33/33 split of the 3 current major versions of Android, and most platforms not being upgradeable.&lt;br /&gt;&lt;br /&gt;Then along came iPad, and with it a sudden rush of android-running tablets.  Because a tablet is just like a scaled-up smartphone, right?  Again, "No, wrong".&lt;br /&gt;&lt;br /&gt;The problem here, I think, is that Android doesn't really fit in with the use case for a tablet device, and particularly not for the pocketable type of tablet.  At least, not for *my* use case, but it's my post so I'll stick with my requirements, thank you very much.  The iPad succeeds because it's a tightly controlled device with a tightly controlled market, aiming at &lt;span style="font-style:italic;"&gt;consumption of media&lt;/span&gt;.  It's not a general purpose computer, really - apps live in their own little walled gardens, and you can't run what you want on it.  It's a fucking good little gadget, though.&lt;br /&gt;&lt;br /&gt;Android thinks it's a phone.  Except when it thinks it's an iPad.  it dosn't have the same overall control that the iPad/iPhone has, and where Google have tried to enforce certain hardware requirements, they often don't make sense (for a phone *or* for a tablet) except when you look at it from an advertising &lt;span style="font-style:italic;"&gt;pusher&lt;/span&gt;'s point of view.  I mean, really, why does a phone, or a tablet, &lt;span style="font-style:italic;"&gt;need&lt;/span&gt; a GPS?  Sure, it might be &lt;span style="font-style:italic;"&gt;useful&lt;/span&gt;, but I fail to see the absolute necessity. Why restrict to a specific set of screen sizes?  Why &lt;span style="font-style:italic;"&gt;must&lt;/span&gt; it have a camera?  After all, anyone who cares about the photos they take won't be using a phone camera anyway.  but I digress slightly.&lt;br /&gt;&lt;br /&gt;To sum it up: IMO, android is a "me too" iPhone/iPad clone, and not a very good one.  It suffers from Google's "permanent beta" mania.  Get it out fast and dirty, and fuck the early adopters.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;User interface&lt;/h2&gt;&lt;br /&gt;The overall user interface of Newton is *tight*.  It's intended to be used with a stylus, although most selections can be made with a finger.  Here's what it looks like, more or less (in reality, it's a lot more "green"):&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.guidebookgallery.org/pics/gui/extra/newton/controlpanel.png"/&gt; &lt;img src="http://www.guidebookgallery.org/pics/gui/extra/newton/dates.png"/&gt;&lt;br /&gt;&lt;br /&gt;What we see at the bottom of these two screens is the stuff that's always available, viz: a "menu bar" for the current application, and a "Dock" (to use OSX terminology) for a few common apps and functions (undo, find, assist, more on these later). That uses up a fair amount of available screen space, but the rest belongs to your app.  What's important here is that the interface is consistent.  From the supplied calendar application to 3rd party web browser, you always know where to go.&lt;br /&gt;&lt;br /&gt;Here's another one, with an "add-on" backdrop, and rotated into landscape mode (the green is a little overbearing, probably as it was pulled from an emulator)&lt;br /&gt;&lt;a href="http://www.flickr.com/photos/raparker/3558905798/in/photostream/"&gt;&lt;img src="http://farm4.static.flickr.com/3563/3558905798_a451c3047f.jpg" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The little star at the top of the screen is the "notification" icon for Newton.  What Android mainly uses its waste of space "status bar" for.&lt;br /&gt;&lt;br /&gt;Android is chaotic.  It's mainly intended to be used with a finger, but that doesn't always work.  One app might grab full screen and require some sort of gesture to pull up menus, another leaves menus and stuff on screen.  Mostly apps leave the status bar at the top of the screen, but it's pretty much a waste of space - there's very little you can do with it.  Admittedly, most apps respond to the "menu" button, but not all devices &lt;span style="font-weight:bold;"&gt;have&lt;/span&gt; a hardware menu button.  Menu widgets don't seem to be standardised, and some are so small as to be unselectable without using a stylus.  If nothing else, it's more food for the "open source can't do UIs" crowd.&lt;br /&gt;&lt;br /&gt;Scrolling is standardised on both platforms - on Newton you use the up-down arrow keys, on Android you "swipe".  I find that my swipes are taken as clicks or some other UI action about 50% of the time, and although the "inertial" scrolling thing looks cool initially, it's intensely frustrating as your swipe goes that bit too far and zooms waaaaaaaay past what you were looking for.&lt;br /&gt;&lt;br /&gt;Closing apps is more or less standardised - the Newt has a little X at the bottom left corner, that closes your app.  Android recognises the "back" and "home" buttons, but, again, &lt;span style="font-style:italic;"&gt;not all devices have them&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Rotation hits some little glitches, as well.  Android assumes you have an accelerometer, and auto-rotates the screen to fit.  That's nice, as long as you have an accelerometer.  I don't, and there doesn't seem to be any standard "rotate" widget as per the Newton.  Which is not to say that the Newt is perfect in this respect, some apps behave very badly when rotated.  I would argue, however, that Android should at least provide a widget as a sop to those who don't have an accelerometer.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Finding your data&lt;/h2&gt;&lt;br /&gt;Newton has a "find" button.  It works.  It &lt;span style="font-weight:bold;"&gt;always&lt;/span&gt; works.  'nuff said.&lt;br /&gt;&lt;br /&gt;Android has a filesystem.  A filesystem that you can't access without some sort of file browser.  I have 3 on my device already - one works for one thing, one works better for another, and so on.  Gah.  &lt;span style="font-style:italic;"&gt;I don't want to fuck about with filesystems, dammit!&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Otherwise, you must find your data from &lt;span style="font-style:italic;"&gt;within&lt;/span&gt; the application you want to use.  Interfaces vary.  It's messy.  It's inconsistent.&lt;br /&gt;&lt;br /&gt;This reveals another philosophical thing.  Newton doesn't have a "filesystem" as such.  No hierarchy, no structure.  A big searchable "soup" of data and applications.  Everything is data, everything has a bunch of attributes, you can search it.  It's a database.  While this is somewhat shocking to those used to the current way of doing things, it's hardly new : "Pick" did this in the 60s, BeOS did this to a certain extent, so did Apple to a very limited extent with the "classic" MacOS (and also with OSX).  &lt;br /&gt;&lt;br /&gt;The current "hierarchical" way of doing things is massively backwards, and it's holding us back.  We use the name and location of a file to indicate its metadata.  This makes no sense (as, for example, those who've been hit with Windows-based trojans and email viruses can testify).  It works for certain system-level applications, but for user data manpulation, it's utter pants.  &lt;br /&gt;&lt;br /&gt;Currently, this need is covered to a certain extent inside individual applications (think, for example, iTunes), but that data can't be easily shared to other apps without intrinsic knowledge of how to "get at" the database, and upgrades to one app often break other stuff further down the chain.  &lt;br /&gt;&lt;br /&gt;It would be much nicer to simply have something where you could query aong the lines of "show me all the emails I sent to fred over the last month, sorted by date".  "Okay, now show me the emails in that list which had attachments".  "Now show me the attachments which pertained to project 'foo'".  Etc etc.&lt;br /&gt;&lt;br /&gt;It's about time someone came up with a filesystem that is completely non-hierarchical (I'm currently working on a fusefs that does exactly this, actually).&lt;br /&gt;&lt;h2&gt;App launching&lt;/h2&gt;&lt;br /&gt;On the Newton, click on the "extras" button, go to the tab for stuff *you* have classified as applications, click your app.  Or use "find".  Or click on an associated piece of data.  Or you might use "Assist".&lt;br /&gt;&lt;br /&gt;On Android, go back to the home screen, go to the right tab, click your app.&lt;br /&gt;&lt;br /&gt;To me, the whole "application centric" way of doing things smacks of "desktop computing metaphor crammed into a handheld device" with little thought.  But maybe that's just me.&lt;br /&gt;&lt;h2&gt;Data input&lt;/h2&gt;&lt;br /&gt;With Newton, you write on the screen.  That's pretty much it.  Really.  Just write directly in the boxes. Newton &lt;span style="font-style:italic;"&gt;usually&lt;/span&gt; turns it into a textual representation of what you wrote.  Special characters and all.  No special voodoo ways of writing a la "graffiti" on the Palm.  Made a mistake?  Scribble it out, it goes away.  Or you can opt to have your text remain as a set of handwritten strokes, and have it recognised later (very handy for note taking in meetings, as the "usually" above implies - Newton's HWR doesn't always get it right, and might need some nudging in the right direction, which you can't do when scribbling at full speed).&lt;br /&gt;&lt;br /&gt;Alternatively, there are on-screen keyboards, or, in extremis, the newton serial keyboard, which is very handy for programming direct on the device.&lt;br /&gt;&lt;br /&gt;With HWR, you get around 20-30wpm, about the same as a *fast* typist on a good keyboard.  With OSK, 3-10wpm.  The newton keyboard is a bit craptacular though, and you can count on about 15-20wpm using it, along with cramp in the fingers.&lt;br /&gt;&lt;br /&gt;With Android on a tablet, you don't have much choice.  It's on screen keyboards pretty much all the way unless your device supports hardware keyboards.  For standard OSK with prediction, I got about the same as on the Newton (the prediction side of things only seemed to be able to predict one word - "android").  I've heard of 10-15wpm with predictive OSK and a fast OSK replacement you're trained to use (Swype, for example) - about as fast as a "slow" typist.  With more training you might get a bit faster.&lt;br /&gt;&lt;br /&gt;HWR is good technology.  It's fast, and discreet (speech recognition has the potential of being fast, but makes you look like a git, and you can't easily use it in meetings or noisy environments).  If you want *really* fast, shorthand recognition could get you into massive wpm speeds - at least as fast as speech and maybe faster.  The hardware can do it (my cheapo chinese tablet is, at a conservative guess, 6-10 times as fast as the Newton without taking into account the vector unit and GPU).  Fast, accurate HWR is no idle dream.&lt;br /&gt;&lt;br /&gt;There's other failings with Android's input methods: they hide the entire screen when you're using them, so you have no context - try entering data into a form, and you'll be given a "next" button indicating that entering the current data will take you to the next form field, but &lt;span style="font-style:italic;"&gt;there's no indication of what this, or the next, form field actually are&lt;/span&gt;.  Barking bloody mad, that is.  Maybe it's fixed in Froyo.&lt;br /&gt;&lt;h2&gt;Inter-app communications&lt;/h2&gt;&lt;br /&gt;Nothing standard under Android.  Some apps share, some don't.&lt;br /&gt;&lt;br /&gt;This is more or less the system under Newton as well, but it also has "Assist", and that's available to all apps.  Yep, that button that looks like a question mark.  What does it do?&lt;br /&gt;&lt;br /&gt;From some app, maybe a word processor, scribble "send to Mark", and select it.  Hit "assist".  Newton goes off and looks, finds that Mark might mean 2 people, and that sending might mean faxing or emailing.  Asks who, and how.  And then goes off and does it.  "Call Fred", Assist, and the telephone dialer fires up, dials the number, and then puts a call logger up, allowing you to scribble notes whilst chatting.&lt;br /&gt;&lt;br /&gt;"Assist" is mad cool, and it was available in the early '90s.  &lt;br /&gt;&lt;br /&gt;The built in apps under Newton *do* share data, and the interface for sharing that data back and forth was available to other developers' apps.  the same is more or less the case under Android, although it's hard / impossible for a developer to shoehorn additional stuff into an existing app (as could be done under Newton) without modifying and recompiling.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Synchronisation and so on&lt;/h2&gt;&lt;br /&gt;The Newton synchronises nicely with desktop apps via serial, ethernet, wifi, bluetooth.  As long as you have a sync app, of course.  They're pretty hard to come by these days, and it's getting difficult to use a newt with a "modern" desktop OS.  You now need need 3rd party software to sync with a Mac, but I'll give the Newton a pass on that as the OS the Apple-supplied software ran on hasn't been current for nearly 10 years.  Dunno if the windows sync stuff works on Vista/7, but it certainly ran on XP.  64 bit might be what fraks it totally.&lt;br /&gt;&lt;br /&gt;Android mainly syncs, as far as I can tell, with "the cloud", which can largely be considered a euphemism for "Google's ads".  Well, frankly, fuck the cloud.  Desktop syncing is payware, so fuck that too.  It really shouldn't be something one needs 3rd party software for - there *should* be some sort of extendable conduit for syncing - hell, OSX provides most of that anyway.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Other stuff&lt;/h2&gt;&lt;br /&gt;Let's be honest, Android does have a bunch of stuff it can do waaaaay better than the Newt.  Video and audio playback is potentially loads better, mainly down to 12 years of additional hardware development.  My cheapo tablet has more hardware potential than a top-of-the-range laptop from the Newton era, after all.  That said, video playback under Android leaves a lot to be desired.  the hardware should have the power to do NLE on video, and in most cases you can't even play stuff back unless it's in a specific format with a specific size.  You can't assume codecs are there.  A mess, in short.&lt;br /&gt;&lt;br /&gt;The web experience on Android is loads better than on the Newt, too, even if it can't easily do embedded flash video (something I heave a sigh of relief over).&lt;br /&gt;&lt;br /&gt;eBooks look nicer under Android.  Higher DPI, colour screen.  Fine.  PDFs don't work at all under Newton, so even having an option to read badly-rendered PDFs slowly under Android is a blessing.  That applies to 3rd party readers - I've not used the Adobe reader for 2 reasons:&lt;br /&gt;&lt;br /&gt;- On the desktop it's a pile of bloatware crap&lt;br /&gt;- My android device doesn't have Google Market, so I can't get to it.&lt;br /&gt;&lt;br /&gt;Photos, idem. Colour screen, higher DPI.  Win for Android by default and the inexorable march of progress.&lt;br /&gt;&lt;br /&gt;One area the Newton wins on is startup.  From "press of button" to the "Happy Newton" chime and a usable device is measured in single digits of seconds.  Android takes an age.  In addition, the Newton comes back to *exactly* where you were when you turned it off or the batteries ran out, no matter how much time has passed between the two.  No data loss, nothing.  you're back.  No finding files, you're still there.  In 15 years of using Newtons, I've *never* lost data (not something I can say of PalmOS devices, for example).&lt;br /&gt;&lt;br /&gt;Wakeup - I can't compare.  Newton is instant, but my tablet doesn't seem to have any PM stuff enabled, so I don't know if it works or not.  I have to hard power cycle it.  Sucks, but probably not an Android flaw as such.&lt;br /&gt;&lt;br /&gt;As far as gaming goes, Android is more modern.  But 99% (or more) of the "games" I've found so far available under Android are made of suck.  A lot of this is down to Google's utterly braindead decision to use Java, of all things, as a systems programming language.  I mean, really.  Gaming suck on a platform that uses a garbage collected langauge with no JIT?  Who'd ever have thought it?&lt;br /&gt;&lt;br /&gt;And finally, performance.&lt;br /&gt;&lt;br /&gt;I have in front of me 2 ARM devices.  &lt;br /&gt;One has a 162MHz StrongARM 110 processor, giving 1 DMips/MHz.  It has 4MB of RAM and 4MB of flash, with 16MB of additional flash in one of the PCMCIA slots for a total of 20MB *total storage*.  It's running a 12 year old, interpreted, prototype based, language.&lt;br /&gt;&lt;br /&gt;The other has a 600MHz Cortex-A8 processor, which gives 2 DMips/MHz, and has, in adddition, a vector floating point unit, NEON FP extensions, and an OpenCL-enabled GPU.  Even without using the addons, it's nearly *10 times* as fast as the Newton. It has a 13 stage superscalar pipeline and more cache than the Newton has main memory. It has 256MB of Main memory and 2GB of flash, with another 8GB of flash in the SD card slot for a total of 10GB storage.  It's running software that's mainly written in Java.  Oh, did I mention that the processor itself is optimised for Java?&lt;br /&gt;&lt;br /&gt;Guess which one feels "snappy"?  Yep, you're right - the hardware specs don't make up for the software implementation.&lt;br /&gt;&lt;br /&gt;The "feel" of Android 2.1 makes me believe that the vaunted 450% speedup of Froyo's improved Dalvik are probably not overstated.  IUt also makes me wonder why, if there was that much speedup to be had, why it wasn't "had" before initial launch of Android.  And how much more there is to be had under the hood.  Android's been out for over 2 years now; I'm hardly an early adopter.&lt;br /&gt;&lt;br /&gt;No, really.  Four hundred and fifty fucking percent faster.  What the fuck?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-3307888844602045341?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/3307888844602045341/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/android-on-tablets-rights-and-wrongs.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/3307888844602045341'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/3307888844602045341'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/android-on-tablets-rights-and-wrongs.html' title='Android on tablets, rights and wrongs'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://farm4.static.flickr.com/3563/3558905798_a451c3047f_t.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-2499725037753765958</id><published>2010-07-10T16:19:00.000+02:00</published><updated>2010-07-19T08:40:00.853+02:00</updated><title type='text'>Teardown, and comparison to the Newton.</title><content type='html'>So, I tore the wits almost all the way down, and at least far enough to get a look at both sides of the motherboard.  I didn't get closeups of the chips, because the little button boards are glued in place and removing them looked like it was gonna be a bit too delicate for a slightly inebriate hardware tech...&lt;br /&gt;&lt;br /&gt;Photos are in &lt;a href="http://www.flickr.com/photos/28888140@N04/sets/72157624338560583/"&gt; this set&lt;/a&gt;, and here's a taster:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.flickr.com/photos/28888140@N04/4780096330/" title="Newton &amp;amp; Wits by tufty tufty tufty, on Flickr"&gt;&lt;img src="http://farm5.static.flickr.com/4093/4780096330_9ef620b805.jpg" width="500" height="375" alt="Newton &amp;amp; Wits" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The hardware is impressive, The case has brass inserts for the screws rather than being the usual self-tapping crap, and the motherboard is *not* the mess of patch wires and crappy soldering I was led to expect.&lt;br /&gt;&lt;br /&gt;Comparison time.  Let's look at the witstech A81 vs a Newton 2x00.  Remember, I am biased, so I may be appearing to be harsh, but I will try to at least explain *why*.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Form Factor&lt;/h2&gt;&lt;br /&gt;Although the Newton is slightly longer for a smaller screen, it wins on ergonomics as opposed to aesthetics.  The big "chin" gives something to hold onto in landscape mode, and the slightly "slimmer in the middle" case suggests (and indeed gives) a comfortable "portrait" mode grip.  The wits, on the other hand, has nothing much to hold onto, and the case is slippery as oppoed to the Newton's slightly "rubbery" feel.  "How do I hold this" was one of my first thoughts when I picked it up.&lt;br /&gt;&lt;br /&gt;The Newton also scores highly for its removable flip-over hard lid (screen protection), large stylus and pop-out stylus holder.  Even the power supply scores well, with a bunch of adaptors supplied and easily interchangeable on the wall-wart.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Screen&lt;/h2&gt;&lt;br /&gt;No contest here.  The chinaman takes it hands-down, in terms of clarity, resolution, brightness (although the Newton's backlight *is* 12 years old, so...) and so on.  It also beats the Newton in terms of touchscreen performance, mainly because this particular example has a bad case of "the jaggies".&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Other hardware issues&lt;/h2&gt;&lt;br /&gt;The wits has a useful stand.  I'm tempted to cancel this out with the Newton's ability to run on 4 AA pencells if you need it to.  Newton's (single) speaker is pretty good, but the hardware behind it lets it down.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Battery life&lt;/h2&gt;&lt;br /&gt;No contest.  Newton, hands down.  *Weeks* on a set of pencells, and you can happily let it simply run out of power, leave it for months, shove in a new set of batteries and you're *exactly* where you were when you put it down.  That said, the wits does get an honourable mention for having a removable, and relatively high-capacity, battery.  It's far better than most; I got 4 hours 45 minutes running "Operation Sandstorm" continuously ( load average 4, exercising the GPU and CPU pretty hard).&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Networking&lt;/h2&gt;&lt;br /&gt;Both Wits and Newton can do bluetooth and wifi (although it's getting hard to find 5V PCMCIA wifi cards these days, and you're not gonna get anything other than 802.11b).  The newt can happily do dialup, fax, or direct serial connection (this one is usually used as a serial terminal on my Sun Netra, actually).  Oh, and IRDA.  And Localtalk.  That said, you can probably do most of that with the wits, given a usb device, some drivers, and a following wind, but would you actually *want* to?&lt;br /&gt;&lt;br /&gt;I'll give this one to the wits, as Wifi and bluetooth on the Newt are "kinda neat", but painful to set up.  I'm assuming there will eventually be some android support for the wits' bluetooth chipset here...&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Expandability&lt;/h2&gt;&lt;br /&gt;Grudgingly, the wits takes this.  The newt has 2 PCMCIA slots, but in real world usage, one of those is gonna be taken up by a flash card.  The little serial dongle on the newt is a pain in the ass, too.  USB and a microSD slot show that some things *have* oved on in 12 years.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Cool Factor&lt;/h2&gt;&lt;br /&gt;Are you Joking?  Newton by a mile.  It's still got a multicoloured Apple on it, for ${DEITY}'s sake.&lt;br /&gt;&lt;br /&gt;If I'm honest, the Wits is a nicer machine overall, and undoubtedly massively more powerful, but the Newton shows areas where it could have been vastly better in a purely hardware sense.  A shame, because wits have obviously poured some love into this - it's not crap.&lt;br /&gt;&lt;br /&gt;I will return to this later, with Newton vs Android in a software sense.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-2499725037753765958?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/2499725037753765958/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/teardown-and-comparison-to-newton.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/2499725037753765958'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/2499725037753765958'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/teardown-and-comparison-to-newton.html' title='Teardown, and comparison to the Newton.'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://farm5.static.flickr.com/4093/4780096330_9ef620b805_t.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-5894852525804200309</id><published>2010-07-07T20:26:00.001+02:00</published><updated>2010-07-09T14:12:29.138+02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='wits android a81e review'/><title type='text'>A New Toy</title><content type='html'>Well, I finally cracked.  I'd been staying away from the Apple store, lest I inadvertantly find myself the owner of an iPad, when I came across Archos' 7 Home Tablet - a nifty looking little 7" screened "mobile internet device", running Android.&lt;br /&gt;&lt;br /&gt;So I ordered one.&lt;br /&gt;&lt;br /&gt;After a while (quite a while, actually), and 3 calls from Archos telling me my order had been put back again, I got a bit steamed with Archos and told them where they could stuff their 7" rigid object.  If you catch my drift.&lt;br /&gt;&lt;br /&gt;However, I'd sort of got hooked on the idea of a 7" replacement for my ageing Newton MP2100, so I started casting around and eventually found &lt;a href="http://www.witstech.com.cn"&gt;witstech&lt;/a&gt;, who produce the A81, a nifty looking 7" device with GPS, a Cortex-A8 processor, and all for less moolah than the Archos.  Score!&lt;br /&gt;&lt;br /&gt;So I contacted them, and purchased a brand spanking new, hot off the presses A81(E) (The E meaning that it has extra buttons for Android usage, but more on that later).&lt;br /&gt;&lt;br /&gt;So, sure as eggs is eggs, it got a bit delayed.  But it did eventually get here.&lt;br /&gt;&lt;br /&gt;Review time.&lt;br /&gt;&lt;br /&gt;Out of the box, I find:  One 7" tablet, half of the back removable, and a replaceable battery that slips nicely behind the back.  Okay.  One mini USB B to female USB A cable, one generic 5v, 1.5A charger (and they were kind enough to send one that had the right sort of plug for France, score one for Wits tech people), and (as I have a developer model), a little mini-USB B to 1.8V serial adaptor.  The latter item looks like a mini-USB plug with 3 hand-labelled wires hanging out of it.  Oh, and a nice enough little semi-hard-case.&lt;br /&gt;&lt;br /&gt;shanzai.com's unboxing video and overall photos &lt;a href="http://shanzai.com/index.php/bandit-gadgets/tablets/1360-first-look-a81-sub-usd-200-mountable-tablet-sports-removable-battery-android-21-cortex-8-processor-and-features-galore"&gt;can be found here&lt;/a&gt;.  Note that hey got a few extra goodies that I didn't like a car charger (already got one) and holder (don't need one).&lt;br /&gt;&lt;br /&gt;In terms of finish, the outside of the hardware is not so bad, considering the price.  It's lightweight, and has a relatively clean look overall.  Although lightweight, it feels tough enough.  A bit "chubby" in terms of depth if you're used to looking at Apple gear, but that's tolerable.  &lt;br /&gt;&lt;br /&gt;Battery is marked 3000mAh, 3.8v.  That's a couple of massive plus points - not only is it a load more beefy than the majority of the competitors' offerings, but it's also removable / replaceable.  More on battery life later.&lt;br /&gt;&lt;br /&gt;The stylus (which is tiny, about the same size as a DS stylus) has a tendency to jam in the holder if you don't push it in "just so" (and it has nothing to help you with alignment - 1 point for industrial design there).&lt;br /&gt;&lt;br /&gt;Nothing much to say about the power and USB ports, which seem solid enough.  The audio out port looks "odd", or rather "cheap".  We'll come to audio later.&lt;br /&gt;&lt;br /&gt;The MicroSD card slot hides behind a little rubber grommet, which is normal, although it doesn't look overly waterproof.  It's a push/push slot, but seems to be mounted a little too far out, it's hard to get the rubber grommet in place with a card in there.  Half a mm further in wouldn't have hurt.&lt;br /&gt;&lt;br /&gt;The stand is a nice addition, pulls out from the back and holds the unit at a usable angle.&lt;br /&gt;&lt;br /&gt;Button-wise, on the "top" of the unit we have a power button, which (being clear, and retro-lit with LEDs) also doubles as a charging / level indicator as far as I can tell.  Green for fully charged, orange for charging, I think.  Next to that, there's a pair of left-right buttons, one unit as a "rocker".  Looks very nuch like a volume up/down pair, and, oddly enough, that appears to be what it's intended as.  On the front, at the bottom left, there are 3 more buttons, with icons corresponding to "home", "back", and "menu".  The thin plastic sheet over the fasia around these buttons is already coming away, and from photos it appears that this is endemic - I haven't had the guts to completely remove the sheet and see if it's just "packing protection" though.&lt;br /&gt;&lt;br /&gt;Touchscreen is shiny, and has the slightly greasy feel that is common to most resistive devices.  The LCD panel itself is bright (20% brightness is more than enough in most circumstances), clear, and (at least on mine) has no dead pixels.  The 800x480 resolution gives roughly 135dpi, which makes for a nice enough display. Viewing angle range is "OK", at about 90° horizontally and 60° vertically before it's difficult to see. &lt;br /&gt;&lt;br /&gt;On the back we have 2 speakers, oddly positioned vertically (assuming a "widescreen" mode of usage), one over the other.  Left and right would have been tricky with the battery placement, I guess, and I'm not expecting hifi quality sound from built in speakers anyway.&lt;br /&gt;&lt;br /&gt;On firing the device up, we find, horror of horrors, WinCE.  Never has a user experience been so well summed up by its name.  Wikipedia tells me that winCE has been out for 13 years, but it doesn't appear to have moved on at all - it still feels like Win 3.0 badly shoehorned into a portable device.  It gets everything wrong.  *Everything* runs full screen.  Microscopic close buttons right next to other microscopic buttons - even with the stylus it's hard to hit the right one (and the touchscreen calibration is pretty much pixel perfect).  This is obviously the build for the previous A81 with no buttons, because the buttons themselves don't do anything other than make a "click" noise.  Even though I've told WinCE not to click.  At all.  Bastarding pile of shite.&lt;br /&gt;&lt;br /&gt;Wifi performance is fine, it picked up my access point straight away and gave me decent speed.  The antennae don't seem overly sensitive (I'd like an external antenna port, myself), but hey - it works.  Bluetooth appears functional, although the godawful WinCE interface (particularly the disappearing keyboard) meant that I couldn't actually get the device to pair with my mac more than once. &lt;br /&gt;&lt;br /&gt;WinCE, surprisingly enough, managed to use a USB keyboard and mouse through the supplied USB cable.  Nice.  Couldn't make my Mac see it as a USB storage device through a standard USB cable though.  This may be an issue with the Mac, as I had the same problems with Adroid.  &lt;br /&gt;&lt;br /&gt;Before we do away with WinCE, though, let's look at the quality of the speakers (and the reasons for this will be come obvious later).  Playing through the speakers sound quality is at least "OK", at least up to about 80% volume (about the volume of a relatively "loud" radio), at which point things start to distort badly.  Cheap speakers, don't expect to use them for much.&lt;br /&gt;&lt;br /&gt;So, we hate WinCE.  Still alpha quality after 13 years.  But we knew that.&lt;br /&gt;&lt;br /&gt;Let's reflash with Android and see where we get.  After all, Wits have just released an english language Android 2.1.&lt;br /&gt;&lt;br /&gt;Reflashing is a piece of piss, once you have the SD card formatted properly (a bit tricky on the Mac, as it craps all over the disk as soon as it's formatted and the booter needs to be the very first item in the FAT, but a bit of VirtualBox voodoo gets us there).  Power on, whilst holding down the "left" button, and leave it alone for a few minutes.&lt;br /&gt;&lt;br /&gt;Now, some of my gripes about Android are related to the fact the Android I have here is a beta.  Some are to do with Android itself, which I'm not really a fan of.  Bear this in mind as you read on.&lt;br /&gt;&lt;br /&gt;So, we power on, and we get a 4-colour quadrant display as the kernel loads, followed by a bunch of android robots, and finally an animated "android" logo.  Boot is pretty quick, but it should be, it's solid-state after all.&lt;br /&gt;&lt;br /&gt;Out of the box, we don't have much.  For some reason, wits have left "phone" and "camera" functionality in place despite the tablet having neither.  A bit daft, that.&lt;br /&gt;&lt;br /&gt;No google Apps, (particularly "market").&lt;br /&gt;&lt;br /&gt;Bluetooth doesn't want to play, at all.  this is, I suspect, a wits issue, or quite likely something to do with kernel drivers for the WL1271 wifi/bluetooth/fm chip.  The end result is the same - no bluetooth.&lt;br /&gt;&lt;br /&gt;Battery status monitoring is also not included, it seems - the unit reports itself as being on charge, 100% charged, all the time.  That sucks, but at least the power button indicates the charging state.&lt;br /&gt;&lt;br /&gt;Sleep doesn't appear to function - I suspect that there's no power management going on at all.  Surprisingly, this still allows the unit to function for 7-8 hours with wireless turned on and decoding music (but not playing it through the speakers - my wife would have killed me for leaving it turned on all night making noise).&lt;br /&gt;&lt;br /&gt;Sound drivers give almost zero volume on the external speakers.  That's gonna be a wits bad as well.&lt;br /&gt;&lt;br /&gt;For the rest, everything seems to function properly.  Calibration is good, again (although the fucking idiot who decided recalibration of the touch panel should require a reboot wants taking out back, shooting, and burying next to the bloke that decided changing network settings in WinNT required a reboot).&lt;br /&gt;&lt;br /&gt;I find Android *very* difficult to get along with.  Swipes are taken as taps, taps as swipes, and the keyboard / input handling is painful.  I've found this to be the case on Android phones, too, but it sometimes makes me want to throw the damned thing at a wall.&lt;br /&gt;&lt;br /&gt;The onboard browser is pretty good (no flash, thankfully), renders pretty fast, and appears to work OK.  I might try mini opera to get some adblocking, though.&lt;br /&gt;&lt;br /&gt;Although the onboard speakers are not properly driven by Android, the external audio is plenty loud enough, and without the usual underlying low-level hiss of cheap audio gear (tested with FLAC rips of Om's "At Giza" and Miles Davis' "Jeru"), and doesn't overload even with some really nasty noise pushed through it (Mainliner's "Black Sky").  As an audio player it's pretty good.&lt;br /&gt;&lt;br /&gt;Gaming-wise, Quake3 runs nicely at 30-40fps, and Modern Combat:Sandstorm is equally playable (although multi-touch would be handy for this).  Radiant Lite is a nice little "old school" shooter, too.&lt;br /&gt;&lt;br /&gt;I haven't tested video playback, but given the gaming performance, it shouldn't be an issue.&lt;br /&gt;&lt;br /&gt;Other apps were less endearing, and indicative of what the average android experience might be, regardless of platform.   Many apps don't bother to check screen size, and run in a little window on the screen.  That's just lazy coding.  Others crash with no explanation, or simply hang.  Again, bad coding - if they are crashing because of missing features, there should be some sort of feedback.&lt;br /&gt;&lt;br /&gt;The device does miss a few features that Google, in their wisdom, have deemed "compulsory" for Android devices, namely: accelerometer, compass (and, to my chagrin, in my case, GPS).  Not that I really need another GPS enabled device, but hey, I was kinda looking forward to playing "Zombies! Run!".&lt;br /&gt;&lt;br /&gt;Overall, Android still feels part-finished, even allowing for the missing functionality of a beta release.  That's part of the overall Google "permanent beta" thing, I guess.  It's not awful, and it's certainly better than Wince, but that's not saying much.&lt;br /&gt;&lt;br /&gt;More later.  Hopefully I'll be able to find my spudgers and do a teardown, and I'll do a side-by side comparison versus my Venerable Newton.&lt;br /&gt;&lt;br /&gt;Later&lt;br /&gt;&lt;br /&gt;Simon&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-5894852525804200309?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/5894852525804200309/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/new-toy.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/5894852525804200309'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/5894852525804200309'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2010/07/new-toy.html' title='A New Toy'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-1257796216629429135</id><published>2009-12-06T11:59:00.000+01:00</published><updated>2009-12-06T17:59:19.591+01:00</updated><title type='text'>Performance, performance, performance</title><content type='html'>Let's look at improving the performance of the synthesis code I had written. It's pretty nippy, but we can shave some cycles off it, I'm sure.  Remember, we're running an interrupt at 32kHz, and then using a divider per "virtual oscillator" to synthesize our waveform; given that we toggle the volume up or down per interrupt, that gives us a maximum frequency of 16kHz.  On top of that, the high frequency range response is going to be pretty poor, as illustrated below&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;"Ticks"  Frequency  Note (approx)&lt;br /&gt;0x0000 = 16kHz&lt;br /&gt;0x0001 = 8kHz&lt;br /&gt;0x0002 = 4kHz     = C8&lt;br /&gt;0x0003 = 2.6kHz   = E7&lt;br /&gt;0x0004 = 2 kHz    = C7&lt;br /&gt;0x0005 = 1.6kHz   = G6&lt;br /&gt;0x0006 = 1.3kHz   = E6&lt;br /&gt;0x0007 = 1.143kHz = D6&lt;br /&gt;0x0008 = 1kHz     = C6&lt;br /&gt;0x0009 = 0.89kHz  = A5&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;As can be seen, we don't even begin to hit (for varying values of "hit", those with perfect pitch need not apply) every note until we get down pretty low.  If we up our interrupt rate to 64kHz, we get the following:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;"Ticks"  Frequency  Note (approx)&lt;br /&gt;0x0000 = 32kHz&lt;br /&gt;0x0001 = 16kHz&lt;br /&gt;0x0002 = 8kHz&lt;br /&gt;0x0003 = 5.3kHz &lt;br /&gt;0x0004 = 4 kHz    = C8&lt;br /&gt;0x0005 = 3.2kHz   = G7&lt;br /&gt;0x0006 = 2.6kHz   = E7&lt;br /&gt;0x0007 = 2.3kHz   = D7&lt;br /&gt;0x0008 = 2kHz     = C7&lt;br /&gt;0x0009 = 1.78kHz  = A6&lt;br /&gt;0x000a = 1.6kHz   = G6&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;That gives us a much better spread, and still leaves us plenty of sub-audio / LFO range at the bottom.  However, 64kHz only leaves us 256 cycles to play with, and our existing code only leaves us about 50 cycles of headroom for other code, like dealing with user interface and actually twiddling frequencies.  Not good enough.  Also, it would be nice to have per-channel fading.&lt;br /&gt;&lt;br /&gt;So, first off, let's look at the original code.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;f_timer_interrupt:&lt;br /&gt;        clr     _volume         ; 1     ; volume to zero&lt;br /&gt;        ld      a, #0x07        ; 1     ; set initial channel flag&lt;br /&gt;        ldw     x, #_channels   ; 2     ; load y register with address of channel data&lt;br /&gt;dochannel:&lt;br /&gt;        ldw     y, x            ; 1     ; Load y register with ticks_left&lt;br /&gt;        ldw     y, (y)          ; 2&lt;br /&gt;        jrmi    br3             ; 1 / 2 ; skip if channel is off (bit 15 of ticks_left set)&lt;br /&gt;        decw    y               ; 2     ; decrement&lt;br /&gt;        jrpl    br1             ; 1 / 2 ; if not negative, skip&lt;br /&gt;        ldw     y, x            ; 1     ; reset ticks_left&lt;br /&gt;        ldw     y, (0x02, y)    ; 2&lt;br /&gt;        ldw     (x), y          ; 2     ; store ticks left&lt;br /&gt;        jra     br2             ; 2     ; and skip unnecessary work&lt;br /&gt;br1:    ldw     (x), y          ; 2     ; store ticks_left&lt;br /&gt;        ldw     y, x            ; 1     ; get number of ticks per cycle&lt;br /&gt;        ldw     y, (0x02, y)    ; 2&lt;br /&gt;br2:    srlw    y               ; 2     ; divide by 2&lt;br /&gt;        cpw     y, (x)          ; 2     ; compare with ticks_left&lt;br /&gt;        jrmi    br3             ; 1 / 2&lt;br /&gt;        inc     _volume         ; 1     ; increment volume&lt;br /&gt;br3:    addw    x, #0x0004      ; 2     ; go 4 bytes up the channel data list&lt;br /&gt;        dec     a               ; 1&lt;br /&gt;        jrpl    dochannel       ; 1 / 2 ; go around if we have more channels to do&lt;br /&gt;               &lt;br /&gt;        bres    0x5255, #0x00   ; 1     ; Clear TIM1 Interrupt pending bit&lt;br /&gt;        iret                    ; 11    ; and return&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Okay, it's pretty good, but an obvious optimisation would be to "unroll the loop", which saves us a few cycles per channel.  Unfortunately, this is gonna make our code unreadable and unmaintainable, a great long string of assembler.  &lt;br /&gt;&lt;br /&gt;That's what macros are made for.  Macros are a bit like the C preprocessor on steroids, you can define a bunch of "inlined" code as a macro, that will be substituted into the code.&lt;br /&gt;&lt;br /&gt;So, what we're going to do is write a macro that deals with any particular channel.  Having 8 instances of a macro avoids the loop, without rewriting the same code over and over again.  Nice.  On top of that, we can pass in the "channel base address" to the macro, and avoid all the "ld y, x; ld y,(y)" tango we were required to do.  That's good for a load of cycles per iteration, and gives us enough headroom to do channel fading as well.&lt;br /&gt;&lt;br /&gt;So, let's redefine our structures, adding a per-channel volume control.  We'll use this, instead of the hokey "top bit" approach taken previously, to know if we can exit fast.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;typedef struct {&lt;br /&gt;        u8      volume;&lt;br /&gt;        u16     ticks_left;&lt;br /&gt;        u16     ticks;&lt;br /&gt;} channel;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Now, in our assembler code, we change the way the zero-page variables are laid out, to match.&lt;br /&gt;&lt;pre&gt;switch .ubsct&lt;br /&gt;_channels:&lt;br /&gt;chan0:  ds.b    5&lt;br /&gt;chan1:  ds.b    5&lt;br /&gt;chan2:  ds.b    5&lt;br /&gt;chan3:  ds.b    5&lt;br /&gt;chan4:  ds.b    5&lt;br /&gt;chan5:  ds.b    5&lt;br /&gt;chan6:  ds.b    5&lt;br /&gt;chan7:  ds.b    5&lt;br /&gt;_volume: ds.b    1&lt;br /&gt;; make them visible to C code&lt;br /&gt;xdef _channels&lt;br /&gt;xdef _volume&lt;/pre&gt;&lt;br /&gt;We keep the "top" label for C, and then we have individual channel labels for the assembler code (we could use simple math, but this makes things more explicit).&lt;br /&gt;&lt;br /&gt;Now, the macro itself.&lt;br /&gt;&lt;pre&gt;dochan: macro \chan&lt;br /&gt;        ld      a, \chan        ; 1     ; move volume to accumulator&lt;br /&gt;        jreq    \@done          ; 1 / 2 ; quit if zero volume&lt;br /&gt;        ldw     x, \chan + 1    ; 2     ; get ticks_left&lt;br /&gt; ldw     y,     \chan + 3       ; 2     ; load ticks&lt;br /&gt;        decw    x               ; 2     ; decrement ticks_left&lt;br /&gt;        jrpl    \@br1           ; 1 / 2 ; branch if still ticking&lt;br /&gt;        ldw     x, y            ; 1     ; load ticks&lt;br /&gt;\@br1:  ldw     \chan + 1, x    ; 2     ; store new ticks_left&lt;br /&gt;        srlw    y               ; 2     ; divide ticks by 2&lt;br /&gt;        cpw     y, \chan + 1    ; 2     ; compare with ticks_left&lt;br /&gt;        jrmi    \@done          ; 1 / 2 ;&lt;br /&gt;        add     a, _volume      ; 1     ; add volume&lt;br /&gt;        ld      _volume, a      ; 1     ; and store&lt;br /&gt;\@done:&lt;br /&gt;endm&lt;/pre&gt;&lt;br /&gt;Passing in a base address, this will handle the calculations for a single channel.  Best case performance (channel muted) is a mere 3 cycles, best case (channel non muted) is 18 cycles, worst case is 19 cycles.  Any given channel will spend half of its time at 18 cycles and half at 19 cycles, giving us an average of 18.5 cycles / non-muted channel.  There's probably a few more cycles to shave off there, too.&lt;br /&gt;&lt;br /&gt;Calling the macro is simple.&lt;br /&gt;&lt;pre&gt;f_timer_interrupt:&lt;br /&gt; clr     _volume                ; 1     ; volume to zero&lt;br /&gt; dochan  chan0                  ; 18.5&lt;br /&gt;        dochan  chan1           ; 18.5&lt;br /&gt;        dochan  chan2           ; 18.5&lt;br /&gt;        dochan  chan3           ; 18.5&lt;br /&gt;        dochan  chan4           ; 18.5&lt;br /&gt;        dochan  chan5           ; 18.5&lt;br /&gt;        dochan  chan6           ; 18.5&lt;br /&gt;        dochan  chan7           ; 18.5&lt;br /&gt;        bres 0x5255, #0x00      ; 1     ; Clear TIM1 Interrupt pending bit&lt;br /&gt;        iret                    ; 11    ; and return&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;As can be seen, the static overhead is now down to 13 cycles, so assuming all channels are "on", worst case time taken is going to be 13 + (19 * channels).  For 8 channels, this is 165 cycles (161 cycles average).  That fits quite neatly into the 256 cycles we have per interrupt, with 90-odd cycles left over.  Not only that, but we can easily trim down the number of channels to fit if needs be - if we wanted, for example, 4 channels would take only 89 cycles, or 6 channels (for 4 drones and 2 LFOs) 127 cycles, or 50% CPU.&lt;br /&gt;&lt;br /&gt;That's pretty good.  But we can do better.&lt;br /&gt;&lt;br /&gt;Let's think about how we are going to go about getting audio off the board.  An R-2R network would work, but uses up to 8 pins and a bunch of additional hardware - why do extra work when we can let the processor do the hard lifting for us?  What we need is to use hardware PWM, and all we need then is a single pin and a lowpass filter.  As this is all going to be thrown out to a set of analog filters anyway, a single-pin approach seems more than reasonable.  Now, we're already doing PWM for the LED brightness, so why not generalise that?  &lt;br /&gt;&lt;br /&gt;So.  First of all, we set Timer 3 to a prescaler of one and period of 256.  That gives us 8-bit PWM, straight off the bat.  We can now do away with the 'volume' variable, and rather than storing stuff in and out of memory, do it all in the accumulator, dumping that direct to the timer at the end of the interrupt.  We lose the extreme time saving of skipping when a channel is set to volume zero, but we gain speed and cycles overall.&lt;br /&gt;&lt;br /&gt;Here's the new assembler code:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;dochan: macro \chan&lt;br /&gt;        ldw     x, \chan + 1    ; 2     ; get ticks_left&lt;br /&gt;        ldw     y, \chan + 3    ; 2     ; load ticks&lt;br /&gt;        decw    x               ; 2     ; decrement ticks_left&lt;br /&gt;        jrpl    \@br1           ; 1 / 2 ; branch if still ticking&lt;br /&gt;        ldw     x, y            ; 1     ; load ticks&lt;br /&gt;\@br1:  ldw     \chan + 1, x    ; 2     ; store new ticks_left&lt;br /&gt;        srlw    y               ; 2     ; divide ticks by 2&lt;br /&gt;        cpw     y, \chan + 1    ; 2     ; compare with ticks_left&lt;br /&gt;        jrmi    \@done          ; 1 / 2 ;&lt;br /&gt;        add     a, \chan        ; 1     ; add volume&lt;br /&gt;\@done:&lt;br /&gt;endm&lt;br /&gt;&lt;br /&gt;; 14 + (16 * 8) = 142 cycles&lt;br /&gt;f_timer_interrupt:&lt;br /&gt;        clr     a               ; 1     ; initial volume to 0&lt;br /&gt;        dochan  chan0           ;&lt;br /&gt;        dochan  chan1           ;&lt;br /&gt;        dochan  chan2           ;&lt;br /&gt;        dochan  chan3           ;&lt;br /&gt;        dochan  chan4           ;&lt;br /&gt;        dochan  chan5           ;&lt;br /&gt;        dochan  chan6           ;&lt;br /&gt;        dochan  chan7           ;&lt;br /&gt;        ld 0x5330, a            ; 1     ; Store volume in TIM3 PWM register&lt;br /&gt;        bres 0x5255, #0x00      ; 1     ; Clear TIM1 Interrupt pending bit&lt;br /&gt;        iret                    ; 11    ; and return&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;That gets us down to 16 cycles per channel in *all* cases, leaving us with a total of 142 cycles for the entire interrupt (just over 50% of CPU), with 8 channels.&lt;br /&gt;&lt;br /&gt;Our main loop is, of course, now reduced to a simple&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;while(1);&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Moving from flashy LED action to annoying drony noise action is a simple case of switching to TIM3_OC1, which outputs its PWM on Port D pin 2 - hooking a speaker across CN4 pin 7 (PD2) and CN1 Pin 4 (GND) indeed gives us an annoying drony beaty noise after pushing some of the virtual oscillator frequencies into the audio range.&lt;br /&gt;&lt;br /&gt;Software Synthesis.  Gotta love it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-1257796216629429135?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/1257796216629429135/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2009/12/performance-performance-performance.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/1257796216629429135'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/1257796216629429135'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2009/12/performance-performance-performance.html' title='Performance, performance, performance'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-472235013442047941</id><published>2009-12-06T09:08:00.000+01:00</published><updated>2009-12-06T10:01:57.714+01:00</updated><title type='text'>Why Assembler is better</title><content type='html'>Following on from my last post, I thought I'd expand a little on how assembler is better than C.  On "real" computers, it's often said that a C or C++ compiler will, in the vast majority of cases, produce better code than you can do by hand in assembler.  This is largely true, especially when you're dealing with multiple cores and the like.&lt;br /&gt;&lt;br /&gt;On microcontrollers, however, and especially 8 / 16 bit ones, the C compilers aren't "all that", and the additional overhead imposed by a C compiler can kill your application stone dead.&lt;br /&gt;&lt;br /&gt;Let's take a concrete example. &lt;br /&gt;&lt;br /&gt;In my (very non-optimal) assembler example posted before, I need to clear the interrupt pending flag for the Timer interrupt I'm servicing.  This is easily done in assembler, it's a one-line, one clock cycle instruction, as follows:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;bres        0x5255, #0x00   ; Clear TIM1 Interrupt pending bit&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Simple, right?  Now, let's look at what the C compiler gives us.&lt;br /&gt;&lt;br /&gt;Here's the "C" code we use:&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;// Clear the interrupt pending bit for TIM1.&lt;br /&gt;TIM1_ClearITPendingBit(TIM1_IT_UPDATE);&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Simple enough, right?  There's obviously the overhead of a function call, but we might expect the guts of the function to do a simple bit of inline assembler as above.  Let's look.&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;void TIM1_ClearITPendingBit(TIM1_IT_TypeDef TIM1_IT)&lt;br /&gt;{&lt;br /&gt;    /* Check the parameters */&lt;br /&gt;    assert_param(IS_TIM1_IT_OK(TIM1_IT));&lt;br /&gt;&lt;br /&gt;    /* Clear the IT pending Bit */&lt;br /&gt;    TIM1-&gt;SR1 = (u8)(~(u8)TIM1_IT);&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So, let's look at what that C code produces.&lt;br /&gt;&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;&lt;br /&gt;main.c:64  TIM1_ClearITPendingBit(TIM1_IT_UPDATE); &lt;br /&gt;0x91c3 &lt;main+134&gt;           0xA601          LD    A,#0x01             LD    A,#0x01 &lt;br /&gt;0x91c5 &lt;main+136&gt;           0xCD8C9C        CALL  0x8c9c              CALL  _TIM1_ClearITPendingBit &lt;br /&gt;&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;stm8s_tim1.c:2156     TIM1-&gt;SR1 = (u8)(~(u8)TIM1_IT); &lt;br /&gt;0x8c9c &lt;.ClearITPendingBit&gt; 0x43            CPL   A                   CPL   A &lt;br /&gt;0x8c9d &lt;.earITPendingBit+1&gt; 0xC75255        LD    0x5255,A            LD    0x5255,A &lt;br /&gt;stm8s_tim1.c:2157 } &lt;br /&gt;0x8ca0 &lt;.earITPendingBit+4&gt; 0x81            RET                       RET &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So, we load the accumulator with a value, that's one cycle.  Call a function, 4 cycles.  Complement the accumulator, one cycle.  Store the accumulator in the flag, 1 cycle.  Return from function, 4 cycles.  &lt;br /&gt;&lt;br /&gt;In total, that's 11 cycles and 10 bytes to do the same thing we did in one cycle and 3 bytes. "inlining" this function doesn't make our code any fatter, either, as the 3 bytes we're using are the same as the 3 bytes we would have used to call the function.&lt;br /&gt;&lt;br /&gt;What we do lose is readability, but that's easily enough got back by writing macros.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-472235013442047941?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/472235013442047941/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2009/12/why-assembler-is-better.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/472235013442047941'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/472235013442047941'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2009/12/why-assembler-is-better.html' title='Why Assembler is better'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-9184145065528610199.post-2125083467092390482</id><published>2009-12-05T21:23:00.000+01:00</published><updated>2009-12-05T22:14:39.833+01:00</updated><title type='text'>Mixing Assembler and C on the STM8S-Discovery board</title><content type='html'>So, been playing with my STM8S-discovery boards, here's some stuff you might like.  This builds on Ben's stuff here - http://www.benryves.com/journal/3567231&lt;br /&gt;&lt;br /&gt;Now, one of my projects is to do with making a multi-channel drone synthesizer.  I should really do it with discrete logic and amps, but hey, it's a quickie.  What I want to do is 8 channels of audio-range square waves, at differing and varying frequencies.  How we go about varying the frequencies is another matter, but the main problem is doing 8 channels - I could do 4 with just the timers, but 8 is pretty tricksy. &lt;br /&gt;&lt;br /&gt;Basically, the approach I'm taking is to use a timer firing at a regular frequency, and do additive synthesis to make the final output volume (I'll initially be using a simple approach with no fading of the 8 pseudo-oscillators, so the maximum output volume is 8; that lets us get away with a 4 pin R-2R DAC (but more on that later).&lt;br /&gt;&lt;br /&gt;Now, we need to work out how much we can do, and how long we have to do it.  The clock of the disco board runs at up to 16 MHz, so we can work out easily enough what frequencies we can call our hypothtical interrupt at.  The prescaler value gives us, direcly, the number of clocks we have to play with at that frequency&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;1024 = 16kHz&lt;/li&gt;&lt;li&gt;512 = 32kHz&lt;/li&gt;&lt;li&gt;256 = 64kHz&lt;/li&gt;&lt;li&gt;128 = 128kHz&lt;/li&gt;&lt;/ol&gt;and so on.  Bearing in mind that human audio range peaks at ~20kHz, we'd probably like to be using a 32kHz or better clock.  Indeed, due to the way my software works, I can only generate up to 16kHz with a 32kHz clock, and there's a really coarse granularity on the high end - it's basically not much cop above 1kHz or so, around 2 octaves up from middle C.&lt;br /&gt;&lt;br /&gt;Anyway, 512 cycles is not much to play with.  We're gonna need some assembler.  C code is waaaay too fat.  My first cut interrupt routine has a "worst case" of just over 200 cycles,  so that cuts out 64kHz if we want to do anything clever elsewhere.  And we do.  So, 32kHz it is, and my interrupt is gonna use around 1/3 of total CPU all by itself.  I should be able to streamline it a bit (or even a lot), but I doubt it's ever going to be able to run at 64kHz.  As an idea of how fat C code is, the equivalent C routine had a worst case of 500 cycles.  Yowza.&lt;br /&gt;&lt;br /&gt;So, how do we make our assembler code work with existing "C"?  Well, first thing to do is take a trivial program and compile it; in the "Debug" folder of the project you'll find the generated assembler code (for example, main.ls) - look at this and you can find out how the name mangling works.&lt;br /&gt;&lt;br /&gt;In my case, I was interested in the names of functions and global variables, so I compiled up a quick program with one global variable and one function.&lt;br /&gt;&lt;br /&gt;For a function name of "MyFunction", the actual label generated in the assembler (and thus the format of the name we need to use) is "f_MyFunction".  Groovy.  Variables have an underscore, so "my_var" becomes "_my_var".  Now we know that, we can do some work.&lt;br /&gt;&lt;br /&gt;First, we edit the interrupt table, adding an "extern" reference to our assembler routine.  This stops the C compiler kicking up a fuss.&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;extern @far @interrupt void timer_interrupt(void);&lt;/pre&gt;Now, we add a reference to that routine to the table itself.&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;...&lt;br /&gt;{0x82, timer_interrupt}, /* irq11 */&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;I'm using irq11 because that's the timer 1 overflow interrupt.&lt;br /&gt;&lt;br /&gt;Now.  I'm going to need access to the interrupt's data from elsewhere, so we need to add some references to them, too.  I've shoved them in main.c.&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;// For our purposes&lt;br /&gt;typedef struct {&lt;br /&gt;    u16 ticks_left;&lt;br /&gt;    u16 ticks;&lt;br /&gt;} channel;&lt;br /&gt;&lt;br /&gt;// Extern, these are defined in synth.asm&lt;br /&gt;extern channel channels[8];&lt;br /&gt;extern u8 volume;&lt;br /&gt;&lt;/pre&gt;So, we have an array of 8 "channel data" structures, with "ticks" defining the number of interrupt "ticks" before we roll over, and "ticks_left" being the number of interrupt ticks this channel has had since last rollover.  Pretty simple, really.  There's also a single 8-bit "volume" field.  Note the "extern" on both of these - this tells the compiler they are actually defined elsewhere. &lt;br /&gt;&lt;br /&gt;As I haven't bothered to wire up an R2R ladder on my board yet, I decided to test with a bit of PWM on the LED.  this follows on directly from Ben's stuff.&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;&lt;br /&gt;main()&lt;br /&gt;{&lt;br /&gt;  // Set the internal high-speed oscillator to 1 to run at 16/1=16MHz.&lt;br /&gt;  CLK_HSIPrescalerConfig(CLK_PRESCALER_HSIDIV1);&lt;br /&gt;&lt;br /&gt;  // Reset ("de-initialise") TIM1.&lt;br /&gt;  TIM1_DeInit();&lt;br /&gt;  // Set TIM1 to use a prescaler of 512 and to have a period of 1.&lt;br /&gt;  TIM1_TimeBaseInit(512, TIM1_COUNTERMODE_UP, 1, 0);&lt;br /&gt;  // Set TIM1 to generate interrupts every time the counter overflows.  With&lt;br /&gt;  // prescaling of 512, this is a frequency of 32kHz&lt;br /&gt;  TIM1_ITConfig(TIM1_IT_UPDATE, ENABLE);&lt;br /&gt;  // Enable TIM1.&lt;br /&gt;  TIM1_Cmd(ENABLE);&lt;br /&gt;&lt;br /&gt;  // For visualisation only&lt;br /&gt;  // Reset ("de-initialise") TIM3.&lt;br /&gt;  TIM3_DeInit();&lt;br /&gt;  // Set TIM3 to use a prescaler of 1 and have a period of 999.&lt;br /&gt;  TIM3_TimeBaseInit(TIM3_PRESCALER_1, 999);&lt;br /&gt;  // Initialise output channel 2 of TIM3.&lt;br /&gt;  TIM3_OC2Init(TIM3_OCMODE_PWM1, TIM3_OUTPUTSTATE_ENABLE, 0, TIM3_OCPOLARITY_LOW);&lt;br /&gt;&lt;br /&gt;  // Enable TIM3.&lt;br /&gt;  TIM3_Cmd(ENABLE);&lt;br /&gt;&lt;br /&gt;  // Set up our channels&lt;br /&gt;  channels[0].ticks_left = 0;&lt;br /&gt;  channels[1].ticks_left = 0;&lt;br /&gt;  channels[2].ticks_left = 0;&lt;br /&gt;  channels[3].ticks_left = 0;&lt;br /&gt;  channels[4].ticks_left = 0;&lt;br /&gt;  channels[5].ticks_left = 0;&lt;br /&gt;  channels[6].ticks_left = 0;&lt;br /&gt;  channels[7].ticks_left = 0;&lt;br /&gt;  channels[0].ticks = 0x4000;&lt;br /&gt;  channels[1].ticks = 0x4010;&lt;br /&gt;  channels[2].ticks = 0x4020;&lt;br /&gt;  channels[3].ticks = 0x4040;&lt;br /&gt;  channels[4].ticks = 0x4080;&lt;br /&gt;  channels[5].ticks = 0x4100;&lt;br /&gt;  channels[6].ticks = 0x4200;&lt;br /&gt;  channels[7].ticks = 0x4400;&lt;br /&gt;&lt;br /&gt;  enableInterrupts();&lt;br /&gt;&lt;br /&gt;  while (1) {&lt;br /&gt;    TIM3_SetCompare2(volume * (1000 / 8));&lt;br /&gt;  }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt; What's interesting is the setting of the various channels to different (very low frequency) rollover periods - it starts off with everything together, and the channels slowly end up "beating" against one another, which does "interesting" stuff to the LED.  The LED itself is PWM controlled to set its brightness to one of 8 levels.&lt;br /&gt;&lt;br /&gt;Of course, none of this is any good without the interrupt routine itself.  This goes in a totally separate file with a ".asm" extension.  The toolkit knows how to deal with it.&lt;br /&gt;&lt;pre style="font-family: courier new;"&gt;&lt;br /&gt;; names mangled to C code specs&lt;br /&gt;; uninitialised data in page zero&lt;br /&gt;switch .ubsct&lt;br /&gt;_channels:&lt;br /&gt;                ds.w        16&lt;br /&gt;_volume:&lt;br /&gt;        ds.b    1&lt;br /&gt;&lt;br /&gt;; make them visible to C code&lt;br /&gt;xdef _channels&lt;br /&gt;xdef _volume&lt;br /&gt;&lt;br /&gt;; code&lt;br /&gt;switch .text&lt;br /&gt;f_timer_interrupt:&lt;br /&gt;        clr     _volume         ; 1     ; volume to zero&lt;br /&gt;        ld      a, #0x07        ; 1     ; set initial channel flag&lt;br /&gt;        ldw     x, #_channels   ; 2     ; load y register with address of channel data&lt;br /&gt;dochannel:&lt;br /&gt;        ldw     y, x            ; 1     ; Load y register with ticks_left&lt;br /&gt;        ldw     y, (y)          ; 2&lt;br /&gt;        jrmi    br3             ; 1 / 2 ; skip if channel is off (bit 15 of ticks_left set)&lt;br /&gt;        decw    y               ; 2     ; decrement&lt;br /&gt;        jrpl    br1             ; 1 / 2 ; if not negative, skip&lt;br /&gt;        ldw     y, x            ; 1     ; reset ticks_left&lt;br /&gt;        ldw     y, (0x02, y)    ; 2&lt;br /&gt;        ldw     (x), y          ; 2     ; store ticks left&lt;br /&gt;        jra     br2             ; 2     ; and skip unnecessary work&lt;br /&gt;br1:    ldw     (x), y          ; 2     ; store ticks_left&lt;br /&gt;        ldw     y, x            ; 1     ; get number of ticks per cycle&lt;br /&gt;        ldw     y, (0x02, y)    ; 2&lt;br /&gt;br2:    srlw    y               ; 2     ; divide by 2&lt;br /&gt;        cpw     y, (x)          ; 2     ; compare with ticks_left&lt;br /&gt;        jrmi    br3             ; 1 / 2&lt;br /&gt;        inc     _volume         ; 1     ; increment volume&lt;br /&gt;br3:    addw    x, #0x0004      ; 2     ; go 4 bytes up the channel data list&lt;br /&gt;        dec     a               ; 1&lt;br /&gt;        jrpl    dochannel       ; 1 / 2 ; go around if we have more channels to do&lt;br /&gt;               &lt;br /&gt;        bres    0x5255, #0x00   ; 1     ; Clear TIM1 Interrupt pending bit&lt;br /&gt;        iret                    ; 11    ; and return&lt;br /&gt;               &lt;br /&gt;; make function visible to C code&lt;br /&gt;xdef f_timer_interrupt&lt;br /&gt;&lt;/pre&gt;You'll note the name mangling, and the "xdef" lines to make the labels externally visible.  Also, that I've somewhat anally added the number of cycles per operation to the source.&lt;br /&gt;&lt;br /&gt;This routine allows me to kill any one channel at the next call to the interrupt by setting its "ticks" value to 0x8000, and "ticks_left" to 0x0000.&lt;br /&gt;&lt;br /&gt;Hope this is of some use to people.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/9184145065528610199-2125083467092390482?l=stm8sdiscovery.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://stm8sdiscovery.blogspot.com/feeds/2125083467092390482/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://stm8sdiscovery.blogspot.com/2009/12/so-been-playing-with-my-stm8s-discovery.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/2125083467092390482'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/9184145065528610199/posts/default/2125083467092390482'/><link rel='alternate' type='text/html' href='http://stm8sdiscovery.blogspot.com/2009/12/so-been-playing-with-my-stm8s-discovery.html' title='Mixing Assembler and C on the STM8S-Discovery board'/><author><name>tufty</name><uri>http://www.blogger.com/profile/00602829823862264815</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry></feed>
