Home | Projects | Notes > Computer Architecture & Organization > ARM Assembly Language
At the linking phase, external references in the programs (e.g., .global main
, .global printf
, .global scanf
in the assembly program) are resolved because the linker now has access to all those pieces of code.
The last thing the linker will do is to load the address for the start of your code (main) into the Program Counter (PC). At that time your program takes over the CPU.
An assembler is a program that converts the assembly code into the machine code (binary).
The output of the assembler is directly executed by the CPU (mostly).
Two-Pass Assembler
First pass
A symbol table is built.
The labels are put in a table and the assembler determines the address assigned to that label.
Memory allocations are made for the instructions and data.
Any incorrect instructions, syntax and referenced labels that are not defined are flagged as errors.
Extra spaces, all comments are ignored by the assembler.
Most modern assemblers are NOT case sensitve.
Second pass
All symbol references in the code are resolved and the final binary code is generated.
Typical instruction in memory will look something like this:
xxxxxxxxxx
11Opcode | Addressing Mode | Operands
Even with RISC machines there are several instruction formats depending on the instruction, associated addressing mode and other options like shifting of registers, etc.
The detailed manual on the ARM assembler provides all this information.
Assembler Directives
Assembler directives tell the assembler how to assemble your code. They do NOT get translated into machine code but CAN affect the way the code is created.
[!] Note: Assembly instructions or executable instructions are translated into machine code which are executed by the processor. This is your program.
Assemblers vary and sometimes quite a lot. In most cases:
Ignore extra spaces between operands.
e.g., r1,r2,r3
and r1, r2, r3
are the same.
Not case sensitive for instructions.
e.g., ADD
, add
, Add
are the same.
Comments can start one space after the last operand, and can be /* */
(multi-line) or @
(single-line) delimited.
Ignore extra spaces between operands.
e.g., r1,r2,r3
is identical to r1, r2, r3
.
Labels are case sensitive.
e.g., Loop:
and loop:
are different labels.
Macros
Last line in code file has to be a blank line.
Things you can do with assembler directives:
Where you want your code to be located in memory, when it is loaded.
Allocate storage space to variables
Initialize variables
Tell the assembler where your assembly code stops and do NOT assemble anything after this point (END
or STOP
)
Examples:
xxxxxxxxxx
341.balign 4 @ Forces a word boundary.
2 @ 4 specifies the number of bytes that must be aligned to.
3 @ (this number must be power of 2).
4 @ It means, the next piece of memory that I declare after this directive
5 @ need to start at a memory location that is divisible by 4. It has to be
6 @ aligned with the memory locations that are divisible by 4.
7 @
8 @ When you declare an array, this is important because you want
9 @ each of those elements to have an individual address that is
10 @ divisible by 4. If it is not divisible by 4 (word size), 1, 2,
11 @ or 3 bytes of an individual element can be partially stored before
12 @ the array, after that particular location.
13
14Q: .word 9 @ Defines a label 'Q' at current memory location as word-size and
15 @ sets it to a decimal 9.
16 @ .word allocates 4 byte memory space to hold data.
17 @ .hword allocates 2 byte memory space to hold data.
18 @ .byte allocates 1 byte memory space to hold data.
19
20Array: .skip 4x10 @ Defines a label 'Array' at current memory location and reserves
21 @ 40 bytes of memory.
22 @ Note that '.skip' does NOT initialize the allocated memory; Thus
23 @ 'Array' will contain whatever garbage values it was previously
24 @ set to.
25
26str1: .asciz "This is a sample string.\n"
27 @ .asciz puts a terminating null character at the end of the string.
28
29str2: .ascii "This is a sample string.\0\n"
30 @ .ascii does NOT put a terminating null character at the end.
31 @ (It must be explicitly added by the programmer.)
32 @
33 @ Note that using the null character \0 as the terminator
34 @ for strings is a C construct, not an assembly's.
To use the string defined in the .data
section:
xxxxxxxxxx
41LDR r1, =str1 @ Put the address of the start of the str1 in r1.
2 @ The = (equal sign) in front of a label reference will use
3 @ the address of the label NOT the contents of the memory
4 @ reference.
Good example of using the literal declaration in conjunction with the arrays:
xxxxxxxxxx
391.text
2...
3
4@ Even though the assembler allows you to put these declarations anywhere
5@ in the code, it is a good practice to put them at the top of your code
6@ for the better readability and maintainability.
7
8Monday = 1 @ .equ Monday, 1
9Tuesday = 2 @ .equ Tuesday, 2
10Wednesday = 3 @ .equ Wednesday, 3
11Thursday = 4 @ .equ Thursday, 4
12Friday = 5 @ .equ Friday, 5
13Saturday = 6 @ .equ Saturday, 6
14Sunday = 7 @ .equ Sunday, 7
15
16...
17
18.data
19
20DaysOfWeek: @ Now this is an array containing
21.word Monday @ 1, 2, 3, 4, 5, 6, 7
22.word Tuesday
23.word Wednesday
24.word Thursday
25.word Friday
26.word Saturday
27.word Sunday
28
29WeekendDays: @ Now this is an array containing
30.word Saturday @ 6, 7
31.word Sunday
32
33WeekDays: @ Now this is an array containing
34.word Monday @ 1, 2, 3, 4, 5
35.word Tuesday
36.word Wednesday
37.word Thursday
38.word Friday
39...
Location Counter
Location counter is the pointer to the next location in memory that the assembler maintains when a program is being assembled. This pointer is necessary for the assembler to do the proper memory allocation, variable initialization, etc.
Similar in concept to the program counter except the location counter is used during the assembly process and the program counter is used during the program execution process.