<< Chapter < Page | Chapter >> Page > |
Storing the register contents uses the same addressing
modes. The assembly instructions used for storing are
STB
,
STH
, and
STW
. Read and understand these
instructions in the TI manual.
(Storing to memory): Write assembly instructions to
store 32-bit constant
53fe 23e4h
to
memory address
0000 0123h
.
Intentionally left blank.
Sometimes, it becomes necessary to access part of the data
stored in memory. For example, if you store the 32-bit word
0x11223344
at memory location
0x8000
, the four bytes having addresses
location
0x8000
, location
0x8001
, location
0x8002
, and location
0x8003
contain the value
0x11223344
. Then, if I read the byte
data at memory location
0x8000
, what
would be the byte value to be read?
The answer depends on the endian mode of the memory system. In the little endian mode , the lower memory addresses contain the LSB part of thedata. Thus, the bytes stored in the four byte addresses will be as shown in .
0x8000 |
0x44 |
0x8001 |
0x33 |
0x8002 |
0x22 |
0x8003 |
0x11 |
In the big endian mode , the lower memory addresses contain the MSB part of the data. Thus, we have
0x8000 |
0x11 |
0x8001 |
0x22 |
0x8002 |
0x33 |
0x8003 |
0x44 |
In this course, we use the little endian mode by default and all the lab programming must assume the little endian mode.
(Little endian mode): What will be the value in
A0
after executing the following
assembly instructions? (functional unit specificationswere omitted.)
MVKL 0x80000000, A10
MVKH 0x80000000, A10
MVKL 0x12345678, A9
MVKH 0x12345678, A9
STW A9, *A10
LDB *+A10[2],A0
A0
if the
system uses the big endian mode? Intentionally left blank.
In fact, the above addressing method describes the so-called linear addressing mode (default upon reset), where the offset or increment/decrement of pointers occurwithout bound. There is a circular addressing modes that can handle a finite size buffer efficiently. Youwill implement circular buffers for the FIR filtering algorithm in the FIR filtering experiments later.
In the C62x CPU, it takes exactly one CPU clock cycle to
execute each instruction. However, the instructions such as
LDW
need to access the slow external
memory and the results of the load are not availableimmediately at the end of the execution. This
delay of the execution results is
called
delay slots .
For example, let's consider loading up the content of
memory content at address pointed by
A10
to
A1
and
then moving the loaded data to
A2
.
You might be tempted to write simple 2 line assembly codeas follows:
1 LDW .D1 *A10, A1
2 MV .D1 A1,A2
What is wrong with the above code? The result of the
LDW
instruction is not available
immediately after
LDW
is executed.
As a consequence, the
MV
instruction
does not copy the desired value of
A1
to
A2
. To prevent this undesirable
execution, we need to make the CPU wait until the resultof the
LDW
instruction is correctly
loaded to
A1
before executing the
MV
instruction. For load
instructions, we need extra 4 clock cycles until the loadresults are valid. To make the CPU wait for 4 clock
cycles, we need to insert 4
NOP
(no
operations) instructions between
LDW
and
MV
. Each
NOP
instruction makes the CPU idle
for one clock cycle. The resulting code will be likethis:
1 LDW .D1 *A10, A1
2 NOP3 NOP
4 NOP5 NOP
6 MV .D1 A1,A2
or simply you can write
1 LDW .D1 *A10, A1
2 NOP 43 MV .D1 A1,A2
Notification Switch
Would you like to follow the 'Finite impulse response' conversation and receive update notifications?