<< Chapter < Page | Chapter >> Page > |
1 ; this is a comment
2 ADD .L1 A1,A2,A3 ;add a1 and a2
Each instruction has particular functional units that can execute it. Note that some instructions can beexecuted by several different functional units.
The following figure shows how data and
addresses can be transfered between the registers, functionalunits and the external memory. If you observe carefully, the
destination path (marked as
dst ) going
out of the
.L1, .S1, .M1
and
D1
units are connected to the register
file A.
.L2, .S2, .M2
and
D2
units should be used.
Therefore if you know the instruction and the destination register, you should be able to assign the functional unit toit.
(Functional units): List all the functional units you can assign to each of these instructions:
ADD .?? A0,A1,A2
B .?? A1
MVKL .?? 000023feh, B0
LDW .?? *A10, A3
If you look at the figure again, each functional unit must receive one of the source data from thecorresponding register file.For example, look at the following assembly instruction:
1 ADD .L1 A0,B0,A1
The
.L1
unit gets data from
A0
(this is natural) and
B0
(this is not) and stores the result in
A1
(this is a must). The data path
through which the content of
B0
is
conveyed to the
.L1
unit is called
1X
cross path . When this
happens, we add
x
to the functional unit
to designate the cross path:
1 ADD .L1x A0,B0,A1
Similarly the data path from register file
B
to the
.M2, .S2
and
.L2
units are called
2X
cross path.
(Cross path): List all the functional units that can be assigned to each of the instruction:
ADD .??? B0,A1,B2
MPY .??? A1,B2,A4
In fact, when you write an assembly program, you can omit the functional unit assignment altogether. The assembler figuresout the available functional units and properly assigns them. However, manually assigned functional units help you to figureout where the actual execution takes place and how the data move around between register files and functional units. Thisis particularly useful when you put multiple instructions in parallel. We will learn about the parallel instructions lateron.
Now you should know enough about C6x assembly to implement the inner product algorithm to compute
(Inner product): Write the complete inner product assembly program to compute where and take the following values:
a[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, a }x[] = { f, e, d, c, b, a, 9, 8, 7, 6 }
The and values must be stored in memory and the inner product is computed by reading the memory contents.
When an instruction is executed, it takes several steps, which are fetching, decoding, and execution. If these steps aredone one at a time for each instruction, the CPU resources are not fully utilized. To increase the throughput, CPUs aredesigned to be pipelined, meaning that the foregoing steps are carried out at the same time.
Notification Switch
Would you like to follow the 'Dsp lab with ti c6x dsp and c6713 dsk' conversation and receive update notifications?