| CMPE324, Final, 2016-2017 – Spring Date  Number:                                                              | e: 10/06/2017 - <b>Duration: 100</b> min.     |
|---------------------------------------------------------------------------------------------------------------|-----------------------------------------------|
| Q1) [20pts] Consider the following MIPS code segment. calling function and the result is returned in v0.      | Note: values of a0 and a1 are passed from the |
| f1 :<br>lw \$t0, 0(\$a0)<br>addi \$t1, \$0, 1                                                                 |                                               |
| loop:                                                                                                         |                                               |
| bge \$t1, \$a1, exit<br>mul \$t2, \$t1, 4<br>add \$t2, \$t2, \$a0<br>lw \$t2, 0(\$t2)<br>bge \$t2, \$t0, next |                                               |
| add \$t0, \$t2, \$0                                                                                           |                                               |
| next:<br>addi \$t1, \$t1, 1<br>j loop                                                                         |                                               |
| exit:                                                                                                         |                                               |
| add \$v0, \$t0, \$0<br>jr \$ra                                                                                |                                               |
| a) Translate the function f1 into a high-level lang                                                           | guage like C.                                 |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
|                                                                                                               |                                               |
| b) Describe briefly what the function f1 perform                                                              | ?                                             |
|                                                                                                               |                                               |
| Q2) [8pts] Consider the following code segment:                                                               |                                               |
| addi \$a0, \$zero, 1000<br>Loop :sub \$a0, \$a0, 1<br>bne \$a0, \$zero, Loop                                  |                                               |
| a) For a single-cycle processor, how many cycles are                                                          | e needed to execute this code segment?        |
| b) For a multi-cycle processor, how many cycles are                                                           | needed to execute this code segment?          |

.....

Q3) [24pts] In a single-cycle processor, assume that it is required to simplify the MIPS instruction set architecture by removing the original lw and sw instructions and replacing them with ones that do not contain a constant offset. The new loads and stores will have the following general forms.

a) Show what changes must be made to the single-cycle datapath shown below to implement the new lw and sw instructions without using the ALU.



b) Fill in the table below the correct control signals for the new lw and sw instructions. Use 'X' to indicate any don't-care conditions (if necessary).

|    |        |          |        | `     |          |         |          |       |
|----|--------|----------|--------|-------|----------|---------|----------|-------|
|    | RegDst | RegWrite | ALUSrc | ALUOp | MemWrite | MemRead | MemToReg | PCSrc |
| sw |        |          |        |       |          |         |          |       |
| lw |        |          |        |       |          |         |          |       |

| c) | Assume that memories and the ALU have 2ns delays, and the registers have a 1ns delay       |
|----|--------------------------------------------------------------------------------------------|
|    | Find the minimum clock cycle times for both the original single-cycle datapath and the new |
|    | modified one (without using the ALU for sw/lw instructions). Show your work.               |
|    | Original:                                                                                  |
|    | Modified:                                                                                  |

d) Based on the new modified processor, how to implement the instruction lw \$t0, 4(\$sp)

In this case, what would be the required CPU time for executing lw \$t0, 4(\$sp)?

CPU time: .....

Comment on the result: .....

**Q4)** [20pts] Assume that it is required to add the **MemInd rt,offset(rs)** instruction to the multicycle data bath shown below. This instruction employs the following operation:

## rt=Memory[Memory[offset+rs]]

- a) In how many cycles this instruction will be executed.
- b) Add any necessary datapaths and justify the need for the modifications, if any.



c) Provide the finite state diagram for executing this instruction. Specify the required control line values from the 3<sup>rd</sup> step.



**Q5**) **[20pts]** Assume that it is required to execute the following immediate addition instruction in the single-cycle datapath:

addi \$29, \$29, 16

The single-cycle datapath shown below shows the execution of this instruction. Several of the datapath values are filled, and you are requested to provide values for the other remaining signals in the diagram, which are marked with a ? symbol.

Write your answers (in decimal) directly on the diagram. Assume register \$29 initially contains the number 129, and if a value cannot be determined, mark it as 'X.'



## **Q6)** [8pts] Answer the following questions:

- a) For a single cycle processor, assuming the following functional units time delays (other time delays are negligible):
- Instruction and Data Memory (200 ps),
- ALU and adderss (200 ps),
- Register file access (reads or writes) (100 ps).
- a) For executing add, lw,sw, beq, and j instructions, calculate the clock frequency.

.....

b) If a multicycle processor is used, what would be the clock frequency?