Question 1:  (5 points)

HP v.4 question 2.1.

 

 

Question 2:  (5 points)

HP v.4 question 2.2.

 

 

Question 3:  (10 points)

HP v.4 question 2.3.

 

 

Question 4:  (10 points)

HP v.4 question 2.4.

 

 

Question 5:  (10 points)

HP v.4 question 2.5.

 

 

Question 6:  (5 points)

HP v.4 question 2.6.

 

 

Question 7:  (10 points)

HP v.4 question 2.7.

 

 

Question 8:  (15 points)

HP v.4 question 2.8.

 

 

Question 9:  (40 points)

Read the paper ÒInstruction Sets and Beyond:  Computers, Complexity, and Controversy,Ó by Colwell, et. al. and answer the questions below:

 

Question A:        Before describing a ÒRISC Manifesto,Ó the authors discuss the Òdrive toward CISC

machines.Ó  Discuss/explain 3 reasons why CISC type architectures evolved to become

the predominant ISA form prior to the RISC/CISC debate put forth in the paper.

 

Question B:        What is microcode?  How has it enabled CISC scaling and additional ISA complexity?

 

Question C:        Describe the ideas/design principles behind the 801 machine.  (i.e. How were the

designers trying to improve performance over the state of the art?)

 

Question D:        Referencing the Colwell paper, list and explain 2 fallacies in the pro-RISC arguments of

the day.

 

Question E:        Think about the different approaches to benchmarking performance chosen by the RISC

and CISC design communities.  In your opinion was the pro-CISC approach better or was the pro-RISC approach better?  Justify your answer.

 

 

Question 10:  (40 points)

This question covers pipelining and hazards with SimpleScalar.  To answer this question, start with the sim-safe simulator. The main loop of the simulator, sim_main(), executes each instruction and increments the cycle counter by one. Note that sim-safe does NOT model the timing of the execution—it only models the functional effects of each instruction. To model timing, youÕll have to modify sim-safe.c to count how many cycles have elapsed during each iteration of sim_main(). Run all experiments with the three working benchmarks. 

 

Question A:  Assume your processor is a 3-wide superscalar (i.e., can execute a maximum of 3 instructions per cycle). Assuming no hazards of any kind, what is its performance (i.e., how many cycles does it take to run)?

 

Question B (Structural Hazards): Now assume that the L1 data cache has only port and thus the processor can only execute at most one memory operation (load or store) per cycle. How does this affect its performance?

 

Question C (Data Hazards): Now further assume that the processor cannot execute data dependent instructions in the same cycle. For example, if an instruction writes to register 2, then no subsequent instruction (in program order) that reads register 2 can execute in the same cycle (it must wait until the next cycle). How does this affect performance? Note that this question is independent of the pipeline length.

 

Question D (Control Hazards): Now further assume that the processor has a 10-stage pipeline. The result of a conditional branch (i.e., taken or not-taken) is computed in stage 7. The processor statically predicts all conditional branches as taken and continues fetching from the branch destination. If the branch is indeed taken, then there is no penalty. If the branch is not taken, then all instructions after it are squashed and fetching resumes from the instruction immediately after the branch in program order. How does this affect performance?