Spring 2015 :: CSE 502 - Computer Architecture

Directions for Logging in to SBRocks

For this course, we have set up the SBRocks cluster of CEWIT with all the tools you need for your assignments. To get started, do "ssh -Y sbrocks.cewit.stonybrook.edu" using your CS Unix ID and password. If unable to log in, email rt@cs.stonybrook.edu to ask for help (this is particularly true for ECE students, who will need to explicitly request accounts). Once logged in, run "ssh-keygen” to create a key pair – leave the passphrase empty.

SBRocks is a cluster of machines and sbrocks.cewit.stonybrook.edu is the cluster head. You should never run your simulations or compilations on the cluster head. Instead, you should choose one of compute nodes of the cluster for your work. To do this, pick a random number A between 0 and 4, and another random number B between 0 and 34. Do "ssh -Y compute-A-B" to log in to your randomly chosen compute node. If your numbers didn’t work, pick two new random numbers and try again.

Once logged in to a compute node, execute the following command to set up your environment properly to be able to use course tools:

  • if using bash as your shell: do "source /home/facfs1/nhonarmand/cse502/cse502-bashrc"
  • if using csh as your shell: do "source /home/facfs1/nhonarmand/cse502/cse502-cshrc"
(Ideally, you should put those commands in your .bash_profile or .csh_profile so that they are executed automatically upon your logging in).


Hardware designers usually use a Hardware Description Language (HDL) to describe their designs in such a way that is amenable to automatic translation to hardware using the so called "Synthesis Tools". HDLs are used for many tasks in a hardware design flow, including hardware description, testing and verification. In this course you will use an HDL for describing your design, and perhaps writing some test cases.

You will implement your designs in a subset of the SystemVerilog HDL. Although it is a hardware description language, SystemVerilog has many features that make it resemble high-level programming languages such as C or C++. Many of such advanced features, however, are primarily intended for testing and verification, and not hardware description. In this course, we will use a subset of the language that is called the "synthesizable" subset for describing our processors. A synthesizable subset is what a synthesis tool can automatically translate to hardware.

Do not panic if you have not used an HDL before! We will teach and discuss SystemVerilog and its synthesizable subset (which is frankly very simple) in enough detail in the class. We will also provide a SystemVerilog-to-C++ translator (called Verilator) to translate your SystemVerilog code to C++ code that can be compiled and run to simulate your design. We will provide the necessary testing infrastructure that you will compile together with Verilator's output to create a fully functional simulator for your design.

Writing and Running SystemVerilog Tests

We have prepared a simple Makefile and some skeleton code to help you get started with your first SystemVerilog programs. Do "git clone /home/facfs1/nhonarmand/cse502/cse502-sv-skeleton.git" to clone the skeleton code and makefile.

  • Do "make" to compile your code.
  • Do "make run" to run your compiled code.

Homework 1

The goal of this homework is to implement a direct-mapped cache that you can later use in your course project. To get the code, do "git clone /home/facfs1/nhonarmand/cse502/cse502-hw1.git". Please carefully read the README file for an overview of what you need to do for this homework. You should work on the homework individually and not as a group. Each student should submit a separate solution.

Project Overview

In this course, you will design and implement a SPARCv8-compatible processor. At the minimum, your processor will include a 5-stage pipeline (similar to the one covered in the class), multiple functional units with varying latencies, and direct mapped instruction and data caches (40 pts). For more points, you can add the following features to your processor:

  • Set-associative caches (50 pts)
  • Above + super-scalar pipeline (60 pts)
  • All above + out-of-order execution (80 pts)
  • All above + branch prediction and speculative execution (90 pts)
  • SMT on top of any of the above (10 extra pts)
  • Successful synthesis to FPGA on top of any of the above (10 extra pts)

Project Details

Getting the Skeleton Code

Do "git clone /home/facfs1/nhonarmand/cse502/cse502-proj.git" to clone the skeleton code for the project. You should implement your processor by modifying the existing SystemVerilog files and adding new ones. top.sv is the top-level SystemVerilog file and Core.sv is your processor core. Read the README file for more information.

Target Instruction Set

Your processor should implement the user-mode (non-privileged) subset of SPARCv8 instructions. You can ignore the ISA subset related to the "Alternate Address Spaces", "Ancillary State Registers" and the "Co-Processor" as well as any instruction that is marked as "privileged" in the SPARCv8 manual. Specifically, your processor need to implement all the instructions (and the requisite architectural state) described in the following sections of the manual:

  1. Load/store instructions: B.1, B.2, B.4, B.5, B.7, B.8
  2. Arithmetic/logical/shift instructions: B.11, B.12, B.13, B.14, B.15, B.16, B.17, B.18, B.19
  3. Control transfer instructions: B.21, B.22, B.24, B.25, B.27,
  4. Floating-point operate: B.33
  5. Misc. instructions: B.9, B10, B20, B28 (only RDY), B29 (only WRY), B.30 (treat as NOP), B.31, B.32 (treat as NOP unless you have caches)

Traps and Interrupts

Your processor need not deal with external interrupts. It should treat all "exceptions" (that is traps caused by instructions) as precise - no deferred interrupts. Your implementation should correctly check for and generate all the exceptions described in the semantics of your implemented instructions.

Virtual Memory and MMU

Your processor need not include any virtual memory support. Hence, you don't need to implement an MMU and can ignore the issues related to the Address Space Identifiers (ASI).

Project Resources