Computer Architecture代写|C代写|C++代写|数据结构代写:这是一个体系结构代写,使用的语言是C,涉及了数据结构的内容
Workload Optimization
For your midterm take home exam, you will optimize the Breadth-First Search (BFS) algorithm.
The project will consist of two parts, each worth 50% of the midterm grade. There is a strict no
collaboration policy for this exam: you should not discuss your work with anyone other than the
instructors. You are also not allowed to consult any material other than the materials provided in
this package. Please use private posts on Piazza for clarification requests.
Breadth-First Search
Breadth-first search (BFS) is an algorithm for searching a graph data structure. For the midterm you
will be optimizing a state-of-the-art implementation known as the Direction-Optimizing [1] approach.
The paper describing the implementation is available here: http://www.scottbeamer.net/pubs/
beamer-sc2012.pdf
Part 1: Optimization Proposal (Due March 21)
For each optimization,
1. describe the optimization;
2. provide a rationale for why you think it will be beneficial;
3. discuss any potential trade-offs that may result from implementing your optimization (e.g.,
increases silicon area to achieve higher performance); and
4. estimate the cost (e.g., energy will increase x%, performance will decrease y% etc.) and benefit
of the optimization.
One of the learning goals for this class is to get an intuitive sense of computer architecture. As such,
you will be evaluated based on your understanding of concepts and your ability to apply them to a
new context appropriately. Your grade for this part will be based on optimizations and the quality of
your analysis of the rationale for speeding up the benchmark.
Between March 22nd & 23rd you will meet for 15min with Prof. Sethumadhavan to
discuss the proposal. The schedule will be made available later.
1
CSEE W4824: Computer Architecture Midterm Project
Part 2: Optimization Implementation (Due April 4)
Based on the feedback you receive for Part 1, implement one or two of your optimizations. Submit
a written report with your results and a discussion of how well the proposed optimization works and
an analysis of the choices you made. Your grade for this part will be based on how well you executed
your optimization.
For hardware optimizations, assume the baseline microarchitecture that follows:
Baseline Microarchitecture Assumptions
• 32nm technology node; 0.5ns clock period.
• ISA: RISC or CISC; assume you can add new instructions
• Instruction supply:
– Ability to fetch and decode one instruction every cycle
• Execution:
– ability to issue/execute one instruction every cycle
– L1 Data TLB: 32 entries, fully associative
– L2 Data TLB: 512 entries, 4-way set associative
• Memory Hierarchy:
– NOTE: all caches have 64B block size
–
Level Capacity Associativity Latency Policies
L1 32KB 8-way 4 cycles LRU; writeback; non-inclusive
L2 256KB 8-way 10 cycles LRU; writeback; non-inclusive
L3 8MB 16-way 35 cycles LRU; writeback; inclusive
• Memory System:
– DDR3-1333 DRAM with 3 memory channels
– 16M × 8 × 8 banks
– tRCD − tRP − CL: 9-9-9 (tCK = 1.5ns)
• Commit: In-order commit with precise exceptions
Getting Started
• Clone the GAP Benchmark Suite repository from here: https://github.com/sbeamer/gapbs
• Download & extract the archive from Courseworks: tar xvf midterm.tar.xz
• Apply bfs.patch to the GAP repo: git apply bfs.patch
• Build the bfs executable: make bfs
• Run by using ./bfs -g 20 -n 3
References
[1] S. Beamer, K. Asanovi´c, and D. Patterson, “Direction-optimizing breadth-first search,” Scientific
Programming, vol. 21, no. 3-4, pp. 137–148, 2013.
Prof. Simha Sethumadhavan Page 2 of 2