CS 352 Spring 2023
代写Compiler | 代写c++ | arm代做 | 代写assembly | java | project代做 – 这是一个关于Compiler 的题目, 主要考察了关于Compiler 的内容,是一个比较经典的题目, 涵盖了report/c++/arm/assembly/Compiler 等程序代做方面, 这个项目是lab代写的代写题目
Compiler project 2
1. Introduction
For an overview of all projects in this semester and their relationship, please review Project 1 handout. However, we will change Project 3. Instead of implementing an interpreter, we will break the original Project 4 into two projects, and the new Project 3 will generate arm assembly code for a very small subset of java programs. Project 4 will generate ARM assembly code for the full scope of MiniJavaP2, the grammar defined in this project, which extends MiniJavaP1.
This project, as all other projects, is to be carried out by each student independently. Please review the course-specific requirements for academic honesty stated in HW0.
2. Main Tasks in Project 2
Project 2 (P2 in short) writes a YACC program to build a bottom-up parser (with the ability of type checking) for a subset of Minijava+ that has been posted on BrightSpace. This subset is defined in the file MiniJavaP 2 .txt posted on BrightSpace in the module titled Project 2. This section gives an overview of the tasks in Project 2. Minute details will be clarified on Ed Discussion as various questions may be posed by students. For brevity, the parser constructed for this project that also performs type checking is called simply the parser.
Project 2 consists of the following main components.
- Modify the Lex script written for P1 to extract new tokens, primarily due to the expansion of the Main class. As in P1, some token classes are embedded in the production rules and students need to define them in the Lex script.
- Write a YACC/Bison script (Bison is a YACC-compatible tool that supports C++) to parse an arbitrary minijavaP2 program. For brevity, we will refer to YACC only throughout the rest of the projects this semester, while students are allowed to use Bison as well.
- Add semantic actions to the YACC script such that an abstract syntax tree (AST) and a symbol table are built for the input program.
- Traverse the AST to perform type checking and report type rule violations. (For brevity, we will simply say a type violation or a type error in this document.)
Items 2 to 4 are elaborated further in the following subsections.
2.1 Writing the Grammar
As discussed in the lectures, right-recursive production rules tend to make the parsing stack grow deep, thus potentially stressing the available memory. It is therefore advisable to change right-recursive rules written for P1 into left-recursive one.
More importantly, the production rules need to be written in such a way that the AST constructed by the semantic actions follow the correct precedence rules in Java. Students are strongly advised to write the production rules in such a way that YACC does not complain about any parsing conflicts. We do not prohibit the use of precedence directives in YACC such as %left to resolve parsing conflicts. We also allow students to use YACCs default parsing actions (i.e., default to shift or to reduce). However, such practice is quite error prone and requires a thorough understanding of the parsing actions that it causes. It is the students responsibility to ensure that the existence of parsing conflicts does not cause any failure to correctly process the input programs.
Note that although YACC resolves parsing conflicts according to the specified operator precedence, it will still give warnings about the existence of such conflicts. To encourage students to apply the ideas discussed in the lectures to minimize parsing conflicts, bonus points will be granted for minimizing such conflicts. This will be explained in more details later in this document.
2.2 Semantic Actions and AST Construction
The purpose of constructing the AST for this project is to facilitate type checking. Students may find that many statements in MiniJavaP2 can be type checked without having an AST constructed. However, due to the possibility of forward reference to method names (i.e., calling a method that is defined after the method making such a call), the absence of AST can make type checking difficult to perform for such method invocations. A few of the test cases for grading will include such forward references. Up to 10 out 100 points will be assigned to such test cases. Hence, if the students want to attempt the full marks, building an AST would be strongly advised.
Ad-hoc ways to avoid constructing AST for handling forward references are possible. However, AST is necessary for assembly code generation in later projects. Therefore, students are strongly advised to construct the AST in Project 2.
2.3 Type Rules and Type Checking
A correct program targeted by P2 must adhere to the following type rules.
- A variable must be declared before it is referenced. In each method, the same variable name may appear in one variable declaration statement only, including the formal parameter of the method.
- All type rules must be consistent with Java standard (as specified in Java document https://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.2), except the following additional restrictions :
a. Both operands of any binary operation must have the same prime type allowed for that
operation. (We have three prime types, int, boolean, and String.) An operand may be a
constant, a variable, the intermediate result of an operation, or a value returned by an
invoked method.
b. The left-hand side and right-hand side of an assignment operation must have the same
prime type.
c. Logical operations are performed on boolean-typed operands only.
d. IF and WHILE conditions must be of the boolean type.
e. Concatenation (+) operations are performed on String-typed operands only.
f. Array indices must be of int type. To simplify the implementation, we assume that an
array is either of one dimension or two dimensions.
- To simplify the handling of strings, it is assumed that the input program contains no backslash character \.
- System.out.println , System.out.print and Integer.parseInt are standard Java methods. The argument expression in the invocation of the first two standard methods may be any of the prime types listed in MiniJavaP2. Integer.parseInt converts a string to an integer and therefore expects a String argument when invoked.
- Except for the standard Java methods explicitly mentioned below in this subsection, a method must be declared in the input program in order to be invoked, although forward references of method names are allowed. The argument list in the method invocation must agree with the list of the formals in the corresponding method declaration, including the strict agreement of the prime type for each parameter.
- MiniJavaP2 has no type conversion and no type casting between prime types. Neither does MiniJavaP2 allow overloading in method declaration and invocation.
2.4 Reporting Syntax Errors and Type Violations
The test cases for grading this project are divided in two categories, discussed separately below. All error reporting messages, including both syntax errors and type violations, must be written to the standard error file ( stderr ).
2.4.1. Syntax rule violation reporting
An input program in the first category contains a violation of the production rules in MiniJavaP2. The parser built for P2 must be able to report such a syntax error. The parser should print the following error message for the whole program in the exact format shown below before aborting the processing of the input program:
Syntax errors in line <line_so_and_so>
where <line_so_and_so> is the line number where the first syntax error is found. Notice the additional word line which is not in the spec of p1. NOTE: If the input program contains no production rule violations, then nothing should be printed except for any type violations found (see below). All printing statements that students added to the parser for the purpose of debugging must be removed or disabled before submission.
2.4. 2. Type violation reporting
An input program in the second category contains no violations of the production rules, but it may or may not contain type rule violations. If the parser incorrectly reports a syntax error, it fails the test case no matter what is performed later for type checking for that program. If the input program contains neither syntax errors nor type violations, then the parser must print out nothing.
Each type violation reporting message must be printed on a separate line , in the form of
Type violation in line <line_number>
The error message must be printed exactly as specified above, except that <line_number> is a line number and the number of spaces (must be > 0) around the printed line number is flexible.
As a general rule, <line_number> should be the line in which the type violation is found. In certain cases, two adjacent lines may both be responsible for a specific type violation. In such cases, set your own tie- breaker to report exactly one of these adjacent lines. Occasionally, yylineno may also deviate slightly from the actual error location slightly, which could be due to the way your production rules are designed_. The test cases for grading will be written to minimize ambiguity of line positions. If needed, however, the grader may inspect the submitted code to see if an_ imprecise line number reported is permitted.
The type violation reporting messages must be printed in the order of the line number. Students are advised to store a sorted list of line numbers internally. Type violation messages are printed after type checking is performed for the entire input. This is especially important if the student does not construct the AST for this project but interleave type checking with parsing actions instead. This is because type violation messages must not be printed before the input has passed grammar check.
NOTE: Many users have found it highly useful to track the line and column numbers using YYLTYPE as in an online document. (https://www.gnu.org/software/bison/manual/html_node/Tracking- Locations.html). Its explanation in the subsection "Location Default Action" can be modified to detect the number of lines in a multiline comment. Note that from your YACC file you can access the contents of the YYLTYPE by using "@$, @1, @2, …" in the same way "$$, $1, $2, …" are accessed.
To avoid excessive type violation reporting due to propagation of type errors, we introduce the concept of undefined type. If an operation is found to violate a type rule, then, after the type violation is
recorded, the type of the operation result becomes undefined , unless the operation is a variable reference. A variable reference has its own type definition rule as stated below.
If a variable, say var_foo , is not yet declared a type, then its type is undefined. As soon as var_foo is declared a type, then it keeps the type no matter what type of expression is assigned to it. However, if var_foo is later re-declared to a different type in a way that violates the scope rule, then, after the type violation is recorded, the type of var_foo becomes undefined. However, if the variable is re-declared to the same type, it should still keep the same type, even though it is still a type violation.
Any arithmetic, comparative, or logical operation that has an undefined operand will not be reported as a type rule violation, and the result of the operation has an undefined type. If a method invocation has an argument of undefined type, no type error is reported either. In an assignment statement, if the right-hand side has an undefined type, then a type error is reported only if the left-hand side variable reference violates the type rules (e.g., the variable being undeclared), as stated in Section 2.3. In all cases not mentioned above, if a type rule cannot be checked due to having an existing undefined type involved, a type error is not to be reported.
As an example, suppose var_b is int, var_c is String, var_d is int, and suppose we have two statements int var_a = var_d + var_b * var_c; String var_a; The operation var_b * var_c is a type violation that must be recorded. The type of var_b * var_c becomes undefined. The operation var_d + var_b * var_c also has an undefined type, but we will not report another type violation. Neither will the assignment to var_a trigger a type violation report. The variable var_a retains its int type until it is re-declared to be String, when the re-declaration must be reported as a type violation and the type of var_a becomes undefined.
An input program may contain one or more type rule violations, and the parser must find and report all of them, with the caveat given above about the undefined types.
Your code generated by YACC must not produce any output (written to stderr or stdout) except those specified above. Hence, if your YACC program has additional printing statements added to yyerror() for debugging purpose, you must remove them before submitting the program. Not meeting this requirement will result in point deduction.
3. Grading
Each of the two categories of test cases mentioned in the previous section carries 50% of the grades, not counting the possible deduction due to submission requirement violations (see Section 4) and bonus points for resolving parsing conflicts.
The bonus points received for removing parsing conflicts depend on the thoroughness of such removal. The maximum bonus points, 5% of the total points, are granted if no parsing conflicts are reported by YACC. With each remaining parsing conflict reported, the bonus points received is decreased by 1 percentage point, i.e., 4% bonus points if one conflict remaining.
4. Submission instruction
It is mandatory for students to meet the submission instruction. Not following any of the requirement may result in grade deduction, up to a 10 point of penalty. (For example, if your code receives 90 points, the final score on the Brightspace for this assignment will be 80 points when maximum deduction is applied.)
1 ) Use the following command on CS lab machines that run any Linux OS to submit your homework. No other forms of submission (e.g., email) will be accepted.
turnin c cs352 p p2 [name of the directory to be submitted]
Prior to executing the turnin command, it is mandatory for you to make sure your submitted directory meets the following content requirement.
Make sure that your submitted directory contains only the following files: the Makefile, your YACC and Lex programs, a README text file (explained below), and the .h and .c files (or the c++ equivalents) you have written to work with the Lex/YACC source code. This is to ensure that the system does not exceed the storage limit set by the system staff. Following good software engineering practice, implementation must not start until at least a high-level design is completed. The required README file, of up to 1000 words, is a design document for your code that must include the following components:
a) The design of the main data structures, which describes the organization of your symbol table,
including how you handle name scopes, what information is included in the symbol table, and
how you maintain the symbol table (e.g., how to insert a new entry to the table).
b) The high-level description of how you perform type checking, e.g., whether you traverse the AST
or if you perform type checking without building an AST. Explain which functions in your parser
perform the specific type-checking activities that you have described.
This document will be checked against your code when grading for consistency. Points may be
deducted if substantial deviations are found.
2 ) You are free to write your own Makefile, as long as the following requirements are met.
The grader will run your Makefile by executing the command make parser to produce the executable program named parser. Hence, it is important that your Makefile produces such an executable when invoked.
You must also write your Makefile such that when make clean is run, your working directory will be cleaned, with the only remaining files being the Makefile, your YACC and Lex programs, and the .h and .c files (or the C++ equivalents) you have written to work with the Lex/YACC source code. This is to allow the grader to easily remove the files generated by the testing after your work is graded.
If you are not familiar with writing of Makefiles, you may wish to use one of the two versions shown below and seek help from the TAs if running into issues. (Note that, usually the command names yacc and bison are aliases on our lab machines.)
The C version of the Makefile: parser: y.tab.c lex.yy.c gcc y.tab.c lex.yy.c -o parser y.tab.c : parser.y yacc -d -g –verbose parser.y lex.yy.c: parser.l lex parser.l clean: rm -f lex.yy.c y.tab.c parser
The C++ version:
FLEXFILE = parser.l BISONFILE = parser.y all: flex bison parser parser: g++ -g scanner.cpp parser.cpp -o parser flex: flex -o scanner.cpp ${FLEXFILE}
bison: bison -o parser.cpp –defines=parser.h ${BISONFILE}
clean: rm -f parser.h parser.cpp scanner.cpp parser NOTE that in these Makefiles, a tab is required as the first character on each command line, e.g.
From previous experience, students may wish to run make clean first each time they run make to rebuild their parser, just to make sure that the make tool does not mistakenly think your parser is up to date and does not rebuild.
-
Your program MUST compile and run without any error on CS labs Linux machines. Please do a final test of this before submission, especially if you do your project on other computers.
-
Your code will be tested for grading by the following command: > ./parser program_name Where program_name is the file name containing the input program.
APPENDIX An Authoritative Reference for Type Checking
The sections above have covered all type rules for P2. If the student wishes to compare with the rules in the full set of Java, the authoritative document for type and run-time evaluation rules for our projects this semester is Java Language Specification (Java SE 7bEdition, abbreviated as JSE7 in the rest of this document). This document can be found at https://docs.oracle.com/javase/specs/jls/se7/html/index.html
This is a large document, but only a small part concerns our Project 2, mainly because MiniJavaP2 is a small subset of Java. The limited syntax form of MiniJavaP2 dictates that its type rules are also limited. Note that we imposed more strict type rules on MiniJavaP2 than the full set of Java. We understand that students have learned a set of type rules in the introductory Java programming course (CS180), but the JSE7 document formalizes the rules. Any type issues not addressed explicitly in this document must be understood by consulting the document referenced above.