CSE526: Principles of Programming Languages Scott Stoller project: type checker for AtomicJava version: 6pm,29mar2004 team formation due: 22 mar 2004 project due: 29 apr 2004 Read: Cormac Flanagan and Shaz Qadeer. A type and effect system for atomicity. In Proc. 2003 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 338-349. ACM Press, 2003. http://www.cs.ucsc.edu/~cormac/papers/pldi03.ps Implement a type checker for programs in this type system. This project is done in teams of two students. One person on each team should send a message containing the names of both team members to cse526@cs.sunysb.edu by the above team formation deadline. You may use any programming language. Teams that use ocaml will receive 10% extra credit, reflecting the fact that they put in the effort to learn and use a new programming language. ocaml is well suited for this kind of program. Here is a more detailed description of the requirements. 1. Define a data type for abstract syntax trees of the programming language. This corresponds to the grammar in Figure 1 with the extensions in Section 3.3. If you use ocaml, I suggest that you use a datatype similar in style to the datatypes for expressions in hw1. for example, type expression = Var of string | FieldUpdate of expression*string*expression | WhileExpression of expression*expression | ... type classDef = ... type program = Program of classDef list * expression If you use Java or C++, I suggest that you introduce a class corresponding to each production in the grammar. For example, there will be classes called Atomicity, Program, and Expression. The Expression class will have a sub-class WhileExpression, for example, class WhileExpression extends Expression { Expression guard; Expression body; ... } You do not need to write a parser. Inputs to your system will be supplied directly as elements of this data type. for example, let test1 _ = let exp1 = Var "x" in let exp2 = Var "y" in let exp3 = (FieldUpdate exp1 "field1" exp2) in let classdef1 = ... in let pgm1 = Program [classdef1] exp3 in typecheck pgm1 (* execute test1 *) let _ = test1 () It is not required---but it is probably worthwhile---to write a simple "pretty printer" that prints programs in a user-friendly format. 2. Define a data type for atomicities (including conditional atomicites), with support for operations on atomicities (sequential composition, join, etc.). 3. Define a type inference function that computes the type and atomicity type of an expression. Specifically, given a program P, a type environment E, and an expression e, this function computes a type t and an atomicity type a such that P; E |- e : t & a. Type inference is necessary because subexpressions of a method are not annotated with their atomicity (because such annotations do not appear in the grammar extensions in Section 3.3). This function does intra-procedural (not inter-procedural) type inference, because each method is annotated with its atomicity, so when this function encounters a method call, it needs to look only at the declaration of that method, not at the code of that method. This function can be defined by recursion on the structure of expressions. 4. Define a type checking function. Specifically, given a program P, this function either indicates that the program is well-typed, or produces an error message providing some details about a type error that was encountered. For simplicity, the type checker can halt after providing details about the first error it encountered. This function computes, for each method m of each class c, a type t and an atomicity type a for m's body, and then checks that type t and atomicity type a are consistent with the declared type and atomicity type of m. 5. Write several testcases to test your type checker. Be sure to include both well-typed programs and programs with type errors. Your project submission should include the testcases and your type checker's output for each testcase. [more details about this were added below on 26apr2004.] You have a few small examples of well-typed programs: 1. the program in atomicity-typing-example on the cse526 web page. 2. atomic bank accounts in section 3.4 of the paper on atomicity types. 3. the Vector example in section 4.2 of the paper on atomicity types. although it is written in java, you can capture the "essence" of the example in a small AtomicJava program. (AtomicJava is the language defined in section 3.3.) one point illustrated by the bank account example is that a well-typed method can have a conditional atomicity containing "error". for example, since "requires" is an abbreviation, the atomicity of the method Account.deposit3 is actually "this ? atomic : error". This method is well-typed. This is not a type error. There is a type error only when the declared type of a method (or the main expression) is not greater-than-or-equal-to the actual type of its body (or of the main expression, respectively). Note that, according to the typing rule [PROG], the main expression of a program is implicitly declared to have the unconditional atomicity "cmpd". I think it is a good exercise for people to try to make some examples on their own, to understand the type system better before implementing it. if you make a few small examples and indicate whether you think they are well-typed or not, I will be happy to check them. 6. Write a brief README describing how to compile and run your system. 7. Submit the assignment in accordance with the general instructions on the cse526 home page. GRADING functionality: 80% clean, clear, and well-documented code: 10% good testcases: 10% Your code should not be grossly inefficient, but clarity and correctness are more important than efficiency in this assignment, so don't bother with minor optimizations. Each team will hold a 30-minute demo with the TA shortly after the project is due. (A sign-up sheet will be available the week before.) The TA will supply some testcases and will ask the team to run some of the testcases from their TESTING file. -------------------- added 22apr2004 a purely bottom-up approach won't easily work. you need a simple top-down pass before the bottom-up pass. or you could use a purely top-down approach. here's an example of the need for some top-down analysis. consider let x:t = exp1 in exp2 suppose you are type-checking a use of variable x in exp2. a purely bottom-up approach will not have processed the enclosing "let" expression yet, so it does not know the declared type t of x. in other words, we need to do some simple top-down type inference, either before or during the type checking. if every expression were labeled with its type, no type inference would be needed, and a purely bottom-up approach would work (since type checking can be done bottom-up). -------------------- added 25apr20004 > For our PPL project, the paper mentioned something called "ls"(lockset) > in the section 3.2. Would you please tell us: > > How you generate a lockset? (or how it is implemented conceptually?) assuming that you have expanded "requires" clauses into conditional atomicities (I think that's the simplest way to deal with them), the only locksets used in the typing rules are implicitly in [EXP SUB], because to prove "a squareLessThanOrEqual b", you might need to check facts of the form "a' squareLessThanOrEqual^h_n b'", where h and h are locksets. you can see that the locksets in the rules defining squareLessThanOrEqual^h_n come from the locks in the conditional atomicities. in some approaches, the algorithm might need to accumulate locskets as it goes down through (possibly nested) "synchronized" expressions. however, I think this is unnecessary if you do most of the work in a bottom-up pass (preceded by a top-down pass to gather types for local variables). > Besides, for the typing rule [EXP LET], does it forget to compare t = t1? you are right. there is a typo in [EXP LET]. "E, t x" should be "E, t_1 x". -------------------- added 26apr2004 > By the way, can you explain what does the "a' squareLessThanOrEqual^h_n > b'" > mean, because I can't understand its definition according to the paper. > Then > I don't know how to implement the funtion of "a' > squareLessThanOrEqual^h_n b'". a squareLessThanOrEqual b means that a is a "stronger" atomicity than b. as we discussed in lecture (some time ago), the sets h and n (intuitively, "held" and "not held", respectively) are used to detect consistency of assumptions about which locks are held. such assumptions arise when considering a specific branch of a conditional atomicity. for example, in "l?(l?a:b):c", the atomicity "b" is "unreachable" and should not affect the meaning of this conditional atomicity, hence should not affect the position of "l?(l?a:b):c" in the squareLessThanOrEqual ordering. the ordering squareLessThanOrEqual^h_n is defined to achieve this. -------------------- added 26apr2004 in addition to a README (as described above), your submission should include a file called TESTING that summarizes how you tested your type checker. specifically, for each non-trivial (e.g., more than 6 or 8 lines) test program that you used, please include 1. the program written in reader-friendly notation (i.e., as normal text, like the program in atomicity-typing-example, not as a data structure). 2. a brief description of the output of your type checker, e.g., "NullPointerException" (I hope not! :-), or "the program is well-typed", or "type error: the expression ... is expected to have type ... but has type ...". if your program produces voluminous diagnostic information during type-checking, you do not need to include all of it here. -------------------- added 27apr2004 the due date is extended to 4may2004. there is a minor problem with the typing rules in the paper. the grammar in Figure 1 does not contain productions for integer constants. accordingly, there are no typing rules for integer constants. however, the [FIELD NO GUARD] and [FIELD GUARD] rules have the hypothesis "P; emptyset |- e : t & mover". without integer constants, there is no way to satisfy this hypothesis when t is "int"! thus, with the current rules, every class that declares a field of type "int" is untypable. note: there is no problem when t is a class, because the typing rules support judgments like "P; emptyset |- new c : c & mover". there are two ways to fix this problem. 1. eliminate initializers for integer fields. in this case, you should also prohibit integer fields from being declared final. thus, the production for field declarations becomes field ::= [final]_opt c fd [g]_opt = e | int fd [g]_opt corresponding minor changes to the typing rules are needed. 2. add integer constants to the language. specifically, add the following production to the grammar e ::= i i \in integer constants and add the following typing rule: P |- E ---------------------- P;E |- i : int & const all teams should adopt approach 2. if this causes difficulties, please let me know. -------------------- added 30apr2004 In addition to the README and TESTING files, the project submission must include a file called CONTRIBUTIONS that summarizes each team member's contributions to the project. For example, indicate who wrote or significantly revised each part of the code, and who wrote each testcase. Also indicate contributions to design, debugging, documentation, etc. If both teammates contributed comparable amounts, they will both receive the same score for the project. Otherwise, they will receive different scores, commensurate with their contributions. -------------------- added 3may2004 during the demo, students will first be asked to run some testcases that we have prepared, and then will be asked to run some of the testcases in their TESTING file, to confirm the accuracy of the results reported there. probably there won't be enough time to run all of the testcases in the TESTING file, so the TA will randomly select testcases to run. one way to design a comprehensive set of testcases is to use a specification-based coverage metric. for example, the grammar for the programming language is part of the specification of your type-checker, so each construct in the programming language should appear in at least one testcase. similarly, the definitions of several operations on atomicities (e.g., ";", "*", and "S(l,a)") are part of the specification, so for each case in the definition of each of these operations, there should be a testcase whose outcome depends on correct implementation of that case. (you can develop such testcases by "working backwards" based on how those operations are used in the typing rules.) due to time limits, you are not required to develop a test suite that systematically covers all of those cases, but your test suite should at least cover some of the more complicated cases. -------------------- added 3may2004 updated information about testcases and demos: 0. the project submission deadline is 11:59pm on Tuesday. the testcases will be posted on the cse526 announcements page at that time. projects submitted after that time will receive a -15% penalty. note that your submitted zip file should include the README, TESTING, and CONTRIBUTIONS files. if you can't conveniently submit printouts of those files before your demo, then you can simply bring the printouts to the demo. 1. at the beginning of the demo, the TA checks that the files being used are the same files that were submitted, using of one of these two methods: 1.1. at the beginning of the demo, the TA mails the zip file from the cse526 account to the students. the students unzip it and use the files. 1.2 when the project is submitted, students include in the README a checksum and the exact command used to compute it. for example, cat *.java */*.java | /usr/local/bin/md5 47fdea02d713b22c114faf800e9133cc at the start of the demo, students re-compute the checksum on their copy of the files, and the TA checks that the value is the same as in the submitted README. if not, use approach 1.1. note: if you are using Solaris, you must use /usr/local/bin/md5 to compute the checksum. if you are using Windows, you must use approach 1.1. 2. students modify the files from step 1 to contain our testcases, by cutting-and-pasting text into one of the existing files, or copying a file containing the testcases into the directory containing the files from step 1. the TA looks at the text being pasted or the file being copied to make sure that it contains only our testcases (and perhaps a small amount of code to call the typechecker on the testcases). 3. students run the testcases we supplied and then run their own testcases. -------------------- added 5may2004 > is there any difference between some stupid mistakes (like typo in > program) and those serious ones? > And can we get some of points back if we can modify it to correct > program easily? we will try to distinguish small errors from serious ones. in part, this happens automatically, if one believes that serious errors generally affect more testcases than minor errors. if your code has small errors that can be fixed by changing a few lines, write a file called CHANGES that describes (1) the errors and how to fix them, and (2) your type checker's output for testcases affected by the change. bring a printout of this file to the demo. to be fair, changes of more than a few lines would have to be considered equivalent to submitting the project late. successful fixes in your CHANGES file will only earn partial credit. giving too much credit for them would be unfair to students who are busy with other committments during the relatively short interval between yesterday evening and the demos tomorrow. during the demo, you should first demonstrate your submitted code on our testcases, so there is a common baseline. Yiping, after looking at printouts of your TESTING file and your CHANGES file (if any), will ask you to demonstrate selected items from them, to confirm the accuracy. (Demonstrating an item in your CHANGES file means applying the fix and re-running the testcase.) demos can be continued at a later time if necessary, but we hope this will be unnecessary for most teams. -------------------- added 7may2004 for grading, each team member's contribution to the project is quantified as an individual contribution factor (ICF). a person who does the expected amount of work has an ICF of 1. specifically, for teams with 2 people, a person who did half of the team's work would have an ICF of 1; this should be the norm. a person who did all of the team's work would have an ICF of 2. a person who did less than half of the team's work would have an ICF less than 1.