|
CSE307 Spring 2007 Stony Brook |
Principles of Programming Languages
Annie Liu Assignment 4 |
Handout A4 Mar. 8, 2007 Due Mar. 29 |
Control Abstraction --- Writing Queries
This assignment has two goals: (1) implementing a hierarchical state machine semantics for data in multiple tables, and (2) practicing programming with control abstraction while doing the implementation: recursion, iterators, as well as queries using comprehension and generator expressions.
While tables as data appear to be flat, programs as data appear nested or recursive, especially parse trees and syntax trees that follow highly recursive grammar rules. Even though state machines appear simple as captured easily in tables, they have been used increasingly in specifying complex computer systems, as for example in UML, especially with hierarchically nested states---a form of recursion.
Implementing a hierarchical state machine semantics
In our hierarchical state machine, states can be nested. Precisely, a state can be recursively a state machine; such a state is called a super state, and a state of the state machine inside is called a substate. When a system is in a substate, it is implicitly also in the superstate, i.e., it can take input and perform transition as in the super state, but substate can take additional input and go to a substate in the same machine; this makes the overall state machine more compact and more modular. Operationally, when a system goes in a super state, it immediately goes to the start state of the machine inside; when the system is in a state of the machine inside, an input is first handled according to the machine inside; an un-handled input is automatically passed back to the machine outside, rather than leading to the trap state of the machine inside. Note that there can be many levels of nesting, so the system can go into substates nested many levels inside. The system stops when the input runs out, or when the system goes in the trap state of the top-level machine.
We represent a hierarchical state machine using multiple tables.
For convenience, we assume that states in different tables are
different (one can always rename them to be different otherwise). We
always put the top-level machine as the first table. If a state n is
recursively a state machine, we put the machine for that state in a
separate table, preceded with a line "----state n----".
For convenience, we assume that no other proper data item is of the
form of this special line. Indeed, we could regard the top-level
machine as corresponding to an implicit top-level state too.
For example, the following two tables denote a hierarchical machine with 5 states (1, 2, 21, 22, 23) and 6 possible input labels (a, b, c, d, e, f), and with one machine nested in state 2 of the top-level machine. The top-level machine is the same as in Assignment 3. The machine for state 2 has the start state 21, where it can take input e and go to state 22, where it can then take input b and go to state 23, where it can then take input f and go back to state 21.
1 a b c d 1 1 2 2 2 1 ----state 2---- 21 e b f 21 22 22 23 23 21For example, on input
a a b e b f e c e b d, the system
will go through the following sequence of states (to help trace the
execution, we put input labels between states in parentheses in the
line above):
(a) (a) (b) (e) (b) (f) (e) (c) (e) (b) (d)
0 1 1 1 2 21 22 23 21 22 2 21 22 23 1
To understand hierarchical state machines better, first write down, in your README file, the flat state machine, i.e., one without nested states, that corresponds to the example above [Hint: it has 4 states: 1, 21, 22, 23].
Your programming task for this assignment is to (1) read data in multiple tables in the format described above for a hierarchical state machine, (2) do static semantic analysis and write the resulting data for each table, as with tag "-s" in Assignment 3, and (3) on a given sequence of input labels, output the sequence of states that the system goes through; if the system stops in the trap state of the top-level machine, flag an error.
When run, your program should take 4 file names from command line, for input data, output data, input label sequence, and output state sequence, respectively; (1) read data from the input data file; (2) write the resulting data from static semantic analysis to the output data file; (3) read a sequence of input labels from the input label sequence file; and (4) write the resulting sequence of states to the output state sequence file. When there are errors in processing the data, try to give the best error messages possible and finish all the processing possible.
Programming with control abstraction
Because, in a hierarchical model, state machines can be nested recursively in states, a natural way to trace executions is to use recursion. For most if not all of the other processings, one could use iterators, and further write queries using comprehension and generator expressions. You are asked to use these mechanisms for control abstraction as much as possible for this assignment. If, however, you feel that not using any such mechanisms could make programming easier and programs clearer, you are welcome to do without and describe the reasons. Again, you should not define classes for this entire assignment.
Extra credit suggestions
Here are some ideas. (1) Check that the states across all tables are distinct, and that the hierarchical state machine is deterministic, i.e., no substate and its superstate, supersuperstate, and so on can take a same input label and go to two different states (that are not the trap state). (2) Check whether there is a cycle in the nesting relationship, i.e., there is a sequence of states, s1, s2, ..., sk, where each next state is a substate of the previous state, and s1 is a substate of sk. (3) Generate specialized code that implements the state machine transitions using conditional statements, as opposed to reading off the tables, in another word, use compilation in place of interpretation. (4) Think of other interesting and/or useful things to check and/or to do, and do them; feel free to discuss with me.
Handins
Hand in everything electronically, using Blackboard, by 5pm on the due date. Your handin should include your README file, code, test data, and anything else you have to show your work.
Grading
This assignment will be graded based on 100 points, allocated in proportion to the estimated amount of work for each part. You may do this assignment in a team of two people; the two people will receive the same points. Exceptionally well thought-out and well written work will receive appropriate extra credit. Extra credit work will be graded based on the estimated amount of extra work.