Answer Key for CSE-304 Final of Fall '97
1. a.
%%
"++" {printf("+ 1");}
.|\n {printf("%s", yytext);}
%%
b.
%%
([ \t]*[\n])+ {printf("<P>");}
. {printf("%s", yytext);}
%%
2. If all we have to do is translate \begin...\end commands, such as
\begin{itemize} .. \end{itemize}, then a simple lex translator
will do:
%%
"\begin{itemize}" {printf("<UL>\n");}
"\end{itemize}" {printf("</UL>\n");}
"\begin{document}" {printf("<html>\n");}
"\end{document}" {printf("</html>\n");}
"\begin{enumerate}" {printf("<OL>\n");}
"\end{enumerate}" {printf("</OL>\n");}
"\begin{quote}" {printf("<blockquote>\n");}
"\end{quote}" {printf("</blockquote>\n");}
...
The above translation will ensure that, given a well formed
(syntactically correct) LaTeX file, we will output
a correct HTML file.
On the other hand, if we want to translate other commands, such
as font-changing commands like \em, we run into a block-nesting
problem. In this case, we will need to write a parser for
LaTeX which has grammar rules of the form:
stmt: OPEN_BRACE EM stmt1 CLOSE_BRACE
for which we will write (syntax-directed) translation rules of
the form:
{stmt.html = "<em>" ++ stmt1.html ++ "</em>";}
where .html is an attribute of stmt that contains the html string
corresponding to stmt, and "++" stands for string concatenation.
3. a. Yes, L = (0|1)*00
b. Yes. Finite automata can be used to count using values in a
finite set (0, 1, 2 in this case).
Consider the NFA:
Start state: s0
Transitions:
From To Symbol
s0 s1 1
s0 s0 0
s1 s1 1
s1 s2 0
s2 s1 0
s2 s2 1
Final state: s0.
In this automaton, we reach s_i whenenver input modulo 3 is i,
and hence recognize exactly the numbers divisible by 3.
Regular languages are equivalent to languages recognized by
finite automata, and hence we can write a regular expression
corresponding to the above automaton.
4. a. No. G4 is ambiguous, so cannot be LL(1).
b. I0 = {
E' -> . E
E -> . E + E
E -> . E * E
E -> . id
}
I1 = goto(I0, E)
= {
E' -> E.
E -> E. + E
E -> E. * E
}
I2 = goto(I0, id)
= {
E -> id.
}
I3 = goto(I1, +)
= {
E -> E + .E
E -> . E + E
E -> . E * E
E -> . id
}
I4 = goto(I1, *)
= {
E -> E * .E
E -> . E + E
E -> . E * E
E -> . id
}
I5 = goto(I3, E)
= {
E -> E + E.
E -> E. + E
E -> E. * E
}
I6 = goto(I4, E)
= {
E -> E * E.
E -> E. + E
E -> E. * E
}
c. Let the rules of the grammar be:
0. E' -> E
1. E -> E + E
2. E -> E * E
3. E -> id
The SLR(1) action table is:
State id + * $
0 S,2
1 S,3 S,4 Acc
2 R3 R3 R3
3 S,2
4 S,2
5 S,3/R1 S,4/R1 R1
6 S,3/R2 S,4/R2 R2
d. + is left associative:
action(5, +) = R1
* is left associative:
action(6, *) = R2
+ has lower precedence than *:
action(5, *) = S,4
action(6, +) = R2
5. a. S -> L {S.val = L.val }
L -> L1 B {L.val = L1.val*2 + B.val}
L -> B {L.val = B.val}
B -> 0 {B.val = 0}
B -> 1 {B.val = 1}
b. val is a synthesized attribute of S, L, and B.
In each rule above, note that val of the lhs symbol is
computed based on the val of rhs symbols.
6. a. a[i].x refers to field x of object a[i]. Many classes may have
fields named x; without knowing the class of a[i], x we cannot
know which of these fields a[i].x refers to.
b. x could be a field inherited from a superclass.
c. c.i: int
c.j: boolean
d.i: float
d.j: float
7. a. array(integer, 10)
b. pointer(char) x pointer(char) -> integer
8. In the following, we assume that all expressions are integer expressions.
a. E -> E1 + E2
{
E.code = E1.code ||
E2.code ||
emit(iadd())
}
E -> id
{
E.code = emit(ildc(id.offset)) ||
emit(load());
}
E -> int
{
E.code = emit(ildc(int.val));
}
b. To E -> id in (a) above, add:
E.lcode = emit(ildc(id.offset));
E -> E1 = E2
{
E.code = E1.lcode ||
E2.code ||
emit(store());
}
c. E -> E1 += E2
{
E.code = E1.lcode ||
emit(dup()) || /* need both l- and */
emit(load()) || /* r-values */
E2.code ||
emit(iadd()) ||
emit(store());
}