5/17/00 Summary for final project Most students found the following results, with exceptions and explanations noted later: Clarity and succinctness: . High SETL and Prolog are clearly the shortest and easiest to understand . APTS generated C, and hand coded C were the most difficult to understand . High Java and Low Java are about right in between: they look much more complicated and harder to program than High SETL and Prolog, but are also much simpler and easier to program than C. . Low SETL is clearly low level compared to High SETL and Prolog, but is much clearer than Low Java. . The longest programs were found to be 100 times longer than the shortest ones. Speed: . Low SETL, APTS generated C, and hand coded C are much closer to each other than the rest; they are significantly faster than the rest. . High Java, Low Java, High SETL are all much much slower than the rest, and things gets much worse when inputs are large. . Prolog is in between these two groups. It is clearly slower than the first group, and clearly faster than the second group. . The slowest programs were found to be 100 times slower than the fastest programs. BTW, this can get worse as input size increases. Exceptions/Surprises found by some, and my explanations: . High Java can be optimized to achieve what's in Low Java and thus run almost as fast, and at the same time, it can be kept much simpler than Low Java. This is because the API of the Java collection framework is designed carefully to allow so. Namely, it allows one to write update code in an aggregate way, rather than explicitly one by one as in low Java. However, it is harder to understand aggregate updates than aggregate functions: an example of aggregate functions: S - T (user can put the result in any variable, including S and T) an example of aggregate updates: set S is updated by removing all elements that are also in T Analogy of this: in higher level languages, we do x-y, and put the result in any thing we like, including x and y. we are not restricted to put result only in x. in assembly languages, the above is usually compiled into load x in register i (Ri) load y in register j (Rj) set content of Ri to be content of Ri minus content Rj not put the result in yet another register, for efficiency reason. The same problem occurs for operations on more primitive data, such as integers, but is much more serious and harder to handle on sets than on whatever can fit in a register, and no current compiler can do the kind of incremental computation we did to transform high-level set operations into low-level set operations. So Java API provides such aggregate updates rather than aggregate functions, so that one is not punished by bad performance as badly. . Prolog version, using table in XSB, can grow quadratically. Chris and James brought this up in their project reports. Few people noted this, but if you look carefully, the curves of many people exhibits this. I just asked David Warren, and here is the answer: For the second clause, one can write either reach(X) :- reach(Y),edge(Y,X). or reach(X) :- edge(Y,X),reach(Y). If you use the former, you get linear time, otherwise you get quadratic time. The reason is, roughly, that the former looks into the table first and only uses those Y's that are in table already, so the Y in edge(Y,X) is fixed; then all X's are added to the table. The latter however checks all edge(Y,X) for all Y's and X's. . Low SETL runs so fast! Most people found Low SETL is almost as fast as APTS generated C. Some even found it faster than generated C. Well, they did a good job implementing it; after all, those people are among the earliest language researchers and implementors. . What about my hand code C? - The fastest of any C, from a couple of you, is a little faster than twice as fast as APTS generated C. The reason is simple: the APTS generated C uses doubly linked lists for every sets, but only singly linked lists are needed in this problem; there is also a set that needs only a bit for each element, not a linked list for the set. - Some wrote C from scratch, using adjacent list representation and DFS. I heard at least two complaints that it was a struggle to get things right. I understand. I used to be a student too and did programmer jobs, having to do such assignments all the time. I think I still had fun writing them, and now I think it is more fun to try to do this more systematically. - Some said it was surprisingly easy to write the C code following the Low SETL code and the data structures given and get it right. I'd say "sure!" and "great!". That's a completely systematic method, which is why APTS can even do it fully automatically. - Several people have C programs slower, even much slower than Low SETL and APTS generated C. They were "lazy" in the implementation work; it is much work to get anything work at all so I might as well do some easy things, such as using adjacency matrix representation instead of adjacency list, hopefully less null pointer problems and fewer segmentation faults. . An ML program!? One of you wrote an ML program. Good effort. Its speed is close to High Java and High SETL, sometimes worse. Purely functional programs are easier to reason about. However, it is well-known that it is difficult, if possible, to write purely functional programs for graph algorithms that run efficiently. The reason is simple, efficient computation depends on smart update, but purely functional programs do not allow update. Which is why a usual implementation is as slow as High Java and High SETL.